# Working with remote objects Remote storage can be key to share and collaborate on multiomics data. `MODOS` integrates with S3 object storage and htsget to allow remote storage, access and real-time secure streaming of genomic data. Most of the `MODOS-api`'s functionalities work with remotely stored objects in the same way as with local objects. The user only as to specify the `s3_endpoint` of the remote object store. ## List remotely available MODO's Listing all available `MODOs` at a specific S3 endpoint (in this tutorial we will use http://localhost as example) will show `MODOs` in all buckets at that endpoint: ```{code-block} python import modos.remote as remo # Show all remote modos remo.list_remote_items("http://localhost") # ['modos-demo/GIAB', 'modos-demo/ex'] ``` ## Show metadata of a remote MODO For all or a specific `MODO` metadata can directly be displayed: ```{code-block} python import modos.remote as remo # Get metadata of all MODOs at endpoint "http://localhost" remo.get_metadata_from_remote("http://localhost") # Get metadata of MODO with id ex remo.get_metadata_from_remote("http://localhost", modo_id="ex") ``` ## Find a specific MODO and get it's S3 path There are different options to query a specific `MODO` and the __bucket name__ to load it from - fuzzy search or exact string matching: ```{code-block} python import modos.remote as remo # Query all MODOs with sequence similar to "ex" remo.get_s3_path("http://localhost", query="ex") # [{'http://localhost/s3/modos-demo/ex': {'s3_endpoint': 'http://localhost/s3', 'modo_path': 'modos-demo/ex'}}] # Query all MODOs exactly matching "ex" remo.get_s3_path("http://localhost", query="ex", exact_match=True) # [] ``` ## Intiantiate a remote MODO locally Remotely stored `MODOs` can be intiantiated by specifiying their remote endpoint and then and worked with as if they were stored locally. ::::{tab-set} :::{tab-item} python :sync: python ```{code-block} python from modos.api import MODO # Load MODO from remote storage modo=MODO(path='s3://modos-demo/ex', endpoint='http://localhost') # All operations can be applied as if locally modo.metadata # {'ex': {'@type': 'MODO', 'creation_date': '2024-02-19T00:00:00', 'description': 'Dummy modo for tests.', 'has_assay': ..}} ``` ::: :::{tab-item} cli :sync: cli ```{code-block} console # Interact with remotly stored MODO modos --endpoint http://localhost show s3://modos-demo/ex # ex: # '@type': MODO # creation_date: '2024-02-19T00:00:00' # description: Dummy modo for tests. # has_assay: ``` ::: :::: :::{warning} The __bucket name__ and the __endpoint url__ are specified separatly. The __bucket name__ is part of the `object_path` and needs to be included in the s3 path, followed by the `MODO`'s name (e.g. `s3://bucket_name/modo_name`), while the __endpoint url__ needs to be specified separately. Only paths that follow the s3 scheme will be considered as remote independent of `--endpoint` being specified or not. ::: :::{note} To avoid repetition the endpoint can also be read from the `MODOS_ENDPOINT` environment variable. The syntax then follows the same as for local objects, except that the `object_path` needs to be provided as s3 scheme: ```{code-block} console export MODOS_ENDPOINT='http://localhost' modos create s3://bucket/object1 modos show s3://bucket/object1 modos delete s3://bucket/object1 ``` ::: (generate_remote)= ## Generate and modify a MODO at a remote object store A `MODO` can be generated from scratch or from file in the same way as locally, by specifying the remote endpoint's url or `MODOS_ENDPOINT`: ::::{tab-set} :::{tab-item} python :sync: python ```{code-block} python from modos.api import MODO from pathlib import Path # yaml file with MODO specifications config_ex = Path("path/to/ex.yaml") # Create a modo remotely modo = MODO.from_file(config_ex, "s3://modos-demo/ex", endpoint="http://localhost") ``` ::: :::{tab-item} cli :sync: cli ```{code-block} console # Create a modo from file remotely modos --endpoint "http://localhost" create --from-file "path/to/ex.yaml" s3://modos-demo/ex3 ``` ::: :::: :::{note} Similar to `MODO` creation, any other modifying functionality of the `modos-api`, (e.g. `modos add`, `modos remove` or `MODO.add_element()`, `MODO.remove_element()`) can be performed on remotely stored objects by specifying the __endpoint__ and object path as s3 scheme + __bucket name__ as path. ::: ## Download and upload a MODO A `MODO` can directly be __downloaded__ from a remote endpoint. ::::{tab-set} :::{tab-item} python :sync: python ```{code-block} python from modos.api import MODO # Load MODO from remote storage modo=MODO(path='s3://modos-demo/ex', endpoint='http://localhost') # Download MODO to local path "data/ex" modo.download("data/ex") ``` ::: :::{tab-item} cli :sync: cli ```{code-block} console # Download a remote modo from "modos-demo/ex" to local path "data/ex" modos --endpoint http://localhost remote download --target data/ex s3://modos-demo/ex ``` ::: :::: A local `MODO` can be __uploaded__ to a remote endpoint. ::::{tab-set} :::{tab-item} python :sync: python ```{code-block} python from modos.api import MODO # Load MODO from local storage modo=MODO(path='data/ex') # Upload MODO to remote path "modos-demo/ex" modo.upload("s3://modos-demo/ex", s3_endpoint='http://localhost') ``` ::: :::{tab-item} cli :sync: cli ```{code-block} console # Upload a local modo from "data/ex" to remote path "modos-demo/ex" modos --endpoint http://localhost remote upload --target s3://modos-demo/ex data/ex ``` ::: ::::