# Working with remote objects
Remote storage can be key to share and collaborate on multiomics data. `MODOS` integrates with S3 object storage and htsget to allow remote storage, access and real-time secure streaming of genomic data.
Most of the `MODOS-api`'s functionalities work with remotely stored objects in the same way as with local objects. The user only as to specify the `s3_endpoint` of the remote object store.
## List remotely available MODO's
Listing all available `MODOs` at a specific S3 endpoint (in this tutorial we will use http://localhost as example) will show `MODOs` in all buckets at that endpoint:
```{code-block} python
import modos.remote as remo
# Show all remote modos
remo.list_remote_items("http://localhost")
# ['modos-demo/GIAB', 'modos-demo/ex']
```
## Show metadata of a remote MODO
For all or a specific `MODO` metadata can directly be displayed:
```{code-block} python
import modos.remote as remo
# Get metadata of all MODOs at endpoint "http://localhost"
remo.get_metadata_from_remote("http://localhost")
# Get metadata of MODO with id ex
remo.get_metadata_from_remote("http://localhost", modo_id="ex")
```
## Find a specific MODO and get its S3 path
There are different options to query a specific `MODO` and the __bucket name__ to load it from - fuzzy search or exact string matching:
```{code-block} python
import modos.remote as remo
# Query all MODOs with sequence similar to "ex"
remo.get_s3_path("http://localhost", query="ex")
# [{'http://localhost/s3/modos-demo/ex': {'s3_endpoint': 'http://localhost/s3', 'modo_path': 'modos-demo/ex'}}]
# Query all MODOs exactly matching "ex"
remo.get_s3_path("http://localhost", query="ex", exact_match=True)
# []
```
## Instantiate a remote MODO locally
Remotely stored `MODOs` can be instantiated by specifying their remote endpoint and then worked with as if they were stored locally.
The example below assumes a public s3 bucket endpoint accessible anonymously (without credentials).
::::{tab-set}
:::{tab-item} python
:sync: python
```{code-block} python
from modos.api import MODO
# Load MODO from remote storage
modo=MODO(path='s3://modos-demo/ex', endpoint='http://localhost', s3_kwargs={"anon": True})
# All operations can be applied as if locally
modo.metadata
# {'ex': {'@type': 'MODO', 'creation_date': '2024-02-19T00:00:00', 'description': 'Dummy modo for tests.', 'has_assay': ..}}
```
:::
:::{tab-item} cli
:sync: cli
```{code-block} console
# Interact with remotely stored MODO
modos --anon --endpoint http://localhost show s3://modos-demo/ex
# ex:
# '@type': MODO
# creation_date: '2024-02-19T00:00:00'
# description: Dummy modo for tests.
# has_assay:
```
:::
::::
:::{warning}
The __bucket name__ and the __endpoint url__ are specified separately. The __bucket name__ is part of the `object_path` and needs to be included in the s3 path, followed by the `MODO`'s name (e.g. `s3://bucket_name/modo_name`), while the __endpoint url__ needs to be specified separately. Only paths that follow the s3 scheme will be considered as remote independent of `--endpoint` being specified or not.
:::
:::{note}
To avoid repetition, the endpoint and anon values can also be read from environment variables.
The syntax then follows the same as for local objects, except that the `object_path` needs to be provided as s3 scheme:
```{code-block} console
export MODOS_ENDPOINT='http://localhost'
export MODOS_ANON=true
modos create s3://bucket/object1
modos show s3://bucket/object1
modos delete s3://bucket/object1
```
:::
(authenticated_bucket)=
## Use authenticated buckets
Most use-cases require authentication to access the S3 bucket.
This usually requires an access key and secret key.
MODOS can access these keys through the standard [AWS environment variables](https://docs.aws.amazon.com/cli/v1/userguide/cli-configure-envvars.html):
```{code-block} console
export AWS_ACCESS_KEY_ID=
export AWS_SECRET_ACCESS_KEY=
export MODOS_ENDPOINT='http://modos.example.org'
modos show s3://protected-bucket/example
```
However, it is strongly recommended to avoid entering secrets in the terminal and instead store them in encrypted .env files.
Tools like [sops](https://github.com/getsops/sops) make this easy:
```{code-block} console
# create public/secret key pair
age-keygen -o keypair.txt
# create encrypted env file
sops --age .enc.env
# Values decrypted in memory and injected in the modos process
sops exec-env .enc.env 'modos show s3://protected-bucket/example'
SOPS_AGE_KEY_FILE=keypair.txt sops exec-env .enc.env \
'modos show s3://protected-bucket/example'
```
(generate_remote)=
## Generate and modify a MODO at a remote object store
A `MODO` can be generated from scratch or from file in the same way as locally, by specifying the remote endpoint's url or `MODOS_ENDPOINT`:
::::{tab-set}
:::{tab-item} python
:sync: python
```{code-block} python
from modos.api import MODO
from pathlib import Path
# yaml file with MODO specifications
config_ex = Path("path/to/ex.yaml")
# Create a modo remotely
modo = MODO.from_file(config_ex, "s3://modos-demo/ex", endpoint="http://localhost")
```
:::
:::{tab-item} cli
:sync: cli
```{code-block} console
# Create a modo from file remotely
modos --endpoint "http://localhost" create --from-file "path/to/ex.yaml" s3://modos-demo/ex3
```
:::
::::
:::{note}
Similar to `MODO` creation, any other modifying functionality of the `modos-api`, (e.g. `modos add`, `modos remove` or `MODO.add_element()`, `MODO.remove_element()`) can be performed on remotely stored objects by specifying the __endpoint__ and object path as s3 scheme + __bucket name__ as path.
:::
## Download and upload a MODO
A `MODO` can directly be __downloaded__ from a remote endpoint.
::::{tab-set}
:::{tab-item} python
:sync: python
```{code-block} python
from modos.api import MODO
# Load MODO from remote storage
modo=MODO(path='s3://modos-demo/ex', endpoint='http://localhost')
# Download MODO to local path "data/ex"
modo.download("data/ex")
```
:::
:::{tab-item} cli
:sync: cli
```{code-block} console
# Download a remote modo from "modos-demo/ex" to local path "data/ex"
modos --endpoint http://localhost remote download --target data/ex s3://modos-demo/ex
```
:::
::::
A local `MODO` can be __uploaded__ to a remote endpoint.
::::{tab-set}
:::{tab-item} python
:sync: python
```{code-block} python
from modos.api import MODO
# Load MODO from local storage
modo=MODO(path='data/ex')
# Upload MODO to remote path "modos-demo/ex"
modo.upload("s3://modos-demo/ex", s3_endpoint='http://localhost')
```
:::
:::{tab-item} cli
:sync: cli
```{code-block} console
# Upload a local modo from "data/ex" to remote path "modos-demo/ex"
modos --endpoint http://localhost remote upload --target s3://modos-demo/ex data/ex
```
:::
::::