API: Export¶

This page covers dataset export APIs.

export_batch (primary / recommended)¶

Signature¶

export_batch(
    *,
    spatials: List[SpatialSpec],
    temporal: Optional[TemporalSpec],
    models: List[str | ExportModelRequest],
    target: ExportTarget,
    config: ExportConfig = ExportConfig(),
    backend: str = "auto",
    device: str = "auto",
    output: OutputSpec = OutputSpec.pooled(),
    sensor: Optional[SensorSpec] = None,
    fetch: Optional[FetchSpec] = None,
    modality: Optional[str] = None,
    per_model_sensors: Optional[Dict[str, SensorSpec]] = None,
    per_model_fetches: Optional[Dict[str, FetchSpec]] = None,
    per_model_modalities: Optional[Dict[str, str]] = None,
) -> Any

Use export_batch(...) when you want to export one or many ROIs, one or many models, and the corresponding inputs, embeddings, and manifests together.

Although the public function still exposes many keyword arguments, the implementation first normalizes requests into ExportTarget, ExportConfig, and ExportModelRequest entries.

That is the real shape of the API internally, and it is the shape new code should follow.

Mental Model¶

Think about export_batch(...) as 4 decisions:

What to export: spatials, temporal, models
Where to write: target=ExportTarget(...)
How to run: config=ExportConfig(...)
Any shared or per-model settings: backend, device, output, fetch, sensor, modality, per_model_*

For new code, prefer target=ExportTarget(...) and config=ExportConfig(...), then use models=[..., ExportModelRequest(...)] when one model really needs special overrides.

Default Pattern¶

If you are not sure what to pass, this is the default pattern:

from rs_embed import export_batch, ExportConfig, ExportTarget, PointBuffer, TemporalSpec

export_batch(
    spatials=[PointBuffer(121.5, 31.2, 2048)],
    temporal=TemporalSpec.range("2022-06-01", "2022-09-01"),
    models=["remoteclip"],
    target=ExportTarget.combined("exports/run"),
    config=ExportConfig(),
)

That gives you one combined export artifact in the default .npz format, including inputs, embeddings, and a manifest, all with the default runtime behavior.

Parameters, Grouped by Job¶

1. Required dataset definition¶

Parameter	Meaning
`spatials`	Non-empty list of `BBox` or `PointBuffer`.
`temporal`	`TemporalSpec` or `None`. The parameter is optional at the API level, but some models or data sources still require it.
`models`	Non-empty list of model IDs or `ExportModelRequest(...)`.

2. Output location and layout¶

Prefer target=ExportTarget(...) in new code.

from rs_embed import ExportTarget

ExportTarget.per_item("exports", names=["p1", "p2"])
ExportTarget.combined("exports/run")

ExportTarget.per_item(...) writes one file per ROI, while ExportTarget.combined(...) writes one merged file for the whole run.

For combined, the output extension is normalized from config.format if missing. For per_item, names=[...] must have the same length as spatials.

3. Shared model/runtime settings¶

These usually apply to all models in the call:

Setting	Typical use
`backend`	Keep `backend="auto"` unless you need a specific provider such as `"gee"`.
`device`	`"auto"` is the normal choice.
`output`	Usually `OutputSpec.pooled()`.
`fetch`	Shared `FetchSpec` for resolution or compositing overrides.
`sensor`	Shared `SensorSpec` for advanced on-the-fly source overrides.
`modality`	Shared modality override for models that expose multiple public branches.

Use per-model overrides only when one model needs different settings.

Rule Of Thumb¶

Use fetch=FetchSpec(...) for shared resolution or compositing overrides. Use sensor=SensorSpec(...) only when a job really needs custom collection or bands. fetch and sensor cannot be passed together.

4. Legacy dict-style compatibility¶

The structured per-model path is ExportModelRequest(...) or ExportModelRequest.configure(...) inside models=[...].

per_model_sensors, per_model_fetches, and per_model_modalities remain accepted as legacy dict-style per-model overrides keyed by model name. They are kept for compatibility with older calling patterns.

If both are provided, inline values on ExportModelRequest(...) take precedence over per_model_*, and per_model_* takes precedence over the corresponding global sensor / fetch / modality.

5. ExportConfig: the knobs that matter most¶

config=ExportConfig(...) is the recommended place for runtime settings.

The most important ones are:

Option	Meaning
`format`	`"npz"` or `"netcdf"`.
`save_inputs`	Save model-ready input patches.
`save_embeddings`	Save embedding arrays.
`save_manifest`	Save JSON manifest metadata.
`resume`	Skip items already exported.
`input_prep`	Large-ROI policy, usually `"resize"` or `"tile"`.

You can usually ignore the rest until you need performance tuning or failure recovery.

config is optional; the default is ExportConfig().

6. Advanced runtime controls¶

These matter mainly for larger runs. chunk_size controls how many ROIs are processed at a time, infer_batch_size controls model batch size when batching is supported, and num_workers controls provider fetch concurrency. The remaining knobs such as continue_on_error, retry settings, asynchronous writing, progress display, and fail_on_bad_input mostly matter when you are tuning larger or less reliable runs.

If you do not know what these mean, leave them at defaults.

Example:

from rs_embed import ExportConfig

config = ExportConfig(
    format="npz",
    save_inputs=True,
    save_embeddings=True,
    save_manifest=True,
    resume=True,
    input_prep="resize",
)

Per-Model Overrides¶

When To Use Them¶

Most runs should pass plain model IDs:

models=["remoteclip", "prithvi"]

Use ExportModelRequest(...) when a specific model needs its own fetch, sensor, or modality:

from rs_embed import ExportModelRequest, FetchSpec

models=[
    "remoteclip",
    ExportModelRequest("prithvi", fetch=FetchSpec(scale_m=30)),
]

ExportModelRequest.configure(...) also accepts model-specific settings as keyword arguments, for example:

from rs_embed import ExportModelRequest

models=[
    "remoteclip",
    ExportModelRequest.configure("thor", variant="large"),
]

Typical use cases are when one model needs its own FetchSpec, modality="s1", a different SensorSpec, a different variant such as variant="large", or some override of the shared export settings.

This also matches the implementation path: string model IDs are first converted into ExportModelRequest(name=...), then resolved.

The same class of overrides can also be expressed with per_model_sensors={...}, per_model_fetches={...}, and per_model_modalities={...} keyed by model name. Those dict-style parameters remain accepted mainly for compatibility with older calling patterns.

Rules¶

export_batch(...) accepts a global modality, one model can override it through ExportModelRequest(...), and unsupported modality choices raise ModelError.

export_batch(...) does not have one global model-settings parameter shared across all models. Pass per-model settings through ExportModelRequest.configure("model", variant=...). Unsupported keyword arguments raise ModelError.

For sensor / fetch / modality, the effective precedence is: inline ExportModelRequest(...) value first, then per_model_*, then the corresponding global argument.

What Gets Returned¶

Return Shape¶

ExportTarget.per_item(...) returns List[dict], while ExportTarget.combined(...) returns dict.

In both cases, the return value is manifest-style metadata describing what was exported.

Common Patterns¶

One combined export file¶

from rs_embed import export_batch, ExportConfig, ExportTarget, PointBuffer, TemporalSpec

export_batch(
    spatials=[PointBuffer(121.5, 31.2, 2048)],
    temporal=TemporalSpec.range("2022-06-01", "2022-09-01"),
    models=["remoteclip"],
    target=ExportTarget.combined("exports/combined_run"),
    config=ExportConfig(save_inputs=True, resume=True),
)

One file per ROI¶

from rs_embed import export_batch, ExportConfig, ExportTarget, PointBuffer, TemporalSpec

spatials = [
    PointBuffer(121.5, 31.2, 2048),
    PointBuffer(120.5, 30.2, 2048),
]

export_batch(
    spatials=spatials,
    temporal=TemporalSpec.range("2022-06-01", "2022-09-01"),
    models=["remoteclip", "prithvi"],
    target=ExportTarget.per_item("exports", names=["p1", "p2"]),
    config=ExportConfig(
        input_prep="tile",
        chunk_size=32,
        num_workers=8,
    ),
)

One model needs its own modality¶

from rs_embed import (
    export_batch,
    ExportModelRequest,
    ExportTarget,
    PointBuffer,
    TemporalSpec,
)

export_batch(
    spatials=[PointBuffer(121.5, 31.2, 2048)],
    temporal=TemporalSpec.range("2022-06-01", "2022-09-01"),
    models=[ExportModelRequest("terrafm", modality="s1")],
    target=ExportTarget.combined("exports/terrafm_s1_run"),
    backend="gee",
)

Shared fetch override across models¶

from rs_embed import FetchSpec, export_batch, ExportTarget, PointBuffer, TemporalSpec

export_batch(
    spatials=[PointBuffer(121.5, 31.2, 2048)],
    temporal=TemporalSpec.range("2022-06-01", "2022-09-01"),
    models=["remoteclip", "prithvi"],
    fetch=FetchSpec(scale_m=10),
    target=ExportTarget.combined("exports/shared_sampling"),
)

One model needs its own variant¶

from rs_embed import (
    export_batch,
    ExportModelRequest,
    ExportTarget,
    PointBuffer,
    TemporalSpec,
)

export_batch(
    spatials=[PointBuffer(121.5, 31.2, 2048)],
    temporal=TemporalSpec.range("2022-06-01", "2022-09-01"),
    models=[ExportModelRequest.configure("thor", variant="large")],
    target=ExportTarget.combined("exports/thor_large_run"),
    backend="gee",
)

Runtime Behavior You Usually Need to Know¶

Inference scheduling¶

Model scheduling is serial, so one model runs at a time. Batch inference is used when the embedder supports it, and GPU or accelerator backends benefit the most from that path.

Per-item vs combined mode¶

per_item mode writes one artifact per ROI, while combined mode writes one merged artifact for the run. Combined mode also keeps the older behavior of preferring batch model APIs when possible.

Input reuse¶

If provider-backed export is used and both save_inputs=True and save_embeddings=True, rs-embed reuses the fetched input patch for both writing and embedding inference instead of downloading it twice.

Simple rule

Start with ExportTarget.combined(...) + ExportConfig(). Add ExportModelRequest.configure(...) only for the few models that need per-model sensor, fetch, modality, or variant overrides.