SatVision-TOA (satvision_toa)¶
Provider-backed SatVision-TOA adapter for 14-channel TOA inputs (default MODIS band order), with channel-aware reflectance/emissive normalization and token/grid outputs.
Quick Facts¶
| Field | Value |
|---|---|
| Model ID | satvision_toa |
| Family / Backbone | SatVision-TOA checkpoint (HF/local checkpoint loader) |
| Adapter type | on-the-fly |
| Typical backend | provider backend (gee) |
| Primary input | 14-channel TOA CHW (default MODIS/061/MOD021KM band order) |
| Temporal mode | range in practice (normalized via shared helper) |
| Output modes | pooled, grid |
| Extra side inputs | none (but channel calibration settings matter) |
| Training alignment (adapter path) | High only when channel order + calibration match checkpoint expectations |
When To Use This Model¶
Good fit for¶
- TOA-based experiments where you want SatVision checkpoints instead of S2 SR-specific encoders
- MODIS-style provider workflows with explicit control over channel normalization/calibration
- testing pooled vs patch-token grid outputs from the same model path
Be careful when¶
- changing
SensorSpec.bandsorder without updatingRS_EMBED_SATVISION_TOA_*channel settings - mixing raw TOA and unit-scaled inputs without logging the effective normalization mode
- assuming arbitrary 14-channel inputs will work if they do not match checkpoint training semantics
Input Contract (Current Adapter Path)¶
Spatial / temporal¶
SpatialSpec:BBoxorPointBufferTemporalSpec: normalized to a range via shared helper (TemporalSpec.range(...)recommended)- Provider backend only (
backend="gee"/ other provider-compatible backend)
Sensor / channels¶
Default SensorSpec if omitted:
- Collection:
MODIS/061/MOD021KM - Bands (strict default order):
1,2,3,26,6,20,7,27,28,29,31,32,33,34 scale_m=1000,cloudy_pct=100,composite="mosaic",fill_value=0.0
input_chw contract:
- must be
CHW Cmust equalRS_EMBED_SATVISION_TOA_IN_CHANS(default14)- adapter checks
len(sensor.bands) == in_chans - values may be raw TOA-like or already unit-scaled depending on normalization mode
Important:
- The adapter does not infer semantic channel order from values.
sensor.bandsorder must match what the checkpoint expects.
Preprocessing Pipeline (Current rs-embed Path)¶
- Resolve runtime settings (checkpoint/model ID,
in_chans, normalization, calibration arrays) - Fetch provider patch (
CHW) or useinput_chw - If provider metadata says input is already unit-scaled, effective norm mode is forced to
unit - Normalize with
RS_EMBED_SATVISION_TOA_NORM: auto: use[0,1]directly if values look unit-scaled, else fall back to raw scalingraw: channel-wise scaling- reflectance channels: divide by
RS_EMBED_SATVISION_TOA_REF_DIV(default100) - emissive channels: min-max normalize using
RS_EMBED_SATVISION_TOA_EMISSIVE_MINS/MAXS
- reflectance channels: divide by
unit: clip to[0,1]- Resize to
RS_EMBED_SATVISION_TOA_IMG(default128) - Forward model and decode output tensor
- Return:
- pooled vector (from token pooling or model-pooled output)
- grid via token reshape if output is token sequence
[N,D]
Environment Variables / Tuning Knobs¶
Model / weights¶
| Env var | Default | Effect |
|---|---|---|
RS_EMBED_SATVISION_TOA_ID |
SatVision TOA HF model ID | HF model identifier |
RS_EMBED_SATVISION_TOA_CKPT |
unset | Local checkpoint path override |
RS_EMBED_SATVISION_TOA_AUTO_DOWNLOAD |
1 |
Allow HF download when local checkpoint not set |
RS_EMBED_SATVISION_TOA_IMG |
128 |
Resize target image size |
RS_EMBED_SATVISION_TOA_IN_CHANS |
14 |
Expected channel count |
RS_EMBED_SATVISION_TOA_BATCH_SIZE |
CPU:2, CUDA:8 |
Inference batch size (batch APIs) |
RS_EMBED_SATVISION_TOA_FETCH_WORKERS |
8 |
Provider prefetch workers (batch APIs) |
Normalization / calibration¶
| Env var | Default | Effect |
|---|---|---|
RS_EMBED_SATVISION_TOA_NORM |
auto |
auto, raw, or unit |
RS_EMBED_SATVISION_TOA_REFLECTANCE_IDXS |
adapter defaults | Reflectance channel indices |
RS_EMBED_SATVISION_TOA_EMISSIVE_IDXS |
adapter defaults | Emissive channel indices |
RS_EMBED_SATVISION_TOA_REF_DIV |
100 |
Reflectance divisor |
RS_EMBED_SATVISION_TOA_EMISSIVE_MINS |
adapter defaults | Emissive min calibration values |
RS_EMBED_SATVISION_TOA_EMISSIVE_MAXS |
adapter defaults | Emissive max calibration values |
Default sensor overrides (if sensor omitted)¶
| Env var | Default | Effect |
|---|---|---|
RS_EMBED_SATVISION_TOA_COLLECTION |
MODIS/061/MOD021KM |
Default provider collection |
RS_EMBED_SATVISION_TOA_BANDS |
default 14-band MODIS order | Override default band list |
RS_EMBED_SATVISION_TOA_SCALE_M |
1000 |
Default fetch scale |
RS_EMBED_SATVISION_TOA_CLOUDY_PCT |
100 |
Default cloud filter |
RS_EMBED_SATVISION_TOA_FILL |
0 |
Default fill value |
RS_EMBED_SATVISION_TOA_COMPOSITE |
mosaic |
Default composite method |
Output Semantics¶
OutputSpec.pooled()¶
- If model output is token sequence
[N,D], adapter pools patch tokens (mean/max) - If model output is already
(D,), adapter returns it directly asmodel_pooled - Metadata records pooling behavior and whether CLS token was removed
OutputSpec.grid()¶
- Requires token sequence output
[N,D] - Reshapes patch tokens to
xarray.DataArray(D,H,W) - Grid is model patch-token layout, not georeferenced raster pixels
Examples¶
Minimal provider-backed example¶
from rs_embed import get_embedding, PointBuffer, TemporalSpec, OutputSpec
emb = get_embedding(
"satvision_toa",
spatial=PointBuffer(lon=121.5, lat=31.2, buffer_m=5000),
temporal=TemporalSpec.range("2022-07-01", "2022-07-31"),
output=OutputSpec.pooled(),
backend="gee",
)
Example normalization tuning (env-controlled)¶
# Example (shell):
# export RS_EMBED_SATVISION_TOA_NORM=raw
# export RS_EMBED_SATVISION_TOA_IMG=128
# export RS_EMBED_SATVISION_TOA_REF_DIV=100
# export RS_EMBED_SATVISION_TOA_IN_CHANS=14
Common Failure Modes / Debugging¶
- backend is not provider-compatible (
satvision_toais provider-only) sensor.bandscount does not matchRS_EMBED_SATVISION_TOA_IN_CHANS- calibration list lengths mismatch (
EMISSIVE_IDXSvsEMISSIVE_MINS/MAXS) - wrong channel order for the chosen checkpoint (results look unstable even if shapes pass)
- grid requested but model output is not a token sequence
- HF download/auth issues when using gated/private checkpoints
Recommended first checks:
- inspect metadata for
norm_modeandnorm_mode_effective - log
sensor.collection,sensor.bands, and calibration env values used - test
OutputSpec.pooled()first beforegrid
Reproducibility Notes¶
Keep fixed and record:
- checkpoint source (
RS_EMBED_SATVISION_TOA_IDor localCKPT) - exact band order and
in_chans - normalization mode plus calibration arrays/divisor
- temporal window + compositing settings
- output mode (
pooled/grid) and pooling choice
Source of Truth (Code Pointers)¶
- Registration/catalog:
src/rs_embed/embedders/catalog.py - Adapter implementation:
src/rs_embed/embedders/onthefly_satvision_toa.py - Token/grid helpers:
src/rs_embed/embedders/_vit_mae_utils.py