Skip to content

SatMAE RGB (satmae)

Sentinel-2 RGB on-the-fly adapter for SatMAE (rshf.satmae.SatMAE), returning pooled vectors or ViT patch-token grids from forward_encoder(mask_ratio=0.0).

Quick Facts

Field Value
Model ID satmae
Aliases satmae_rgb
Family / Backbone SatMAE via rshf.satmae.SatMAE
Adapter type on-the-fly
Typical backend provider backend (gee)
Primary input S2 RGB (B4,B3,B2)
Default resolution 10m default provider fetch (sensor.scale_m)
Temporal mode range window in practice (normalized via shared helper)
Output modes pooled, grid
Extra side inputs none
Training alignment (adapter path) Medium-High (higher when wrapper model.transform(...) is available and used)

When To Use This Model

Good fit for

  • strong RGB-only SatMAE baseline on Sentinel-2
  • MAE-style token-grid analysis (OutputSpec.grid())
  • comparisons with other RGB ViT adapters (remoteclip, scalemae, wildsat)

Be careful when

  • you need multispectral semantics (this is RGB-only)
  • assuming grid is georeferenced pixels (it is a patch-token grid)
  • comparing runs across environments where rshf wrapper preprocessing behavior may differ

Input Contract (Current Adapter Path)

Spatial / temporal

  • Provider backend only (backend="gee" / provider-compatible backend)
  • TemporalSpec is normalized via shared helper; use TemporalSpec.range(...) for reproducibility
  • Temporal window is used for compositing/filtering, not single-scene identity selection

Sensor / channels

Default SensorSpec if omitted:

  • Collection: COPERNICUS/S2_SR_HARMONIZED
  • Bands: ("B4", "B3", "B2")
  • scale_m=10, cloudy_pct=30, composite="median"

input_chw contract:

  • must be CHW with exactly 3 bands in (B4,B3,B2) order
  • expected raw S2 SR values in 0..10000
  • adapter converts to [0,1], then uint8 RGB before model preprocessing

Preprocessing Pipeline (Current rs-embed Path)

  1. Fetch S2 RGB patch as uint8 RGB (provider path) or convert input_chw raw SR -> [0,1] -> uint8
  2. Resize to RS_EMBED_SATMAE_IMG (default 224)
  3. Model preprocessing inside adapter:
  4. preferred: model.transform(rgb_u8, image_size) if wrapper exposes it
  5. fallback: generic CLIP-style tensor preprocessing (rgb_u8_to_tensor_clipnorm)
  6. Run forward_encoder(mask_ratio=0.0) to get token sequence [N,D]
  7. Return:
  8. pooled: patch-token pooling (mean / max)
  9. grid: patch-token reshape to xarray.DataArray

Notes:

  • Current adapter path always targets token output (not pooled wrapper outputs).
  • CLS token is removed automatically when pooling / grid reshape helpers detect it.

Environment Variables / Tuning Knobs

Env var Default Effect
RS_EMBED_SATMAE_ID MVRL/satmae-vitlarge-fmow-pretrain-800 HF model ID used by SatMAE.from_pretrained(...)
RS_EMBED_SATMAE_IMG 224 Resize / preprocess image size
RS_EMBED_SATMAE_FETCH_WORKERS 8 Provider prefetch workers for batch APIs
RS_EMBED_SATMAE_BATCH_SIZE CPU:8, CUDA:32 Inference batch size for batch APIs

Output Semantics

OutputSpec.pooled()

  • Pools SatMAE patch tokens using OutputSpec.pooling
  • Metadata records pooling="patch_mean" or patch_max, plus cls_removed

OutputSpec.grid()

  • Reshapes SatMAE token sequence [N,D] to xarray.DataArray (D,H,W)
  • Grid is ViT patch-token layout, not georeferenced raster pixels

Examples

Minimal provider-backed example

from rs_embed import get_embedding, PointBuffer, TemporalSpec, OutputSpec

emb = get_embedding(
    "satmae",
    spatial=PointBuffer(lon=121.5, lat=31.2, buffer_m=2048),
    temporal=TemporalSpec.range("2022-06-01", "2022-09-01"),
    output=OutputSpec.pooled(),
    backend="gee",
)

Example model/image-size tuning (env-controlled)

# Example (shell):
# export RS_EMBED_SATMAE_ID=MVRL/satmae-vitlarge-fmow-pretrain-800
# export RS_EMBED_SATMAE_IMG=224

Common Failure Modes / Debugging

  • backend mismatch (satmae is provider-only)
  • wrong input_chw shape or band order (must be CHW, C=3, (B4,B3,B2))
  • missing rshf / incompatible rshf version (no SatMAE wrapper or forward_encoder)
  • grid requests failing if token output shape is unexpected

Recommended first checks:

  • inspect metadata tokens_shape and grid_hw
  • confirm RS_EMBED_SATMAE_ID and RS_EMBED_SATMAE_IMG
  • verify your custom input_chw is raw SR (not already 0..1 unless you intentionally converted it)

Reproducibility Notes

Keep fixed and record:

  • RS_EMBED_SATMAE_ID
  • image size (RS_EMBED_SATMAE_IMG)
  • temporal window and compositing settings
  • output mode (pooled / grid) and pooling choice
  • rshf version (wrapper preprocessing behavior can matter)

Source of Truth (Code Pointers)

  • Registration/catalog: src/rs_embed/embedders/catalog.py
  • Adapter implementation: src/rs_embed/embedders/onthefly_satmae.py
  • Shared RGB/token helpers: src/rs_embed/embedders/_vit_mae_utils.py