Galileo (galileo)¶
Multi-frame Sentinel-2 adapter that constructs Galileo encoder inputs (including
months) from a temporal window and returns pooled vectors or S2-group patch-token grids.
Quick Facts¶
| Field | Value |
|---|---|
| Model ID | galileo |
| Family / Backbone | Galileo Encoder from official single_file_galileo.py |
| Adapter type | on-the-fly |
| Typical backend | provider backend (gee via public API) |
| Primary input | S2 10-band time series (T,C,H,W) |
| Temporal mode | range in practice (adapter normalizes via shared helper) |
| Output modes | pooled, grid |
| Extra side inputs | required months (per-frame month tokens), plus Galileo masks/tensors built by adapter |
| Training alignment (adapter path) | Medium (depends on FRAMES, IMG, PATCH, normalization, NDVI inclusion) |
When To Use This Model¶
Good fit for¶
- temporal S2 sequence modeling with explicit month tokens
- comparisons against other multi-frame adapters (
anysat,agrifm) - feature-grid analysis over Galileo S2-related token groups
Be careful when¶
- comparing to single-composite models without matching temporal assumptions
- changing
image_size/patch_sizeinconsistently (image_sizemust divide bypatch_size) - changing month handling (
RS_EMBED_GALILEO_MONTH) without documenting it
Input Contract (Current Adapter Path)¶
Spatial / temporal¶
SpatialSpec:BBoxorPointBufferTemporalSpec: normalized to range via shared helper (rangerecommended for reproducibility)- Adapter builds
Tframes by splitting the temporal window into equal sub-windows and compositing each frame - Month side input:
- default: derived from frame-bin midpoints
- optional override:
RS_EMBED_GALILEO_MONTH(constant month for all frames)
Sensor / channels¶
Default SensorSpec if omitted:
- Collection:
COPERNICUS/S2_SR_HARMONIZED - Bands: 10-band S2 set used by adapter (
B2,B3,B4,B5,B6,B7,B8,B8A,B11,B12) scale_m=10,cloudy_pct=30,composite="median",fill_value=0.0
input_chw contract:
- accepts
CHW(C=10) orTCHW(C=10) through shared coercion helper CHWrepeats toT;TCHWpads/truncates to exactT- values are clipped to raw-SR range
0..10000
Preprocessing Pipeline (Current rs-embed Path)¶
- Fetch S2 10-band
raw_tchwor coerce userinput_chwtoTCHW - Resolve months sequence from frame bins (or
RS_EMBED_GALILEO_MONTH) - Resize frames to
RS_EMBED_GALILEO_IMG(default64) if needed - Normalize S2 series with
RS_EMBED_GALILEO_NORM(defaultunit_scale) - Build Galileo encoder tensors and masks:
s_t_x,sp_x,t_x,st_x- masks
s_t_m,sp_m,t_m,st_m months- Optional NDVI channel injection when
RS_EMBED_GALILEO_INCLUDE_NDVI=1and model space supports NDVI - Forward Galileo encoder (
patch_size=RS_EMBED_GALILEO_PATCH, default8) - Outputs:
- pooled vector from visible tokens (
encoder.average_tokens(...)) - grid from S2-related space-time groups, averaged over time/group dimensions
Constraint:
image_size % patch_size == 0is required
Environment Variables / Tuning Knobs¶
| Env var | Default | Effect |
|---|---|---|
RS_EMBED_GALILEO_MODEL_SIZE |
nano |
Galileo model size selector |
RS_EMBED_GALILEO_MODEL_PATH |
unset | Local model/checkpoint path override |
RS_EMBED_GALILEO_REPO_PATH |
unset | Local Galileo repo path |
RS_EMBED_GALILEO_REPO_URL |
upstream Galileo repo | Repo clone source |
RS_EMBED_GALILEO_REPO_CACHE |
~/.cache/rs_embed/galileo |
Repo cache root |
RS_EMBED_GALILEO_AUTO_DOWNLOAD_REPO |
1 |
Auto-clone repo if missing |
RS_EMBED_GALILEO_IMG |
64 |
Frame resize target |
RS_EMBED_GALILEO_PATCH |
8 |
Encoder patch size |
RS_EMBED_GALILEO_FRAMES |
8 |
Number of temporal frames T |
RS_EMBED_GALILEO_NORM |
unit_scale |
S2 normalization mode |
RS_EMBED_GALILEO_ADD_LN |
1 |
Add layer norm on encoder exit |
RS_EMBED_GALILEO_INCLUDE_NDVI |
1 |
Include NDVI channel when supported |
RS_EMBED_GALILEO_MONTH |
unset | Force a constant month (1..12) for all frames |
RS_EMBED_GALILEO_FETCH_WORKERS |
8 |
Prefetch workers for batch APIs |
Output Semantics¶
OutputSpec.pooled()¶
- Default pooled path uses Galileo pooled token output (
token_meansemantics in metadata) - If
OutputSpec.pooling="max", adapter max-pools the produced grid instead (grid_max)
OutputSpec.grid()¶
- Returns S2-related Galileo space-time-group patch tokens as
xarray.DataArray(D,H,W) - Grid is produced by selecting S2 groups and averaging over time + channel-group axes
- This is model token structure, not georeferenced raster pixels
Examples¶
Minimal example¶
from rs_embed import get_embedding, PointBuffer, TemporalSpec, OutputSpec
emb = get_embedding(
"galileo",
spatial=PointBuffer(lon=121.5, lat=31.2, buffer_m=2048),
temporal=TemporalSpec.range("2022-01-01", "2023-01-01"),
output=OutputSpec.pooled(),
backend="gee",
)
Example tuning temporal packaging (env-controlled)¶
# Example (shell):
# export RS_EMBED_GALILEO_FRAMES=8
# export RS_EMBED_GALILEO_IMG=64
# export RS_EMBED_GALILEO_PATCH=8
# export RS_EMBED_GALILEO_NORM=unit_scale
# export RS_EMBED_GALILEO_INCLUDE_NDVI=1
Common Failure Modes / Debugging¶
- backend is not provider-compatible
image_sizenot divisible bypatch_size- wrong
input_chwchannel count (must be 10) - unexpected effects from
RS_EMBED_GALILEO_MONTHforcing a constant month - missing local repo/model path when auto-download is disabled
Recommended first checks:
- verify temporal window/frame count and month sequence in metadata
- inspect raw inputs before changing normalization or model settings
Reproducibility Notes¶
Record and keep fixed:
- temporal window and
RS_EMBED_GALILEO_FRAMES RS_EMBED_GALILEO_IMG,RS_EMBED_GALILEO_PATCH- normalization mode / NDVI inclusion
- month override (if any)
- model source (repo/model path, model size)
Source of Truth (Code Pointers)¶
- Registration/catalog:
src/rs_embed/embedders/catalog.py - Adapter implementation:
src/rs_embed/embedders/onthefly_galileo.py - Shared TCHW coercion helper:
src/rs_embed/embedders/runtime_utils.py