built by IBM and ESA Ξ¦-lab
π€ Models arXiv Code Challengetrained at Julich Supercomputing Center with funding via the FAST-EO project
π€ Total downloads from Hugging Face: Loading... (all models)
π TerraMind has been accepted at ICCV 2025!
Meet TerraMind, the first any-to-any generative, multimodal foundation model for Earth observation. TerraMind represents new levels of understanding geospatial data, introduces new capabilities such as Thinking-in-Modalities (TiM), and outperforms existing models significantly across community-standard benchmarks.
PANGAEA bench results for TerraMind and the top 5 EO FMs based on average rank. The mIoU is visualized on a min-max normalized scale with the best performance in displayed in parentheses.
For one-shot classification, a labeled support set and unlabeled query data are mapped into an embedding space using the TerraMind encoder. The targets are classified based on the shortest distance to the labeled samples in the embedding space.
1-shot 5-way classification results using nearest neighbors, measured in accuracy and averaged over 200 runs. TerraMind outperforms benchmarks from CV and EO, suggesting a well-structured latent space.
Large tile generation of Sentinel-1 RTC data using a Sentinel-2 L2A input from Singapore. Many features like ships or airport runways are clearly visible in the S-1 RTC generations while clouds are completly ignored.
Large tile generation of a Sentinel-1 GRD radar map using a Sentinel-2 L2A input from Santiago de Compostela.
Large tile generation of a land-use map using a Sentinel-2 L2A input from a bay near Santiago de Compostela.
A bi-monthly award spotlighting the boldest, most imaginative ways to push TerraMind beyond βjust another fine-tuneβ. Whether youβre prototyping a new multi-modal workflow, exploring Thinking-in-Modalities, or inventing a never-seen geospatial application, we want you to share it with everyone.