论文
arXiv
GeoAI
GIS
RemoteSensing
EarthObservation
SpatialIntelligence
GeoLargeModel
GeoFoundationModel
中文标题
GAIA:面向运行大气动力学的基础模型
English Title
GAIA: A Foundation Model for Operational Atmospheric Dynamics
Ata Akbari Asanjan, Olivia Alexander, Tom Berg, Stephen Peng, Jad Makki, Clara Zhang, Matt Yang, Disha Shidham, Srija Chakraborty, William Bender, Cara Crawford, Arun Ravindran, Olivier Raiman, David Potere, David Bell
发布时间
2025/5/15 13:07:09
来源类型
preprint
语言
en
摘要
中文对照

我们提出GAIA(大气地理空间人工智能),一种融合掩码自编码器(MAE)与无标签自蒸馏(DINO)的混合自监督地理空间基础模型,能够从全球静止卫星影像中生成语义丰富的表征。该模型在2001至2015年全球合并的红外观测数据(共15年)上进行预训练,学习到解耦的表征,捕捉大气动力学特征而非简单的昼夜模式,这一结论通过分布式主成分结构与时间一致性分析得到验证。我们在不同数据缺失率(30%-95%遮蔽)下展示了模型强大的重建能力,在真实缺失数据模式上的补全性能显著优于基线。在下游任务迁移中,GAIA始终优于仅使用MAE的基线模型:在大气河分割任务中F1得分提升至0.58(对比0.52),热带气旋检测的风暴级召回率提高至81%(对比75%),早期探测率提升至29%(对比17%),同时保持了具有竞争力的降水估计性能。分析表明,GAIA的混合目标促使模型学习到跨多个主成分分布的空间一致、以对象为中心的特征,而非集中于重建任务的单一表征。本研究证明,结合互补的自监督目标可生成更具迁移性的表征,适用于多样化的气象建模任务。模型权重与代码已公开:https://huggingface.co/bcg-usra-nasa-gaia/GAIA-v1。

English Original

We introduce GAIA (Geospatial Artificial Intelligence for Atmospheres), a hybrid self-supervised geospatial foundation model that fuses Masked Autoencoders (MAE) with self-distillation with no labels (DINO) to generate semantically rich representations from global geostationary satellite imagery. Pre-trained on 15 years of globally-merged infrared observations (2001-2015), GAIA learns disentangled representations that capture atmospheric dynamics rather than trivial diurnal patterns, as evidenced by distributed principal component structure and temporal coherence analysis. We demonstrate robust reconstruction capabilities across varying data availability (30-95% masking), achieving superior gap-filling performance on real missing data patterns. When transferred to downstream tasks, GAIA consistently outperforms an MAE-only baseline: improving atmospheric river segmentation (F1: 0.58 vs 0.52), enhancing tropical cyclone detection (storm-level recall: 81% vs 75%, early detection: 29% vs 17%), and maintaining competitive precipitation estimation performance. Analysis reveals that GAIA's hybrid objectives encourage learning of spatially coherent, object-centric features distributed across multiple principal components rather than concentrated representations focused on reconstruction. This work demonstrates that combining complementary self-supervised objectives yields more transferable representations for diverse atmospheric modeling tasks. Model weights and code are available at: https://huggingface.co/bcg-usra-nasa-gaia/GAIA-v1.

元数据
arXiv2505.18179v2
来源arXiv
类型论文
抽取状态raw
关键词
GeoAI
GIS
RemoteSensing
EarthObservation
SpatialIntelligence
GeoLargeModel
GeoFoundationModel
cs.LG
cs.AI