论文
arXiv
GeoAI
GIS
RemoteSensing
EarthObservation
GeoLargeModel
GeoFoundationModel
中文标题
GAIA:面向业务化大气动力学建模的基础模型
English Title
GAIA: A Foundation Model for Operational Atmospheric Dynamics
Ata Akbari Asanjan, Olivia Alexander, Tom Berg, Stephen Peng, Jad Makki, Clara Zhang, Matt Yang, Disha Shidham, Srija Chakraborty, William Bender, Cara Crawford, Arun Ravindran, Olivier Raiman, David Potere, David Bell
发布时间
2025/5/15 13:07:09
来源类型
preprint
语言
en
摘要
中文对照

我们提出 GAIA(Geospatial Artificial Intelligence for Atmospheres,面向大气的地理空间人工智能),一种混合式自监督地理空间基础模型,将掩码自编码器(Masked Autoencoder, MAE)与无标签自蒸馏(DINO)相结合,从全球静止轨道卫星影像中生成语义丰富的表征。GAIA 在 2001–2015 年共 15 年的全球融合红外观测数据上进行预训练,所学习到的解耦表征能有效捕捉大气动力学过程,而非琐碎的昼夜周期模式,该结论由分布式主成分结构分析与时间一致性分析证实。实验表明,GAIA 在不同数据缺失程度(30%–95% 掩码率)下均展现出稳健的重建能力,并在真实缺失数据模式下的空缺填补任务中性能显著优于基线方法。在下游任务迁移中,GAIA 始终优于仅采用 MAE 的基线模型:大气河分割任务 F1 分数提升至 0.58(基线为 0.52);热带气旋检测任务中,风暴级召回率提升至 81%(基线为 75%),早期识别率提升至 29%(基线为 17%);降水估算性能亦保持竞争力。进一步分析表明,GAIA 的混合自监督目标促使模型学习空间连贯、以对象为中心的特征,这些特征分布于多个主成分之上,而非集中于重建导向的单一紧凑表征。本工作证实,结合互补的自监督目标可生成更具泛化能力的表征,适用于多样化的大气建模任务。模型权重与代码开源地址为:https://huggingface.co/bcg-usra-nasa-gaia/GAIA-v1。

English Original

We introduce GAIA (Geospatial Artificial Intelligence for Atmospheres), a hybrid self-supervised geospatial foundation model that fuses Masked Autoencoders (MAE) with self-distillation with no labels (DINO) to generate semantically rich representations from global geostationary satellite imagery. Pre-trained on 15 years of globally-merged infrared observations (2001-2015), GAIA learns disentangled representations that capture atmospheric dynamics rather than trivial diurnal patterns, as evidenced by distributed principal component structure and temporal coherence analysis. We demonstrate robust reconstruction capabilities across varying data availability (30-95% masking), achieving superior gap-filling performance on real missing data patterns. When transferred to downstream tasks, GAIA consistently outperforms an MAE-only baseline: improving atmospheric river segmentation (F1: 0.58 vs 0.52), enhancing tropical cyclone detection (storm-level recall: 81% vs 75%, early detection: 29% vs 17%), and maintaining competitive precipitation estimation performance. Analysis reveals that GAIA's hybrid objectives encourage learning of spatially coherent, object-centric features distributed across multiple principal components rather than concentrated representations focused on reconstruction. This work demonstrates that combining complementary self-supervised objectives yields more transferable representations for diverse atmospheric modeling tasks. Model weights and code are available at: https://huggingface.co/bcg-usra-nasa-gaia/GAIA-v1.

元数据
arXiv2505.18179v3
来源arXiv
类型论文
抽取状态raw
关键词
GeoAI
GIS
RemoteSensing
EarthObservation
GeoLargeModel
GeoFoundationModel
cs.LG
cs.AI