气候相关灾害的频次与强度持续上升,加剧了对实时监测、早期预警及科学决策支持的需求。地球观测(Earth Observation, EO)依托卫星数据与机器学习(Machine Learning, ML),为此类挑战提供了有力工具。基础模型(Foundation Models, FMs)通过在大规模遥感数据集上进行通用预训练,已深刻变革了EO领域的ML应用。然而,现有大多数模型依赖高空间分辨率但重访周期较长的卫星影像,难以适配快速演变的现象及时间敏感的应急响应任务。本文提出HighFM,一种面向高时间分辨率、多光谱EO数据的基础模型初步方案。我们利用Meteosat第二代(Meteosat Second Generation, MSG)平台提供的逾2 TB SEVIRI影像,适配SatMAE掩码自编码框架,以学习鲁棒的时空表征;为支持实时监测,我们在原始架构中引入细粒度时间编码,以捕捉短期变化特征。预训练模型随后在云掩膜与活跃火点检测任务上进行微调。我们将SEVIRI预训练的视觉Transformer与传统基线方法及近期地理空间基础模型进行基准测试,在平衡准确率(balanced accuracy)与交并比(IoU)指标上均展现出稳定提升。结果表明,时间密度高的静止轨道观测数据在实时地球观测中具有显著潜力,为面向灾害检测与追踪的基础模型构建提供了一条可扩展路径。
The increasing frequency and severity of climate related disasters have intensified the need for real time monitoring, early warning, and informed decision-making. Earth Observation (EO), powered by satellite data and Machine Learning (ML), offers powerful tools to meet these challenges. Foundation Models (FMs) have revolutionized EO ML by enabling general-purpose pretraining on large scale remote sensing datasets. However most existing models rely on high-resolution satellite imagery with low revisit rates limiting their suitability for fast-evolving phenomena and time critical emergency response. In this work, we present HighFM, a first cut approach towards a FM for high temporal resolution, multispectral EO data. Leveraging over 2 TB of SEVIRI imagery from the Meteosat Second Generation (MSG) platform, we adapt the SatMAE masked autoencoding framework to learn robust spatiotemporal representations. To support real time monitoring, we enhance the original architecture with fine grained temporal encodings to capture short term variability. The pretrained models are then finetuned on cloud masking and active fire detection tasks. We benchmark our SEVIRI pretrained Vision Transformers against traditional baselines and recent geospatial FMs, demonstrating consistent gains across both balanced accuracy and IoU metrics. Our results highlight the potential of temporally dense geostationary data for real-time EO, offering a scalable path toward foundation models for disaster detection and tracking.