UrbanComp Lab | 学习资料库

UrbanComp Lab | 学习资料库

中国地质大学（武汉）位置智能与城市感知实验室

搜索资料库团队官网

返回论文库

论文

arXiv

GeoAI

GIS

RemoteSensing

EarthObservation

中文标题

Cryo-Bench：面向冰冻圈应用的基础模型基准测试

English Title

Cryo-Bench: Benchmarking Foundation Models for Cryosphere Applications

Saurabh Kaushik, Lalit Maurya, Beth Tellman

发布时间

2026/3/2 16:05:56

来源类型

preprint

语言

en

摘要

中文对照

地理基础模型（GFMs）已在多种地球观测任务中得到评估，涵盖多个领域，并展现出即使在标签稀疏条件下也能生成可靠制图结果的强大学习潜力。然而，针对冰冻圈（Cryosphere）应用的GFMs基准测试仍十分有限，主要原因在于缺乏适配的评估数据集。为填补这一空白，我们提出\textbf{Cryo-Bench}——一个专为评估GFMs在关键冰冻圈组分上性能而构建的基准测试套件。Cryo-Bench涵盖碎屑覆盖冰川、冰川湖、海冰和冰川崩解前沿，数据来源包括多种传感器，覆盖广阔地理区域。我们评估了14种GFMs以及UNet和ViT两类基线模型，以系统分析其优势、局限性及最优使用策略。在编码器冻结设定下，UNet在Cryo-Bench所含五个评估数据集上的平均mIoU最高，达\textbf{66.38}，其次为TerraMind（\textbf{64.02}）。在少样本设定（仅使用10\%输入数据）下，DOFA与TerraMind等GFMs表现优于UNet，mIoU分别达\textbf{59.53}、\textbf{56.62}和\textbf{56.60}，而UNet为56.60。当对GFMs进行全量微调时，各模型在不同数据集上的性能表现不一致；但若同步优化学习率，则可显著提升GFMs性能——例如在两个代表性数据集（GLID与CaFFe）上的评估显示，平均相对性能提升达\textbf{12.77\%}。尽管预训练数据中冰冻圈样本极少，GFMs仍展现出显著的跨域适应能力，并在各项任务中产出有意义的结果。基于上述发现，我们建议采用编码器微调并辅以超参数优化以实现最优性能；而在资源受限时，可采用冻结编码器策略。

English Original

Geo-Foundation Models (GFMs) have been evaluated across diverse Earth observation task including multiple domains and have demonstrated strong potential of producing reliable maps even with sparse labels. However, benchmarking GFMs for Cryosphere applications has remained limited, primarily due to the lack of suitable evaluation datasets. To address this gap, we introduce \textbf{Cryo-Bench}, a benchmark compiled to evaluate GFM performance across key Cryospheric components. Cryo-Bench includes debris-covered glaciers, glacial lakes, sea ice, and calving fronts, spanning multiple sensors and broad geographic regions. We evaluate 14 GFMs alongside UNet and ViT baselines to assess their advantages, limitations, and optimal usage strategies. With a frozen encoder, UNet achieves the highest average mIoU of \textbf{66.38}, followed by TerraMind at \textbf{64.02} across five evluation dataset included in Cryo-Bench. In the few-shot setting (10\% input data), GFMs such as DOFA and TerraMind outperform UNet, achieving mIoU scores of \textbf{59.53}, \textbf{56.62}, and \textbf{56.60}, respectively, comapred to U-Net's 56.60. When fully finetuning GFMs, we observe inconsistent performance across datasets and models. However, tuning learning rate along with finetuning substantially improves GFM performance. For example, evaluation on two representative datasets (GLID and CaFFe) shows an average relative improvement of \textbf{12.77\%}. Despite having minimal Cryosphere representation in their pretraining data, GFMs exhibit notable domain adaptation capabilities and produce meaningful results across tasks. Based on our findings, We recommend encoder fine-tuning with hyperparameter optimization optimization to achieve the best possible performance, while using frozen encoders when users need quick results without extensive experimentation.(\href{https://github.com/Sk-2103/Cryo-Bench}{GitHub}).

资源链接

论文 PDFarxiv.org/pdf/2603.01576v2 论文 PDFarxiv.org/pdf/2603.01576v2 原始来源页面arxiv.org/abs/2603.01576v2

元数据

arXiv2603.01576v2

来源arXiv

类型论文

抽取状态raw

关键词

GeoAI

GIS

RemoteSensing

EarthObservation

cs.CV