UrbanComp Lab | 学习资料库

返回论文库

论文

arXiv

GeoAI

GIS

RemoteSensing

EarthObservation

SpatialIntelligence

GeoLargeModel

GeoFoundationModel

中文标题

PANGAEA：面向地理空间基础模型的全球性与包容性基准

English Title

PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models

Valerio Marsocci, Yuru Jia, Georges Le Bellier, David Kerekes, Liang Zeng, Sebastian Hafner, Sebastian Gerard, Eric Brune, Ritu Yadav, Ali Shibli, Heng Fang, Yifang Ban, Maarten Vergauwen, Nicolas Audebert, Andrea Nascetti

发布时间

2024/12/5 22:40:41

来源类型

preprint

语言

摘要

中文对照

地理空间基础模型（Geospatial Foundation Models, GFMs）已成为从地球观测数据中提取表征的强大工具，但其评估仍存在不一致且范围狭窄的问题。现有研究通常在次优的下游数据集和任务上进行评估，这些任务往往过于简单或过于局限，限制了评估结果对GFMs实际应用能力的衡量价值。此外，当前评估协议缺乏多样性，未能充分考虑图像分辨率、传感器类型和时间维度的多重差异，进一步增加了评估GFM性能的复杂性。特别是，大多数现有基准在地理分布上偏向北美和欧洲，质疑了GFMs的全球适用性。为应对上述挑战，我们提出PANGAEA，一个标准化的评估协议，涵盖多样化的数据集、任务、分辨率、传感器模态和时间维度，建立了稳健且广泛适用的GFMs基准。我们在该基准上评估了目前公开可用的最主流GFMs，并分析其在多个领域的表现。特别地，我们将这些模型与监督学习基线（如UNet和原始ViT）进行比较，并评估其在标注数据有限情况下的有效性。研究结果揭示了GFMs在不同场景下的局限性，表明它们并未始终优于监督模型。PANGAEA设计具有高度可扩展性，支持未来研究中无缝集成新的数据集、模型和任务。通过发布评估代码与基准，我们旨在使其他研究人员能够复现我们的实验并在此基础上开展工作，推动大规模预训练地理空间模型评估向更严谨的范式演进。

English Original

Geospatial Foundation Models (GFMs) have emerged as powerful tools for extracting representations from Earth observation data, but their evaluation remains inconsistent and narrow. Existing works often evaluate on suboptimal downstream datasets and tasks, that are often too easy or too narrow, limiting the usefulness of the evaluations to assess the real-world applicability of GFMs. Additionally, there is a distinct lack of diversity in current evaluation protocols, which fail to account for the multiplicity of image resolutions, sensor types, and temporalities, which further complicates the assessment of GFM performance. In particular, most existing benchmarks are geographically biased towards North America and Europe, questioning the global applicability of GFMs. To overcome these challenges, we introduce PANGAEA, a standardized evaluation protocol that covers a diverse set of datasets, tasks, resolutions, sensor modalities, and temporalities. It establishes a robust and widely applicable benchmark for GFMs. We evaluate the most popular GFMs openly available on this benchmark and analyze their performance across several domains. In particular, we compare these models to supervised baselines (e.g. UNet and vanilla ViT), and assess their effectiveness when faced with limited labeled data. Our findings highlight the limitations of GFMs, under different scenarios, showing that they do not consistently outperform supervised models. PANGAEA is designed to be highly extensible, allowing for the seamless inclusion of new datasets, models, and tasks in future research. By releasing the evaluation code and benchmark, we aim to enable other researchers to replicate our experiments and build upon our work, fostering a more principled evaluation protocol for large pre-trained geospatial models. The code is available at https://github.com/VMarsocci/pangaea-bench.

资源链接

论文 PDFarxiv.org/pdf/2412.04204v2 论文 PDFarxiv.org/pdf/2412.04204v2 原始来源页面arxiv.org/abs/2412.04204v2

元数据

arXiv2412.04204v2

来源arXiv

类型论文

抽取状态raw

关键词