论文
arXiv
GeoAI
GIS
RemoteSensing
EarthObservation
LLM
GeoLargeModel
GeoFoundationModel
中文标题
地理空间基础模型在图像分析中的应用:评估与增强NASA-IBM Prithvi模型的领域适应能力
English Title
Geospatial foundation models for image analysis: evaluating and enhancing NASA-IBM Prithvi's domain adaptability
Chia-Yu Hsu, Wenwen Li, Sizhe Wang
发布时间
2024/8/31 23:51:23
来源类型
preprint
语言
en
摘要
中文对照

地理空间基础模型(GFMs)的研究因在地理空间人工智能(AI)领域中具备实现高泛化能力和领域适应性的潜力,从而显著降低个体研究者的模型训练成本,已成为当前热点。与ChatGPT等大型语言模型不同,构建用于图像分析的视觉基础模型,尤其是在遥感领域,面临诸多挑战,例如如何将多样化的视觉任务统一为通用问题框架。本文评估了近期发布的NASA-IBM GFMs Prithvi在多个基准数据集上的高层图像分析任务预测性能。选择Prithvi的原因在于它是首个基于高分辨率遥感影像时序数据训练的开源地理空间基础模型。通过一系列实验,对比了Prithvi与其他预训练专用任务AI模型在地理空间图像分析中的表现。本文提出并整合了新的策略,包括波段适应、多尺度特征生成以及微调技术,将其融入图像分析流程,以增强Prithvi的领域适应能力并提升模型性能。深入分析揭示了Prithvi的优势与不足,为改进Prithvi及未来地理空间视觉基础模型的开发提供了重要启示。

English Original

Research on geospatial foundation models (GFMs) has become a trending topic in geospatial artificial intelligence (AI) research due to their potential for achieving high generalizability and domain adaptability, reducing model training costs for individual researchers. Unlike large language models, such as ChatGPT, constructing visual foundation models for image analysis, particularly in remote sensing, encountered significant challenges such as formulating diverse vision tasks into a general problem framework. This paper evaluates the recently released NASA-IBM GFM Prithvi for its predictive performance on high-level image analysis tasks across multiple benchmark datasets. Prithvi was selected because it is one of the first open-source GFMs trained on time-series of high-resolution remote sensing imagery. A series of experiments were designed to assess Prithvi's performance as compared to other pre-trained task-specific AI models in geospatial image analysis. New strategies, including band adaptation, multi-scale feature generation, and fine-tuning techniques, are introduced and integrated into an image analysis pipeline to enhance Prithvi's domain adaptation capability and improve model performance. In-depth analyses reveal Prithvi's strengths and weaknesses, offering insights for both improving Prithvi and developing future visual foundation models for geospatial tasks.

元数据
arXiv2409.00489v1
来源arXiv
类型论文
抽取状态raw
关键词
GeoAI
GIS
RemoteSensing
EarthObservation
LLM
GeoLargeModel
GeoFoundationModel
cs.CV
cs.AI