论文
arXiv
GeoAI
GIS
RemoteSensing
EarthObservation
LLM
GeoLargeModel
GeoFoundationModel
Multimodal
GeoMultimodal
中文标题
天体生物学的基础模型:第一篇——研讨会与综述
English Title
Foundation Models for Astrobiology: Paper I -- Workshop and Overview
Ryan Felton, Caleb Scharf, Stuart Bartlett, Nathalie A. Cabrol, Victoria Da Poian, Diana Gentry, Jian Gong, Adrienne Hoarfrost, Manil Maskey, Floyd Nichols, Conor A. Nixon, Tejas Panambur, Joseph Pasterski, Anton S. Petrov, Anirudh Prabhu, Brenda Thomson, Hamed Valizadegan, Kimberley Warren-Rhodes, David Wettergreen, Michael L. Wong, Anastasia Yanchilina
发布时间
2025/10/9 04:01:22
来源类型
preprint
语言
en
摘要
中文对照

过去十年机器学习的进展催生了大量用于编码、表征和处理包含众多高维特征的复杂数据的算法应用。近期,基于超大规模数据集训练的深度学习模型的出现,形成了机器学习的新范式,即基础模型(Foundation Models)。基础模型是在非常庞大且广泛的数据集上训练而成,具有大量参数的程序。一旦构建完成,这些强大而灵活的模型可被以较低资源消耗的方式应用于多种下游任务,实现此前彼此分离的多模态数据的整合。此类应用的开发速度更快,对机器学习专业知识的需求也显著降低。目前,包括NASA和ESA在内的多个机构已开始建立相应的基础设施和模型。在NASA,相关工作涵盖科学使命理事会下的多个部门,包括NASA戈达德中心和INDUS大语言模型,以及Prithvi地理空间基础模型。ESA推动基础模型在地球观测中的应用,促成了TerraMind的开发。2025年2月,NASA艾姆斯研究中心与SETI研究所联合举办了一场研讨会,旨在探讨基础模型在天体生物学研究中的潜力,并确定构建和利用此类模型所需的关键步骤。本文分享了该研讨会的发现与建议,明确了基础模型(或模型集合)在天体生物学应用中的近期及未来机遇。这些应用将涵盖生物标志物识别或生命特征表征任务,以及任务设计等。

English Original

Advances in machine learning over the past decade have resulted in a proliferation of algorithmic applications for encoding, characterizing, and acting on complex data that may contain many high dimensional features. Recently, the emergence of deep-learning models trained across very large datasets has created a new paradigm for machine learning in the form of Foundation Models. Foundation Models are programs trained on very large and broad datasets with an extensive number of parameters. Once built, these powerful, and flexible, models can be utilized in less resource-intensive ways to build many different, downstream applications that can integrate previously disparate, multimodal data. The development of these applications can be done rapidly and with a much lower demand for machine learning expertise. And the necessary infrastructure and models themselves are already being established within agencies such as NASA and ESA. At NASA this work is across several divisions of the Science Mission Directorate including the NASA Goddard and INDUS Large Language Models and the Prithvi Geospatial Foundation Model. And ESA initiatives to bring Foundation Models to Earth observations has led to the development of TerraMind. A workshop was held by the NASA Ames Research Center and the SETI Institute, in February 2025, to investigate the potential of Foundation Models for astrobiological research and to determine what steps would be needed to build and utilize such a model or models. This paper shares the findings and recommendations of that workshop, and describes clear near-term, and future opportunities in the development of a Foundation Model (or Models) for astrobiology applications. These applications would include a biosignature, or life characterization, task, a mission development and operations task, and a natural language task for integrating and supporting astrobiology research needs.

元数据
arXiv2510.08636v1
来源arXiv
类型论文
抽取状态raw
关键词
GeoAI
GIS
RemoteSensing
EarthObservation
LLM
GeoLargeModel
GeoFoundationModel
Multimodal
GeoMultimodal
astro-ph.IM
astro-ph.EP