论文
arXiv
GeoAI
GIS
SpatialIntelligence
Multimodal
GeoMultimodal
中文标题
GeoJEPA:迈向消除多模态地理空间学习中的增强与采样偏差
English Title
GeoJEPA: Towards Eliminating Augmentation- and Sampling Bias in Multimodal Geospatial Learning
Theodor Lundqvist, Ludvig Delvret
发布时间
2025/2/26 06:03:28
来源类型
preprint
语言
en
摘要
中文对照

现有的地理空间区域与地图实体自监督表示学习方法在很大程度上依赖于预训练任务的设计,通常涉及基于空间邻近性的增强或启发式采样正负样本对。这种依赖引入了偏差,限制了表示的表达能力与泛化性能。因此,学术界迫切需要探索建模地理空间数据的不同方法。为应对此类方法的关键挑战——多模态性、异质性以及预训练任务的选择问题,我们提出GeoJEPA,一种基于自监督联合嵌入预测架构(Joint-Embedding Predictive Architecture, JEPA)的多功能多模态融合模型。通过GeoJEPA,我们旨在消除自监督地理空间表示学习中广泛存在的增强与采样偏差。GeoJEPA在大规模OpenStreetMap属性、几何数据和航空影像数据集上进行自监督预训练,生成的城市区域与地图实体的多模态语义表示经由定量与定性评估验证。本研究揭示了JEPA处理多模态数据能力的若干关键见解。

English Original

Existing methods for self-supervised representation learning of geospatial regions and map entities rely extensively on the design of pretext tasks, often involving augmentations or heuristic sampling of positive and negative pairs based on spatial proximity. This reliance introduces biases and limits the representations' expressiveness and generalisability. Consequently, the literature has expressed a pressing need to explore different methods for modelling geospatial data. To address the key difficulties of such methods, namely multimodality, heterogeneity, and the choice of pretext tasks, we present GeoJEPA, a versatile multimodal fusion model for geospatial data built on the self-supervised Joint-Embedding Predictive Architecture. With GeoJEPA, we aim to eliminate the widely accepted augmentation- and sampling biases found in self-supervised geospatial representation learning. GeoJEPA uses self-supervised pretraining on a large dataset of OpenStreetMap attributes, geometries and aerial images. The results are multimodal semantic representations of urban regions and map entities that we evaluate both quantitatively and qualitatively. Through this work, we uncover several key insights into JEPA's ability to handle multimodal data.

元数据
arXiv2503.05774v1
来源arXiv
类型论文
抽取状态raw
关键词
GeoAI
GIS
SpatialIntelligence
Multimodal
GeoMultimodal
cs.LG
cs.DB