论文
arXiv
GeoAI
GIS
GeoLargeModel
GeoFoundationModel
Multimodal
GeoMultimodal
中文标题
基于兴趣点数据的多模态对比学习城市空间表征
English Title
Multimodal Contrastive Learning of Urban Space Representations from POI Data
Xinglei Wang, Tao Cheng, Stephen Law, Zichao Zeng, Lu Yin, Junyuan Liu
发布时间
2024/11/10 00:24:07
来源类型
preprint
语言
en
摘要
中文对照

现有从兴趣点(POI)数据学习城市空间表征的方法存在诸多局限,包括地理边界划分不准确、空间信息建模不足、POI语义属性利用不充分以及计算效率低下等问题。为解决上述问题,我们提出CaLLiPer(对比语言-位置预训练)模型,该模型直接将连续的城市空间嵌入向量表示,能够捕捉城市环境的空间与语义分布特征。该模型采用多模态对比学习目标,将位置嵌入与文本形式的POI描述对齐,从而避免了复杂训练语料构建和负样本采样的需求。通过在英国伦敦的应用验证,CaLLiPer在土地利用分类与社会经济制图任务中相比当前最优方法,预测性能提升5%-15%。所学表征的可视化结果进一步表明,该模型在高精度与细粒度上有效捕捉了城市语义的空间变化。此外,CaLLiPer实现了更短的训练时间,展现出良好的效率与可扩展性。本研究为可扩展、语义丰富的城市空间表征学习提供了可行路径,有助于推动地理空间基础模型的发展。实现代码见:https://github.com/xlwang233/CaLLiPer。

English Original

Existing methods for learning urban space representations from Point-of-Interest (POI) data face several limitations, including issues with geographical delineation, inadequate spatial information modelling, underutilisation of POI semantic attributes, and computational inefficiencies. To address these issues, we propose CaLLiPer (Contrastive Language-Location Pre-training), a novel representation learning model that directly embeds continuous urban spaces into vector representations that can capture the spatial and semantic distribution of urban environment. This model leverages a multimodal contrastive learning objective, aligning location embeddings with textual POI descriptions, thereby bypassing the need for complex training corpus construction and negative sampling. We validate CaLLiPer's effectiveness by applying it to learning urban space representations in London, UK, where it demonstrates 5-15% improvement in predictive performance for land use classification and socioeconomic mapping tasks compared to state-of-the-art methods. Visualisations of the learned representations further illustrate our model's advantages in capturing spatial variations in urban semantics with high accuracy and fine resolution. Additionally, CaLLiPer achieves reduced training time, showcasing its efficiency and scalability. This work provides a promising pathway for scalable, semantically rich urban space representation learning that can support the development of geospatial foundation models. The implementation code is available at https://github.com/xlwang233/CaLLiPer.

元数据
arXiv2411.06229v1
来源arXiv
类型论文
抽取状态raw
关键词
GeoAI
GIS
GeoLargeModel
GeoFoundationModel
Multimodal
GeoMultimodal
cs.AI