论文
arXiv
GeoAI
GIS
SpatialIntelligence
Trajectory
Mobility
中文标题
位置匿名化究竟有多难?在日本10万真实用户轨迹数据中实现再识别
English Title
How Tough Is Location Anonymization? Re-identifying 100K Real-User Trajectories in Japan
Abhishek Kumar Mishra, Mathieu Cunche, Heber H. Arcolezi
发布时间
2025/6/6 05:51:50
来源类型
preprint
语言
en
摘要
中文对照

移动轨迹是揭示个人身份信息最敏感的数据类型之一,但轨迹数据的公开发布通常仅依赖临时性的变换方法进行保护。本文以近期发布的日本匿名化轨迹数据集YJMob100K(含10万名用户轨迹)为对象,对上述实践展开压力测试。首先,我们证明所采用的保护措施仍保留了足够的空间与时间结构,使得攻击者可通过密度特征、城市空间相关性及时间活动模式,恢复真实地理坐标系与实际日历时间线。在此重建基础上,我们通过轨迹级隐私度量指标量化风险,涵盖时空k-匿名性、点唯一性、家-工作地及多锚点唯一性,以及用户暴露于偏僻或敏感地点的程度。这些指标揭示出广泛的再识别面:少量观测点、锚点或敏感场所信息往往足以唯一识别特定用户或其社会邻域。最后,我们评估了若干代表性脱敏策略——地理不可区分性(geo-indistinguishability)、本地差分隐私(local differential privacy)及激进的空间解构(aggressive spatial de-structuring),并观察到一致规律:强隐私参数严重损害下游效用,而维持效用的参数设置则几乎无法抑制结构性信息泄露。总体而言,本研究结果表明,当前脱敏技术尚不足以保障大规模移动轨迹数据的隐私安全,并凸显了开发轨迹感知型隐私机制及强化数据发布标准的紧迫性。

English Original

Mobility traces are among the most revealing forms of personal data, yet trajectory releases are often protected only by ad hoc transformations. We stress-test such practices on recently-released YJMob100K, an anonymized dataset of 100,000 user trajectories in Japan. First, we show that the applied protection leaves enough spatial and temporal structure to recover both the real-world geographic frame and the actual calendar timeline by exploiting density signatures, urban correlations, and temporal activity profiles. On top of this reconstruction, we quantify privacy risks through trajectory-level metrics that capture spatio-temporal k-anonymity, -point unicity, home-work and multi-anchor uniqueness, and exposure to secluded and sensitive locations. These metrics reveal extensive re-identification surfaces: a small number of observations, anchors, or sensitive venues often suffices to uniquely pinpoint users or their social neighborhoods. Finally, we evaluate representative sanitization strategies: geo-indistinguishability, local differential privacy, and aggressive spatial de-structuring; and observe a consistent pattern: strong privacy parameters destroy downstream utility, while utility-preserving settings leave structural leakage largely intact. Overall, our findings show that current sanitization techniques are insufficient for large-scale mobility data, and they highlight the urgent need for trajectory-aware privacy mechanisms and stronger publication standards.

元数据
arXiv2506.05611v2
来源arXiv
类型论文
抽取状态raw
关键词
GeoAI
GIS
SpatialIntelligence
Trajectory
Mobility
cs.CR