轨迹生成近年来在隐私保护的城市移动性研究及基于位置的服务应用中受到日益增长的关注。尽管已有大量研究采用深度学习或生成式人工智能方法建模轨迹并取得了良好效果,但此类模型的鲁棒性与可解释性仍鲜有探索。这限制了轨迹生成算法在含噪真实世界数据上的适用性,也削弱了其在下游任务中的可信度。为应对该问题,我们利用城市轨迹中固有的规则结构,提出一种基于路径片段(pathlet)表示的深度生成模型;该表示将轨迹编码为二进制向量,其对应一个经学习所得的轨迹片段字典。具体而言,我们引入一个概率图模型来刻画轨迹生成过程,该模型包含变分自编码器(VAE)组件与线性解码器组件。训练过程中,模型可同步学习路径片段表示的隐空间嵌入以及捕获轨迹数据集中移动模式的路径片段字典。本模型的条件版本还可依据时空约束生成定制化轨迹。即使在含噪数据上,该模型亦能有效学习数据分布,在两个真实轨迹数据集上相较强基线分别取得35.4%和26.3%的相对性能提升。此外,所生成轨迹可便捷地用于多种下游任务,包括轨迹预测与数据去噪。最后,该框架设计具备显著效率优势,相较先前方法节省64.8%的运行时间与56.5%的GPU显存。
Trajectory generation has recently drawn growing interest in privacy-preserving urban mobility studies and location-based service applications. Although many studies have used deep learning or generative AI methods to model trajectories and have achieved promising results, the robustness and interpretability of such models are largely unexplored. This limits the application of trajectory generation algorithms on noisy real-world data and their trustworthiness in downstream tasks. To address this issue, we exploit the regular structure in urban trajectories and propose a deep generative model based on the pathlet representation, which encode trajectories with binary vectors associated with a learned dictionary of trajectory segments. Specifically, we introduce a probabilistic graphical model to describe the trajectory generation process, which includes a Variational Autoencoder (VAE) component and a linear decoder component. During training, the model can simultaneously learn the latent embedding of pathlet representations and the pathlet dictionary that captures mobility patterns in the trajectory dataset. The conditional version of our model can also be used to generate customized trajectories based on temporal and spatial constraints. Our model can effectively learn data distribution even using noisy data, achieving relative improvements of $35.4\%$ and $26.3\%$ over strong baselines on two real-world trajectory datasets. Moreover, the generated trajectories can be conveniently utilized for multiple downstream tasks, including trajectory prediction and data denoising. Lastly, the framework design offers a significant efficiency advantage, saving $64.8\%$ of the time and $56.5\%$ of GPU memory compared to previous approaches.