论文
arXiv
SpatialIntelligence
Trajectory
Mobility
GeoSimulation
中文标题
GSDrive:基于三维高斯泼溅环境的多模态轨迹探测强化驾驶策略
English Title
GSDrive: Reinforcing Driving Policies by Multi-mode Trajectory Probing with 3D Gaussian Splatting Environment
Ziang Guo, Chen Min, Xuefeng Zhang, Yixiao Zhou, Zufeng Zhang, Dzmitry Tsetserukou
发布时间
2026/5/1 00:59:07
来源类型
preprint
语言
en
摘要
中文对照

端到端(E2E)自动驾驶提供了一种将感知输入直接映射为驾驶动作的有前景方法。然而,高昂的标注成本与随时间推移而加剧的数据质量退化,阻碍了其在真实场景中的长期部署。尽管结合模仿学习(IL)与强化学习(RL)是提升策略性能的常用策略,但传统RL训练依赖延迟的、基于事件的奖励——策略仅从碰撞等灾难性结果中学习,易导致过早收敛至次优行为。为克服上述局限,我们提出GSDrive框架,利用三维高斯泼溅(3DGS)实现可微分、基于物理的奖励塑形,以改进E2E驾驶策略。本方法在3DGS仿真器中嵌入基于流匹配的轨迹预测器,支持多模态轨迹探测:对候选轨迹进行前向 rollout 以评估其预期奖励。该设计通过将奖励函数锚定于物理仿真的交互信号,在IL与RL之间建立双向知识交换,从而提供即时、稠密的反馈,而非稀疏的灾难性事件反馈。在重建的nuScenes数据集上开展的闭环实验表明,本方法性能优于现有基于仿真的RL驾驶方法。代码开源地址:https://github.com/ZionGo6/GSDrive。

English Original

End-to-end (E2E) autonomous driving presents a promising approach for translating perceptual inputs directly into driving actions. However, prohibitive annotation costs and temporal data quality degradation hinder long-term real-world deployment. While combining imitation learning (IL) and reinforcement learning (RL) is a common strategy for policy improvement, conventional RL training relies on delayed, event-based rewards-policies learn only from catastrophic outcomes such as collisions, leading to premature convergence to suboptimal behaviors. To address these limitations, we introduce GSDrive, a framework that exploits 3D Gaussian Splatting (3DGS) for differentiable, physics-based reward shaping in E2E driving policy improvement. Our method incorporates a flow matching-based trajectory predictor within the 3DGS simulator, enabling multi-mode trajectory probing where candidate trajectories are rolled out to assess prospective rewards. This establishes a bidirectional knowledge exchange between IL and RL by grounding reward functions in physically simulated interaction signals, offering immediate dense feedback instead of sparse catastrophic events. Evaluated on the reconstructed nuScenes dataset, our method surpasses existing simulation-based RL driving approaches in closed-loop experiments. Code is available at https://github.com/ZionGo6/GSDrive.

元数据
arXiv2604.28111v2
来源arXiv
类型论文
抽取状态raw
关键词
SpatialIntelligence
Trajectory
Mobility
GeoSimulation
cs.RO