UrbanComp Lab | 学习资料库

返回论文库

论文

arXiv

SpatialIntelligence

Trajectory

Mobility

GeoSimulation

中文标题

GSDrive：基于三维高斯泼溅环境的多模态轨迹探测强化驾驶策略

English Title

GSDrive: Reinforcing Driving Policies by Multi-mode Trajectory Probing with 3D Gaussian Splatting Environment

Ziang Guo, Chen Min, Xuefeng Zhang, Yixiao Zhou, Zufeng Zhang, Dzmitry Tsetserukou

发布时间

2026/5/1 00:59:07

来源类型

preprint

语言

摘要

中文对照

端到端（E2E）自动驾驶提供了一种将感知输入直接映射为驾驶动作的有前景方法。然而，高昂的标注成本与随时间推移而加剧的数据质量退化，阻碍了其在真实场景中的长期部署。尽管结合模仿学习（IL）与强化学习（RL）是提升策略性能的常用策略，但传统RL训练依赖延迟的、基于事件的奖励——策略仅从碰撞等灾难性结果中学习，易导致过早收敛至次优行为。为克服上述局限，我们提出GSDrive框架，利用三维高斯泼溅（3DGS）实现可微分、基于物理的奖励塑形，以改进E2E驾驶策略。本方法在3DGS仿真器中嵌入基于流匹配的轨迹预测器，支持多模态轨迹探测：对候选轨迹进行前向 rollout 以评估其预期奖励。该设计通过将奖励函数锚定于物理仿真的交互信号，在IL与RL之间建立双向知识交换，从而提供即时、稠密的反馈，而非稀疏的灾难性事件反馈。在重建的nuScenes数据集上开展的闭环实验表明，本方法性能优于现有基于仿真的RL驾驶方法。代码开源地址：https://github.com/ZionGo6/GSDrive。

English Original

End-to-end (E2E) autonomous driving presents a promising approach for translating perceptual inputs directly into driving actions. However, prohibitive annotation costs and temporal data quality degradation hinder long-term real-world deployment. While combining imitation learning (IL) and reinforcement learning (RL) is a common strategy for policy improvement, conventional RL training relies on delayed, event-based rewards-policies learn only from catastrophic outcomes such as collisions, leading to premature convergence to suboptimal behaviors. To address these limitations, we introduce GSDrive, a framework that exploits 3D Gaussian Splatting (3DGS) for differentiable, physics-based reward shaping in E2E driving policy improvement. Our method incorporates a flow matching-based trajectory predictor within the 3DGS simulator, enabling multi-mode trajectory probing where candidate trajectories are rolled out to assess prospective rewards. This establishes a bidirectional knowledge exchange between IL and RL by grounding reward functions in physically simulated interaction signals, offering immediate dense feedback instead of sparse catastrophic events. Evaluated on the reconstructed nuScenes dataset, our method surpasses existing simulation-based RL driving approaches in closed-loop experiments. Code is available at https://github.com/ZionGo6/GSDrive.

资源链接

论文 PDFarxiv.org/pdf/2604.28111v2 论文 PDFarxiv.org/pdf/2604.28111v2 原始来源页面arxiv.org/abs/2604.28111v2

元数据

arXiv2604.28111v2

来源arXiv

类型论文

抽取状态raw

关键词

SpatialIntelligence

Trajectory

Mobility

GeoSimulation

cs.RO