安全关键交通场景生成对于在罕见但高风险交互下评估自动驾驶系统至关重要。现有基于扩散模型的方法在闭环生成中具备较强的可控性,但其迭代去噪过程计算开销大,且在长时序展开过程中易累积采样误差与引导误差,导致运动失真,例如抖动、异常加速度及驶离道路行为。为解决上述问题,我们提出 RiskFlow——一种面向安全关键多智能体交通场景的闭环生成框架,将未来轨迹生成建模为动作空间中的概率流传输。RiskFlow 不依赖迭代去噪,而是学习有限时间区间内的平均速度场,通过单次前向传播将高斯动作序列映射为未来的加速度与横摆角速度指令,并采用基于雅可比向量积(JVP)的目标函数实现高效稳定的训练。在推理阶段,RiskFlow 对生成的动作施加输出空间引导,驱动选定的关键智能体趋向高风险交互,同时约束驶离道路行为,并借助车辆动力学模型重构物理可行的轨迹。在 nuScenes 数据集上结合 tbsim 进行闭环评估的实验表明,RiskFlow 在多智能体与长时域设置下均实现了优异的对抗性—真实性权衡。相较于代表性基线方法,RiskFlow 在保持具有竞争力的安全关键场景生成能力的同时,显著提升了生成结果的真实性,并大幅降低了评估所需的推理耗时。
Safety-critical traffic scenario generation is essential for evaluating autonomous driving systems under rare but high-risk interactions. Existing diffusion-based methods offer strong controllability in closed-loop generation, but their iterative denoising process is computationally expensive and may accumulate sampling and guidance errors over long rollouts, causing unrealistic motion artifacts such as jitter, abnormal acceleration, and off-road behavior. To address these issues, we propose RiskFlow, a closed-loop safety-critical multi-agent traffic generation framework that formulates future trajectory generation as transport in the action space. Instead of relying on iterative denoising, RiskFlow learns an average velocity field over a finite interval to transform Gaussian action sequences into future acceleration and yaw-rate commands with a single forward pass, using a JVP-based objective for efficient and stable training. At test time, RiskFlow applies output-space guidance to the generated actions, steering selected critical agents toward risky interactions while regularizing off-road behavior, and reconstructs physically feasible trajectories through vehicle dynamics. Experiments on nuScenes with tbsim closed-loop evaluation show that RiskFlow achieves a strong adversariality-realism trade-off across multi-agent and long-horizon settings. Compared with representative baselines, RiskFlow consistently improves realism while maintaining competitive safety-critical generation capability, and substantially reduces inference time for evaluation.