UrbanComp Lab | 学习资料库

返回论文库

论文

arXiv

Trajectory

Mobility

LLM

Agent

UrbanTraffic

中文标题

TrafficClaw：基于统一物理环境建模的可泛化城市交通控制

English Title

TrafficClaw: Generalizable Urban Traffic Control via Unified Physical Environment Modeling

Siqi Lai, Pan Zhang, Yuping Zhou, Jindong Han, Yansong Ning, Hao Liu

发布时间

2026/4/19 22:17:56

来源类型

preprint

语言

摘要

中文对照

城市交通控制是一个涵盖异构子系统（包括交通信号灯、高速公路、公共交通及出租车服务）的系统级协同问题。现有基于优化、强化学习（RL）以及新兴的基于大语言模型（LLM）的方法，大多针对孤立任务设计，因而既限制了跨任务泛化能力，也难以刻画子系统间耦合的物理动力学。我们认为，有效的系统级控制需依托一个统一的物理环境，在该环境中，各子系统共享基础设施、出行需求及时空约束，从而使局部干预得以在网络中传播。为此，我们提出 TrafficClaw——一种构建于统一运行时环境之上的通用城市交通控制框架。TrafficClaw 将异构子系统整合为一个共享的动力学系统，支持对子系统间交互的显式建模以及智能体与环境之间的闭环反馈。在此环境中，我们开发了一个具备可执行时空推理能力与可复用程序化记忆的 LLM 智能体，实现跨子系统的统一诊断与策略的持续优化。此外，我们引入一种多阶段训练流程，包含监督初始化与面向系统级优化的智能体式强化学习（agentic RL），进一步提升协调性与系统感知能力。实验表明，TrafficClaw 在未见过的交通场景、动态特性及任务配置下，均展现出鲁棒、可迁移且具备系统感知能力的性能。本项目代码开源地址为：https://github.com/usail-hkust/TrafficClaw。

English Original

Urban traffic control is a system-level coordination problem spanning heterogeneous subsystems, including traffic signals, freeways, public transit, and taxi services. Existing optimization-based, reinforcement learning (RL), and emerging LLM-based approaches are largely designed for isolated tasks, limiting both cross-task generalization and the ability to capture coupled physical dynamics across subsystems. We argue that effective system-level control requires a unified physical environment in which subsystems share infrastructure, mobility demand, and spatiotemporal constraints, allowing local interventions to propagate through the network. To this end, we propose TrafficClaw, a framework for general urban traffic control built upon a unified runtime environment. TrafficClaw integrates heterogeneous subsystems into a shared dynamical system, enabling explicit modeling of cross-subsystem interactions and closed-loop agent-environment feedback. Within this environment, we develop an LLM agent with executable spatiotemporal reasoning and reusable procedural memory, supporting unified diagnostics across subsystems and continual strategy refinement. Furthermore, we introduce a multi-stage training pipeline with supervised initialization and agentic RL with system-level optimization, further enabling coordinated and system-aware performance. Experiments demonstrate that TrafficClaw achieves robust, transferable, and system-aware performance across unseen traffic scenarios, dynamics, and task configurations. Our project is available at https://github.com/usail-hkust/TrafficClaw.

资源链接

论文 PDFarxiv.org/pdf/2604.17456v1 论文 PDFarxiv.org/pdf/2604.17456v1 原始来源页面arxiv.org/abs/2604.17456v1

元数据

arXiv2604.17456v1

来源arXiv

类型论文

抽取状态raw

关键词

Trajectory

Mobility

LLM

Agent

UrbanTraffic

cs.AI