城市交通控制是一个涵盖异构子系统(包括交通信号灯、高速公路、公共交通及出租车服务)的系统级协同问题。现有基于优化、强化学习(RL)以及新兴的基于大语言模型(LLM)的方法,大多针对孤立任务设计,因而既限制了跨任务泛化能力,也难以刻画子系统间耦合的物理动力学。我们认为,有效的系统级控制需依托一个统一的物理环境,在该环境中,各子系统共享基础设施、出行需求及时空约束,从而使局部干预得以在网络中传播。为此,我们提出 TrafficClaw——一种构建于统一运行时环境之上的通用城市交通控制框架。TrafficClaw 将异构子系统整合为一个共享的动力学系统,支持对子系统间交互的显式建模以及智能体与环境之间的闭环反馈。在此环境中,我们开发了一个具备可执行时空推理能力与可复用程序化记忆的 LLM 智能体,实现跨子系统的统一诊断与策略的持续优化。此外,我们引入一种多阶段训练流程,包含监督初始化与面向系统级优化的智能体式强化学习(agentic RL),进一步提升协调性与系统感知能力。实验表明,TrafficClaw 在未见过的交通场景、动态特性及任务配置下,均展现出鲁棒、可迁移且具备系统感知能力的性能。本项目代码开源地址为:https://github.com/usail-hkust/TrafficClaw。
Urban traffic control is a system-level coordination problem spanning heterogeneous subsystems, including traffic signals, freeways, public transit, and taxi services. Existing optimization-based, reinforcement learning (RL), and emerging LLM-based approaches are largely designed for isolated tasks, limiting both cross-task generalization and the ability to capture coupled physical dynamics across subsystems. We argue that effective system-level control requires a unified physical environment in which subsystems share infrastructure, mobility demand, and spatiotemporal constraints, allowing local interventions to propagate through the network. To this end, we propose TrafficClaw, a framework for general urban traffic control built upon a unified runtime environment. TrafficClaw integrates heterogeneous subsystems into a shared dynamical system, enabling explicit modeling of cross-subsystem interactions and closed-loop agent-environment feedback. Within this environment, we develop an LLM agent with executable spatiotemporal reasoning and reusable procedural memory, supporting unified diagnostics across subsystems and continual strategy refinement. Furthermore, we introduce a multi-stage training pipeline with supervised initialization and agentic RL with system-level optimization, further enabling coordinated and system-aware performance. Experiments demonstrate that TrafficClaw achieves robust, transferable, and system-aware performance across unseen traffic scenarios, dynamics, and task configurations. Our project is available at https://github.com/usail-hkust/TrafficClaw.