论文
arXiv
GeoAI
GIS
RemoteSensing
EarthObservation
LLM
Multimodal
UrbanTraffic
中文标题
ParkSense:配送司机应将车停在哪里?利用闲置自动驾驶计算资源与视觉-语言模型
English Title
ParkSense: Where Should a Delivery Driver Park? Leveraging Idle AV Compute and Vision-Language Models
Die Hu, Henan Li
发布时间
2026/4/9 15:28:57
来源类型
preprint
语言
en
摘要
中文对照

寻找停车位占用了外卖配送中不成比例的时间,但现有系统均未解决针对商户入口的精确停车点选择问题。我们提出 ParkSense 框架,该框架在自动驾驶车辆(AV)低风险空闲状态下——例如红灯等待、交通拥堵、停车场内低速缓行——复用闲置计算资源,在预缓存的卫星图像与街景图像上运行视觉-语言模型(VLM),以识别商户入口及合法停车区域。我们形式化定义了“配送感知的精准停车”(Delivery-Aware Precision Parking, DAPP)问题;证明量化后的 7B 参数 VLM 在 HW4 级硬件上可在 4–8 秒内完成推理;并估算该方案在美国可为每位司机带来每年 3,000–8,000 美元的收入增长。本文还指出了这一尚未被探索的交叉领域——即自动驾驶、计算机视觉与末端物流——所面临的五个开放研究方向。

English Original

Finding parking consumes a disproportionate share of food delivery time, yet no system addresses precise parking-spot selection relative to merchant entrances. We propose ParkSense, a framework that repurposes idle compute during low-risk AV states -- queuing at red lights, traffic congestion, parking-lot crawl -- to run a Vision-Language Model (VLM) on pre-cached satellite and street view imagery, identifying entrances and legal parking zones. We formalize the Delivery-Aware Precision Parking (DAPP) problem, show that a quantized 7B VLM completes inference in 4-8 seconds on HW4-class hardware, and estimate annual per-driver income gains of 3,000-8,000 USD in the U.S. Five open research directions are identified at this unexplored intersection of autonomous driving, computer vision, and last-mile logistics.

元数据
arXiv2604.07912v1
来源arXiv
类型论文
抽取状态raw
关键词
GeoAI
GIS
RemoteSensing
EarthObservation
LLM
Multimodal
UrbanTraffic
cs.CV
cs.RO