论文
arXiv
RemoteSensing
EarthObservation
Multimodal
GeoMultimodal
中文标题
任务驱动的提示学习:一种面向多模态云去除与分割的联合框架
English Title
Task-Driven Prompt Learning: A Joint Framework for Multi-modal Cloud Removal and Segmentation
Zaiyan Zhang, Jie Li, Shaowei Shi, Qiangqiang Yuan
发布时间
2026/1/17 21:32:38
来源类型
preprint
语言
en
摘要
中文对照

光学遥感影像对地球观测不可或缺,但持续存在的云层遮挡严重限制了其下游应用价值。现有云去除(CR)方法大多针对低层保真度进行优化,易过度平滑对分析就绪数据(ARD)至关重要的纹理与边界,导致视觉上合理的重建结果与其语义实用性之间存在不匹配。为弥合这一差距,我们提出TDP-CR——一种任务驱动的多模态框架,可联合执行云去除与地表覆盖分割。该框架的核心是提示引导融合(PGF)机制,其利用可学习的退化提示来编码云层厚度与空间不确定性;通过融合全局通道上下文与局部提示条件化的空间偏差,PGF自适应地仅在光学数据受云污染区域引入合成孔径雷达(SAR)信息。此外,我们设计了一种参数高效的两阶段训练策略,将重建学习与语义表征学习解耦。在LuojiaSET-OSFCR数据集上的实验表明,本框架性能优越:TDP-CR在PSNR指标上以仅15%的参数量超越当前重型先进基线0.18 dB,并在mIoU指标上相较其他多任务方法稳定提升1.4%,有效生成分析就绪数据。

English Original

Optical remote sensing imagery is indispensable for Earth observation, yet persistent cloud occlusion limits its downstream utility. Most cloud removal (CR) methods are optimized for low-level fidelity and can over-smooth textures and boundaries that are critical for analysis-ready data (ARD), leading to a mismatch between visually plausible restoration and semantic utility. To bridge this gap, we propose TDP-CR, a task-driven multimodal framework that jointly performs cloud removal and land-cover segmentation. Central to our approach is a Prompt-Guided Fusion (PGF) mechanism, which utilizes a learnable degradation prompt to encode cloud thickness and spatial uncertainty. By combining global channel context with local prompt-conditioned spatial bias, PGF adaptively integrates Synthetic Aperture Radar (SAR) information only where optical data is corrupted. We further introduce a parameter-efficient two-phase training strategy that decouples reconstruction and semantic representation learning. Experiments on the LuojiaSET-OSFCR dataset demonstrate the superiority of our framework: TDP-CR surpasses heavy state-of-the-art baselines by 0.18 dB in PSNR while using only 15\% of the parameters, and achieves a 1.4\% improvement in mIoU consistently against multi-task competitors, effectively delivering analysis-ready data.

元数据
arXiv2601.12052v2
来源arXiv
类型论文
抽取状态raw
关键词
RemoteSensing
EarthObservation
Multimodal
GeoMultimodal
cs.CV