气溶胶光学厚度(AOD)反演对地球观测至关重要,支撑空气质量监测与气候研究等应用。传统基于物理的AOD反演方法将问题建模为逐像素反演,依赖辐射传输建模、内存密集型查找表及辅助气象数据。尽管近期数据驱动方法展现出潜力,但多数未能充分利用高光谱影像的空间-光谱一致性,导致反演结果空间不一致且对噪声敏感。本研究首次探索基础人工智能(Foundation AI)模型用于AOD反演,并提出ViTCG——一种基于通道分组(Channel-wise Grouping)的空间回归框架的视觉Transformer模型,可降低反演偏差与误差。ViTCG以高光谱大气顶层辐射亮度为输入,联合建模空间上下文与光谱信息。在PACE辐射亮度观测数据上的验证表明,相较于Prithvi等当前最先进的基础模型,ViTCG的均方误差降低62%,并生成空间一致的AOD场。
Aerosol Optical Depth (AOD) retrieval is essential for Earth observation, supporting applications from air quality monitoring to climate studies. Conventional physics-based AOD retrieval methods formulate the problem as a pixel-wise inversion, relying on radiative transfer modeling, memory-intensive look-up tables, and auxiliary meteorological data. While recent data-driven approaches have shown promise, many fail to exploit the spatial-spectral coherence of hyperspectral imagery, leading to spatially inconsistent and noise-sensitive retrievals. We present the first study exploring Foundation AI models for AOD retrieval and propose ViTCG, a Vision Transformer with Channel-wise Grouping-based spatial regression framework that reduces retrieval bias and error. ViTCG uses hyperspectral top-of-atmosphere radiance as input and jointly models spatial context and spectral information. Validation with PACE radiance observations demonstrates a 62% reduction in mean squared error compared to state-of-the-art foundation models, including Prithvi, and produces spatially coherent AOD fields.