图像检索能够高效地在海量卫星影像中进行搜索,并返回与查询图像相似的影像。深度学习模型可在无需标注的情况下识别多种语义概念的图像。本文提出使用地理空间基础模型(如Prithvi)进行遥感图像检索,具有多项优势:i)模型可编码多光谱卫星数据;ii)无需进一步微调即可实现良好泛化。我们引入两个新数据集用于检索任务,并观察到显著性能表现:Prithvi处理六波段数据,在BigEarthNet-43上达到97.62%的平均精度,在ForestNet-12上达到44.51%的平均精度,优于其他基于RGB的模型。此外,我们评估了三种压缩方法,采用二值化嵌入在检索速度与精度之间取得平衡。这些方法在保持与浮点嵌入相同精度的同时,实现了比短哈希码更快的检索速度,且压缩率高达32倍。代码已公开于https://github.com/IBM/remote-sensing-image-retrieval。
Image retrieval enables an efficient search through vast amounts of satellite imagery and returns similar images to a query. Deep learning models can identify images across various semantic concepts without the need for annotations. This work proposes to use Geospatial Foundation Models, like Prithvi, for remote sensing image retrieval with multiple benefits: i) the models encode multi-spectral satellite data and ii) generalize without further fine-tuning. We introduce two datasets to the retrieval task and observe a strong performance: Prithvi processes six bands and achieves a mean Average Precision of 97.62% on BigEarthNet-43 and 44.51% on ForestNet-12, outperforming other RGB-based models. Further, we evaluate three compression methods with binarized embeddings balancing retrieval speed and accuracy. They match the retrieval speed of much shorter hash codes while maintaining the same accuracy as floating-point embeddings but with a 32-fold compression. The code is available at https://github.com/IBM/remote-sensing-image-retrieval.