地理空间知识发现日益依赖对异构资源的跨模态检索,包括数据集、地图、笔记本、软件、出版物以及它们之间的溯源关联。传统地理信息门户仅支持元数据与空间过滤,却极少在一个集成系统中提供语义检索、图感知的溯源遍历及对话式综合能力。本文介绍 I-GUIDE Smart Search——一个已投入生产的多模态地理空间检索增强生成(RAG)系统,该系统嵌入于 I-GUIDE 平台,并报告其设计、部署与评估结果。该系统整合了由生产环境维护的 OpenSearch 关键词索引、向量索引与空间索引,辅以 Neo4j 知识图谱,并采用迭代式 RAG 流水线,实现记忆感知的查询增强、推理、检索方法路由、相关性评分、基于证据的生成,以及幻觉检测与相关性校验。在单 A100 GPU 的 RAG 部署配置下,I-GUIDE Smart Search 可支持约 100 个并发模拟用户的交互式使用,在每查询调用 20–50 次大语言模型(LLM)的前提下,达到 4.4 请求/秒的吞吐量,p50 延迟约为 25 秒。在答案质量评估方面,我们构建了一个包含 170 个经人工筛选的面向用户查询的四类别基准测试集,并辅以十组针对特定意图、从已部署索引与图谱中生成的探针集。相较于无检索基线与朴素 RAG 基线,Smart Search 在检索证据覆盖度与人工评定的答案质量上均有提升,尤其在精确标识符匹配、空间约束型、简单推荐型及需依赖当前索引证据的领域特异性事实型查询中表现最为显著。我们提炼出适用于空间 RAG 系统的可迁移部署经验,涵盖空间元数据质量、图谱溯源建模、检索路由机制、接口契约设计、拒绝意识评估,以及延迟-成本权衡等维度。
Geospatial knowledge discovery increasingly requires search across heterogeneous artifacts: datasets, maps, notebooks, software, publications, and the provenance links among them. Conventional geoportals support metadata and spatial filtering, but they rarely provide semantic retrieval, graph-aware provenance traversal, and conversational synthesis in one integrated system. This paper presents I-GUIDE Smart Search, a production multimodal geospatial retrieval-augmented generation (RAG) system embedded in the I-GUIDE Platform, and reports on its design, deployment, and evaluation. The system combines production-maintained OpenSearch keyword, vector, and spatial indexes with a Neo4j knowledge graph and an iterative RAG pipeline for memory-aware query augmentation, reasoning, retrieval-method routing, relevance grading, grounded generation, hallucination and relevance checking. In a single-A100 RAG deployment, I-GUIDE Smart Search supports interactive use up to about 100 concurrent simulated users, reaching 4.4 requests per second with p50 latency near 25 seconds despite 20-50 LLM calls per query. For answer quality, we evaluate a four-category benchmark of 170 unique human-filtered user-facing queries, together with ten intent-specific probe sets generated from the deployed indexes and graph. Smart Search improves retrieved evidence coverage and judged answer quality over non-retrieval and naive-RAG baselines, with the clearest gains on exact-identifier, spatially constrained, simple-recommendation, and domain-specific factual queries requiring current indexed evidence. We distill transferable deployment lessons for spatial RAG systems, covering spatial metadata quality, graph provenance, retrieval routing, interface contracts, refusal-aware evaluation, latency-cost tradeoffs, and the role of the user interface in deployed geospatial cyberinfrastructure.