文件最后提交记录最后更新时间
bugfix: Modify elasticsearch-java version to be consistent with Spring AI (#415) * bugfix: When I started the rag-elasticsearch-example project, there was a version incompatibility issue * bugfix: When I started the rag-elasticsearch-example project, there was a version incompatibility issue4 个月前
Feat claudecode skill (#408) * feat: 为项目生成完整的API文档和模块文档 - 新增Claude Code技能框架:http-generate、readme-generate、skill-creator - 为76个包含Web接口的模块生成.http文件,涵盖100+个Controller类 - 为46个缺少文档的模块生成README.md文件,包含完整的API文档和使用说明 - 更新CLAUDE.md文件,提供项目开发指导 - 新增task/module-generate.md文档,描述自动化文档生成任务 生成的文档特点: - HTTP文件:包含完整的REST API请求示例,支持IDE直接运行 - README文件:统一的中文文档格式,包含功能介绍、API文档、使用示例、技术实现等 - 提升项目文档完整性,降低开发者学习和使用门槛 🤖 Generated with [Claude Code](https://claude.ai/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * fix: 恢复被覆盖的重要README文件,整合原有技术文档与新生成API文档 ## 恢复的关键文档 ### RAG相关模块 - **rag-milvus-example**: 恢复Milvus向量数据库配置、Docker部署、集合加载步骤 - **rag-elasticsearch-example**: 恢复Local/Cloud RAG流程详解、Elasticsearch配置 - **rag-pgvector-example**: 恢复PostgreSQL+pgvector数据库创建脚本、HNSW索引配置 - **rag-openai-dashscope-pgvector-example**: 恢复MXY RAG Server架构、多模型集成 - **module-rag**: 恢复Spring AI Module RAG技术架构、Pre-Retrieval模块详解 ### 基础模块 - **helloworld**: 恢复完整的入门示例,包含环境配置、快速开始指南 ## 恢复的重要内容 ### 技术实现细节 - 详细的数据库配置和SQL脚本 - Docker Compose部署指南 - 向量索引创建和优化配置 - RAG流程的完整技术说明 ### 配置和部署 - application.yml完整配置示例 - 环境变量和依赖说明 - 性能优化建议 - 故障排查指南 ### API文档整合 - 保留新生成的标准化API文档格式 - 整合原有的curl命令示例 - 统一的接口说明和参数描述 ## 改进效果 ✅ **技术完整性**: 恢复了丢失的重要技术实现细节 ✅ **配置完整性**: 保留了完整的部署和配置说明 ✅ **文档一致性**: 统一了文档格式,同时保留了重要信息 ✅ **可用性提升**: 开发者可以获得完整的使用指南 ## 文件统计 - 恢复了6个重要的README文件 - 保留了原有的技术细节和配置说明 - 整合了新生成的API文档格式 - 新增readme-overwrite.md任务文档 🤖 Generated with [Claude Code](https://claude.ai/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * fix: doc * fix: doc --------- Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>5 个月前
bugfix: fixed issues such as mismatch between port numbers in .http files and .yml files, and some formatting problems. (#412) * bugfix:ensure the correct port number and formatting issues * bugfix:ensure the correct port number and formatting issues --------- Co-authored-by: maple <14118288+gao-shengkai525866@user.noreply.gitee.com>5 个月前
bugfix: Modify elasticsearch-java version to be consistent with Spring AI (#415) * bugfix: When I started the rag-elasticsearch-example project, there was a version incompatibility issue * bugfix: When I started the rag-elasticsearch-example project, there was a version incompatibility issue4 个月前
README.md

Spring AI Alibaba Module RAG 使用示例

1. 简介

Spring AI Module RAG: https://docs.spring.io/spring-ai/reference/api/retrieval-augmented-generation.html#modules

Spring AI implements a Modular RAG architecture inspired by the concept of modularity detailed in the paper "Modular RAG: Transforming RAG Systems into LEGO-like Reconfigurable Frameworks".

2. 组成部分

2.1 Pre-Retrieval

增强和转换用户输入,使其更有效地执行检索任务,解决格式不正确的查询、query 语义不清晰、或不受支持的语言等。

  1. QueryAugmenter 查询增强:使用附加的上下文数据信息增强用户 query,提供大模型回答问题时的必要上下文信息;
  2. QueryTransformer:查询改写:因为用户的输入通常是片面的,关键信息较少,不便于大模型理解和回答问题。因此需要使用 prompt 调优手段或者大模型改写用户 query;
  3. QueryExpander:查询扩展:将用户 query 扩展为多个语义不同的变体以获得不同视角,有助于检索额外的上下文信息并增加找到相关结果的机会。

2.2 Retrieval

负责查询向量存储等数据系统并检索和用户 query 相关性最高的 Document。

  1. DocumentRetriever:检索器,根据 QueryExpander 使用不同的数据源进行检索,例如 搜索引擎、向量存储、数据库或知识图等;
  2. DocumentJoiner:将从多个 query 和从多个数据源检索到的 Document 合并为一个 Document 集合;

2.3 Post-Retrieval

负责处理检索到的 Document 以获得最佳的输出结果,解决模型中的中间丢失和上下文长度限制等。

  1. DocumentRanker:根据 Document 和用户 query 的相关性对 Document 进行排序和排名;
  2. DocumentSelector:用于从检索到的 Document 列表中删除不相关或冗余文档;
  3. DocumentCompressor:用于压缩每个 Document,减少检索到的信息中的噪音和冗余。

2.4 生成

生成用户 Query 对应的大模型输出。

3. 接口文档

ModuleRagController 接口

1. ragWithMemory 方法

接口路径: POST /module-rag/rag/memory/{conversationId}

功能描述: 使用基本RAG流程,带对话记忆功能的智能问答。

主要特性:

  • 支持多轮对话上下文
  • 自动维护会话记忆
  • 向量检索增强
  • 适用于连续对话场景

请求参数:

  • conversationId: 会话ID,用于维护上下文
  • prompt: 用户问题

示例请求:

POST /module-rag/rag/memory/123
Content-Type: application/json

{
  "prompt": "Who are the characters going on an adventure in the North Pole?"
}
curl -X POST http://127.0.0.1:10014/module-rag/rag/memory/123 \
  -H "Content-Type: application/json" \
  -d '{"prompt": "Who are the characters going on an adventure in the North Pole?"}'

2. ragWithCompression 方法

接口路径: POST /module-rag/rag/compression/{conversationId}

功能描述: 使用带文档压缩的RAG流程,提高回答质量。

主要特性:

  • 检索后文档压缩
  • 减少冗余信息
  • 提高相关性
  • 优化上下文利用

示例请求:

POST /module-rag/rag/compression/123
Content-Type: application/json

{
  "prompt": "What places do they visit?"
}
curl -X POST http://127.0.0.1:10014/module-rag/rag/compression/123 \
  -H "Content-Type: application/json" \
  -d '{"prompt": "What places do they visit?"}'

4. 启动示例

前置条件

在启动示例之前,您本地应该有一个可以正常使用的 Elasticsearch。

4.1 环境准备

  1. 安装并启动 Elasticsearch:
# 使用 Docker
docker run -d \
  --name elasticsearch \
  -p 9200:9200 \
  -p 9300:9300 \
  -e "discovery.type=single-node" \
  -e "xpack.security.enabled=false" \
  elasticsearch:8.11.0
  1. 设置环境变量:
export AI_DASHSCOPE_API_KEY=your_api_key

4.2 配置文件

application.yaml 配置示例:

spring:
  ai:
    dashscope:
      api-key: ${AI_DASHSCOPE_API_KEY}
      chat:
        model: qwen-plus
        options:
          temperature: 0.7
          max-tokens: 2000
  elasticsearch:
    uris: http://localhost:9200

4.3 启动应用

mvn spring-boot:run

应用启动在 http://localhost:10014

5. 使用示例

5.1 基础RAG对话

不使用压缩时的基本RAG流程:

curl -X POST http://127.0.0.1:10014/module-rag/rag/memory/123 \
    -d '{"prompt": "Who are the characters going on an adventure in the North Pole?"}'

# 输出示例:
# I'm sorry, but I don't have the information needed to answer your question.
# Could you please provide more details or clarify your query?
# If it's outside my current knowledge base, I may not be able to assist accurately.

继续对话:

curl -X POST http://127.0.0.1:10014/module-rag/rag/memory/123 \
    -d '{"prompt": "What places do they visit?"}'

# 输出示例:
# I understand. Here's how I would respond:
#
# I'm sorry, but I don't have specific information about the characters going on an adventure in the North Pole.
# Could you provide more context or details? If it's from a particular story, book, or game,
# letting me know might help me assist you better.

5.2 带压缩的RAG对话

使用文档压缩提高回答质量:

curl -X POST http://127.0.0.1:10014/module-rag/rag/compression/123 \
    -d '{"prompt": "Who are the characters going on an adventure in the North Pole?"}'

# 输出示例:
# Understood. Here's a polite and clear response for the user:
#
# I'm sorry, but I don't have the specific information about the characters going on an adventure in the North Pole.
# If this is from a particular story, book, or other source, providing more details might help me assist you better.

继续对话:

curl -X POST http://127.0.0.1:10014/module-rag/rag/compression/123 \
    -d '{"prompt": "What places do they visit?"}'

# 输出示例:
# Certainly! Here's a polite response to inform the user that the query is outside my knowledge base:
#
# I'm sorry, but I don't have the specific information about the characters going on an adventure in the North Pole.
# If this is from a particular story, book, or other source, providing more details might help me assist you better.

6. 技术实现细节

6.1 模块化RAG流程

// 基础RAG流程
public ChatResponse basicRAG(String prompt, String conversationId) {
    // 1. Query Enhancement
    Query enhancedQuery = queryAugmenter.augment(prompt);

    // 2. Document Retrieval
    List<Document> documents = documentRetriever.retrieve(enhancedQuery);

    // 3. Context Preparation
    String context = documentJoiner.join(documents);

    // 4. Response Generation
    return chatClient.call()
        .prompt(prompt)
        .user(user -> user.text(context))
        .chatResponse();
}

// 带压缩的RAG流程
public ChatResponse compressedRAG(String prompt, String conversationId) {
    // 1. Query Enhancement
    Query enhancedQuery = queryAugmenter.augment(prompt);

    // 2. Document Retrieval
    List<Document> documents = documentRetriever.retrieve(enhancedQuery);

    // 3. Document Compression
    List<Document> compressedDocs = documentCompressor.compress(documents);

    // 4. Context Preparation
    String context = documentJoiner.join(compressedDocs);

    // 5. Response Generation
    return chatClient.call()
        .prompt(prompt)
        .user(user -> user.text(context))
        .chatResponse();
}

6.2 对话记忆管理

@Service
public class ConversationMemoryService {

    private final Map<String, List<Message>> conversationHistory = new ConcurrentHashMap<>();

    public void addMessage(String conversationId, Message message) {
        conversationHistory.computeIfAbsent(conversationId, k -> new ArrayList<>())
                         .add(message);
    }

    public List<Message> getHistory(String conversationId) {
        return conversationHistory.getOrDefault(conversationId, Collections.emptyList());
    }

    public String buildContext(String conversationId, String currentQuery) {
        List<Message> history = getHistory(conversationId);
        StringBuilder context = new StringBuilder();

        // 添加历史对话上下文
        history.forEach(msg -> context.append(msg.toString()).append("\n"));

        // 添加当前查询
        context.append("Current query: ").append(currentQuery);

        return context.toString();
    }
}

6.3 文档压缩策略

@Component
public class DocumentCompressor {

    private final ChatClient chatClient;

    public List<Document> compress(List<Document> documents) {
        return documents.stream()
            .map(this::compressDocument)
            .filter(doc -> doc.getMetadata().containsKey("relevance_score"))
            .filter(doc -> (Double) doc.getMetadata().get("relevance_score") > 0.5)
            .collect(Collectors.toList());
    }

    private Document compressDocument(Document document) {
        String prompt = "Please compress the following text while preserving key information:\n" +
                       document.getContent();

        String compressedContent = chatClient.call()
            .prompt(prompt)
            .content();

        return new Document(compressedContent, document.getMetadata());
    }
}

7. 配置说明

7.1 Elasticsearch配置

spring:
  elasticsearch:
    uris: http://localhost:9200
    username: elastic
    password: changeme
    connection-timeout: 5s
    socket-timeout: 30s

7.2 向量存储配置

spring:
  ai:
    vectorstore:
      elasticsearch:
        index-name: module-rag-index
        dimensions: 1536
        similarity-function: cosine

7.3 RAG组件配置

rag:
  retrieval:
    top-k: 5
    similarity-threshold: 0.7
  compression:
    enabled: true
    max-compression-ratio: 0.5
  memory:
    max-history-length: 20
    ttl: 3600 # 1 hour

8. 性能优化

8.1 检索优化

  • 使用合适的索引结构
  • 调整 top-k 参数平衡精度和性能
  • 实现结果缓存机制

8.2 压缩优化

  • 并行处理文档压缩
  • 设置合理的压缩比例
  • 过滤低相关性文档

8.3 记忆管理

  • 定期清理过期对话
  • 限制历史记录长度
  • 使用滑动窗口机制

9. 扩展功能

9.1 自定义检索器

public class HybridRetriever implements DocumentRetriever {
    private final VectorStore vectorStore;
    private final SearchEngine searchEngine;

    @Override
    public List<Document> retrieve(Query query) {
        // 向量检索
        List<Document> vectorResults = vectorStore.similaritySearch(query);

        // 关键词检索
        List<Document> keywordResults = searchEngine.search(query.getText());

        // 结果合并和去重
        return mergeAndDeduplicate(vectorResults, keywordResults);
    }
}

9.2 查询扩展

public class LLMQueryExpander implements QueryExpander {
    private final ChatClient chatClient;

    @Override
    public List<Query> expand(Query originalQuery) {
        String prompt = "Expand the following query with 3 different perspectives: " +
                       originalQuery.getText();

        String expansion = chatClient.call()
            .prompt(prompt)
            .content();

        return parseExpansions(expansion);
    }
}

10. 监控和日志

10.1 性能监控

@Aspect
@Component
public class RAGPerformanceMonitor {

    @Around("execution(* com.alibaba.cloud.ai.example..*.*(..))")
    public Object monitorPerformance(ProceedingJoinPoint joinPoint) throws Throwable {
        long startTime = System.currentTimeMillis();
        Object result = joinPoint.proceed();
        long duration = System.currentTimeMillis() - startTime;

        log.info("Method {} executed in {} ms",
                joinPoint.getSignature().getName(), duration);

        return result;
    }
}

10.2 详细日志配置

logging:
  level:
    com.alibaba.cloud.ai.example: DEBUG
    org.springframework.ai: DEBUG
    org.elasticsearch.client: WARN
  pattern:
    console: "%d{HH:mm:ss.SSS} [%thread] %-5level %logger{36} - %msg%n"

11. 测试指导

11.1 单元测试

@SpringBootTest
class ModuleRagControllerTest {

    @Autowired
    private TestRestTemplate restTemplate;

    @Test
    void testRAGWithMemory() {
        String url = "/module-rag/rag/memory/test-conversation";
        Map<String, String> request = Map.of("prompt", "What is AI?");

        ResponseEntity<String> response = restTemplate.postForEntity(url, request, String.class);

        assertThat(response.getStatusCode()).isEqualTo(HttpStatus.OK);
        assertThat(response.getBody()).contains("AI");
    }
}

11.2 集成测试

@Testcontainers
@SpringBootTest(webEnvironment = SpringBootTest.WebEnvironment.RANDOM_PORT)
class ModuleRagIntegrationTest {

    @Container
    static ElasticsearchContainer elasticsearch = new ElasticsearchContainer(
        "docker.elastic.co/elasticsearch/elasticsearch:8.11.0"
    );

    @DynamicPropertySource
    static void elasticsearchProperties(DynamicPropertyRegistry registry) {
        registry.add("spring.elasticsearch.uris", elasticsearch::getHttpHostAddress);
    }
}

12. 常见问题

12.1 检索结果质量差

  • 检查嵌入模型配置
  • 调整相似度阈值
  • 优化查询预处理

12.2 对话上下文丢失

  • 验证会话ID传递
  • 检查记忆存储机制
  • 确认上下文构建逻辑

12.3 压缩效果不明显

  • 调整压缩提示词
  • 修改压缩比例参数
  • 检查文档相关性过滤

更新时间: 2025-12-09 版本: 1.1.0