推理增强生成ReAG，让RAG效果更上一层楼

小虎哦哦

发布于 2025-2-25 13:03

2088浏览

0收藏

在检索增强生成（RAG）技术崭露头角之际，业界对其赋予厚望，期待它能够推动AI迈向新的智能高度。

然而，实践中RAG暴露出诸多缺陷，极大地限制了其应用效果与AI的发展进程。在此背景下，推理增强生成（ReAG）技术应运而生。ReAG凭借其独特的技术架构与运行逻辑，为解决 RAG 问题提供新思路和可行方案，在AI升级之路上潜力巨大。

1.传统RAG的 “槽点”

传统 RAG 系统就好比记忆力差的图书管理员，看似在努力找资料，实则状况百出：

语义搜索“缺根弦”：找文档只看表面，像搜“空气污染”，就只知道“汽车尾气排放”，像《城市肺部疾病趋势》这种相关研究就被无视了。
基础架构“找麻烦”：分块、嵌入、向量数据库把流程复杂化，还容易出问题，索引过时、分割错误经常有。
知识更新“慢吞吞”：医学、金融数据变得快，RAG更新索引却很慢，新知识进不来，根本没法用。

你问“北极熊为啥变少”，RAG只说“海冰融化”，关键的觅食问题却不提，这就是RAG的缺陷。

2.ReAG来袭，告别传统检索模式

RAG的问题不少，ReAG则带来全新思路。它跳过RAG的预处理流程，直接把原始材料（文本文件、电子表格、网址等）喂给语言模型。

大语言模型具体这么做：

完整读取文档：无需分块、嵌入，文档上下文完整保留。
精准筛选内容：先判断文档是否有用（相关性检查），再确定哪些部分重要（内容提取）。
智能合成答案：像专业人员一样整合信息，即便关键词不匹配，也能找出联系。

比如问“北极熊为啥减少”，ReAG分析《海冰的热动力学》报告时，就算没“北极熊”字样，也能找到海冰减少影响其觅食的关键内容，给出答案。

3.ReAG工作原理

ReAG 是如何 “工作” 的？给大家拆解一下它的技术流程，一看就懂：

直接摄取原始文档：不管是Markdown、PDF，还是网址，ReAG都不做预处理，直接使用。
并行分析文档：大语言模型同时对每份文档进行相关性检查和内容提取，效率超高。
动态合成答案：剔除不相关文档，用筛选后的内容生成答案。

ReAG 的技术流程简洁高效，具有较高的技术价值。

4.ReAG更胜一筹：优势与权衡

4.1 ReAG优势

动态数据处理快：实时新闻、市场数据这类不断变化的数据，ReAG能即时处理，无需重新嵌入，效率超高。
复杂查询有一手：像探究监管政策对社区银行的影响这类难题，ReAG挖掘间接联系的能力比RAG强，解题更在行。
多模态分析超方便：图表、表格、文本，ReAG能一起分析，还不用额外预处理。

4.2 ReAG短板

成本较高：处理100份文档，ReAG需调用100次大语言模型，RAG向量搜索成本则低很多。
大规模处理慢：面对海量文档，ReAG速度欠佳，RAG和ReAG混合使用效果更佳。

ReAG优势突出但也有局限，使用时按需选择！

5.ReAG技术栈揭秘

ReAG表现亮眼，其技术栈暗藏玄机，下面详细罗列：

5.1 技术组件解析

GROQ + Llama-3.3–70B-Versatile

职责：负责相关性评估，初步筛选文档。
优势：推理快，每秒处理500多令牌；700亿参数精准评分；12.8万令牌大窗口。
示例：能识别无关键词重叠的《海冰的热动力学》与“北极熊减少”相关。

Ollama + DeepSeek-R1:14B

任务：进行响应合成，推理出答案。
长处：轻量省钱，针对提取总结优化；可本地运行保隐私、降成本；12.8万令牌窗口。
应用：从文档提取关键信息，如无冰期觅食窗口变化数据。

LangChain

功能：编排流程、实现自动化。
特点：并行GROQ和Ollama任务；管理文档、处理错误、聚合输出。

5.2 技术栈优势

成本合理：GROQ处理重任务，Ollama本地处理轻量任务，节省成本。
扩展性好：GROQ的LPU能处理大量并发评估。
灵活多变：可更换模型，无需重写管道。
经验之谈：处理超50页文档，用大上下文窗口的大语言模型配合ReAG更好。

6.ReAG代码实现

安装所需依赖项

!pip install langchain langchain_groq langchain_ollama langchain_community pymupdf pypdf1.

下载数据

!mkdir ./data
!mkdir ./chunk_caches
!wget "https://www.binasss.sa.cr/int23/8.pdf" -O "./data/fibromyalgia.pdf"1.
2.
3.

设置大语言模型

from langchain_groq import ChatGroq
from langchain_ollama import ChatOllama
import os
os.environ["GROQ_API_KEY"] = "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"

llm_relevancy = ChatGroq(
    model="llama-3.3-70b-versatile",
    temperature=0,
)

llm = ChatOllama(
    model="deepseek-r1:14b",
    temperature=0.6,
    max_tokens=3000,
)1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.

定义系统提示

REAG_SYSTEM_PROMPT = """
# 角色和目标
你是一个智能知识检索助手。你的任务是分析提供的文档或网址，为用户查询提取最相关的信息。

# 指令
1. 仔细分析用户的查询，确定关键概念和要求。
2. 在提供的来源中搜索相关信息，并在“content”字段中输出相关部分。
3. 如果你在文档中找不到必要的信息，返回“isIrrelevant: true”，否则返回“isIrrelevant: false”。

# 约束
- 不要超出可用数据进行假设
- 明确指出是否未找到相关信息
- 在选择来源时保持客观
"""1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.

定义RAG提示词

rag_prompt = """你是一个问答任务助手。使用以下检索到的上下文片段来回答问题。如果你不知道答案，就说不知道。最多用三句话，保持回答简洁。
问题：{question} 
上下文：{context} 
答案：
"""1.
2.
3.
4.
5.

定义响应模式

from pydantic import BaseModel, Field
from typing import List
from langchain_core.output_parsers import JsonOutputParser

class ResponseSchema(BaseModel):
    content: str = Field(..., description="文档中与回答所提问题相关或足以回答问题的页面内容")
    reasoning: str = Field(..., description="针对所提问题选择该页面内容的原因")
    is_irrelevant: bool = Field(..., description="如果文档中的内容不足以或与回答所提问题无关，指定为“True”；如果上下文或页面内容与回答问题相关，则指定为“False”")

class RelevancySchemaMessage(BaseModel):
    source: ResponseSchema

relevancy_parser = JsonOutputParser(pydantic_object=RelevancySchemaMessage)1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.

加载并处理输入文档

from langchain_community.document_loaders import PyMuPDFLoader

file_path = "./data/fibromyalgia.pdf"
loader = PyMuPDFLoader(file_path)
docs = loader.load()
print(len(docs))
print(docs[0].metadata)1.
2.
3.
4.
5.
6.
7.

响应

8
{'producer': 'Acrobat Distiller 6.0 for Windows',
'creator': 'Elsevier',
'creationdate': '2023-01-20T09:25:19-06:00',
'source': './data/fibromyalgia.pdf',
'file_path': './data/fibromyalgia.pdf',
'total_pages': 8,
'format': 'PDF 1.7',
'title': 'Fibromyalgia: Diagnosis and Management',
'author': 'Bradford T. Winslow MD',
'subject': 'American Family Physician, 107 (2023) 137-144',
'keywords': '',
'moddate': '2023-02-27T15:02:12+05:30',
'trapped': '',
'modDate': "D:20230227150212+05'30'",
'creationDate': "D:20230120092519-06'00'",
'page': 0}1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.

格式化文档的辅助函数

from langchain.schema import Document

def format_doc(doc: Document) -> str:
    return f"Document_Title: {doc.metadata['title']}\nPage: {doc.metadata['page']}\nContent: {doc.page_content}"1.
2.
3.
4.

提取相关上下文的辅助函数

from langchain_core.prompts import PromptTemplate

def extract_relevant_context(question, documents):
    result = []
    for doc in documents:
        formatted_documents = format_doc(doc)
        system = f"{REAG_SYSTEM_PROMPT}\n\n# Available source\n\n{formatted_documents}"
        prompt = f"""Determine if the 'Avaiable source' content supplied is sufficient and relevant to ANSWER the QUESTION asked.
        QUESTION: {question}
        #INSTRUCTIONS TO FOLLOW
        1. Analyze the context provided thoroughly to check its relevancy to help formulizing a response for the QUESTION asked.
        2, STRICTLY PROVIDE THE RESPONSE IN A JSON STRUCTURE AS DESCRIBED BELOW:
            ```json
               {{"content":<<The page content of the document that is relevant or sufficient to answer the question asked>>,
                 "reasoning":<<The reasoning for selecting The page content with respect to the question asked>>,
                 "is_irrelevant":<<Specify 'True' if the content in the document is not sufficient or relevant.Specify 'False' if the page content is sufficient to answer the QUESTION>>
                 }}
            ```
         """
        messages =[ {"role": "system", "content": system},
                       {"role": "user", "content": prompt},
                    ]
        response = llm_relevancy.invoke(messages)    
        print(response.content)
        formatted_response = relevancy_parser.parse(response.content)
        result.append(formatted_response)
    final_context = []
    for items in result:
        if (items['is_irrelevant'] == False) or ( items['is_irrelevant'] == 'false') or (items['is_irrelevant'] == 'False'):
            final_context.append(items['content'])
    return final_context1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.

调用函数检索相关上下文

question = "What is Fibromyalgia?"
final_context = extract_relevant_context(question, docs)
print(len(final_context))1.
2.
3.

生成响应的辅助函数

def generate_response(question, final_context):
    prompt = PromptTemplate(template=rag_prompt,
                                     input_variables=["question","context"],)
    chain  = prompt | llm
    response = chain.invoke({"question":question,"context":final_context})
    print(response.content.split("\n\n")[-1])
    return response.content.split("\n\n")[-1]1.
2.
3.
4.
5.
6.
7.

生成响应

final_response = generate_response(question, final_context)
final_response1.
2.

完整响应

'Fibromyalgia is a chronic condition characterized by widespread musculoskeletal pain, fatigue, disrupted sleep, and cognitive difficulties like "fibrofog." It is often associated with heightened sensitivity to pain due to altered nervous system processing. Diagnosis considers symptoms such as long-term pain, fatigue, and sleep issues without underlying inflammation or injury.'1.

问题2

question =  "What are the causes of Fibromyalgia?"
final_context = extract_relevant_context(question, docs)
final_response = generate_response(question, final_context)1.
2.
3.

完整响应

Fibromyalgia likely results from disordered central pain processing leading to heightened sensitivity (hyperalgesia and allodynia). Possible causes include dysfunction of the hypothalamic-pituitary-adrenal axis, inflammation, glial activation, small fiber neuropathy, infections like Epstein-Barr virus or Lyme disease, and a genetic component. Other conditions, such as infections or medication side effects, may also contribute to similar symptoms.1.

问题3

question =  "Do people suffering from rheumatologic conditions may have fibromyalgia?"
final_context = extract_relevant_context(question, docs)
final_response = generate_response(question, final_context)1.
2.
3.

完整响应

Yes, people with rheumatologic conditions, such as rheumatoid arthritis or psoriatic arthritis, may also have fibromyalgia. This is because they share overlapping symptoms, making diagnosis challenging.1.

问题4

question =  "Mention the nonpharmacologic treatment for fibromyalgia?"
final_context = extract_relevant_context(question, docs)
final_response = generate_response(question, final_context)1.
2.
3.

完整响应

Nonpharmacologic treatments for fibromyalgia include patient education, exercise, and cognitive behavior therapy (CBT).1.

问题5

question =  "According to 2016 American College of Rheumatology Fibromyalgia what is the Diagnostic Criteria for Fibromyalgia?"
final_context = extract_relevant_context(question, docs)
final_response = generate_response(question, final_context)1.
2.
3.

完整响应

The 2016 American College of Rheumatology diagnostic criteria for fibromyalgia require generalized pain in at least four of five body regions for at least three months. Additionally, patients must meet either a Widespread Pain Index (WPI) score of ≥7 with a Symptom Severity Scale (SSS) score of ≥5 or a WPI score of ≥4 with an SSS score of ≥9. Other disorders that could explain the symptoms must be ruled out.1.

问题6

question =  "What is the starting dosage of Amitriptyline?"
final_context = extract_relevant_context(question, docs)
final_response = generate_response(question, final_context)1.
2.
3.

完整响应

The starting dosage of Amitriptyline for adults is usually between 25 to 50 mg per day, often beginning with a lower dose of 5 to 10 mg at night to minimize side effects before gradually increasing.1.

问题7

question = "What has been mentioned about AAPT 2019 Diagnostic Criteria for Fibromyalgia"
final_context = extract_relevant_context(question, docs)
final_response = generate_response(question, final_context)1.
2.
3.

完整响应

The AAPT 2019 criteria for fibromyalgia include multisite pain in at least six of nine specified areas, moderate to severe sleep problems or fatigue, and symptoms lasting three months or more.1.

问题8

question =  "What are the medications and doses for Fibromyalgia?"
final_context = extract_relevant_context(question, docs)
print(final_context)
final_response = generate_response(question, final_context)1.
2.
3.
4.

输出结果

['Duloxetine, milnacipran, pregabalin, and amitriptyline are potentially effective medications for fibromyalgia. Nonsteroidal anti-inflammatory drugs and opioids have not demonstrated benefits for fibromyalgia and have significant limitations.',
 'Amitriptyline, cyclobenzaprine, duloxetine (Cymbalta), milnacipran (Savella), and pregabalin (Lyrica) are effective for pain in fibromyalgia.43,46-48,50,52,54',
 'Amitriptyline (tricyclic antidepressant) - 5 to 10 mg at night, 20 to 30 mg at night. Cyclobenzaprine (muscle relaxant; tricyclic derivative) - 5 to 10 mg at night, 10 to 40 mg daily in 1 to 3 divided doses. Duloxetine (Cymbalta; serotonin-norepinephrine reuptake inhibitor) - 20 to 30 mg every morning, 60 mg every morning. Milnacipran (Savella; serotonin-norepinephrine reuptake inhibitor) - 12.5 mg every morning, 50 mg twice daily. Pregabalin (Lyrica; gabapentinoid) - 25 to 50 mg at bedtime, 150 to 450 mg at bedtime.',
 'Fibromyalgia is often treated with medications such as pregabalin (Lyrica) and duloxetine (Cymbalta). Pregabalin can be started at a dose of 75 mg twice daily, with a maximum dose of 450 mg/day. Duloxetine can be initiated at a dose of 30 mg once daily, with a target dose of 60 mg/day.',
 'Fibromyalgia is often treated with medications such as pregabalin (Lyrica) and duloxetine (Cymbalta). Pregabalin can be started at a dose of 75 mg twice daily, with a maximum dose of 450 mg/day. Duloxetine can be initiated at a dose of 30 mg once daily, with a target dose of 60 mg/day.']1.
2.
3.
4.
5.

最终响应

The medications commonly used to treat fibromyalgia include:

1. **Amitriptyline**: A tricyclic antidepressant typically taken at night in doses ranging from 5 to 30 mg.

2. **Cyclobenzaprine**: A muscle relaxant and tricyclic derivative, usually administered in doses up to 40 mg daily in divided doses.

3. **Duloxetine (Cymbalta)**: A serotonin-norepinephrine reuptake inhibitor taken in the morning, starting at 20-30 mg and increasing to 60 mg if needed.

4. **Milnacipran (Savella)**: Another serotonin-norepinephrine reuptake inhibitor, starting at 12.5 mg in the morning and potentially increased to 50 mg twice daily.

5. **Pregabalin (Lyrica)**: A gabapentinoid taken at bedtime, beginning with 75 mg twice daily and up to a maximum of 450 mg/day.

These medications are effective for managing pain associated with fibromyalgia. It's important to note that dosages should be adjusted under medical supervision, starting low and increasing as necessary. Additionally, NSAIDs and opioids are not recommended for treating fibromyalgia due to limited effectiveness and potential side effects.1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.

7.实际应用

医学研究：从原始临床试验数据和期刊中综合见解。
金融市场：分析实时收益报告和美国证券交易委员会（SEC）文件，制定实时投资策略。
法律分析：剖析复杂的判例法，识别先例之间的联系。

8.ReAG的未来发展

混合系统：先使用检索增强生成（RAG）进行初步筛选，然后利用推理增强生成（ReAG）进行深度分析。
低成本模型：开源大语言模型（如 DeepSeek）和量化技术将降低成本。
更大的上下文窗口：未来的模型将能够处理包含十亿个标记的文档，这会使推理增强生成（ReAG）更加强大。

9.结语

在 AI 技术不断迭代的当下，ReAG 为我们展现了一种全新思路。ReAG 无意取代 RAG，而是从根本上重塑 AI 与知识的交互逻辑。

ReAG 将检索巧妙转化为推理任务，精准复刻人类研究的全面性、细致性和上下文关联性。在医学研究中，它能高效梳理临床数据；于金融领域，又可敏锐洞察市场动态。这种独特的优势，使其在多领域已崭露头角。

随着技术发展，ReAG 有望解锁更多应用场景，深度赋能各行业。让我们一同期待，它在未来 AI 发展浪潮中创造更多可能，重塑更多领域的发展格局。

本文转载自 AI科技论谈，作者： AI科技论谈

标签

ReAG

RAG

51CTO

51CTO博客

51CTO学堂

推理增强生成ReAG，让RAG效果更上一层楼

1.传统RAG的 “槽点”

2.ReAG来袭，告别传统检索模式

3.ReAG工作原理

4.ReAG更胜一筹：优势与权衡

4.1 ReAG优势

4.2 ReAG短板

5.ReAG技术栈揭秘

5.1 技术组件解析

5.2 技术栈优势

6.ReAG代码实现

安装所需依赖项

下载数据

设置大语言模型

定义系统提示

定义RAG提示词

定义响应模式

加载并处理输入文档

格式化文档的辅助函数

提取相关上下文的辅助函数

调用函数检索相关上下文

生成响应的辅助函数

生成响应

问题2

问题3

问题4

问题5

问题6

问题7

问题8

7.实际应用

8.ReAG的未来发展

9.结语

目录