如何讓 RAG 應用程式新增引用

本指南回顧了讓模型引用其在生成回應時參考的來源文件部分的方法。

我們將涵蓋五種方法

使用工具呼叫引用文件 ID；
使用工具呼叫引用文件 ID 並提供文本片段；
直接提示；
檢索後處理（即，壓縮檢索到的上下文以使其更相關）；
生成後處理（即，發出第二次 LLM 呼叫以使用引用註釋生成的答案）。

我們通常建議使用列表中第一個適用於您用例的方法。也就是說，如果您的模型支援工具呼叫，請嘗試方法 1 或 2；否則，或者如果這些方法失敗，請繼續向下列表。

首先，讓我們建立一個簡單的 RAG 鏈。首先，我們將使用 WikipediaRetriever 從維基百科檢索。我們將使用 RAG 教學中的相同 LangGraph 實作。

設定

首先，我們需要安裝一些依賴項

%pip install -qU langchain-community wikipedia

首先，讓我們選擇一個 LLM

選擇聊天模型

pip install -qU "langchain[openai]"

import getpass
import os

if not os.environ.get("OPENAI_API_KEY"):
  os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter API key for OpenAI: ")

from langchain.chat_models import init_chat_model

llm = init_chat_model("gpt-4o-mini", model_provider="openai")

我們現在可以載入檢索器並建構我們的提示

from langchain_community.retrievers import WikipediaRetriever
from langchain_core.prompts import ChatPromptTemplate

system_prompt = (
    "You're a helpful AI assistant. Given a user question "
    "and some Wikipedia article snippets, answer the user "
    "question. If none of the articles answer the question, "
    "just say you don't know."
    "\n\nHere are the Wikipedia articles: "
    "{context}"
)

retriever = WikipediaRetriever(top_k_results=6, doc_content_chars_max=2000)
prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system_prompt),
        ("human", "{question}"),
    ]
)
prompt.pretty_print()

API 參考：WikipediaRetriever | ChatPromptTemplate

================================[1m System Message [0m================================

You're a helpful AI assistant. Given a user question and some Wikipedia article snippets, answer the user question. If none of the articles answer the question, just say you don't know.

Here are the Wikipedia articles: [33;1m[1;3m{context}[0m

================================[1m Human Message [0m=================================

[33;1m[1;3m{question}[0m

現在我們有了模型、檢索器和提示，讓我們將它們全部鏈接在一起。按照關於為 RAG 應用程式新增引用的操作指南，我們將使其鏈返回答案和檢索到的文件。這使用了與RAG 教學中相同的 LangGraph 實作。

from langchain_core.documents import Document
from langgraph.graph import START, StateGraph
from typing_extensions import List, TypedDict


# Define state for application
class State(TypedDict):
    question: str
    context: List[Document]
    answer: str


# Define application steps
def retrieve(state: State):
    retrieved_docs = retriever.invoke(state["question"])
    return {"context": retrieved_docs}


def generate(state: State):
    docs_content = "\n\n".join(doc.page_content for doc in state["context"])
    messages = prompt.invoke({"question": state["question"], "context": docs_content})
    response = llm.invoke(messages)
    return {"answer": response.content}


# Compile application and test
graph_builder = StateGraph(State).add_sequence([retrieve, generate])
graph_builder.add_edge(START, "retrieve")
graph = graph_builder.compile()

API 參考：Document | StateGraph

from IPython.display import Image, display

display(Image(graph.get_graph().draw_mermaid_png()))

result = graph.invoke({"question": "How fast are cheetahs?"})

sources = [doc.metadata["source"] for doc in result["context"]]
print(f"Sources: {sources}\n\n")
print(f'Answer: {result["answer"]}')

Sources: ['https://en.wikipedia.org/wiki/Cheetah', 'https://en.wikipedia.org/wiki/Southeast_African_cheetah', 'https://en.wikipedia.org/wiki/Footspeed', 'https://en.wikipedia.org/wiki/Fastest_animals', 'https://en.wikipedia.org/wiki/Pursuit_predation', 'https://en.wikipedia.org/wiki/Gepard-class_fast_attack_craft']

Answer: Cheetahs are capable of running at speeds between 93 to 104 km/h (58 to 65 mph).

查看 LangSmith 追蹤。

工具呼叫

如果您的首選 LLM 實作了工具呼叫功能，您可以使用它讓模型指定在生成答案時參考的提供的文件。 LangChain 工具呼叫模型實作了 .with_structured_output 方法，該方法將強制生成符合所需架構的輸出（請參閱此處的詳細資訊）。

引用文件

要使用識別符引用文件，我們將識別符格式化到提示中，然後使用 .with_structured_output 強制 LLM 在其輸出中引用這些識別符。

首先，我們定義輸出的架構。 .with_structured_output 支援多種格式，包括 JSON 架構和 Pydantic。在這裡，我們將使用 Pydantic

from pydantic import BaseModel, Field


class CitedAnswer(BaseModel):
    """Answer the user question based only on the given sources, and cite the sources used."""

    answer: str = Field(
        ...,
        description="The answer to the user question, which is based only on the given sources.",
    )
    citations: List[int] = Field(
        ...,
        description="The integer IDs of the SPECIFIC sources which justify the answer.",
    )

讓我們看看當我們傳遞函式和用戶輸入時，模型輸出是什麼樣的

structured_llm = llm.with_structured_output(CitedAnswer)

example_q = """What Brian's height?

Source: 1
Information: Suzy is 6'2"

Source: 2
Information: Jeremiah is blonde

Source: 3
Information: Brian is 3 inches shorter than Suzy"""
result = structured_llm.invoke(example_q)

result

CitedAnswer(answer='Brian is 5\'11".', citations=[1, 3])

或作為字典

result.dict()

{'answer': 'Brian is 5\'11".', 'citations': [1, 3]}

現在，我們將來源識別符結構化到提示中，以與我們的鏈複製。我們將進行三項變更

更新提示以包含來源識別符；
使用 structured_llm（即，llm.with_structured_output(CitedAnswer)）；
在輸出中傳回 Pydantic 物件。

def format_docs_with_id(docs: List[Document]) -> str:
    formatted = [
        f"Source ID: {i}\nArticle Title: {doc.metadata['title']}\nArticle Snippet: {doc.page_content}"
        for i, doc in enumerate(docs)
    ]
    return "\n\n" + "\n\n".join(formatted)


class State(TypedDict):
    question: str
    context: List[Document]
    answer: CitedAnswer


def generate(state: State):
    formatted_docs = format_docs_with_id(state["context"])
    messages = prompt.invoke({"question": state["question"], "context": formatted_docs})
    structured_llm = llm.with_structured_output(CitedAnswer)
    response = structured_llm.invoke(messages)
    return {"answer": response}


graph_builder = StateGraph(State).add_sequence([retrieve, generate])
graph_builder.add_edge(START, "retrieve")
graph = graph_builder.compile()

result = graph.invoke({"question": "How fast are cheetahs?"})

result["answer"]

CitedAnswer(answer='Cheetahs are capable of running at speeds between 93 to 104 km/h (58 to 65 mph).', citations=[0, 3])

我們可以檢查索引 0 處的文件，模型引用了該文件

print(result["context"][0])

page_content='The cheetah (Acinonyx jubatus) is a large cat and the fastest land animal. It has a tawny to creamy white or pale buff fur that is marked with evenly spaced, solid black spots. The head is small and rounded, with a short snout and black tear-like facial streaks. It reaches 67–94 cm (26–37 in) at the shoulder, and the head-and-body length is between 1.1 and 1.5 m (3 ft 7 in and 4 ft 11 in). Adults weigh between 21 and 72 kg (46 and 159 lb). The cheetah is capable of running at 93 to 104 km/h (58 to 65 mph); it has evolved specialized adaptations for speed, including a light build, long thin legs and a long tail.
The cheetah was first described in the late 18th century. Four subspecies are recognised today that are native to Africa and central Iran. An African subspecies was introduced to India in 2022. It is now distributed mainly in small, fragmented populations in northwestern, eastern and southern Africa and central Iran. It lives in a variety of habitats such as savannahs in the Serengeti, arid mountain ranges in the Sahara, and hilly desert terrain.
The cheetah lives in three main social groups: females and their cubs, male "coalitions", and solitary males. While females lead a nomadic life searching for prey in large home ranges, males are more sedentary and instead establish much smaller territories in areas with plentiful prey and access to females. The cheetah is active during the day, with peaks during dawn and dusk. It feeds on small- to medium-sized prey, mostly weighing under 40 kg (88 lb), and prefers medium-sized ungulates such as impala, springbok and Thomson's gazelles. The cheetah typically stalks its prey within 60–100 m (200–330 ft) before charging towards it, trips it during the chase and bites its throat to suffocate it to death. It breeds throughout the year. After a gestation of nearly three months, females give birth to a litter of three or four cubs. Cheetah cubs are highly vulnerable to predation by other large carnivores. They are weaned a' metadata={'title': 'Cheetah', 'summary': 'The cheetah (Acinonyx jubatus) is a large cat and the fastest land animal. It has a tawny to creamy white or pale buff fur that is marked with evenly spaced, solid black spots. The head is small and rounded, with a short snout and black tear-like facial streaks. It reaches 67–94 cm (26–37 in) at the shoulder, and the head-and-body length is between 1.1 and 1.5 m (3 ft 7 in and 4 ft 11 in). Adults weigh between 21 and 72 kg (46 and 159 lb). The cheetah is capable of running at 93 to 104 km/h (58 to 65 mph); it has evolved specialized adaptations for speed, including a light build, long thin legs and a long tail.\nThe cheetah was first described in the late 18th century. Four subspecies are recognised today that are native to Africa and central Iran. An African subspecies was introduced to India in 2022. It is now distributed mainly in small, fragmented populations in northwestern, eastern and southern Africa and central Iran. It lives in a variety of habitats such as savannahs in the Serengeti, arid mountain ranges in the Sahara, and hilly desert terrain.\nThe cheetah lives in three main social groups: females and their cubs, male "coalitions", and solitary males. While females lead a nomadic life searching for prey in large home ranges, males are more sedentary and instead establish much smaller territories in areas with plentiful prey and access to females. The cheetah is active during the day, with peaks during dawn and dusk. It feeds on small- to medium-sized prey, mostly weighing under 40 kg (88 lb), and prefers medium-sized ungulates such as impala, springbok and Thomson\'s gazelles. The cheetah typically stalks its prey within 60–100 m (200–330 ft) before charging towards it, trips it during the chase and bites its throat to suffocate it to death. It breeds throughout the year. After a gestation of nearly three months, females give birth to a litter of three or four cubs. Cheetah cubs are highly vulnerable to predation by other large carnivores. They are weaned at around four months and are independent by around 20 months of age.\nThe cheetah is threatened by habitat loss, conflict with humans, poaching and high susceptibility to diseases. The global cheetah population was estimated in 2021 at 6,517; it is listed as Vulnerable on the IUCN Red List. It has been widely depicted in art, literature, advertising, and animation. It was tamed in ancient Egypt and trained for hunting ungulates in the Arabian Peninsula and India. It has been kept in zoos since the early 19th century.', 'source': 'https://en.wikipedia.org/wiki/Cheetah'}

LangSmith 追蹤： https://smith.langchain.com/public/6f34d136-451d-4625-90c8-2d8decebc21a/r

引用片段

要傳回文本跨度（可能除了來源識別符之外），我們可以使用相同的方法。唯一的變更是建立更複雜的輸出架構，此處使用 Pydantic，其中包括「引文」以及來源識別符。

旁註：請注意，如果我們分解文件，使我們擁有許多只有一兩個句子的文件，而不是少數長文件，則引用文件大致相當於引用片段，並且對於模型來說可能更容易，因為模型只需要為每個片段傳回識別符，而不是實際文本。可能值得嘗試這兩種方法並進行評估。

class Citation(BaseModel):
    source_id: int = Field(
        ...,
        description="The integer ID of a SPECIFIC source which justifies the answer.",
    )
    quote: str = Field(
        ...,
        description="The VERBATIM quote from the specified source that justifies the answer.",
    )


class QuotedAnswer(BaseModel):
    """Answer the user question based only on the given sources, and cite the sources used."""

    answer: str = Field(
        ...,
        description="The answer to the user question, which is based only on the given sources.",
    )
    citations: List[Citation] = Field(
        ..., description="Citations from the given sources that justify the answer."
    )

class State(TypedDict):
    question: str
    context: List[Document]
    answer: QuotedAnswer


def generate(state: State):
    formatted_docs = format_docs_with_id(state["context"])
    messages = prompt.invoke({"question": state["question"], "context": formatted_docs})
    structured_llm = llm.with_structured_output(QuotedAnswer)
    response = structured_llm.invoke(messages)
    return {"answer": response}


graph_builder = StateGraph(State).add_sequence([retrieve, generate])
graph_builder.add_edge(START, "retrieve")
graph = graph_builder.compile()

在這裡，我們看到模型已從來源 0 中提取了相關的文本片段

result = graph.invoke({"question": "How fast are cheetahs?"})

result["answer"]

QuotedAnswer(answer='Cheetahs are capable of running at speeds of 93 to 104 km/h (58 to 65 mph).', citations=[Citation(source_id=0, quote='The cheetah is capable of running at 93 to 104 km/h (58 to 65 mph); it has evolved specialized adaptations for speed.')])

LangSmith 追蹤： https://smith.langchain.com/public/e16dc72f-4261-4f25-a9a7-906238737283/r

直接提示

某些模型不支援函式呼叫。我們可以透過直接提示獲得類似的結果。讓我們嘗試指示模型為其輸出生成結構化的 XML

xml_system = """You're a helpful AI assistant. Given a user question and some Wikipedia article snippets, \
answer the user question and provide citations. If none of the articles answer the question, just say you don't know.

Remember, you must return both an answer and citations. A citation consists of a VERBATIM quote that \
justifies the answer and the ID of the quote article. Return a citation for every quote across all articles \
that justify the answer. Use the following format for your final output:

<cited_answer>
    <answer></answer>
    <citations>
        <citation><source_id></source_id><quote></quote></citation>
        <citation><source_id></source_id><quote></quote></citation>
        ...
    </citations>
</cited_answer>

Here are the Wikipedia articles:{context}"""
xml_prompt = ChatPromptTemplate.from_messages(
    [("system", xml_system), ("human", "{question}")]
)

我們現在對我們的鏈進行類似的小更新

我們更新格式化函式以將檢索到的上下文包裝在 XML 標籤中；
我們不使用 .with_structured_output（例如，因為它對於模型不存在）；
我們使用 XMLOutputParser 將答案解析為字典。

from langchain_core.output_parsers import XMLOutputParser


def format_docs_xml(docs: List[Document]) -> str:
    formatted = []
    for i, doc in enumerate(docs):
        doc_str = f"""\
    <source id=\"{i}\">
        <title>{doc.metadata['title']}</title>
        <article_snippet>{doc.page_content}</article_snippet>
    </source>"""
        formatted.append(doc_str)
    return "\n\n<sources>" + "\n".join(formatted) + "</sources>"


class State(TypedDict):
    question: str
    context: List[Document]
    answer: dict


def generate(state: State):
    formatted_docs = format_docs_xml(state["context"])
    messages = xml_prompt.invoke(
        {"question": state["question"], "context": formatted_docs}
    )
    response = llm.invoke(messages)
    parsed_response = XMLOutputParser().invoke(response)
    return {"answer": parsed_response}


graph_builder = StateGraph(State).add_sequence([retrieve, generate])
graph_builder.add_edge(START, "retrieve")
graph = graph_builder.compile()

API 參考：XMLOutputParser

請注意，引用再次結構化到答案中

result = graph.invoke({"question": "How fast are cheetahs?"})

result["answer"]

{'cited_answer': [{'answer': 'Cheetahs can run at speeds of 93 to 104 km/h (58 to 65 mph).'},
  {'citations': [{'citation': [{'source_id': '0'},
      {'quote': 'The cheetah is capable of running at 93 to 104 km/h (58 to 65 mph);'}]},
    {'citation': [{'source_id': '3'},
      {'quote': 'The fastest land animal is the cheetah.'}]}]}]}

LangSmith 追蹤： https://smith.langchain.com/public/0c45f847-c640-4b9a-a5fa-63559e413527/r

檢索後處理

另一種方法是對我們檢索到的文件進行後處理以壓縮內容，以便來源內容已經足夠精簡，我們不需要模型引用特定來源或跨度。例如，我們可以將每個文件分成一兩個句子，嵌入它們，並僅保留最相關的句子。 LangChain 具有一些用於此目的的內建組件。在這裡，我們將使用 RecursiveCharacterTextSplitter，它透過在分隔符子字串上分割來建立指定大小的區塊，以及 EmbeddingsFilter，它僅保留具有最相關嵌入的文本。

這種方法有效地更新了我們的 retrieve 步驟以壓縮文件。首先，讓我們選擇一個嵌入模型

選擇嵌入模型

pip install -qU langchain-openai

import getpass
import os

if not os.environ.get("OPENAI_API_KEY"):
  os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter API key for OpenAI: ")

from langchain_openai import OpenAIEmbeddings

embeddings = OpenAIEmbeddings(model="text-embedding-3-large")

我們現在可以重寫 retrieve 步驟

from langchain.retrievers.document_compressors import EmbeddingsFilter
from langchain_core.runnables import RunnableParallel
from langchain_text_splitters import RecursiveCharacterTextSplitter

splitter = RecursiveCharacterTextSplitter(
    chunk_size=400,
    chunk_overlap=0,
    separators=["\n\n", "\n", ".", " "],
    keep_separator=False,
)
compressor = EmbeddingsFilter(embeddings=embeddings, k=10)


class State(TypedDict):
    question: str
    context: List[Document]
    answer: str


def retrieve(state: State):
    retrieved_docs = retriever.invoke(state["question"])
    split_docs = splitter.split_documents(retrieved_docs)
    stateful_docs = compressor.compress_documents(split_docs, state["question"])
    return {"context": stateful_docs}

API 參考：EmbeddingsFilter | RunnableParallel | RecursiveCharacterTextSplitter

讓我們測試一下

retrieval_result = retrieve({"question": "How fast are cheetahs?"})

for doc in retrieval_result["context"]:
    print(f"{doc.page_content}\n\n")

Adults weigh between 21 and 72 kg (46 and 159 lb). The cheetah is capable of running at 93 to 104 km/h (58 to 65 mph); it has evolved specialized adaptations for speed, including a light build, long thin legs and a long tail

The cheetah (Acinonyx jubatus) is a large cat and the fastest land animal. It has a tawny to creamy white or pale buff fur that is marked with evenly spaced, solid black spots. The head is small and rounded, with a short snout and black tear-like facial streaks. It reaches 67–94 cm (26–37 in) at the shoulder, and the head-and-body length is between 1.1 and 1.5 m (3 ft 7 in and 4 ft 11 in)

2 mph), or 171 body lengths per second. The cheetah, the fastest land mammal, scores at only 16 body lengths per second

It feeds on small- to medium-sized prey, mostly weighing under 40 kg (88 lb), and prefers medium-sized ungulates such as impala, springbok and Thomson's gazelles. The cheetah typically stalks its prey within 60–100 m (200–330 ft) before charging towards it, trips it during the chase and bites its throat to suffocate it to death. It breeds throughout the year

The cheetah was first described in the late 18th century. Four subspecies are recognised today that are native to Africa and central Iran. An African subspecies was introduced to India in 2022. It is now distributed mainly in small, fragmented populations in northwestern, eastern and southern Africa and central Iran

The cheetah lives in three main social groups: females and their cubs, male "coalitions", and solitary males. While females lead a nomadic life searching for prey in large home ranges, males are more sedentary and instead establish much smaller territories in areas with plentiful prey and access to females. The cheetah is active during the day, with peaks during dawn and dusk

The Southeast African cheetah (Acinonyx jubatus jubatus) is the nominate cheetah subspecies native to East and Southern Africa. The Southern African cheetah lives mainly in the lowland areas and deserts of the Kalahari, the savannahs of Okavango Delta, and the grasslands of the Transvaal region in South Africa. In Namibia, cheetahs are mostly found in farmlands

Subpopulations have been called "South African cheetah" and "Namibian cheetah."

In India, four cheetahs of the subspecies are living in Kuno National Park in Madhya Pradesh after having been introduced there

Acinonyx jubatus velox proposed in 1913 by Edmund Heller on basis of a cheetah that was shot by Kermit Roosevelt in June 1909 in the Kenyan highlands.
Acinonyx rex proposed in 1927 by Reginald Innes Pocock on basis of a specimen from the Umvukwe Range in Rhodesia.

接下來，我們像以前一樣將其組裝到我們的鏈中

# This step is unchanged from our original RAG implementation
def generate(state: State):
    docs_content = "\n\n".join(doc.page_content for doc in state["context"])
    messages = prompt.invoke({"question": state["question"], "context": docs_content})
    response = llm.invoke(messages)
    return {"answer": response.content}


graph_builder = StateGraph(State).add_sequence([retrieve, generate])
graph_builder.add_edge(START, "retrieve")
graph = graph_builder.compile()

result = graph.invoke({"question": "How fast are cheetahs?"})

print(result["answer"])

Cheetahs are capable of running at speeds between 93 to 104 km/h (58 to 65 mph). They are known as the fastest land animals.

請注意，文件內容現在已壓縮，儘管文件物件在其元數據的「摘要」鍵中保留了原始內容。這些摘要不會傳遞給模型；僅傳遞精簡的內容。

result["context"][0].page_content  # passed to model

'Adults weigh between 21 and 72 kg (46 and 159 lb). The cheetah is capable of running at 93 to 104 km/h (58 to 65 mph); it has evolved specialized adaptations for speed, including a light build, long thin legs and a long tail'

result["context"][0].metadata["summary"]  # original document  # original document

'The cheetah (Acinonyx jubatus) is a large cat and the fastest land animal. It has a tawny to creamy white or pale buff fur that is marked with evenly spaced, solid black spots. The head is small and rounded, with a short snout and black tear-like facial streaks. It reaches 67–94 cm (26–37 in) at the shoulder, and the head-and-body length is between 1.1 and 1.5 m (3 ft 7 in and 4 ft 11 in). Adults weigh between 21 and 72 kg (46 and 159 lb). The cheetah is capable of running at 93 to 104 km/h (58 to 65 mph); it has evolved specialized adaptations for speed, including a light build, long thin legs and a long tail.\nThe cheetah was first described in the late 18th century. Four subspecies are recognised today that are native to Africa and central Iran. An African subspecies was introduced to India in 2022. It is now distributed mainly in small, fragmented populations in northwestern, eastern and southern Africa and central Iran. It lives in a variety of habitats such as savannahs in the Serengeti, arid mountain ranges in the Sahara, and hilly desert terrain.\nThe cheetah lives in three main social groups: females and their cubs, male "coalitions", and solitary males. While females lead a nomadic life searching for prey in large home ranges, males are more sedentary and instead establish much smaller territories in areas with plentiful prey and access to females. The cheetah is active during the day, with peaks during dawn and dusk. It feeds on small- to medium-sized prey, mostly weighing under 40 kg (88 lb), and prefers medium-sized ungulates such as impala, springbok and Thomson\'s gazelles. The cheetah typically stalks its prey within 60–100 m (200–330 ft) before charging towards it, trips it during the chase and bites its throat to suffocate it to death. It breeds throughout the year. After a gestation of nearly three months, females give birth to a litter of three or four cubs. Cheetah cubs are highly vulnerable to predation by other large carnivores. They are weaned at around four months and are independent by around 20 months of age.\nThe cheetah is threatened by habitat loss, conflict with humans, poaching and high susceptibility to diseases. The global cheetah population was estimated in 2021 at 6,517; it is listed as Vulnerable on the IUCN Red List. It has been widely depicted in art, literature, advertising, and animation. It was tamed in ancient Egypt and trained for hunting ungulates in the Arabian Peninsula and India. It has been kept in zoos since the early 19th century.'

LangSmith 追蹤： https://smith.langchain.com/public/21b0dc15-d70a-4293-9402-9c70f9178e66/r

生成後處理

另一種方法是對我們的模型生成進行後處理。在此範例中，我們將首先僅生成一個答案，然後我們將要求模型使用引用註釋其自己的答案。這種方法的缺點當然是它速度較慢且成本更高，因為需要進行兩次模型呼叫。

讓我們將此應用於我們的初始鏈。如果需要，我們可以透過應用程式中的第三個步驟來實作此功能。

class Citation(BaseModel):
    source_id: int = Field(
        ...,
        description="The integer ID of a SPECIFIC source which justifies the answer.",
    )
    quote: str = Field(
        ...,
        description="The VERBATIM quote from the specified source that justifies the answer.",
    )


class AnnotatedAnswer(BaseModel):
    """Annotate the answer to the user question with quote citations that justify the answer."""

    citations: List[Citation] = Field(
        ..., description="Citations from the given sources that justify the answer."
    )


structured_llm = llm.with_structured_output(AnnotatedAnswer)

class State(TypedDict):
    question: str
    context: List[Document]
    answer: str
    annotations: AnnotatedAnswer


def retrieve(state: State):
    retrieved_docs = retriever.invoke(state["question"])
    return {"context": retrieved_docs}


def generate(state: State):
    docs_content = "\n\n".join(doc.page_content for doc in state["context"])
    messages = prompt.invoke({"question": state["question"], "context": docs_content})
    response = llm.invoke(messages)
    return {"answer": response.content}


def annotate(state: State):
    formatted_docs = format_docs_with_id(state["context"])
    messages = [
        ("system", system_prompt.format(context=formatted_docs)),
        ("human", state["question"]),
        ("ai", state["answer"]),
        ("human", "Annotate your answer with citations."),
    ]
    response = structured_llm.invoke(messages)
    return {"annotations": response}


graph_builder = StateGraph(State).add_sequence([retrieve, generate, annotate])
graph_builder.add_edge(START, "retrieve")
graph = graph_builder.compile()

display(Image(graph.get_graph().draw_mermaid_png()))

result = graph.invoke({"question": "How fast are cheetahs?"})

print(result["answer"])

Cheetahs are capable of running at speeds between 93 to 104 km/h (58 to 65 mph).

result["annotations"]

AnnotatedAnswer(citations=[Citation(source_id=0, quote='The cheetah is capable of running at 93 to 104 km/h (58 to 65 mph)')])

LangSmith 追蹤： https://smith.langchain.com/public/b8257417-573b-47c4-a750-74e542035f19/r

設定​

工具呼叫​

引用文件​

引用片段​

直接提示​

檢索後處理​

生成後處理​

此頁面是否對您有幫助？

設定