摘要文本
本教學示範如何使用內建鏈和 LangGraph 進行文本摘要。
本頁的先前版本展示了傳統鏈 StuffDocumentsChain、MapReduceDocumentsChain 和 RefineDocumentsChain。請參閱此處以瞭解有關使用這些抽象概念的資訊,以及與本教學中示範方法的比較。
假設您有一組文件(PDF、Notion 頁面、客戶問題等),並且想要摘要內容。
由於 LLM 在理解和合成文本方面的熟練程度,它們是執行此操作的絕佳工具。
在檢索增強生成的背景下,摘要文本可以幫助提煉大量檢索文件中資訊,以為 LLM 提供上下文。
在本逐步指南中,我們將介紹如何使用 LLM 摘要多個文件中的內容。
概念
我們將涵蓋的概念包括
-
使用語言模型。
-
使用文件載入器,特別是 WebBaseLoader 從 HTML 網頁載入內容。
-
摘要或以其他方式組合文件的兩種方法。
- Stuff,它只是將文件串連到提示中;
- Map-reduce,適用於較大的文件集。這會將文件分割成批次、摘要這些批次,然後摘要這些摘要。
有關這些策略和其他策略(包括迭代精煉)的更簡短、更有針對性的指南,可以在操作指南中找到。
設定
Jupyter Notebook
本指南(以及文件中大多數其他指南)使用Jupyter Notebook,並假設讀者也是如此。Jupyter Notebook 非常適合學習如何使用 LLM 系統,因為通常情況下可能會出錯(意外輸出、API 故障等),並且在互動式環境中瀏覽指南是更好地理解它們的好方法。
本教學和其他教學或許最方便在 Jupyter Notebook 中執行。請參閱此處以取得有關如何安裝的說明。
安裝
若要安裝 LangChain,請執行
- Pip
- Conda
pip install langchain
conda install langchain -c conda-forge
如需更多詳細資訊,請參閱我們的安裝指南。
LangSmith
您使用 LangChain 建構的許多應用程式將包含多個步驟,其中包含多次 LLM 呼叫。隨著這些應用程式變得越來越複雜,能夠檢查您的鏈或代理器內部究竟發生了什麼變得至關重要。執行此操作的最佳方法是使用 LangSmith。
在上方連結註冊後,請務必設定您的環境變數以開始記錄追蹤
export LANGSMITH_TRACING="true"
export LANGSMITH_API_KEY="..."
或者,如果在 Notebook 中,您可以使用以下程式碼設定它們
import getpass
import os
os.environ["LANGSMITH_TRACING"] = "true"
os.environ["LANGSMITH_API_KEY"] = getpass.getpass()
概觀
建構摘要器的核心問題是如何將您的文件傳遞到 LLM 的上下文視窗中。兩種常用的方法是
-
Stuff
:只需將所有文件「塞入」單一提示中。這是最簡單的方法(如需有關用於此方法的create_stuff_documents_chain
建構子的詳細資訊,請參閱此處)。 -
Map-reduce
:在「map」步驟中單獨摘要每個文件,然後將摘要「reduce」為最終摘要(如需有關用於此方法的MapReduceDocumentsChain
的詳細資訊,請參閱此處)。
請注意,當子文件的理解不依賴於先前的上下文時,map-reduce 特別有效。例如,當摘要大量較短文件的語料庫時。在其他情況下,例如摘要具有固有順序的小說或文本主體時,迭代精煉可能更有效。
設定
首先設定環境變數並安裝套件
%pip install --upgrade --quiet tiktoken langchain langgraph beautifulsoup4 langchain-community
# Set env var OPENAI_API_KEY or load from a .env file
# import dotenv
# dotenv.load_dotenv()
import os
os.environ["LANGSMITH_TRACING"] = "true"
首先,我們載入文件。我們將使用 WebBaseLoader 載入部落格文章
from langchain_community.document_loaders import WebBaseLoader
loader = WebBaseLoader("https://lilianweng.github.io/posts/2023-06-23-agent/")
docs = loader.load()
接下來,讓我們選取 LLM
pip install -qU "langchain[openai]"
import getpass
import os
if not os.environ.get("OPENAI_API_KEY"):
os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter API key for OpenAI: ")
from langchain.chat_models import init_chat_model
llm = init_chat_model("gpt-4o-mini", model_provider="openai")
Stuff:在單次 LLM 呼叫中摘要
我們可以使用 create_stuff_documents_chain,特別是當使用較大上下文視窗模型時,例如
- 128k token OpenAI
gpt-4o
- 200k token Anthropic
claude-3-5-sonnet-20240620
鏈將取得文件清單,將它們全部插入提示中,然後將該提示傳遞給 LLM
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain.chains.llm import LLMChain
from langchain_core.prompts import ChatPromptTemplate
# Define prompt
prompt = ChatPromptTemplate.from_messages(
[("system", "Write a concise summary of the following:\\n\\n{context}")]
)
# Instantiate chain
chain = create_stuff_documents_chain(llm, prompt)
# Invoke chain
result = chain.invoke({"context": docs})
print(result)
The article "LLM Powered Autonomous Agents" by Lilian Weng discusses the development and capabilities of autonomous agents powered by large language models (LLMs). It outlines a system architecture that includes three main components: Planning, Memory, and Tool Use.
1. **Planning** involves task decomposition, where complex tasks are broken down into manageable subgoals, and self-reflection, allowing agents to learn from past actions to improve future performance. Techniques like Chain of Thought (CoT) and Tree of Thoughts (ToT) are highlighted for enhancing reasoning and planning.
2. **Memory** is categorized into short-term and long-term memory, with mechanisms for fast retrieval using Maximum Inner Product Search (MIPS) algorithms. This allows agents to retain and recall information effectively.
3. **Tool Use** enables agents to interact with external APIs and tools, enhancing their capabilities beyond the limitations of their training data. Examples include MRKL systems and frameworks like HuggingGPT, which facilitate task planning and execution.
The article also addresses challenges such as finite context length, difficulties in long-term planning, and the reliability of natural language interfaces. It concludes with case studies demonstrating the practical applications of these concepts in scientific discovery and interactive simulations. Overall, the article emphasizes the potential of LLMs as powerful problem solvers in autonomous agent systems.
串流
請注意,我們也可以逐個 Token 串流結果
for token in chain.stream({"context": docs}):
print(token, end="|")
|The| article| "|LL|M| Powered| Autonomous| Agents|"| by| Lil|ian| W|eng| discusses| the| development| and| capabilities| of| autonomous| agents| powered| by| large| language| models| (|LL|Ms|).| It| outlines| a| system| architecture| that| includes| three| main| components|:| Planning|,| Memory|,| and| Tool| Use|.|
|1|.| **|Planning|**| involves| task| decomposition|,| where| complex| tasks| are| broken| down| into| manageable| sub|go|als|,| and| self|-ref|lection|,| allowing| agents| to| learn| from| past| actions| to| improve| future| performance|.| Techniques| like| Chain| of| Thought| (|Co|T|)| and| Tree| of| Thoughts| (|To|T|)| are| highlighted| for| enhancing| reasoning| and| planning|.
|2|.| **|Memory|**| is| categorized| into| short|-term| and| long|-term| memory|,| with| mechanisms| for| fast| retrieval| using| Maximum| Inner| Product| Search| (|M|IPS|)| algorithms|.| This| allows| agents| to| retain| and| recall| information| effectively|.
|3|.| **|Tool| Use|**| emphasizes| the| integration| of| external| APIs| and| tools| to| extend| the| capabilities| of| L|LM|s|,| enabling| them| to| perform| tasks| beyond| their| inherent| limitations|.| Examples| include| MR|KL| systems| and| frameworks| like| Hug|ging|GPT|,| which| facilitate| task| planning| and| execution|.
|The| article| also| addresses| challenges| such| as| finite| context| length|,| difficulties| in| long|-term| planning|,| and| the| reliability| of| natural| language| interfaces|.| It| concludes| with| case| studies| demonstrating| the| practical| applications| of| L|LM|-powered| agents| in| scientific| discovery| and| interactive| simulations|.| Overall|,| the| piece| illustrates| the| potential| of| L|LM|s| as| general| problem| sol|vers| and| their| evolving| role| in| autonomous| systems|.||
深入瞭解
- 您可以輕鬆自訂提示。
- 您可以透過
llm
參數輕鬆嘗試不同的 LLM(例如,Claude)。
Map-Reduce:透過平行化摘要長文本
讓我們拆解 map reduce 方法。為此,我們先使用 LLM 將每個文件映射到個別摘要。然後,我們將這些摘要 reduce 或合併為單一全域摘要。
請注意,map 步驟通常在輸入文件上平行化。
LangGraph 建構於 langchain-core
之上,支援 map-reduce 工作流程,並且非常適合此問題
- LangGraph 允許串流個別步驟(例如連續摘要),從而可以更好地控制執行;
- LangGraph 的檢查點支援錯誤復原、擴展人為迴路工作流程,以及更輕鬆地整合到對話式應用程式中。
- LangGraph 實作易於修改和擴展,我們將在下面看到。
Map
首先,讓我們定義與 map 步驟相關聯的提示。我們可以像上面的 stuff
方法一樣,使用相同的摘要提示
from langchain_core.prompts import ChatPromptTemplate
map_prompt = ChatPromptTemplate.from_messages(
[("system", "Write a concise summary of the following:\\n\\n{context}")]
)
我們也可以使用 Prompt Hub 來儲存和提取提示。
這將適用於您的LangSmith API 金鑰。
例如,請參閱此處的 map 提示。
from langchain import hub
map_prompt = hub.pull("rlm/map-prompt")
Reduce
我們也定義一個提示,該提示採用文件映射結果並將它們 reduce 為單一輸出。
# Also available via the hub: `hub.pull("rlm/reduce-prompt")`
reduce_template = """
The following is a set of summaries:
{docs}
Take these and distill it into a final, consolidated summary
of the main themes.
"""
reduce_prompt = ChatPromptTemplate([("human", reduce_template)])
透過 LangGraph 協調
下面我們實作一個簡單的應用程式,該應用程式將摘要步驟映射到文件清單上,然後使用上述提示 reduce 它們。
當文本長度與 LLM 的上下文視窗相比時,Map-reduce 流程特別有用。對於長文本,我們需要一種機制來確保 reduce 步驟中要摘要的上下文不會超過模型的上下文視窗大小。在這裡,我們實作摘要的遞迴「摺疊」:輸入根據 Token 限制進行分割,並產生分割區的摘要。重複此步驟,直到摘要的總長度在所需的限制內,從而允許摘要任意長度的文本。
首先,我們將部落格文章分塊為較小的「子文件」以進行映射
from langchain_text_splitters import CharacterTextSplitter
text_splitter = CharacterTextSplitter.from_tiktoken_encoder(
chunk_size=1000, chunk_overlap=0
)
split_docs = text_splitter.split_documents(docs)
print(f"Generated {len(split_docs)} documents.")
Created a chunk of size 1003, which is longer than the specified 1000
``````output
Generated 14 documents.
接下來,我們定義圖形。請注意,我們定義了人為低的 1,000 個 Token 的最大 Token 長度,以說明「摺疊」步驟。
import operator
from typing import Annotated, List, Literal, TypedDict
from langchain.chains.combine_documents.reduce import (
acollapse_docs,
split_list_of_docs,
)
from langchain_core.documents import Document
from langgraph.constants import Send
from langgraph.graph import END, START, StateGraph
token_max = 1000
def length_function(documents: List[Document]) -> int:
"""Get number of tokens for input contents."""
return sum(llm.get_num_tokens(doc.page_content) for doc in documents)
# This will be the overall state of the main graph.
# It will contain the input document contents, corresponding
# summaries, and a final summary.
class OverallState(TypedDict):
# Notice here we use the operator.add
# This is because we want combine all the summaries we generate
# from individual nodes back into one list - this is essentially
# the "reduce" part
contents: List[str]
summaries: Annotated[list, operator.add]
collapsed_summaries: List[Document]
final_summary: str
# This will be the state of the node that we will "map" all
# documents to in order to generate summaries
class SummaryState(TypedDict):
content: str
# Here we generate a summary, given a document
async def generate_summary(state: SummaryState):
prompt = map_prompt.invoke(state["content"])
response = await llm.ainvoke(prompt)
return {"summaries": [response.content]}
# Here we define the logic to map out over the documents
# We will use this an edge in the graph
def map_summaries(state: OverallState):
# We will return a list of `Send` objects
# Each `Send` object consists of the name of a node in the graph
# as well as the state to send to that node
return [
Send("generate_summary", {"content": content}) for content in state["contents"]
]
def collect_summaries(state: OverallState):
return {
"collapsed_summaries": [Document(summary) for summary in state["summaries"]]
}
async def _reduce(input: dict) -> str:
prompt = reduce_prompt.invoke(input)
response = await llm.ainvoke(prompt)
return response.content
# Add node to collapse summaries
async def collapse_summaries(state: OverallState):
doc_lists = split_list_of_docs(
state["collapsed_summaries"], length_function, token_max
)
results = []
for doc_list in doc_lists:
results.append(await acollapse_docs(doc_list, _reduce))
return {"collapsed_summaries": results}
# This represents a conditional edge in the graph that determines
# if we should collapse the summaries or not
def should_collapse(
state: OverallState,
) -> Literal["collapse_summaries", "generate_final_summary"]:
num_tokens = length_function(state["collapsed_summaries"])
if num_tokens > token_max:
return "collapse_summaries"
else:
return "generate_final_summary"
# Here we will generate the final summary
async def generate_final_summary(state: OverallState):
response = await _reduce(state["collapsed_summaries"])
return {"final_summary": response}
# Construct the graph
# Nodes:
graph = StateGraph(OverallState)
graph.add_node("generate_summary", generate_summary) # same as before
graph.add_node("collect_summaries", collect_summaries)
graph.add_node("collapse_summaries", collapse_summaries)
graph.add_node("generate_final_summary", generate_final_summary)
# Edges:
graph.add_conditional_edges(START, map_summaries, ["generate_summary"])
graph.add_edge("generate_summary", "collect_summaries")
graph.add_conditional_edges("collect_summaries", should_collapse)
graph.add_conditional_edges("collapse_summaries", should_collapse)
graph.add_edge("generate_final_summary", END)
app = graph.compile()
LangGraph 允許繪製圖形結構,以幫助視覺化其功能
from IPython.display import Image
Image(app.get_graph().draw_mermaid_png())
執行應用程式時,我們可以串流圖形以觀察其步驟順序。下面,我們將只列印步驟的名稱。
請注意,由於圖形中存在迴圈,因此在其執行時指定 recursion_limit 可能會有所幫助。當超出指定的限制時,這將引發特定錯誤。
async for step in app.astream(
{"contents": [doc.page_content for doc in split_docs]},
{"recursion_limit": 10},
):
print(list(step.keys()))
['generate_summary']
['generate_summary']
['generate_summary']
['generate_summary']
['generate_summary']
['generate_summary']
['generate_summary']
['generate_summary']
['generate_summary']
['generate_summary']
['generate_summary']
['generate_summary']
['generate_summary']
['generate_summary']
['collect_summaries']
['collapse_summaries']
['collapse_summaries']
['generate_final_summary']
print(step)
{'generate_final_summary': {'final_summary': 'The consolidated summary of the main themes from the provided documents is as follows:\n\n1. **Integration of Large Language Models (LLMs) in Autonomous Agents**: The documents explore the evolving role of LLMs in autonomous systems, emphasizing their enhanced reasoning and acting capabilities through methodologies that incorporate structured planning, memory systems, and tool use.\n\n2. **Core Components of Autonomous Agents**:\n - **Planning**: Techniques like task decomposition (e.g., Chain of Thought) and external classical planners are utilized to facilitate long-term planning by breaking down complex tasks.\n - **Memory**: The memory system is divided into short-term (in-context learning) and long-term memory, with parallels drawn between human memory and machine learning to improve agent performance.\n - **Tool Use**: Agents utilize external APIs and algorithms to enhance problem-solving abilities, exemplified by frameworks like HuggingGPT that manage task workflows.\n\n3. **Neuro-Symbolic Architectures**: The integration of MRKL (Modular Reasoning, Knowledge, and Language) systems combines neural and symbolic expert modules with LLMs, addressing challenges in tasks such as verbal math problem-solving.\n\n4. **Specialized Applications**: Case studies, such as ChemCrow and projects in anticancer drug discovery, demonstrate the advantages of LLMs augmented with expert tools in specialized domains.\n\n5. **Challenges and Limitations**: The documents highlight challenges such as hallucination in model outputs and the finite context length of LLMs, which affects their ability to incorporate historical information and perform self-reflection. Techniques like Chain of Hindsight and Algorithm Distillation are discussed to enhance model performance through iterative learning.\n\n6. **Structured Software Development**: A systematic approach to creating Python software projects is emphasized, focusing on defining core components, managing dependencies, and adhering to best practices for documentation.\n\nOverall, the integration of structured planning, memory systems, and advanced tool use aims to enhance the capabilities of LLM-powered autonomous agents while addressing the challenges and limitations these technologies face in real-world applications.'}}
在相應的LangSmith 追蹤中,我們可以查看個別的 LLM 呼叫,它們在各自的節點下分組。
深入瞭解
自訂
- 如上所示,您可以自訂 map 和 reduce 階段的 LLM 和提示。
真實世界的用例
- 請參閱這篇部落格文章案例研究,瞭解如何分析使用者互動(有關 LangChain 文件的問題)!
- 這篇部落格文章和相關的 repo 也介紹了將叢集分析作為摘要的一種手段。
- 這開闢了另一條超越
stuff
或map-reduce
方法的路徑,值得考慮。
後續步驟
我們鼓勵您查看操作指南,以瞭解有關以下方面的更多詳細資訊
和其他概念。