遷移離開 ConversationBufferWindowMemory 或 ConversationTokenBufferMemory

如果您嘗試遷移離開以下列出的舊記憶體類別之一，請遵循本指南

記憶體類型	描述
`ConversationBufferWindowMemory`	保留對話的最後 `n` 條訊息。當訊息超過 `n` 條時，會捨棄最舊的訊息。
`ConversationTokenBufferMemory`	僅在對話中的 token 總數不超過特定限制的條件下，保留對話中最近的訊息。

ConversationBufferWindowMemory 和 ConversationTokenBufferMemory 在原始對話歷史記錄之上應用額外處理，以修剪對話歷史記錄的大小，使其適合聊天模型的上下文視窗內。

此處理功能可以使用 LangChain 的內建 trim_messages 函式完成。

重要事項

我們將從探索一種直接的方法開始，該方法涉及將處理邏輯應用於整個對話歷史記錄。

雖然這種方法易於實作，但它有一個缺點：隨著對話的增長，延遲也會隨之增加，因為邏輯會在每次輪次都重新應用於對話中所有先前的交流。

更進階的策略側重於增量更新對話歷史記錄，以避免冗餘處理。

例如，langgraph 關於摘要說明的操作指南示範了如何在捨棄較舊訊息的同時維護對話的執行摘要，確保它們不會在後續輪次中重新處理。

設定

%%capture --no-stderr
%pip install --upgrade --quiet langchain-openai langchain

import os
from getpass import getpass

if "OPENAI_API_KEY" not in os.environ:
    os.environ["OPENAI_API_KEY"] = getpass()

使用 LLMChain / Conversation Chain 的傳統用法

詳細資訊

from langchain.chains import LLMChain
from langchain.memory import ConversationBufferWindowMemory
from langchain_core.messages import SystemMessage
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.prompts.chat import (
    ChatPromptTemplate,
    HumanMessagePromptTemplate,
    MessagesPlaceholder,
)
from langchain_openai import ChatOpenAI

prompt = ChatPromptTemplate(
    [
        SystemMessage(content="You are a helpful assistant."),
        MessagesPlaceholder(variable_name="chat_history"),
        HumanMessagePromptTemplate.from_template("{text}"),
    ]
)

memory = ConversationBufferWindowMemory(memory_key="chat_history", return_messages=True)

legacy_chain = LLMChain(
    llm=ChatOpenAI(),
    prompt=prompt,
    memory=memory,
)

legacy_result = legacy_chain.invoke({"text": "my name is bob"})
print(legacy_result)

legacy_result = legacy_chain.invoke({"text": "what was my name"})
print(legacy_result)

{'text': 'Nice to meet you, Bob! How can I assist you today?', 'chat_history': []}
{'text': 'Your name is Bob. How can I assist you further, Bob?', 'chat_history': [HumanMessage(content='my name is bob', additional_kwargs={}, response_metadata={}), AIMessage(content='Nice to meet you, Bob! How can I assist you today?', additional_kwargs={}, response_metadata={})]}

重新實作 ConversationBufferWindowMemory 邏輯

讓我們先建立適當的邏輯來處理對話歷史記錄，然後我們將了解如何將其整合到應用程式中。您稍後可以使用更符合您特定需求的高階邏輯來取代此基本設定。

我們將使用 trim_messages 來實作保留對話最後 n 條訊息的邏輯。當訊息數量超過 n 時，它將捨棄最舊的訊息。

此外，如果系統訊息存在，我們也會保留它 — 當系統訊息存在時，它是對話中的第一條訊息，其中包含聊天模型的指示。

from langchain_core.messages import (
    AIMessage,
    BaseMessage,
    HumanMessage,
    SystemMessage,
    trim_messages,
)
from langchain_openai import ChatOpenAI

messages = [
    SystemMessage("you're a good assistant, you always respond with a joke."),
    HumanMessage("i wonder why it's called langchain"),
    AIMessage(
        'Well, I guess they thought "WordRope" and "SentenceString" just didn\'t have the same ring to it!'
    ),
    HumanMessage("and who is harrison chasing anyways"),
    AIMessage(
        "Hmmm let me think.\n\nWhy, he's probably chasing after the last cup of coffee in the office!"
    ),
    HumanMessage("why is 42 always the answer?"),
    AIMessage(
        "Because it’s the only number that’s constantly right, even when it doesn’t add up!"
    ),
    HumanMessage("What did the cow say?"),
]

from langchain_core.messages import trim_messages

selected_messages = trim_messages(
    messages,
    token_counter=len,  # <-- len will simply count the number of messages rather than tokens
    max_tokens=5,  # <-- allow up to 5 messages.
    strategy="last",
    # Most chat models expect that chat history starts with either:
    # (1) a HumanMessage or
    # (2) a SystemMessage followed by a HumanMessage
    # start_on="human" makes sure we produce a valid chat history
    start_on="human",
    # Usually, we want to keep the SystemMessage
    # if it's present in the original history.
    # The SystemMessage has special instructions for the model.
    include_system=True,
    allow_partial=False,
)

for msg in selected_messages:
    msg.pretty_print()

API 參考：trim_messages

================================[1m System Message [0m================================

you're a good assistant, you always respond with a joke.
==================================[1m Ai Message [0m==================================

Hmmm let me think.

Why, he's probably chasing after the last cup of coffee in the office!
================================[1m Human Message [0m=================================

why is 42 always the answer?
==================================[1m Ai Message [0m==================================

Because it’s the only number that’s constantly right, even when it doesn’t add up!
================================[1m Human Message [0m=================================

What did the cow say?

重新實作 ConversationTokenBufferMemory 邏輯

在這裡，我們將使用 trim_messages 保留系統訊息和對話中最新的訊息，但條件是對話中的 token 總數不得超過特定限制。

from langchain_core.messages import trim_messages

selected_messages = trim_messages(
    messages,
    # Please see API reference for trim_messages for other ways to specify a token counter.
    token_counter=ChatOpenAI(model="gpt-4o"),
    max_tokens=80,  # <-- token limit
    # The start_on is specified
    # Most chat models expect that chat history starts with either:
    # (1) a HumanMessage or
    # (2) a SystemMessage followed by a HumanMessage
    # start_on="human" makes sure we produce a valid chat history
    start_on="human",
    # Usually, we want to keep the SystemMessage
    # if it's present in the original history.
    # The SystemMessage has special instructions for the model.
    include_system=True,
    strategy="last",
)

for msg in selected_messages:
    msg.pretty_print()

API 參考：trim_messages

================================[1m System Message [0m================================

you're a good assistant, you always respond with a joke.
================================[1m Human Message [0m=================================

why is 42 always the answer?
==================================[1m Ai Message [0m==================================

Because it’s the only number that’s constantly right, even when it doesn’t add up!
================================[1m Human Message [0m=================================

What did the cow say?

使用 LangGraph 的現代用法

以下範例示範如何使用 LangGraph 新增簡單的對話預先處理邏輯。

注意

如果您想避免每次都在整個對話歷史記錄上執行計算，您可以遵循關於摘要說明的操作指南，其中示範了如何捨棄較舊的訊息，確保它們不會在後續輪次中重新處理。

詳細資訊

import uuid

from IPython.display import Image, display
from langchain_core.messages import HumanMessage
from langgraph.checkpoint.memory import MemorySaver
from langgraph.graph import START, MessagesState, StateGraph

# Define a new graph
workflow = StateGraph(state_schema=MessagesState)

# Define a chat model
model = ChatOpenAI()


# Define the function that calls the model
def call_model(state: MessagesState):
    selected_messages = trim_messages(
        state["messages"],
        token_counter=len,  # <-- len will simply count the number of messages rather than tokens
        max_tokens=5,  # <-- allow up to 5 messages.
        strategy="last",
        # Most chat models expect that chat history starts with either:
        # (1) a HumanMessage or
        # (2) a SystemMessage followed by a HumanMessage
        # start_on="human" makes sure we produce a valid chat history
        start_on="human",
        # Usually, we want to keep the SystemMessage
        # if it's present in the original history.
        # The SystemMessage has special instructions for the model.
        include_system=True,
        allow_partial=False,
    )

    response = model.invoke(selected_messages)
    # We return a list, because this will get added to the existing list
    return {"messages": response}


# Define the two nodes we will cycle between
workflow.add_edge(START, "model")
workflow.add_node("model", call_model)


# Adding memory is straight forward in langgraph!
memory = MemorySaver()

app = workflow.compile(
    checkpointer=memory
)


# The thread id is a unique key that identifies
# this particular conversation.
# We'll just generate a random uuid here.
thread_id = uuid.uuid4()
config = {"configurable": {"thread_id": thread_id}}

input_message = HumanMessage(content="hi! I'm bob")
for event in app.stream({"messages": [input_message]}, config, stream_mode="values"):
    event["messages"][-1].pretty_print()

# Here, let's confirm that the AI remembers our name!
config = {"configurable": {"thread_id": thread_id}}
input_message = HumanMessage(content="what was my name?")
for event in app.stream({"messages": [input_message]}, config, stream_mode="values"):
    event["messages"][-1].pretty_print()

API 參考：HumanMessage | MemorySaver | StateGraph

================================[1m Human Message [0m=================================

hi! I'm bob
==================================[1m Ai Message [0m==================================

Hello Bob! How can I assist you today?
================================[1m Human Message [0m=================================

what was my name?
==================================[1m Ai Message [0m==================================

Your name is Bob. How can I help you, Bob?

搭配預先建置的 langgraph 代理使用

此範例示範如何搭配使用 Agent Executor 和使用 create_tool_calling_agent 函式建構的預先建置代理。

如果您正在使用舊版 LangChain 預先建置代理之一，您應該能夠將該程式碼替換為新的 langgraph 預先建置代理，後者利用聊天模型的原生工具呼叫功能，並且很可能在開箱即用時效果更好。

詳細資訊

import uuid

from langchain_core.messages import (
    AIMessage,
    BaseMessage,
    HumanMessage,
    SystemMessage,
    trim_messages,
)
from langchain_core.tools import tool
from langchain_openai import ChatOpenAI
from langgraph.checkpoint.memory import MemorySaver
from langgraph.prebuilt import create_react_agent


@tool
def get_user_age(name: str) -> str:
    """Use this tool to find the user's age."""
    # This is a placeholder for the actual implementation
    if "bob" in name.lower():
        return "42 years old"
    return "41 years old"


memory = MemorySaver()
model = ChatOpenAI()


def prompt(state) -> list[BaseMessage]:
    """Given the agent state, return a list of messages for the chat model."""
    # We're using the message processor defined above.
    return trim_messages(
        state["messages"],
        token_counter=len,  # <-- len will simply count the number of messages rather than tokens
        max_tokens=5,  # <-- allow up to 5 messages.
        strategy="last",
        # Most chat models expect that chat history starts with either:
        # (1) a HumanMessage or
        # (2) a SystemMessage followed by a HumanMessage
        # start_on="human" makes sure we produce a valid chat history
        start_on="human",
        # Usually, we want to keep the SystemMessage
        # if it's present in the original history.
        # The SystemMessage has special instructions for the model.
        include_system=True,
        allow_partial=False,
    )



app = create_react_agent(
    model,
    tools=[get_user_age],
    checkpointer=memory,
    prompt=prompt,
)

# The thread id is a unique key that identifies
# this particular conversation.
# We'll just generate a random uuid here.
thread_id = uuid.uuid4()
config = {"configurable": {"thread_id": thread_id}}

# Tell the AI that our name is Bob, and ask it to use a tool to confirm
# that it's capable of working like an agent.
input_message = HumanMessage(content="hi! I'm bob. What is my age?")

for event in app.stream({"messages": [input_message]}, config, stream_mode="values"):
    event["messages"][-1].pretty_print()

# Confirm that the chat bot has access to previous conversation
# and can respond to the user saying that the user's name is Bob.
input_message = HumanMessage(content="do you remember my name?")

for event in app.stream({"messages": [input_message]}, config, stream_mode="values"):
    event["messages"][-1].pretty_print()

================================[1m Human Message [0m=================================

hi! I'm bob. What is my age?
==================================[1m Ai Message [0m==================================
Tool Calls:
  get_user_age (call_jsMvoIFv970DhqqLCJDzPKsp)
 Call ID: call_jsMvoIFv970DhqqLCJDzPKsp
  Args:
    name: bob
=================================[1m Tool Message [0m=================================
Name: get_user_age

42 years old
==================================[1m Ai Message [0m==================================

Bob, you are 42 years old.
================================[1m Human Message [0m=================================

do you remember my name?
==================================[1m Ai Message [0m==================================

Yes, your name is Bob.

LCEL：新增預先處理步驟

新增複雜對話管理的最簡單方法是在聊天模型前面引入預先處理步驟，並將完整的對話歷史記錄傳遞到預先處理步驟。

這種方法在概念上很簡單，並且在許多情況下都有效；例如，如果使用 RunnableWithMessageHistory 而不是封裝聊天模型，請使用預先處理器封裝聊天模型。

這種方法顯而易見的缺點是，由於以下兩個原因，延遲會隨著對話歷史記錄的增長而開始增加

隨著對話時間變長，可能需要從您用來儲存對話歷史記錄的任何儲存區中提取更多資料（如果未將其儲存在記憶體中）。
預先處理邏輯最終會執行大量冗餘計算，重複對話先前步驟的計算。

注意

如果您想使用聊天模型的工具呼叫功能，請記住在將歷史記錄預先處理步驟新增到模型之前，先將工具繫結到模型！

詳細資訊

from langchain_core.messages import (
    AIMessage,
    BaseMessage,
    HumanMessage,
    SystemMessage,
    trim_messages,
)
from langchain_core.tools import tool
from langchain_openai import ChatOpenAI

model = ChatOpenAI()


@tool
def what_did_the_cow_say() -> str:
    """Check to see what the cow said."""
    return "foo"


message_processor = trim_messages(  # Returns a Runnable if no messages are provided
    token_counter=len,  # <-- len will simply count the number of messages rather than tokens
    max_tokens=5,  # <-- allow up to 5 messages.
    strategy="last",
    # The start_on is specified
    # to make sure we do not generate a sequence where
    # a ToolMessage that contains the result of a tool invocation
    # appears before the AIMessage that requested a tool invocation
    # as this will cause some chat models to raise an error.
    start_on=("human", "ai"),
    include_system=True,  # <-- Keep the system message
    allow_partial=False,
)

# Note that we bind tools to the model first!
model_with_tools = model.bind_tools([what_did_the_cow_say])

model_with_preprocessor = message_processor | model_with_tools

full_history = [
    SystemMessage("you're a good assistant, you always respond with a joke."),
    HumanMessage("i wonder why it's called langchain"),
    AIMessage(
        'Well, I guess they thought "WordRope" and "SentenceString" just didn\'t have the same ring to it!'
    ),
    HumanMessage("and who is harrison chasing anyways"),
    AIMessage(
        "Hmmm let me think.\n\nWhy, he's probably chasing after the last cup of coffee in the office!"
    ),
    HumanMessage("why is 42 always the answer?"),
    AIMessage(
        "Because it’s the only number that’s constantly right, even when it doesn’t add up!"
    ),
    HumanMessage("What did the cow say?"),
]


# We pass it explicity to the model_with_preprocesor for illustrative purposes.
# If you're using `RunnableWithMessageHistory` the history will be automatically
# read from the source the you configure.
model_with_preprocessor.invoke(full_history).pretty_print()

==================================[1m Ai Message [0m==================================
Tool Calls:
  what_did_the_cow_say (call_urHTB5CShhcKz37QiVzNBlIS)
 Call ID: call_urHTB5CShhcKz37QiVzNBlIS
  Args:

如果您需要實作更有效率的邏輯，並且現在想使用 RunnableWithMessageHistory，實現此目的的方法是從 BaseChatMessageHistory 建立子類別，並為 add_messages 定義適當的邏輯（它不僅僅是附加歷史記錄，而是重新寫入歷史記錄）。

除非您有充分的理由實作此解決方案，否則您應該改用 LangGraph。

後續步驟

探索 LangGraph 的持久性

使用簡單 LCEL 新增持久性（對於更複雜的用例，請優先使用 langgraph）

如何新增訊息歷史記錄

使用訊息歷史記錄

設定​

使用 LLMChain / Conversation Chain 的傳統用法​

重新實作 ConversationBufferWindowMemory 邏輯​

重新實作 ConversationTokenBufferMemory 邏輯​

使用 LangGraph 的現代用法​

搭配預先建置的 langgraph 代理使用​

LCEL：新增預先處理步驟​

後續步驟​

此頁面是否有幫助？

設定

使用 LLMChain / Conversation Chain 的傳統用法

重新實作 ConversationBufferWindowMemory 邏輯

重新實作 ConversationTokenBufferMemory 邏輯

使用 LangGraph 的現代用法

搭配預先建置的 langgraph 代理使用

LCEL：新增預先處理步驟

後續步驟