跳到主要內容
Open on GitHub

如何實作整合套件

本指南將逐步說明實作 LangChain 整合套件的流程。

整合套件只是 Python 套件,可以使用 pip install <您的套件名稱> 安裝,其中包含與 LangChain 核心介面相容的類別。

我們將涵蓋

  1. (選用)如何引導啟動新的整合套件
  2. 如何實作組件,例如 聊天模型向量儲存,使其符合 LangChain 介面;

(選用)引導啟動新的整合套件

在本節中,我們將概述 2 個引導啟動新整合套件的選項,如果您喜歡,也歡迎使用其他工具!

  1. langchain-cli:這是一個命令列工具,可用於使用 LangChain 組件範本和 Poetry 進行依賴管理來引導啟動新的整合套件。
  2. Poetry:這是一個 Python 依賴管理工具,可用於引導啟動具有依賴項的新 Python 套件。然後您可以將 LangChain 組件新增到此套件中。
選項 1:langchain-cli(推薦)

在本指南中,我們將使用 langchain-cli 從範本建立新的整合套件,您可以編輯該範本以實作您的 LangChain 組件。

先決條件

使用 langchain-cli 引導啟動新的 Python 套件

首先,安裝 langchain-clipoetry

pip install langchain-cli poetry

接下來,為您的套件想一個名稱。在本指南中,我們將使用 langchain-parrot-link。您可以透過在 PyPi 網站上搜尋來確認該名稱在 PyPi 上是否可用。

接下來,使用 langchain-cli 建立您的新 Python 套件,並使用 cd 導航到新目錄中

langchain-cli integration new

> The name of the integration to create (e.g. `my-integration`): parrot-link
> Name of integration in PascalCase [ParrotLink]:

cd parrot-link

接下來,讓我們新增任何我們需要的依賴項

poetry add my-integration-sdk

我們也可以在單獨的 poetry 依賴群組中新增一些 typingtest 依賴項。

poetry add --group typing my-typing-dep
poetry add --group test my-test-dep

最後,讓 poetry 設定包含您的依賴項以及您的整合套件的虛擬環境

poetry install --with lint,typing,test,test_integration

您現在有一個新的 Python 套件,其中包含 LangChain 組件的範本!此範本隨附每個整合類型的檔案,您可以隨意複製或刪除任何這些檔案(包括相關的測試檔案)。

若要從 [範本] 建立任何個別檔案,您可以執行例如:

langchain-cli integration new \
--name parrot-link \
--name-class ParrotLink \
--src integration_template/chat_models.py \
--dst langchain_parrot_link/chat_models_2.py
選項 2:Poetry(手動)

在本指南中,我們將使用 Poetry 進行依賴管理和封裝,您也可以隨意使用您偏好的任何其他工具。

先決條件

使用 Poetry 引導啟動新的 Python 套件

首先,安裝 Poetry

pip install poetry

接下來,為您的套件想一個名稱。在本指南中,我們將使用 langchain-parrot-link。您可以透過在 PyPi 網站上搜尋來確認該名稱在 PyPi 上是否可用。

接下來,使用 Poetry 建立您的新 Python 套件,並使用 cd 導航到新目錄中

poetry new langchain-parrot-link
cd langchain-parrot-link

使用 Poetry 新增主要依賴項,這會將它們新增到您的 pyproject.toml 檔案中

poetry add langchain-core

我們也將在單獨的 poetry 依賴群組中新增一些 test 依賴項。如果您未使用 Poetry,我們建議以不會將它們與您發布的套件一起封裝的方式新增這些依賴項,或者只是在您執行測試時單獨安裝它們。

langchain-tests 將提供我們稍後將使用的標準測試。我們建議將它們釘選到最新版本:

注意:將 <latest_version> 替換為下方 langchain-tests 的最新版本。

poetry add --group test pytest pytest-socket pytest-asyncio langchain-tests==<latest_version>

最後,讓 poetry 設定包含您的依賴項以及您的整合套件的虛擬環境

poetry install --with test

您現在已準備好開始編寫您的整合套件!

編寫您的整合

假設您正在建置一個簡單的整合套件,為 LangChain 提供 ChatParrotLink 聊天模型整合。以下是您的專案結構可能看起來的簡單範例

langchain-parrot-link/
├── langchain_parrot_link/
│ ├── __init__.py
│ └── chat_models.py
├── tests/
│ ├── __init__.py
│ └── test_chat_models.py
├── pyproject.toml
└── README.md

所有這些檔案都應該已從步驟 1 存在,除了 chat_models.pytest_chat_models.py!我們稍後將按照標準測試指南實作 test_chat_models.py

對於 chat_models.py,只需貼上聊天模型實作上方的內容即可。

將您的套件推送到公開的 Github 儲存庫

如果您想在 LangChain 文件中發布您的整合,則這是唯一需要的步驟。

  1. 在 GitHub 上建立新的儲存庫。
  2. 將您的程式碼推送到儲存庫。
  3. 確認您的儲存庫可供公眾查看(例如,在私密瀏覽視窗中,您未登入 Github 的情況下)。

實作 LangChain 組件

LangChain 組件是 langchain-core 中基底類別的子類別。範例包括聊天模型向量儲存工具嵌入模型檢索器

您的整合套件通常會實作至少其中一個組件的子類別。展開下方的標籤頁以查看每個組件的詳細資訊。

請參閱自訂聊天模型指南,以取得入門級聊天模型實作的詳細資訊。

您可以從以下範本或 langchain-cli 命令開始

langchain-cli integration new \
--name parrot-link \
--name-class ParrotLink \
--src integration_template/chat_models.py \
--dst langchain_parrot_link/chat_models.py
聊天模型程式碼範例
langchain_parrot_link/chat_models.py
"""ParrotLink chat models."""

from typing import Any, Dict, Iterator, List, Optional

from langchain_core.callbacks import (
CallbackManagerForLLMRun,
)
from langchain_core.language_models import BaseChatModel
from langchain_core.messages import (
AIMessage,
AIMessageChunk,
BaseMessage,
)
from langchain_core.messages.ai import UsageMetadata
from langchain_core.outputs import ChatGeneration, ChatGenerationChunk, ChatResult
from pydantic import Field


class ChatParrotLink(BaseChatModel):
# TODO: Replace all TODOs in docstring. See example docstring:
# https://github.com/langchain-ai/langchain/blob/7ff05357bac6eaedf5058a2af88f23a1817d40fe/libs/partners/openai/langchain_openai/chat_models/base.py#L1120
"""ParrotLink chat model integration.

The default implementation echoes the first `parrot_buffer_length` characters of the input.

# TODO: Replace with relevant packages, env vars.
Setup:
Install ``langchain-parrot-link`` and set environment variable ``PARROT_LINK_API_KEY``.

.. code-block:: bash

pip install -U langchain-parrot-link
export PARROT_LINK_API_KEY="your-api-key"

# TODO: Populate with relevant params.
Key init args — completion params:
model: str
Name of ParrotLink model to use.
temperature: float
Sampling temperature.
max_tokens: Optional[int]
Max number of tokens to generate.

# TODO: Populate with relevant params.
Key init args — client params:
timeout: Optional[float]
Timeout for requests.
max_retries: int
Max number of retries.
api_key: Optional[str]
ParrotLink API key. If not passed in will be read from env var PARROT_LINK_API_KEY.

See full list of supported init args and their descriptions in the params section.

# TODO: Replace with relevant init params.
Instantiate:
.. code-block:: python

from langchain_parrot_link import ChatParrotLink

llm = ChatParrotLink(
model="...",
temperature=0,
max_tokens=None,
timeout=None,
max_retries=2,
# api_key="...",
# other params...
)

Invoke:
.. code-block:: python

messages = [
("system", "You are a helpful translator. Translate the user sentence to French."),
("human", "I love programming."),
]
llm.invoke(messages)

.. code-block:: python

# TODO: Example output.

# TODO: Delete if token-level streaming isn't supported.
Stream:
.. code-block:: python

for chunk in llm.stream(messages):
print(chunk.text(), end="")

.. code-block:: python

# TODO: Example output.

.. code-block:: python

stream = llm.stream(messages)
full = next(stream)
for chunk in stream:
full += chunk
full

.. code-block:: python

# TODO: Example output.

# TODO: Delete if native async isn't supported.
Async:
.. code-block:: python

await llm.ainvoke(messages)

# stream:
# async for chunk in (await llm.astream(messages))

# batch:
# await llm.abatch([messages])

.. code-block:: python

# TODO: Example output.

# TODO: Delete if .bind_tools() isn't supported.
Tool calling:
.. code-block:: python

from pydantic import BaseModel, Field

class GetWeather(BaseModel):
'''Get the current weather in a given location'''

location: str = Field(..., description="The city and state, e.g. San Francisco, CA")

class GetPopulation(BaseModel):
'''Get the current population in a given location'''

location: str = Field(..., description="The city and state, e.g. San Francisco, CA")

llm_with_tools = llm.bind_tools([GetWeather, GetPopulation])
ai_msg = llm_with_tools.invoke("Which city is hotter today and which is bigger: LA or NY?")
ai_msg.tool_calls

.. code-block:: python

# TODO: Example output.

See ``ChatParrotLink.bind_tools()`` method for more.

# TODO: Delete if .with_structured_output() isn't supported.
Structured output:
.. code-block:: python

from typing import Optional

from pydantic import BaseModel, Field

class Joke(BaseModel):
'''Joke to tell user.'''

setup: str = Field(description="The setup of the joke")
punchline: str = Field(description="The punchline to the joke")
rating: Optional[int] = Field(description="How funny the joke is, from 1 to 10")

structured_llm = llm.with_structured_output(Joke)
structured_llm.invoke("Tell me a joke about cats")

.. code-block:: python

# TODO: Example output.

See ``ChatParrotLink.with_structured_output()`` for more.

# TODO: Delete if JSON mode response format isn't supported.
JSON mode:
.. code-block:: python

# TODO: Replace with appropriate bind arg.
json_llm = llm.bind(response_format={"type": "json_object"})
ai_msg = json_llm.invoke("Return a JSON object with key 'random_ints' and a value of 10 random ints in [0-99]")
ai_msg.content

.. code-block:: python

# TODO: Example output.

# TODO: Delete if image inputs aren't supported.
Image input:
.. code-block:: python

import base64
import httpx
from langchain_core.messages import HumanMessage

image_url = "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg"
image_data = base64.b64encode(httpx.get(image_url).content).decode("utf-8")
# TODO: Replace with appropriate message content format.
message = HumanMessage(
content=[
{"type": "text", "text": "describe the weather in this image"},
{
"type": "image_url",
"image_url": {"url": f"data:image/jpeg;base64,{image_data}"},
},
],
)
ai_msg = llm.invoke([message])
ai_msg.content

.. code-block:: python

# TODO: Example output.

# TODO: Delete if audio inputs aren't supported.
Audio input:
.. code-block:: python

# TODO: Example input

.. code-block:: python

# TODO: Example output

# TODO: Delete if video inputs aren't supported.
Video input:
.. code-block:: python

# TODO: Example input

.. code-block:: python

# TODO: Example output

# TODO: Delete if token usage metadata isn't supported.
Token usage:
.. code-block:: python

ai_msg = llm.invoke(messages)
ai_msg.usage_metadata

.. code-block:: python

{'input_tokens': 28, 'output_tokens': 5, 'total_tokens': 33}

# TODO: Delete if logprobs aren't supported.
Logprobs:
.. code-block:: python

# TODO: Replace with appropriate bind arg.
logprobs_llm = llm.bind(logprobs=True)
ai_msg = logprobs_llm.invoke(messages)
ai_msg.response_metadata["logprobs"]

.. code-block:: python

# TODO: Example output.

Response metadata
.. code-block:: python

ai_msg = llm.invoke(messages)
ai_msg.response_metadata

.. code-block:: python

# TODO: Example output.

""" # noqa: E501

model_name: str = Field(alias="model")
"""The name of the model"""
parrot_buffer_length: int
"""The number of characters from the last message of the prompt to be echoed."""
temperature: Optional[float] = None
max_tokens: Optional[int] = None
timeout: Optional[int] = None
stop: Optional[List[str]] = None
max_retries: int = 2

@property
def _llm_type(self) -> str:
"""Return type of chat model."""
return "chat-__package_name_short__"

@property
def _identifying_params(self) -> Dict[str, Any]:
"""Return a dictionary of identifying parameters.

This information is used by the LangChain callback system, which
is used for tracing purposes make it possible to monitor LLMs.
"""
return {
# The model name allows users to specify custom token counting
# rules in LLM monitoring applications (e.g., in LangSmith users
# can provide per token pricing for their model and monitor
# costs for the given LLM.)
"model_name": self.model_name,
}

def _generate(
self,
messages: List[BaseMessage],
stop: Optional[List[str]] = None,
run_manager: Optional[CallbackManagerForLLMRun] = None,
**kwargs: Any,
) -> ChatResult:
"""Override the _generate method to implement the chat model logic.

This can be a call to an API, a call to a local model, or any other
implementation that generates a response to the input prompt.

Args:
messages: the prompt composed of a list of messages.
stop: a list of strings on which the model should stop generating.
If generation stops due to a stop token, the stop token itself
SHOULD BE INCLUDED as part of the output. This is not enforced
across models right now, but it's a good practice to follow since
it makes it much easier to parse the output of the model
downstream and understand why generation stopped.
run_manager: A run manager with callbacks for the LLM.
"""
# Replace this with actual logic to generate a response from a list
# of messages.
last_message = messages[-1]
tokens = last_message.content[: self.parrot_buffer_length]
ct_input_tokens = sum(len(message.content) for message in messages)
ct_output_tokens = len(tokens)
message = AIMessage(
content=tokens,
additional_kwargs={}, # Used to add additional payload to the message
response_metadata={ # Use for response metadata
"time_in_seconds": 3,
},
usage_metadata={
"input_tokens": ct_input_tokens,
"output_tokens": ct_output_tokens,
"total_tokens": ct_input_tokens + ct_output_tokens,
},
)
##

generation = ChatGeneration(message=message)
return ChatResult(generations=[generation])

def _stream(
self,
messages: List[BaseMessage],
stop: Optional[List[str]] = None,
run_manager: Optional[CallbackManagerForLLMRun] = None,
**kwargs: Any,
) -> Iterator[ChatGenerationChunk]:
"""Stream the output of the model.

This method should be implemented if the model can generate output
in a streaming fashion. If the model does not support streaming,
do not implement it. In that case streaming requests will be automatically
handled by the _generate method.

Args:
messages: the prompt composed of a list of messages.
stop: a list of strings on which the model should stop generating.
If generation stops due to a stop token, the stop token itself
SHOULD BE INCLUDED as part of the output. This is not enforced
across models right now, but it's a good practice to follow since
it makes it much easier to parse the output of the model
downstream and understand why generation stopped.
run_manager: A run manager with callbacks for the LLM.
"""
last_message = messages[-1]
tokens = str(last_message.content[: self.parrot_buffer_length])
ct_input_tokens = sum(len(message.content) for message in messages)

for token in tokens:
usage_metadata = UsageMetadata(
{
"input_tokens": ct_input_tokens,
"output_tokens": 1,
"total_tokens": ct_input_tokens + 1,
}
)
ct_input_tokens = 0
chunk = ChatGenerationChunk(
message=AIMessageChunk(content=token, usage_metadata=usage_metadata)
)

if run_manager:
# This is optional in newer versions of LangChain
# The on_llm_new_token will be called automatically
run_manager.on_llm_new_token(token, chunk=chunk)

yield chunk

# Let's add some other information (e.g., response metadata)
chunk = ChatGenerationChunk(
message=AIMessageChunk(content="", response_metadata={"time_in_sec": 3})
)
if run_manager:
# This is optional in newer versions of LangChain
# The on_llm_new_token will be called automatically
run_manager.on_llm_new_token(token, chunk=chunk)
yield chunk

# TODO: Implement if ChatParrotLink supports async streaming. Otherwise delete.
# async def _astream(
# self,
# messages: List[BaseMessage],
# stop: Optional[List[str]] = None,
# run_manager: Optional[AsyncCallbackManagerForLLMRun] = None,
# **kwargs: Any,
# ) -> AsyncIterator[ChatGenerationChunk]:

# TODO: Implement if ChatParrotLink supports async generation. Otherwise delete.
# async def _agenerate(
# self,
# messages: List[BaseMessage],
# stop: Optional[List[str]] = None,
# run_manager: Optional[AsyncCallbackManagerForLLMRun] = None,
# **kwargs: Any,
# ) -> ChatResult:

下一步

現在您已實作了您的套件,您可以繼續進行測試您的整合,以測試您的整合並成功執行它們。


此頁面是否有幫助?