ChatNVIDIA

這將幫助您開始使用 NVIDIA 聊天模型。如需所有 ChatNVIDIA 功能和設定的詳細文件，請前往 API 參考。

概觀

langchain-nvidia-ai-endpoints 套件包含 LangChain 整合，可使用 NVIDIA NIM 推論微服務上的模型建置應用程式。NIM 支援來自社群以及 NVIDIA 的跨領域模型，例如聊天、嵌入和重新排序模型。這些模型經過 NVIDIA 最佳化，可在 NVIDIA 加速基礎架構上提供最佳效能，並部署為 NIM，這是一種易於使用的預建容器，可透過單一指令在 NVIDIA 加速基礎架構上隨處部署。

NIM 的 NVIDIA 託管部署可在 NVIDIA API 目錄上進行測試。測試後，可以使用 NVIDIA AI Enterprise 授權從 NVIDIA 的 API 目錄匯出 NIM，並在內部部署或雲端中執行，讓企業擁有其 IP 和 AI 應用程式的所有權和完全控制權。

NIM 以每個模型為基礎封裝為容器映像，並透過 NVIDIA NGC 目錄作為 NGC 容器映像發佈。NIM 的核心是為在 AI 模型上執行推論提供簡單、一致且熟悉的 API。

此範例說明如何使用 LangChain 透過 ChatNVIDIA 類別與 NVIDIA 支援互動。

如需透過此 API 存取聊天模型的更多資訊，請查看 ChatNVIDIA 文件。

整合詳細資訊

類別	套件	本地	可序列化	JS 支援	套件下載	套件最新版
ChatNVIDIA	langchain_nvidia_ai_endpoints	✅	beta	❌

模型功能

工具呼叫	結構化輸出	JSON 模式	圖像輸入	音訊輸入	視訊輸入	Token 層級串流	原生非同步	Token 使用量	Logprobs
✅	✅	✅	✅	❌	❌	✅	✅	✅	❌

設定

開始使用

使用 NVIDIA 建立免費帳戶，其中託管 NVIDIA AI Foundation 模型。
點擊您選擇的模型。
在 Input 下，選取 Python 標籤，然後點擊 Get API Key。然後點擊 Generate Key。
複製產生的金鑰並另存為 NVIDIA_API_KEY。從那裡，您應該可以存取端點。

憑證

import getpass
import os

if not os.getenv("NVIDIA_API_KEY"):
    # Note: the API key should start with "nvapi-"
    os.environ["NVIDIA_API_KEY"] = getpass.getpass("Enter your NVIDIA API key: ")

如果您想取得模型呼叫的自動追蹤，您也可以透過取消註解下方內容來設定您的 LangSmith API 金鑰

# os.environ["LANGSMITH_TRACING"] = "true"
# os.environ["LANGSMITH_API_KEY"] = getpass.getpass("Enter your LangSmith API key: ")

安裝

LangChain NVIDIA AI Endpoints 整合位於 langchain_nvidia_ai_endpoints 套件中

%pip install --upgrade --quiet langchain-nvidia-ai-endpoints

例項化

現在我們可以存取 NVIDIA API 目錄中的模型

## Core LC Chat Interface
from langchain_nvidia_ai_endpoints import ChatNVIDIA

llm = ChatNVIDIA(model="mistralai/mixtral-8x7b-instruct-v0.1")

API 參考：ChatNVIDIA

調用

result = llm.invoke("Write a ballad about LangChain.")
print(result.content)

使用 NVIDIA NIM

準備好部署時，您可以使用 NVIDIA NIM 自行託管模型（NVIDIA AI Enterprise 軟體授權隨附），並在任何地方執行它們，讓您擁有自訂項目的所有權以及智慧財產權 (IP) 和 AI 應用程式的完全控制權。

深入瞭解 NIM

from langchain_nvidia_ai_endpoints import ChatNVIDIA

# connect to an embedding NIM running at localhost:8000, specifying a specific model
llm = ChatNVIDIA(base_url="https://127.0.0.1:8000/v1", model="meta/llama3-8b-instruct")

API 參考：ChatNVIDIA

串流、批次和非同步

這些模型原生支援串流，並且與所有 LangChain LLM 的情況一樣，它們公開批次方法來處理並行請求，以及用於調用、串流和批次的非同步方法。以下是一些範例。

print(llm.batch(["What's 2*3?", "What's 2*6?"]))
# Or via the async API
# await llm.abatch(["What's 2*3?", "What's 2*6?"])

for chunk in llm.stream("How far can a seagull fly in one day?"):
    # Show the token separations
    print(chunk.content, end="|")

async for chunk in llm.astream(
    "How long does it take for monarch butterflies to migrate?"
):
    print(chunk.content, end="|")

支援的模型

查詢 available_models 仍然會提供您的 API 憑證提供的所有其他模型。

playground_ 前綴是選用的。

ChatNVIDIA.get_available_models()
# llm.get_available_models()

模型類型

以上所有模型都受到支援，並且可以透過 ChatNVIDIA 存取。

某些模型類型支援獨特的提示技術和聊天訊息。我們將在下面檢閱一些重要的模型類型。

若要瞭解特定模型的更多資訊，請瀏覽 AI Foundation 模型的 API 區段如此處連結。

一般聊天

meta/llama3-8b-instruct 和 mistralai/mixtral-8x22b-instruct-v0.1 等模型是全方位模型，可用於任何 LangChain 聊天訊息。範例如下。

from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_nvidia_ai_endpoints import ChatNVIDIA

prompt = ChatPromptTemplate.from_messages(
    [("system", "You are a helpful AI assistant named Fred."), ("user", "{input}")]
)
chain = prompt | ChatNVIDIA(model="meta/llama3-8b-instruct") | StrOutputParser()

for txt in chain.stream({"input": "What's your name?"}):
    print(txt, end="")

API 參考：StrOutputParser | ChatPromptTemplate | ChatNVIDIA

程式碼產生

這些模型接受與常規聊天模型相同的引數和輸入結構，但它們在程式碼產生和結構化程式碼任務方面往往表現更好。meta/codellama-70b 就是一個範例。

prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are an expert coding AI. Respond only in valid python; no narration whatsoever.",
        ),
        ("user", "{input}"),
    ]
)
chain = prompt | ChatNVIDIA(model="meta/codellama-70b") | StrOutputParser()

for txt in chain.stream({"input": "How do I solve this fizz buzz problem?"}):
    print(txt, end="")

多模態

NVIDIA 也支援多模態輸入，這表示您可以同時提供圖像和文字，讓模型進行推理。支援多模態輸入的範例模型是 nvidia/neva-22b。

以下是一個使用範例

import IPython
import requests

image_url = "https://www.nvidia.com/content/dam/en-zz/Solutions/research/ai-playground/nvidia-picasso-3c33-p@2x.jpg"  ## Large Image
image_content = requests.get(image_url).content

IPython.display.Image(image_content)

from langchain_nvidia_ai_endpoints import ChatNVIDIA

llm = ChatNVIDIA(model="nvidia/neva-22b")

API 參考：ChatNVIDIA

以 URL 傳遞圖像

from langchain_core.messages import HumanMessage

llm.invoke(
    [
        HumanMessage(
            content=[
                {"type": "text", "text": "Describe this image:"},
                {"type": "image_url", "image_url": {"url": image_url}},
            ]
        )
    ]
)

API 參考：HumanMessage

以 base64 編碼字串傳遞圖像

目前，用戶端會進行一些額外處理，以支援較大的圖像，例如上面的圖像。但對於較小的圖像（以及為了更好地說明幕後發生的過程），我們可以像下面所示直接傳遞圖像

import IPython
import requests

image_url = "https://picsum.photos/seed/kitten/300/200"
image_content = requests.get(image_url).content

IPython.display.Image(image_content)

import base64

from langchain_core.messages import HumanMessage

## Works for simpler images. For larger images, see actual implementation
b64_string = base64.b64encode(image_content).decode("utf-8")

llm.invoke(
    [
        HumanMessage(
            content=[
                {"type": "text", "text": "Describe this image:"},
                {
                    "type": "image_url",
                    "image_url": {"url": f"data:image/png;base64,{b64_string}"},
                },
            ]
        )
    ]
)

API 參考：HumanMessage

直接在字串中

NVIDIA API 獨特地接受以 base64 圖像形式內嵌在 <img/> HTML 標籤中的圖像。雖然這與其他 LLM 不可互通，但您可以相應地直接提示模型。

base64_with_mime_type = f"data:image/png;base64,{b64_string}"
llm.invoke(f'What\'s in this image?\n<img src="{base64_with_mime_type}" />')

RunnableWithMessageHistory 內的範例用法

與任何其他整合一樣，ChatNVIDIA 可以很好地支援聊天公用程式，例如 RunnableWithMessageHistory，它類似於使用 ConversationChain。下面，我們展示了應用於 mistralai/mixtral-8x22b-instruct-v0.1 模型的 LangChain RunnableWithMessageHistory 範例。

%pip install --upgrade --quiet langchain

from langchain_core.chat_history import InMemoryChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory

# store is a dictionary that maps session IDs to their corresponding chat histories.
store = {}  # memory is maintained outside the chain


# A function that returns the chat history for a given session ID.
def get_session_history(session_id: str) -> InMemoryChatMessageHistory:
    if session_id not in store:
        store[session_id] = InMemoryChatMessageHistory()
    return store[session_id]


chat = ChatNVIDIA(
    model="mistralai/mixtral-8x22b-instruct-v0.1",
    temperature=0.1,
    max_tokens=100,
    top_p=1.0,
)

#  Define a RunnableConfig object, with a `configurable` key. session_id determines thread
config = {"configurable": {"session_id": "1"}}

conversation = RunnableWithMessageHistory(
    chat,
    get_session_history,
)

conversation.invoke(
    "Hi I'm Srijan Dubey.",  # input or query
    config=config,
)

API 參考：InMemoryChatMessageHistory | RunnableWithMessageHistory

conversation.invoke(
    "I'm doing well! Just having a conversation with an AI.",
    config=config,
)

conversation.invoke(
    "Tell me about yourself.",
    config=config,
)

工具呼叫

從 v0.2 開始，ChatNVIDIA 支援 bind_tools。

ChatNVIDIA 提供與 build.nvidia.com 以及本地 NIM 上的各種模型整合。並非所有這些模型都經過工具呼叫訓練。請務必選擇具有工具呼叫功能的模型，以進行實驗和應用。

您可以使用以下方式取得已知支援工具呼叫的模型清單：

tool_models = [
    model for model in ChatNVIDIA.get_available_models() if model.supports_tools
]
tool_models

使用具有工具功能的模型，

from langchain_core.tools import tool
from pydantic import Field


@tool
def get_current_weather(
    location: str = Field(..., description="The location to get the weather for."),
):
    """Get the current weather for a location."""
    ...


llm = ChatNVIDIA(model=tool_models[0].id).bind_tools(tools=[get_current_weather])
response = llm.invoke("What is the weather in Boston?")
response.tool_calls

API 參考：tool

請參閱如何使用聊天模型呼叫工具以取得其他範例。

鏈結

我們可以像這樣使用提示範本鏈結我們的模型

from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate(
    [
        (
            "system",
            "You are a helpful assistant that translates {input_language} to {output_language}.",
        ),
        ("human", "{input}"),
    ]
)

chain = prompt | llm
chain.invoke(
    {
        "input_language": "English",
        "output_language": "German",
        "input": "I love programming.",
    }
)

API 參考：ChatPromptTemplate

API 參考

如需所有 ChatNVIDIA 功能和配置的詳細文件，請前往 API 參考文檔： https://langchain-python.dev.org.tw/api_reference/nvidia_ai_endpoints/chat_models/langchain_nvidia_ai_endpoints.chat_models.ChatNVIDIA.html

聊天模型概念指南
聊天模型操作指南

概觀​

整合詳細資訊​

模型功能​

設定​

憑證​

安裝​

例項化​

調用​

使用 NVIDIA NIM​

串流、批次和非同步​

支援的模型​

模型類型​

一般聊天​

程式碼產生​

多模態​

以 URL 傳遞圖像​

以 base64 編碼字串傳遞圖像​

直接在字串中​

RunnableWithMessageHistory 內的範例用法​

工具呼叫​

鏈結​

API 參考​

相關​

此頁面是否對您有幫助？

概觀

整合詳細資訊

模型功能

設定

憑證

安裝

例項化

調用

使用 NVIDIA NIM

串流、批次和非同步

支援的模型

模型類型

一般聊天

程式碼產生

多模態

以 URL 傳遞圖像

以 base64 編碼字串傳遞圖像

直接在字串中

RunnableWithMessageHistory 內的範例用法

工具呼叫

鏈結

API 參考

相關