ChatOpenAI

本筆記本提供快速概觀，以開始使用 OpenAI 聊天模型。如需所有 ChatOpenAI 功能和組態的詳細文件，請前往 API 參考。

OpenAI 有數個聊天模型。您可以在 OpenAI 文件中找到關於其最新模型及其成本、上下文視窗和支援的輸入類型資訊。

Azure OpenAI

請注意，某些 OpenAI 模型也可以透過 Microsoft Azure 平台存取。若要使用 Azure OpenAI 服務，請使用 AzureChatOpenAI 整合。

概觀

整合詳細資訊

類別	套件	本地	可序列化	JS 支援	套件下載次數	套件最新版本
ChatOpenAI	langchain-openai	❌	beta	✅

模型功能

工具呼叫	結構化輸出	JSON 模式	影像輸入	音訊輸入	影片輸入	Token 層級串流	原生非同步	Token 使用量	Logprobs
✅	✅	✅	✅	✅	❌	✅	✅	✅	✅

設定

若要存取 OpenAI 模型，您需要建立 OpenAI 帳戶、取得 API 金鑰，並安裝 langchain-openai 整合套件。

憑證

前往 https://platform.openai.com 註冊 OpenAI 並產生 API 金鑰。完成後，設定 OPENAI_API_KEY 環境變數

import getpass
import os

if not os.environ.get("OPENAI_API_KEY"):
    os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter your OpenAI API key: ")

如果您想要取得模型呼叫的自動追蹤，您也可以設定您的 LangSmith API 金鑰，取消註解下方內容即可

# os.environ["LANGSMITH_API_KEY"] = getpass.getpass("Enter your LangSmith API key: ")
# os.environ["LANGSMITH_TRACING"] = "true"

安裝

LangChain OpenAI 整合存在於 langchain-openai 套件中

%pip install -qU langchain-openai

例項化

現在我們可以例項化我們的模型物件並產生聊天完成

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    model="gpt-4o",
    temperature=0,
    max_tokens=None,
    timeout=None,
    max_retries=2,
    # api_key="...",  # if you prefer to pass api key in directly instaed of using env vars
    # base_url="...",
    # organization="...",
    # other params...
)

API 參考：ChatOpenAI

調用

messages = [
    (
        "system",
        "You are a helpful assistant that translates English to French. Translate the user sentence.",
    ),
    ("human", "I love programming."),
]
ai_msg = llm.invoke(messages)
ai_msg

AIMessage(content="J'adore la programmation.", additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 5, 'prompt_tokens': 31, 'total_tokens': 36}, 'model_name': 'gpt-4o-2024-05-13', 'system_fingerprint': 'fp_3aa7262c27', 'finish_reason': 'stop', 'logprobs': None}, id='run-63219b22-03e3-4561-8cc4-78b7c7c3a3ca-0', usage_metadata={'input_tokens': 31, 'output_tokens': 5, 'total_tokens': 36})

print(ai_msg.content)

J'adore la programmation.

鏈接

我們可以像這樣使用提示範本鏈接我們的模型

from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a helpful assistant that translates {input_language} to {output_language}.",
        ),
        ("human", "{input}"),
    ]
)

chain = prompt | llm
chain.invoke(
    {
        "input_language": "English",
        "output_language": "German",
        "input": "I love programming.",
    }
)

API 參考：ChatPromptTemplate

AIMessage(content='Ich liebe das Programmieren.', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 6, 'prompt_tokens': 26, 'total_tokens': 32}, 'model_name': 'gpt-4o-2024-05-13', 'system_fingerprint': 'fp_3aa7262c27', 'finish_reason': 'stop', 'logprobs': None}, id='run-350585e1-16ca-4dad-9460-3d9e7e49aaf1-0', usage_metadata={'input_tokens': 26, 'output_tokens': 6, 'total_tokens': 32})

工具呼叫

OpenAI 具有工具呼叫（我們在這裡交替使用「工具呼叫」和「函數呼叫」）API，可讓您描述工具及其引數，並讓模型傳回 JSON 物件，其中包含要調用的工具以及該工具的輸入。工具呼叫對於建置工具使用鏈和代理程式，以及更普遍地從模型取得結構化輸出非常有用。

ChatOpenAI.bind_tools()

透過 ChatOpenAI.bind_tools，我們可以輕鬆地將 Pydantic 類別、字典結構描述、LangChain 工具，甚至是函數作為工具傳遞至模型。在底層，這些會轉換為 OpenAI 工具結構描述，如下所示

{
    "name": "...",
    "description": "...",
    "parameters": {...}  # JSONSchema
}

並在每次模型調用中傳遞。

from pydantic import BaseModel, Field

class GetWeather(BaseModel):
    """Get the current weather in a given location"""

    location: str = Field(..., description="The city and state, e.g. San Francisco, CA")

llm_with_tools = llm.bind_tools([GetWeather])

ai_msg = llm_with_tools.invoke(
    "what is the weather like in San Francisco",
)
ai_msg

AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_o9udf3EVOWiV4Iupktpbpofk', 'function': {'arguments': '{"location":"San Francisco, CA"}', 'name': 'GetWeather'}, 'type': 'function'}], 'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 17, 'prompt_tokens': 68, 'total_tokens': 85}, 'model_name': 'gpt-4o-2024-05-13', 'system_fingerprint': 'fp_3aa7262c27', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-1617c9b2-dda5-4120-996b-0333ed5992e2-0', tool_calls=[{'name': 'GetWeather', 'args': {'location': 'San Francisco, CA'}, 'id': 'call_o9udf3EVOWiV4Iupktpbpofk', 'type': 'tool_call'}], usage_metadata={'input_tokens': 68, 'output_tokens': 17, 'total_tokens': 85})

`strict=True`

需要 langchain-openai>=0.1.21

截至 2024 年 8 月 6 日，OpenAI 在呼叫工具時支援 strict 引數，該引數將強制模型遵守工具引數結構描述。在此處查看更多資訊：https://platform.openai.com/docs/guides/function-calling

注意：如果 strict=True，工具定義也將經過驗證，並且接受 JSON 結構描述的子集。至關重要的是，結構描述不能有選用引數（具有預設值的引數）。請閱讀完整文件，瞭解此處支援哪些類型的結構描述：https://platform.openai.com/docs/guides/structured-outputs/supported-schemas。

llm_with_tools = llm.bind_tools([GetWeather], strict=True)
ai_msg = llm_with_tools.invoke(
    "what is the weather like in San Francisco",
)
ai_msg

AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_jUqhd8wzAIzInTJl72Rla8ht', 'function': {'arguments': '{"location":"San Francisco, CA"}', 'name': 'GetWeather'}, 'type': 'function'}], 'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 17, 'prompt_tokens': 68, 'total_tokens': 85}, 'model_name': 'gpt-4o-2024-05-13', 'system_fingerprint': 'fp_3aa7262c27', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-5e3356a9-132d-4623-8e73-dd5a898cf4a6-0', tool_calls=[{'name': 'GetWeather', 'args': {'location': 'San Francisco, CA'}, 'id': 'call_jUqhd8wzAIzInTJl72Rla8ht', 'type': 'tool_call'}], usage_metadata={'input_tokens': 68, 'output_tokens': 17, 'total_tokens': 85})

AIMessage.tool_calls

請注意，AIMessage 具有 tool_calls 屬性。這包含在標準化的 ToolCall 格式中，該格式與模型提供者無關。

ai_msg.tool_calls

[{'name': 'GetWeather',
  'args': {'location': 'San Francisco, CA'},
  'id': 'call_jUqhd8wzAIzInTJl72Rla8ht',
  'type': 'tool_call'}]

如需繫結工具和工具呼叫輸出的詳細資訊，請前往工具呼叫文件。

Responses API

需要 langchain-openai>=0.3.9

OpenAI 支援 Responses API，該 API 面向建置代理程式應用程式。它包含一套內建工具，包括網路和檔案搜尋。它也支援管理對話狀態，讓您可以繼續對話線程，而無需明確傳入先前的訊息。

如果使用其中一項功能，ChatOpenAI 將路由至 Responses API。您也可以在例項化 ChatOpenAI 時指定 use_responses_api=True。

內建工具

為 ChatOpenAI 配備內建工具將使其回應以外部資訊為基礎，例如透過檔案或網路中的上下文。從模型產生的 AIMessage 將包含關於內建工具調用的資訊。

網路搜尋

若要觸發網路搜尋，請將 {"type": "web_search_preview"} 作為另一個工具傳遞至模型。

提示

您也可以將內建工具作為調用參數傳遞

llm.invoke("...", tools=[{"type": "web_search_preview"}])

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4o-mini")

tool = {"type": "web_search_preview"}
llm_with_tools = llm.bind_tools([tool])

response = llm_with_tools.invoke("What was a positive news story from today?")

API 參考：ChatOpenAI

請注意，回應包含結構化內容區塊，其中包含回應的文字和 OpenAI 註釋，引用其來源

response.content

[{'type': 'text',
  'text': 'Today, a heartwarming story emerged from Minnesota, where a group of high school robotics students built a custom motorized wheelchair for a 2-year-old boy named Cillian Jackson. Born with a genetic condition that limited his mobility, Cillian\'s family couldn\'t afford the $20,000 wheelchair he needed. The students at Farmington High School\'s Rogue Robotics team took it upon themselves to modify a Power Wheels toy car into a functional motorized wheelchair for Cillian, complete with a joystick, safety bumpers, and a harness. One team member remarked, "I think we won here more than we do in our competitions. Instead of completing a task, we\'re helping change someone\'s life." ([boredpanda.com](https://www.boredpanda.com/wholesome-global-positive-news/?utm_source=openai))\n\nThis act of kindness highlights the profound impact that community support and innovation can have on individuals facing challenges. ',
  'annotations': [{'end_index': 778,
    'start_index': 682,
    'title': '“Global Positive News”: 40 Posts To Remind Us There’s Good In The World',
    'type': 'url_citation',
    'url': 'https://www.boredpanda.com/wholesome-global-positive-news/?utm_source=openai'}]}]

提示

您可以使用 response.text() 僅恢復回應的文字內容作為字串。例如，若要串流回應文字

for token in llm_with_tools.stream("..."):
    print(token.text(), end="|")

請參閱串流指南以取得更多詳細資訊。

輸出訊息也將包含來自任何工具調用的資訊

response.additional_kwargs

{'tool_outputs': [{'id': 'ws_67d192aeb6cc81918e736ad4a57937570d6f8507990d9d71',
   'status': 'completed',
   'type': 'web_search_call'}]}

檔案搜尋

若要觸發檔案搜尋，請將檔案搜尋工具作為另一個工具傳遞至模型。您需要填入 OpenAI 管理的向量儲存庫，並在工具定義中包含向量儲存庫 ID。請參閱OpenAI 文件以取得更多詳細資訊。

llm = ChatOpenAI(model="gpt-4o-mini")

openai_vector_store_ids = [
    "vs_...",  # your IDs here
]

tool = {
    "type": "file_search",
    "vector_store_ids": openai_vector_store_ids,
}
llm_with_tools = llm.bind_tools([tool])

response = llm_with_tools.invoke("What is deep research by OpenAI?")
print(response.text())

Deep Research by OpenAI is a new capability integrated into ChatGPT that allows for the execution of multi-step research tasks independently. It can synthesize extensive amounts of online information and produce comprehensive reports similar to what a research analyst would do, significantly speeding up processes that would typically take hours for a human.

### Key Features:
- **Independent Research**: Users simply provide a prompt, and the model can find, analyze, and synthesize information from hundreds of online sources.
- **Multi-Modal Capabilities**: The model is also able to browse user-uploaded files, plot graphs using Python, and embed visualizations in its outputs.
- **Training**: Deep Research has been trained using reinforcement learning on real-world tasks that require extensive browsing and reasoning.

### Applications:
- Useful for professionals in sectors like finance, science, policy, and engineering, enabling them to obtain accurate and thorough research quickly.
- It can also be beneficial for consumers seeking personalized recommendations on complex purchases.

### Limitations:
Although Deep Research presents significant advancements, it has some limitations, such as the potential to hallucinate facts or struggle with authoritative information. 

Deep Research aims to facilitate access to thorough and documented information, marking a significant step toward the broader goal of developing artificial general intelligence (AGI).

與網路搜尋一樣，回應將包含帶有引用的內容區塊

response.content[0]["annotations"][:2]

[{'file_id': 'file-3UzgX7jcC8Dt9ZAFzywg5k',
  'index': 346,
  'type': 'file_citation',
  'filename': 'deep_research_blog.pdf'},
 {'file_id': 'file-3UzgX7jcC8Dt9ZAFzywg5k',
  'index': 575,
  'type': 'file_citation',
  'filename': 'deep_research_blog.pdf'}]

它也將包含來自內建工具調用的資訊

response.additional_kwargs

{'tool_outputs': [{'id': 'fs_67d196fbb83c8191ba20586175331687089228ce932eceb1',
   'queries': ['What is deep research by OpenAI?'],
   'status': 'completed',
   'type': 'file_search_call'}]}

電腦使用

ChatOpenAI 支援 "computer-use-preview" 模型，這是適用於內建電腦使用工具的專用模型。若要啟用，請將電腦使用工具作為另一個工具傳遞。

目前，電腦使用工具的輸出存在於 AIMessage.additional_kwargs["tool_outputs"] 中。若要回覆電腦使用工具呼叫，請建構一個 ToolMessage，其 additional_kwargs 中包含 {"type": "computer_call_output"}。訊息的內容將是螢幕截圖。以下，我們示範一個簡單的範例。

首先，載入兩個螢幕截圖

import base64


def load_png_as_base64(file_path):
    with open(file_path, "rb") as image_file:
        encoded_string = base64.b64encode(image_file.read())
        return encoded_string.decode("utf-8")


screenshot_1_base64 = load_png_as_base64(
    "/path/to/screenshot_1.png"
)  # perhaps a screenshot of an application
screenshot_2_base64 = load_png_as_base64(
    "/path/to/screenshot_2.png"
)  # perhaps a screenshot of the Desktop

from langchain_openai import ChatOpenAI

# Initialize model
llm = ChatOpenAI(
    model="computer-use-preview",
    model_kwargs={"truncation": "auto"},
)

# Bind computer-use tool
tool = {
    "type": "computer_use_preview",
    "display_width": 1024,
    "display_height": 768,
    "environment": "browser",
}
llm_with_tools = llm.bind_tools([tool])

# Construct input message
input_message = {
    "role": "user",
    "content": [
        {
            "type": "text",
            "text": (
                "Click the red X to close and reveal my Desktop. "
                "Proceed, no confirmation needed."
            ),
        },
        {
            "type": "input_image",
            "image_url": f"data:image/png;base64,{screenshot_1_base64}",
        },
    ],
}

# Invoke model
response = llm_with_tools.invoke(
    [input_message],
    reasoning={
        "generate_summary": "concise",
    },
)

API 參考：ChatOpenAI

回應將在其 additional_kwargs 中包含對電腦使用工具的呼叫

response.additional_kwargs

{'reasoning': {'id': 'rs_67ddb381c85081919c46e3e544a161e8051ff325ba1bad35',
  'summary': [{'text': 'Closing Visual Studio Code application',
    'type': 'summary_text'}],
  'type': 'reasoning'},
 'tool_outputs': [{'id': 'cu_67ddb385358c8191bf1a127b71bcf1ea051ff325ba1bad35',
   'action': {'button': 'left', 'type': 'click', 'x': 17, 'y': 38},
   'call_id': 'call_Ae3Ghz8xdqZQ01mosYhXXMho',
   'pending_safety_checks': [],
   'status': 'completed',
   'type': 'computer_call'}]}

接下來，我們使用這些屬性建構一個 ToolMessage

它具有與電腦呼叫中的 call_id 相符的 tool_call_id。
它的 additional_kwargs 中具有 {"type": "computer_call_output"}。
其內容為 image_url 或 input_image 輸出區塊（格式請參閱 OpenAI 文件）。

from langchain_core.messages import ToolMessage

tool_call_id = response.additional_kwargs["tool_outputs"][0]["call_id"]

tool_message = ToolMessage(
    content=[
        {
            "type": "input_image",
            "image_url": f"data:image/png;base64,{screenshot_2_base64}",
        }
    ],
    # content=f"data:image/png;base64,{screenshot_2_base64}",  # <-- also acceptable
    tool_call_id=tool_call_id,
    additional_kwargs={"type": "computer_call_output"},
)

API 參考：ToolMessage

我們現在可以使用訊息歷史記錄再次調用模型

messages = [
    input_message,
    response,
    tool_message,
]

response_2 = llm_with_tools.invoke(
    messages,
    reasoning={
        "generate_summary": "concise",
    },
)

response_2.text()

'Done! The Desktop is now visible.'

除了傳回整個序列，我們也可以使用 previous_response_id

previous_response_id = response.response_metadata["id"]

response_2 = llm_with_tools.invoke(
    [tool_message],
    previous_response_id=previous_response_id,
    reasoning={
        "generate_summary": "concise",
    },
)

response_2.text()

'The Visual Studio Code terminal has been closed and your desktop is now visible.'

管理對話狀態

Responses API 支援對話狀態的管理。

手動管理狀態

您可以手動管理狀態，或使用 LangGraph，就像其他聊天模型一樣

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4o-mini")

tool = {"type": "web_search_preview"}
llm_with_tools = llm.bind_tools([tool])

first_query = "What was a positive news story from today?"
messages = [{"role": "user", "content": first_query}]

response = llm_with_tools.invoke(messages)
response_text = response.text()
print(f"{response_text[:100]}... {response_text[-100:]}")

API 參考：ChatOpenAI

As of March 12, 2025, here are some positive news stories that highlight recent uplifting events:

*...  exemplify positive developments in health, environmental sustainability, and community well-being.

second_query = (
    "Repeat my question back to me, as well as the last sentence of your answer."
)

messages.extend(
    [
        response,
        {"role": "user", "content": second_query},
    ]
)
second_response = llm_with_tools.invoke(messages)
print(second_response.text())

Your question was: "What was a positive news story from today?"

The last sentence of my answer was: "These stories exemplify positive developments in health, environmental sustainability, and community well-being."

提示

您可以使用 LangGraph 在各種後端（包括記憶體內和 Postgres）為您管理對話線程。請參閱本教學以開始使用。

傳遞 `previous_response_id`

當使用 Responses API 時，LangChain 訊息將在其元數據中包含一個 "id" 字段。將此 ID 傳遞到後續調用將繼續對話。請注意，從計費角度來看，這與手動傳遞訊息是等效的。

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    model="gpt-4o-mini",
    use_responses_api=True,
)
response = llm.invoke("Hi, I'm Bob.")
print(response.text())

API 參考：ChatOpenAI

Hi Bob! How can I assist you today?

second_response = llm.invoke(
    "What is my name?",
    previous_response_id=response.response_metadata["id"],
)
print(second_response.text())

Your name is Bob. How can I help you today, Bob?

微調

您可以通過傳遞相應的 modelName 參數來調用微調的 OpenAI 模型。

這通常採用 ft:{OPENAI_MODEL_NAME}:{ORG_NAME}::{MODEL_ID} 的形式。例如

fine_tuned_model = ChatOpenAI(
    temperature=0, model_name="ft:gpt-3.5-turbo-0613:langchain::7qTVM5AR"
)

fine_tuned_model.invoke(messages)

AIMessage(content="J'adore la programmation.", additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 8, 'prompt_tokens': 31, 'total_tokens': 39}, 'model_name': 'ft:gpt-3.5-turbo-0613:langchain::7qTVM5AR', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-0f39b30e-c56e-4f3b-af99-5c948c984146-0', usage_metadata={'input_tokens': 31, 'output_tokens': 8, 'total_tokens': 39})

多模態輸入

OpenAI 擁有支援多模態輸入的模型。您可以將圖像或音訊傳遞給這些模型。有關如何在 LangChain 中執行此操作的更多資訊，請前往多模態輸入文件。

您可以在 OpenAI 的文件中查看支援不同模態的模型列表。

在本文件撰寫時，您會使用的主要 OpenAI 模型將是

圖像輸入：gpt-4o、gpt-4o-mini
音訊輸入：gpt-4o-audio-preview

有關傳遞圖像輸入的範例，請參閱多模態輸入操作指南。

以下是將音訊輸入傳遞到 gpt-4o-audio-preview 的範例

import base64

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    model="gpt-4o-audio-preview",
    temperature=0,
)

with open(
    "../../../../libs/partners/openai/tests/integration_tests/chat_models/audio_input.wav",
    "rb",
) as f:
    # b64 encode it
    audio = f.read()
    audio_b64 = base64.b64encode(audio).decode()


output_message = llm.invoke(
    [
        (
            "human",
            [
                {"type": "text", "text": "Transcribe the following:"},
                # the audio clip says "I'm sorry, but I can't create..."
                {
                    "type": "input_audio",
                    "input_audio": {"data": audio_b64, "format": "wav"},
                },
            ],
        ),
    ]
)
output_message.content

API 參考：ChatOpenAI

"I'm sorry, but I can't create audio content that involves yelling. Is there anything else I can help you with?"

預測輸出

資訊

需要 langchain-openai>=0.2.6

某些 OpenAI 模型（例如其 gpt-4o 和 gpt-4o-mini 系列）支援預測輸出，這讓您可以預先傳遞 LLM 預期輸出的一部分，以減少延遲。這對於編輯文字或程式碼等情況非常有用，在這些情況下，模型輸出的只有一小部分會發生變化。

這是一個範例

code = """
/// <summary>
/// Represents a user with a first name, last name, and username.
/// </summary>
public class User
{
    /// <summary>
    /// Gets or sets the user's first name.
    /// </summary>
    public string FirstName { get; set; }

    /// <summary>
    /// Gets or sets the user's last name.
    /// </summary>
    public string LastName { get; set; }

    /// <summary>
    /// Gets or sets the user's username.
    /// </summary>
    public string Username { get; set; }
}
"""

llm = ChatOpenAI(model="gpt-4o")
query = (
    "Replace the Username property with an Email property. "
    "Respond only with code, and with no markdown formatting."
)
response = llm.invoke(
    [{"role": "user", "content": query}, {"role": "user", "content": code}],
    prediction={"type": "content", "content": code},
)
print(response.content)
print(response.response_metadata)

/// <summary>
/// Represents a user with a first name, last name, and email.
/// </summary>
public class User
{
    /// <summary>
    /// Gets or sets the user's first name.
    /// </summary>
    public string FirstName { get; set; }

    /// <summary>
    /// Gets or sets the user's last name.
    /// </summary>
    public string LastName { get; set; }

    /// <summary>
    /// Gets or sets the user's email.
    /// </summary>
    public string Email { get; set; }
}
{'token_usage': {'completion_tokens': 226, 'prompt_tokens': 166, 'total_tokens': 392, 'completion_tokens_details': {'accepted_prediction_tokens': 49, 'audio_tokens': None, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 107}, 'prompt_tokens_details': {'audio_tokens': None, 'cached_tokens': 0}}, 'model_name': 'gpt-4o-2024-08-06', 'system_fingerprint': 'fp_45cf54deae', 'finish_reason': 'stop', 'logprobs': None}

請注意，目前的預測會被視為額外 token 計費，並且可能會增加您的使用量和成本，以換取減少的延遲。

音訊生成（預覽）

資訊

需要 langchain-openai>=0.2.3

OpenAI 有一個新的音訊生成功能，可讓您將音訊輸入和輸出與 gpt-4o-audio-preview 模型一起使用。

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    model="gpt-4o-audio-preview",
    temperature=0,
    model_kwargs={
        "modalities": ["text", "audio"],
        "audio": {"voice": "alloy", "format": "wav"},
    },
)

output_message = llm.invoke(
    [
        ("human", "Are you made by OpenAI? Just answer yes or no"),
    ]
)

API 參考：ChatOpenAI

output_message.additional_kwargs['audio'] 將包含一個類似以下的字典

{
    'data': '<audio data b64-encoded',
    'expires_at': 1729268602,
    'id': 'audio_67127d6a44348190af62c1530ef0955a',
    'transcript': 'Yes.'
}

格式將是 model_kwargs['audio']['format'] 中傳遞的格式。

我們也可以在 openai expires_at 到達之前，將此訊息與音訊資料一起傳回模型，作為訊息歷史記錄的一部分。

注意

輸出音訊儲存在 AIMessage.additional_kwargs 中的 audio 鍵下，但輸入內容區塊使用 HumanMessage.content 列表中的 input_audio 類型和鍵進行類型化。

如需更多資訊，請參閱 OpenAI 的音訊文件。

history = [
    ("human", "Are you made by OpenAI? Just answer yes or no"),
    output_message,
    ("human", "And what is your name? Just give your name."),
]
second_output_message = llm.invoke(history)

API 參考

有關所有 ChatOpenAI 功能和配置的詳細文件，請前往 API 參考：https://langchain-python.dev.org.tw/api_reference/openai/chat_models/langchain_openai.chat_models.base.ChatOpenAI.html

聊天模型概念指南
聊天模型操作指南

概觀​

整合詳細資訊​

模型功能​

設定​

憑證​

安裝​

例項化​

調用​

鏈接​

工具呼叫​

ChatOpenAI.bind_tools()​

strict=True​

AIMessage.tool_calls​

Responses API​

內建工具​

網路搜尋​

檔案搜尋​

電腦使用​

管理對話狀態​

手動管理狀態​

傳遞 previous_response_id​

微調​

多模態輸入​

預測輸出​

音訊生成（預覽）​

API 參考​

相關內容​

此頁面是否有幫助？

概觀

整合詳細資訊

模型功能

設定

憑證

安裝

例項化

調用

鏈接

工具呼叫

ChatOpenAI.bind_tools()

`strict=True`

AIMessage.tool_calls

Responses API

內建工具

網路搜尋

檔案搜尋

電腦使用

管理對話狀態

手動管理狀態

傳遞 `previous_response_id`

微調

多模態輸入

預測輸出

音訊生成（預覽）

API 參考

相關內容