LlamaEdge

LlamaEdge 讓您能夠在本機和透過聊天服務與 GGUF 格式的 LLM 聊天。

LlamaEdgeChatService 為開發人員提供 OpenAI API 相容服務，以透過 HTTP 請求與 LLM 聊天。
LlamaEdgeChatLocal 讓開發人員能夠在本機與 LLM 聊天（即將推出）。

LlamaEdgeChatService 和 LlamaEdgeChatLocal 都在由 WasmEdge Runtime 驅動的基礎架構上運行，WasmEdge Runtime 為 LLM 推論任務提供輕量且可攜式的 WebAssembly 容器環境。

透過 API 服務聊天

LlamaEdgeChatService 在 llama-api-server 上運作。按照 llama-api-server 快速入門中的步驟，您可以託管自己的 API 服務，以便您可以隨時隨地在任何裝置上與您喜歡的任何模型聊天，只要有網路連線即可。

from langchain_community.chat_models.llama_edge import LlamaEdgeChatService
from langchain_core.messages import HumanMessage, SystemMessage

API 參考：LlamaEdgeChatService | HumanMessage | SystemMessage

在非串流模式下與 LLM 聊天

# service url
service_url = "https://b008-54-186-154-209.ngrok-free.app"

# create wasm-chat service instance
chat = LlamaEdgeChatService(service_url=service_url)

# create message sequence
system_message = SystemMessage(content="You are an AI assistant")
user_message = HumanMessage(content="What is the capital of France?")
messages = [system_message, user_message]

# chat with wasm-chat service
response = chat.invoke(messages)

print(f"[Bot] {response.content}")

[Bot] Hello! The capital of France is Paris.

在串流模式下與 LLM 聊天

# service url
service_url = "https://b008-54-186-154-209.ngrok-free.app"

# create wasm-chat service instance
chat = LlamaEdgeChatService(service_url=service_url, streaming=True)

# create message sequence
system_message = SystemMessage(content="You are an AI assistant")
user_message = HumanMessage(content="What is the capital of Norway?")
messages = [
    system_message,
    user_message,
]

output = ""
for chunk in chat.stream(messages):
    # print(chunk.content, end="", flush=True)
    output += chunk.content

print(f"[Bot] {output}")

[Bot]   Hello! I'm happy to help you with your question. The capital of Norway is Oslo.

聊天模型概念指南
聊天模型操作指南

透過 API 服務聊天​

在非串流模式下與 LLM 聊天​

在串流模式下與 LLM 聊天​

相關內容​

此頁面是否有幫助？

透過 API 服務聊天

在非串流模式下與 LLM 聊天

在串流模式下與 LLM 聊天

相關內容