跳到主要內容
Open In ColabOpen on GitHub

Ray Serve

Ray Serve 是一個可擴展的模型服務函式庫,用於建置線上推論 API。Serve 特別適合系統組成,讓您能夠以 Python 程式碼建置由多個鏈和業務邏輯組成的複雜推論服務。

本筆記本的目標

本筆記本示範如何將 OpenAI 鏈部署到生產環境的簡單範例。您可以擴展它以部署您自己的自託管模型,在其中您可以輕鬆定義有效率地在生產環境中執行模型所需的硬體資源量(GPU 和 CPU)。請閱讀 Ray Serve 文件,以深入瞭解包括自動擴展在內的可選選項。

設定 Ray Serve

使用 pip install ray[serve] 安裝 ray。

一般骨架

部署服務的一般骨架如下

# 0: Import ray serve and request from starlette
from ray import serve
from starlette.requests import Request


# 1: Define a Ray Serve deployment.
@serve.deployment
class LLMServe:
def __init__(self) -> None:
# All the initialization code goes here
pass

async def __call__(self, request: Request) -> str:
# You can parse the request here
# and return a response
return "Hello World"


# 2: Bind the model to deployment
deployment = LLMServe.bind()

# 3: Run the deployment
serve.api.run(deployment)
# Shutdown the deployment
serve.api.shutdown()

部署具有自訂提示的 OpenAI 鏈的範例

這裡取得 OpenAI API 金鑰。執行以下程式碼後,系統會要求您提供 API 金鑰。

from langchain.chains import LLMChain
from langchain_core.prompts import PromptTemplate
from langchain_openai import OpenAI
API 參考:LLMChain | PromptTemplate | OpenAI
from getpass import getpass

OPENAI_API_KEY = getpass()
@serve.deployment
class DeployLLM:
def __init__(self):
# We initialize the LLM, template and the chain here
llm = OpenAI(openai_api_key=OPENAI_API_KEY)
template = "Question: {question}\n\nAnswer: Let's think step by step."
prompt = PromptTemplate.from_template(template)
self.chain = LLMChain(llm=llm, prompt=prompt)

def _run_chain(self, text: str):
return self.chain(text)

async def __call__(self, request: Request):
# 1. Parse the request
text = request.query_params["text"]
# 2. Run the chain
resp = self._run_chain(text)
# 3. Return the response
return resp["text"]

現在我們可以綁定部署。

# Bind the model to deployment
deployment = DeployLLM.bind()

當我們想要執行部署時,可以指定連接埠號碼和主機。

# Example port number
PORT_NUMBER = 8282
# Run the deployment
serve.api.run(deployment, port=PORT_NUMBER)

現在服務已部署在連接埠 localhost:8282 上,我們可以傳送 post 請求以取回結果。

import requests

text = "What NFL team won the Super Bowl in the year Justin Beiber was born?"
response = requests.post(f"https://127.0.0.1:{PORT_NUMBER}/?text={text}")
print(response.content.decode())

此頁面是否對您有幫助?