RELLM
RELLM 是一個包裝本地 Hugging Face 管線模型以進行結構化解碼的函式庫。
它的運作方式是一次產生一個 token。 在每個步驟中,它會遮罩不符合提供的部分正規表示式的 token。
警告 - 此模組仍處於實驗階段
%pip install --upgrade --quiet rellm langchain-huggingface > /dev/null
Hugging Face 基準
首先,讓我們透過檢查模型在沒有結構化解碼的情況下的輸出,建立一個定性的基準。
import logging
logging.basicConfig(level=logging.ERROR)
prompt = """Human: "What's the capital of the United States?"
AI Assistant:{
"action": "Final Answer",
"action_input": "The capital of the United States is Washington D.C."
}
Human: "What's the capital of Pennsylvania?"
AI Assistant:{
"action": "Final Answer",
"action_input": "The capital of Pennsylvania is Harrisburg."
}
Human: "What 2 + 5?"
AI Assistant:{
"action": "Final Answer",
"action_input": "2 + 5 = 7."
}
Human: 'What's the capital of Maryland?'
AI Assistant:"""
from langchain_huggingface import HuggingFacePipeline
from transformers import pipeline
hf_model = pipeline(
"text-generation", model="cerebras/Cerebras-GPT-590M", max_new_tokens=200
)
original_model = HuggingFacePipeline(pipeline=hf_model)
generated = original_model.generate([prompt], stop=["Human:"])
print(generated)
API 參考:HuggingFacePipeline
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
``````output
generations=[[Generation(text=' "What\'s the capital of Maryland?"\n', generation_info=None)]] llm_output=None
這不是很令人印象深刻,對吧? 它沒有回答問題,也沒有完全遵循 JSON 格式! 讓我們試試結構化解碼器。
RELLM LLM 包裝器
讓我們再次嘗試,現在提供一個正規表示式來匹配 JSON 結構化格式。
import regex # Note this is the regex library NOT python's re stdlib module
# We'll choose a regex that matches to a structured json string that looks like:
# {
# "action": "Final Answer",
# "action_input": string or dict
# }
pattern = regex.compile(
r'\{\s*"action":\s*"Final Answer",\s*"action_input":\s*(\{.*\}|"[^"]*")\s*\}\nHuman:'
)
from langchain_experimental.llms import RELLM
model = RELLM(pipeline=hf_model, regex=pattern, max_new_tokens=200)
generated = model.predict(prompt, stop=["Human:"])
print(generated)
API 參考:RELLM
{"action": "Final Answer",
"action_input": "The capital of Maryland is Baltimore."
}
瞧! 沒有剖析錯誤。