LangFair：用例層級 LLM 偏見與公平性評估

LangFair 是一個全面的 Python 程式庫，旨在對大型語言模型 (LLM) 用例進行偏見與公平性評估。LangFair 儲存庫包含一個全面的框架，用於為 LLM 用例選擇偏見與公平性指標，以及示範筆記本和一份技術手冊，其中討論了 LLM 偏見與公平性風險、評估指標和最佳實務。

瀏覽我們的文件網站，以取得關於使用 LangFair 的詳細說明。

⚡ 快速入門指南

(選用) 建立虛擬環境以使用 LangFair

我們建議在使用 LangFair 之前，先使用 venv 建立新的虛擬環境。若要執行此操作，請依照此處的指示操作。

安裝 LangFair

最新版本可以從 PyPI 安裝

pip install langfair

使用範例

以下是程式碼範例，說明如何使用 LangFair 來評估文字生成和摘要用例中的偏見與公平性風險。以下範例假設使用者已從其用例中定義了提示清單 prompts。

產生 LLM 回應

若要產生回應，我們可以使用 LangFair 的 ResponseGenerator 類別。首先，我們必須建立一個 langchain LLM 物件。以下我們使用 ChatVertexAI，但可以使用LangChain 的任何 LLM 類別。請注意，InMemoryRateLimiter 用於避免速率限制錯誤。

from langchain_google_vertexai import ChatVertexAI
from langchain_core.rate_limiters import InMemoryRateLimiter
rate_limiter = InMemoryRateLimiter(
    requests_per_second=4.5, check_every_n_seconds=0.5, max_bucket_size=280,  
)
llm = ChatVertexAI(
    model_name="gemini-pro", temperature=0.3, rate_limiter=rate_limiter
)

API 參考：ChatVertexAI | InMemoryRateLimiter

我們可以使用 ResponseGenerator.generate_responses 為每個提示產生 25 個回應，這符合毒性評估的慣例。

from langfair.generator import ResponseGenerator
rg = ResponseGenerator(langchain_llm=llm)
generations = await rg.generate_responses(prompts=prompts, count=25)
responses = generations["data"]["response"]
duplicated_prompts = generations["data"]["prompt"] # so prompts correspond to responses

計算毒性指標

可以使用 ToxicityMetrics 計算毒性指標。請注意，torch.device 的使用是選用的，如果 GPU 可用，應使用它來加速毒性計算。

# import torch # uncomment if GPU is available
# device = torch.device("cuda") # uncomment if GPU is available
from langfair.metrics.toxicity import ToxicityMetrics
tm = ToxicityMetrics(
    # device=device, # uncomment if GPU is available,
)
tox_result = tm.evaluate(
    prompts=duplicated_prompts, 
    responses=responses, 
    return_data=True
)
tox_result['metrics']
# # Output is below
# {'Toxic Fraction': 0.0004,
# 'Expected Maximum Toxicity': 0.013845130120171235,
# 'Toxicity Probability': 0.01}

計算刻板印象指標

可以使用 StereotypeMetrics 計算刻板印象指標。

from langfair.metrics.stereotype import StereotypeMetrics
sm = StereotypeMetrics()
stereo_result = sm.evaluate(responses=responses, categories=["gender"])
stereo_result['metrics']
# # Output is below
# {'Stereotype Association': 0.3172750176745329,
# 'Cooccurrence Bias': 0.44766333654278373,
# 'Stereotype Fraction - gender': 0.08}

產生反事實回應並計算指標

我們可以使用 CounterfactualGenerator 產生反事實回應。

from langfair.generator.counterfactual import CounterfactualGenerator
cg = CounterfactualGenerator(langchain_llm=llm)
cf_generations = await cg.generate_responses(
    prompts=prompts, attribute='gender', count=25
)
male_responses = cf_generations['data']['male_response']
female_responses = cf_generations['data']['female_response']

可以使用 CounterfactualMetrics 輕鬆計算反事實指標。

from langfair.metrics.counterfactual import CounterfactualMetrics
cm = CounterfactualMetrics()
cf_result = cm.evaluate(
    texts1=male_responses, 
    texts2=female_responses,
    attribute='gender'
)
cf_result['metrics']
# # Output is below
# {'Cosine Similarity': 0.8318708,
# 'RougeL Similarity': 0.5195852482361165,
# 'Bleu Similarity': 0.3278433712872481,
# 'Sentiment Bias': 0.0009947145187601957}

替代方法：使用 `AutoEval` 進行半自動評估

為了簡化文字生成和摘要用例的評估，AutoEval 類別執行一個多步驟流程，以兩行程式碼完成所有上述步驟。

from langfair.auto import AutoEval
auto_object = AutoEval(
    prompts=prompts, 
    langchain_llm=llm,
    # toxicity_device=device # uncomment if GPU is available
)
results = await auto_object.evaluate()
results['metrics']
# # Output is below
# {'Toxicity': {'Toxic Fraction': 0.0004,
#   'Expected Maximum Toxicity': 0.013845130120171235,
#   'Toxicity Probability': 0.01},
#  'Stereotype': {'Stereotype Association': 0.3172750176745329,
#   'Cooccurrence Bias': 0.44766333654278373,
#   'Stereotype Fraction - gender': 0.08,
#   'Expected Maximum Stereotype - gender': 0.60355167388916,
#   'Stereotype Probability - gender': 0.27036},
#  'Counterfactual': {'male-female': {'Cosine Similarity': 0.8318708,
#    'RougeL Similarity': 0.5195852482361165,
#    'Bleu Similarity': 0.3278433712872481,
#    'Sentiment Bias': 0.0009947145187601957}}}

⚡ 快速入門指南​

(選用) 建立虛擬環境以使用 LangFair​

安裝 LangFair​

使用範例​

產生 LLM 回應​

計算毒性指標​

計算刻板印象指標​

產生反事實回應並計算指標​

替代方法：使用 AutoEval 進行半自動評估​

此頁面是否對您有幫助？