跳至主要內容
Open In ColabOpen on GitHub

使用最佳化與量化嵌入器嵌入文件

使用量化嵌入器嵌入所有文件。

這些嵌入器基於最佳化模型,使用 optimum-intelIPEX 建立。

範例文字基於 SBERT

from langchain_community.embeddings import QuantizedBiEncoderEmbeddings

model_name = "Intel/bge-small-en-v1.5-rag-int8-static"
encode_kwargs = {"normalize_embeddings": True} # set True to compute cosine similarity

model = QuantizedBiEncoderEmbeddings(
model_name=model_name,
encode_kwargs=encode_kwargs,
query_instruction="Represent this sentence for searching relevant passages: ",
)
loading configuration file inc_config.json from cache at 
INCConfig {
"distillation": {},
"neural_compressor_version": "2.4.1",
"optimum_version": "1.16.2",
"pruning": {},
"quantization": {
"dataset_num_samples": 50,
"is_static": true
},
"save_onnx_model": false,
"torch_version": "2.2.0",
"transformers_version": "4.37.2"
}

Using `INCModel` to load a TorchScript model will be deprecated in v1.15.0, to load your model please use `IPEXModel` instead.

讓我們問一個問題,並與 2 個文件比較。第一個包含問題的答案,第二個則沒有。

我們可以檢查哪個更適合我們的查詢。

question = "How many people live in Berlin?"
documents = [
"Berlin had a population of 3,520,031 registered inhabitants in an area of 891.82 square kilometers.",
"Berlin is well known for its museums.",
]
doc_vecs = model.embed_documents(documents)
Batches: 100%|██████████| 1/1 [00:00<00:00,  4.18it/s]
query_vec = model.embed_query(question)
import torch
doc_vecs_torch = torch.tensor(doc_vecs)
query_vec_torch = torch.tensor(query_vec)
query_vec_torch @ doc_vecs_torch.T
tensor([0.7980, 0.6529])

我們可以看見確實第一個排名較高。


此頁面是否對您有幫助?