Baidu Cloud ElasticSearch VectorSearch

百度智能雲 VectorSearch 是一個完全託管、企業級分散式搜尋和分析服務，100% 相容於開源。百度智能雲 VectorSearch 為結構化/非結構化資料提供低成本、高效能且可靠的檢索和分析平台級產品服務。作為向量資料庫，它支援多種索引類型和相似度距離方法。

Baidu Cloud ElasticSearch 提供權限管理機制，讓您可以自由配置叢集權限，進一步確保資料安全。

本筆記本展示如何使用與 Baidu Cloud ElasticSearch VectorStore 相關的功能。若要執行，您應該已啟動並執行 Baidu Cloud ElasticSearch 執行個體

閱讀說明文件，快速熟悉並配置 Baidu Cloud ElasticSearch 執行個體。

執行個體啟動並執行後，請按照以下步驟分割文件、取得嵌入、連線至百度智能雲 elasticsearch 執行個體、索引文件，並執行向量檢索。

我們需要先安裝以下 Python 套件。

%pip install --upgrade --quiet langchain-community elasticsearch == 7.11.0

首先，我們要使用 QianfanEmbeddings，因此必須取得 Qianfan AK 和 SK。關於 QianFan 的詳細資訊，請參閱百度千帆大模型平台

import getpass
import os

if "QIANFAN_AK" not in os.environ:
    os.environ["QIANFAN_AK"] = getpass.getpass("Your Qianfan AK:")
if "QIANFAN_SK" not in os.environ:
    os.environ["QIANFAN_SK"] = getpass.getpass("Your Qianfan SK:")

其次，分割文件並取得嵌入。

from langchain_community.document_loaders import TextLoader
from langchain_text_splitters import CharacterTextSplitter

loader = TextLoader("../../../state_of_the_union.txt")
documents = loader.load()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
docs = text_splitter.split_documents(documents)

from langchain_community.embeddings import QianfanEmbeddingsEndpoint

embeddings = QianfanEmbeddingsEndpoint()

API 參考：TextLoader | CharacterTextSplitter | QianfanEmbeddingsEndpoint

然後，建立可存取的 Baidu ElasticeSearch 執行個體。

# Create a bes instance and index docs.
from langchain_community.vectorstores import BESVectorStore

bes = BESVectorStore.from_documents(
    documents=docs,
    embedding=embeddings,
    bes_url="your bes cluster url",
    index_name="your vector index",
)
bes.client.indices.refresh(index="your vector index")

API 參考：BESVectorStore

最後，查詢和檢索資料

query = "What did the president say about Ketanji Brown Jackson"
docs = bes.similarity_search(query)
print(docs[0].page_content)

如果您在使用過程中遇到任何問題，請隨時聯絡 liuboyao@baidu.com 或 chenweixu01@baidu.com，我們將盡力為您提供支援。

向量儲存區概念指南
向量儲存區操作指南

相關​

此頁面是否對您有幫助？

相關