跳到主要內容
Open In ColabOpen on GitHub

ApertureDB

ApertureDB 是一個資料庫,用於儲存、索引和管理多模態數據,例如文字、圖像、影片、邊界框和嵌入,以及它們相關的元數據。

本筆記本說明如何使用 ApertureDB 的嵌入功能。

安裝 ApertureDB Python SDK

這會安裝 Python SDK,用於編寫 ApertureDB 的用戶端程式碼。

%pip install --upgrade --quiet aperturedb
Note: you may need to restart the kernel to use updated packages.

執行 ApertureDB 實例

要繼續,您應該已啟動並執行 ApertureDB 實例,並設定您的環境以使用它。
有多種方法可以做到這一點,例如

docker run --publish 55555:55555 aperturedata/aperturedb-standalone
adb config create local --active --no-interactive

下載一些網頁文件

我們將在這裡對一個網頁進行迷你爬取。

# For loading documents from web
from langchain_community.document_loaders import WebBaseLoader

loader = WebBaseLoader("https://docs.aperturedata.io")
docs = loader.load()
API 參考文件:WebBaseLoader
USER_AGENT environment variable not set, consider setting it to identify your requests.

選擇嵌入模型

我們想要使用 OllamaEmbeddings,因此我們必須匯入必要的模組。

Ollama 可以設定為 Docker 容器,如文件中所述,例如

# Run server
docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
# Tell server to load a specific model
docker exec ollama ollama run llama2
from langchain_community.embeddings import OllamaEmbeddings

embeddings = OllamaEmbeddings()
API 參考文件:OllamaEmbeddings

將文件分割成段落

我們想要將單個文件轉換為多個段落。

from langchain_text_splitters import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter()
documents = text_splitter.split_documents(docs)

從文件和嵌入建立向量資料庫

此程式碼會在 ApertureDB 實例中建立一個向量資料庫。在實例中,此向量資料庫表示為「描述符集」。預設情況下,描述符集命名為 langchain。以下程式碼將為每個文件產生嵌入,並將它們作為描述符儲存在 ApertureDB 中。由於嵌入正在產生中,這將需要幾秒鐘的時間。

from langchain_community.vectorstores import ApertureDB

vector_db = ApertureDB.from_documents(documents, embeddings)
API 參考文件:ApertureDB

選擇大型語言模型

同樣,我們使用為本地處理設定的 Ollama 伺服器。

from langchain_community.llms import Ollama

llm = Ollama(model="llama2")
API 參考文件:Ollama

建立 RAG 鏈

現在我們擁有建立 RAG(檢索增強生成)鏈所需的所有組件。此鏈執行以下操作

  1. 為使用者查詢產生嵌入描述符
  2. 使用向量資料庫尋找與使用者查詢相似的文字段落
  3. 使用提示範本將使用者查詢和上下文文件傳遞給 LLM
  4. 傳回 LLM 的答案
# Create prompt
from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_template("""Answer the following question based only on the provided context:

<context>
{context}
</context>

Question: {input}""")


# Create a chain that passes documents to an LLM
from langchain.chains.combine_documents import create_stuff_documents_chain

document_chain = create_stuff_documents_chain(llm, prompt)


# Treat the vectorstore as a document retriever
retriever = vector_db.as_retriever()


# Create a RAG chain that connects the retriever to the LLM
from langchain.chains import create_retrieval_chain

retrieval_chain = create_retrieval_chain(retriever, document_chain)
Based on the provided context, ApertureDB can store images. In fact, it is specifically designed to manage multimodal data such as images, videos, documents, embeddings, and associated metadata including annotations. So, ApertureDB has the capability to store and manage images.

執行 RAG 鏈

最後,我們將問題傳遞給鏈並取得答案。由於 LLM 從查詢和上下文文件中產生答案,因此這將需要幾秒鐘才能執行。

user_query = "How can ApertureDB store images?"
response = retrieval_chain.invoke({"input": user_query})
print(response["answer"])
Based on the provided context, ApertureDB can store images in several ways:

1. Multimodal data management: ApertureDB offers a unified interface to manage multimodal data such as images, videos, documents, embeddings, and associated metadata including annotations. This means that images can be stored along with other types of data in a single database instance.
2. Image storage: ApertureDB provides image storage capabilities through its integration with the public cloud providers or on-premise installations. This allows customers to host their own ApertureDB instances and store images on their preferred cloud provider or on-premise infrastructure.
3. Vector database: ApertureDB also offers a vector database that enables efficient similarity search and classification of images based on their semantic meaning. This can be useful for applications where image search and classification are important, such as in computer vision or machine learning workflows.

Overall, ApertureDB provides flexible and scalable storage options for images, allowing customers to choose the deployment model that best suits their needs.

此頁面是否對您有幫助?