SQLServer
Azure SQL 提供專用的 向量資料類型,簡化了在關聯式資料庫中直接建立、儲存和查詢向量嵌入的流程。 這消除了對獨立向量資料庫和相關整合的需求,提高了解決方案的安全性,同時降低了整體複雜性。
Azure SQL 是一項穩健的服務,結合了可擴展性、安全性和高可用性,提供了現代化資料庫解決方案的所有優勢。 它利用精密的查詢最佳化工具和企業功能,執行向量相似度搜尋以及傳統 SQL 查詢,從而增強資料分析和決策能力。
請閱讀更多關於使用 搭配 Azure SQL Database 的智慧型應用程式
這個 notebook 向您展示如何利用這個整合式 SQL 向量資料庫來儲存文件,並使用餘弦 (餘弦距離)、L2 (歐幾里得距離) 和 IP (內積) 執行向量搜尋查詢,以定位靠近查詢向量的文件
設定
安裝 langchain-sqlserver
python 套件。
程式碼位於名為:langchain-sqlserver 的整合套件中。
!pip install langchain-sqlserver==0.1.1
憑證
執行此 notebook 不需要任何憑證,只需確保您已下載 langchain_sqlserver
套件。 如果您想要獲得一流的模型呼叫自動追蹤,您也可以透過取消註解下方內容來設定您的 LangSmith API 金鑰
# os.environ["LANGSMITH_API_KEY"] = getpass.getpass("Enter your LangSmith API key: ")
# os.environ["LANGSMITH_TRACING"] = "true"
初始化
from langchain_sqlserver import SQLServer_VectorStore
在 Azure 入口網站的資料庫設定中找到您的 Azure SQL DB 連接字串
更多資訊:連線至 Azure SQL DB - Python
import os
import pyodbc
# Define your SQLServer Connection String
_CONNECTION_STRING = (
"Driver={ODBC Driver 18 for SQL Server};"
"Server=<YOUR_DBSERVER>.database.windows.net,1433;"
"Database=test;"
"TrustServerCertificate=yes;"
"Connection Timeout=60;"
"LongAsMax=yes;"
)
# Connection string can vary:
# "mssql+pyodbc://<username>:<password><servername>/<dbname>?driver=ODBC+Driver+18+for+SQL+Server" -> With Username and Password specified
# "mssql+pyodbc://<servername>/<dbname>?driver=ODBC+Driver+18+for+SQL+Server&Trusted_connection=yes" -> Uses Trusted connection
# "mssql+pyodbc://<servername>/<dbname>?driver=ODBC+Driver+18+for+SQL+Server" -> Uses EntraID connection
# "mssql+pyodbc://<servername>/<dbname>?driver=ODBC+Driver+18+for+SQL+Server&Trusted_connection=no" -> Uses EntraID connection
在本範例中,我們使用 Azure OpenAI 來產生嵌入,但您可以使用 LangChain 中提供的不同嵌入。
您可以依照此指南,在 Azure 入口網站上部署 Azure OpenAI 執行個體。 執行個體啟動並執行後,請確保您擁有執行個體的名稱和金鑰。 您可以在 Azure 入口網站的執行個體「金鑰和端點」區段中找到金鑰。
!pip install langchain-openai
# Import the necessary Libraries
from langchain_openai import AzureChatOpenAI, AzureOpenAIEmbeddings
# Set your AzureOpenAI details
azure_endpoint = "https://<YOUR_ENDPOINT>.openai.azure.com/"
azure_deployment_name_embedding = "text-embedding-3-small"
azure_deployment_name_chatcompletion = "chatcompletion"
azure_api_version = "2023-05-15"
azure_api_key = "YOUR_KEY"
# Use AzureChatOpenAI for chat completions
llm = AzureChatOpenAI(
azure_endpoint=azure_endpoint,
azure_deployment=azure_deployment_name_chatcompletion,
openai_api_version=azure_api_version,
openai_api_key=azure_api_key,
)
# Use AzureOpenAIEmbeddings for embeddings
embeddings = AzureOpenAIEmbeddings(
azure_endpoint=azure_endpoint,
azure_deployment=azure_deployment_name_embedding,
openai_api_version=azure_api_version,
openai_api_key=azure_api_key,
)
管理向量儲存區
from langchain_community.vectorstores.utils import DistanceStrategy
from langchain_sqlserver import SQLServer_VectorStore
# Initialize the vector store
vector_store = SQLServer_VectorStore(
connection_string=_CONNECTION_STRING,
distance_strategy=DistanceStrategy.COSINE, # optional, if not provided, defaults to COSINE
embedding_function=embeddings, # you can use different embeddings provided in LangChain
embedding_length=1536,
table_name="langchain_test_table", # using table with a custom name
)
將項目新增至向量儲存區
## we will use some artificial data for this example
query = [
"I have bought several of the Vitality canned dog food products and have found them all to be of good quality. The product looks more like a stew than a processed meat and it smells better. My Labrador is finicky and she appreciates this product better than most.",
"The candy is just red , No flavor . Just plan and chewy . I would never buy them again",
"Arrived in 6 days and were so stale i could not eat any of the 6 bags!!",
"Got these on sale for roughly 25 cents per cup, which is half the price of my local grocery stores, plus they rarely stock the spicy flavors. These things are a GREAT snack for my office where time is constantly crunched and sometimes you can't escape for a real meal. This is one of my favorite flavors of Instant Lunch and will be back to buy every time it goes on sale.",
"If you are looking for a less messy version of licorice for the children, then be sure to try these! They're soft, easy to chew, and they don't get your hands all sticky and gross in the car, in the summer, at the beach, etc. We love all the flavos and sometimes mix these in with the chocolate to have a very nice snack! Great item, great price too, highly recommend!",
"We had trouble finding this locally - delivery was fast, no more hunting up and down the flour aisle at our local grocery stores.",
"Too much of a good thing? We worked this kibble in over time, slowly shifting the percentage of Felidae to national junk-food brand until the bowl was all natural. By this time, the cats couldn't keep it in or down. What a mess. We've moved on.",
"Hey, the description says 360 grams - that is roughly 13 ounces at under $4.00 per can. No way - that is the approximate price for a 100 gram can.",
"The taste of these white cheddar flat breads is like a regular cracker - which is not bad, except that I bought them because I wanted a cheese taste.<br /><br />What was a HUGE disappointment? How misleading the packaging of the box is. The photo on the box (I bought these in store) makes it look like it is full of long flatbreads (expanding the length and width of the box). Wrong! The plastic tray that holds the crackers is about 2"
" smaller all around - leaving you with about 15 or so small flatbreads.<br /><br />What is also bad about this is that the company states they use biodegradable and eco-friendly packaging. FAIL! They used a HUGE box for a ridiculously small amount of crackers. Not ecofriendly at all.<br /><br />Would I buy these again? No - I feel ripped off. The other crackers (like Sesame Tarragon) give you a little<br />more bang for your buck and have more flavor.",
"I have used this product in smoothies for my son and he loves it. Additionally, I use this oil in the shower as a skin conditioner and it has made my skin look great. Some of the stretch marks on my belly has disappeared quickly. Highly recommend!!!",
"Been taking Coconut Oil for YEARS. This is the best on the retail market. I wish it was in glass, but this is the one.",
]
query_metadata = [
{"id": 1, "summary": "Good Quality Dog Food"},
{"id": 8, "summary": "Nasty No flavor"},
{"id": 4, "summary": "stale product"},
{"id": 11, "summary": "Great value and convenient ramen"},
{"id": 5, "summary": "Great for the kids!"},
{"id": 2, "summary": "yum falafel"},
{"id": 9, "summary": "Nearly killed the cats"},
{"id": 6, "summary": "Price cannot be correct"},
{"id": 3, "summary": "Taste is neutral, quantity is DECEITFUL!"},
{"id": 7, "summary": "This stuff is great"},
{"id": 10, "summary": "The reviews don't lie"},
]
vector_store.add_texts(texts=query, metadatas=query_metadata)
[1, 8, 4, 11, 5, 2, 9, 6, 3, 7, 10]
查詢向量儲存區
建立向量儲存區並新增相關文件後,您很可能希望在執行鏈或代理程式期間查詢它。
執行簡單的相似度搜尋可以如下進行
# Perform a similarity search between the embedding of the query and the embeddings of the documents
simsearch_result = vector_store.similarity_search("Good reviews", k=3)
print(simsearch_result)
[Document(metadata={'id': 1, 'summary': 'Good Quality Dog Food'}, page_content='I have bought several of the Vitality canned dog food products and have found them all to be of good quality. The product looks more like a stew than a processed meat and it smells better. My Labrador is finicky and she appreciates this product better than most.'), Document(metadata={'id': 7, 'summary': 'This stuff is great'}, page_content='I have used this product in smoothies for my son and he loves it. Additionally, I use this oil in the shower as a skin conditioner and it has made my skin look great. Some of the stretch marks on my belly has disappeared quickly. Highly recommend!!!'), Document(metadata={'id': 5, 'summary': 'Great for the kids!'}, page_content="If you are looking for a less messy version of licorice for the children, then be sure to try these! They're soft, easy to chew, and they don't get your hands all sticky and gross in the car, in the summer, at the beach, etc. We love all the flavos and sometimes mix these in with the chocolate to have a very nice snack! Great item, great price too, highly recommend!")]
篩選支援:
向量儲存區支援一組可以針對文件的中繼資料欄位套用的篩選器。此功能使開發人員和資料分析師能夠精簡其查詢,確保搜尋結果與其需求精確對齊。 透過根據特定中繼資料屬性套用篩選器,使用者可以限制其搜尋範圍,僅專注於最相關的資料子集。
# hybrid search -> filter for cases where id not equal to 1.
hybrid_simsearch_result = vector_store.similarity_search(
"Good reviews", k=3, filter={"id": {"$ne": 1}}
)
print(hybrid_simsearch_result)
[Document(metadata={'id': 7, 'summary': 'This stuff is great'}, page_content='I have used this product in smoothies for my son and he loves it. Additionally, I use this oil in the shower as a skin conditioner and it has made my skin look great. Some of the stretch marks on my belly has disappeared quickly. Highly recommend!!!'), Document(metadata={'id': 5, 'summary': 'Great for the kids!'}, page_content="If you are looking for a less messy version of licorice for the children, then be sure to try these! They're soft, easy to chew, and they don't get your hands all sticky and gross in the car, in the summer, at the beach, etc. We love all the flavos and sometimes mix these in with the chocolate to have a very nice snack! Great item, great price too, highly recommend!"), Document(metadata={'id': 3, 'summary': 'Taste is neutral, quantity is DECEITFUL!'}, page_content='The taste of these white cheddar flat breads is like a regular cracker - which is not bad, except that I bought them because I wanted a cheese taste.<br /><br />What was a HUGE disappointment? How misleading the packaging of the box is. The photo on the box (I bought these in store) makes it look like it is full of long flatbreads (expanding the length and width of the box). Wrong! The plastic tray that holds the crackers is about 2 smaller all around - leaving you with about 15 or so small flatbreads.<br /><br />What is also bad about this is that the company states they use biodegradable and eco-friendly packaging. FAIL! They used a HUGE box for a ridiculously small amount of crackers. Not ecofriendly at all.<br /><br />Would I buy these again? No - I feel ripped off. The other crackers (like Sesame Tarragon) give you a little<br />more bang for your buck and have more flavor.')]
具有分數的相似度搜尋:
如果您想要執行相似度搜尋並接收相應的分數,您可以執行
simsearch_with_score_result = vector_store.similarity_search_with_score(
"Not a very good product", k=12
)
print(simsearch_with_score_result)
[(Document(metadata={'id': 3, 'summary': 'Taste is neutral, quantity is DECEITFUL!'}, page_content='The taste of these white cheddar flat breads is like a regular cracker - which is not bad, except that I bought them because I wanted a cheese taste.<br /><br />What was a HUGE disappointment? How misleading the packaging of the box is. The photo on the box (I bought these in store) makes it look like it is full of long flatbreads (expanding the length and width of the box). Wrong! The plastic tray that holds the crackers is about 2 smaller all around - leaving you with about 15 or so small flatbreads.<br /><br />What is also bad about this is that the company states they use biodegradable and eco-friendly packaging. FAIL! They used a HUGE box for a ridiculously small amount of crackers. Not ecofriendly at all.<br /><br />Would I buy these again? No - I feel ripped off. The other crackers (like Sesame Tarragon) give you a little<br />more bang for your buck and have more flavor.'), 0.651870006770711), (Document(metadata={'id': 8, 'summary': 'Nasty No flavor'}, page_content='The candy is just red , No flavor . Just plan and chewy . I would never buy them again'), 0.6908952973052638), (Document(metadata={'id': 4, 'summary': 'stale product'}, page_content='Arrived in 6 days and were so stale i could not eat any of the 6 bags!!'), 0.7360955776468822), (Document(metadata={'id': 1, 'summary': 'Good Quality Dog Food'}, page_content='I have bought several of the Vitality canned dog food products and have found them all to be of good quality. The product looks more like a stew than a processed meat and it smells better. My Labrador is finicky and she appreciates this product better than most.'), 0.7408823529514486), (Document(metadata={'id': 9, 'summary': 'Nearly killed the cats'}, page_content="Too much of a good thing? We worked this kibble in over time, slowly shifting the percentage of Felidae to national junk-food brand until the bowl was all natural. By this time, the cats couldn't keep it in or down. What a mess. We've moved on."), 0.782995248991772), (Document(metadata={'id': 7, 'summary': 'This stuff is great'}, page_content='I have used this product in smoothies for my son and he loves it. Additionally, I use this oil in the shower as a skin conditioner and it has made my skin look great. Some of the stretch marks on my belly has disappeared quickly. Highly recommend!!!'), 0.7912681479906212), (Document(metadata={'id': 2, 'summary': 'yum falafel'}, page_content='We had trouble finding this locally - delivery was fast, no more hunting up and down the flour aisle at our local grocery stores.'), 0.809213468778896), (Document(metadata={'id': 10, 'summary': "The reviews don't lie"}, page_content='Been taking Coconut Oil for YEARS. This is the best on the retail market. I wish it was in glass, but this is the one.'), 0.8281482301097155), (Document(metadata={'id': 5, 'summary': 'Great for the kids!'}, page_content="If you are looking for a less messy version of licorice for the children, then be sure to try these! They're soft, easy to chew, and they don't get your hands all sticky and gross in the car, in the summer, at the beach, etc. We love all the flavos and sometimes mix these in with the chocolate to have a very nice snack! Great item, great price too, highly recommend!"), 0.8283754326400574), (Document(metadata={'id': 6, 'summary': 'Price cannot be correct'}, page_content='Hey, the description says 360 grams - that is roughly 13 ounces at under $4.00 per can. No way - that is the approximate price for a 100 gram can.'), 0.8323967822635847), (Document(metadata={'id': 11, 'summary': 'Great value and convenient ramen'}, page_content="Got these on sale for roughly 25 cents per cup, which is half the price of my local grocery stores, plus they rarely stock the spicy flavors. These things are a GREAT snack for my office where time is constantly crunched and sometimes you can't escape for a real meal. This is one of my favorite flavors of Instant Lunch and will be back to buy every time it goes on sale."), 0.8387189489406939)]
如需您可以在 Azure SQL 向量儲存區上執行的不同搜尋的完整清單,請參閱API 參考資料。
當您已經擁有想要搜尋的嵌入時,進行相似度搜尋
# if you already have embeddings you want to search on
simsearch_by_vector = vector_store.similarity_search_by_vector(
[-0.0033353185281157494, -0.017689190804958344, -0.01590404286980629, ...]
)
print(simsearch_by_vector)
[Document(metadata={'id': 8, 'summary': 'Nasty No flavor'}, page_content='The candy is just red , No flavor . Just plan and chewy . I would never buy them again'), Document(metadata={'id': 4, 'summary': 'stale product'}, page_content='Arrived in 6 days and were so stale i could not eat any of the 6 bags!!'), Document(metadata={'id': 3, 'summary': 'Taste is neutral, quantity is DECEITFUL!'}, page_content='The taste of these white cheddar flat breads is like a regular cracker - which is not bad, except that I bought them because I wanted a cheese taste.<br /><br />What was a HUGE disappointment? How misleading the packaging of the box is. The photo on the box (I bought these in store) makes it look like it is full of long flatbreads (expanding the length and width of the box). Wrong! The plastic tray that holds the crackers is about 2 smaller all around - leaving you with about 15 or so small flatbreads.<br /><br />What is also bad about this is that the company states they use biodegradable and eco-friendly packaging. FAIL! They used a HUGE box for a ridiculously small amount of crackers. Not ecofriendly at all.<br /><br />Would I buy these again? No - I feel ripped off. The other crackers (like Sesame Tarragon) give you a little<br />more bang for your buck and have more flavor.'), Document(metadata={'id': 6, 'summary': 'Price cannot be correct'}, page_content='Hey, the description says 360 grams - that is roughly 13 ounces at under $4.00 per can. No way - that is the approximate price for a 100 gram can.')]
# Similarity Search with Score if you already have embeddings you want to search on
simsearch_by_vector_with_score = vector_store.similarity_search_by_vector_with_score(
[-0.0033353185281157494, -0.017689190804958344, -0.01590404286980629, ...]
)
print(simsearch_by_vector_with_score)
[(Document(metadata={'id': 8, 'summary': 'Nasty No flavor'}, page_content='The candy is just red , No flavor . Just plan and chewy . I would never buy them again'), 0.9648153551769503), (Document(metadata={'id': 4, 'summary': 'stale product'}, page_content='Arrived in 6 days and were so stale i could not eat any of the 6 bags!!'), 0.9655108580341948), (Document(metadata={'id': 3, 'summary': 'Taste is neutral, quantity is DECEITFUL!'}, page_content='The taste of these white cheddar flat breads is like a regular cracker - which is not bad, except that I bought them because I wanted a cheese taste.<br /><br />What was a HUGE disappointment? How misleading the packaging of the box is. The photo on the box (I bought these in store) makes it look like it is full of long flatbreads (expanding the length and width of the box). Wrong! The plastic tray that holds the crackers is about 2 smaller all around - leaving you with about 15 or so small flatbreads.<br /><br />What is also bad about this is that the company states they use biodegradable and eco-friendly packaging. FAIL! They used a HUGE box for a ridiculously small amount of crackers. Not ecofriendly at all.<br /><br />Would I buy these again? No - I feel ripped off. The other crackers (like Sesame Tarragon) give you a little<br />more bang for your buck and have more flavor.'), 0.9840511208615808), (Document(metadata={'id': 6, 'summary': 'Price cannot be correct'}, page_content='Hey, the description says 360 grams - that is roughly 13 ounces at under $4.00 per can. No way - that is the approximate price for a 100 gram can.'), 0.9915737524649991)]
從向量儲存區中刪除項目
依 ID 刪除列
# delete row by id
vector_store.delete(["3", "7"])
True
捨棄向量儲存區
# drop vectorstore
vector_store.drop()
從 Azure Blob 儲存體載入文件
以下範例說明如何在將文件分割成區塊後,將 Azure Blob 儲存體容器中的檔案載入 SQL 向量儲存區。 Azure Blog 儲存體是 Microsoft 用於雲端的物件儲存體解決方案。 Blob 儲存體針對儲存大量非結構化資料進行了最佳化。
pip install azure-storage-blob
from langchain.document_loaders import AzureBlobStorageFileLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_core.documents import Document
# Define your connection string and blob details
conn_str = "DefaultEndpointsProtocol=https;AccountName=<YourBlobName>;AccountKey=<YourAccountKey>==;EndpointSuffix=core.windows.net"
container_name = "<YourContainerName"
blob_name = "01 Harry Potter and the Sorcerers Stone.txt"
# Create an instance of AzureBlobStorageFileLoader
loader = AzureBlobStorageFileLoader(
conn_str=conn_str, container=container_name, blob_name=blob_name
)
# Load the document from Azure Blob Storage
documents = loader.load()
# Split the document into smaller chunks if necessary
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=100)
split_documents = text_splitter.split_documents(documents)
# Print the number of split documents
print(f"Number of split documents: {len(split_documents)}")
Number of split documents: 528
API 參考資料:AzureBlobStorageContainerLoader
# # Initialize the vector store & insert the documents in AzureSQLDB with their embeddings
vector_store = SQLServer_VectorStore(
connection_string=_CONNECTION_STRING,
distance_strategy=DistanceStrategy.COSINE,
embedding_function=embeddings,
embedding_length=1536,
table_name="harrypotter",
) # Replace with your actual vector store initialization
# Add split documents to the vector store individually
for i, doc in enumerate(split_documents):
vector_store.add_documents(documents=[doc], ids=[f"doc_{i}"])
print("Documents added to the vector store successfully!")
Documents added to the vector store successfully!
直接查詢
from typing import List, Tuple
# Perform similarity search
query = "Why did the Dursleys not want Harry in their house?"
docs_with_score: List[Tuple[Document, float]] = (
vector_store.similarity_search_with_score(query)
)
for doc, score in docs_with_score:
print("-" * 60)
print("Score: ", score)
print(doc.page_content)
print("-" * 60)
------------------------------------------------------------
Score: 0.3626232679001803
The Dursleys had everything they wanted, but they also had a secret, and their greatest fear was that somebody would discover it. They didn’t think they could bear it if anyone found out about the Potters. Mrs. Potter was Mrs. Dursley’s sister, but they hadn’t met for several years; in fact, Mrs. Dursley pretended she didn’t have a sister, because her sister and her good-for-nothing husband were as unDursleyish as it was possible to be. The Dursleys shuddered to think what the neighbors would say if the Potters arrived in the street. The Dursleys knew that the Potters had a small son, too, but they had never even seen him. This boy was another good reason for keeping the Potters away; they didn’t want Dudley mixing with a child like that.
------------------------------------------------------------
------------------------------------------------------------
Score: 0.44752797298657554
The Dursleys’ house had four bedrooms: one for Uncle Vernon and Aunt Petunia, one for visitors (usually Uncle Vernon’s sister, Marge), one where Dudley slept, and one where Dudley kept all the toys and things that wouldn’t fit into his first bedroom. It only took Harry one trip upstairs to move everything he owned from the cupboard to this room. He sat down on the bed and stared around him. Nearly everything in here was broken. The month-old video camera was lying on top of a small, working tank Dudley had once driven over the next door neighbor’s dog; in the corner was Dudley’s first-ever television set, which he’d put his foot through when his favorite program had been canceled; there was a large birdcage, which had once held a parrot that Dudley had swapped at school for a real air rifle, which was up on a shelf with the end all bent because Dudley had sat on it. Other shelves were full of books. They were the only things in the room that looked as though they’d never been touched.
------------------------------------------------------------
------------------------------------------------------------
Score: 0.4652486419877385
M r. and Mrs. Dursley, of number four, Privet Drive, were proud to say that they were perfectly normal, thank you very much. They were the last people you’d expect to be involved in anything strange or mysterious, because they just didn’t hold with such nonsense.
Mr. Dursley was the director of a firm called Grunnings, which made drills. He was a big, beefy man with hardly any neck, although he did have a very large mustache. Mrs. Dursley was thin and blonde and had nearly twice the usual amount of neck, which came in very useful as she spent so much of her time craning over garden fences, spying on the neighbors. The Dursleys had a small son called Dudley and in their opinion there was no finer boy anywhere.
------------------------------------------------------------
------------------------------------------------------------
Score: 0.4739086301927252
Hagrid was watching him sadly.
“Took yeh from the ruined house myself, on Dumbledore’s orders. Brought yeh ter this lot….”
“Load of old tosh,” said Uncle Vernon. Harry jumped; he had almost forgotten that the Dursleys were there. Uncle Vernon certainly seemed to have got back his courage. He was glaring at Hagrid and his fists were clenched.
“Now, you listen here, boy,” he snarled, “I accept there’s something strange about you, probably nothing a good beating wouldn’t have cured — and as for all this about your parents, well, they were weirdoes, no denying it, and the world’s better off without them in my opinion — asked for all they got, getting mixed up with these wizarding types — just what I expected, always knew they’d come to a sticky end —”
But at that moment, Hagrid leapt from the sofa and drew a battered pink umbrella from inside his coat. Pointing this at Uncle Vernon like a sword, he said, “I’m warning you, Dursley — I’m warning you — one more word….”
------------------------------------------------------------
用於檢索增強生成 (Retrieval-Augmented Generation) 的用法
使用案例 1:基於故事書的問答系統
問答功能允許使用者提出關於故事、角色和事件的特定問題,並獲得簡潔、內容豐富的答案。 這不僅增強了他們對書籍的理解,也讓他們感覺自己是魔法宇宙的一部分。
透過轉換為檢索器 (Retriever) 進行查詢
LangChain 向量儲存簡化了複雜的問答系統的建構,透過有效的相似性搜尋來找到基於使用者查詢的前 10 個相關文件。 檢索器 (retriever) 是從 vector_store 建立的,問答鏈是使用 create_stuff_documents_chain 函數建立的。 使用 ChatPromptTemplate 類別製作提示範本,以確保結構化和內容豐富的回應。 通常在問答應用程式中,向使用者展示用於產生答案的來源非常重要。 LangChain 內建的 create_retrieval_chain 將會把檢索到的來源文件傳遞到輸出中的 "context" 鍵下
在此處閱讀更多關於 Langchain RAG 教學和上述術語的信息:here
from typing import List, Tuple
import pandas as pd
from langchain.chains import create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_core.prompts import ChatPromptTemplate
# Define the function to perform the RAG chain invocation
def get_answer_and_sources(user_query: str):
# Perform similarity search with scores
docs_with_score: List[Tuple[Document, float]] = (
vector_store.similarity_search_with_score(
user_query,
k=10,
)
)
# Extract the context from the top results
context = "\n".join([doc.page_content for doc, score in docs_with_score])
# Define the system prompt
system_prompt = (
"You are an assistant for question-answering tasks based on the story in the book. "
"Use the following pieces of retrieved context to answer the question. "
"If you don't know the answer, say that you don't know, but also suggest that the user can use the fan fiction function to generate fun stories. "
"Use 5 sentences maximum and keep the answer concise by also providing some background context of 1-2 sentences."
"\n\n"
"{context}"
)
# Create the prompt template
prompt = ChatPromptTemplate.from_messages(
[
("system", system_prompt),
("human", "{input}"),
]
)
# Create the retriever and chains
retriever = vector_store.as_retriever()
question_answer_chain = create_stuff_documents_chain(llm, prompt)
rag_chain = create_retrieval_chain(retriever, question_answer_chain)
# Define the input
input_data = {"input": user_query}
# Invoke the RAG chain
response = rag_chain.invoke(input_data)
# Print the answer
print("Answer:", response["answer"])
# Prepare the data for the table
data = {
"Doc ID": [
doc.metadata.get("source", "N/A").split("/")[-1]
for doc in response["context"]
],
"Content": [
doc.page_content[:50] + "..."
if len(doc.page_content) > 100
else doc.page_content
for doc in response["context"]
],
}
# Create a DataFrame
df = pd.DataFrame(data)
# Print the table
print("\nSources:")
print(df.to_markdown(index=False))
# Define the user query
user_query = "How did Harry feel when he first learnt that he was a Wizard?"
# Call the function to get the answer and sources
get_answer_and_sources(user_query)
Answer: When Harry first learned that he was a wizard, he felt quite sure there had been a horrible mistake. He struggled to believe it because he had spent his life being bullied and mistreated by the Dursleys. If he was really a wizard, he wondered why he hadn't been able to use magic to defend himself. This disbelief and surprise were evident when he gasped, “I’m a what?”
Sources:
| Doc ID | Content |
|:--------------------------------------------|:------------------------------------------------------|
| 01 Harry Potter and the Sorcerers Stone.txt | Harry was wondering what a wizard did once he’d fi... |
| 01 Harry Potter and the Sorcerers Stone.txt | Harry realized his mouth was open and closed it qu... |
| 01 Harry Potter and the Sorcerers Stone.txt | “Most of us reckon he’s still out there somewhere ... |
| 01 Harry Potter and the Sorcerers Stone.txt | “Ah, go boil yer heads, both of yeh,” said Hagrid.... |
# Define the user query
user_query = "Did Harry have a pet? What was it"
# Call the function to get the answer and sources
get_answer_and_sources(user_query)
Yes, Harry had a pet owl named Hedwig. He decided to call her Hedwig after finding the name in a book titled *A History of Magic*.
Sources:
| Doc ID | Content |
|:--------------------------------------------|:------------------------------------------------------|
| 01 Harry Potter and the Sorcerers Stone.txt | Harry sank down next to the bowl of peas. “What di... |
| 01 Harry Potter and the Sorcerers Stone.txt | Harry kept to his room, with his new owl for compa... |
| 01 Harry Potter and the Sorcerers Stone.txt | As the snake slid swiftly past him, Harry could ha... |
| 01 Harry Potter and the Sorcerers Stone.txt | Ron reached inside his jacket and pulled out a fat... |
API 參考
有關 SQLServer 向量儲存功能的詳細文件和配置,請前往 API 參考: https://langchain-python.dev.org.tw/api_reference/sqlserver/index.html