Rememberizer
Rememberizer是由 SkyDeck AI Inc. 創建,用於 AI 應用程式的知識增強服務。
此筆記本展示如何從 Rememberizer
檢索文件並轉換成下游使用的 Document 格式。
準備
您需要一個 API 金鑰:在https://rememberizer.ai 建立通用知識後,即可獲得。 取得 API 金鑰後,您必須將其設定為環境變數 REMEMBERIZER_API_KEY
,或在初始化 RememberizerRetriever
時將其作為 rememberizer_api_key
傳遞。
RememberizerRetriever
具有以下參數
- 可選的
top_k_results
:預設值為 10。 用於限制傳回的文件數量。 - 可選的
rememberizer_api_key
:如果您沒有設定環境變數REMEMBERIZER_API_KEY
,則為必填。
get_relevant_documents()
具有一個參數 query
:自由文本,用於在 Rememberizer.ai
的通用知識中查找文件
範例
基本用法
# Setup API key
from getpass import getpass
REMEMBERIZER_API_KEY = getpass()
import os
from langchain_community.retrievers import RememberizerRetriever
os.environ["REMEMBERIZER_API_KEY"] = REMEMBERIZER_API_KEY
retriever = RememberizerRetriever(top_k_results=5)
API 參考:RememberizerRetriever
docs = retriever.get_relevant_documents(query="How does Large Language Models works?")
docs[0].metadata # meta-information of the Document
{'id': 13646493,
'document_id': '17s3LlMbpkTk0ikvGwV0iLMCj-MNubIaP',
'name': 'What is a large language model (LLM)_ _ Cloudflare.pdf',
'type': 'application/pdf',
'path': '/langchain/What is a large language model (LLM)_ _ Cloudflare.pdf',
'url': 'https://drive.google.com/file/d/17s3LlMbpkTk0ikvGwV0iLMCj-MNubIaP/view',
'size': 337089,
'created_time': '',
'modified_time': '',
'indexed_on': '2024-04-04T03:36:28.886170Z',
'integration': {'id': 347, 'integration_type': 'google_drive'}}
print(docs[0].page_content[:400]) # a content of the Document
before, or contextualized in new ways. on some level they " understand " semantics in that they can associate words and concepts by their meaning, having seen them grouped together in that way millions or billions of times. how developers can quickly start building their own llms to build llm applications, developers need easy access to multiple data sets, and they need places for those data sets
在鏈中使用
OPENAI_API_KEY = getpass()
os.environ["OPENAI_API_KEY"] = OPENAI_API_KEY
from langchain.chains import ConversationalRetrievalChain
from langchain_openai import ChatOpenAI
model = ChatOpenAI(model_name="gpt-3.5-turbo")
qa = ConversationalRetrievalChain.from_llm(model, retriever=retriever)
API 參考:ConversationalRetrievalChain | ChatOpenAI
questions = [
"What is RAG?",
"How does Large Language Models works?",
]
chat_history = []
for question in questions:
result = qa.invoke({"question": question, "chat_history": chat_history})
chat_history.append((question, result["answer"]))
print(f"-> **Question**: {question} \n")
print(f"**Answer**: {result['answer']} \n")
-> **Question**: What is RAG?
**Answer**: RAG stands for Retrieval-Augmented Generation. It is an AI framework that retrieves facts from an external knowledge base to enhance the responses generated by Large Language Models (LLMs) by providing up-to-date and accurate information. This framework helps users understand the generative process of LLMs and ensures that the model has access to reliable information sources.
-> **Question**: How does Large Language Models works?
**Answer**: Large Language Models (LLMs) work by analyzing massive data sets of language to comprehend and generate human language text. They are built on machine learning, specifically deep learning, which involves training a program to recognize features of data without human intervention. LLMs use neural networks, specifically transformer models, to understand context in human language, making them better at interpreting language even in vague or new contexts. Developers can quickly start building their own LLMs by accessing multiple data sets and using services like Cloudflare's Vectorize and Cloudflare Workers AI platform.