TiDB

TiDB Cloud 是一個全面的資料庫即服務 (DBaaS) 解決方案，提供專用和無伺服器選項。TiDB Serverless 現在正在將內建向量搜尋整合到 MySQL 生態系統中。透過此增強功能，您可以使用 TiDB Serverless 無縫開發 AI 應用程式，而無需新的資料庫或其他技術堆疊。成為首批體驗者之一，請加入私人測試版的候補名單：https://tidb.cloud/ai。

本筆記本介紹如何使用 TiDBLoader 從 langchain 中的 TiDB 載入資料。

先決條件

在使用 TiDBLoader 之前，我們將安裝以下相依性

%pip install --upgrade --quiet langchain

然後，我們將配置與 TiDB 的連線。在本筆記本中，我們將遵循 TiDB Cloud 提供的標準連線方法，以建立安全且有效率的資料庫連線。

import getpass

# copy from tidb cloud console，replace it with your own
tidb_connection_string_template = "mysql+pymysql://<USER>:<PASSWORD>@<HOST>:4000/<DB>?ssl_ca=/etc/ssl/cert.pem&ssl_verify_cert=true&ssl_verify_identity=true"
tidb_password = getpass.getpass("Input your TiDB password:")
tidb_connection_string = tidb_connection_string_template.replace(
    "<PASSWORD>", tidb_password
)

從 TiDB 載入資料

以下是一些您可以使用的主要引數的細目，以自訂 TiDBLoader 的行為

query (str)：這是要對 TiDB 資料庫執行的 SQL 查詢。查詢應選取您要載入到 Document 物件中的資料。例如，您可以使用類似 "SELECT * FROM my_table" 的查詢，以從 my_table 擷取所有資料。
page_content_columns (Optional[List[str]])：指定應包含在每個 Document 物件的 page_content 中的欄名稱清單。如果設定為 None (預設值)，則查詢傳回的所有欄都會包含在 page_content 中。這可讓您根據資料的特定欄自訂每個文件的內容。
metadata_columns (Optional[List[str]])：指定應包含在每個 Document 物件的 metadata 中的欄名稱清單。依預設，此清單為空，表示除非明確指定，否則不會包含任何中繼資料。這適用於包含有關每個文件的其他資訊，這些資訊不構成主要內容的一部分，但對於處理或分析仍然很有價值。

from sqlalchemy import Column, Integer, MetaData, String, Table, create_engine

# Connect to the database
engine = create_engine(tidb_connection_string)
metadata = MetaData()
table_name = "test_tidb_loader"

# Create a table
test_table = Table(
    table_name,
    metadata,
    Column("id", Integer, primary_key=True),
    Column("name", String(255)),
    Column("description", String(255)),
)
metadata.create_all(engine)


with engine.connect() as connection:
    transaction = connection.begin()
    try:
        connection.execute(
            test_table.insert(),
            [
                {"name": "Item 1", "description": "Description of Item 1"},
                {"name": "Item 2", "description": "Description of Item 2"},
                {"name": "Item 3", "description": "Description of Item 3"},
            ],
        )
        transaction.commit()
    except:
        transaction.rollback()
        raise

from langchain_community.document_loaders import TiDBLoader

# Setup TiDBLoader to retrieve data
loader = TiDBLoader(
    connection_string=tidb_connection_string,
    query=f"SELECT * FROM {table_name};",
    page_content_columns=["name", "description"],
    metadata_columns=["id"],
)

# Load data
documents = loader.load()

# Display the loaded documents
for doc in documents:
    print("-" * 30)
    print(f"content: {doc.page_content}\nmetada: {doc.metadata}")

API 參考：TiDBLoader

------------------------------
content: name: Item 1
description: Description of Item 1
metada: {'id': 1}
------------------------------
content: name: Item 2
description: Description of Item 2
metada: {'id': 2}
------------------------------
content: name: Item 3
description: Description of Item 3
metada: {'id': 3}

test_table.drop(bind=engine)

文件載入器概念指南
文件載入器操作指南

先決條件​

從 TiDB 載入資料​

相關內容​

此頁面是否對您有幫助？

先決條件

從 TiDB 載入資料

相關內容