
Pandas Dataframe

本筆記本展示了如何使用代理與 Pandas DataFrame 互動。它主要針對問答進行了優化。

注意:此代理在後台調用 Python 代理,後者執行 LLM 生成的 Python 代碼 - 如果 LLM 生成的 Python 代碼有害,這可能會很糟糕。謹慎使用。

注意:由於 langchain 已遷移到 v0.3,您應該升級 langchain_openai 和 langchain。這將避免導入錯誤。

pip install --upgrade langchain_openai pip install --upgrade langchain

from langchain.agents.agent_types import AgentType
from langchain_experimental.agents.agent_toolkits import create_pandas_dataframe_agent
from langchain_openai import ChatOpenAI
import pandas as pd
from langchain_openai import OpenAI

df = pd.read_csv(
API 參考:OpenAI



agent = create_pandas_dataframe_agent(OpenAI(temperature=0), df, verbose=True)

使用 OpenAI Functions

這展示了如何使用 OPENAI_FUNCTIONS 代理類型初始化代理。請注意,這是上述方法的一種替代方案。

agent = create_pandas_dataframe_agent(
ChatOpenAI(temperature=0, model="gpt-3.5-turbo-0613"),
agent.invoke("how many rows are there?")

> Entering new chain...

Invoking: `python_repl_ast` with `df.shape[0]`

891There are 891 rows in the dataframe.

> Finished chain.
'There are 891 rows in the dataframe.'
agent.invoke("how many people have more than 3 siblings")

> Entering new AgentExecutor chain...
Thought: I need to count the number of people with more than 3 siblings
Action: python_repl_ast
Action Input: df[df['SibSp'] > 3].shape[0]
Observation: 30
Thought: I now know the final answer
Final Answer: 30 people have more than 3 siblings.

> Finished chain.
'30 people have more than 3 siblings.'
agent.invoke("whats the square root of the average age?")

> Entering new AgentExecutor chain...
Thought: I need to calculate the average age first
Action: python_repl_ast
Action Input: df['Age'].mean()
Observation: 29.69911764705882
Thought: I now need to calculate the square root of the average age
Action: python_repl_ast
Action Input: math.sqrt(df['Age'].mean())
Observation: NameError("name 'math' is not defined")
Thought: I need to import the math library
Action: python_repl_ast
Action Input: import math
Thought: I now need to calculate the square root of the average age
Action: python_repl_ast
Action Input: math.sqrt(df['Age'].mean())
Observation: 5.449689683556195
Thought: I now know the final answer
Final Answer: The square root of the average age is 5.449689683556195.

> Finished chain.
'The square root of the average age is 5.449689683556195.'

多個 DataFrame 範例


df1 = df.copy()
df1["Age"] = df1["Age"].fillna(df1["Age"].mean())
agent = create_pandas_dataframe_agent(OpenAI(temperature=0), [df, df1], verbose=True)
agent.invoke("how many rows in the age column are different?")

> Entering new AgentExecutor chain...
Thought: I need to compare the age columns in both dataframes
Action: python_repl_ast
Action Input: len(df1[df1['Age'] != df2['Age']])
Observation: 177
Thought: I now know the final answer
Final Answer: 177 rows in the age column are different.

> Finished chain.
'177 rows in the age column are different.'
