pandas-ai/examples/quickstart.ipynb · pa/pandas-ai - AtomGit

Ggdcsinaptikupdated readme, examples notebook, agent page, changed getpanda.ai to pandas-ai.com
34d01994创建于 2025年9月22日历史提交
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# PandasAI Quickstart Guide\n",
    "\n",
    "This notebook demonstrates how to get started with PandasAI and how to use it to analyze data through natural language."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Set up LLM\n",
    "\n",
    "Use pandasai_litellm to select the LLm of your choice and use PandasAI"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandasai as pai\n",
    "from pandasai_litellm.litellm import LiteLLM\n",
    "\n",
    "# Initialize LiteLLM with your OpenAI model\n",
    "llm = LiteLLM(model=\"gpt-4.1-mini\", api_key=\"YOUR_OPENAI_API_KEY\")\n",
    "\n",
    "# Configure PandasAI to use this LLM\n",
    "pai.config.set({\n",
    "    \"llm\": llm\n",
    "})"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Read CSV\n",
    "\n",
    "For this example, we will use a small dataset of heart disease patients from [Kaggle](https://www.kaggle.com/datasets/arezaei81/heartcsv)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "file_df = pai.read_csv(\"./data/heart.csv\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Chat with Your Data\n",
    "\n",
    "You can ask questions about your data using natural language"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "response = file_df.chat(\"What is the correlation between age and cholesterol?\")\n",
    "print(response)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Create Dataset\n",
    "\n",
    "To avoid to reading the csv again and again create dataset in PandasAI to reused.\n",
    "The path must be in format 'organization/dataset'."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "dataset = pai.create(path=\"your-organization/heart\",\n",
    "    name=\"Heart\",\n",
    "    df = file_df,\n",
    "    description=\"Heart Disease Dataset\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Load Dataset\n",
    "After creation you load dataset anytime with the following code"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "dataset = pai.load(\"your-organization/heart\")"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}