“Why does every PandasAI tutorial I find break the moment I run pip install?” If you’ve asked that recently, you’re not imagining it. PandasAI 3.0 landed in October 2025 with breaking changes that broke roughly every blog post written before it. This guide installs PandasAI 3.0.0 the way the new docs actually expect – including the parts that bite v2 users hardest.
PandasAI is a Python library that lets you chat with dataframes, SQL, CSVs, and parquet files using LLMs. The sinaptik-ai/pandas-ai repo sits at 23,500+ stars as of October 2025 – and v3 was a structural rewrite, not a polish pass.
What actually changed in v3 (and why this guide exists)
v3 split the project into a small core plus extensions. The LLM no longer ships with the main package. The reasoning: every LLM provider releases new SDKs constantly, and bundling them all caused dependency hell for users who only needed one. The official v3 migration guide confirms LLMs are now extension-based – you install pandasai-litellm separately for access to 100+ models through one interface.
Config moved too. Global via pai.config.set() now – no more passing it into every SmartDataframe constructor.
Several v2 knobs are gone entirely. Removed: save_charts, enable_cache, security, custom_whitelisted_dependencies, save_charts_path, custom_head. If a tutorial tells you to set those, that tutorial is dead.
System requirements
| Component | Minimum | Recommended |
|---|---|---|
| Python | 3.8 | 3.10 or 3.11 |
| RAM | 2 GB free | 8 GB+ for larger CSVs |
| Disk | ~500 MB for deps | 2 GB if using Docker sandbox |
| OS | Linux / macOS / Windows | Linux for Docker sandbox |
| Network | Outbound HTTPS to LLM provider | – |
The Python ceiling is the gotcha. PyPI metadata pins it to Python <3.12, >=3.8. On 3.12 or 3.13, wheel resolution misbehaves – you get cryptic dependency errors that have nothing to do with your code. Spin up a 3.11 venv before anything else.
Install PandasAI – the working sequence
Two packages. Not one. The library plus an LLM extension:
# Isolate deps - don't skip this
python3.11 -m venv pandasai-env
source pandasai-env/bin/activate # Windows: pandasai-envScriptsactivate
# Core library
pip install pandasai
# LiteLLM extension - OpenAI, Anthropic, Gemini, Ollama, and more
pip install pandasai-litellm
# Optional: Docker sandbox for safer code execution
pip install pandasai-docker
The official quickstart supports pip or Poetry – Poetry is what the maintainers use internally, but pip works fine for one-off setups.
Don’t skip the venv. PandasAI pulls in pandas, sqlglot, pydantic, litellm, and a dozen friends. Installed globally, it will conflict with at least one existing project. Learned that one the hard way.
First-time configuration
One config call at startup. That’s the v3 model. Save this as quickstart.py:
import pandasai as pai
from pandasai_litellm.litellm import LiteLLM
llm = LiteLLM(model="gpt-4o-mini", api_key="YOUR_OPENAI_API_KEY")
pai.config.set({
"llm": llm,
"save_logs": True,
"verbose": False,
"max_retries": 3,
})
df = pai.read_csv("data/companies.csv")
response = df.chat("What is the average revenue by region?")
print(response)
pai.config.set() runs once globally – the LLM model string follows LiteLLM’s naming convention, so anthropic/claude-3-5-sonnet-latest, gemini/gemini-1.5-pro, or ollama/llama3 all route through the same interface. Pick your provider, swap the string.
If you’re using PandaBI Cloud features rather than your own LLM key, the env variable name matters. Turns out the maintainers renamed it – PANDASAI_API_KEY became PANDABI_API_KEY in PR #1511 (documented in the GitHub release notes). Code copy-pasted from old blogs silently fails to authenticate. No loud error. Just a refusal.
Which raises an honest question: if you’re switching LLM providers mid-project, does the single global config create problems? Mostly no – you can call pai.config.set() again at any point and it overwrites. But there’s no per-dataframe override anymore, so if you genuinely need two different LLMs active simultaneously (say, one cheap model for schema inference and one capable model for complex queries), that pattern isn’t supported in v3 open-source out of the box.
Verify the install
Three lines from a Python shell:
import pandasai as pai
print(pai.__version__) # expect 3.0.0 or later
df = pai.DataFrame({"x": [1, 2, 3]})
print(df.chat("What is the sum of x?")) # expect: 6
Version prints and chat returns 6? You’re live. No LLM configured yet? The chat line raises – expected. The import working cleanly is what the verify step is actually checking.
Common install errors and fixes
Most v3 failures are one of four things:
ModuleNotFoundError: No module named 'pandasai.llm'– classic v2-tutorial casualty. Fix:pip install pandasai-litellm, then swap your import tofrom pandasai_litellm.litellm import LiteLLM. The migration troubleshooting docs flag this as the #1 v3 error.AttributeError: module 'pandasai' has no attribute 'api_key'– this one surfaces when following docs written against v3 alpha builds, where the stable API hadn’t settled yet. On 3.0.0 stable: usepai.config.set({"llm": llm}). Thepai.api_key.set()path doesn’t exist.ModuleNotFoundError: No module named 'pandasai.core'(or'pandasai.connectors') – happens when pip has cached a v2 install and the v3 packages land on top of it. Full uninstall first:pip uninstall pandasai pandasai-litellm, thenpip install "pandasai==3.0.0"pinned.- Ollama / LiteLLM endpoint errors – GitHub issue #1777 documents this exactly: users pass
api_base='http://localhost:11434/api/generate'. LiteLLM expects OpenAI-compatible routes. Useapi_base='http://localhost:11434'withmodel='ollama/llama3'and let LiteLLM construct the path.
Optional: Docker sandbox for untrusted prompts
PandasAI runs generated Python code on your machine. Fine for your own data. Risky if prompts come from end users.
from pandasai_docker import DockerSandbox
sandbox = DockerSandbox()
sandbox.start()
pai.chat("Who gets paid the most?", df, sandbox=sandbox)
sandbox.stop() # don't forget this - leaked containers add up
The sandbox runs code in an isolated container, keeping generated scripts away from your host filesystem. Extra second or two of overhead per call – worth it for any multi-tenant deployment. The most common mistake: forgetting sandbox.stop(). Docker doesn’t auto-clean these containers, and they accumulate fast.
Upgrade from v2 and uninstall
Three rewrites are unavoidable when bringing v2 code forward.
Replace per-dataframe configs with pai.config.set(). Swap LLM imports from pandasai.llm.openai to pandasai_litellm.litellm. Remove any calls to agent.clarification_questions(), rephrase_query(), or explain() – those Agent methods were cut in v3, replaced by chat() and follow_up() per the migration guide.
To uninstall completely:
pip uninstall pandasai pandasai-litellm pandasai-docker -y
rm -rf ~/.cache/pandasai # logs and cached responses
rm pandasai.log # if present in your project root
Used vector store training? Clean those too. Training with ChromaDB, Qdrant, Pinecone, or LanceDB requires an enterprise license for production use (per the migration guide), and the local store files don’t remove themselves.
One honest caveat about licensing
Open source, but not entirely. PandasAI is MIT-licensed except for the pandasai/ee directory, which carries its own license. Cloud connectors, semantic agent training in production, certain enterprise features – those live in ee/. The free tier is generous. The line between free and enterprise isn’t always loud in the docs. Read it before you ship to production.
FAQ
Do I need an OpenAI key to use PandasAI?
No – LiteLLM routes to Anthropic, Gemini, Mistral, Groq, or local Ollama models with the right model string and matching key.
Can I still use SmartDataframe from v2 code?
Yes, with a real catch. The v3 backwards-compatibility layer keeps SmartDataframe importable, but config options like save_charts and enable_cache silently do nothing – they were removed from the engine. Basic queries keep working. But anything that depended on caching is now making fresh LLM calls every time, which means your bill went up the moment you upgraded. If your v2 codebase is in production and you’re not ready to migrate, pin to pandasai==2.x and stay there until you are.
What’s the difference between pandasai and the pandas library?
Separate projects. pandas is the dataframe library. pandasai wraps it with an LLM so you can ask questions in plain English. You still need pandas underneath – PandasAI adds to it, not replaces it.
Next step: open a terminal, create that 3.11 venv, run the two pip commands. If the verify script prints 6, point it at your own CSV.