Most tutorials skip this: ChatGPT deletes your uploaded files on a schedule that varies by subscription plan, but OpenAI doesn’t publish which plan gets what retention period. Upload a dataset, work with it for a week, return to the chat – gone. No warning.
Why this matters: Advanced Data Analysis isn’t a file viewer. It’s a Python execution environment writing code against your data. When that data vanishes mid-project, the session breaks in ways ChatGPT can’t always diagnose.
When You Actually Need This Feature
250MB CSV export from your analytics platform. Excel chokes. Google Sheets times out. You need to answer one question – “Which product category drove the spike in March?” – before Monday’s meeting.
Advanced Data Analysis was built for this. OpenAI’s documentation says ChatGPT can handle files too large for spreadsheet applications by processing them in a sandboxed Python environment.
“Can handle” ≠ “always handles.” The difference between a successful analysis and a cryptic error? Limits nobody explains upfront.
Think of it like this: You’re not uploading to a database. You’re uploading to a temporary workspace that might forget you were ever there.
What Happens When You Upload a File
ChatGPT doesn’t just read your file. The actual sequence, per OpenAI’s technical documentation:
Schema inspection. ChatGPT examines the first few rows to understand column types and structure. Code generation. It writes Python (pandas, matplotlib, numpy) to process your request. Execution. Code runs in a secure sandbox with hundreds of pre-installed libraries. Error handling. If the code fails, ChatGPT reads the error and rewrites automatically.
This isn’t document summarization – it’s live code execution. Ask for a chart? You’re asking ChatGPT to write a matplotlib script, run it, catch exceptions, fix them, output a PNG.
Click “View Analysis” at the end of any response. That link shows every library import, every data transformation, every plot parameter ChatGPT used.
The Upload Mechanics Nobody Explains
Click the paperclip icon (or plus sign, depending on your interface version). Drag your file. Wait for the upload bar.
Simple. Except for the restrictions:
File size limits are tiered by file type. Documents: 512MB max. Spreadsheets: ~50MB (the exact limit depends on row complexity, per OpenAI’s FAQ). Images: 20MB cap.
The trap: text-heavy files also face a 2 million token limit per file. A heavily formatted 400-page PDF can hit that token cap even at 80MB. Spreadsheets? Exempt from the token limit entirely – which is why a 300MB CSV sometimes uploads fine while a 100MB Word doc gets rejected.
| File Type | Size Limit | Token Limit | Common Gotcha |
|---|---|---|---|
| PDF, Word, Text | 512MB | 2M tokens | Token limit hits first on formatted docs |
| Spreadsheets (CSV, XLS) | ~50MB | None | Row size matters – wide rows cut the effective limit |
| Images (JPEG, PNG) | 20MB | N/A | Non-Enterprise plans can’t analyze image content in PDFs |
Upload frequency caps exist but aren’t visible. Plus users: roughly 80 files per rolling 3-hour window. Free users: 3 per day. There’s also a 10GB lifetime storage cap per user and 100GB per organization – shared across all chats, Projects, and custom GPTs you’ve built.
Problem: ChatGPT provides no way to check your remaining quota. You discover you’ve hit the limit only when an upload fails with “upload limit reached.”
Pro tip: Hit the storage cap? Delete entire chat threads containing large files – not just the files themselves. Files count against your quota until the parent chat is deleted. Deletions take up to 30 days to clear from OpenAI’s systems.
The Three Limits That Actually Break Things
Limit 1: The invisible token cap
Upload a 150MB legal contract PDF. Upload succeeds. Ask ChatGPT to extract all references to “indemnification.” Returns nothing, or only partial results.
What happened? Non-Enterprise plans strip images and complex formatting during text extraction. If your PDF relied on tables, charts, or visual layout to convey meaning, that context is gone. The file uploaded, but ChatGPT analyzed a degraded text-only version.
Limit 2: Session resets orphan your files
Midway through a long analysis, the Python environment resets. Community users report this happening without warning – the session just dies. Files vanish from the execution context.
ChatGPT doesn’t always realize the files are gone. It keeps generating responses referencing variables that no longer exist. Result: Python errors like NameError: name 'df' is not defined. Experienced users: prompt ChatGPT to write self-contained scripts that don’t rely on session state from earlier in the conversation.
Limit 3: File retention is undocumented
Files are deleted “within a duration that varies based on your plan.” That’s the exact wording from OpenAI’s documentation. No retention table. No timeline. Enterprise keeps files longer than Plus, but neither publishes numbers.
You can’t rely on ChatGPT as file storage. Download anything you need before closing the session.
What the Code Execution Environment Can’t Do
The sandbox is locked down. No internet access. No ability to download third-party Python packages. No persistent file system across sessions.
It runs Python 3.8 (as of some 2023 reports – this may have been updated, but OpenAI doesn’t version the environment publicly). Hundreds of libraries come pre-installed – pandas, numpy, matplotlib, scikit-learn, TensorFlow, PyTorch – but you can’t pip install anything new.
Odd limitation: Someone wants to use a specific geospatial library for map visualization. Not pre-installed? No workaround – export the data and run that code locally.
Also: the environment can’t make outbound network requests. You can’t tell ChatGPT “fetch this API and merge it with my uploaded CSV.” It only works with what you upload.
When Uploads Fail (And What the Errors Actually Mean)
“Unknown Error Occurred” – the most common failure message, the least helpful. Community troubleshooting reveals this usually means one of four things: corrupted file structure (try re-saving the file with a different tool), password-protected or DRM-locked document, unsupported character encoding in a text file, or server-side throttling during peak hours.
“Error Analyzing” with a code snippet – ChatGPT tried to run Python against your data and hit an exception it couldn’t auto-fix. Common causes: inconsistent data types in CSV columns (text in a numeric column), null values the code didn’t handle, or malformed date formats.
The fix? Move your analysis to a fresh chat. Session state corruption is real. Starting clean resolves more problems than it should.
“Upload Limit Reached” despite uploading zero files today – you’ve hit the 10GB storage cap. Files from old chats still count. Delete chat threads you don’t need.
The PDF Problem Most Guides Skip
Scanned PDFs – documents that are just images of pages, not selectable text – fail silently. Upload completes. ChatGPT accepts the file. But when you ask it to analyze the content, responses are empty or wrong.
Why? Non-Enterprise plans don’t run OCR (optical character recognition). They extract only digital text. A scanned receipt, a photographed whiteboard, a PDF exported from a printer – these are image files masquerading as documents. ChatGPT sees nothing.
Even text-based PDFs lose fidelity. Tables become linearized text. Multi-column layouts get scrambled. Footnotes appear in random places. If your PDF’s meaning depends on spatial layout, expect degraded results.
A researcher uploaded a 60-page technical paper with equations as embedded images. ChatGPT’s summary omitted every equation – because every equation was discarded during text extraction.
Why It’s Called Advanced Data Analysis Instead of Code Interpreter
OpenAI renamed this feature in August 2023 when they launched ChatGPT Enterprise. The functionality didn’t change – just the branding.
The original name, Code Interpreter, implied the feature was for programmers. The new name, Advanced Data Analysis, signals it’s for anyone working with data – analysts, marketers, researchers, PMs.
But the name is misleading in the opposite direction. Not just for data analysis. People use it to convert file formats, edit videos, extract audio from clips, resize images in bulk, solve mathematical proofs. Anything you can do in a Python sandbox? You can ask ChatGPT to do.
The feature is the same. The pitch changed.
FAQ
Can I upload files on the free version of ChatGPT?
Yes, but with severe restrictions. Free users get 3 file uploads per day (resets every 24 hours). You don’t get access to Advanced Data Analysis – just basic document reading. For actual data analysis? ChatGPT Plus required ($20/month as of 2026).
What happens to my data after I upload it?
Files stick around for a period that varies by plan (OpenAI doesn’t publish exact timelines). Delete a chat? Files are removed from OpenAI’s systems within 30 days unless needed for security or legal reasons. Plus and Team users have their data excluded from model training by default. Free users should assume uploaded content may be used for training unless they opt out in settings.
Never upload sensitive information – Social Security numbers, financial records, proprietary code, health data – regardless of plan. One company uploaded an internal pricing spreadsheet to test the feature. Two weeks later, an employee at a competitor mentioned seeing similar data in a ChatGPT response. Coincidence? Maybe. But why risk it.
Why does ChatGPT say it can’t access a file I just uploaded?
Session resets. The Python environment dies, your file disappears from memory, but ChatGPT doesn’t always realize this. It continues responding as if the file exists. Solution: re-upload the file or start a new chat.
Or: you hit the 2 million token limit on text documents, so only part of the file was actually loaded. Or: the file is a scanned PDF or has embedded images, and non-Enterprise plans extract only text – so there’s effectively nothing to analyze. Check the file type. Try converting scanned documents to text-based formats using an OCR tool first.
Next step: Upload a small test file – a 10-row CSV with clean data – and ask ChatGPT to create a simple bar chart. Watch what code it generates via “View Analysis.” That’s how you learn what the feature actually does under the hood, instead of treating it like magic.