Skip to main content
Katie Academy

Spreadsheet Analysis

Intermediate16 minutesLesson 2 of 5

Progress saved locally. Sign in to sync across devices.

Learning objectives

  • Ask better questions of spreadsheets and CSVs.
  • Use ChatGPT for sanity checks, segmentation, and summary.
  • Avoid letting fluent analysis hide weak data thinking.

Spreadsheets often produce false confidence because they look precise. ChatGPT can help you analyze them faster, but only if you frame the job clearly and keep basic analytical hygiene intact.

Show a spreadsheet being transformed into questions, findings, and a decision-ready summary.

What you'll learn
  • How to ask analytical questions instead of generic summary questions.
  • What kinds of spreadsheet tasks ChatGPT handles well.
  • Where you still need manual checking and judgment.
Why this matters

Tabular data is one of the places where ChatGPT can save real time: finding patterns, summarizing trends, segmenting categories, and turning messy observations into a readable narrative.

But spreadsheets also invite lazy interpretation. The more polished the explanation sounds, the more important it is to ask whether the data and question were actually strong.

How data analysis works under the hood

ChatGPT's data analysis runs Python code behind the scenes. This capability was previously called "Code Interpreter," then "Advanced Data Analysis." It is now a default capability built into ChatGPT -- no manual enabling required. When you upload a spreadsheet, ChatGPT writes and executes Python to inspect, transform, and visualize your data.

Results now include interactive tables you can sort and filter, plus customizable charts (bar, line, pie, scatter) that you can download directly from the conversation.

The core idea

Good spreadsheet prompts define the business or analytical question, the key columns, and the form of the answer. You get better results when you ask for interpretation in relation to a goal rather than for a vague summary of the sheet.

The reason generic analysis fails is subtle. When you say "analyze this spreadsheet," ChatGPT will produce something that looks like analysis: it will describe column distributions, note a few trends, and maybe compute some averages. The output reads well. But it is not oriented toward any decision. It is the equivalent of describing a map without knowing where the person wants to go. The analysis has no direction, so it has no leverage.

A better approach starts from the other end. What question does this data need to answer? What decision does it need to inform? When you lead with the question, the analysis becomes purposeful. ChatGPT can filter for the relevant dimensions, ignore the noise, and structure findings around the thing you actually need to know.

Use ChatGPT for exploration, pattern detection, explanation, and first-pass synthesis. Use extra care for edge cases, outliers, calculations that matter materially, and any conclusion that would drive a real decision. The model is excellent at quickly surfacing patterns you might miss in a large sheet, but it is not a substitute for understanding whether those patterns are meaningful in context.

How it works

  1. State what decision or question the data should inform.
  2. Tell ChatGPT which columns or dimensions matter most.
  3. Ask for findings, caveats, and suggested next checks rather than only a polished narrative.
  4. When the data is complex, break analysis into stages: first inspect the data quality, then explore patterns, then synthesize findings.

What skilled users do differently

A novice uploads a spreadsheet and asks ChatGPT to "analyze it" or "find insights." The result feels productive because it reads well, but it is usually shallow. The findings are generic because the question was generic.

A skilled user approaches the same spreadsheet with a question already formed. They might say: "I need to know whether our enterprise customers are churning faster than SMB customers in the last two quarters, and if so, at which stage." They name the columns that matter. They ask for the analysis to include caveats, such as whether the sample size is large enough to draw conclusions, or whether a few outliers might be skewing the average.

Skilled users also ask for verification steps. They include instructions like "show me the top five rows driving the highest-impact finding" or "flag any data quality issues you notice." This turns the analysis into a collaborative process rather than a one-shot answer. They know that the most dangerous spreadsheet output is a confident narrative built on data they never checked.

Two worked examples

Example 1: a vague request

Analyze this spreadsheet.

This prompt is weak for the same reason "summarize this document" is weak. It gives the model no direction. ChatGPT will describe what it sees: column names, row counts, a few averages, maybe a trend. But without knowing what question you are trying to answer, the analysis has no focus. You end up with a tour of the spreadsheet rather than an answer to a question.

Example 2: a decision-oriented analysis

Analyze the attached spreadsheet for onboarding performance.

Focus on these questions:
1. Which stage has the biggest drop-off?
2. Are there obvious differences by customer segment?
3. What patterns or anomalies should I inspect manually?

Output format:
- top findings
- supporting observations from the data
- cautions or limitations
- 3 next analytical checks

This version is stronger because every part of the prompt serves the analysis. The questions give direction. The output format separates findings from caveats, which prevents the model from burying uncertainty inside a confident narrative. The request for "next analytical checks" turns the output into a starting point rather than a final answer.

Example 3: a data quality audit

I am uploading a CSV of customer support tickets from the last 90 days.

Before any analysis, run a data quality check:
1. How many rows and columns? Are there duplicate rows?
2. What percentage of values are missing in each column?
3. Are there obvious data entry errors (e.g., negative response times, dates in the future)?

Then, if the data looks clean enough:
- What is the median resolution time by ticket category?
- Which category has the highest reopen rate?
- Are there any agents with significantly different resolution patterns?

Present findings as a table, followed by 2-3 observations and any data quality warnings.

This example demonstrates a two-stage approach: quality check first, analysis second. This is important because spreadsheet analysis built on dirty data produces confident-sounding nonsense. The staged approach catches problems before they corrupt the findings.

Prompt block

Analyze this spreadsheet.

Better prompt

Analyze the attached spreadsheet for onboarding performance.

Focus on these questions:
1. Which stage has the biggest drop-off?
2. Are there obvious differences by customer segment?
3. What patterns or anomalies should I inspect manually?

Output format:
- top findings
- supporting observations from the data
- cautions or limitations
- 3 next analytical checks

Why this works

The better prompt ties the analysis to real questions and explicitly requests cautions and next checks. This works because it mirrors good analytical practice: start with a question, gather evidence, note limitations, and identify what you do not yet know. When you ask ChatGPT only for findings, the output feels complete even when it is not. When you also ask for cautions and next checks, you get a more honest picture of what the data actually supports.

The output format is doing real work here. By separating findings from caveats, you make it easy to forward the findings to a colleague while keeping the caveats for your own review. That is output design for the next step in your workflow.

Common mistakes
  • Asking for a generic analysis with no decision context. Without a question, the model narrates the spreadsheet rather than analyzing it.
  • Trusting polished patterns without checking whether the underlying question was well framed. A fluent explanation of a meaningless pattern is worse than no explanation at all.
  • Skipping manual review for anomalies or materially important figures. If a number would change a decision, verify it in the raw data.
  • Ignoring data quality before diving into analysis. Missing values, duplicates, and inconsistent formatting can produce findings that look real but are not. Always ask for a data quality check first on unfamiliar datasets.
  • Treating the model's analysis as final rather than as a first pass. ChatGPT is excellent at quickly surfacing patterns, but it does not know your business context. The best use is as an analytical partner, not an oracle.
Mini lab
  1. Upload a spreadsheet you understand reasonably well.
  2. First, ask ChatGPT to run a data quality check: row count, missing values, and any obvious errors.
  3. Then ask one descriptive question, one comparative question, and one anomaly question.
  4. Check one conclusion manually in the raw data.
  5. In one sentence, name the difference between what the generic analysis emphasized and what your targeted questions revealed.
Key takeaway

ChatGPT is strongest with spreadsheets when it supports a clear analytical question rather than narrating the sheet back to you.