Skip to main content
Katie Academy

Fixing Bad Prompts

Intermediate19 minutesLesson 5 of 5

Progress saved locally. Sign in to sync across devices.

Learning objectives

  • Diagnose the most common prompt failure modes
  • Apply targeted fixes instead of random rewrites
  • Build a reliable prompt-debugging checklist you can reuse

Most bad prompts do not fail randomly. They fail in familiar ways.

The result is too generic. The tone is wrong. The scope is sloppy. The answer has structure problems. The evidence standard is too weak. Or the prompt looks bad only because the real issue is that the task belongs in a different workflow entirely.

Once you can name the failure mode, prompt repair becomes much calmer. You stop adding random detail and start applying the smallest useful fix.

Show a decision tree with branches for generic, wrong tone, wrong scope, weak structure, weak evidence, and wrong workflow.

What you'll learn
  • How to identify common prompt failure modes quickly
  • What targeted repair looks like for each one
  • When the answer problem is really a workflow problem instead of a prompt problem
Why this matters

Prompt debugging is more useful than prompt inspiration. Inspiration gives you examples. Debugging gives you control.

That matters because most real work does not begin from a blank page. It begins from a prompt that almost worked. If you can repair nearly-good prompts systematically, your workflow becomes faster and less frustrating. You also become a better teacher of prompting because you can explain why a prompt failed instead of just replacing it.

This is also where prompting becomes more like engineering and less like folklore. You learn to test assumptions, isolate variables, and fix the real problem instead of decorating the symptoms.

The core idea

Most prompt failures fit a small set of categories.

Generic output means the task or context is under-specified.

Wrong tone often means the audience, role, or style boundaries are unclear.

Wrong scope can mean too much breadth, too little priority, or conflicting instructions.

Weak structure means the output shape was never designed.

Weak evidence means the prompt did not specify sources, uncertainty handling, or the right workflow for current information.

And sometimes the prompt looks bad because the job itself is in the wrong place. A plain chat prompt cannot reliably solve a task that actually needs Search, Deep Research, or the exact contents of a file.

The job of debugging is to identify which category applies before you start repairing.

How it works

Read the bad result once without reacting emotionally. Your first task is diagnosis, not rewriting.

Then ask: what is the main failure? If you can only name one, that is usually enough to start. 'Too generic' is better than 'bad.' 'Wrong audience' is better than 'not what I meant.' 'Needs source-backed comparison' is better than 'this feels weak.'

Next, apply the smallest repair that directly addresses the named failure. If the answer is generic, add context. If it is too broad, constrain scope. If it is messy, define output structure. If it is unsourced and the task needs current evidence, change the workflow or demand a higher evidence standard.

Finally, retest. If the fix works, save the improved version. If it does not, ask whether you misdiagnosed the problem or whether the workflow itself is wrong.

This is why debugging works better than random expansion. It gives you a loop: diagnose, repair, retest, and only then escalate.

A practical failure taxonomy

Failure mode 1: generic output

The answer sounds plausible but could have been written for almost anyone.

Typical cause: missing situation, audience, or goal.

Typical repair: add the specific context that changes what a good answer would look like.

Failure mode 2: wrong tone or audience

The content may be correct, but the voice, level, or framing is wrong.

Typical cause: the prompt did not specify who the answer is for or what register is appropriate.

Typical repair: name the audience, tone, and one or two style boundaries.

Failure mode 3: wrong scope

The answer is too broad, too narrow, or tries to do too much at once.

Typical cause: the task was not prioritized or bounded.

Typical repair: define what to include, what to exclude, and what matters most.

Failure mode 4: weak structure

The answer may contain useful ideas but arrives in a form that is hard to use.

Typical cause: no output format was specified.

Typical repair: request a table, checklist, memo, agenda, comparison grid, or another format that fits the next step.

Failure mode 5: weak evidence

The answer sounds confident but should be source-backed, current, or uncertainty-aware.

Typical cause: the task needed a different evidence standard or a different workflow.

Typical repair: ask for official sources, citations, uncertainty handling, or move the task into Search, Deep Research, or file analysis.

Failure mode 6: wrong workflow

The prompt is not the main problem. The mode is.

Typical cause: trying to solve a file problem, current-facts problem, or continuity problem in a plain chat.

Typical repair: change the workflow, not just the wording.

Two worked examples

Example 1: generic but salvageable

Bad prompt:

Tell me everything I should know about customer onboarding.

The problem is not random. It is too broad, too generic, and unconstrained. 'Everything' is not a real scope. 'Customer onboarding' is not a complete context.

Better repair:

Explain the first 30 days of customer onboarding for a B2B SaaS operations team.

Focus on:
- kickoff
- implementation milestones
- handoff risks
- signs the account is off track

Output:
1. a short overview
2. a checklist of the core tasks
3. 3 common mistakes to watch for

That is a targeted fix. It narrows the scope, clarifies the audience context, and adds usable structure.

Example 2: wrong workflow disguised as a bad prompt

Bad prompt:

Tell me which AI coding tools are best this month.

You can improve this prompt, but the deeper issue is that 'this month' is a current-state requirement. If you stay in a plain unsourced thread, even a cleaner prompt may still underperform because the workflow is wrong.

Better repair: switch to a source-backed workflow, request official sources where possible, and ask for comparison criteria plus uncertainty notes.

This example matters because many users over-invest in prompt repair when the real fix is workflow selection.

A debugging checklist you can reuse

When a prompt fails, ask these questions in order:

  1. Did I define the job clearly?
  2. Did I provide the context that actually changes the answer?
  3. Did I set useful constraints?
  4. Did I choose a usable output format?
  5. Does this task need sources, files, or a different workflow?
  6. Am I trying to fix the wrong problem?

This list is intentionally short. Good debugging tools are usually compact.

Prompt block

Improve this prompt for me: 'Tell me everything I should know about customer onboarding.'

Better prompt

Debug this prompt step by step.

Original prompt:
'Tell me everything I should know about customer onboarding.'

Tasks:
1. Identify the exact failure modes in the prompt
2. Explain why each one matters
3. Rewrite the prompt so it is useful for a SaaS operations manager preparing a 30-minute onboarding review
4. Keep the revised prompt under 140 words
5. Tell me whether this is still a plain-chat task or whether another workflow would be better

Why this works

The weak version asks for repair only. The stronger version asks for diagnosis before repair.

That changes the lesson from a one-off fix to a reusable habit. It also adds the workflow question at the end, which is crucial. Many prompt repairs fail because they never ask whether the task belongs in the current mode.

A good debugging prompt teaches you to think like an operator. It turns a bad result into information.

Common mistakes
  • Adding more words without identifying the actual problem
  • Treating every weak result as if it were a wording issue
  • Trying to fix a current-facts task with plain-chat phrasing alone
  • Combining too many repairs at once and losing track of what helped
  • Never saving repaired prompts, which means you keep relearning the same lesson
Mini lab
  1. Choose a prompt from your recent work that disappointed you.
  2. Label the main failure mode in one line.
  3. Apply exactly one targeted repair.
  4. Retest the prompt.
  5. If the result is still weak, ask whether the workflow is wrong.
  6. Save the repaired prompt and write one note on what the original version failed to specify.

This lab works best on real work, not invented examples. Real failures teach faster.

Key takeaway

Bad prompts become fixable once you identify the failure mode. Prompt repair is clearer, faster, and more teachable than prompt guesswork.