Skip to main content
Katie Academy

Build One Custom GPT

Advanced18 minutesLesson 3 of 6

Progress saved locally. Sign in to sync across devices.

Learning objectives

  • Choose one repeated job suitable for a custom GPT
  • Write a focused behavior brief
  • Keep the GPT narrow enough to stay reliable

Your custom GPT should not try to do the whole operating system. It should solve one repeated job inside it.

That is what keeps the GPT useful instead of bloated. A GPT that does one thing well gets used. A GPT that tries to do everything gets abandoned.

Show the system as a whole, with one narrow job extracted into a GPT.

What you'll learn
  • Which part of your workflow is worth turning into a GPT
  • How to keep the GPT focused
  • What a minimal first version should contain
Why this matters

Many users build one "everything GPT" and then wonder why it behaves inconsistently.

The better pattern is to choose one narrow, repeated job and build the GPT around that alone. Focus is a major source of reliability. A GPT with clear boundaries is easier to test, easier to trust, and easier to improve over time.

This also matters because custom GPTs are shared surfaces. If you build one for a team, clarity of purpose becomes even more important. A teammate who opens a GPT should understand within seconds what it does and what it does not do.

There is a subtler cost to building broad GPTs that many people miss: debugging becomes nearly impossible. When a focused GPT produces a bad result, you know which instruction to adjust because the GPT only has a few instructions. When a broad GPT produces a bad result, you have to figure out which of the many instructions interacted poorly with the input, and that diagnostic work often takes longer than rewriting the GPT from scratch.

There is also a user-experience consideration. When someone opens a GPT, they form an immediate expectation about what it does. A focused GPT with a clear name and purpose meets that expectation consistently. A broad GPT with an ambiguous purpose forces the user to experiment and guess. That guessing phase produces bad prompts, which produce bad output, which leads the user to conclude the GPT does not work. The GPT may be perfectly capable, but if the user cannot figure out what to ask, capability is irrelevant.

Choosing between a GPT and a skill

The decision between building a custom GPT and creating a ChatGPT Skill is worth making deliberately. A custom GPT is better when the repeated job benefits from a persistent persona, a distinct conversation context, or uploaded reference files. A skill is better when the repeated job is primarily about applying consistent instructions to varied inputs within your normal ChatGPT workflow. If you are unsure, start with the custom GPT. It is easier to convert a working GPT into a skill later than to debug a skill that should have been a GPT from the start.

The core idea

A capstone GPT should solve a repeated step, not represent your whole identity.

Examples include briefing, critique, lesson planning, source triage, onboarding explanation, or weekly summary drafting. The more clearly you can define the job and its boundaries, the better the GPT usually behaves.

The key question is not "what can this GPT do?" but "what should this GPT refuse to do?" Boundaries are what keep a GPT reliable. Without them, users push the GPT into territory it was not designed for, and the quality drops. A GPT that says "I only do competitive analysis briefs" is more trustworthy than one that says "I can help with anything marketing-related."

Think of the GPT as a specialist, not a generalist. Specialists earn trust because their scope is clear. Generalists create uncertainty because every interaction starts with the question "will this be good enough?"

There is a practical test for whether your GPT is narrow enough: can you describe its purpose in one sentence without using the word "and"? "This GPT turns meeting notes into structured action-item lists" passes. "This GPT helps with meeting notes and email drafting and project planning" does not. The "and" test catches scope creep before it undermines the GPT's reliability.

Note that some use cases that previously required a custom GPT might now be better served by a ChatGPT Skill -- a workspace-level reusable workflow that applies automatically when relevant. Skills are lighter-weight and shared across the workspace rather than being a standalone assistant. If the repeated job is primarily about applying consistent instructions rather than maintaining a distinct persona, consider whether a skill is the better fit. See the Skills as Reusable Workflows lesson for details.

Use the GPT to reduce repeated setup on one narrow job. Avoid turning it into a generic assistant with too many competing goals.

How it works

  1. Pick one repeated job inside the use case. Look for the part you would gladly stop re-explaining every time. The best candidates are tasks where you find yourself pasting the same context or instructions into new conversations.
  2. Write a focused brief. Role, priorities, boundaries, and output style are enough to start. Keep the instructions under 500 words for the first version. You can always add detail later.
  3. Define what the GPT should refuse. Boundaries are as important as capabilities. A GPT without boundaries will drift into tasks it was not designed for.
  4. Test with real prompts. Use at least three: one typical request, one edge case, and one request that should be out of scope. If the GPT fails repeatedly, narrow the scope before you expand it.
  5. Iterate based on real usage. The first version is always a draft. Plan to revise the instructions after three to five real uses.

What skilled users do differently

Skilled users write their GPT instructions as if they are onboarding a careful new colleague. They specify not just what the GPT should do, but what it should avoid, what tone it should maintain, and what output structure the user expects.

They also test the GPT adversarially before sharing it. They ask edge-case questions, give it ambiguous inputs, and deliberately try to push it outside its boundaries. If the GPT handles those cases poorly, they tighten the instructions before calling it finished. This adversarial testing is not paranoia -- it is quality assurance. The edge cases you discover during testing are the same ones a real user will encounter during actual work.

Finally, skilled users treat the first version as a draft. They expect to revise the instructions after the first few real uses, because real usage always reveals gaps that hypothetical planning misses. They also keep the instruction brief short. A GPT with fifty lines of instructions is almost always a GPT that tried to do too many things. The best custom GPTs tend to have ten to twenty lines of instructions because their scope is narrow enough that ten to twenty lines are sufficient.

There is one more pattern worth noting: skilled users name their GPTs descriptively, not cleverly. "Weekly Competitive Brief Drafter" tells every user what the GPT does. "StrategyBot Pro" tells them nothing. Names are part of the interface, and a clear name reduces the chance that someone will misuse the GPT by sending it the wrong kind of request.

Two worked examples

Example 1: too broad

A teacher builds a GPT called "Teaching Assistant" with instructions covering lesson planning, grading rubrics, parent communication, student feedback, and curriculum mapping. The GPT produces mediocre results across all five areas because the instructions compete with each other and no single task gets enough guidance. When a prompt about grading arrives, the lesson-planning instructions create noise.

Example 2: well-scoped

The same teacher builds a GPT called "Exit Ticket Reviewer." It takes a set of student exit-ticket responses, identifies common misconceptions, and suggests one adjustment for the next lesson. The instructions are short, the output format is consistent, and the teacher uses it every afternoon. It does one thing and does it well.

Example 3: different domain

A recruiter builds a GPT called "Screening Note Drafter." After each phone screen, the recruiter pastes their raw notes and the GPT produces a structured screening summary: candidate strengths, concerns, and a recommendation for whether to advance. The GPT has clear boundaries: it does not write job descriptions, it does not schedule interviews, and it does not evaluate resumes. That focus keeps the output consistently useful.

Prompt block

Help me build a custom GPT for my workflow.

Better prompt block

Help me define one custom GPT inside my capstone workflow.

Workflow:
[describe it]

Please identify:
- one repeated job worth turning into a GPT
- the GPT's purpose
- the most important instruction priorities
- the boundaries it should respect
- 3 test prompts I can use immediately

Why this works

The better prompt forces specialization. That usually creates a stronger first GPT than trying to cover the entire workflow at once. By asking for boundaries and test prompts explicitly, the prompt also builds in the two elements most people skip: scope limits and quality checks. Those are exactly what separate a GPT that lasts from one that gets used once and forgotten.

The request for "3 test prompts I can use immediately" is particularly important. Test prompts turn the GPT from a design exercise into a testable product. If you cannot test it, you cannot improve it.

The prompt also asks for boundaries explicitly, which is the element most people skip. Without stated boundaries, the GPT will cheerfully attempt any request, producing mediocre results for tasks it was never designed to handle. With boundaries, the GPT can decline gracefully, which is a far better outcome than producing low-quality work that the user might mistakenly trust.

Common mistakes
  • Trying to build a GPT for the whole workflow instead of one step
  • Adding too many capabilities too early
  • Skipping test prompts before deciding the GPT is done
  • Writing vague instructions that leave behavior to the model's defaults
  • Forgetting to define what the GPT should refuse to do
  • Naming the GPT with a clever brand name instead of a clear functional description
  • Never revising the instructions after the first version
Mini lab
  1. Review your capstone workflow and identify the single most repeated step. Choose the one where you re-explain the same context most often.
  2. Write a one-sentence purpose statement for a GPT that handles that step. Use the "and" test: if the sentence requires "and," the scope is too broad.
  3. Draft the instruction brief: role, priorities, boundaries, and output format. Keep it under 300 words. If you need more, the scope is probably too wide.
  4. Write three test prompts: one typical request, one edge case that tests the boundaries, and one out-of-scope request that the GPT should decline.
  5. Run the test prompts and revise the instructions based on what you observe. Note which instruction change had the most impact on output quality.
  6. Ask a colleague to use the GPT without explanation beyond its name and purpose statement. If they struggle to use it correctly, the instructions or the name needs work.

If any part feels vague after testing, narrow the use case before moving on. The strongest GPTs are the ones that do less but do it reliably.

Plan to revise the instructions after three real uses. The first version is always a draft, and real usage always reveals something that hypothetical design misses.

Key takeaway

One focused GPT is far more useful than one overly ambitious GPT. If your GPT's purpose cannot fit in one sentence without using "and," narrow the scope until it can. The best custom GPTs are boring in their scope and reliable in their output -- and reliability is what earns repeated use.