Few-Shot Prompting: Teaching AI by Example
Few-shot prompting means including a handful of worked examples — input paired with the exact output you want — directly in your prompt, so the model copies the pattern instead of guessing at it. It is the single most reliable way to lock in a format, a tone, or an edge-case rule, because a model pattern-matches on what you show far more faithfully than on what you describe.
Zero-shot, one-shot, few-shot
The "shots" are the examples. The number in front tells you how many you provided.
Zero-shot is the default: you state the task and the model answers from its training alone.
Classify the sentiment of this review as positive, negative, or neutral: "The screen is gorgeous but the battery barely lasts a morning."
That works for easy, common tasks. The model has seen a million sentiment labels. But notice what is undefined: does a mixed review count as neutral, or does the negative half win? You did not say, so the model picks — and it may pick differently next time.
One-shot adds a single example. Often that is enough to fix format, less often enough to fix judgment.
Few-shot gives two to five examples. Now you are not just telling the model what to do — you are showing it the standard, the edge cases, and the precise shape of a correct answer. The examples do the teaching that adjectives cannot.
The model is a mimic before it is a reasoner. Show it three good answers and it will produce a fourth in the same mould. Describe a good answer in three sentences and it will produce its own interpretation of your description.
Why examples work when instructions don't
Instructions are abstract; examples are concrete. When you write "keep the tone professional but warm," the model has to translate four words into a thousand micro-decisions about word choice, sentence length, and punctuation — and its translation may not match yours. When you instead paste two messages written in exactly that tone, there is nothing to translate. The pattern is right there.
This matters most for things that are easy to recognise but hard to specify: house style, a particular JSON shape, how to handle the ambiguous case, the difference between "concise" and "terse." You know it when you see it. So does the model — once it has seen it.
The mechanism is the same one behind chain-of-thought prompting: you are reducing the model's guesswork by making the target explicit. Chain-of-thought makes the reasoning explicit; few-shot makes the output explicit. They stack well together.
Zero-shot vs few-shot, side by side
Here is the same extraction task done both ways. The goal: pull structured event data out of a messy sentence.
Zero-shot first:
Extract the event details from this text as JSON: "Reminder — the Q3 board sync moved to Thursday the 14th at 2pm, now in the Lisbon room not Berlin."
The model will return something JSON-ish, but you have not said what keys to use, what date format you want, or what to do with the cancelled location. You will get one of a dozen plausible shapes, and the next call may differ.
Now few-shot — two examples teach the schema, the date format, and how to drop noise:
Extract event details as JSON. Examples:
Input: "Lunch with Priya bumped to Monday 12:30 at the usual place." Output: {"event": "Lunch with Priya", "day": "Monday", "time": "12:30", "location": "the usual place"}
Input: "Sprint demo is Friday 10am in Studio B, bring laptops." Output: {"event": "Sprint demo", "day": "Friday", "time": "10:00", "location": "Studio B"}
Input: "Reminder — the Q3 board sync moved to Thursday the 14th at 2pm, now in the Lisbon room not Berlin." Output:
The second version is no longer ambiguous. The keys are fixed, time is 24-hour, "bring laptops" gets dropped because the examples dropped their equivalents, and "not Berlin" is excluded because nothing in the examples kept negated detail. You taught all of that without writing a single rule — the examples carried it.
A small set of input/output pairs
Examples are at their best for tasks with a consistent transform. Suppose you want product names rewritten into short, benefit-led ad headlines. Describing the style is hard. Showing it is easy:
Rewrite each product into a 6-word-max benefit headline.
Input: Noise-cancelling over-ear headphones Output: Silence everything but your music
Input: Standing desk converter Output: Stand up without buying a desk
Input: Reusable beeswax food wraps Output: Ditch plastic, keep food fresh
Input: Blue-light reading glasses Output:
Four lines of examples pin down length, voice, the "lead with the benefit not the feature" rule, and the casual register — all things that would take a clumsy paragraph to specify and still come out wrong. The model finishes the last pair in the same key. This is the pattern to reach for whenever you have a repetitive transform and a clear sense of "right" you struggle to put into words.
How to choose good examples
This is where most few-shot prompts succeed or fail. The examples are your teaching material, so their quality is the ceiling on your output quality.
- Make them representative. Examples should look like the real inputs you expect, not idealised toy cases. If your real data is messy, show a messy example handled well.
- Cover the edge cases that matter. If the ambiguous case is "mixed sentiment," include a mixed review and label it the way you want. The model learns your rule from the example, not from a clause you forgot to write.
- Keep the format identical across every example. Same keys, same order, same casing, same delimiters. Any inconsistency between examples becomes a licence for the model to be inconsistent too.
- Use real, high-quality outputs. A sloppy example output teaches sloppiness. The model copies what you show, warts included — there is no "do as I say, not as I show."
- Balance the categories. For classification, if all your examples are positive, the model leans positive. Show each label at least once.
The fastest way to debug a misbehaving few-shot prompt is to read your own examples as if you were the model. Whatever inconsistency or bad habit is in them will be in the output.
How many shots
More is not automatically better. Each example costs tokens and attention, and past a point you get diminishing returns.
- One shot is often enough when the only thing you need to fix is format — the model already knows the task, it just needs to see the shape.
- Two to five shots is the sweet spot for most real work: enough to establish the pattern, show one edge case, and demonstrate consistency, without bloating the prompt.
- More than five rarely helps for simple transforms and can start to hurt — the model may overfit to surface features of your examples, or lose the thread in a long prompt. Reach for more only when the task has genuinely many distinct cases to cover.
If two well-chosen examples and three sloppy ones are on offer, pick the two. Consistency and quality beat quantity every time.
Format matching: the quiet superpower
The single most underrated thing examples do is teach format by demonstration. If your examples use a JSON object, the model returns JSON. If they end each answer with a one-line summary, so will the model. If your inputs are prefixed "Input:" and outputs "Output:", the model continues that scaffold rather than inventing its own.
This is why few-shot pairs so naturally with structured prompting. A clean structure plus clean examples is far more reliable than either alone — the structure says where things go and the examples say what they look like. If you want a framework for the surrounding prompt, the RCTCO structure gives you the skeleton; drop your examples into its "Output" or a dedicated examples block.
One practical tip: make the last example trail off exactly where you want the model to begin. Ending your prompt with "Output:" and nothing after it is a strong signal that the model should produce the next output and only that.
Common mistakes
A few failure modes show up again and again.
- Contradictory examples. Two examples that handle the same situation differently teach the model that the rule is "do whatever." If example one drops trailing notes and example two keeps them, your output is a coin flip. Audit your set for internal disagreement.
- Unrepresentative examples. Three clean, short inputs followed by a real input that is long and messy means the model has never seen its actual job. Match your examples to reality.
- Low-quality outputs. Examples with typos, inconsistent casing, or a mediocre answer drag the whole output down. The model has no way to know you meant better than you showed.
- Format drift between examples. "day" in one, "Day" in another, "weekday" in a third. Pick one and repeat it exactly.
- Label imbalance. All-positive examples bias a classifier positive. Cover every category.
- Too many examples for a trivial task. Eight shots to teach sentiment is wasted tokens and an invitation to overfit. Two would do.
When few-shot beats a long instruction
Sometimes you can feel yourself writing a paragraph of rules that keeps sprouting exceptions: "use sentence case, except for proper nouns, and keep it under eight words, but don't cut the verb, and lead with the benefit unless the feature is the benefit…" That is the signal to stop writing rules and start showing examples.
A short instruction plus three good examples will usually outperform a long instruction with zero, because:
- Examples encode the exceptions naturally — you just show the exception handled correctly.
- Examples are unambiguous where prose is interpretable.
- Examples are easier to maintain: to change the behaviour, swap an example rather than rewrite a tangle of clauses.
The flip side: if a task genuinely needs no demonstration — a one-off question, an open-ended brainstorm, anything where there is no fixed "shape" of a right answer — examples are overhead. Few-shot earns its keep on repeatable, format-sensitive, judgment-laden tasks. Use it there.
For the broader picture of where this technique sits among the others, the how to prompt reference and our guide to how to prompt AI correctly both put few-shot in context alongside role, context, and constraints.
Putting it into practice
The best way to internalise few-shot is to build it visibly. In Studio you can lay out an examples block as its own node, see it sitting next to your role and constraints, and tell at a glance whether your examples are consistent — a missing or contradictory shot is obvious on a canvas in a way it never is buried in a paragraph. If you would rather drill the skill with feedback, Cortex has hands-on courses that grade you on choosing and formatting examples, not just reading about them.
And when you want examples that already work, borrow them. The Library has 2,750+ forkable prompts — many of them few-shot — that you can open, study, and adapt. The fastest way to learn what a good example set looks like is to take apart one that already produces great output. Open the Library, find a prompt close to your task, and steal its examples shamelessly.
Put this into practice
Build prompts visually on the canvas with your own key, or grab a ready-made one from the Library.
Keep reading
Chain-of-Thought Prompting: How and When to Use It
When chain-of-thought prompting helps, when it hurts, and how to make the model reason step by step then hand you one clean answer.
The 5-Part Prompt Structure That Fixes 90% of Bad Outputs
Role, Context, Task, Constraints, Output: the 5-part prompt structure that fixes vague AI answers. With a full worked rewrite and a copy-paste template.
How to Prompt AI Correctly: The Complete 2026 Guide
Prompt AI correctly by specifying role, context, task, constraints, and output. A practical 2026 guide with before/after examples and named techniques.