What is Few-Shot Prompting?

From AISApedia, the AI skills & terms encyclopedia

Few-shot prompting is a technique where two to five examples of desired input-output pairs are included in the prompt before the actual task, enabling the model to infer the expected pattern, format, tone, and quality level from the examples rather than from explicit instructions alone. The technique exploits transformer models' strength at pattern matching, often producing more consistent results than detailed written specifications.

Why do examples often outperform detailed written instructions?

Written instructions describe what you want in abstract terms: "use a professional but approachable tone," "keep responses concise," "include relevant technical details." Each of these instructions is ambiguous — the model must interpret what "professional but approachable" means in your specific context. Different models, and even different runs of the same model, may interpret these descriptions differently, producing inconsistent outputs.

Examples eliminate this ambiguity by showing rather than telling. When the model sees three examples of your preferred output, it can extract the concrete patterns: typical sentence length, vocabulary level, the specific balance between technical detail and accessibility, formatting conventions, and structural choices. These patterns are often too subtle or too numerous to describe exhaustively in written instructions but are immediately apparent from well-chosen examples.

The technique is particularly powerful for tasks where the desired output has a distinctive style, format, or quality bar that is difficult to specify in words — executive summaries for a specific audience, data transformations with particular conventions, bug reports that match a team's format — see this prompt teardown for an example of expert-level prompting in action, or content that matches a brand voice. The examples encode tacit knowledge that explicit instructions cannot fully capture, because the knowledge was never fully articulated — it was learned through practice, not through specification.

How do you choose examples that teach the right patterns?

The most effective examples are diverse enough to cover the range of inputs the model will encounter but consistent enough to establish a stable output pattern. Three examples that all handle the same type of input teach the model one narrow pattern; three examples that each handle a different input type teach the model how to adapt the pattern across variations while maintaining consistency on the dimensions that matter (format, tone, depth, structure).

Example quality matters more than example quantity. Two excellent examples that represent your gold standard will produce better results than five mediocre examples that include internal inconsistencies. If the examples contradict each other — one uses formal tone and another uses casual tone, one includes detailed citations and another omits them — the model will inconsistently switch between patterns in its output. Curate examples that are internally consistent on every dimension you care about: tone, format, depth, structure, and vocabulary.

Include edge cases if they are relevant to the task. If you want the model to handle missing data gracefully, include one example where the input has gaps — using negative constraints to specify what NOT to do in those cases and the output handles them according to your preferred convention (flagging the gap, using a default, noting the limitation). Without this example, the model may handle missing data in an unpredictable way — inventing data to fill the gap, ignoring the field entirely, or halting with an unhelpful error.

How does few-shot prompting combine with other techniques?

Few-shot prompting is often most powerful in combination with other prompting techniques rather than in isolation. Combining it with <a href="/aisapedia/chain-of-thought-prompting">chain-of-thought prompting</a> — where the examples include visible step-by-step reasoning, not just final answers — teaches the model both the reasoning process and the output format simultaneously. This combination is particularly effective for analytical, mathematical, and diagnostic tasks where both the reasoning quality and the output presentation matter.

Combining few-shot with <a href="/aisapedia/role-prompting">role prompting</a> (setting a persona or expertise level before the examples) primes the model to interpret the examples through a specific professional lens. Examples of financial analysis following a role declaration as a "senior financial analyst specialising in SaaS metrics" will be processed differently than the same examples without the role context — the role constrains which patterns in the examples the model treats as essential versus incidental.

The main trade-off is context window consumption. Each example uses tokens that could otherwise be spent on the actual task input, additional instructions, or reference material. For simple, well-defined tasks, one or two examples may suffice to establish the pattern. For complex tasks with many degrees of freedom in the output — where format, tone, depth, structure, and analytical approach all matter — three to five examples are typically needed. Beyond five, the returns diminish and the token cost becomes significant.

What mistakes reduce the effectiveness of few-shot prompting?

The most common mistake is using examples that are too similar to each other. Three examples of summarising the same type of document in the same way teach the model one rigid pattern that it may not generalise to slightly different inputs. Varying the input type across examples — a long document, a short document, a document with missing sections — teaches the model to adapt the pattern rather than memorise a template.

Another frequent mistake is including examples with errors, hoping the model will learn from the contrast. In practice, models learn from examples by reproducing their patterns — they do not reliably distinguish between "this is a good example to follow" and "this is a bad example to avoid" unless explicitly told. If you include a bad example, clearly label it: "Here is an example of what NOT to do, followed by the corrected version." Without this labelling, the model may reproduce the error.

Finally, placing examples too far from the task in the prompt can reduce their influence. Due to attention distribution patterns, examples placed immediately before the task have more influence on the output than examples buried in the middle of a long system prompt. Position your examples as close to the actual task instruction as the prompt structure allows.

How should teams build and maintain libraries of few-shot examples?

For recurring tasks, curate a library of vetted examples as part of your prompt library that team members can draw from when constructing prompts. The library should be organised by task type and tagged with the dimensions each example demonstrates (format, tone, edge case handling, complexity level). When a team member needs to build a few-shot prompt for a specific task, they select examples from the library that cover the relevant dimensions rather than sourcing examples ad hoc each time.

Maintain the library actively. As team standards evolve — new formatting conventions, updated quality bars, changed terminology — the examples must evolve with them. An example that represented the gold standard six months ago may now demonstrate an outdated pattern that the team has moved away from. Review the example library on the same cadence as you review your <a href="/aisapedia/domain-prompt-templates">domain prompt templates</a>, and retire or update examples that no longer reflect current standards. A stale example library teaches the model yesterday's patterns and produces outputs that feel subtly wrong to team members working under today's conventions.

Try this yourself

Take a formatting task you do regularly (email responses, data transformations, report sections) and give Claude or ChatGPT 3 examples of your preferred style before asking it to create new ones. Compare against results without examples.

Real-world example

A consultant needed consistent executive summaries. Instructions alone produced varying formats. But after showing 3 examples — 'Situation: X, Complication: Y, Resolution: Z' — every AI-generated summary matched her firm's house style perfectly. The examples encoded nuances no instruction could capture.