Few-Shot Prompting

Few-shot prompting provides the model with concrete examples of the task before asking it to handle a new case. These examples act as in-context demonstrations, showing the model precisely what input-output behavior you expect. This is one of the most reliable techniques for improving output quality, consistency, and format adherence.

Why Few-Shot Works

When you provide examples in the prompt, the model's attention mechanism identifies patterns — the relationship between inputs and outputs — and continues that pattern for the new input. This is called in-context learning (ICL). The model isn't retrained; it's pattern-matching within the context window. Larger, more capable models show dramatically better ICL performance.

Basic Few-Shot Structure

from langchain_openai import ChatOpenAI
from langchain_core.prompts import FewShotChatMessagePromptTemplate, ChatPromptTemplate
from langchain_core.messages import HumanMessage, AIMessage

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

# Define examples
examples = [
    {
        "input": "The model completely failed to understand my requirements.",
        "output": '{"sentiment": "Negative", "category": "product_quality", "urgency": "high"}'
    },
    {
        "input": "Shipping took 2 days, arrived in perfect condition!",
        "output": '{"sentiment": "Positive", "category": "delivery", "urgency": "low"}'
    },
    {
        "input": "Product works fine but the documentation is unclear.",
        "output": '{"sentiment": "Mixed", "category": "documentation", "urgency": "medium"}'
    },
]

# Create few-shot prompt template
example_prompt = ChatPromptTemplate.from_messages([
    ("human", "{input}"),
    ("ai", "{output}"),
])

few_shot_prompt = FewShotChatMessagePromptTemplate(
    example_prompt=example_prompt,
    examples=examples,
)

# Full prompt with system message + examples + actual query
full_prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a customer feedback classifier. Classify each review as JSON with sentiment, category, and urgency."),
    few_shot_prompt,
    ("human", "{input}"),
])

chain = full_prompt | llm
result = chain.invoke({"input": "I've been waiting 2 weeks and the order hasn't arrived."})
print(result.content)
# {"sentiment": "Negative", "category": "delivery", "urgency": "high"}

Example Selection Strategies

Not all examples are equally effective. Research shows:

Diversity matters: Examples should cover the range of input types you expect. A sentiment classifier trained only on product reviews will perform worse on service reviews.

Recency matters: Examples closer to the end of the prompt (just before the actual query) tend to have more influence than early examples.

Relevance matters: The most effective few-shot setups use examples similar to the current input. Selecting examples dynamically based on semantic similarity to the query (dynamic few-shot) consistently outperforms static example sets.

from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings
from langchain_core.example_selectors import SemanticSimilarityExampleSelector

# Build a vector store of examples for dynamic retrieval
example_selector = SemanticSimilarityExampleSelector.from_examples(
    examples=examples,
    embeddings=OpenAIEmbeddings(),
    vectorstore_cls=FAISS,
    k=2,  # Retrieve 2 most similar examples for each query
)

# The selector picks the most relevant 2 examples for each new input
dynamic_few_shot_prompt = FewShotChatMessagePromptTemplate(
    example_prompt=example_prompt,
    example_selector=example_selector,
)

Common Few-Shot Patterns

Format demonstration: The most common use — showing the exact JSON/CSV/markdown structure you need:

Example 1:
Input: "John Smith, 35, Software Engineer, San Francisco"
Output: {"name": "John Smith", "age": 35, "role": "Software Engineer", "city": "San Francisco"}

Example 2:
Input: "Maria Garcia, 28, Data Scientist, Austin"  
Output: {"name": "Maria Garcia", "age": 28, "role": "Data Scientist", "city": "Austin"}

Now extract from: "Alex Chen, 42, Product Manager, Seattle"

Reasoning style demonstration: Show the model how to think, not just what to output:

Problem: A store sells apples for $0.50 each. If you buy 12 apples, how much do you pay?
Reasoning: Price per apple = $0.50. Total = 12 × $0.50 = $6.00
Answer: $6.00

Problem: The same store sells oranges for $0.75 each. How much for 8 oranges?
Reasoning: Price per orange = $0.75. Total = 8 × $0.75 = $6.00
Answer: $6.00

Optimal Number of Examples

More examples aren't always better — they consume tokens and can introduce noise if examples conflict:

Task Type	Recommended Examples
Simple classification	3-5
Complex formatting	3-8
Reasoning tasks	5-10
Novel domain	8-15

Always test with different example counts on a representative evaluation set. The optimal number varies significantly by task and model.