Prompt Engineering Basics

Zero-Shot Prompting

11m read

Zero-Shot Prompting

Zero-shot prompting means asking an LLM to perform a task without providing any examples of how to do it. The model relies entirely on its training to understand and complete the request. It's the simplest form of prompting — and when done well, remarkably powerful.

Why Zero-Shot Works

Modern LLMs are trained on vast datasets that contain implicit demonstrations of almost every conceivable task: blog posts, code reviews, scientific papers, customer service transcripts, legal documents. This exposure to diverse human-written text means the model has internalized patterns for tasks like "summarize," "classify," "translate," and "explain" — without needing explicit examples at inference time.

The quality of zero-shot performance correlates strongly with model scale and instruction tuning. Models specifically fine-tuned to follow instructions (like GPT-4, Claude 3, and Gemini) are dramatically better at zero-shot tasks than base pretrained models.

Basic Zero-Shot Structure

A good zero-shot prompt has three components:

  1. Role or context (optional but helpful): Set the model's perspective
  2. Task description: What exactly you want done
  3. Input: The data or question to act on
from langchain_openai import ChatOpenAI
from langchain_core.messages import SystemMessage, HumanMessage

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

# Minimal zero-shot: just the task and input
simple = llm.invoke([
    HumanMessage(content="Classify the sentiment of this review as Positive, Negative, or Neutral: 'The delivery was fast but the product quality was disappointing.'")
])
print(simple.content)  # "Negative"

# Better zero-shot: role + task + format specification
structured = llm.invoke([
    SystemMessage(content="You are a sentiment analysis expert. Respond with a JSON object containing 'sentiment' (Positive/Negative/Neutral), 'confidence' (0.0-1.0), and 'reasoning' (one sentence)."),
    HumanMessage(content="Review: 'The delivery was fast but the product quality was disappointing.'")
])
print(structured.content)
# {"sentiment": "Negative", "confidence": 0.85, "reasoning": "Despite positive delivery speed, product quality dissatisfaction dominates."}

When Zero-Shot Excels

Classification tasks: Sentiment analysis, topic categorization, intent detection. These map directly to patterns the model saw repeatedly during training.

Text transformation: Translation, summarization, reformatting, tone adjustment. The model understands these operations intuitively.

Information extraction: Pulling structured data (names, dates, quantities) from unstructured text.

Code generation for common patterns: Generating boilerplate, writing standard algorithms, explaining code. The model has seen millions of code examples.

Common Zero-Shot Mistakes

Being too vague:

# Bad: too ambiguous
"Improve this text."

# Good: specific criteria
"Rewrite this text to be more concise. Target 50% fewer words while keeping all key information. Preserve the professional tone."

Missing output format specification:

# Bad: format undefined — inconsistent outputs
"Extract the key points from this document."

# Good: format specified
"Extract the 3-5 most important points from this document as a bulleted list. Each bullet should be a complete sentence under 20 words."

Not accounting for edge cases:

# Bad: no instruction for ambiguous cases
"Classify this email as spam or not-spam."

# Good: handles uncertainty
"Classify this email as 'spam', 'not-spam', or 'uncertain'. Use 'uncertain' if there are indicators of both."

Zero-Shot vs. Few-Shot Decision Guide

SituationUse Zero-Shot WhenUse Few-Shot When
Task complexitySimple, well-defined taskComplex or unusual task
Output formatStandard (text, JSON)Custom or domain-specific format
Model familiarityCommon task typeNiche domain or unusual conventions
Token budgetTightAvailable for examples

Zero-shot is always worth trying first — it's simpler and cheaper. Move to few-shot prompting (covered in the next lesson) when zero-shot results are inconsistent or miss the mark.