Prompt Chaining

Prompt chaining is the technique of breaking a complex task into a series of simpler prompts, where the output of each step becomes the input for the next. It's one of the most powerful patterns for tasks that exceed what a single prompt can reliably accomplish.

Why Chain Prompts?

Single prompts have fundamental limitations:

Cognitive overload: Asking a model to research, analyze, draft, edit, and format in one step produces mediocre results at each sub-task
Context contamination: Early parts of a long response can degrade quality of later parts
Error cascading: One mistake in the middle of a complex single-prompt task propagates to the end
Evaluation difficulty: It's hard to debug which part of a complex single prompt is failing

Chaining solves these by isolating each concern into its own prompt with its own evaluation point.

Basic Chain Pattern

from openai import OpenAI
from typing import Any

client = OpenAI()

def llm(system: str, user: str, model: str = "gpt-4o-mini", temperature: float = 0) -> str:
    """Helper to make a single LLM call."""
    response = client.chat.completions.create(
        model=model,
        temperature=temperature,
        messages=[
            {"role": "system", "content": system},
            {"role": "user", "content": user}
        ]
    )
    return response.choices[0].message.content

def research_and_write_pipeline(topic: str) -> dict[str, str]:
    """Three-step chain: outline → draft → edit."""
    
    # Step 1: Create an outline
    outline = llm(
        system="You are a content strategist. Create clear, logical content outlines.",
        user=f"Create a detailed outline for a 500-word blog post about: {topic}\nOutput as numbered sections with 2-3 bullet points each."
    )
    
    # Step 2: Write the draft using the outline
    draft = llm(
        system="You are an expert technical writer. Follow outlines precisely.",
        user=f"Write a 500-word blog post based on this outline:\n\n{outline}"
    )
    
    # Step 3: Edit and improve the draft
    final = llm(
        system="You are a senior editor. Improve clarity, remove redundancy, strengthen the opening and closing.",
        user=f"Edit and improve this draft. Maintain the 500-word target:\n\n{draft}"
    )
    
    return {"outline": outline, "draft": draft, "final": final}

result = research_and_write_pipeline("the benefits of AI agents in software development")
print(result["final"])

Conditional Chains (Branching Logic)

Chains don't have to be linear — you can branch based on intermediate outputs:

def intelligent_support_router(user_message: str) -> str:
    """Route support requests to specialized handlers."""
    
    # Step 1: Classify the intent
    classification = llm(
        system="""Classify the support message intent. Respond with ONLY one of:
BILLING, TECHNICAL, ACCOUNT, GENERAL""",
        user=user_message
    )
    intent = classification.strip().upper()
    
    # Step 2: Branch to specialized handler
    if intent == "BILLING":
        return llm(
            system="You are a billing specialist. Help with invoices, payments, and subscriptions. Be precise about amounts.",
            user=user_message
        )
    elif intent == "TECHNICAL":
        return llm(
            system="You are a senior technical support engineer. Provide step-by-step troubleshooting. Include diagnostic commands.",
            user=user_message
        )
    elif intent == "ACCOUNT":
        return llm(
            system="You are an account manager. Help with account settings, access, and user management.",
            user=user_message
        )
    else:
        return llm(
            system="You are a helpful general support agent.",
            user=user_message
        )

Validation Chains (Self-Checking)

Add a validation step to catch errors before returning to the user:

import json

def validated_extraction(text: str, schema_description: str) -> dict | None:
    """Extract data and validate it with a second LLM call."""
    
    # Step 1: Extract
    extracted = llm(
        system=f"Extract data as JSON. Schema: {schema_description}",
        user=text
    )
    
    # Step 2: Validate the extraction
    validation = llm(
        system="""Review this extracted JSON data and the original text.
Return JSON: {"valid": boolean, "issues": string[], "corrected": object | null}
If valid: {"valid": true, "issues": [], "corrected": null}
If invalid: {"valid": false, "issues": ["description of issue"], "corrected": {corrected JSON}}""",
        user=f"Original text:\n{text}\n\nExtracted data:\n{extracted}"
    )
    
    try:
        result = json.loads(validation)
        if result["valid"]:
            return json.loads(extracted)
        elif result.get("corrected"):
            return result["corrected"]
    except json.JSONDecodeError:
        pass
    
    return None

When to Chain vs. When to Use a Single Prompt

Use a Single Prompt	Use Chaining
Simple, well-defined task	Multi-step tasks with distinct phases
Output is final answer	Intermediate results need validation
Single domain expertise needed	Different expertise per step
Latency is critical	Quality matters more than speed
Cost is the primary concern	Reliability matters more than cost

Chaining increases latency and cost proportionally to the number of steps. For production systems, benchmark both approaches to find the right tradeoff for your specific use case.