Prompt Design Principles

Controlling Output Format

11m read

Controlling Output Format

Controlling the format of LLM output is essential for building systems where the AI's response needs to be processed programmatically. Unstructured text is hard to parse reliably; structured output (JSON, Markdown, specific templates) makes your application robust. This lesson covers the full toolkit for format control.

Why Format Matters

Consider a system that extracts key information from job postings. If the model sometimes returns JSON, sometimes a bulleted list, and sometimes a prose paragraph — your parsing code needs to handle all three cases. Consistent format makes downstream processing deterministic.

Format instructions serve two purposes:

  1. Make output machine-parseable
  2. Ensure consistency across requests (same input structure → same output structure)

Technique 1: Explicit Format Instructions

The most straightforward approach — tell the model exactly what format to produce:

from openai import OpenAI
import json

client = OpenAI()

def extract_job_info(job_posting: str) -> dict:
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {
                "role": "system",
                "content": """Extract job information and return ONLY valid JSON with no markdown formatting or explanation.
                
Required JSON structure:
{
  "title": string,
  "company": string,
  "location": string | null,
  "remote": boolean,
  "salary_min": number | null,
  "salary_max": number | null,
  "required_skills": string[],
  "experience_years_min": number | null,
  "seniority": "junior" | "mid" | "senior" | "lead" | "unknown"
}"""
            },
            {"role": "user", "content": job_posting}
        ],
        temperature=0,
    )
    
    raw = response.choices[0].message.content.strip()
    # Strip markdown code fences if the model added them despite instructions
    if raw.startswith("```"):
        raw = raw.split("```")[1]
        if raw.startswith("json"):
            raw = raw[4:]
    
    return json.loads(raw)

Technique 2: JSON Mode (Guaranteed Valid JSON)

OpenAI's response_format={"type": "json_object"} guarantees valid JSON output:

response = client.chat.completions.create(
    model="gpt-4o-mini",
    response_format={"type": "json_object"},
    messages=[
        {"role": "system", "content": "You extract structured data. Always respond with valid JSON."},
        {"role": "user", "content": f"Extract the key metrics from: {report_text}"}
    ]
)
# response.choices[0].message.content is guaranteed to be parseable JSON
data = json.loads(response.choices[0].message.content)

Important: JSON mode guarantees syntactic validity but not schema compliance. The model may return a JSON object that doesn't match your expected schema. Always validate after parsing.

Technique 3: Structured Outputs with Pydantic (Strictest)

OpenAI's structured outputs (using client.beta.chat.completions.parse) guarantees schema compliance:

from pydantic import BaseModel, Field
from typing import Literal, Optional
from enum import Enum

class Seniority(str, Enum):
    JUNIOR = "junior"
    MID = "mid"
    SENIOR = "senior"
    LEAD = "lead"
    UNKNOWN = "unknown"

class JobPosting(BaseModel):
    title: str = Field(description="Job title")
    company: str
    location: Optional[str] = None
    remote: bool = False
    salary_min: Optional[int] = Field(None, description="Minimum salary in USD")
    salary_max: Optional[int] = Field(None, description="Maximum salary in USD")
    required_skills: list[str] = Field(default_factory=list)
    experience_years_min: Optional[int] = None
    seniority: Seniority = Seniority.UNKNOWN

completion = client.beta.chat.completions.parse(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "Extract job posting information into structured data."},
        {"role": "user", "content": job_posting_text}
    ],
    response_format=JobPosting,
)

job: JobPosting = completion.choices[0].message.parsed
print(f"Title: {job.title}")
print(f"Seniority: {job.seniority.value}")
print(f"Skills: {', '.join(job.required_skills)}")

Technique 4: Markdown Structure for Human Consumption

For outputs read by humans (reports, documentation), control the markdown structure:

report_format_instruction = """
Format your analysis as a professional report with this exact structure:

# Executive Summary
[2-3 sentences, no jargon, key finding only]

## Key Findings
[3-5 bullet points, each starting with a bold metric or insight]

## Detailed Analysis
[3-4 paragraphs with supporting data]

## Recommendations
[Numbered list, 3 items max, each actionable and specific]

## Risk Factors
[Table with columns: Risk | Likelihood | Impact | Mitigation]
"""

Format Validation in Production

Always validate LLM output before using it:

def safe_parse_llm_json(raw_text: str, schema: type[BaseModel]) -> BaseModel | None:
    """Parse LLM output with graceful failure — never raises in production."""
    try:
        # Try direct parse
        return schema.model_validate_json(raw_text)
    except Exception:
        # Try stripping markdown fences
        try:
            cleaned = raw_text.replace("```json", "").replace("```", "").strip()
            return schema.model_validate_json(cleaned)
        except Exception as e:
            logging.warning(f"Failed to parse LLM output: {e}\nRaw: {raw_text[:200]}")
            return None

Combining explicit format instructions with JSON mode or structured outputs gives you the highest reliability. For production systems, always validate and have a fallback behavior when parsing fails.