Controlling Output Format

Controlling the format of LLM output is essential for building systems where the AI's response needs to be processed programmatically. Unstructured text is hard to parse reliably; structured output (JSON, Markdown, specific templates) makes your application robust. This lesson covers the full toolkit for format control.

Why Format Matters

Consider a system that extracts key information from job postings. If the model sometimes returns JSON, sometimes a bulleted list, and sometimes a prose paragraph — your parsing code needs to handle all three cases. Consistent format makes downstream processing deterministic.

Format instructions serve two purposes:

Make output machine-parseable
Ensure consistency across requests (same input structure → same output structure)

Technique 1: Explicit Format Instructions

The most straightforward approach — tell the model exactly what format to produce:

from openai import OpenAI
import json

client = OpenAI()

def extract_job_info(job_posting: str) -> dict:
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {
                "role": "system",
                "content": """Extract job information and return ONLY valid JSON with no markdown formatting or explanation.
                
Required JSON structure:
{
  "title": string,
  "company": string,
  "location": string | null,
  "remote": boolean,
  "salary_min": number | null,
  "salary_max": number | null,
  "required_skills": string[],
  "experience_years_min": number | null,
  "seniority": "junior" | "mid" | "senior" | "lead" | "unknown"
}"""
            },
            {"role": "user", "content": job_posting}
        ],
        temperature=0,
    )
    
    raw = response.choices[0].message.content.strip()
    # Strip markdown code fences if the model added them despite instructions
    if raw.startswith("```"):
        raw = raw.split("```")[1]
        if raw.startswith("json"):
            raw = raw[4:]
    
    return json.loads(raw)

Technique 2: JSON Mode (Guaranteed Valid JSON)

OpenAI's response_format={"type": "json_object"} guarantees valid JSON output:

response = client.chat.completions.create(
    model="gpt-4o-mini",
    response_format={"type": "json_object"},
    messages=[
        {"role": "system", "content": "You extract structured data. Always respond with valid JSON."},
        {"role": "user", "content": f"Extract the key metrics from: {report_text}"}
    ]
)
# response.choices[0].message.content is guaranteed to be parseable JSON
data = json.loads(response.choices[0].message.content)

Important: JSON mode guarantees syntactic validity but not schema compliance. The model may return a JSON object that doesn't match your expected schema. Always validate after parsing.

Technique 3: Structured Outputs with Pydantic (Strictest)

OpenAI's structured outputs (using client.beta.chat.completions.parse) guarantees schema compliance:

from pydantic import BaseModel, Field
from typing import Literal, Optional
from enum import Enum

class Seniority(str, Enum):
    JUNIOR = "junior"
    MID = "mid"
    SENIOR = "senior"
    LEAD = "lead"
    UNKNOWN = "unknown"

class JobPosting(BaseModel):
    title: str = Field(description="Job title")
    company: str
    location: Optional[str] = None
    remote: bool = False
    salary_min: Optional[int] = Field(None, description="Minimum salary in USD")
    salary_max: Optional[int] = Field(None, description="Maximum salary in USD")
    required_skills: list[str] = Field(default_factory=list)
    experience_years_min: Optional[int] = None
    seniority: Seniority = Seniority.UNKNOWN

completion = client.beta.chat.completions.parse(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "Extract job posting information into structured data."},
        {"role": "user", "content": job_posting_text}
    ],
    response_format=JobPosting,
)

job: JobPosting = completion.choices[0].message.parsed
print(f"Title: {job.title}")
print(f"Seniority: {job.seniority.value}")
print(f"Skills: {', '.join(job.required_skills)}")

Technique 4: Markdown Structure for Human Consumption

For outputs read by humans (reports, documentation), control the markdown structure:

report_format_instruction = """
Format your analysis as a professional report with this exact structure:

# Executive Summary
[2-3 sentences, no jargon, key finding only]

## Key Findings
[3-5 bullet points, each starting with a bold metric or insight]

## Detailed Analysis
[3-4 paragraphs with supporting data]

## Recommendations
[Numbered list, 3 items max, each actionable and specific]

## Risk Factors
[Table with columns: Risk | Likelihood | Impact | Mitigation]
"""

Format Validation in Production

Always validate LLM output before using it:

def safe_parse_llm_json(raw_text: str, schema: type[BaseModel]) -> BaseModel | None:
    """Parse LLM output with graceful failure — never raises in production."""
    try:
        # Try direct parse
        return schema.model_validate_json(raw_text)
    except Exception:
        # Try stripping markdown fences
        try:
            cleaned = raw_text.replace("```json", "").replace("```", "").strip()
            return schema.model_validate_json(cleaned)
        except Exception as e:
            logging.warning(f"Failed to parse LLM output: {e}\nRaw: {raw_text[:200]}")
            return None

Combining explicit format instructions with JSON mode or structured outputs gives you the highest reliability. For production systems, always validate and have a fallback behavior when parsing fails.