Deployment and Testing

Containerizing with Docker

12m read

Containerising Your Agent with Docker

Docker packages your agent, its dependencies, its runtime, and its configuration into a single portable image. That image runs identically on your laptop, in CI, and in production — eliminating the "works on my machine" class of bugs that are especially painful for AI agents, where environment drift can produce subtle, hard-to-reproduce failures. This lesson builds a production-grade Docker setup from scratch.


Why Containerisation Matters for AI Agents

AI agents have specific containerisation needs that a basic Python web service does not:

  • Dependency pinning is critical — a minor version bump in langchain or openai can silently change agent behaviour
  • API keys must be secrets, not environment variables baked into the image
  • Startup time matters — loading model weights or connecting to vector databases on cold start can take 5–30 seconds; health checks must account for this
  • Multi-stage builds keep images lean — development tools and build artifacts should not ship to production

Project Structure for a Containerised Agent

my_agent/
├── Dockerfile
├── docker-compose.yml
├── docker-compose.override.yml   # Local dev overrides (gitignored)
├── .dockerignore
├── pyproject.toml                # Or requirements.txt
├── agent/
│   ├── __init__.py
│   ├── main.py                   # FastAPI or CLI entrypoint
│   ├── orchestrator.py
│   └── tools/
└── tests/

The Dockerfile

A production Dockerfile for a Python AI agent should use a multi-stage build: a builder stage installs dependencies, and a lean runtime stage copies only the artefacts needed to run.

# syntax=docker/dockerfile:1.6
# FILE: Dockerfile
# PURPOSE: Multi-stage build for a Python AI agent service.
#          Builder stage installs all deps; runtime stage ships only
#          what is needed to execute the agent.

# ─────────────────────────────────────────────────
# Stage 1: builder — install Python dependencies
# ─────────────────────────────────────────────────
FROM python:3.12-slim AS builder

# Set a consistent working directory for the build stage
WORKDIR /build

# Install system dependencies required for common Python packages
# (e.g. psycopg2 needs libpq-dev, cryptography needs libssl-dev)
RUN apt-get update && apt-get install -y --no-install-recommends \
    build-essential \
    libpq-dev \
    libssl-dev \
    curl \
    && rm -rf /var/lib/apt/lists/*

# Copy only the dependency manifest first — this layer is cached as long
# as pyproject.toml doesn't change, even if application code does.
COPY pyproject.toml .

# Install dependencies into a known prefix we can copy to the runtime stage.
# --no-cache-dir keeps the image smaller.
RUN pip install --no-cache-dir --prefix=/install .


# ─────────────────────────────────────────────────
# Stage 2: runtime — minimal image for production
# ─────────────────────────────────────────────────
FROM python:3.12-slim AS runtime

# Security: run as a non-root user
RUN groupadd --gid 1001 agentuser \
    && useradd --uid 1001 --gid agentuser --shell /bin/bash --create-home agentuser

WORKDIR /app

# Copy installed packages from the builder stage only
COPY --from=builder /install /usr/local

# Copy application source code
COPY --chown=agentuser:agentuser agent/ ./agent/

# Switch to non-root user before running the process
USER agentuser

# Expose the port the agent's HTTP server listens on
EXPOSE 8000

# Health check — polls the /health endpoint every 30 seconds.
# The agent gets 60 seconds to start (--start-period) before checks begin.
HEALTHCHECK --interval=30s --timeout=10s --start-period=60s --retries=3 \
    CMD curl -f http://localhost:8000/health || exit 1

# The default command — override in docker-compose for worker or CLI mode
CMD ["python", "-m", "uvicorn", "agent.main:app", "--host", "0.0.0.0", "--port", "8000"]

The .dockerignore File

A good .dockerignore file prevents bloating your image with files that do not belong in production:

# .dockerignore
.git
.github
.venv
__pycache__
*.pyc
*.pyo
.pytest_cache
.mypy_cache
.ruff_cache
tests/
docs/
*.md
.env
.env.*
docker-compose.override.yml
*.log
dist/
build/

Important: Always include .env in .dockerignore. Environment files containing API keys must never be copied into a Docker image.


Managing Secrets

API keys for OpenAI, Anthropic, Pinecone, etc. are secrets. There are three common patterns:

Pattern 1: Runtime Environment Variables (Development)

# docker-compose.yml — development only
services:
  agent:
    build: .
    environment:
      - OPENAI_API_KEY=${OPENAI_API_KEY}   # Read from host shell's .env
      - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
# On the host
export OPENAI_API_KEY=sk-...
docker compose up

Pattern 2: Docker Secrets (Production Compose)

# docker-compose.yml — production
services:
  agent:
    build: .
    secrets:
      - openai_api_key
      - anthropic_api_key
    environment:
      - OPENAI_API_KEY_FILE=/run/secrets/openai_api_key

secrets:
  openai_api_key:
    external: true   # Created with: docker secret create openai_api_key -
  anthropic_api_key:
    external: true
# agent/config.py — reading Docker secrets at runtime
from pathlib import Path


def read_secret(name: str, env_var: str | None = None) -> str:
    """
    Read a secret from a Docker secrets file or fall back to an environment variable.

    Docker mounts secrets at /run/secrets/<name> at container startup.
    This approach works in both Docker Swarm and Kubernetes environments.
    """
    import os

    secret_path = Path(f"/run/secrets/{name}")
    if secret_path.exists():
        return secret_path.read_text().strip()

    # Fall back to environment variable for local development
    if env_var and (value := os.getenv(env_var)):
        return value

    raise RuntimeError(
        f"Secret '{name}' not found. "
        f"Expected at {secret_path} or in env var '{env_var}'."
    )


# Usage
OPENAI_API_KEY = read_secret("openai_api_key", env_var="OPENAI_API_KEY")

Pattern 3: External Secret Manager (Recommended for Production)

For Kubernetes or cloud deployments, fetch secrets at runtime from AWS Secrets Manager, HashiCorp Vault, or Google Secret Manager. Never store secrets in the image or in source control.


Docker Compose for Local Development

The full docker-compose.yml brings up the agent alongside its dependencies — a Redis instance for session storage and a PostgreSQL database for long-term memory:

# docker-compose.yml
version: "3.9"

services:

  agent:
    build:
      context: .
      target: runtime
    ports:
      - "8000:8000"
    environment:
      - OPENAI_API_KEY=${OPENAI_API_KEY}
      - REDIS_URL=redis://redis:6379/0
      - DATABASE_URL=postgresql://agent:password@postgres:5432/agent_db
      - LOG_LEVEL=INFO
    depends_on:
      redis:
        condition: service_healthy
      postgres:
        condition: service_healthy
    restart: unless-stopped

  redis:
    image: redis:7-alpine
    ports:
      - "6379:6379"
    healthcheck:
      test: ["CMD", "redis-cli", "ping"]
      interval: 10s
      timeout: 5s
      retries: 3

  postgres:
    image: postgres:16-alpine
    environment:
      POSTGRES_USER: agent
      POSTGRES_PASSWORD: password
      POSTGRES_DB: agent_db
    volumes:
      - postgres_data:/var/lib/postgresql/data
    ports:
      - "5432:5432"
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U agent -d agent_db"]
      interval: 10s
      timeout: 5s
      retries: 5

volumes:
  postgres_data:

The development override file lets you mount source code for hot-reload without changing the base compose file:

# docker-compose.override.yml (gitignored — for local dev only)
version: "3.9"

services:
  agent:
    build:
      target: builder   # Use the builder stage which has dev tools installed
    volumes:
      - ./agent:/app/agent   # Mount source for hot-reload
    environment:
      - LOG_LEVEL=DEBUG
    command: ["python", "-m", "uvicorn", "agent.main:app",
              "--host", "0.0.0.0", "--port", "8000", "--reload"]

Health Checks for Agent Services

Agent startup is not instantaneous. Loading the LLM client, connecting to Redis, and warming up vector index lookups can take 10–60 seconds. Your health check must reflect this:

# agent/main.py — health check endpoint
from fastapi import FastAPI, Response
import time
import logging

logger = logging.getLogger(__name__)
app = FastAPI()

# Track when the service finished initialising
_start_time: float = 0.0
_ready: bool = False


@app.on_event("startup")
async def startup():
    """Run initialisation tasks and mark the service as ready."""
    global _start_time, _ready
    _start_time = time.monotonic()
    logger.info("[INFO][startup] Initialising agent service...")

    # Initialise dependencies (LLM client, Redis, vector index)
    await _init_dependencies()

    _ready = True
    elapsed = time.monotonic() - _start_time
    logger.info("[INFO][startup] Agent service ready in %.2fs", elapsed)


@app.get("/health")
async def health(response: Response):
    """
    Liveness and readiness health check.

    Returns 200 when the service is ready to handle requests.
    Returns 503 during startup or if a critical dependency is unavailable.
    Docker, Kubernetes, and load balancers poll this endpoint.
    """
    if not _ready:
        response.status_code = 503
        return {"status": "starting", "uptime_seconds": time.monotonic() - _start_time}

    # Quick dependency checks
    checks = {}
    try:
        await redis_client.ping()
        checks["redis"] = "ok"
    except Exception as exc:
        checks["redis"] = f"error: {exc}"

    all_healthy = all(v == "ok" for v in checks.values())
    if not all_healthy:
        response.status_code = 503

    return {
        "status": "healthy" if all_healthy else "degraded",
        "checks": checks,
        "uptime_seconds": round(time.monotonic() - _start_time, 1),
    }


async def _init_dependencies():
    """Initialise all agent dependencies during startup."""
    # Placeholder — connect to Redis, load vector index, warm LLM client
    import asyncio
    await asyncio.sleep(0)  # Replace with real init calls

Building and Running

# Build the production image
docker build --target runtime -t my-agent:latest .

# Run with environment variables from a .env file
docker compose up --build

# Check service health
curl http://localhost:8000/health

# View agent logs
docker compose logs -f agent

# Run tests inside the container
docker compose run --rm agent pytest tests/ -m "not slow"

# Push to a registry
docker tag my-agent:latest registry.example.com/my-agent:v1.0.0
docker push registry.example.com/my-agent:v1.0.0

Best Practices Summary

PracticeWhy
Multi-stage buildKeeps production image small; dev tools stay in builder stage
Non-root userReduces attack surface; required by some Kubernetes admission controllers
.dockerignore includes .envPrevents secrets from being copied into the image layer
--start-period on HEALTHCHECKGives the agent time to initialise before Docker marks it unhealthy
Pin base image with digestpython:3.12-slim@sha256:abc... prevents surprise upstream changes
COPY pyproject.toml before source codeMaximises layer caching — dependency layer only rebuilds when deps change

Key Takeaways

  • Use multi-stage builds to keep your production image lean — the builder stage installs deps, the runtime stage ships the minimal artefact.
  • Never bake secrets into your image. Use runtime environment variables for development, Docker secrets or an external secret manager for production.
  • Implement a /health endpoint that reflects true service readiness, not just process liveness.
  • Use docker-compose.override.yml for local dev customisations — keep the base docker-compose.yml production-safe and commit it to source control.
  • Always add a .dockerignore that excludes .env files, __pycache__, test files, and other non-runtime artefacts.

Further Reading