Tool Dispatch and Routing
Tools are your agent's hands. The reasoning loop decides what to do; the tool dispatch layer decides how to do it — selecting the right tool, validating its arguments, executing it safely, and handling failures gracefully. A well-designed dispatch layer is the difference between a fragile demo and a production-ready agent.
Why Dispatch Deserves Its Own Layer
It is tempting to keep tool execution inline — just call the function directly from the orchestrator. But as your agent grows, a dedicated dispatch layer earns its existence by centralising:
- Tool registration and discovery — one place to add, remove, or document tools
- Argument validation — catch bad inputs before they cause hard-to-debug downstream errors
- Error handling and retries — isolate failures so one broken tool does not crash the whole agent
- Execution timeouts — prevent slow tools from blocking the reasoning loop indefinitely
- Audit logging — record every tool call for debugging and compliance
The Tool Registry Pattern
A tool registry maps tool names (the strings the LLM emits) to callable implementations. It is the single authoritative source of truth about what tools the agent can use.
Defining a Tool
from __future__ import annotations
import inspect
from dataclasses import dataclass, field
from typing import Any, Callable, Optional, get_type_hints
@dataclass
class ToolParameter:
"""Metadata for a single tool parameter."""
name: str
type_annotation: type
description: str
required: bool = True
default: Any = None
@dataclass
class ToolDefinition:
"""Complete metadata for one tool, used to generate the LLM's tool schema."""
name: str
description: str
parameters: list[ToolParameter]
fn: Callable
def to_openai_schema(self) -> dict:
"""
Convert to the OpenAI function-calling tool schema format.
This JSON structure is passed to the ChatCompletion API so the LLM
knows what tools are available and what arguments they accept.
"""
type_map = {str: "string", int: "integer", float: "number", bool: "boolean"}
properties: dict[str, Any] = {}
required: list[str] = []
for param in self.parameters:
properties[param.name] = {
"type": type_map.get(param.type_annotation, "string"),
"description": param.description,
}
if param.required:
required.append(param.name)
return {
"type": "function",
"function": {
"name": self.name,
"description": self.description,
"parameters": {
"type": "object",
"properties": properties,
"required": required,
},
},
}
Building the Registry
import logging
from functools import wraps
logger = logging.getLogger(__name__)
class ToolRegistry:
"""
Central registry for all agent tools.
Tools are registered via the @registry.tool() decorator. The registry
exposes tool schemas for the LLM and an execution interface for the
dispatcher. It is intentionally simple — a thin directory, not a framework.
"""
def __init__(self) -> None:
self._tools: dict[str, ToolDefinition] = {}
def tool(self, name: str | None = None, description: str | None = None):
"""
Decorator that registers a function as an agent tool.
Usage:
registry = ToolRegistry()
@registry.tool(description="Search the web for current information.")
def search_web(query: str) -> str:
...
"""
def decorator(fn: Callable) -> Callable:
tool_name = name or fn.__name__
tool_description = description or (fn.__doc__ or "").strip().split("\n")[0]
sig = inspect.signature(fn)
hints = get_type_hints(fn)
params: list[ToolParameter] = []
for param_name, param in sig.parameters.items():
if param_name == "self":
continue
annotation = hints.get(param_name, str)
has_default = param.default is not inspect.Parameter.empty
params.append(ToolParameter(
name=param_name,
type_annotation=annotation,
description=f"The {param_name} parameter.",
required=not has_default,
default=param.default if has_default else None,
))
self._tools[tool_name] = ToolDefinition(
name=tool_name,
description=tool_description,
parameters=params,
fn=fn,
)
logger.debug("[DEBUG][ToolRegistry] Registered tool: %s", tool_name)
return fn
return decorator
def get(self, name: str) -> Optional[ToolDefinition]:
"""Look up a tool definition by name. Returns None if not found."""
return self._tools.get(name)
def list_names(self) -> list[str]:
"""Return a sorted list of all registered tool names."""
return sorted(self._tools.keys())
def to_openai_schemas(self) -> list[dict]:
"""Return all tool schemas in OpenAI function-calling format."""
return [defn.to_openai_schema() for defn in self._tools.values()]
def __len__(self) -> int:
return len(self._tools)
def __contains__(self, name: str) -> bool:
return name in self._tools
Registering Real Tools
registry = ToolRegistry()
@registry.tool(description="Search the web for current facts and recent information.")
def search_web(query: str) -> str:
"""Return a text summary of web search results for the given query."""
import httpx
response = httpx.get(
"https://api.tavily.com/search",
params={"query": query, "max_results": 5},
headers={"Authorization": f"Bearer {TAVILY_API_KEY}"},
timeout=10,
)
response.raise_for_status()
results = response.json().get("results", [])
return "\n\n".join(f"**{r['title']}**\n{r['content']}" for r in results)
@registry.tool(description="Read the contents of a local file by path.")
def read_file(path: str) -> str:
"""Read and return the full text contents of the specified file."""
from pathlib import Path
resolved = Path(path).resolve()
if not resolved.exists():
return f"Error: file not found at '{path}'"
if resolved.stat().st_size > 1_000_000:
return f"Error: file too large to read ({resolved.stat().st_size} bytes)"
return resolved.read_text(encoding="utf-8")
@registry.tool(description="Execute a Python code snippet and return its stdout output.")
def run_python(code: str) -> str:
"""
Execute a Python code snippet in a subprocess and capture stdout.
Returns the output, or the error message if execution fails.
"""
import subprocess
import sys
result = subprocess.run(
[sys.executable, "-c", code],
capture_output=True,
text=True,
timeout=10,
)
if result.returncode != 0:
return f"Execution error:\n{result.stderr}"
return result.stdout or "(no output)"
Security Warning:
run_pythonexecutes arbitrary code. In production, always run agent-generated code inside a container with no network access, restricted filesystem, and CPU/memory limits. Never execute LLM-generated code directly on your host machine.
The Dispatcher
The dispatcher takes a tool name and raw arguments from the LLM, validates them, calls the tool, handles errors, and returns a clean observation string.
import time
from dataclasses import dataclass
@dataclass
class DispatchResult:
"""Outcome of a single tool dispatch."""
tool_name: str
arguments: dict[str, Any]
output: Optional[str] = None
error: Optional[str] = None
duration_ms: float = 0.0
@property
def succeeded(self) -> bool:
return self.error is None
def to_observation(self) -> str:
"""Render as the string shown to the LLM in the next iteration."""
if self.error:
return f"[Error from {self.tool_name}]: {self.error}"
return str(self.output)
class ToolDispatcher:
"""
Validates and executes tool calls on behalf of the reasoning loop.
Responsibilities:
- Confirm the requested tool exists in the registry.
- Coerce argument types where possible (e.g. string "42" → int 42).
- Execute the tool with a configurable timeout.
- Catch and format all exceptions as observations rather than crashes.
- Enforce a maximum output length to prevent context window overflow.
"""
MAX_OUTPUT_CHARS = 8_000
def __init__(self, registry: ToolRegistry, timeout_seconds: int = 30) -> None:
self.registry = registry
self.timeout_seconds = timeout_seconds
def dispatch(self, tool_name: str, raw_args: dict[str, Any]) -> DispatchResult:
"""
Execute a tool by name with the given arguments.
This method never raises — all errors are captured into
DispatchResult so the reasoning loop can continue.
"""
start = time.monotonic()
definition = self.registry.get(tool_name)
if definition is None:
return DispatchResult(
tool_name=tool_name,
arguments=raw_args,
error=(
f"Unknown tool '{tool_name}'. "
f"Available: {self.registry.list_names()}"
),
)
try:
coerced_args = self._coerce_args(definition, raw_args)
except ValueError as exc:
return DispatchResult(
tool_name=tool_name,
arguments=raw_args,
error=f"Invalid arguments: {exc}",
)
try:
output = definition.fn(**coerced_args)
duration_ms = (time.monotonic() - start) * 1000
output_str = str(output)
if len(output_str) > self.MAX_OUTPUT_CHARS:
output_str = (
output_str[:self.MAX_OUTPUT_CHARS]
+ f"\n... [output truncated at {self.MAX_OUTPUT_CHARS} chars]"
)
logger.info(
"[INFO][ToolDispatcher][dispatch] %s completed in %.1fms",
tool_name, duration_ms,
)
return DispatchResult(
tool_name=tool_name,
arguments=coerced_args,
output=output_str,
duration_ms=duration_ms,
)
except TimeoutError:
return DispatchResult(
tool_name=tool_name,
arguments=coerced_args,
error=f"Tool '{tool_name}' timed out after {self.timeout_seconds}s",
duration_ms=(time.monotonic() - start) * 1000,
)
except Exception as exc:
logger.error(
"[ERROR][ToolDispatcher][dispatch] %s raised %s: %s",
tool_name, type(exc).__name__, exc,
)
return DispatchResult(
tool_name=tool_name,
arguments=coerced_args,
error=f"{type(exc).__name__}: {exc}",
duration_ms=(time.monotonic() - start) * 1000,
)
def _coerce_args(
self, definition: ToolDefinition, raw_args: dict[str, Any]
) -> dict[str, Any]:
"""
Coerce raw argument values to the types declared in the tool definition.
LLMs sometimes produce numbers as strings or booleans as "true" strings.
This method normalises common mismatches before calling the tool function.
"""
coerced: dict[str, Any] = {}
for param in definition.parameters:
if param.name in raw_args:
raw_value = raw_args[param.name]
if param.type_annotation == int and isinstance(raw_value, str):
coerced[param.name] = int(raw_value)
elif param.type_annotation == float and isinstance(raw_value, (str, int)):
coerced[param.name] = float(raw_value)
elif param.type_annotation == bool and isinstance(raw_value, str):
coerced[param.name] = raw_value.lower() in ("true", "1", "yes")
else:
coerced[param.name] = raw_value
elif param.required:
raise ValueError(f"Required parameter '{param.name}' is missing")
else:
coerced[param.name] = param.default
return coerced
Handling Tool Errors Gracefully
The golden rule for tool errors: always return an observation, never let an exception escape the dispatch layer. An error observation gives the LLM the information it needs to try a different approach. An unhandled exception crashes the agent entirely.
# Bad — exceptions escape to the orchestrator and crash the loop
def bad_dispatch(tool_name, args):
tool = registry.get(tool_name)
return tool.fn(**args) # Any exception here terminates the agent
# Good — all failures become observations the LLM can reason about
def good_dispatch(tool_name: str, args: dict) -> str:
result = dispatcher.dispatch(tool_name, args)
if not result.succeeded:
return (
f"The tool '{tool_name}' failed: {result.error}. "
"Consider trying a different approach or tool."
)
return result.to_observation()
Retry Logic for Transient Failures
Some tools fail transiently — network timeouts, rate limits, temporary server errors. A retry wrapper handles these without burdening the reasoning loop:
import time
from typing import Callable
def with_retry(
fn: Callable,
max_attempts: int = 3,
delay_seconds: float = 1.0,
retriable_exceptions: tuple[type[Exception], ...] = (TimeoutError, ConnectionError),
) -> Callable:
"""
Wrap a tool function with automatic retry on transient failures.
Uses linear back-off: waits delay_seconds * attempt before each retry.
Non-retriable exceptions are re-raised immediately without waiting.
"""
@wraps(fn)
def wrapper(*args, **kwargs):
last_exc = None
for attempt in range(1, max_attempts + 1):
try:
return fn(*args, **kwargs)
except retriable_exceptions as exc:
last_exc = exc
if attempt < max_attempts:
wait = delay_seconds * attempt
logger.warning(
"[WARN][with_retry] %s failed (attempt %d/%d), retrying in %.1fs: %s",
fn.__name__, attempt, max_attempts, wait, exc,
)
time.sleep(wait)
raise last_exc
return wrapper
Dynamic Tool Selection
For agents with many tools, sending all tool schemas to the LLM on every turn is wasteful and can confuse the model. A smarter pattern selects a relevant subset based on the current context:
class SelectiveDispatcher(ToolDispatcher):
"""
Extends ToolDispatcher with context-aware tool selection.
Filters the available tool schemas down to the most relevant subset
before each LLM call. Uses a simple keyword overlap score; in
production, replace this with a vector similarity search over
tool description embeddings.
"""
def relevant_schemas(self, context: str, max_tools: int = 8) -> list[dict]:
"""Return up to max_tools schemas most relevant to the current context."""
context_lower = context.lower()
scored: list[tuple[int, ToolDefinition]] = []
for defn in self.registry._tools.values():
description_words = set(defn.description.lower().split())
context_words = set(context_lower.split())
score = len(description_words & context_words)
scored.append((score, defn))
scored.sort(key=lambda x: x[0], reverse=True)
top = scored[:max_tools]
return [defn.to_openai_schema() for _, defn in top]
Tip: Always include a small set of "always-on" general-purpose tools (like
search_web) in every tool selection regardless of relevance score. These act as the agent's fallback when more specialised tools do not apply.
Tool Error Taxonomy
Not all tool errors are equal. Understanding the failure mode helps you decide whether to retry, escalate, or report:
| Error Type | Example | Recommended Response |
|---|---|---|
| Argument error | Missing required field | Report to LLM; ask it to re-call with correct args |
| Not found | File path does not exist | Report to LLM; it may try a different path |
| Transient network | Timeout, 503 response | Retry with back-off up to 3 times |
| Rate limit | HTTP 429 | Retry after Retry-After header delay |
| Permission denied | Filesystem or API auth error | Report as permanent failure; do not retry |
| Data error | Malformed response from external API | Truncate and return partial result with a warning |
Key Takeaways
- Use a ToolRegistry to centralise tool registration, documentation, and schema generation — never scatter tool definitions across the codebase.
- The dispatcher is responsible for argument validation, execution, error capture, and output truncation. The orchestrator should never call tools directly.
- All tool errors must become observations, never unhandled exceptions. An error observation lets the LLM recover; a crash does not.
- Add retry logic for tools that call external APIs — transient failures are common and recoverable.
- Dynamic tool selection reduces noise in the LLM's context for agents with large tool sets.
Further Reading
- OpenAI function calling documentation — the standard tool schema format
- Anthropic tool use guide — Claude's native tool use API
- LangChain tool documentation — higher-level tool abstractions