Plan-and-Execute: Separating Planning from Action

Why the Naive ReAct Loop Falls Short

ReAct agents are powerful for short-horizon tasks: pick up a tool, observe the result, pick up the next tool. But when a task requires 15–20 sequential steps, the agent's context fills up with intermediate observations, and it starts losing track of the original goal. It hallucinates tool calls, circles back to steps it already completed, or — worst of all — commits to a local optimum early and never reconsiders.

The Plan-and-Execute pattern solves this by enforcing a strict separation of concerns:

Planner — a language model (often a larger, more capable one) that looks at the user's goal and produces an explicit, numbered step-by-step plan before any tools are called.
Executor — a leaner agent that works through the plan one step at a time, calling tools as needed and reporting results.
Replanner — a model that inspects the current plan, the steps completed so far, and any failures, then revises the remaining steps.

User Goal
    │
    ▼
┌─────────┐     Plan (list of steps)    ┌──────────┐
│ Planner │ ──────────────────────────▶ │ Executor │
└─────────┘                             └──────────┘
    ▲                                        │
    │          Execution trace + failures    │
    └──────────────────────────────────────-─┘
         (via Replanner when step fails)

Core Concepts

The Planner

The planner sees only the user's objective and any relevant background context. It produces a structured plan — typically a numbered list. The key insight is that the planner is not allowed to call tools; it must reason about what should happen at a higher level of abstraction.

PLANNER_PROMPT = """You are a task planner. Given a user objective, produce
a numbered list of discrete, actionable steps that an executor agent can
carry out one by one. Each step should be self-contained and testable.

Do NOT call any tools. Only output the plan.

Objective: {objective}
"""

The plan might look like:

1. Search for the latest quarterly earnings report for AAPL.
2. Extract total revenue and net income figures.
3. Compare with the same quarter last year (YoY change).
4. Search for analyst consensus estimates for those metrics.
5. Summarize findings in a concise paragraph.

The Executor

The executor receives one step at a time along with the full tool catalogue. It is a standard ReAct-style agent but with a much shorter horizon — it only needs to complete one step, then hand control back.

EXECUTOR_PROMPT = """You are a task executor. You have access to the
following tools: {tool_descriptions}

Complete the following single step and report the result clearly.
Do not move on to any other steps.

Step: {current_step}

Previous steps completed:
{completed_steps}
"""

The Replanner

When a step fails (tool error, empty result, model refusal), the replanner kicks in. It sees the full original plan, the steps completed so far with their outputs, and the failure. It decides: can we skip this step? Should we retry differently? Do we need to insert new steps?

REPLANNER_PROMPT = """You are a task replanner. A step in the execution
plan has encountered a problem. Revise the remaining steps to achieve
the original objective.

Original objective: {objective}
Original plan: {original_plan}

Completed steps and results:
{completed_steps}

Failed step: {failed_step}
Failure reason: {failure_reason}

Output a revised plan for the REMAINING steps only (do not repeat
completed steps). If the objective can be achieved with fewer steps,
omit unnecessary ones.
"""

Implementation with LangChain

LangChain's langchain_experimental package ships a PlanAndExecute agent. Here is a minimal but complete example:

from langchain_experimental.plan_and_execute import (
    PlanAndExecute,
    load_agent_executor,
    load_chat_planner,
)
from langchain_openai import ChatOpenAI
from langchain.tools import DuckDuckGoSearchRun, WikipediaQueryRun
from langchain_community.utilities import WikipediaAPIWrapper

# --- Tools ---
search = DuckDuckGoSearchRun()
wikipedia = WikipediaQueryRun(api_wrapper=WikipediaAPIWrapper())
tools = [search, wikipedia]

# --- Models ---
planner_llm = ChatOpenAI(model="gpt-4o", temperature=0)
executor_llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

# --- Agent components ---
planner = load_chat_planner(planner_llm)
executor = load_agent_executor(executor_llm, tools, verbose=True)

# --- Assemble ---
agent = PlanAndExecute(
    planner=planner,
    executor=executor,
    verbose=True,
)

result = agent.invoke(
    "What is the GDP per capita of Norway, and how does it compare "
    "to the EU average? Provide the most recent figures."
)
print(result["output"])

Tip: Use a smaller, faster model for the executor (e.g., gpt-4o-mini) and reserve the larger model for planning. Planning requires broad reasoning; execution requires precise tool use. This split can cut costs by 40–60% on multi-step tasks.

Building a Custom Plan-and-Execute Loop

For production systems, you will want more control than the LangChain default. Here is a lean implementation using LangGraph:

from typing import TypedDict, List, Optional
from langgraph.graph import StateGraph, END
from langchain_core.messages import HumanMessage, AIMessage


class PlanExecuteState(TypedDict):
    objective: str
    plan: List[str]
    completed_steps: List[dict]   # {"step": str, "result": str}
    current_step_index: int
    final_answer: Optional[str]
    error: Optional[str]


def plan_node(state: PlanExecuteState, planner_chain) -> PlanExecuteState:
    """Generate the initial plan from the objective."""
    response = planner_chain.invoke({"objective": state["objective"]})
    # Parse numbered list from response
    lines = response.content.strip().split("\n")
    steps = [
        line.lstrip("0123456789. ").strip()
        for line in lines
        if line.strip() and line[0].isdigit()
    ]
    return {**state, "plan": steps, "current_step_index": 0}


def execute_node(state: PlanExecuteState, executor_agent) -> PlanExecuteState:
    """Execute the current step."""
    idx = state["current_step_index"]
    step = state["plan"][idx]
    try:
        result = executor_agent.invoke({
            "current_step": step,
            "completed_steps": state["completed_steps"],
        })
        completed = state["completed_steps"] + [
            {"step": step, "result": result["output"]}
        ]
        return {
            **state,
            "completed_steps": completed,
            "current_step_index": idx + 1,
            "error": None,
        }
    except Exception as exc:
        return {**state, "error": str(exc)}


def replan_node(state: PlanExecuteState, replanner_chain) -> PlanExecuteState:
    """Revise remaining steps after a failure."""
    failed_step = state["plan"][state["current_step_index"]]
    response = replanner_chain.invoke({
        "objective": state["objective"],
        "original_plan": "\n".join(
            f"{i+1}. {s}" for i, s in enumerate(state["plan"])
        ),
        "completed_steps": state["completed_steps"],
        "failed_step": failed_step,
        "failure_reason": state["error"],
    })
    # Replace remaining steps with revised plan
    revised = [
        line.lstrip("0123456789. ").strip()
        for line in response.content.strip().split("\n")
        if line.strip() and line[0].isdigit()
    ]
    new_plan = (
        state["plan"][: state["current_step_index"]] + revised
    )
    return {**state, "plan": new_plan, "error": None}


def should_continue(state: PlanExecuteState) -> str:
    if state.get("error"):
        return "replan"
    if state["current_step_index"] >= len(state["plan"]):
        return "finalize"
    return "execute"


# Build graph
builder = StateGraph(PlanExecuteState)
builder.add_node("plan", plan_node)
builder.add_node("execute", execute_node)
builder.add_node("replan", replan_node)
builder.add_conditional_edges("execute", should_continue, {
    "execute": "execute",
    "replan": "replan",
    "finalize": END,
})
builder.add_edge("plan", "execute")
builder.add_edge("replan", "execute")
builder.set_entry_point("plan")
graph = builder.compile()

Plan-and-Execute vs. ReAct: When to Choose Which

Criterion	ReAct	Plan-and-Execute
Task horizon	Short (< 5 steps)	Long (5–20+ steps)
Replanning needed	Rarely	Frequently
Transparency	Low (implicit)	High (explicit plan)
Latency	Lower (no planning phase)	Higher (extra LLM call)
Cost	Lower	Higher (planner call overhead)
Handles goal drift	Poorly	Well (replanner)
Debugging ease	Harder	Easier (inspect plan)

Note: Plan-and-Execute is not universally better. For simple Q&A with one or two tool calls, the extra planning overhead wastes tokens and latency. Use it when tasks are long, when you need an audit trail, or when you want users to review and approve the plan before execution begins.

Common Failure Modes and Mitigations

Over-planning: The planner generates 20 granular micro-steps for what should be a 5-step task. Mitigation: add a constraint in the planner prompt ("Generate between 3 and 8 steps").

Step dependency leakage: Step 4 references output from step 3 using a variable the executor has never seen. Mitigation: require each step to be self-contained, and inject completed step results into the executor's context.

Infinite replan loops: The replanner keeps generating steps that fail for the same underlying reason (e.g., a broken API). Mitigation: track replan count per step and hard-fail after N attempts.

MAX_REPLANS_PER_STEP = 3
replan_counts: dict[int, int] = {}

def should_continue_safe(state: PlanExecuteState) -> str:
    if state.get("error"):
        idx = state["current_step_index"]
        replan_counts[idx] = replan_counts.get(idx, 0) + 1
        if replan_counts[idx] >= MAX_REPLANS_PER_STEP:
            # Give up on this step, mark as skipped, move on
            return "finalize"
        return "replan"
    if state["current_step_index"] >= len(state["plan"]):
        return "finalize"
    return "execute"

Human-in-the-Loop Approval

One underappreciated benefit of Plan-and-Execute is that the plan is a natural checkpoint for human review. Before any tools are called, you can surface the plan to the user:

def run_with_approval(agent_graph, objective: str) -> str:
    """Run plan-and-execute with optional human plan approval."""
    # Generate plan only
    state = {"objective": objective, "plan": [], "completed_steps": [],
             "current_step_index": 0, "final_answer": None, "error": None}
    state = plan_node(state, planner_chain)

    # Show plan to user
    print("Proposed plan:")
    for i, step in enumerate(state["plan"], 1):
        print(f"  {i}. {step}")

    approval = input("\nApprove this plan? [y/n/edit]: ").strip().lower()
    if approval == "n":
        return "Task cancelled by user."
    if approval == "edit":
        # Let user edit individual steps
        state["plan"] = interactive_plan_editor(state["plan"])

    # Execute approved plan
    for event in agent_graph.stream(state):
        pass  # stream execution events
    return state.get("final_answer", "Execution complete.")

Summary

The Plan-and-Execute pattern brings discipline to long-horizon agentic tasks by externalising the plan as a first-class artifact. This makes agents more transparent, easier to debug, and more resilient — at the cost of an extra LLM call and slightly higher latency. For any task where you would otherwise be writing more than five ReAct steps, consider reaching for this pattern first.

Plan-and-Execute Agents