Agentic Software Construction

This section lives at the agentic level, the newest layer of software practice, where AI models aren’t just tools you use but collaborators you direct. Agentic software construction is the discipline of building software with and through AI agents: systems that can read code, propose changes, run commands, and iterate toward an outcome under human guidance.

The patterns here range from foundational concepts (what is a model, a prompt, a context window) to workflow patterns (plan mode, verification loops, thread-per-task) to execution patterns (compaction, progress logs, parallelization). Together they describe a way of working that’s already changing how software gets built, not by replacing human judgment, but by shifting where human judgment is most needed.

For patterns about controlling, evaluating, and steering agents, see Agent Governance and Feedback.

If the earlier sections of this book describe what to build and how to structure it, this section describes how to direct an AI agent to do that building effectively. The principles from every prior section still apply: agents need clear requirements, good separation of concerns, and honest testing. What changes is the workflow: you spend less time typing code and more time thinking, reviewing, and steering.

Foundations

What agents are made of: the core primitives that every agentic workflow builds on.

Model — The underlying inference engine that generates language, code, plans, or tool calls.
Prompt — The instruction set given to a model to steer its behavior.
Context Window — The bounded working memory available to the model.
Context Rot — The quiet decline in output quality as inputs grow, even inside the advertised window.
Context Engineering — Deliberate management of what the model sees, in what order.
Progressive Disclosure — Load instructions, tools, and references into the agent’s working memory only when they become relevant.
Agent — A model in a loop that can inspect state, use tools, and iterate toward an outcome.
Harness (Agentic) — The software layer around a model that makes it practically usable.
Harness Engineering — The discipline of designing the configuration surfaces around a coding agent so a fixed model produces reliable outcomes in a specific codebase.
REPL — The read-evaluate-print-loop shell that wraps a coding agent so a human can direct it conversationally, one turn at a time, with the session state preserved across turns.
Deep Agents — The composite recipe behind every production coding agent: explicit planning, sub-agent delegation, persistent memory, and an extreme context-engineering layer applied together.
Tool — A callable capability exposed to an agent.
Agent-Computer Interface (ACI) — The discipline of designing tools, affordances, and interaction formats for a language-model agent rather than a human.
MCP (Model Context Protocol) — A protocol for connecting agents to external tools and data sources.
Structured Outputs — Constrain a model’s response to a known schema so the next program in the pipeline can parse it without guessing.
Retrieval — Pulling relevant documents from an external corpus into the agent’s context at query time.
ReAct — The thought-action-observation loop that turns a model into an agent; the inner primitive every coding agent runs on.
Code Mode — Give the agent a small API and a sandbox; let it write code that calls tools instead of emitting JSON one step at a time.

Direction and Control

How you steer an agent: the patterns that shape what it does before, during, and between tasks.

Plan Mode — A read-first workflow: explore, gather context, propose a plan before changing.
Question Generation — Interview first, implement second: the agent asks structured clarifying questions before writing any code.
Research, Plan, Implement — A three-phase discipline that separates understanding from decision-making from execution.
Verification Loop — The cycle of change, test, inspect, iterate.
Interactive Explanations — After the agent writes non-trivial code, have it build a small animated visualization that runs the real algorithm and exposes scrub and step controls, and use the visualization to form the intuition a static description can’t give.
Reflexion — Single-agent self-correction: the agent writes a natural-language post-mortem on each failure and feeds it back as context for the next attempt.
Plan-and-Execute — Split the agent into a planner that thinks once, an executor that runs each step, and a re-planner that only re-engages when the plan needs to change.
Agentic Context Engineering — Treat the agent’s working context as an evolving structured playbook of discrete tagged bullets, updated incrementally by three specialized roles (Generator, Reflector, Curator) instead of monolithic rewrites.
Instruction File — Durable, project-scoped guidance for an agent.
Skill — A reusable packaged workflow or expertise unit.
Hook — Automation that fires at a lifecycle point.
Memory — Persisted information for cross-session consistency.
Compound Engineering — Make every shipped lesson land on a durable, agent-readable surface (instruction file, skill, hook, subagent, test) so the next feature is genuinely cheaper than the last.
Agentic Engineering — The professional discipline of orchestrating coding agents to produce production software, where the human writes the spec, supervises the work, and reviews the output, and the agents write almost all of the code.

Coordination

How multiple agents and threads compose: from subagents to full teams.

Subagent — A specialized agent delegated a narrower role.
Thread-per-Task — Each coherent unit of work in its own conversation thread.
Worktree Isolation — Separate agents get separate checkouts.
Parallelization — Running multiple agents at the same time on bounded work.
Orchestrator-Workers — A central agent decides the subtasks a goal requires, dispatches workers, and synthesizes the results.
Back-Pressure (Agent) — Pacing mechanisms that keep an agent from overwhelming itself, its tools, or the humans and systems around it.
Agent Teams — Multiple agents that coordinate with each other through shared task lists and peer messaging.
Generator-Evaluator — Two agents in an adversarial loop: one writes, one judges, and quality improves through independent critique.
Model Routing — Directing different tasks to different models based on cost, capability, and latency requirements.
A2A (Agent-to-Agent Protocol) — A standard protocol for agents to discover each other and collaborate across vendor boundaries.
Handoff — The structured transfer of context, authority, and state between agents or agent sessions.

Execution Hygiene

How a single agent thread stays sane over long tasks: managing context, tracking progress, and recovering from interruptions.

Compaction — Summarization of prior context to continue without exhausting the context window.
Context Offloading — Route large tool results to the filesystem and pass the agent a summary plus a reference, keeping the active window lean while the full payload stays retrievable.
Prompt Caching — Pin the unchanging prefix of a prompt so the provider can reuse its computed state and bill the repeat at a fraction of the cost.
Progress Log — A durable record of what has been attempted, succeeded, and failed.
Checkpoint — A gate in a workflow where the agent pauses, verifies conditions, and proceeds only if they pass.
Externalized State — Storing an agent’s plan, progress, and intermediate results in inspectable files.
Task Horizon — The length of task an agent can complete reliably on its own; the duration capacity that scopes every long-running run.
Ralph Wiggum Loop — A shell loop that restarts an agent with fresh context after each unit of work, using a plan file as the coordination mechanism.

Keyboard shortcuts

Agentic Software Construction

Foundations

Direction and Control

Coordination

Execution Hygiene