Context Engineering

Pattern

A reusable solution you can apply to your work.

Understand This First

Context Window – context engineering manages a finite resource.
Prompt – the prompt is one component of the engineered context.

At the agentic level, context engineering is the deliberate management of what a model sees, in what order, and with what emphasis. It goes beyond writing a good prompt: it covers the entire information environment presented to the model within its context window.

If prompting is writing a good question, context engineering is curating the entire briefing packet. It’s the difference between asking a consultant a question and giving that consultant the right documents, background, constraints, and examples before asking the question.

Problem

How do you ensure the model has the right information to produce high-quality output, given that its context window is finite and it can’t ask for what it doesn’t know it needs?

Most agent failures aren’t model failures. They’re context failures. The model is capable enough — it just wasn’t given the right information, or the right information was buried under noise.

Models work with what they’re given. If critical information is absent, the model fills gaps with plausible defaults. If irrelevant information crowds the window, it competes for the model’s attention and degrades output quality. The core challenge is signal-to-noise ratio: assembling the smallest possible set of high-signal tokens that maximize the likelihood of a good outcome.

Forces

Too little context leads to generic output that ignores your project’s specifics.
Too much context dilutes the model’s attention and wastes the finite window.
Context ordering matters. Models attend more strongly to the beginning and end of the window.
Context freshness matters. Stale information from earlier in a conversation can override current instructions.
You can’t always predict what the model will need, because the task may reveal requirements as it progresses.

Solution

Context engineering is the practice of assembling, ordering, and maintaining the information environment for a model. Four operations form the core of the discipline.

Select: Choose which files, documents, and instructions to pull into the context window. Prefer specific, relevant information over comprehensive dumps. If the agent is modifying one function, provide that function, its tests, and its interface contracts — not the entire repository. Let the agent extend its own selection through tools: an agent that can read files, search code, and run commands fetches information on demand rather than requiring everything preloaded.

Compress: As a conversation progresses, the context fills. Use compaction to summarize earlier exchanges, preserving decisions and state while discarding resolved tangents. Watch for signals of context degradation: the agent ignoring earlier instructions or regressing in quality.

Order: Place the most important information (project conventions, constraints, and the current task) at the beginning of the context. Supporting details and reference material follow. End with the specific request. Models attend most strongly to the beginning and end of the window, so structure matters.

Isolate: Prevent cross-contamination between subtasks by giving each a clean context. Thread-per-task keeps unrelated work from polluting the current task’s window. Subagents take this further: each subagent gets its own context scoped to one narrow subtask, which is why multi-agent architectures often outperform a single agent on complex work.

Beyond these four operations, two practices shape how context is built and maintained over time.

Layering: Use instruction files for durable project context that persists across every interaction. Use the prompt for task-specific context. Use memory for cross-session learnings. Each layer serves a different purpose and lifecycle — writing context into these persistent stores is what makes it available for future selection.

Formatting: Structure information for the model’s consumption. XML-style tags, clear section headers, and consistent delimiters help the model parse what it’s seeing. A wall of unstructured text is harder to work with than the same information organized under labeled sections, even though the token count is similar.

Tip

Structure your project’s instruction files in layers: a top-level CLAUDE.md for project-wide conventions, and directory-level files for subsystem-specific guidance. This way the agent always has relevant context without loading the entire project’s rules into every conversation.

How It Plays Out

A developer starts a session by pasting an entire 2,000-line file into the context and asking the agent to fix a bug on line 847. The agent’s output is mediocre; it struggles with the volume of irrelevant code. The developer starts over, providing only the relevant function, its test, and the error message. The agent fixes the bug on the first try.

A team creates a project instruction file that includes coding standards, architectural decisions, and common pitfalls. Every agent session starts with this context automatically. New team members notice that the agent produces code matching the team’s conventions from the first interaction, because the conventions are in the context, not just in human heads.

Example Prompt

“Before making changes, read CLAUDE.md for project conventions, then read src/api/routes.ts and its test file. Use the existing error-handling pattern you see in the routes file when adding the new endpoint.”

Consequences

Good context engineering dramatically improves the quality and consistency of agent output. It reduces the number of iterations needed to reach a good result and makes the agent’s work more predictable.

The cost is the effort of maintaining context artifacts: instruction files, memory entries, and curated reference documents. This is a new kind of work that didn’t exist before agentic coding. But it compounds: a well-maintained instruction file benefits every future session, and clear project documentation helps both agents and human newcomers.

At production scale, context engineering becomes an infrastructure concern. Token ratios in agentic workflows can run 100:1 input-to-output, making cache efficiency critical for cost and latency. Techniques like stable prompt prefixes, append-only context, and careful cache breakpoint placement move context engineering from an art of prompt-writing into a discipline of systems design.

Depends on: Context Window – context engineering manages a finite resource.
Depends on: Prompt – the prompt is one component of the engineered context.
Uses: Instruction File – instruction files are durable context.
Uses: Memory – memory carries context across sessions.
Uses: Thread-per-Task – isolation by giving each task a clean context window.
Uses: Subagent – subagents isolate subtask context from the parent conversation.
Uses: Compaction – compaction is a context engineering technique for long conversations.
Uses: Tool – tools let the agent select context on demand.

Sources

Tobi Lutke, CEO of Shopify, coined the term “context engineering” in a June 2025 post, defining it as “the art of providing all the context for the task to be plausibly solvable by the LLM.”
Andrej Karpathy amplified the concept days later, describing context engineering as “the delicate art and science of filling the context window with just the right information for the next step” and distinguishing it from the narrower practice of prompt crafting.
Anthropic’s “Effective Context Engineering for AI Agents” (2025) formalized the four core operations (write, select, compress, isolate) and established signal-to-noise ratio as the central design principle.
Philipp Schmid’s “The New Skill in AI Is Not Prompting, It’s Context Engineering” (2025) framed context failures as the primary source of agent failures, shifting the diagnostic focus from model capability to context quality.
Manus’s “Context Engineering for AI Agents: Lessons from Building Manus” demonstrated production-scale context engineering, introducing KV-cache hit rate as the critical metric and techniques like stable prefixes and append-only context for cache efficiency.
Nelson F. Liu et al., “Lost in the Middle: How Language Models Use Long Contexts” (2023), established that models attend most strongly to the beginning and end of the context window.

Keyboard shortcuts

Encyclopedia of Agentic Coding Patterns