Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Harnessability

Pattern

A reusable solution you can apply to your work.

Harnessability is the degree to which a codebase’s structural properties make it tractable for AI agents to work in safely and effectively.

“Not every codebase is equally amenable to harnessing.” — Martin Fowler

Also known as: Agent-Friendliness, Ambient Affordances

Understand This First

  • Harness (Agentic) – the harness is the mechanism; harnessability is what the codebase provides for the harness to work with.
  • Feedforward – feedforward controls require harnessable properties (types, boundaries, conventions) to be effective.
  • Feedback Sensor – feedback sensors require structural properties (type systems, test suites) to generate useful signals.

Context

At the agentic level, harnessability describes a quality of the codebase itself, not the agent or the harness that wraps it. A harness provides feedforward controls and feedback sensors. But those controls can only work if the codebase gives them something to latch onto. A type checker is a powerful sensor, but only if the code is written in a typed language. An architectural boundary rule is a useful guide, but only if the codebase has clear module boundaries to enforce.

Ned Letcher coined the term “ambient affordances” for these structural properties: features of the environment that make it legible, navigable, and tractable to agents operating within it. Harnessability is the aggregate of those affordances. A highly harnessable codebase enables more effective controls; a low-harnessability codebase limits what even the best harness can do.

Problem

Why do identical agents, given the same task, perform well in one codebase and poorly in another?

The agent and the model are the same. The harness configuration is the same. The difference is the code they’re working in. One project has strong types, consistent naming, clear module boundaries, and a comprehensive test suite. The other has dynamic types, ad-hoc naming, tangled dependencies, and sparse tests. The first project gives the harness rich signals to work with. The second gives it almost nothing.

Forces

  • Harness quality has a ceiling set by the codebase. You can’t add a type-checking sensor to untyped code, or enforce module boundaries in a codebase that has none.
  • Harnessability overlaps with code quality, but isn’t identical. A codebase can be well-crafted for human developers yet still opaque to agents if it relies on implicit conventions that aren’t machine-readable.
  • Improving harnessability costs effort. Adding types to an untyped project, documenting conventions, or clarifying module boundaries takes work. The payoff comes later, spread across every agent session.
  • Different properties matter at different scales. Strong typing helps at the function level. Module boundaries help at the architectural level. Consistent naming helps everywhere.

Solution

Treat harnessability as a design property worth investing in, the same way you invest in testability or maintainability. A harnessable codebase gives agents structural handholds that the harness converts into controls.

The properties that matter most fall into three groups.

Type information. Strong, static types contribute more to harnessability than any other single property. A type checker running as a feedback sensor catches errors in milliseconds with zero ambiguity. Languages like TypeScript, Rust, Go, and Swift give agents a constant stream of fast, deterministic feedback. Dynamic languages can close part of the gap with type annotations (Python’s type hints, Ruby’s RBS), but the coverage is usually incomplete.

Module structure. Clear boundaries, explicit interfaces, and enforced dependency rules make a codebase navigable. An agent working in a well-modularized project can scope its changes to one module and trust that the boundary prevents unintended side effects elsewhere. Without boundaries, every change is potentially global, and the agent must reason about the entire system at once.

Codified conventions. Naming patterns, file organization rules, and architectural decisions that exist only in developers’ heads are invisible to agents. The same conventions written into linter rules, instruction files, or configuration become feedforward controls that steer agents automatically. Fowler’s observation holds: frameworks that abstract away incidental detail (like Spring or Rails) implicitly increase harnessability by reducing the surface area where agents can make mistakes.

A fourth property cuts across all three: test coverage. Tests are the backbone of feedback sensing. A codebase with comprehensive, fast tests gives the steering loop the signals it needs to converge. Sparse or slow tests leave the agent flying blind.

Optimization Checklist

Knowing the categories is one thing. Knowing where to start is another. These are the highest-leverage changes you can make, roughly ordered by effort-to-impact ratio:

  • Add a single-command verification step. If make check or npm test runs all linters, type checks, and tests in one invocation, the agent can verify its own work without you specifying the right incantation each time.
  • Make CLI tools emit structured output. When your build scripts, test runners, and linters support --json or machine-readable output, the agent parses results directly instead of scraping human-formatted text. Fewer parsing errors, faster feedback loops.
  • Write an AGENTS.md or CLAUDE.md file. A single document describing module boundaries, naming conventions, forbidden patterns, and the project’s verification command gives the agent feedforward at the start of every session.
  • Add type annotations to your most-edited files first. Full-codebase type adoption is expensive. Start with the files agents touch most often and let coverage expand naturally.
  • Enforce module boundaries with tooling. An ESLint rule, an import linter, or an architecture test that prevents cross-boundary imports does more for harnessability than any amount of documentation about what modules should not import.
  • Keep test execution fast. A test suite that finishes in seconds lets the steering loop iterate quickly. A suite that takes minutes slows every correction cycle and tempts the agent (and you) to skip verification.

Tip

When you notice an agent struggling with a specific part of your codebase, ask whether the problem is the agent or the code. If the same task succeeds in a well-typed module but fails repeatedly in an untyped utility folder, the folder’s low harnessability is the bottleneck. Improving the code improves every future agent session.

How It Plays Out

A team maintains a large Python monorepo. Half the codebase has type annotations and a strict mypy configuration. The other half predates the typing effort and runs with no type checking. When agents work in the typed half, the mypy sensor catches type mismatches on every change, and the agents self-correct quickly. In the untyped half, type errors surface only through test failures, which are slower, less specific, and sometimes absent for edge cases. The team tracks agent success rates by directory and finds a 40% gap in first-pass accuracy between the two halves. They prioritize adding type annotations to the most-edited untyped modules, not for human benefit alone, but because each annotated module immediately becomes more tractable for agents.

A solo developer starts a new Rust project. The language’s ownership model, strong types, and cargo-enforced module structure mean the codebase starts at high harnessability by default. The agent’s feedback loop includes the compiler (which catches memory, type, and borrow errors), clippy (which catches idiomatic mistakes), and cargo test. From the first commit, the agent operates inside a tight correction loop. The developer spends little time debugging agent output because the language’s structural properties do much of the work.

Example Prompt

“Run mypy across the codebase and show me which modules have no type annotations. Prioritize adding type stubs to the five most-edited files so future agent sessions get better feedback.”

Consequences

Investing in harnessability compounds. Every improvement to type coverage, module structure, or convention documentation benefits not just the current task but every future agent session. Teams that treat harnessability as a first-class concern find that their agents require less supervision over time, because the codebase itself constrains the agent toward correct behavior.

The cost is upfront effort that may feel disconnected from immediate feature work. Adding types, writing architectural rules, and documenting conventions don’t ship features. The return is indirect: faster agent iterations, fewer correction cycles, and higher first-pass accuracy. Teams that skip this investment often compensate with heavier human review, which is more expensive in the long run.

There’s also a language-choice implication. Codebases in statically typed languages start with higher harnessability than those in dynamic languages. This doesn’t make dynamic languages unusable with agents, but it does mean that teams using them must invest more deliberately in type annotations, linter rules, and convention documentation to reach comparable harnessability.

  • Depends on: Harness (Agentic) — the harness is the mechanism; harnessability is what the codebase provides for the harness to work with.
  • Depends on: Feedforward — feedforward controls require harnessable properties (types, boundaries, conventions) to be effective.
  • Depends on: Feedback Sensor — feedback sensors require structural properties (type systems, test suites) to generate useful signals.
  • Enables: Steering Loop — a harnessable codebase supports tighter, faster steering loops.
  • Related: Boundary — clear boundaries are a core harnessability property.
  • Related: Cohesion — cohesive modules are more navigable for agents.
  • Related: Instruction File — instruction files codify conventions that might otherwise be invisible to agents.
  • Related: Test — test coverage is the foundation of feedback sensing.

Sources

  • Martin Fowler and Birgitta Boeckeler introduced harnessability and “ambient affordances” as properties of the agent’s working environment in their harness engineering article (2025).
  • Ned Letcher coined the term “ambient affordances” for codebase properties that make environments legible and tractable to agents.
  • OpenAI’s harness engineering guide describes how codebase structure determines the effectiveness of agent controls.
  • Davide Consonni’s “Creating AI-Friendly Codebases” offers practical guidance on optimizing codebases for AI agent workflows.

Further Reading