Cohesion

Concept

A foundational idea to recognize and understand.

Context

A module or component groups code together. But grouping alone isn’t enough. What matters is whether the grouped things actually belong together. Cohesion measures that fit. It operates at the architectural scale and is one of the two fundamental metrics of structural quality, alongside coupling.

High cohesion means everything in a module relates to a single, clear purpose. Low cohesion means the module is a grab bag — a collection of unrelated things that happen to share a file or a namespace.

Problem

How do you know whether the contents of a module actually belong together, rather than just being lumped together by convenience or history?

Forces

Grouping by technical layer (all controllers together, all models together) is easy but often produces low cohesion. The contents share a mechanism but not a purpose.
Grouping by domain concept (all user-related code together) tends to produce higher cohesion but can blur layer boundaries.
Modules accumulate clutter over time as developers add “just one more thing” to the most convenient location.
Small, highly cohesive modules are individually clear but collectively numerous, with more boundaries to manage.

Solution

Apply a simple test: can you describe what a module does in a single sentence without using “and”? If you can, it’s probably cohesive. If you need “and” (“this module handles authentication and email formatting and logging configuration”), it’s doing too much.

Aim for functional cohesion, where every element contributes to a single well-defined task or concept. Avoid coincidental cohesion, where elements are together only because someone had to put them somewhere.

When you notice low cohesion, refactor: extract the unrelated pieces into their own modules. In agentic coding, this refactoring pays off quickly. An AI agent working on a cohesive module can hold the module’s full purpose in mind. A module that does five unrelated things forces the agent to load context about all five, most of which is irrelevant to the task at hand.

How It Plays Out

A developer reviews a file called utils.py that has grown to 2,000 lines. It contains date formatting functions, HTTP retry logic, string sanitizers, and configuration loaders. Nothing is related to anything else. She splits it into four cohesive modules: date_utils.py, http_retry.py, sanitizers.py, and config.py. Each module is now small enough to understand at a glance.

An AI agent is asked to fix a bug in notification delivery. The project has a notifications/ module containing only notification-related code: templates, delivery logic, preference management. The agent reads the module, understands the full picture, and fixes the bug in one pass. Had the notification code been scattered across a generic services.py, the agent would have needed to sift through unrelated code to find the relevant pieces.

Tip

The name of a file or module is a promise about its contents. When the name no longer matches what is inside, either rename the module or move the misfit code out. This is cheap maintenance that pays compound interest.

Example Prompt

“Split utils.py into cohesive modules: date_utils.py for date formatting, http_retry.py for retry logic, sanitizers.py for string cleaning, and config.py for configuration loading. Update all imports.”

Consequences

High cohesion makes code easier to find, understand, test, and change. It reduces the amount of context needed to work on any single piece. It makes modules more reusable — a module that does one thing well can be used wherever that thing is needed.

The tradeoff is that highly cohesive modules produce more modules overall, requiring more explicit interfaces and more navigation. This is almost always a net win, but it takes investment in naming, directory structure, and module discovery.

Measures: Module, Component — cohesion evaluates whether a grouping is good.
Paired with: Coupling — high cohesion and low coupling are the twin goals of structural design.
Supports: Separation of Concerns — cohesive modules naturally separate concerns.
Guided by: Shape — the shape of a module reveals its cohesion (or lack thereof).
Informed by: Domain Model — modules that align with domain concepts tend to be highly cohesive.
Informed by: Ubiquitous Language — code organized around shared domain terms groups related behavior naturally.
Violated by: Big Ball of Mud – mud has no cohesion; modules accumulate unrelated responsibilities.

Keyboard shortcuts

Encyclopedia of Agentic Coding Patterns