--- slug: primitive-obsession type: antipattern summary: "Using raw strings, integers, booleans, or loose maps where a named domain type would carry the meaning and enforce the rules." created: 2026-05-22 updated: 2026-06-07 related: value-object: relation: corrected-by note: "Value objects replace raw primitives with named domain concepts that carry local rules." make-illegal-states-unrepresentable: relation: corrected-by note: "Tighter types make invalid combinations impossible instead of checking scattered raw values later." domain-model: relation: violates note: "Primitive obsession keeps domain concepts out of the model and leaves their rules implicit." ubiquitous-language: relation: violates note: "Raw strings and numbers erase the domain vocabulary that code, documents, and agents should share." code-smell: relation: specializes note: "Primitive obsession is one of the classic code smells catalogued by Fowler and Beck." hard-coding: relation: related note: "Hard coding embeds literal values; primitive obsession embeds domain concepts as undifferentiated primitives." data-model: relation: violates note: "A data model loses meaning when its important concepts collapse into strings, integers, and loose maps." refactor: relation: corrected-by note: "Refactoring replaces raw values with named types while preserving behavior." ai-smell: relation: produces note: "Agents often emit plausible primitive fields when the prompt doesn't supply the domain types." --- # Primitive Obsession > **Antipattern** > > A recurring trap that causes harm — learn to recognize and escape it. *Using raw strings, integers, booleans, or loose maps where a named domain type would carry the meaning and enforce the rules.* *Also known as: Primitive Typing, Stringly Typed Code, Type-Code Obsession* Primitive obsession is not a dislike of primitive values. Strings, numbers, and booleans are useful building blocks. The trap begins when a real domain concept gets flattened into one of them, then every caller has to remember what the value means, which values are allowed, and which combinations are impossible. ## Understand This First - [Value Object](value-object.md) — the usual corrective pattern for domain values with rules and no identity. - [Make Illegal States Unrepresentable](make-illegal-states-unrepresentable.md) — the broader design principle behind replacing loose values with tighter types. - [Ubiquitous Language](ubiquitous-language.md) — the shared vocabulary primitive obsession erases from code. ## Symptoms - A `status`, `role`, `currency`, or `country` travels through the codebase as a string. - A function accepts `amount: float` and `currency: string`, so every caller can pass dollars to code that expects euros. - Several booleans describe one state: `is_paid`, `is_shipped`, `is_cancelled`, `is_refunded`. - Validation logic for the same raw value appears in controllers, serializers, tests, and UI code. - An agent invents a `dict[str, Any]` or generic JSON blob because the prompt didn't provide a domain type. - Reviewers ask, "What does this number mean?" or "Which strings are valid here?" - The code has comments like `// status is one of pending, active, suspended, closed` because the type system doesn't know that. ## Why It Happens Primitive obsession starts with speed. A string is easy to add. A new class, enum, value object, or tagged union feels like design work, and design work feels expensive when the feature is small. So the first version stores `"premium"` as a string, `0.85` as a threshold, or three booleans as state. Serialization also pushes teams toward primitives. APIs, databases, queues, and config files exchange strings and numbers, so it feels natural to keep those shapes inside the application too. The boundary format leaks inward. A value that should be parsed once at the edge stays loose all the way through the domain code. Agents amplify the habit. When a prompt says "add a priority field," the agent sees a thousand tutorial examples where priority is a string. Unless the surrounding code or instruction file names a `Priority` type, the agent will often add `"low"`, `"normal"`, and `"high"` as raw text. The patch works until the next agent writes `"urgent"` or `"High"` somewhere else. The deeper cause is a missing domain model. If the team hasn't named `Money`, `EmailAddress`, `OrderStatus`, or `DeploymentEnvironment` as concepts, the code can't carry those names. The primitives are not the problem by themselves. They are evidence that the meaning never found a proper home. ## The Harm Primitive obsession spreads rules across the codebase. Every raw value needs its own validation, parsing, comparison, formatting, and error handling. If `OrderStatus` is a string, every function that touches it needs to know the allowed strings. If `Money` is two unrelated fields, every function that adds amounts needs to remember to check currency first. It also creates invalid states. Three booleans can express eight combinations even when the business only allows four. A `string` can hold `"admin"`, `"Admin"`, `"administrator"`, `""`, or `"🤷"`. A `float` can hold negative money, `NaN`, and values rounded in ways no payment processor accepts. The system then grows defensive branches to handle states that should never have existed. In agentic coding, the harm is review load. A human reviewer has to inspect every primitive-bearing patch for hidden domain assumptions. Did the agent use the canonical status strings? Did it preserve timezone meaning? Did it compare currency before adding amounts? Did it pass tenant IDs as strings to code that expects user IDs? The more meaning lives outside the types, the more supervision the human has to do by hand. Primitive obsession also weakens future prompts. Agents read the local code as instruction. If the code teaches them that statuses are strings and money is a float, they will keep producing more of that shape. The antipattern becomes self-reinforcing. ## The Way Out Promote domain concepts into named types at the point where they first gain rules. The type does not have to be large. It only has to make the meaning explicit and keep invalid values from spreading. Use four moves: **Name the concept.** If a value has domain meaning, give it a domain name. `EmailAddress`, `Money`, `OrderStatus`, `TenantId`, and `DeploymentEnvironment` tell the next reader what the value is before they inspect its contents. **Constrain the value.** Use enums for closed sets, value objects for structured values, and tagged unions for state that varies by case. Parse raw input once at the boundary, then pass the constrained type through the rest of the system. **Move behavior to the type.** A `Money` type should know how to add money and reject mixed currencies. An `EmailAddress` type should validate format on construction. An `OrderStatus` type should expose allowed transitions. Don't make every caller rediscover the rules. **Teach the agent the type.** Put the domain types in the prompt context or instruction file before asking for code. "Use `OrderStatus`, not raw strings. New statuses must be added to the enum and the transition table. Don't compare status values as text." That instruction gives the agent a local convention to follow. > **💡 Tip** > > When reviewing an agent's patch, search for new string, number, boolean, and `dict` fields. For each one, ask whether it is a real domain concept wearing a primitive costume. If yes, ask the agent to extract the named type before the shape spreads. ## How It Plays Out A subscription service stores plan tiers as strings: `"free"`, `"pro"`, and `"enterprise"`. An agent adds a billing feature and checks for `"paid"` in one branch because the prompt said "paid plans." Tests pass for the pro case but fail in production for enterprise customers. The team replaces the string with a `PlanTier` enum and exposes `is_billable()` on the type. Future code asks the domain question directly instead of guessing which strings imply payment. A payments module passes `amount: float` and `currency: string` through twenty functions. One path adds `10.00 USD` to `10.00 EUR` because both are floats by the time they reach the accumulator. The fix is a `Money` value object that stores a decimal amount and currency together. Its `add()` method rejects mixed currencies unless a conversion step has already produced a common currency. The bug disappears because the invalid operation no longer has an easy expression. A workflow engine models job state with booleans: `started`, `finished`, `failed`, `cancelled`. The agent asked to add retries writes a branch for `failed && finished` because the data shape permits it. The state machine never meant to allow that combination. The team replaces the booleans with a tagged union: `Queued`, `Running`, `Succeeded`, `Failed(reason)`, `Cancelled(by)`. The retry code becomes shorter because each case carries only the fields it can legally have. > **⚠️ Warning** > > Do not fix primitive obsession by wrapping every value blindly. `PageNumber`, `RetryCount`, and `PercentComplete` may earn names; a local loop index probably doesn't. The test is domain meaning plus rules, not discomfort with primitives in general. ## Consequences Replacing primitives with domain types makes code easier to read and safer to change. The names carry the ubiquitous language into the implementation. Constructors and enums reject invalid values early. Agents given those types generate code that follows the model instead of inventing local conventions. The cost is extra structure. Small systems can drown in tiny wrappers if every value becomes a class. Serialization also needs care: strict internal types still have to cross loose external boundaries such as JSON, forms, CSV files, databases, and tool outputs. That conversion code belongs at the boundary. Once the value is inside the system, it should carry its meaning with it. The judgment call is timing. Extract too early and you create ceremony. Extract too late and the primitive shape spreads through APIs, tests, fixtures, and stored data. A practical rule: when the second validation check appears, or when a value needs a second field to make sense, promote the concept. ## Sources - Martin Fowler and Kent Beck's *[Refactoring: Improving the Design of Existing Code](https://martinfowler.com/books/refactoring.html)* names primitive obsession as a code smell and gives the core remedy: replace loose data values with objects that carry behavior and meaning. Fowler's online catalog entry for [Replace Primitive with Object](https://refactoring.com/catalog/replacePrimitiveWithObject.html) shows the small refactoring step behind the larger design move. - Eric Evans's *Domain-Driven Design* provides the domain-modeling frame this article uses: the important concepts in the domain should appear in the model, and the model should speak the team's ubiquitous language. Domain Language's [DDD resources page](https://www.domainlanguage.com/ddd/) is the stable public pointer for Evans's book and surrounding work. - Yaron Minsky's Jane Street writing on [Effective ML Revisited](https://blog.janestreet.com/effective-ml-revisited/) gives the type-design principle that makes this antipattern costly in practice: invalid states should be impossible to represent, not merely checked after they appear. --- - [Next: Data Normalization / Denormalization](data-normalization.md) - [Previous: Hard Coding](hard-coding.md)