Test
“Testing shows the presence, not the absence, of bugs.” — Edsger Dijkstra
Understand This First
- Invariant – tests verify that invariants hold.
- Test Oracle – the oracle tells the test what the right answer is.
Context
You’ve built or modified software and you need to know whether it works. Not “probably works” or “looks right,” but an objective, repeatable answer. This is a tactical pattern, fundamental to every stage of software development.
A test builds on the idea of an Invariant or a Requirement: something the system should do or a property it should have. The test makes that expectation executable; it runs the code and checks the result.
Problem
Software behavior is invisible until you run it. Reading code can tell you what it probably does, but only execution reveals what it actually does. Manual checking is slow, unreliable, and doesn’t scale. How do you gain confidence that your software behaves correctly, and keep that confidence as the software changes?
Forces
- Manual verification is expensive and error-prone.
- Code that works today may break tomorrow after a seemingly unrelated change.
- Writing tests takes time that could be spent building features.
- Tests that are too tightly coupled to implementation become fragile and expensive to maintain.
- Without tests, you must re-verify everything by hand after every change.
Solution
Write executable claims about your software’s behavior. A test is a small program that sets up a situation, exercises a piece of code, and checks whether the result matches an expectation. If the result matches, the test passes. If not, it fails, and the failure tells you exactly where the problem is.
Tests come in many sizes. Unit tests check a single function or class in isolation. Integration tests check that multiple components work together. End-to-end tests simulate a real user interacting with the full system. Each level trades speed for realism: unit tests run in milliseconds but miss integration bugs; end-to-end tests catch more but run slowly and break easily.
The most important property of a good test is that it fails only when something is genuinely wrong. A test that fails randomly, or fails when you change an irrelevant detail, is worse than no test. It trains people to ignore failures.
How It Plays Out
A developer adds a function that calculates shipping costs based on weight and destination. They write three unit tests: one for a domestic package under 5 pounds, one for an international package, and one for a zero-weight edge case. Each test calls the function with specific inputs and asserts the expected output. These tests run in under a second and will catch any future change that accidentally breaks the shipping calculation.
In an agentic workflow, tests become the primary feedback mechanism for AI agents. When you ask an agent to implement a feature, the agent writes code, runs the tests, sees failures, and iterates. The tests act as a specification the agent can check against, a machine-readable definition of “done.” Without tests, you’re left reviewing every line of generated code by hand.
Tests aren’t proof of correctness. They check specific cases you thought of. Bugs live in the cases you didn’t think of. Tests reduce risk; they don’t eliminate it.
“Write unit tests for the calculate_shipping function. Cover domestic under 5 pounds, international, and the zero-weight edge case. Each test should call the function with specific inputs and assert the expected output.”
Consequences
A healthy test suite gives you confidence to change code. You can refactor, add features, or upgrade dependencies, and the tests will catch most breakage immediately. This is especially valuable when working with AI agents that change code rapidly.
The cost is maintenance. Tests are code, and code has bugs. When the system’s behavior changes intentionally, you must update the tests to match. A large, poorly organized test suite can become a drag on development, where every change requires updating dozens of tests. The remedy is to test behavior, not implementation details, and to keep tests focused and independent.
Related Patterns
- Depends on: Invariant — tests verify that invariants hold.
- Depends on: Test Oracle — the oracle tells the test what the right answer is.
- Uses: Harness, Fixture — the surrounding infrastructure that runs tests.
- Enables: Regression detection — tests catch regressions automatically.
- Enables: Test-Driven Development — tests become a design tool.
- Enables: Refactor — tests make refactoring safe.
- Tests: Failure Mode — test each failure mode explicitly.
- Complements: Observability — tests verify before deployment; observability verifies after.
- Tests: Performance Envelope — load tests verify the envelope.
- Enables: Red/Green TDD — the TDD loop depends on working tests.
- Catches: Silent Failure — tests convert silent failures into loud ones.
- Related: Technical Debt – missing tests are a common form of debt.