Artifact
A durable, named, inspectable product of work — a thing you can reference after the moment that made it.
Understand This First
- State — an artifact is one of the places state is allowed to live between sessions.
- Source of Truth — an artifact becomes useful when something can be said to be authoritatively true about it.
What It Is
Write a plan down in a file and you’ve made an artifact. Sketch the same plan on a whiteboard, photograph it, commit the photo: still an artifact. Explain the same plan out loud in a meeting that nobody recorded and nothing stuck, and you haven’t. The difference isn’t the medium. It’s that one of them you can point at tomorrow, and the other is gone.
An artifact is a product of work that persists beyond the moment of its making. Three properties define it:
- Persistent. It survives the session that produced it. Close the laptop, end the conversation, restart the agent — the artifact is still there.
- Addressable. It has a name, a path, or an identifier that lets someone else reach it without being told the story of how it got made.
- Inspectable. A person or another agent, who was not present when it was made, can examine it and understand what it says.
Specifications, plans, design documents, architecture decision records, briefs, handoff notes, commits, pull requests, build outputs, release notes, progress logs, CLAUDE.md files, Parquet files staged between pipeline steps: all artifacts. Conversations, mental models, working memory, the half-formed intention in an agent’s context window: not artifacts. The moment one of those transient things is written down in a form the next person can open, it crosses the line.
Why It Matters
Agentic workflows are built on artifacts. The shift from “an engineer types code” to “an agent ships work” is, operationally, a shift from transient in-head state to a chain of durable things you can inspect: a brief becomes a spec, the spec becomes a plan, the plan becomes an implementation, the implementation becomes tests and a pull request, the pull request becomes a release note. Each arrow in that chain is a handoff, and each handoff requires the upstream step to have produced something the downstream step can read without the original author present.
Agents magnify this requirement. A human colleague can rebuild some of the lost context from tone, shared history, or a quick follow-up conversation. An agent starting a fresh session has only what was written down. If the previous session’s work lives only in a closed context window, the next session has nothing to pick up. If the previous session produced an artifact (a plan file with checkboxes, a design doc with open questions, a commit with a message), the next session has a place to start.
Treating work as artifact-producing also changes how much review is possible. A plan held in the agent’s head cannot be reviewed before execution; a plan written to PLAN.md can. A design implied by the structure of a commit cannot be argued with; a design written as an Architecture Decision Record can. Every artifact a workflow produces is another gate where a human can intervene, another point a second agent can learn from, and another piece of evidence the system can replay if something goes wrong later.
How to Recognize It
When you’re not sure whether something counts, run the three tests:
- Persistence: If the laptop crashes right now, is it still there?
- Address: Can you send someone a link, a path, or a filename and have them find it?
- Inspection: Can someone who wasn’t there read it and learn something useful?
A chat transcript in a closed window fails all three. A chat transcript saved to conversations/2026-04-23.md passes all three. The content didn’t change. The act of saving it did.
Watch for near-misses. A ticket title without a body is technically persistent and addressable, but not very inspectable, since the content lives in the heads of the people who wrote it. A commit message that reads fix fails the same way. The strongest artifacts are the ones that answer “what does this say?” without needing the author on the phone.
How It Plays Out
An SRE on the Friday overnight shift asks an agent to investigate why a checkout flow has been failing intermittently for the past week. The agent works through 90 minutes of log queries, distributed traces, and metric comparisons, narrows the suspect surface to one of three downstream services, and the shift ends. Saturday’s on-call inherits the case. If Friday’s agent kept its reasoning only in chat, Saturday’s SRE gets a vague summary and re-runs the same queries before making any new progress. If Friday’s agent wrote the timeline, the eliminated services, and the open hypotheses to INCIDENT_NOTES.md, Saturday’s SRE opens the file and resumes at the next narrowing step. Both shifts cost 90 minutes. Only one of them left the next person something to pick up.
A product manager asks an agent to analyze three months of support tickets and propose a roadmap. The agent does the analysis in a long conversation, lists five priorities at the end, and the window closes. A week later, the PM wants to share the reasoning with engineering. None of it exists anymore: no document, no ranked list, no evidence chain from tickets to priorities. The analysis happened, but because nothing was written down as an inspectable output, it can’t be shared, verified, or challenged. The fix is mechanical: at the start of the session, tell the agent to produce a ROADMAP.md that cites specific tickets for each priority. The conversation becomes scaffolding; the artifact is the deliverable.
A build pipeline treats every intermediate stage as an artifact. Source code compiles to an object file; the object file links into a binary; the binary signs into a release bundle; the release bundle publishes with a checksum and a version tag. Any stage that fails can be diagnosed by inspecting the outputs of the stages before it. If a production rollout goes wrong, the team can point at a specific versioned artifact and roll back to the previous one. None of that works if “the build” is a set of commands someone ran on their laptop.
Ask “what artifact does this produce?” as a routine question when directing an agent. If the answer is “nothing durable,” either add an output step or accept that the work is ephemeral and will need to be redone if anyone else ever cares about it.
Consequences
Treating work as artifact-producing makes agentic workflows auditable and resumable, and lets a reviewer step in at any point. A plan can be read before it runs. A decision leaves a trace. Handoffs across sessions, agents, and the humans on either side become reliable because the state of the work lives in files rather than in memory.
The cost is discipline and tokens. Producing an artifact for every step slows the workflow down, and not every piece of transient state earns its keep. A five-minute task doesn’t need a plan file; a trivial change doesn’t need a design doc. The judgment call is figuring out which stages of which workflows matter enough that losing them would hurt. For anything involving a handoff, multiple sessions, external review, or enough risk that an audit trail matters, the overhead pays for itself.
Artifacts also carry a fidelity risk. An out-of-date artifact is worse than no artifact, because it manufactures false confidence. A status file that claims six items are done when only four are will send the next session in the wrong direction. The remedy is to keep the artifact honest as the work progresses, and to reconcile it with reality whenever a session resumes. Never trust a stale file as if it were the territory.
Related Patterns
Sources
The term “artifact” as a software work product traces to the 1970s software-engineering lifecycle literature, especially Winston Royce’s Managing the Development of Large Software Systems (IEEE WESCON, 1970) and Barry Boehm’s Software Engineering Economics (1981). Both treated specifications, designs, code, and test plans as first-class outputs produced at distinct phases, rather than as byproducts of one continuous activity.
The Unified Process, formalized by Ivar Jacobson, Grady Booch, and James Rumbaugh in The Unified Software Development Process (1999), made “artifact” a core vocabulary word for object-oriented development. Their definition, a piece of information produced, modified, or used by a process, is close to the one used here.
The Software Engineering Body of Knowledge (SWEBOK, IEEE, multiple editions) catalogs the standard artifacts of each software-engineering activity and remains the broadest reference for what the discipline counts as a work product.
The agentic-coding community has inherited the word largely through the lifecycle and DevOps literature rather than inventing a new one. Its renewed relevance comes from how much more depends on inspectable, durable outputs when the worker producing them is a stateless model.