The Spec is the New Code

The highest-leverage technical act an engineer can perform right now is not writing code. It is writing a precise specification for the agent that will.

May 05, 2026

In April 2025, developers using Cursor, the AI-powered code editor, started getting logged out unexpectedly when switching between machines. Frustrated, one user emailed support and received a prompt, confident reply from an agent named Sam: “Cursor is designed to work with one device per subscription as a core security feature.”

The policy was entirely invented. No such restriction existed.

Sam was a bot. It had encountered a bug it could not explain, and instead of saying “I don’t know,” it did what language models do under ambiguity: it generated a plausible, authoritative-sounding answer. Users took it as official. Cancellations followed. The post spread across Hacker News and Reddit before Cursor’s co-founder had to intervene publicly: “We have no such policy. Unfortunately, this is an incorrect response from a front-line AI support bot.”

The bug itself was survivable. A session invalidation mistake is frustrating but forgettable. What turned it into a reputational event was the absence of a specification: nobody had told Sam what to do when it did not know the answer. The bot had no grounded access to actual policy documentation, no instruction to express uncertainty, no escalation path to a human. It had been deployed into a customer-facing role with a vague mandate to help, and it filled the gaps in its knowledge with confident fabrication.

This is what I mean when I say the spec is the new code.

01 - The Bottleneck Has Moved

For years the primary constraint in software development was implementation capacity. There were never enough engineers to build everything that needed building. AI agents are loosening that constraint, fast. What is tightening in its place is something most engineering organizations have barely started to think about: specification capacity. The ability to describe what a system should do, with enough precision that an agent can execute it reliably and fail safely.

The evidence is not anecdotal. A GitHub study of over 2,500 agent instruction files found that most fail because they are too vague. An instruction like “you are a helpful coding assistant” is not a specification. “You are a test engineer who writes tests for React components, follows these examples, and never modifies source code” is. The output quality difference between those two starting points is not marginal. Separate research found that resolvable problems exhibit description quality scores between 110% and 2,700% higher than non-resolvable ones, measured across thousands of real bug reports. The range is wide because the gap between a well-specified problem and a vague one is not linear. At some threshold of vagueness, the agent stops being able to help at all.

The Cursor incident illustrates this at the product level. But the same dynamic plays out dozens of times a week in engineering teams adopting AI for implementation. A developer asks an agent to “refactor this module.” The agent refactors it, silently making assumptions about which patterns to apply, which dependencies to preserve, which behaviors to treat as intentional versus accidental. Some of those assumptions are wrong. The code compiles. The wrongness surfaces in review, or in production, or not at all until a month later when someone touches the refactored module and finds it does something subtly different from what it did before.

A vague spec does not produce a little wrong output. It produces a lot of wrong output, quickly, coherently structured, and difficult to unpick.

02 - What a Good Spec Actually Contains

The six components below are not a methodology. They are a checklist of the questions an agent cannot answer on your behalf, because the answers live in your head, your organization’s constraints, and your understanding of the problem. Every one of these that you leave unstated gets filled in by the model’s inference from patterns. Sometimes that inference is right. Often it is not.

The goal, stated from the outside in. Not “implement a caching layer” but “reduce database read latency for the user profile endpoint from 340ms to under 50ms.” The first describes a solution. The second describes an outcome the agent can evaluate its own work against. This distinction matters because agents that are handed solutions implement them without questioning whether they are right. Agents that are handed outcomes can flag when the proposed approach will not achieve them.

The constraints that cannot be violated. Regulatory requirements, security boundaries, compatibility constraints, data retention policies. The agent does not know your compliance posture. In the Cursor case, Sam had no grounded access to actual policy. A constraint as simple as “if you do not have confirmed information from the policy document, say so and escalate” would have changed the outcome entirely. The same principle applies to a coding agent working on a feature that touches GDPR-regulated data, or a pharmaceutical system with formulary rules the agent has never seen.

The edge cases that matter. This is the most consistently skipped component and the one that causes the most expensive failures. A specification that only describes the happy path is half a specification. What happens when the upstream service returns a 503? When the input is valid JSON but values are outside expected ranges? When a user with an expired session triggers this feature? Identifying edge cases before implementation is where the real design work happens. The agent will handle them one way or another. The question is whether it handles them the way you intended.

The definition of done, stated as an observable state. Not “the tests pass.” Something like: “the feature is complete when a user can upload an image up to 5MB, receive a shareable link within three seconds, and the link returns a 404 after 30 days.” A definition of done written before implementation serves as the evaluation criterion for the agent’s output. Without it, review is subjective, and subjective review of agent-generated code misses the subtle semantic errors that look syntactically fine.

The out-of-scope statement. This is the most underrated component. One of the most valuable sentences in any specification is: “This implementation is not responsible for X, which is handled by Y.” An agent that does not know the boundaries of its task will fill them in, and the filled-in boundaries are usually larger than intended. Stating scope explicitly prevents the agent from attempting to re-implement tenant isolation inside a feature, or adding error handling that conflicts with existing middleware, or building something that technically works but overlaps badly with adjacent systems.

03 - The Spec as Design, Not Documentation

Most engineers, when they think about specifications at all, think of them as documentation: a record of decisions already made, written after the fact for future readers. This framing makes specs feel like overhead, which is why most engineers almost never write them.

The more useful framing is that a specification is a design tool. The act of writing it is the act of design. When you sit down to define precisely what an edge case should produce, you are not documenting a decision you already made. You are making it, often for the first time, at a quality that is substantially higher than it would have been if the agent had made it implicitly mid-implementation.

This framing has been standard in hardware engineering and safety-critical software for decades. In those domains, you cannot begin implementation without a specification because the cost of rework is too high. AI agents are importing those economics into general software development. Implementation is now cheap and fast. Rework driven by a bad spec is still expensive and slow. The incentive to specify carefully before implementing is the strongest it has ever been in the history of the profession.

What is new is simply this: the consumer of your spec used to be a human engineer, who could fill gaps from context, ask questions mid-task, and apply judgment to ambiguity. The agent cannot do any of those things reliably. It will fill the gaps, but it will fill them from its training distribution, not from your intent. The spec is the only mechanism you have for aligning those two things.

04 - What This Means for Senior Engineers

The highest-leverage thing a senior engineer can do on an AI-assisted team is not write more code. It is write better specs. Specifically, the specs that junior engineers and agents cannot write on their own: the ones requiring deep familiarity with the system’s history, the organization’s constraints, the regulatory environment, and the unstated assumptions carried by people who built the thing years ago.

That institutional knowledge does not live in the agent’s training data. It lives in the engineers who have been in the room. The specification is the mechanism by which that knowledge gets transferred into something an agent can act on correctly.

An hour writing a precise specification for a complex feature is not an hour spent not coding. It is an hour that determines the quality of everything the agent produces in the following eight hours. Most engineering teams have not yet built the culture that values this, because they are still measuring AI adoption by output volume rather than output correctness. That measurement will change when the rework costs become visible enough to track. The teams that shift their thinking before that moment will have a compounding advantage over the ones that wait.

Start Here

Before you hand your next feature to an agent, write the spec yourself. Goal from the outside in, constraints, edge cases, definition of done, out-of-scope. Then compare what the agent produces to what you intended.

The gap between those two things, if there is one, is the specification debt you would otherwise have paid in review cycles, rework, and incidents. Seeing that gap on a real feature, even once, is more persuasive than any argument.

The spec is not overhead. It is not documentation. It is not someone else’s job.

In an AI-assisted engineering environment, it is the work.

Next issue: what happens when the specification itself is wrong, and how to build review processes that catch specification errors before agents execute them at scale.

The Senior Engineer's Compass

Discussion about this post

Ready for more?