It’s become increasingly common to hear the claim that AI is now writing a significant and growing share of production code. You see it implied in LinkedIn posts from engineering leaders, in investor decks, and in conference talks that treat AI-assisted development as a settled fact rather than a hypothesis. The assumption isn’t just that AI tools are being used, but that they’re materially changing what ships.
Some of that confidence is reinforced by unusually candid disclosures. The CEO of Anthropic has claimed that Claude now writes between 70 - 90% of the company’s code, and the creator of Claude Code has gone further, noting that the tool wrote 100% of the code he personally contributed in December 2025. This is a striking but rare claim that suggests how far AI-assisted development may have progressed for certain teams.
At the same time, the broader industry picture is far less clear. While belief in AI’s impact on software development has grown rapidly, concrete evidence about how much shipped code is AI-generated and how quickly that share is changing remains scarce outside of one-off self-reports.
The gap between confidence and measurement matters because it shapes real business decisions. In a recent example, Salesforce acknowledged that it had eliminated thousands of roles under the expectation that AI would offset the loss, only to later concede that the anticipated gains did not materialize as expected. This misfire highlights how strongly the assumption of accelerating AI capability has taken hold, even when the outcomes are mixed.
This post isn’t an attempt to produce a definitive number where none exists. Instead, it’s a closer look at why those numbers are so difficult to surface, what signals we do have, and how we can interpret them carefully without overstating the case.
If you spend time in developer discourse, you’ll occasionally see a specific figure invoked: something like “40% of code is now AI-generated.” Variants of this number circulate widely, often without attribution, and almost never with a clear definition of what’s being counted.
As far as I can tell, there is no credible, industry-wide metric that supports claims like this. No major platform has published audited data tying AI usage to a measurable percentage of committed, reviewed, and deployed code. When numbers are shared, they’re usually narrow, contextual, and explicitly caveated.
That hasn’t stopped people from reaching for a single headline statistic, which is telling in itself. The demand for a number reflects how strongly people want to anchor an intuition that already feels true: that AI isn’t just assisting developers, but meaningfully contributing to what ships.
What we actually have instead are partial indicators (adoption surveys, time-use data, and a handful of unusually candid disclosures), none of which directly measure volume, but all of which point in the same direction.
If AI code generation is accelerating in production environments, it has implications well beyond developer productivity. It affects how teams reason about code quality, ownership, review processes, incident response, and ultimately system reliability. It also changes where responsibility sits when something goes wrong.
Given the scale of those implications, you might expect companies to be eager to measure and publish this data, especially if AI represents the next major productivity wave in software.
But there are understandable reasons this kind of data doesn’t surface publicly.
There is no agreed-upon definition of “AI-generated code.” Does it include boilerplate that a developer edits heavily? Code suggested line-by-line during refactoring? Tests written by an assistant but curated by a human? Even internally, teams struggle to define the boundary in a way that’s consistent or auditable.
If a company were to say, “25% of our production code is AI-generated,” that number immediately becomes a benchmark. Stakeholders will expect it to grow. If it stalls (or worse, declines) the company now has to explain whether adoption slowed, quality issues emerged, or governance concerns forced a rollback. In that sense, publishing the metric creates an obligation to defend it over time.
Detailed disclosures about AI reliance can expose companies to scrutiny around IP provenance, licensing, and compliance. In many cases, it’s safer not to know, or at least not to formalize what you know, than to produce metrics that could be discoverable later.
Taken together, these forces make underreporting the rational choice, even if adoption is accelerating rapidly behind the scenes.
This is where the evidence we do have becomes useful, not as direct measurement, but as inference.
Developer surveys like those published by Stack Overflow show sustained year-over-year growth in the number of developers using AI tools as part of their regular workflow, not just experimenting with them. These surveys don’t measure output volume, but they do measure behavior; and the behavior has shifted decisively from occasional use to daily reliance.
Google’s DORA Report goes a step further by showing how embedded these tools have become. A large majority of developers report using AI assistance regularly, and many report spending a non-trivial portion of their working day interacting with these systems. Again, this doesn’t tell us how many lines of code are generated, but it does tell us that AI is present at the moment decisions are made and code is written.
Then there are the rare, partial disclosures like Anthropic’s. Google has also publicly stated that roughly a quarter of new code within its own engineering organization is now AI-generated. While this figure is narrow, context-specific, and almost certainly conservative, it shows that in a handful of mature engineering orgs with strong internal controls, AI is already responsible for a meaningful share of new code creation.
Finally, there’s the tooling itself. AI assistants are no longer external copilots; they’re embedded directly into IDEs, CLIs, and code review workflows. That level of integration is built for scale, not just marginal use.
Individually, none of these signals prove that AI-generated code volume is accelerating. But collectively, they make the opposite conclusion hard to defend.
It’s important to be precise and say we still don’t know:
What we can say is that AI is now structurally embedded in how software is written, and that the people closest to the work increasingly behave as if AI contribution is significant and growing, even when they can’t quantify it cleanly.
Historically, that’s what acceleration looks like before it becomes provable via metrics.
If AI code generation is indeed scaling faster than our ability to measure it precisely, the next question moves from volume to confidence. How much do teams trust the code that’s being produced this way, and how does that trust (or lack thereof) shape the systems they build around deployment, review, and release?
That’s a question for a later post. Stay tuned!