Why 63% of developers use AI coding tools but only 29% trust the output

The trust deficit in AI-generated code is measurable and widening. What drives it, what the data shows, and what measurement does to close it.

The numbers come from different surveys, but they point to the same thing: a significant portion of developers who use AI coding tools don’t trust what those tools produce.

63% of developers now use AI coding tools weekly. 29% trust the output. That’s not a rounding error or a framing problem. It’s a gap that reflects something developers are observing directly in their work.

Why developers don’t trust AI output

The trust problem is partly about accuracy and partly about visibility.

On accuracy: AI coding tools produce statistically plausible code, not necessarily correct code. They’re optimised to generate output that resembles working code because that’s what they were trained on. When they’re wrong, they’re wrong in ways that aren’t always obvious at the point of generation — code that passes review, deploys without errors, and then needs rewriting two sprints later.

On visibility: developers have limited ability to verify AI output at scale. For a self-contained function with a clear spec, verification is straightforward. For a complex refactor across multiple files, or a piece of code that interacts with systems the AI doesn’t have context for, verification is harder and more time-consuming.

The practical result: developers use AI assistance for the tasks where it’s most reliable — boilerplate, simple functions, test generation — apply extra scrutiny to its output on complex tasks, and end up with a general posture of guarded use rather than confident deployment.

What the gap means for engineering managers

The trust deficit among developers is a signal about code quality risk, not a signal about developer sentiment.

When 71% of developers who use AI tools have reservations about the output, that’s information about the gap between velocity gains and quality maintenance. Managers who track adoption rates and velocity without tracking quality are seeing one side of the equation.

GitClear’s research puts some data on the quality side: AI-assisted commits show 2× higher churn rates on average compared to human-authored commits, and a 4× increase in duplicate code blocks. These figures describe what developers are observing intuitively when they say they don’t trust the output.

The trust gap, in other words, isn’t a perception problem to be managed with better training or change management. It’s a data signal about what’s actually happening in the codebase.

What narrows the gap

The trust deficit tends to narrow when developers have more visibility into their own AI-assisted output. Seeing that their AI-assisted commits churn at a particular rate, compared to their human-authored work, gives them specific information about where AI assistance is working well and where it needs more oversight.

Teams with that visibility don’t eliminate the gap entirely — AI output still requires careful review. But they tend to develop more calibrated use: better at identifying which tasks benefit most from AI assistance and better at reviewing the output on tasks where it’s less reliable.

What measurement does with this information

The practical value of measuring AI impact is that it converts a general, unverifiable concern into something specific and addressable.

Knowing that 29% of developers trust AI output is interesting. Knowing that your team’s AI-assisted commits churn at 2.1× the rate of human-authored commits, that three specific developers have churn rates significantly above that average, and that the problem is concentrated in a particular part of the codebase: that’s actionable.

The trust deficit is where most teams stop. The data is where the conversation about what to do about it can actually begin.

Scryable measures the quality of your team’s AI-assisted output from your git history, with before/after baseline comparisons. Get early access.