Vibe Coding, AI Code Review, and the New Trust Gap in AI-Generated Code

By Paul Nashawaty | December 23, 2025

Over 60% of enterprise code commits now include AI-generated content. Our research also shows that AI-generated code is roughly 30% more prone to logic errors. Developers are increasingly leaning into what many are calling vibe coding, relying on AI intuition and feedback loops instead of structured design and architecture.

The question isn’t whether AI can write code anymore. It’s whether we can trust what it writes and how we govern what gets into production.

In this episode of AppDevANGLE, I spoke with David Loker, Director of AI at CodeRabbit, where he leads development of agentic AI systems for code review and developer workflows. David has spent nearly two decades building large-scale AI systems and has been published at NeurIPS, ICML, and AAAI. Our conversation explored AI-generated code, vibe coding, AI-assisted code review, and what it will take for AI to “earn the commit.”

Productivity Gains vs Logic Errors: Where We Are Now

Tools such as GitHub Copilot have materially changed how developers work. Time-to-commit drops, boilerplate disappears, and teams can stand up prototypes in a fraction of the time. In our data, more than 55% of new code repositories now include AI-generated snippets.

David’s take is pragmatic: the productivity benefits are real, but so is the quality gap.

“There’s a lot of productivity gains that can be had,” he told me. “At the same time, having a quality assurance buffer between that code and production systems is still an ever-growing problem as more AI-generated code is introduced.”

Human-only pull request review is already straining. As AI output scales, manual review becomes a bottleneck, not a safety net. That’s where David sees AI code review playing an increasingly critical role: as a buffer between AI-generated code and production systems.

We also see this pressure show up in release expectations. In our survey, 24% of organizations want to release code hourly, yet only 8% actually can. The desire for velocity is outpacing the ability to validate what’s being shipped.

David’s view: we’re still in a transition phase. Teams are figuring out how to use AI “in a way that doesn’t end up polluting our system with unmaintainable code riddled with bugs.” The genie is not going back in the bottle; the challenge is to build workflows that embrace AI without normalizing sloppiness.

Vibe Coding: Democratizing Prototyping, Complicating Maintainability

Nearly 40% of developers in our research describe their AI workflow as “vibe coding,” iterating with AI prompts, reacting to suggestions, and steering by intuition rather than upfront architecture. That’s especially true for citizen developers and business-side builders using low-code and no-code tools.

David sees both opportunity and risk.

On the plus side, vibe coding opens the door for people who don’t traditionally identify as developers:

“It opens up the space of development to people who don’t know anything about coding,” he said. “I can come in with an idea, vibe code something, and get a prototype out … that’s something we didn’t have before.”

That’s powerful. Ideas that would have died in a backlog can now be prototyped and tested quickly. But if vibe-coded prototypes get promoted directly into production without scrutiny, the long-term cost is significant: tangled logic, brittle flows, missing tests, and systems no one truly understands.

“At this point, I’d be pretty concerned if people are just using that technique and then checking in the code,” David cautioned. Enterprises are still figuring out how to measure the risk of vibe-coded systems and how to constrain where and how they’re used.

The emerging pattern: AI-driven prototyping is fine as long as there’s a clear boundary where experienced engineers and robust review processes step in before production.

Should AI Be a Reviewer, Validator, or Lead?

One of the more striking signals in our 2025 data:

68% of developers say they trust AI-assisted code reviews more than peer reviews for catching syntax and mechanical issues.
Only 22% say they trust AI for architectural or design-level reviews.

So where does AI belong in the review pipeline?

David broke it down cleanly. AI is already strong at:

Exhaustive line-by-line attention
Catching inconsistencies between comments and code
Surfacing off-by-one errors, unsafe patterns, and common security smells

“AI doesn’t get tired,” he pointed out. “It pays attention to every single line.”

Where it struggles today is in architectural trade-offs and business-context decisions, cases where there are multiple “right” answers depending on strategy, constraints, or deadlines.

When I asked whether AI should act as a reviewer, validator, or lead reviewer, David’s answer was “all of the above—eventually, but not yet.”

He envisions a future where AI behaves more like a lead reviewer/design partner: “These are three options. Which one would you like? Here’s why I’d pick one versus the other.”

That requires much deeper understanding of business context, non-functional requirements, and organizational constraints than today’s models typically have. For now, the consensus is clear: human in the loop is non-negotiable.

Speed vs Stability: Is the Trade-Off Worth It?

GitHub’s metrics show AI code assistance can reduce time-to-first-commit by up to 46%. Our research also shows bug density in AI-generated code is ~30% higher post-development.

So is the ROI worth it? David’s answer: it depends entirely on how you’re using these tools.

If developers are treating AI like a fire-and-forget generator—“ask it to do something, paste whatever it returns, and commit”—then the answer is no. You’re trading short-term speed for long-term instability and rework.

Used well, however, AI becomes an intense collaborative partner:

“When I engage with these systems, it’s a very intellectual process,” David said. “I’m constantly iterating with it, catching it, making sure it’s doing things in a very specific way … It’s exhausting—but it also helps me produce more code than I would by myself.”

In that mode, AI amplifies a strong engineer rather than replacing one. You still need:

Intentional design
Clear patterns and constraints
A robust code review stage that includes both humans and AI

From a DevOps and CI/CD perspective, we’re seeing KPIs shift from “how fast can I push the big green button?” to “how much quality can I guarantee per deploy?” AI can help on both sides, but only with guardrails and governance.

When Will AI Truly “Earn the Commit”?

At some point, many organizations will ask a harder question: when do we let AI not just suggest code, but merge it autonomously into production under certain conditions?

David likened it to self-driving cars. We don’t flip directly from “no autonomy” to “fully autonomous for everyone, everywhere.” Instead, trust grows through:

Proven safety and reliability metrics
Narrow, controlled use cases in production
Gradual expansion as confidence builds

“Until we see studies showing that code output by these systems is en masse higher quality, fewer bugs, more maintainable than humans alone, there’s still going to be a lot of skepticism,” he said.

He expects:

Startups and lean teams to push further, faster. They have less legacy code, smaller user bases, and more appetite for risk.
Large enterprises to move more cautiously, protecting IP, uptime, and customer trust.

The likely pattern: AI will first “earn the commit” in tightly scoped domains (e.g., internal tooling, low-risk services), under strict guardrails and with heavy observability. Over time, as error rates drop and trust grows, its remit will expand. For now, as David put it, “don’t drop the code review.”

Analyst Take

AI-generated code is no longer experimental; it’s mainstream. The real question is how we govern it.

From this conversation with David, several guidance points emerge:

Treat AI as an accelerant, not an autopilot.
Vibe coding and AI assistance can unlock huge productivity gains and broaden who can meaningfully prototype. But ungoverned use directly into production is a recipe for brittle systems.
Elevate code review, don’t devalue it.
As AI generates more of the codebase, code review becomes more important, not less. AI-powered code review is essential for scaling, but humans still own architecture, trade-offs, and final accountability.
Define how AI is used in your pipeline explicitly.
Document where AI can be used (prototyping, test generation, refactoring), where it must be constrained, and where human-only standards still apply. Treat this as part of your SDLC, not a side note.
Invest in trust-building, not blind trust.
Track bug density, incident patterns, maintainability metrics, and developer experience as AI usage grows. Use data to decide when and where AI can “earn” more responsibility.

My recommendation: lean into AI as a partner in development and review, but maintain strong governance, human-in-the-loop review, and architectural rigor. If you get that balance right, you can get the upside of vibe coding and AI acceleration without sacrificing long-term code health.

Article Categories

By Paul Nashawaty | December 23, 2025

Paul Nashawaty

You may also be interested in

AI moves from experimentation to execution

Enterprise Connect 2026: AI Moves From Hype to Execution

Bob Laliberte March 15, 2026

Futuristic illustration showing Amazon S3 storage evolving into the foundation of AI data platforms, with an S3 bucket connected to analytics systems, data lakes, and an AI neural network.

Amazon S3 at 20: How “Storage for the Internet” Became the Foundation of the AI Data Era

Rob Strechay March 14, 2026