Study: AI-authored code produces more and more severe bugs than human code

TL;DR

An analysis by CodeRabbit of 470 open-source pull requests found AI-generated contributions carry roughly 1.7x more issues than human-written PRs and a higher share of critical and major defects. The report flags bigger rates of logic, maintainability, security and performance problems in machine-generated code, while noting methodological limits and contrasting findings from other research.

What happened

CodeRabbit, which offers an AI-driven code review product, examined 470 open-source pull requests for its State of AI vs Human Code Generation report. The company found that PRs identified as AI-generated averaged about 10.83 issues each, versus 6.45 issues for PRs labeled as human-authored, implying roughly a 1.7× increase in the number of findings when AI was used. Severity also differed: AI-origin PRs contained about 1.4× more critical issues and 1.7× more major issues on average. Across categories, the report recorded higher rates of logic/correctness defects (1.75×), maintainability and code-quality problems (1.64×), security findings (1.57×) and performance issues (1.42×) in AI output. Specific security weaknesses—improper password handling, insecure object references, XSS and insecure deserialization—were all more common in AI-produced changes. The vendor cautions about limitations in its labeling and notes other studies have produced mixed results.

Why it matters

More and more severe defects in AI-written code can increase review time and developer workload, slowing merge cycles.
Elevated security and correctness issues raise the risk that machine-generated contributions could introduce exploitable vulnerabilities.
Teams adopting AI coding tools may need to adjust testing, code-review and gating practices to catch predictable classes of faults.
Contrasting academic and industry studies mean organizations should validate AI tooling effects against their own codebases before broad adoption.

Key facts

Sample size: CodeRabbit analyzed 470 open-source pull requests for the report.
Issue counts: AI-generated PRs averaged ~10.83 issues each; human PRs averaged ~6.45 issues.
Severity multipliers: AI PRs had ~1.4× more critical issues and ~1.7× more major issues than human PRs on average.
Category differences: AI-generated code showed higher rates of logic/correctness (1.75×), maintainability (1.64×), security (1.57×) and performance (1.42×) issues.
Security specifics: AI PRs were about 1.88× more likely to mishandle passwords, 1.91× more likely to introduce insecure object references, 2.74× more likely to add XSS vulnerabilities and 1.82× more likely to implement insecure deserialization.
Exceptions: Human-authored PRs had more spelling errors (about 1.76×) and more testability issues (about 1.32×) than AI PRs.
Methodological caveat: CodeRabbit acknowledged it cannot be certain that PRs labeled as human-authored were exclusively created by people.
Other research: The report’s conclusions align with some industry findings but differ from several academic studies that have reported mixed or contrasting outcomes.
Industry context: Trend Micro researcher Dustin Childs noted Microsoft patched 1,139 CVEs in 2025, and Microsoft reported around 30% of code in certain repositories was written by AI (as cited in the source).

What to watch next

Whether organizations revise code-review and CI gating rules to specifically detect and block classes of faults associated with AI-generated code (not confirmed in the source).
Improvements in AI code generators and guardrails that reduce logic, security and maintainability issues over time (not confirmed in the source).
Changes in the share of repository code written by AI and whether that correlates with vulnerability trends—CodeRabbit and others note the proportion of AI-written code may rise, but future impact is not detailed in the source.

Quick glossary

Pull request (PR): A contribution submitted to a version control repository proposing changes to code that can be reviewed and merged by maintainers.
Code review: The process of examining source code changes to find bugs, enforce standards, and improve quality before merging.
Cross-site scripting (XSS): A security vulnerability where an attacker injects malicious scripts into web pages viewed by other users.
Insecure deserialization: A flaw that occurs when untrusted data is deserialized, potentially allowing attackers to execute code or manipulate program state.
Maintainability: Attributes of code that make it easier or harder to understand, modify, and extend over time.

Reader FAQ

How many pull requests did CodeRabbit analyze?
CodeRabbit examined 470 open-source pull requests for the report.

Were AI-generated PRs worse across all categories?
The report found AI PRs had higher rates of logic, maintainability, security and performance issues, though humans had more spelling and testability issues.

Is the CodeRabbit methodology flawless?
No — the company acknowledged it cannot be certain that PRs labeled as human-authored were exclusively written by people.

Do other studies agree with CodeRabbit?
Not uniformly; the source cites multiple academic and industry studies with differing conclusions about AI-generated code quality.

AI + ML 95 AI-authored code contains worse bugs than software crafted by humans CodeRabbit review of pull requests shows meatbags beat clankers Thomas Claburn Wed 17 Dec 2025 // 16:00 UTC Generating code…

Study: AI-authored code produces more and more severe bugs than human code

By

TL;DR

What happened

Why it matters

Key facts

What to watch next

Quick glossary

Reader FAQ

Sources

Related posts

By

Related Post

SMTP Tunnel: A SOCKS5 proxy that masks TCP as SMTP to bypass DPI

The waning era of scale-only AI: why scaling’s grip is weakening

JS Analyzer — Burp Suite extension for static JavaScript security analysis

Leave a Reply Cancel reply

You missed

SMTP Tunnel: A SOCKS5 proxy that masks TCP as SMTP to bypass DPI

Recreated: Steve Jobs’s 1975 Atari horoscope program — now runnable

Google to publish AOSP source twice yearly, a setback for custom ROMs

Transform your phone into a true productivity workhorse with a USB-C hub