AI-Assisted Code Reviews: What the Latest Research Reveals About GPT in Pull Request Workflows

At ZEN, we keep a close eye on emerging research that affects how engineering teams build, ship, and maintain software. Recently, we came across a new study — “The Impact of Large Language Models (LLMs) on Code Review Process” (Antonio Collantea, Samuel Abedu, SayedHassan Khatoonabadi, Ahmad Abdellatif, Ebube Alor, Emad Shihaba) — that analyses more than 25,000 GitHub pull requests to understand how GPT-style models influence code review speed and collaboration.
The findings are both promising and highly relevant for modern engineering teams working in cloud-native, agile, or distributed environments.
Below is our breakdown of the study — and what we think teams can take away from it.
What the Study Found
The researchers compiled a dataset of 25,473 pull requests from 9,254 GitHub repositories, identifying around 1,600 GPT-assisted PRs.
Here are the standout results:
1. GPT-assisted PRs are merged significantly faster
-
Median merge time for GPT-assisted PRs: ~9 hours
-
Median merge time for standard PRs: ~23 hours
That's roughly a 61% reduction.
2. Faster time to first review comment
The “At Review” phase improved dramatically:
-
GPT-assisted PRs: 1 hour
-
Non-assisted PRs: 3 hours
An improvement of ~66.7%.
3. Huge impact on “Waiting for Changes” phases
This is where the biggest gains show up:
-
GPT-assisted PRs median wait: ~3 hours
-
Non-assisted median wait: ~24 hours
An 87.5% improvement.
4. How developers actually use GPT
The study classified GPT usage in PRs as:
-
60% – Enhancements (refactoring, renaming, error handling)
-
26% – Bug fixes
-
12% – Documentation
These patterns reflect how teams are naturally using LLMs: to accelerate small but important improvements that often clog review cycles.
Why This Matters
From our perspective, the findings illustrate something we’ve observed in the industry:
LLMs aren’t replacing reviewers — they’re compressing the waiting time around reviews.
The biggest bottleneck in PR workflows isn’t usually quality or complexity — it’s idle time: waiting for feedback, waiting for changes, waiting for someone to pick up the next step.
By providing instant suggestions and improvements, GPT reduces that downtime and keeps the workflow moving. For modern teams, especially those practising trunk-based development or high-velocity delivery, this is significant.
How Teams Can Apply These Insights
1. Use LLMs to support (not replace) reviewers
The study shows the biggest improvements in enhancements and bug fixes — not major features or architectural decisions.
Teams can benefit by using GPT for:
-
cleaning up code
-
eliminating trivial review comments
-
improving naming and documentation
-
drafting initial fixes or refactor suggestions
This leaves human reviewers free to focus on design, architecture, and correctness.
2. Be transparent about AI assistance
The authors observed that GPT-assisted PRs were identifiable through commit messages, PR descriptions, and patterns of changes.
Teams may want to formalise this by:
-
adding a “LLM assistance used?” checkbox
-
writing guidelines for acceptable usage
-
documenting reviewer expectations
This reduces ambiguity and builds trust.
3. Track and measure real impact
The study used clear PR-level metrics (merge time, time-to-review, waiting time). Teams can replicate this internally to see whether LLM assistance is actually accelerating workflows — or just adding noise.
4. Train teams on safe and effective usage
The study mentions common pitfalls of LLM-generated changes:
-
context loss
-
misinterpretation of code
-
superficial fixes
-
incorrect refactoring under token limits
Teams should treat LLM output like a strong-but-junior engineer’s suggestion: useful, but always reviewed critically.
5. Keep humans in the loop
Even though metrics improved dramatically, quality wasn't evaluated in this study.
Our takeaway: LLMs can speed up the mechanics of a review, but not the judgment.
Design decisions, security concerns, domain logic, and architectural tradeoffs still require experienced engineers.
Limitations to Keep in Mind
We appreciate that the study is rigorous but also transparent about its constraints:
-
It uses open-source GitHub projects — enterprise workflows may behave differently.
-
GPT-assisted PRs were detected through heuristics, not precise logs.
-
PRs assisted by GPT may skew toward simpler tasks.
-
Quality of code changes wasn't analysed, only timing.
These limitations are important for interpreting the results responsibly.
Our Takeaway
This study provides one of the earliest large-scale, quantitative assessments of how LLMs impact code review workflows — and the results are encouraging.
As the ZEN Team, our takeaway is simple:
LLMs dramatically speed up the slow, idle stages of PR workflows — not by replacing human reviewers, but by reducing friction around them.
For engineering teams looking to improve developer experience, flow efficiency, and delivery speed, integrating LLM-assisted review practices is a promising direction — as long as it’s paired with strong human oversight and sensible guidelines.

Optimize with ZEN's Expertise
Upgrade your development process or let ZEN craft a subsystem that sets the standard.
Read more:

AI-Assisted Code Reviews: What the Latest Research Reveals About GPT in Pull Request Workflows
At ZEN, we keep a close eye on emerging research that affects how engineering teams build, ship, and maintain software. ...

Code Reviews That Teach, Not Torture: Patterns for Effective, Respectful, and Useful Code Review Culture
Code reviews are supposed to make us better developers — not frustrated ones. Yet, in many teams, they end up feeling li...

The Future of Frontend Hosting: What Developers Need to Know in 2025
Frontend hosting has evolved dramatically in the past decade. We’ve gone from uploading static HTML files to FTP servers...

Best Ways to Host Your React-Based Frontend (Vite, Next.js, Remix) and Backend from One Container
Modern applications often combine a React-based frontend (built with Vite, Next.js, Remix, etc.) and a backend API (Node...

AI is the missing piece of the Productivity Puzzle
Today, I’d like to plea that Artificial Intelligence (AI) is the missing piece of the productivity puzzle, revolutionizi...

Say Goodbye to Frustration: ZEN Software's Plugin Makes Picture Labeling a Breeze!
WordPress empowers businesses, creative enthusiasts, and content creators with its user-friendly interface and extensive...
