A practical rule for AI code review: reject working diffs you cannot explain

Q: Is green CI enough for coding-agent pull requests?

No. CI can miss maintainability, domain-model mismatch, weak tests, data leakage, security issues, and unnecessary abstractions.

Q: What is a good first review question?

Ask the reviewer to explain the solution in their own words before discussing style or implementation details.

Q: When is agent-written code lower risk?

It is lower risk for disposable prototypes, scripts, and isolated internal tools where failure does not create durable maintenance debt.

Source image for Vinicius Brasil's essay on rejecting AI code.Vinicius Brasil

Knowledge & LearningJun 21, 2026

@ZachasAuthorADMIN

A June 2026 developer essay and active Hacker News discussion point to a recurring coding-agent problem: green CI is not enough when the reviewer cannot explain the approach.

Vinicius Brasil argues that AI-generated code can pass local checks while still being the wrong solution to merge. His practical rejection test is whether the reviewer can explain the approach, justify the diff size, and see that the agent avoided premature abstraction. Hacker News discussion around the essay shows the same concern from practitioners: coding agents increase output speed, but review judgment becomes the constraint when maintainability, domain context, and hidden errors matter.

Key takeaways

The essay's core rule is simple: reject AI code when you cannot explain the approach in your own words.
Passing CI does not prove the solution is maintainable, scoped, secure, or aligned with the codebase's model.
HN comments add examples around oversized abstractions, weak human pushback, data leakage in ML code, and agents producing code faster than teams can review.
The useful workflow is slower than blind automation: plan, keep changes small, review the diff, discard weak first passes, then drive the agent with the understanding you gained.
Teams should separate low-risk disposable code from production code that future systems or people will depend on.

Practical LinkLoot angle

Use this as a review checklist for agent-authored pull requests. Before merging, the reviewer should be able to answer four questions: what problem does this solve, why is this the smallest adequate change, what assumptions did the agent make, and what failure mode would hurt users or maintainers later?

Review gate	What it catches	When it matters most	Source
Explain the approach	Blind trust in a working diff	Production code and unfamiliar domains	Essay
Compare diff size to problem size	Agent over-building and speculative abstractions	UI, API, and architecture changes	Essay, HN
Read tests and evals first	Green CI with weak or leaky checks	ML, data, auth, billing, migrations	HN
Keep human sign-off required	Ticket-closing without understanding	Team codebases with shared ownership	Essay, HN

The workflow is not anti-agent. It treats agents as implementation accelerators and keeps architectural judgment with the human who owns the code.

What to verify before you act

Verify whether the post is describing your team's risk profile. For prototypes, one-off scripts, and disposable internal tools, accepting a working agent diff may be rational. For durable libraries, security-sensitive paths, infrastructure, billing, analytics, data pipelines, and code other teams depend on, require explanation before merge.

Verify the tests, not just their status. HN commenters specifically called out cases where AI-generated ML or evaluation code appeared to work while leaking data or encoding the wrong domain assumptions.

Verify ownership. If no human can explain the agent's change after review, the team has not gained maintainable code; it has gained an artifact that will be expensive to debug later.

Source check

The primary essay confirms the rejection criteria: inability to explain the approach, oversized diffs, premature abstraction, local success with worse reasoning, and trusting output more than understanding. The Hacker News thread independently confirms active practitioner concern, with examples around code-review overload, data leakage, agent loops that optimize for apparent success, and the difference between low-risk throwaway code and long-lived production systems.

FAQ

Should AI-generated code be rejected if it works?

Reject it when the reviewer cannot explain the approach, validate the assumptions, or justify the diff size for the problem.

Is green CI enough for coding-agent pull requests?

What is a good first review question?

When is agent-written code lower risk?

For workflow design, pair this with LinkLoot's guide to AI workflow automation and turn the questions above into a PR checklist.

Sources & links

References, demos, and supporting links.

Vinicius Brasil essayvinibrasil.comPrimary Hacker News discussionnews.ycombinator.com