Posts tagged "pr-reviews"

4 posts · all topics

Our best reviewer was already a prompt.

I built a code-review subagent and named it after Mauro. This is not a joke about Mauro — Mauro really is our best reviewer. He reads the code with his eyeballs. He suggests an enum every time he sees three magic numbers in a row. He reads the strings inside the code, notices when "Error fetching MauroAgent data" should have been a template literal, and tells you. He won't accept `eslint-disable-next-line` without a reason. When he sees a prompt he can't follow, he says "if I can't understand it, the LLM won't either."

I wrote those rules down. That was the agent. It took an afternoon.

Mauro's reaction was that I replaced him because I got tired of waiting for his reviews. That part is also true. The interesting thing is how little of his review style I had to invent — most of it was already a list of habits he applies in the same order to every PR. The reviewers we trust most are the ones whose taste is the most legible. Turns out legible taste compiles.

The cheapest auto-publish architecture is a queue with a click.

I tried to make /fab-note auto-publish posts the moment a draft is confirmed. Two paths to do that, and both required carving an exception out of our org's branch protection — either a bot identity in the bypass list or a PAT scoped to repository admin. Neither is wrong, but each is a small concession against the policy that says "every change to main gets reviewed." So I stopped pushing on auto-merge and made the workflow assign the PR to me instead. Total clock time from "ship it" to live: about two minutes — most of it CI checking. The clicks in between (one approval, one merge) take three seconds and they preserve the property that a human approved the change. I was solving for the wrong thing. The friction of "wait for CI, click approve, click merge" is invisible to the author because they've moved on to the next thing by the time it's their turn. The friction of "build a policy exception around a fast publish path" is permanent and visible to anyone who later asks why this repo bypasses the rules. The cheapest version of automation is the one that lets the existing policy do its job.

Validation should be a sub-agent, not a checklist.

I've been thinking about where verification belongs in an agentic pipeline. The shape I keep coming back to is a quality validation sub-agent that runs *before* PR submission — its job is to come up with and verify a validation plan for the change, including running the relevant mabl tests, capturing evidence, and attaching that evidence to the PR. Then the PR review agent enforces the existence of the validation plan, not the rules underneath it. Then the full mabl suite runs on merge, and when something breaks, a failure-analysis skill identifies which PR introduced it and suggests fixes.

The reason for that structure: at our throughput, "did the engineer remember to validate this" is the wrong question. The right question is "does the PR carry evidence that validation happened, and does the evidence hold up?" Sub-agents do the validation; the review agent checks the evidence; the merge gate trusts the chain. None of the layers is doing checklist work — each one has a specific decision to make. Most teams that try to add AI to their existing CI/CD end up with checklist agents because that's the shape of CI/CD. I don't think that's where this lands.

Asking for human reviews should be the exception, not the rule.

At our current PR throughput, I'm going to burn out on reviews alone, never mind actual work. We can't move at this pace while also having a human read every change. The model I want us to move toward: ask for human review when you're truly unsure about something, and let the agent reviewers handle the rest.

The piece I shipped to make this real was getting /codex-review wired up across every repo as an on-demand second opinion. Different model family from Claude, uncorrelated blind spots, one comment away when you want it. The work was sixteen PRs across sixteen repos to enable the GitHub Actions trigger, plus a change to our shared workflows repo to make it a first-class slash command. The conceptual work was harder: deciding that "two agent reviews and a human glance" is now an acceptable pre-merge state for routine changes, and reserving real human attention for the changes that genuinely need it. We're not all the way there. But the trajectory is clear, and I'd rather build the routing now than burn out catching up to it later.