CodeRabbit + Ribo: PR Review Against Intent, Not Just Diff Hygiene

A reviewer you trust leaves three comments on a pull request. Two are about extracting a helper function. One flags an unused import. The branch merges. Nine hours later, a support engineer notices the application log is writing customer email addresses to a log stream that flows, unredacted, into your data warehouse. The reviewer was thorough. It caught the bad code. It never saw the wrong code, because the thing that was wrong was not a property of the diff. It was a property of the system the diff was supposed to fit into.

This is what AI code review gets right and what it gets structurally wrong. Review that does not know what the system is supposed to be can only flag things the code looks like. CodeRabbit's MCP integration is the mechanism that closes that gap. Hook Ribo into it and review stops grading against style. It starts checking against your declarations.

AI review catches bad code. It does not catch wrong code.

CodeRabbit's 2025 analysis found that AI-authored pull requests contain roughly 1.7x more issues than human-authored ones. That figure has been cited to argue for better models, tighter style rules, and more careful prompting. All fine. None of them touch the underlying problem.

General-best-practices review has a ceiling. It can catch what a generic linter and a senior engineer with no context can catch: nil checks, unbounded allocations, dead branches, awkward abstractions. Those are real bugs. They are also the cheap category. The expensive category is code that compiles cleanly, passes tests, reads well, and contradicts a decision your team made three quarters ago that nobody wrote down in a reviewable form.

We named this earlier as the wrong-level problem. CodeRabbit is a better reviewer than what came before it. It is still reviewing at the wrong level, because it cannot see the level where most expensive bugs live: the declared intent of the system. That is not CodeRabbit's fault. It is the fault of an industry that has no shared place to declare what a system is supposed to be. MCP changes that, because the review agent can now reach out during analysis and ask.

What changes when the reviewer can cite the source of truth

CodeRabbit's review agent automatically calls relevant MCP tools during PR analysis. Once Ribo is configured as an MCP server on your CodeRabbit workspace, the agent gains four capabilities it did not previously have: it can search the artifacts that govern the area the PR touches, retrieve the specific declaration that applies, enumerate the artifacts in a project, and walk the relationships between them. The review is no longer a monologue from a linter with LLM taste. It becomes an argument grounded in your own declared identity.

Take the opening scenario. The PR adds an endpoint that writes a structured log on every authenticated request. The engineer who wrote it used the same logging call as everywhere else in the service. The diff is clean. The tests pass. Without MCP, CodeRabbit has nothing to check against except style.

With Ribo mounted, the sequence looks different. CodeRabbit searches Ribo for artifacts related to the paths and symbols in the diff. It finds a compliance artifact declaring email and phone_number as PII. It finds a constraint artifact stating that application logs must not contain PII. It finds a contract artifact on the logger that declares a sanitization layer, and notices the new call bypasses it. It writes a review comment that says, in effect: the log line on line 42 emits user.Email directly, which the compliance artifact cmp_pii_v3 classifies as PII, and the constraint artifact c_log_sanitization forbids in logs. It cites the artifact IDs. A human reviewer could not have produced that comment without spending an hour reading internal docs. CodeRabbit produces it in seconds because the declarations are retrievable.

That is the qualitative shift. Review stops being opinion and becomes citation. The reviewer's job moves from "what do I remember about this system?" to "what did the team declare about this system, and does this change hold?"

Role-appropriate projection: what the reviewer should see, and what it should not

One source of truth does not mean every integration reads every layer. Ribo's DNA splits into thirteen kinds: intent, contract, algorithm, evaluation, escalation, pace, monitor, glossary, integration, reporting, compliance, constraint, and tradeoff. A PR reviewer needs most of those, but not all. It should not see operational telemetry or incident routing. It does need the declarations that govern whether a change is correct for this system.

For CodeRabbit, the appropriate projection is nine layers: intent, contract, algorithm, evaluation, compliance, constraint, tradeoff, glossary, and integration. Intent tells the reviewer what the component exists to do. Contract tells it what the component promises. Algorithm tells it how a computation must behave. Evaluation tells it how correctness is measured. Compliance, constraint, and tradeoff tell it what the system must not become. Glossary tells it what the team's words actually mean, so the reviewer stops misreading tenant as an org when your system uses it to mean a logical deployment. Integration tells it where the system talks to the outside world, so a PR that starts calling a new billing gateway gets flagged against the integration artifact that named Stripe as the payment provider.

The four layers a PR reviewer should not see are pace, monitor, reporting, and escalation. Pace, monitor, and reporting are operational: how fast changes roll out, how the system watches itself, what gets reported upstream. Escalation is incident routing: who gets paged when a failure mode fires. None of them inform a diff review. Exposing them adds noise without adding review value, and widens the blast radius if the token leaks.

Ribo mints a separate token per integration, with an allowed_kinds list scoping which layers the integration can read. A CodeRabbit token created with the nine-layer projection above sees only those layers, regardless of what else the workspace contains. A support bot wired into the same Ribo workspace gets a different, narrower token. One source of truth, many projections.

Two trade-offs to name. Every MCP call the review agent makes adds latency. On a large PR with many affected artifacts, this is measurable but not painful, typically single-digit seconds. The second is MCP server capacity: CodeRabbit Pro supports up to five configured MCP servers, Pro+ supports fifteen, and Enterprise supports twenty. Ribo consumes one slot. If you plan to add several, think about which are load-bearing for review and which are not.

How to wire it up

Two surfaces, one token passing between them.

First, mint the token in the Ribo console. Navigate to the Agents page, create a new agent named after the repository or workspace you are wiring up (something like coderabbit-payments-monorepo makes future audit logs legible). Set scopes to artifact:read. Set projectIds to the Ribo project or projects that cover this repository. Set allowed_kinds to the nine-layer projection above: intent, contract, algorithm, evaluation, compliance, constraint, tradeoff, glossary, integration. Create the agent. Ribo returns a plaintext bearer token exactly once; copy it into your secrets manager immediately.

Second, configure the MCP server in CodeRabbit. Go to app.coderabbit.ai/integrations, open the MCP Server Integrations section, and click New MCP Server. The configuration needs four fields:

Field	Value
Name	`Ribo DNA` (or your internal naming convention)
Server URL	`https://mcp.ribo.dev/context/mcp`
Authentication	Bearer token — paste the agent key from the Ribo console
User guidance	Instructions for the review agent (see below)

The User guidance field is the part most teams underfill. It is free-text instructions that CodeRabbit's review agent reads before using the MCP server, and it is where you translate "review against intent" into operational direction. A starting template that has worked for teams piloting this integration:

Before flagging a style or structural issue on this PR, search Ribo for artifacts related to the files and symbols in the diff. If the change touches an area governed by a declared intent, contract, constraint, compliance, or evaluation artifact, cite the artifact ID in your comment and explain whether the change is consistent with it or drifts from it. Prefer drift citations over general best-practice suggestions when both apply.

CodeRabbit's MCP server integrations docs cover the current UI in more detail and should be considered authoritative for any field names or tier behavior that changes. Once saved, CodeRabbit will automatically call relevant Ribo tools during every subsequent review, and surface the artifacts it consulted under the "Additional context used" section of its review report.

Before and after

The concrete change shows up in the review comments themselves.

Before the integration, a typical CodeRabbit comment on a cache invalidation change reads something like: "Consider extracting this repeated pattern into a helper function. Also, the error handling here could return early to reduce nesting." True. Style-level. Forgettable.

After the integration, on the same change: "This handler writes to the session cache without invalidating the user-profile cache. Contract ct_session_profile_coherence declares that any write to the session store must invalidate the paired profile entry before returning. The current change returns before the invalidation, which will produce stale-read behavior on the next profile fetch. Recommend routing through writeSessionWithInvalidation() from pkg/cache/coherent.go, which the contract references as the canonical write path."

The difference is not verbosity. It is citation. The comment is now anchored to an artifact the team authored deliberately. A disagreement about the comment becomes a disagreement about whether the artifact is right, not whether the reviewer has good taste. That is review at the right level.

What it's worth

Most teams we see adopt AI code review because review has become the bottleneck. One or two senior engineers hold the mental model that makes review possible, and they cannot scale. AI review tools absorb some of the load, but the load they absorb is the cheap category: style and surface bugs. The expensive category stays on the senior engineers' queue because only they hold the system's intent.

When the reviewer can cite the declared intent, the expensive category moves. Not all of it, but enough to matter. Architectural drift gets flagged in the PR instead of in the postmortem. Compliance constraints stop being a tribal memory and become a review gate. The senior engineers who used to be the only people who could say "that violates our contract with billing" now only need to confirm a comment that already said it.

The log lines that leak PII still get written. Some of them still merge. The ones that do not are the PRs where the reviewer had the compliance artifact open.

Review against source of truth catches what review against style cannot. Ribo is where we declare that truth so the reviewer can cite it while the diff is open.

CodeRabbit + Ribo: PR Review Against Intent, Not Just Diff Hygiene

AI review catches bad code. It does not catch wrong code.

What changes when the reviewer can cite the source of truth

Role-appropriate projection: what the reviewer should see, and what it should not

How to wire it up

Before and after

What it's worth

The ribo.dev Team

Related Insights

Atlassian Rovo Dev CLI + Ribo: DNA at the Command Line

ChatGPT + Ribo: Ask Across Tools Without Losing the Thread of Intent