An HTML diff with the reviewer's notes pinned in the margin beside the exact lines beats a comment thread every single time, and your agent can produce it.
A pull-request comment thread is where review context goes to die. Notes scatter down a page, detached from the lines they're about by the time you scroll. The good ones get resolved and collapsed. The whole conversation evaporates when the branch merges. And the most valuable part of any review: the reviewer's reasoning is the first thing the format loses.
The fix is small and a little unfashionable: have the agent emit the review as a single self-contained HTML file. The diff down the middle, the reviewer's notes pinned in the margin next to the exact lines, each one severity-tagged, with a one-screen summary on top. One artifact you can open, send, and keep.
The artifact has a fixed shape, and the shape is the whole trick. A summary header states the verdict and the counts, how many blockers, how many nits, so the reader knows in one screen whether to merge. The diff sits in the body, hunk by hunk. And the margin notes pin to specific lines, each one tagged by severity, so the reasoning lives next to the code it is about instead of a thousand pixels below it. Here is a fragment, rendered the way the file renders:
Assignment in a condition. The original = was almost certainly meant to be ===. The fix wraps it in parens to silence the linter, but it preserves the bug: this still overwrites token and treats any truthy lookup as valid. Decide which you meant.
Pull this to a named const beside the other session settings, SESSION_TTL_SECONDS reads better than a bare arithmetic literal.
Good call wrapping the revoke in a transaction: the previous version could half-revoke on a crash.
This is also the one place the format argument resolves cleanly. A review like this is read rendered: that is its whole value, but it is generated from a diff that lives in git as text. The truth stays reviewable line by line; the view stays readable by a human. The document and the source of record never get confused, because the document is plainly a view of the source. see "Stop handing me the markdown your agent had to fight"
The instruction is short and the output is consistent once you pin two things: the structure and the tags. Tell the agent to take the diff, emit a single HTML file with all CSS inlined so it opens on a double-click, anchor every note to a line range, lead with a verdict and counts, and tag each note from a fixed set, blocker, nit, question, praise. The vocabulary is what makes the result scannable: a reader can find every blocker in one pass and skip the praise until they have time for it.
The header is the part people skip and shouldn't. Put the PR number and the commit SHA in it. That one line is what turns the file from a screenshot into a record: six weeks later you can tie the review back to exactly the code it judged.
The one-file artifact optimizes for reading a review. When the job isn't reading a finished review, use the thread.
The comment thread is optimized for the person writing the review: quick to type, quick to resolve. The one-file artifact is optimized for the person reading it. On the reviews that matter: the ones a second person has to understand, sign off on, or find again later: the reader is the point. Hand them the file.