On the surface, these four tools look interchangeable: each lets an AI agent
navigate a page, click an element, fill a form, take a screenshot. Underneath,
they make very different bets about how the agent talks to the browser, and that bet is the thing that determines which one you actually want.
Three of the four are MCP servers: long-running processes the
agent speaks to over a tool-protocol channel, with each capability registered
as a named tool in the agent's prompt. The fourth, Vercel Labs'
agent-browser, is a plain CLI: a Rust binary
the agent invokes by writing agent-browser click @e2 in a shell, the
same way it'd write grep or git.
The CLI-vs-MCP split is the live argument of early 2026. MCP servers register
every tool definition into the agent's context window, so a 29-tool server like
Chrome DevTools MCP costs ~18k tokens just to exist
in the prompt. A CLI registers nothing: the agent reads --help
when it needs to, like a developer. That's why some shops (Perplexity, YC's
internal tooling) have been publicly moving back from MCP to CLI for tools
that don't need persistent state.
Playwright is in the business of driving a browser, and Chrome DevTools MCP is
in the business of debugging one.
Steve Kinney, Playwright vs. Chrome DevTools MCP: Driving vs. Debugging
Kinney's framing is the cleanest one I've found. Driving is
the Playwright/agent-browser axis: deterministic clicks against an accessibility
tree, designed to make the user flow work end-to-end.
Debugging is the Chrome DevTools axis: performance traces,
network waterfalls, source-mapped stack traces, the same data a human developer
opens DevTools to see. The four contenders below sort cleanly onto that split, plus a third dimension, session reuse, that determines whether the
agent runs in a fresh sandbox or attaches to the browser you're already logged
into.