OpenBrowser connects AI agents to browser through raw CDP. No abstraction layer. The LLM writes Python in a persistent namespace, batching operations per call. Page state at ~450 characters. Benchmarked against 3 frameworks on 6 real tasks: 100% accuracy across the board, 2.6x fewer tokens, 59% lower inference costs. Methodology is public and reproducible. MIT licensed. CLI + MCP server. 15 LLM providers. Two published RL studies training open-source models for browser control.
We built OpenBrowser because we wanted to see what happens when an AI agent talks directly to Chrome.
Most browser automation goes through an abstraction layer between the LLM and the browser. We took a different path: raw CDP (Chrome DevTools Protocol). The LLM writes Python code that executes in a persistent namespace, batching multiple browser operations into a single tool call. Page state compresses to ~450 characters. The architecture is simple, and it turns out simplicity saves tokens.
We wanted numbers, not intuition. We benchmarked 4 CLI frameworks head-to-head on 6 real browser tasks using Claude Sonnet 4.6 on AWS Bedrock, N=3 runs with randomized order, 10,000-sample bootstrap confidence intervals. All four achieved 100% accuracy. OpenBrowser used 2.6x fewer tokens on average and won 5 of 6 tasks on token efficiency. Every framework in the benchmark is a good tool. Ours just found a way to do the same work with less. Full methodology and reproducible scripts: https://docs.openbrowser.me/cli-...
Then we went further. We are post-training open-source models specifically for browser control. Two published studies: SFT + GRPO reinforcement learning on Qwen3-8B for web form filling, and a cross-paradigm comparison with diffusion language models. Both papers and all trained models are public on ResearchGate and HuggingFace.
What ships today:
- CLI tool and MCP server (pip install openbrowser-ai)
- 15 LLM providers (OpenAI, Anthropic, Google, Bedrock, Azure, Ollama, and more)
- Cloud platform with saved auth profiles and scheduled browser workflows
- Raw CDP engine with code batching and persistent variable namespace
This started as a capstone project at the University of Toronto. Four students, four months, one architectural question that turned into a framework, a cloud platform, and two research papers.
We would love your feedback. What browser automation tasks would you throw at this?