Product upvotes vs the next 3

Waiting for data. Loading

Product comments vs the next 3

Waiting for data. Loading

Product upvote speed vs the next 3

Waiting for data. Loading

Product upvotes and comments

Waiting for data. Loading

Product vs the next 3

Loading

OpenAI WebSocket Mode for Responses API

Persistent AI agents. Up to 40% faster.

Every agent turn, you're resending the full context. Again. That overhead compounds fast. WebSocket Mode for the Responses API keeps a persistent connection, sends only incremental inputs, and cuts end-to-end latency by up to 40% on heavy tool-call workflows.

Top comment

I'm happy to hunt this one WebSocket Mode for the Responses API looks like a small infra update but it's quietly one of the more important shifts in how production agents get built.

Most agentic workflows today are built on a protocol designed for single-turn interactions. Every tool call resends the full conversation history. The model reprocesses what it already knows. Your infrastructure pays that toll on repeat, invisibly, at scale.

This changes the contract.

What's different with WebSocket Mode:

  • One persistent connection to /v1/responses -- no new HTTP handshake per turn

  • Only incremental inputs travel over the wire, not the full context

  • Session state lives in memory -- the model picks up exactly where it left off

  • Cline tested this in production: ~39% faster on complex multi-file tasks, up to 50% in best cases

  • Pair with server-side compaction and you can run agents for hours without hitting context limits

🎯 Who this is actually for:

  • Teams running agentic coding tools with repeated tool calls

  • Computer-use and browser automation loops

  • Orchestration systems where agent latency affects user-perceived quality

⚠️ One honest caveat: the WebSocket handshake adds slight TTFT overhead on short, simple tasks. This compounds value on heavy workloads, not light ones. Know your use case before you swap.

For teams already running production agents, is latency or context limits the bigger blocker right now? Curious what this unlocks for people here. 👇