The Real-Time Decision Nobody Explains Well
Every founder building a modern web app hits the same question: how do I push data from the server to the client in real time? The three mainstream answers are WebSockets, Server-Sent Events (SSE), and Long Polling. All three have been around for over a decade. Most engineering teams still pick the wrong one because the trade-offs are not obvious until you are debugging a production issue at 2am.
This guide walks through when each protocol wins, how they actually behave under load, and what it costs to operate them at scale. The short version: WebSockets are overused, SSE is underrated, and Long Polling is still useful in specific cases. But the full answer depends on your traffic shape, infrastructure, and what you are actually building.
For a broader look at real-time architecture, our real-time features guide covers the higher-level patterns. This article focuses specifically on the protocol choice.
WebSockets: Bidirectional and Persistent
WebSockets upgrade an HTTP connection to a persistent bidirectional TCP channel. Once the handshake completes, the client and server can both send messages without the overhead of new HTTP requests. This is the protocol most teams reach for first, often without realizing they do not actually need bidirectional communication.
How it works. The client sends an HTTP Upgrade request. The server responds with a 101 Switching Protocols. From that point, the connection stays open and both sides exchange binary or text frames. Frames can be tiny (a few bytes) or large (megabytes), and the framing protocol handles message boundaries automatically.
What it is good for. Chat applications, collaborative editors, multiplayer games, live trading platforms, and anything where the client needs to push data back to the server frequently. If you are building a Figma clone or a chat app, WebSockets are the obvious choice.
The problems. Persistent connections consume server memory and file descriptors. A single Node.js server typically handles 10K to 50K concurrent WebSocket connections before hitting limits. Load balancers and proxies sometimes drop idle connections, so you need client-side reconnect logic and heartbeat pings. Corporate firewalls occasionally block WebSocket upgrades, though this is rare in 2026.
Scaling. Horizontal scaling with WebSockets requires a pub/sub backplane (Redis, NATS, or a managed service) because messages from user A on server 1 need to reach user B on server 2. This adds operational complexity and latency.
Authentication. WebSocket handshakes support cookies and custom headers, but refresh tokens mid-connection is awkward. Most production WebSocket systems authenticate at handshake time and close the connection if the session expires.
Server-Sent Events: Underrated and Simpler
Server-Sent Events are a one-way streaming protocol built on top of standard HTTP. The client opens a GET request with the right headers (Accept: text/event-stream) and the server keeps the connection open, sending newline-delimited events as they happen. The browser's EventSource API handles reconnection, event parsing, and event IDs automatically.
How it works. Regular HTTP request. Content-Type is text/event-stream. The server streams lines formatted as "data: {json}\n\n" until it chooses to close. The browser reconnects automatically with the last received event ID, so you can resume from where you left off.
What it is good for. Real-time notifications, live dashboards, stock tickers, AI response streaming, activity feeds, progress updates, and any case where the server pushes data and the client only needs to send occasional requests (which can go through a normal HTTP endpoint). SSE is the right choice for maybe 60% of the cases where teams default to WebSockets.
Why SSE keeps winning. Standard HTTP means proxies, load balancers, CDNs, and firewalls handle it natively. Authentication works with normal cookies and headers. Reconnection is built into the browser. There is no framing protocol to debug. You can use the same HTTP middleware you already use for regular API requests.
The limitations. SSE is unidirectional. The client cannot push data to the server over the same connection. This is only a problem if you need bidirectional messaging that WebSockets provide natively. For most use cases, a separate HTTP POST for client-to-server messages works fine.
HTTP/2 and HTTP/3 bonus. With HTTP/2, the 6-connection-per-host browser limit does not apply to SSE streams. You can run many concurrent SSE connections per tab without exhausting browser limits. This is a huge advantage over legacy HTTP/1.1 concerns.
AI streaming use case. Every modern LLM provider (OpenAI, Anthropic, Google) streams completions over SSE. If you are building an AI product, SSE is not just an option, it is the standard.
Long Polling: Still Useful in 2026
Long polling is the oldest of the three. The client makes an HTTP request, the server holds the request open until it has data or a timeout fires, and then the client immediately makes another request. It looks like regular HTTP from every perspective (proxies, firewalls, logging, monitoring), which is its biggest strength.
How it works. Client sends GET /events?since=123. Server checks if there are any events since that cursor. If yes, respond immediately. If not, hold the connection open for up to 30 to 60 seconds. When an event arrives, respond. Client sees the response and immediately makes another request with the updated cursor.
What it is good for. Environments with strict proxy or firewall rules that block WebSockets or SSE. Mobile apps that need to preserve battery (each response ends the connection, letting the radio sleep). Low-traffic applications where the extra request overhead does not matter. Backup option when WebSockets or SSE are not available.
The downsides. Higher latency than WebSockets or SSE because each event triggers a round-trip of request setup. Higher bandwidth usage because every poll carries HTTP headers. Harder to implement correctly: you have to handle the race condition between "request times out" and "event arrives at the same instant."
When to reach for it. Honestly, not often in 2026. The main cases are legacy enterprise environments with hostile proxies and mobile apps that need extreme battery conservation. Otherwise, pick WebSockets or SSE.
How They Scale Under Load
The protocol choice matters less at 100 concurrent users. It starts to matter a lot at 10,000. Here is what actually breaks as you scale.
WebSockets at scale. The main constraint is the number of persistent connections per server. A tuned Node.js process (with ulimit raised and TCP keepalives configured) handles 30K to 100K WebSocket connections. Go and Rust servers can go higher. Beyond that, you shard connections across servers and need a pub/sub backplane. At 1M+ concurrent connections, you are looking at a dedicated team managing the WebSocket layer.
SSE at scale. Similar concurrent connection limits to WebSockets (connections are HTTP, but they stay open). The advantage: SSE plays nicely with HTTP infrastructure, so you can put a CDN in front (Cloudflare, Fastly) and let the CDN handle connection termination. This can dramatically reduce the number of connections your origin sees.
Long polling at scale. Each request is short-lived (30 to 60 seconds max), so server memory is not the bottleneck. The problem is request overhead. At 100K active users, you are fielding roughly 2,000 to 3,000 requests per second just from poll cycles. This is manageable but expensive compared to SSE or WebSockets.
Pub/sub backplane. For WebSockets and SSE, horizontal scaling requires a way to deliver a message from any server to any connection. Redis Pub/Sub is the simplest option. NATS or Apache Pulsar scale further. Managed services like Ably, Pusher, and PubNub handle the backplane for you.
If your use case involves reliable delivery of notifications to users who might be offline, our scalable notification system guide covers the persistence and retry patterns that sit on top of the real-time protocol.
Managed Services: Build vs Buy
Running your own WebSocket or SSE infrastructure is real work. Several managed services exist that handle the scaling, reconnection, and backplane for you. Here is the 2026 landscape.
Pusher. The original managed WebSocket service. Simple, well-documented, pay-as-you-go. $49 per month for 500 concurrent connections, $99 for 2,000. Starts to get expensive above 50K connections but handles everything you need out of the box.
Ably. More feature-rich than Pusher. Supports multiple protocols (WebSockets, SSE, MQTT), history, presence, and token authentication. Pricing is usage-based and scales more gracefully than Pusher at high volumes.
PubNub. Enterprise-oriented. Strong global network, reliable delivery guarantees, and a long feature list. Priced higher than Pusher or Ably but valuable if you need their specific features like functions and access manager.
Soketi. Open-source, Pusher-compatible. Self-host to cut costs and avoid vendor lock-in. Great option if you want Pusher's API without the bill.
Centrifugo. Open-source real-time messaging server. Supports WebSocket, SSE, and other transports. Used at scale by companies that outgrew managed services.
Cloudflare Durable Objects. Not a drop-in replacement, but an interesting pattern. Each Durable Object is a stateful, WebSocket-capable instance you can route clients to. Used for coordination and small-scale real-time workloads.
Supabase Realtime. Built on Phoenix Channels, integrated with Postgres row-level changes. Good if you are already using Supabase and your real-time needs are tied to database updates.
The build vs buy decision. Under 50K concurrent connections, a managed service almost always wins on total cost of ownership (engineering time beats infrastructure cost). Above 500K concurrent, self-hosting with Soketi, Centrifugo, or a custom Go service starts to pay off. Between 50K and 500K, it depends on team skill and how much real-time is core to your business.
Authentication, Reconnection, and Delivery Guarantees
These are the three things that separate toy real-time systems from production ones. Whatever protocol you pick, you need to handle all three.
Authentication. For WebSockets, pass a token in the query string, a cookie, or a custom header during the handshake. Validate at connect time and tie the connection to the authenticated user. For SSE, use normal HTTP cookies or Authorization headers. For long polling, every request authenticates normally.
Token expiration mid-connection. This is where teams trip up. If a user's session expires while their WebSocket is still open, the server should close the connection and force a reconnect. Build this into your heartbeat logic: the server periodically checks token validity and disconnects expired sessions.
Reconnection. SSE has reconnection built into the browser's EventSource API. WebSockets require manual reconnect logic with exponential backoff and jitter. The standard pattern is: initial delay 1 second, max delay 30 seconds, jitter 0 to 1 second. Give up after 10 failed attempts and surface an error to the user.
Message delivery guarantees. By default, WebSockets and SSE are fire-and-forget. If the connection drops mid-message, the message is lost. For anything important (chat, notifications, transaction updates), you need at-least-once delivery: the server stores outbound messages with an ID, the client acknowledges received messages, and on reconnect the client requests any missed messages.
Ordering. TCP gives you in-order delivery within a single connection. Across reconnects, ordering is your responsibility. Include sequence numbers in every message and deduplicate on the client side.
Presence. Tracking who is online requires a heartbeat. Client sends a ping every 30 seconds. Server marks the user offline if no ping in 60 seconds. At scale, presence state lives in Redis or a managed service because you cannot query every server for every user.
The Decision Framework for 2026
Here is the decision framework we use with clients. It covers 95% of real-time decisions in 2026.
Use SSE when: The server pushes data and the client rarely or never needs to push back over the same connection. Examples: notifications, live dashboards, activity feeds, AI streaming responses, live comments, live scores, real-time analytics. This is the default choice for most new products.
Use WebSockets when: The client and server need to exchange messages frequently and bidirectionally. Examples: chat, collaborative editing (Figma, Google Docs), multiplayer games, live trading platforms, video or audio control channels. Do not reach for WebSockets just because you might need bidirectional later. Migrate when you actually need it.
Use long polling when: You are stuck behind hostile proxies, need extreme battery conservation on mobile, or building a fallback path for environments where WebSockets and SSE are blocked.
Use a managed service when: Your real-time needs are a feature, not the core product. Pusher, Ably, or Supabase Realtime will get you to 100K concurrent users without a dedicated infrastructure team.
Self-host when: Real-time is core to your product, you have engineers who know their way around Go or Rust, and your scale justifies the operational investment.
Common mistakes. Reaching for WebSockets when SSE would work. Underestimating reconnect logic. Forgetting authentication refresh on long-lived connections. Skipping delivery guarantees and then debugging lost messages at 2am. Using long polling for AI streaming when SSE is the standard.
For a concrete example of putting it all together in a product, our collaboration tool build guide walks through the architectural decisions that apply to real-time apps end to end.
If you are architecting a real-time feature and trying to decide between these protocols or between managed and self-hosted options, we help engineering teams make these calls every week. Book a free strategy call and we will walk through the trade-offs for your specific use case and scale.
Need help building this?
Our team has launched 50+ products for startups and ambitious brands. Let's talk about your project.