This document provides an overview of real-time web technologies including Comet, long polling, HTTP streaming, forever frames, Server-Sent Events, and WebSockets. Comet is an umbrella term for techniques used to push data from a server to a browser in real-time. Long polling, HTTP streaming, and forever frames are different Comet programming models. Server-Sent Events and WebSockets are HTML5 solutions for real-time connections, with Server-Sent Events using HTTP streaming and WebSockets using a bidirectional TCP connection. Both approaches have advantages and disadvantages in terms of features, reliability, and proxy/firewall support.
2. COMET
Comet is a programming model that allows a web server to
push data to a browser.
This is often achieved through a long-held HTTP request, but
there really is no standard or specification for Comet; it is
just an umbrella term for all the different ways that can be
used to achieve this.
The different comet programming models are:
Long Polling
HTTP Streaming
Forever-frames
3. LONG POLLING
With long-polling, the client sends a HTTP request, waiting
for a server event. If an event occurs on the server-side, the
server sends the response including the event data.
After receiving the response containing the event data, the
client will send a request again, waiting for the next event.
There is always a pending request which allows the server to
send a response at any time.
4. HTTP STREAMING
With Http Streaming, the server keeps the response
message open. In contrast to long polling the HTTP
response message (body) will not be closed after sending an
event to the client.
If an event occurs on the server-side, the server will write
this event to the open response message body.
The HTTP response message body represents a
unidirectional event stream to the client.
5. FOREVER-FRAMES
The forever-frame technique uses HTTP 1.1 chunked
encoding to establish a single, long-lived HTTP connection
in a hidden iframe. Data is pushed incrementally from the
server to the client over this connection, and rendered
incrementally by your web browser.
As events occur, the iframe is gradually filled with script
tags, containing JavaScript to be executed in the browser.
Because browsers render HTML pages incrementally, each
script tag is executed as it is received.
6. COMET AND PROXY SERVERS
Long polling - provided it is implemented in a robust way,
will not suffer from too many proxy server issues, because
it is still just using the HTTP request and response model
Streaming - is more efficient but this approach suffers
from the same proxy server issues. For example, a proxy
server may be buffering the response and cause latency.
Alternatively, the proxy server may be configured to
disconnect HTTP connections that are kept open for a
certain amount of time. This is why most legacy Comet
solutions simply use long-polling.
7. COMET PROTOCOLS
BOSCH - http://xmpp.org/extensions/xep-0124.html,
bidirectional communication channel, can use HTTP
streaming as well as long polling
Bayeux -
http://svn.cometd.org/trunk/bayeux/bayeux.html,
bidirectional protocol, based on the long polling
approach
Uses "hacks" to break the HTTP Request-Response barrier.
This forces such protocols to implement a complex session
and connection management.
9. SERVER-SENT EVENTS
HTML5 also applies the Comet communication pattern by
defining Server-Sent Events (SSE), standardizing Comet
for all standards-compliant web browsers. SSE
specification "defines an API for opening an HTTP
connection for receiving push notifications from a server."
Server-Sent Events are based on HTTP streaming. The
response stays open and event data are written as they
occur on the server side.
Server-Sent Events includes the new HTML element
EventSource as well as a new mime type text/event-
stream which defines an event framing format.
10. EXAMPLE CODE
The EventSource represents the client-side end point to
receive events. The client opens an event stream by creating
an EventSource, which takes an event source URL as its
constructor argument. The onmessage event handler will be
called each time new data is received.
11. WEBSOCKETS
A bidirectional communication channel. In contrast to
Server-Sent Events, the WebSocket protocol is not build
on top of HTTP. However, the WebSocket protocol defines
the HTTP handshake behaviour to switch an existing
HTTP connection to a lower level WebSocket connection.
The overhead involved managing a WebSocket is very
minimal. Due the fact that WebSockets is not
implemented on the top of HTTP it will not run into
trouble caused by HTTP protocol limitations.
12. On the other hand WebSockets, does almost nothing for
reliability. It does not include reconnect handling or
support guaranteed message delivery like Server-Sent
Event does.
Further more, as a non-HTTP based protocol, WebSocket
cannot make use of the built-in reliability features of
HTTP. This means reliability has to be implemented in the
application (sub-protocol level) when using WebSockets.
14. WEB SOCKETS AND PROXY SERVERS
The problem with proxy servers for web applications that
have a long-lived connection (for example, Comet HTTP
streaming or HTML5 Web Sockets) is clear:
HTTP proxy servers which were originally designed for
document transfer — may choose to close streaming or
idle WebSocket connections, because they appear to be
trying to connect with an unresponsive HTTP server.
Additionally, proxy servers may also buffer unencrypted
HTTP responses, thereby introducing unpredictable
latency during HTTP response streaming.
17. In the case of the WebSocket upgrade request, a
transparent HTTP proxy will remove the Connection:
upgrade header, which will result in the WebSocket server
receiving a corrupt WebSocket upgrade request.
Today, most HTTP proxies are not familiar with the
WebSocket protocol.
Using secured WebSockets can avoid this effect. In
creating a secured WebSocket connection, the browser
opens an SSL connection to the WebSocket server. In this
case intermediaries will not be able to interpret or modify
data.
18.
19. If a browser is configured to use an explicit proxy server
(for both encrypted and unencrypted WebSocket
connections) then it will first issue an HTTP CONNECT
method to that proxy server while establishing the
WebSocket connection.
If an unencrypted WebSocket connection (ws://) is used,
then in the case of transparent proxy servers, the browser
is unaware of the proxy server, so no HTTP CONNECT is
sent (and so it removes upgrade headers). As a result, the
connection is almost likely to fail in practice today.
20. If an encrypted WebSocket Secure connection (wss://) is
used, then in the case of transparent proxy servers, the
browser is unaware of the proxy server, so no HTTP
CONNECT is sent. However, since the wire traffic is
encrypted, intermediate transparent proxy servers may
simply allow the encrypted traffic through, so there is a
much better chance that the WebSocket connection will
succeed if an encrypted WebSocket connection is used.
21. HTML5 WEB SOCKETS AND LOAD-
BALANCING ROUTERS
Ø TCP (Layer-4) load-balancing routers should work well
with HTML5 Web Sockets, because they have the same
connection profile: connect once up front and stay
connected, rather than the HTTP document transfer
request-response profile.
Ø HTTP (Layer-7) load-balancing routers expect HTTP traffic
and can easily get confused by WebSocket upgrade traffic.
For that reason, Layer 7 load balancing routers may need to
be configured to be explicitly aware of WebSocket traffic.
22. HTML5 WEB SOCKETS AND
FIREWALLS
Since firewalls normally just enforce the rules for inbound
traffic rejection and outbound traffic routing (for example,
through the proxy server), there are usually no specific
WebSocket traffic-related firewall concerns.
23. SSE VS WEB SOCKETS
SSE Web Sockets
Unidirectional
server-to-client
channel only
Full-duplex, bidirectional
communication (not just server push)
Runs on top of
HTTP (uses
HTTP
streaming)
Highly efficient: minimal overhead
involved in managing a WebSocket,
runs on top of TCP so no HTTP hacks
24. SSE Web Sockets
Includes powerful
features to reconnect
and synchronize
messages
Does not include reconnect
handling or guarantee
message delivery
High reliability is a built-
in feature
No built-in reliability. This has
to be done on the application
(sub-protocol) level
Not supported by IE Supported by IE