Data Push Apps with HTML5 SSE (2014)

Chapter 6. Fallbacks: Data Push for Everyone Else

What we are going to look at in this chapter is a fallback approach, called long-polling, that (with a few tweaks) works just about everywhere. If the data being pushed is relatively infrequent, its inefficiency won’t even be noticed and you could get away with using it everywhere, but generally we will just use it for browsers where there is no native SSE support.

In both this and the next chapter I will start by showing the code in a minimal example. Then, after that, the FX demo from the end of Chapter 5 will be adapted to work with long poll. By the end of this chapter we will have 99% coverage (albeit with varying levels of inefficiency) for our production-quality, realistic, data push application.

Browser Wars

The differences between browsers (aka “The Browser Wars”) have been annoying us since the mid-1990s, but they became especially troublesome when Microsoft threw its hat into the ring and we went through a phase where each browser manufacturer tried to make the Web better, unilaterally, attempting to differentiate its product with features. That wasn’t what either the end users (you and me) or the content developers (again, you and me) wanted. Standards got discussed and ignored; it is only in the past few years that all the browser manufacturers have started taking standards seriously. The browser manufacturers finally realized they should differentiate themselves on user interface and speed, not proprietary features.

But we still have their mess to deal with. And when it comes to using the latest HTML5 technologies, the mess is still being created and it is going to be around for the next 3–4 years, at least. When it comes to SSE I am going to shine the spotlight of shame first on Google, then Microsoft. The built-in Android browser only started supporting SSE as of Android 4.4 (though many earlier Android devices use Chrome, which does support SSE); the XHR fallback (described in the next chapter) works since Android 3, but fails on Android 2.x (which still has a large number of users at the time of writing; see http://bit.ly/wiki-android-versions).

Over in the blue corner, Microsoft has made Internet Explorer (IE) more standards-compliant with each release, so much so that IE9 and later get grudging approval from most web developers. But, as of IE11, SSE is not supported, and next chapter’s XHR fallback does not work on IE9 and earlier. There is another fallback, iframe, also covered in Chapter 7, that only works on IE8 and above.

The long-poll approach shown in this chapter is not quite as efficient as native SSE (or the fallbacks we will study in Chapter 7), but for Android 2.x and IE6/7, it is the only choice. Actually, the inefficiency won’t be noticed in most applications. But if you are sending multiple updates per second, it is possible that the extra resource usage (client-side CPU, server-side CPU, and network bandwidth) will become noticeable.

NOTE

That is not to say you cannot use long-poll at subsecond frequencies. I’ve tested it at 10 updates/second, with no degradation. At 100 updates/second, and passing an ID to specify the last received data (see Sending Last-Event-ID), it keeps up—you get all the data, but it comes through in lumps: you won’t get 100 distinct messages each second.

What Is Polling?

Before I tell you what long-polling is, we should first talk about regular polling, which is shown graphically in Figure 6-1. Regular polling is where you knock on the door of your best friend and ask “Are you ready to play?” and she immediately says “yes” or “no.” If she says “yes,” you go out and have fun. If she says “no,” she shuts the door in your face, and 30 seconds later you knock on the door and ask her (poll her) again. Eventually she’ll be ready. Or she was only pretending to be your best friend. Or you had garlic for breakfast.

Polling your friend

Figure 6-1. Polling your friend

The really odd thing that I want you to understand about regular polling is that once your friend is finally ready, she will just sit there, staring at the wall like a zombie, waiting for you to come and knock on the door again.

In the context of our FX demo, regular polling means that at a fixed rate, say every 10 seconds, we do an Ajax HTTP request to ask for data. When we do regular polling we have to decide if we want to do sampling or if we want to receive everything. If we are sampling, the server will send the latest price for all symbols. Just one price for each symbol. The client will get price snapshots, but not the prices between each of the polling requests. The alternative to sampling is that each time we poll the server we also send the timestamp of the last update we received, and request all updates since that time. The backend server might reply with a blank array to say no updates, or it might reply with a huge array if there have been lots of updates. Compared to sampling, you are able to maintain a full history on the client side.

How Does Long-Polling Work?

So, how is long-polling different from regular polling? Going back to our best friend, we go and knock on her door, and say, “Are you ready to play?” and she replies, “No, but let’s leave the door open and as soon as I’m ready I will come and tell you.” See Figure 6-2 and notice how you only knock on the door once, how the door stays open, and how each and every visit gets data.

Long-polling your friend

Figure 6-2. Long-polling your friend

In the context of our FX demo, we do the Ajax HTTP request, but we don’t ask for the latest price. Instead we ask to be told about the next price change. If things have gone quiet, it just sits there, holding a socket open. Then as soon as the price changes, it sends it, and shuts the socket. We then immediately make another long-poll connection, for the next price change.

The key difference between SSE and long-poll is we need a new HTTP connection for each piece of data. We still have the downside that SSE/WebSocket had, because we are using up a dedicated socket practically all the time. Latency-wise, it is almost as good as SSE: when a new price arrives we get to hear about it immediately. It’s almost but not quite as good, because making the new HTTP connection each time takes a few milliseconds. And once you turn up the rate of messages to 10 or more per second, that “just a few milliseconds” for each new connection starts to take up the majority of the time. (On slow latency networks such as mobile, that few milliseconds might actually be hundreds or even thousands of milliseconds, so it’s slow right from the start.)

IS LONG-POLLING ALWAYS BETTER THAN REGULAR POLLING?

Which is best depends on your definition of better. From the point of view of latency, long-polling is always better: as soon as the server has a new value, you get it. In regular polling you have to wait until the next time you poll to discover it. (This latency advantage also applies to SSE and the other fallbacks we discuss in the next chapter.)

But what about from the point of view of overall bandwidth usage? The answer to that is not so clear cut. If your FX prices update twice a second, long-polling has to make 120 HTTP requests each minute. If you were instead doing regular polling once every 10 seconds, you only had to make six HTTP requests each minute. So regular polling is better. But, conversely, if your FX prices update twice a minute, with long-polling you only had to make two HTTP requests each minute, whereas regular polling every 10 seconds still had to make six HTTP requests. And latency is still worse!

Knowledge of your data, and exactly when it will update, can also be used with long-poll or native SSE: disconnect when not expecting any data. This gives you the same latency (assuming you reconnect in time), but saves on socket usage (and associated costs like an Apache process). This was the technique we showed in Adding Scheduled Shutdowns/Reconnects. (If you end up using it, pay close attention to why the reconnects were randomly jittered.)

Show Me Some Code!

That was a lot of words, so how about some code, for balance? First, the backend:

<?php

usleep(2500000); //2.5s

header("Content-Type: text/plain");

echo date("Y-m-d H:i:s")."\n";

Short and simple. Save this as minimal_longpoll.php and put it on your web server. When you call it, there is a 2.5 second wait and then it shows you the current timestamp. About the only thing I need to point out about the code is that we send the header after the sleep, not before. The sleep is simulating waiting for our next piece of data, and until we get that data we will not know what kind of data we are sending. For instance, we might end up needing to send back an error code instead, based on some external event, in which case the code would be something like this:

<?php

usleep(2500000); //2.5s

$cat = (rand(1,2) == 1) ? "dead" : "alive";

if($cat == "dead"){

    header("HTTP/1.0 404 Not Found");

    echo "Something bad happened. Sorry.";

    }else{

    header("Content-Type: text/plain");

    echo date("Y-m-d H:i:s")."\n";

    }

Now for the frontend. Put minimal_longpoll_test.html in the same directory as minimal_longpoll.php, and try loading it in a browser. You will see “Preparing!” flash on screen for a moment, then the JavaScript runs and it gets replaced by “Started!” Then a moment later it gets replaced with .[1]. This tells you a connection has been made (readyState==1). Two and a half seconds later it will typically show .[1].[2].[3].[4] followed by the timestamp, then on the next line another .[1] (meaning another long-poll connection has been made). What you see might vary from browser to browser, depending on exactly how Ajax has been implemented; it is only the [1] (Ajax request started) and the [4] (Ajax request completed) that are important. See Ajax readyState to learn what the 1, 2, 3, and 4 mean.

<!DOCTYPE html>

<html>

  <head>

    <noscript>

      <meta http-equiv="refresh"

                 content="0;URL=longpoll.nojs.php">

    </noscript>

    <meta charset="utf-8" />

    <title>Minimal long-poll test</title>

  </head>

  <body>

    <p id="x">Preparing!</p>

    <script>

    function onreadystatechange(){

        s += ".[" + this.readyState + "]";

        document.getElementById('x').innerHTML = s;

        if(this.readyState != 4)return;

        s += this.responseText + "<br/>\n";

        document.getElementById('x').innerHTML = s;

        setTimeout(start, 50);

        }

    function start(){

        var xhr;

        if(window.XMLHttpRequest){

           xhr = new XMLHttpRequest();

          }

        else{

          xhr = new ActiveXObject("Msxml2.XMLHTTP");

          }

        xhr.onreadystatechange = onreadystatechange;

        xhr.open('GET', 'minimal_longpoll.php?t=' +

           (new Date().getTime()));

        xhr.send(null);

        }

    var s = "";

    setTimeout(start,100);

    document.getElementById('x').innerHTML = "Started!";

    </script>

  </body>

</html>

Start your study of the source code by looking at the start() function. This initiates a long-poll request. First we create our XMLHttpRequest object, unless we are on Internet Explorer, in which case we create an Msxml2.XMLHTTP ActiveX object. The two objects have the same functions and behavior, so all other code is the same. The next line, xhr.onreadystatechange = onreadystatechange;, tells it the name of the callback function we want all data to be sent to. As an aside, we could have used JQuery to hide this Ajax complexity. But there isn’t that much complexity in the end, just two or three extra lines.

Then we do xhr.open to say which page to get data from, and xhr.send() to actually start everything going. (The explicit null parameter to send() is needed on some browsers.)

At the beginning of this chapter, I mentioned that a few tweaks were needed to get long-poll working in all browsers. The first of those is that some browsers (e.g., Android) will cache the Ajax request. To avoid this, we append something to the URL. A simple approach is to use the current timestamp, expressed as milliseconds since 1970.

With IE6/7 there is another thing we need to be careful of: we must use a fresh XHR object for each request. If, instead, we create the XHR object once, then just call send() again each time we want to start a long-poll request, it works in all browsers except Internet Explorer 7 and earlier. But by creating a fresh object each time, it works everywhere. We do it that way for all browsers; it is not really any extra trouble.

Another tweak is the very first call to start(). Instead of calling it directly, we use setTimeout to add a 100ms delay. This is needed by some versions of Safari, at least. Without it, you see a permanent loading spinner. There has to be enough time for the rest of the page to be parsed and made ready. (It is not needed by Android, in my testing, so if Android is the only one of your supported browsers using long-poll, you could try removing the 100ms delay.)

The next function I would like you to look at is onreadystatechange (“on-ready-state-change”). This is a callback function that is called as it progresses through the request; see the following sidebar. All we are interested in here is when readyState becomes 4, because that means we’ve received some new data. It also means the remote server has closed the connection.

AJAX READYSTATE

An XMLHttpRequest object (and also Internet Explorer’s Msxml2.XMLHTTP ActiveX object) can be in a number of different states. You don’t normally need to care, and if you have only ever made your Ajax connections using jQuery, you won’t even have met them. The states are a number from 0 to 4, with the following meanings:

0

Request has not started yet.

1

A connection to the server has been made.

2

The request (and any post data) has been sent to the server.

3

Getting data.

4

Got all data and connection has closed.

For long-poll (and short-poll, and normal Ajax usage), we ignore everything we get until readyState becomes 4. Our onreadystatechange callback is called exactly once for when readyState is 4. In the next chapter we will look at a technique where we do care about readyState 3. It might be called more than once. Different browsers treat it differently, and some make the data loaded so far available, while others do not. Different browsers treat readyState 0, 1, and 2 differently, so you cannot always rely on them being given to you.

So, we output a period each time the function is called, but if readyState is not yet 4, then that is all we do. Once readyState has become 4, we output the message the server has sent us (found in responseText), and then we initiate the next long-poll request by calling start().

There is a 50ms delay on calling start(), again done with setTimeout() because otherwise some browsers get confused and eventually complain about stack overflows and such. Long-poll is our fallback for the dumbest browsers, so don’t sweat having to introduce a bit of extra latency. (Again, Android does not appear to need the 50ms delay in my testing.)

Optimizing Long-Poll

I mentioned earlier that long-poll is fine most of the time, but starts to become quite inefficient when things heat up. If we are sending a new update twice every second, that is up to 120 new HTTP requests a minute that have to be made. When this happens, there are two things we can do to reduce the load a bit.

The first is easy: have the client go slower. In fact, our code already does this—we have a 50ms sleep before initiating the next long-poll request. If you increased that from 50 to 1000, then the absolute maximum number of long-poll requests we can make is 60 per minute. Allowing for some network overhead, you are looking at a maximum of 40 to 50 requests per minute. When data is less frequent, the extra delay causes no real problem: you get your next update after 16 seconds instead of 15 seconds. You can think of the length of that sleep as the continuum between the extremes of long-poll (zero latency, possibly lots of requests) and regular-poll (predictable latency, predictable request rate).

The other approach is server side. We could buffer up data for the long-poll clients, sending their data no more than once/second. How would this work? First, make a note of the time they connect (for example, 18:30:00.000). Then, say, we have data available to send to clients at 18:30:00.150, but we decide not to flush the data yet, because it has been less than a second since they connected. So instead we hold on to it, and set a timeout of 850ms. But before that timer triggers (at, for example, 18:30:00.900), we get more data to send to clients. Still we wait—another 100ms. No new data arrives in those 100ms so now we flush it and close the connection. The client gets two data items together.

Alternatively, how about if the client connects at 18:30:00.000, but the first new data comes through at 18:30:01.100 (1.1 seconds after the request started)? In that case we send it immediately and close the connection. In other words, the artificial latency is only being introduced when multiple messages come through in the space of a single second, which effectively means we only slow things down when there are a lot of messages. This is just what we want.

I suggest that if you do this, you have the minimum time easily customizable, so that you can easily experiment with values between 500 and 2000 milliseconds.

What If JavaScript Is Disabled?

If JavaScript is disabled, then nothing described in this chapter works. When the user runs our minimal example, they will see “Preparing!” on screen for the rest of their natural lives. And it is nothing more than they deserve. Nothing described in any of the other chapters works either.

What’s that? You sympathize with them? Bah, humbug. But, yes, there is a way to send updates to these 20th-century-ers. We’re going to just modify the minimal_longpoll_example.html files, not the fuller FX price demo. First, add this immediately after the <head> tag:

<noscript>

  <meta http-equiv="refresh" content="0;URL=longpoll.nojs.php">

</noscript>

Because it is between the <noscript> tags, it does nothing for almost everyone. But, what it does is send our JavaScript-disabled users to another page. That other page is PHP, not HTML. That PHP script has to generate a full HTML page, not just send the data, as it can when called over Ajax. The code is quite straightforward:

<!DOCTYPE html>

<html>

  <head>

    <script>window.location.href="minimal_longpoll_test.html"</script>

    <meta http-equiv="refresh" content="3">

    <meta charset="utf-8" />

    <title>Update test when JS disabled</title>

  <head>

  <body>

    <p><?= date("Y-m-d H:i:s"); ?></p>

    <p>(Enable JavaScript for better responsiveness.)</p>

  </body>

</html>

The key line is <meta http-equiv="refresh" content="3">, which says “reload this page after 3 seconds.” After 3 seconds an HTTP request is made, and the PHP script will run again and create a new web page, with a new timestamp in it.

I’d also like to point out the <script> line at the top of that file. This is a clever little trick: if the users had just temporarily disabled JavaScript, then as soon they enable JavaScript, it will be detected on the very next page refresh and will take them back to your full-service live-updating site, where they will be welcomed to the 21st century with open arms.

Grafting Long-Poll onto Our FX Application

At the end of Chapter 5, we finished with a fairly robust demo application. It generates random FX data, with multiple symbols (multiplexing) and multifield data. It can maintain a history of all data received, and do interesting things with that history, such as charts and tables. It reconnects when things go wrong, and keeps track of the point up to which it’s seen data, as well as scheduled shutdowns and reconnects.

Luckily, to graft long-poll on to that existing application is not too much work. Did I just say luckily? In fact, the ease with which we can graft on an alternative delivery method is a direct result of all the little design decisions that were made in the previous few chapters.

Connecting

The FX application currently has an SSE object with a private variable called es and a function called startEventSource(). Our first task is to create the equivalent for long-poll.[25] Here are the new private variables added to the SSE object:

var xhr = null;

var longPollTimer = null;

As you can see, there is also a variable to store the timer handle (this is only used by disconnect()). And here are the functions we need to add:

function startLongPoll(){

if(window.XMLHttpRequest)xhr = new XMLHttpRequest();

else xhr = new ActiveXObject("Msxml2.XMLHTTP");

xhr.onreadystatechange = longPollOnReadyStateChange;

var u = url;

u += "longpoll=1&t=" + (new Date().getTime());

xhr.open("GET", u);

if(last_id)xhr.setRequestHeader("Last-Event-ID", last_id)

xhr.send(null);

}

function longPollOnReadyStateChange(){

if(this.readyState != 4)return;

longPollTimer = setTimeout(startLongPoll, 50);

processNonSSE(this.responseText);

}

The startLongPoll() function and its onreadystatechange callback are basically the same functions we saw earlier in this chapter, but with a few small differences:

§  Use the url global, instead of hardcoding the URL to connect to.

§  Pass the Last-Event-ID header, when last_id is set. See Sending Last-Event-ID. Unlike with EventSource, it is possible to send HTTP headers with XMLHttpRequest (and with Internet Explorer’s ActiveXObject too), and so we do.

§  The processing is handed to processNonSSE(), which will be written shortly.

§  longpoll=1 is added to the URL. This is so the backend knows to disconnect after sending data. (Remember, with long-poll the data does not get seen by the browser until the connection is closed.) By using this, we can have a single backend servicing the various frontend fallbacks.

§  The timer handle is recorded, so the timer can be cancelled by other code.

One more small addition is needed. In temporarilyDisconnect() there are a couple of tidy-up tasks:

if(keepaliveTimer != null)clearTimeout(keepaliveTimer);

if(es)es.close();

We could just add if(xhr)xhr.abort();, but there will be more to do in the next chapter, so let’s move all three commands to a disconnect() function, and call that from temporarilyDisconnect(). So the two functions look like this:

function disconnect(){

if(keepaliveTimer){

  clearTimeout(keepaliveTimer);

  keepaliveTimer = null;

  }

if(es){

  es.close();

  es = null;

  }

if(xhr){

  xhr.abort();

  xhr = null;

  }

if(longPollTimer){

  clearTimeout(longPollTimer);

  longPollTimer = null;

  }

}

function temporarilyDisconnect(secs){

var millisecs = secs * 1000;

millisecs -= Math.random() * 60000;

if(millisecs < 0)return;

disconnect();

setTimeout(connect,millisecs);

}

Long-Poll and Keep-Alive

If you remember back to the section Client Side, you know our keep-alive system is set up to call connect() if we don’t get any activity on the connection after 20 seconds. This causes a problem for long-poll because there is no way for long-poll to send keep-alives: it sends one message and disconnects. Well, of course, the server will happily send the keep-alives, but our client won’t receive them.

NOTE

In those browsers where onreadystatechanged gets called for readyState==3 messages, we can get those keep-alives. But, if we can do that, then we would be using the XHR technique described in the next chapter, not bothering with the current long-poll technique.

See the longpoll_keepalive.php and longpoll_keepalive.html files in the book’s source code if you want to play around with this. It sends keep-alives every 2 seconds, then sends the real data after 10 seconds and exits. See what you get, and when, in each browser. In Android 2.3 (the main need for long-poll, if you support mobile users), you will see the callback is called immediately for readyState==1, but then there is nothing for 10 seconds and states 2, 3, and 4 all come through together at the end.

So, if our long-poll does not send anything within 20 seconds, what happens? Something not good. startLongPoll gets called again, so now we have two sockets open to the server. If the server doesn’t send anything for hours we will have hundreds of sockets open. Really? Hundreds? Kind of. Remember that if the server is sending keep-alives, the sockets will all be active, and so won’t be getting killed off. But not hundreds, because browsers have a limit on the number of simultaneous connections, typically six. In a sense this is worse: after a short time there will be six long-poll connections open, new requests will quietly get put on a stack, and all other communication with that server (e.g., for new images) will also be put on hold.

By adding the following two highlighted lines, we can avoid this Armageddon scenario:

function startLongPoll(){

if(xhr)xhr.abort();

if(window.XMLHttpRequest)xhr = new XMLHttpRequest();

else xhr = new ActiveXObject("Msxml2.XMLHTTP");

xhr.onreadystatechange = longPollOnReadyStateChange;

var u = url;

u += "longpoll=1&t=" + (new Date().getTime());

xhr.open("GET", u);

if(lastId)xhr.setRequestHeader("Last-Event-ID", lastId)

xhr.send(null);

}

function longPollOnReadyStateChange(){

if(this.readyState != 4)return;

xhr = null;

longPollTimer = setTimeout(startLongPoll, 50);

processNonSSE(this.responseText);

}

When the onreadystatechange callback is called successfully, with data, xhr is set to null; this partners with the first line in startLongPoll(), which calls abort() if xhr has not been set to null. In normal operation, xhr will always be null when startLongPoll() is entered. It is only when it is called by a keep-alive timeout that xhr will not be null and instead will represent the previous connection. In other words, if the long-poll request does not reply within 20 seconds, abort it and make a fresh call.

Happy? I’m not. Long-poll has become not-very-long-poll. Every 20 seconds we make a new connection. They were expensive enough as it was. OK. So how about we never use keep-alive when using long-poll? To understand if that is good or not, think about the reasons we have keep-alive in the first place:

§  To stop some intermediate server or router closing our socket.

§  To keep retrying if the initial request failed.

§  To detect when the backend has gone wrong in such a way that the socket is being kept open. (This also covers the case of intermittent browser bugs.)

The first point is moot if we are going to shut down the socket ourselves every 20 seconds. But the second and third points are good and noble reasons, and I wouldn’t want to be without them in a production system. The third point is the troublesome one: there is simply no way to tell the difference between a server that has no message to send yet and a server that has gone into an infinite loop and is never going to reply. My suggestion is that you set a much higher number for keep-alive timeouts when using long-poll, because you don’t ever really expect that crash, do you? You can do this by simply adding this line:

function startLongPoll(){

keepaliveSecs = 300;

if(xhr)xhr.abort();

...

For the second point (retrying if the initial request fails), see the next section.

Long-Poll and Connection Errors

Our previous long-poll code was rather optimistic: it assumed the URL was correct, and the server would always be happy to receive our request. What if the server is offline for any reason? Or if the URL is bad? When either of those happen, your longPollOnReadyStateChange callback will quickly be called with readyState==4. You identify them by looking at the status element of the xhr object. Typical codes you will see are listed in Table 6-1.

Table 6-1. Common XMLHttpRequest status codes

Status code

Meaning

0

Connection issue, such as bad domain name

200

Success

304

It came from cache

401

Authentication failed

404

Server exists, but bad URL

500

Server error

I hope you never see a 304, because that would defeat the whole point of streaming live data! A 401 is intercepted by the browser, which asks the user for his credentials, then sends the request again. You only receive a 401 in your code if the user clicks Cancel. Therefore, we treat everything except a “200” status code as an error. For all errors, we assume no valid data was sent, and we sleep 30 seconds[26] before trying to long-poll again. Only when the status code is 200 do we use the data and immediately long-poll again. With these changes, the onreadystatechangecallback now looks like this:

function longPollOnReadyStateChange(){

if(this.readyState != 4)return;

xhr = null;

if(this.status == 200){

  longPollTimer = setTimeout(startLongPoll, 50);

  processNonSSE(this.responseText);

  }

else{

  console.log("Connection failure, status:"+this.status);

  disconnect();

  longPollTimer = setTimeout(startLongPoll, 30000);

  }

}

The call to disconnect() stops a couple of timers (longpollTimer and the keep-alive timer) to make sure nothing else will call startLongPoll() before those 30 seconds are up.

NOTE

If you really wanted to get clever, there are some status codes that have information in the payload. For instance, a 301 tells us a new URL we should try. A 305 tells us a proxy we should be using. If you are connecting to a third-party system, you may need to handle some of these; hopefully they will give you instructions on which ones. Watch out for 420 and 429, which tell you that you are making your connection attempts too frequently.

Server Side

Relative to the previous version of the server-side script (fx_server.id.php) we need a few changes. The first couple are specific to long-poll; at the very top of the script, see if "longpoll" has been requested by the client:

$GLOBALS["is_longpoll"] = array_key_exists("longpoll",$_POST)

  || array_key_exists("longpoll",$_GET);

$GLOBALS["is_sse"] = !($GLOBALS["is_longpoll"]);

This is a nice, compact expression to assign either true or false to $is_longpoll. It only tests for the existence of longpoll in the input data (either in GET or POST data), not for its value. The second line says if it is not long-poll, then it must be SSE. The other part of this change is at the very end of the main loop:

  ..

  if($GLOBALS["is_longpoll"])break;

  }

Short and sweet. Just like a honey bee drone: one package delivered, and then it kills itself.

NOTE

I explicitly use the $GLOBALS[] array. This code and our main loop are both in the same scope (the global scope), so I could have assigned to the simpler $is_longpoll variable. But doing it this way means the code will still work if this code, or the main loop, gets refactored to its own function. It also documents the code: it screams “these are global variables” to the programmer who has to maintain this code in six months’ time.

The other changes we make will be used for all our fallbacks. Previously you may remember we had these helper functions:

function sendData($data){

echo "data:";

echo json_encode($data)."\n";

echo "\n";

@flush();@ob_flush();

}

function sendIdAndData($data){

$id = $data["rows"][0]["id"];

echo "id:".json_encode($id)."\n";

sendData($data);

}

When we use the fallbacks, the bits specific to SSE (the data: prefix, the extra blank line at the end, and the separate id: row) are not needed, and in fact they get in the way. So why not drop them?

function sendData($data){

if($GLOBALS["is_sse"])echo "data:";

echo json_encode($data)."\n";

if($GLOBALS["is_sse"])echo "\n";

@flush();@ob_flush();

}

function sendIdAndData($data){

if($GLOBALS["is_sse"]){

  $id = $data["rows"][0]["id"];

  echo "id:".json_encode($id)."\n";

  }

sendData($data);

}

This means sendIdAndData() is now identical to sendData() for long-poll. That is fine. You can find this version as fx_server.longpoll.php in the book’s source code. (If your server code sends retry:, you also need to do the same thing.)

NOTE

If you wanted to make a polyfill, you would not do this. Instead, on the client side you would strip off “data:” on lines that start with it, and you would look out for lines that start with anything else and ignore them.

One final change. Replace this line:

header("Content-Type: text/event-stream");

with:[27]

if($GLOBALS["is_sse"])header("Content-Type: text/event-stream");

else header("Content-Type: text/plain");

Dealing with Data

Let’s go back to the frontend and the processNonSSE() function we introduced earlier. This function is used in the next chapter, too. It does a couple of jobs that are done by the browser for us when using SSE:

function processNonSSE(msg){

var lines = msg.split(/\n/);

for(var ix in lines){

  var s = lines[ix];

  if(s.length == 0)continue;

  if(s[0] != "{"){

    s = s.substring(s.indexOf("{"));

    if(s.length == 0)continue;

    }

  processOneLine(s);

  }

}

To see the first job more clearly, here is a cut-down version:

function processNonSSE(msg){

var lines = msg.split(/\n/);

for(var ix in lines){

  processOneLine(lines[ix]);

  }

}

The SSE protocol always gives our callback exactly one message at a time. With long-poll we might have been given multiple messages.[28] So the preceding code breaks up the lines and processes each separately. But this cut-down version is naive and dangerous.

Our application protocol is exactly one JSON object per message, which also implies one line (CR and LF have to be escaped in JSON). But do you remember the SSE protocol? It finishes each message with a blank line. So the next thing we do is look for blank lines (if(s.length == 0)) and throw them away (continue).

What about the if(s[0] != "{") block? This is a Dirty Data Defense. process_one_line() expects JSON, whole JSON, and nothing but JSON. If it gets anything else, it will throw an exception when it comes to parse it. In fact, it expects JSON representing an object, which means it must start with { and end with }. If there is any junk to the left of the opening curly bracket (if(s[0] != "{")), the s.substring(s.indexOf("{")) line strips it away. And if that leaves nothing, then skip it completely. (By the way, this particular Dirty Data Defense was added as part of the iframe support of the next chapter; I’ve not seen long-poll trigger it.)

Wire It Up!

The last step is easy. Add the following highlighted line to connect(), then go and test it in a browser that does not support SSE:

function connect(){

gotActivity();

if(window.EventSource)start_eventsource();

else startLongPoll();

}

How do we test long-poll (or any of our later fallbacks) on a browser that already supports SSE? Rather than mess around with commenting clauses out, I recommend slapping some temporary code in at the top of the connect() function, like this:

function connect(){

gotActivity();

if(true)startLongPoll();else  //TEMP

if(window.EventSource)start_eventsource();

else startLongPoll();

}

I like to use those blank lines on each side, and the comment, to make it stand out and therefore hard to forget. (This temporary line to force long-poll is included in fx_client.longpoll.html; please experiment with removing it.)

IE8 and Earlier

The code we have up to this point works fine in just about every browser, including Android 2.x. Sigh, not IE8. The only issue in IE8 is that Object.keys is not available. (This is used in the makeHistoryTbody() function introduced in Adding a History Store.) To add support, use the following block of code; insert it in the <head> of the page:

<script>

Object.keys=Object.keys || function(o,k,r){

  r=[];

  for(k in o)if(o.hasOwnProperty(k))r.push(k);

  return r;

  }

</script>

If Object.keys is natively supported, it will use that: Object.keys=Object.keys. Otherwise the rest of this block of code assigns a simple function to Object.keys, which iterates through the properties of the given object and adds them to an array. The hasOwnProperty is to avoid including any keys that have been added to the Object prototype. Search online or refer to an advanced JavaScript book if you want to understand that more deeply.

IE7 and Earlier

Object.keys was missing in IE6, IE7, and IE8. But there is still one more thing missing in IE6 and IE7: JSON. The JSON object is built into modern browsers (including IE8 and later) and gives us parse() and stringify() objects. Our code only needs JSON.parse(), so if you are seriously bandwidth-sensitive, you could strip down this solution. But this only affects IE6 and IE7 users, who by now must be so glad simply to find a website that still supports them that they won’t care about an extra file load, so I am going to use the readily available json2.js file.

NOTE

This file is in the book’s source code, or you can get it from https://github.com/douglascrockford/JSON-js.

Actually I am using a minified version, which reduces the file from 17,530 to 3,377 bytes.

Now, IE6 and IE7 represent maybe 1% of your users, so it is unreasonable to expect the other 99% to have to download a patch that they don’t need. (It does no harm; it is designed to only create the JSON object when one does not already exist, but it is a waste of bandwidth for both you and your users.) So I chose to use IE’s special version detection. This is an IE-only feature (which actually disappeared as of IE10), but is ideal for our purposes:

<!--[if lte IE 7]>

<script src="json2.min.js"></script>

<![endif]-->

IE7 and earlier will process that <script> command and load, then run, json2.min.js. IE8 and IE9 will process the command but not do anything. All other browsers will just see this as an HTML comment and ignore it completely.

Overall, for all the modern browsers, including IE9 and later, we waste 198 bytes on patching IE8 and earlier. IE6 and IE7 have a further 3,377 bytes to load.

The Long and Winding Poll

In this chapter we have looked at more primitive mechanisms that can be used as an alternative to SSE. Regular polling might sometimes be a better choice than SSE if you only need sampling (as opposed to full history), or if latency is not important (e.g., if you can wait every 5 minutes for a batch of “latest” data; that way each client won’t be holding open a socket). Then we looked at long-poll. Its great advantage is that it works on any OS/browser where Ajax works; and that is just about everywhere nowadays. Its disadvantage is that for every message received, there is an extra HTTP request involved. The good news is that for some browsers, there are more efficient choices—this is the subject of the next chapter.


[25The es and xhr variables are exclusive. In other words, either a browser will use es and xhr will always be null, or it will use xhr and es will always be null. So they could share the same variable name, perhaps called server. I have chosen not to, to emphasize that each holds a different type of JavaScript object. Another reason to use different names is for when closing them:es.close() but xhr.abort().

[26This assumes keep-alive, when using long-poll, has been set to higher than 30 seconds. See Long-Poll and Keep-Alive. Otherwise the keep-alive will trigger first, which will then be interrupted by our 30-second timeout—which is not good.

[27Did you think the Content-Type ought to be application/json instead of text/plain? The code we are writing here is for workarounds for browsers that are nowhere near the bleeding edge. Not the time for the semantic network soapbox. More seriously, the data we are sending is not strictly JSON. When we send 2+ data items together, it is two JSON strings separated by an LF. Each of those lines only turns into JSON inside the processOneLine() function.

[28OK, fx_server.longpoll.php does not send multiple messages. But we could; look at the suggestions in Optimizing Long-Poll. And in the next chapter, multiple messages happen whether we want them to or not.