Data Push Apps with HTML5 SSE (2014)

Chapter 2. Super Simple Easy SSE

This chapter will introduce a simple frontend and backend that uses SSE to stream real-time data to a browser client from a server. I won’t get into some of the exotic features of SSE (those are saved for Chapters 58, and 9). I also won’t try to make it work on older browsers that do not support SSE (see Chapters 6 and 7 for that). But, even so, it will work on recent versions of most of the major browsers.

NOTE

Any recent version of Firefox, Chrome, Safari, iOS Safari, or Opera will work. It won’t work on IE11 and earlier. It also won’t work on the native browser in Android 4.3 and earlier. To test this example on your Android phone or tablet, install either Chrome for Android or Firefox for Android. Alternatively, wait for Chapter 6 where we will implement long-poll as a fallback solution. For the latest list of which browsers support SSE natively, see http://caniuse.com/eventsource.

If you want to go ahead and try it out, put basic_sse.html and basic_sse.php in the same directory,[9] a directory that is served by Apache (or whatever web server you use). It can be on localhost, or a remote server. If you’ve put it on localhost, in a directory called sse, then the URL you browse to will be http://localhost/sse/basic_sse.html. You should see a timestamp appearing once per second, and it will soon fill the screen.

Minimal Example: The Frontend

I will take this first example really slowly, in case you need an HTML5 or JavaScript refresher. First, let’s create a minimal file, just the scaffolding HTML/head/body tags. The very first line is the doctype for HTML5, which is much simpler than the doctypes you might have seen for HTML4. In the <head> tag I also specify the character set as UTF-8, not because I use any exotic Unicode in this example, but because some validation tools will complain if it is not specified:

<!doctype html>

<html>

  <head>

    <meta charset="UTF-8">

    <title>Basic SSE Example</title>

  </head>

  <body>

    <pre id="x">Initializing...</pre>

  </body>

</html>

You can also see I have a <pre> tag, with the id set to "x". I have used a <pre> tag rather than a <p> or <div> tag so that it can be filled with the received data (which contains line feeds) without any modification or formatting.

WARNING

Be aware of the potential for JavaScript injection when using server-side data with no checking.

Initially the <pre> block is hardcoded to say “Initializing….” We will replace that text with our data.

JQUERY VERSUS JAVASCRIPT

In case you’ve been using JQuery everywhere, the equivalent of $("#x") to get a reference to x in your HTML is document.getElementById("x"). To replace the text, we assign it to innerHTML. To append to the existing text, we use += instead of = like this:

//Equivalent of $("#x").html("New content\n");

document.getElementById("x").innerHTML = "New content\n"

//Equivalent of $("#x").append("Append me\n");

document.getElementById("x").innerHTML += "Append me\n"

Now let’s add a <script> block, at the bottom of the HTML body:

<!doctype html>

<html>

  <head>

    <meta charset="UTF-8">

    <title>Basic SSE Example</title>

  </head>

  <body>

    <pre id="x">Initializing...</pre>

    <script>

    var es=new EventSource("basic_sse.php");

    </script>

  </body>

</html>

We created an EventSource object that takes a single parameter: the URL to connect to. Here we connect to basic_sse.php. Congratulations, we now have a working SSE script. This one line is connecting to the backend server, and a steady stream of data is now being received by the browser. But if you run this example, you’d be forgiven for thinking, “Well, this is dull.”

To see the data that SSE is sending us we need to handle the “message” event. SSE works asynchronously, meaning our program does not sit there waiting for the server to tell it something, and meaning we do not need to poll to see if anything new has happened. Instead our JavaScript gets on with its life, interacting with the user, making silly animations, sending key presses to government organizations, and whatever else we use JavaScript for. Then when the server has something to say, a function we have specified will be called. This function is called an “event handler”; you might also hear it referred to as a “callback.” In JavaScript, objects generate events, and each object has its own set of events we might want to listen for. To assign an event handler in JavaScript, we do the following:

es.addEventListener('message',FUNCTION,false);

The es. at the start means we want to listen for an event related to the EventSource object we have just created. The first parameter is the name of the event, in this case 'message'. Then comes the function to process that event.[10]

The FUNCTION we use to process the event takes a single parameter, which by convention will be referred to simply as e, for event. That e is an object, and what we care about is e.data, which contains the new message the server has sent us. The function can be defined separately, and its name given as the second parameter. But it is more usual to use an anonymous function, to save littering our code with one-line functions (and having to think up suitable names for them). Putting all that together, we get this:

<!doctype html>

<html>

  <head>

    <meta charset="UTF-8">

    <title>Basic SSE Example</title>

  </head>

  <body>

    <pre id="x">Initializing...</pre>

    <script>

    var es = new EventSource("basic_sse.php");

    es.addEventListener("message", function(e){

      //Use e.data here

      },false);

    </script>

  </body>

</html>

Still it does nothing! So in the body of the event handler function, let’s have it append e.data to the <pre> tag. (We prefix a line feed so each message goes on its own line.) The final frontend code looks like this:

<!doctype html>

<html>

  <head>

    <meta charset="UTF-8">

    <title>Basic SSE Example</title>

  </head>

  <body>

    <pre id="x">Initializing...</pre>

    <script>

    var es = new EventSource("basic_sse.php");

    es.addEventListener("message", function(e){

      document.getElementById("x").innerHTML += "\n" + e.data;

      },false);

    </script>

  </body>

</html>

At last! We see one line that says “Initializing…,” then a new timestamp appears every second (see Figure 2-1).

basic_sse.html after running for a few seconds

Figure 2-1. basic_sse.html after running for a few seconds

We could be writing handlers for other EventSource events, but they are all optional, and I will introduce them later when we first need them.

Using JQuery?

Nowadays most people use jQuery. However, the SSE boilerplate code is so easy there is not much for JQuery to simplify. For reference, here is the minimal example rewritten for JQuery:

<!doctype html>

<html>

  <head>

    <meta charset="UTF-8">

    <title>Basic SSE Example</title>

    <script src="//code.jquery.com/jquery-1.11.0.min.js"></script>

  </head>

  <body>

    <pre id="x">Initializing...</pre>

    <script>

    var es = new EventSource("basic_sse.php");

    es.addEventListener("message", function(e){

      $("#x").append("\n" + e.data);

      },false);

    </script>

  </body>

</html>

This next version (basic_sse_jquery_anim.html in the book’s source code) spruces it up with a fade-out/fade-in animation each time. This version also does a replace instead of an append, so you get to see only the most recent item:

<!doctype html>

<html>

  <head>

    <meta charset="UTF-8">

    <title>Basic SSE Example</title>

    <script src="//code.jquery.com/jquery-1.11.0.min.js"></script>

  </head>

  <body>

    <pre id="x">Initializing...</pre>

    <script>

    var es = new EventSource("basic_sse.php");

    es.addEventListener("message", function(e){

      $("#x").fadeOut("fast", function(){

        $("#x").html(e.data);

        $("#x").fadeIn("slow");

        });

      },false);

    </script>

  </body>

</html>

Minimal Example: The Backend

The first backend (server-side) example we will study is written in PHP, and looks like this:

<?php

header("Content-Type: text/event-stream");

while(true){

  echo "data:".date("Y-m-d H:i:s")."\n\n";

  @ob_flush();@flush();

  sleep(1);

  }

Just like the frontend code, this is wonderfully short, isn’t it? No libraries, no dependencies, just a few simple lines of vanilla PHP. And just like the frontend there is more we could be doing, but again it is all optional.

Going through the script, the very first line, <?php, identifies this as a PHP script. Then we send back a MIME type of text/event-stream, using the header() function. text/event-stream is the special MIME type for SSE. Next we enter an infinite loop (while(true){...} is the PHP idiom for that), and in that loop we output the current timestamp every second.

The SSE protocol just involves prefixing our message data (the timestamp) with data: and following it with a blank line. So starting at 1 p.m. on February 28, 2014, it outputs:

data:2014-02-28 13:00:00

data:2014-02-28 13:00:01

data:2014-02-28 13:00:02

data:2014-02-28 13:00:03

...

What about the @ob_flush;@flush(); line? This tells PHP (and Apache) to send the data back to the client immediately, rather than buffer it up and send it back in batches. The @ prefix means suppress errors, and is fine here: there are no interesting errors we need to know about, butob_flush() might complain if there is no data to flush out. (In case you wondered, the order does matter. ob_flush() must come before flush().)

PHP ERROR SUPPRESSION

For the PHP experts: @ is said to be slow. But putting that in context, it adds on the order of 0.01ms to call it twice, as shown here. So, as long as you are not putting it inside a tight loop, just relax. @foo() is shorthand for $prev=error_reporting(0); before the call to foo(), then error_reporting($prev); afterwards. So if you are really performance-sensitive and you find a need to use @foo() in a loop, and understand the implications, it is better to put those commands outside the loop.

In the case of ob_flush, it is an E_NOTICE that we want to suppress. So this an even better longhand:

$prev = error_reporting();

error_reporting($prev & ~E_NOTICE);

...

ob_flush();

flush();

...

error_reporting($prev);

http://bit.ly/1gCNyfX suggests flush() can never throw an error, so @ could be dropped there, and we can just leave it on ob_flush(). http://bit.ly/1elPD1S shows the notices PHP might throw from ob_flush().

Do infinite loops make you nervous? It is OK here. We are using up one of Apache’s threads/processes, but as soon as the browser closes the connection (whether from JavaScript, or the user closing the window) the socket is closed, and Apache will close down the PHP instance.

What about caching, whether by the client or intermediate proxies, you may wonder? I agree, caching would be awfully bad for SSE: the whole point is we have new information we want the user to know about. In my testing the client has never cached anything. Because this is intended as a minimal example, I chose to ignore caching. Examples in other chapters will send headers that explicitly request no caching, just to be on the safe side (see Cache Prevention).

WARNING

One other thing to watch out for when using SSE is that the browser might kill the connection if it goes quiet. For instance, some versions of the Chrome browser kill (and reopen) the connection after 60 seconds. In our real applications we will deal with this (see Adding Keep-Alive). Here it is not needed, because the backend never goes quiet—we output something every single second.

The Backend in Node.js

In this section I will use the Node.js language for the backend. Node.js is the same JavaScript you know from the browser, even with the same libraries (strings, regexes, dates, etc.), but done server side, and then extended with loads of modules. The biggest thing to watch out for when using Node.js is that, by default, everything is nonblocking—asynchronous, in other words—and asynchronous coding needs a different mindset. But it is this nonblocking, event-driven, behavior that makes it well-suited to data push applications.

The PHP server solution we used earlier is better termed “Apache+PHP” because Apache (or the web server of your choice) handles the HTTP request handling (and a whole heap of other stuff, such as authentication), and PHP just handles the logic of the request itself. Apart from keeping the code samples fairly small, this is also the most common way people use PHP. Node.js comes with its own web server library, and that is the way most people use it for serving web content—so that is the way we will use it here.

NOTE

Let’s not get drawn into language wars. All languages suck until you are used to them. Then they just suck in ways you know how to deal with. The real strengths of PHP and Node.js are rather similar: very popular, easy to find developers for, and lots of useful extensions.

Minimal Web Server in Node.js

So, before I show how to support SSE with Node.js, we should first take a look at the minimal web server in Node.js:

var http = require("http");

http.createServer(function(request,response){

  response.writeHead(200,

    { "Content-Type": "text/plain" }

    );

  var content="Hello World\n";

  response.end(content);

  }).listen(1234);

The first line includes the http library; this is the CommonJS way of importing a module. We can then start running an HTTP server with a single line:

http.createServer(myRequestHandler).listen(port);

There is a lot of power in that single line: it will start listening on the port we give, handle all the HTTP protocol, and handle multiple clients, and when each client connects the specified request handler is called. By default it will listen on all local IP addresses. If you just wanted it to listen on 127.0.0.1, specify that as follows:

http.createServer(myRequestHandler).listen(port,"127.0.0.1");

By convention the request handler is implemented with an anonymous function, and our example follows that convention. The function takes two parameters: request, which is an instance of http.ClientRequest,[11] and response, which is an instance of http.ServerResponse.[12]

The request parameter tells us what the client is asking for. The response object is then used to give it to the client. This minimal example completely ignores the user request: everybody gets the same thing (the content string). We make two calls on the response object. The first is to specify the status (HTTP status code 200 means “Success”) and content-type header (here plain text, not HTML). The second call, response.end(content), is a shortcut for two calls: response.write(content) to send data to the client (optionally specifying the encoding), andresponse.end() to say that is everything that needs to be sent, we are done.

To test this code, save it as basic_sse_node_server1.js, and from the command line run node basic_sse_node_server1.js. Then in your browser visit http://127.0.0.1:1234/, and you should see “Hello World.”

Pushing SSE in Node.js

In the previous section we ignored the user input, and output static plain-text content. For the next block of code we continue to ignore the user input, but output dynamic text—the current timestamp, just as our earlier PHP code did:

var http = require("http");

http.createServer(function(request, response){

  response.writeHead(200, { "Content-Type": "text/event-stream" });

  setInterval(function(){

    var content = "data:" +

      new Date().toISOString() + "\n\n";

    response.write(content);

    }, 1000);

  }).listen(1234);

The first change is trivial: output the text/event-stream content type. But the biggest change from the previous example is the addition of setInterval( ... ,1000) to run some code once per second. In PHP we used an infinite loop, and a sleep(1) command to run a command once per second. If we did that in Node.js we would block the whole web server, and no other clients could connect. When writing a Node.js HTTP server, it is important to exit the request handler as quickly as possible. So the Node.js way is to use setInterval. The code being called once each second is reasonably straightforward. The “data:” prefix and the “\n\n” suffix are the SSE protocol. new Date().toISOString() is the JavaScript idiom to get the current timestamp.

From the command line, start this with node basic_sse_node_server2.js. Don’t try to test it in a browser just yet (it won’t work). If you have curl installed, you can test with curl http://127.0.0.1:1234/. A new timestamp will appear once a second, with a blank line betweeneach:

data:2014-02-28T13:00:00.123Z

data:2014-02-28T13:00:01.145Z

data:2014-02-28T13:00:02.140Z

data:2014-02-28T13:00:03.142Z

...

SOME IMPROVEMENTS

There are a couple of ways we can enhance the script, though they get away from this chapter’s theme of minimal. At the top, add this line:

var port = parseInt( process.argv[2] || 1234 );

Then change the final line of the script so it looks like this:

  ...

  }).listen(port);

This allows you to specify the port to listen to, on the command line. If you do not have a web server already running, you could run the script as root specifying port 80.

The next change is to give some insight into how it is working. Replace response.write(content); with these three lines:

var b = response.write(content);

if(!b)console.log("Data queued (content=" + content + ")");

else console.log("Flushed! (content=" + content + ")");

Just as in the browser, JavaScript console.log() is used to let the programmer see what is going on. The return value from response.write() is true if the data got flushed out cleanly. This happens most of the time, and it is good. It is false if the data had to be cached in memory first. That means that at the time response.write() returned, the data had not been sent to your client yet. This happens if you try to send data too quickly (this is hard to see; even changing the interval from 1000ms to 1ms won’t count as “too quickly,” but getting rid of setInterval and using a while(true){...} loop will do it), or if the socket has become broken.

Start the node server again, and then start your curl client again. Wait for some data to come through. Now press Ctrl-C to kill the curl client. Over in the node window see how it is still trying to send data. Uh-oh…that is something else Apache takes care of for us when we use Apache+PHP.

What we need to do is recognize when the client has disconnected, which can be done by listening for the “close” event. The “close” event is part of request.connection, so we can respond to it by adding this code:

request.connection.on("close", function(){

  response.end();

  clearInterval(timer);

  console.log("Client closed connection. Aborting.");

  });

This code has to come after the call to setInterval. Just before that, capture the return value of setInterval as follows:

var timer = setInterval(function(){

  ...

So, now when the client disconnects, that function triggers and we get to cleanly close the response, as well as shut down the interval that was ticking away every second.

If you look at basic_sse_node_server3.js in the book’s source code, you will also spot a couple of extra console.log() commands.

Now to Get It Working in a Browser!

First, start up your node server (node basic_sse_node_server3.js), look up basic_sse.html from earlier in this chapter, open it in an editor, and find this line:

var es = new EventSource("basic_sse.php");

Change it to use our Node.js server that is listening on port 1234:

var es = new EventSource("http://127.0.0.1:1234/");

Now open basic_sse.html in your browser. (This is assuming you have Apache listening on port 80, serving at least HTML files.)

Nothing happens. You will see “Preparing…,” and it just sits there. Why? The problem is that the HTML is being loaded from port 80, but is then trying to make a connection to port 1234. A different port number is enough for it to count as a different server and that is not allowed (for security reasons). We will look at cross-origin resource sharing (CORS) in Chapter 9, which gives servers a way to say they want to accept connections from clients that loaded their content from somewhere else. But the alternative is to use Node.js to deliver the HTML file to the clients; this is the normal way to do things in the Node.js world.

(Before you go any further, change back basic_sse.html to connect to basic_sse.php again.) Then, so the script can read files from the local filesystem, add this line to the top of your script:

var fs = require("fs");

Then the big change is at the top of the request handler. Add this block:

if(request.url!="/basic_sse.php"){

  fs.readFile("basic_sse.html",

    function(err,file){

      response.writeHead( 200,

        {"Content-Type" : "text/html"}

        );

      response.end(file);

      });

  return;

  }

When you get a certain URL, treat it as a request for the streaming. But the rest of the time (notice the !=) send back the HTML file instead. readFile() is one of Node.js’s async operations. You give the filename, then an anonymous function to deal with the content when it has been loaded. In the meantime, while waiting for the file to be loaded, you return from the request handler. When the file does load, we simply spit it out to the client, with a text/html content type, and end() the connection.

Now you can browse to http://127.0.0.1:1234 in your browser.

MODIFYING THE HTML FILE

What’s that? Why do we mention “php” in the preceding code snippet? You’ve gone to all the trouble of those language wars with the PHP Brigade, going so far as to drug their tea, complain about their personal hygiene to the boss, and email them over 35 links to articles on how important and easy async programming really is, and now it looks like you are using Node.js to serve PHP content. The reason is simple: basic_sse.html was written to connect to the PHP script, and I don’t want to make another file.

Well, this is easy to fix. Between loading the file from disk and sending it to the client, why not modify the URL it says to connect to! Make the following highlighted changes:

if(request.url != "/sse"){

  fs.readFile("basic_sse.html",

    function(err,file){

      response.writeHead( 200,

        {"Content-Type" : "text/html"}

        );

      var s = file.toString();

      s = s.replace("basic_sse.php","sse");

      response.end(s);

      });

  return;

  }

By the way, file is actually a buffer, not a string (because it might contain binary data), which is why we first have to convert it to a string.

You can find the final file with the code from this section and from the two sidebars in the book’s source code as basic_sse_node_server.js, and here it is in full:

var http = require("http"), fs = require("fs");

var port = parseInt( process.argv[2] || 1234 );

http.createServer(function(request, response){

  console.log("Client connected:" + request.url);

  if(request.url!="/sse"){

    fs.readFile("basic_sse.html", function(err,file){

      response.writeHead(200, { 'Content-Type': 'text/html' });

      var s = file.toString();  //file is a buffer

      s = s.replace("basic_sse.php","sse");

      response.end(s);

      });

    return;

    }

  //Below is to handle SSE request. It never returns.

  response.writeHead(200, { "Content-Type": "text/event-stream" });

  var timer = setInterval(function(){

    var content = "data:" + new Date().toISOString() + "\n\n";

    var b = response.write(content);

    if(!b)console.log("Data got queued in memory (content=" + content + ")");

    else console.log("Flushed! (content=" + content + ")");

    },1000);

  request.connection.on("close", function(){

    response.end();

    clearInterval(timer);

    console.log("Client closed connection. Aborting.");

    });

  }).listen(port);

console.log("Server running at http://localhost:" + port);

It is quite a bit more code than basic_sse.php because it is doing the tasks that Apache was taking care of in the Apache+PHP solution.

Smart, Sassy Exit

So that was the Hello World of the SSE world. Just a few lines on the frontend and a few lines on the backend; it couldn’t be simpler, could it? In the next five chapters we build on this knowledge to make something more sophisticated and robust that is usable on practically every desktop and mobile browser.


[9For the moment, stick to keeping your HTML and your server-side script on the same machine. In Chapter 9 we will cover CORS, which (in some browsers) will allow the server-side script to be on a different machine.

[10The third parameter of false means handle the event in the bubbling phase, rather than the capturing phase. Yeah, whatever. Just use false.

[11See http://nodejs.org/api/http.html#http_class_http_clientrequest.

[12See http://nodejs.org/api/http.html#http_class_http_serverresponse.