Node.js in Practice (2015)

Part 2. Real-world recipes

Chapter 12. Node in production: Deploying applications safely

This chapter covers

·        Deploying Node applications to your own server

·        Deploying Node applications to cloud providers

·        Managing packages for production

·        Logging

·        Scaling with proxies and cluster

Once you’ve built and tested a Node application, you’ll want to release it. Popular PaaS (platform as a service) providers like Heroku and Nodejitsu make deployment simple, but you can also deploy to private servers. Once your code is out there, you’ll need to cope with unexpected errors, service outages, and bugs, and monitor performance.

This chapter shows you how to safely release and maintain Node programs. It covers privately hosted servers that use Apache and nginx, WebSockets, horizontal scaling, automated deployment, logging, and ways to boost performance.

12.1. Deployment

In this section you’ll learn how to deploy Node applications to popular cloud providers and your own private servers. It’s likely that you’ll only typically use one of these approaches, depending on the requirements of your application or employer, but being familiar with both is instructive. For example, the Git-based workflow employed by Heroku has influenced how people deploy applications to servers they control, and with a bit of knowledge you can set up a server without having to call for help from a DevOps specialist.

The first technique we cover is based on Windows Azure, Heroku, and Nodejitsu. This is probably the easiest way to deploy web applications today, and cloud providers have free plans that make it cheap and painless to share your work.

Technique 96 Deploying Node applications to the cloud

This technique outlines how to use Node with PaaS providers, and has tips on how to configure and maintain applications in production. The focus is on the practical aspects of deployment and maintenance, rather than pricing or business models.

You can try out Heroku and Azure for free, so follow along if you’ve ever wanted to run a Node application in the cloud.


You’ve built a Node web application and want to run it on servers so people can use it.


Use a PaaS provider like Heroku or Nodejitsu.


We’ll look at three options for cloud deployment: Nodejitsu, Heroku, and Windows Azure. All of these services allow you to deploy Node web applications, but they all handle things slightly differently. The methods for uploading an application and configuring it vary, even though the fundamental concepts are the same.

Nodejitsu is an interesting case because it’s dedicated to Node. On the other hand, Windows Azure supports Microsoft’s software development tools, programming languages, and databases. Azure even has features beyond web application hosting, like databases and Active Directory integration. Heroku draws on a rich community of partners that offers add-ons, whereas Azure is more of a full-service offering.

If you look in the source code provided with this book, you should find a small Express application in production/inky. This is the application we used to research this technique, and you can use it as a sample application to try each service provider. Nodejitsu and Azure’s documentation includes examples based on Node’s http module, but you really need something with a package.json to see how things work for typical Node applications.

The first service provider we’ll look at is Nodejitsu ( Nodejitsu is based in New York, and has data centers in North America and Western Europe. Nodejitsu was founded in 2010, and has funding from the Bloomberg Beta fund.

To get started with Nodejitsu, you’ll need to register an account. Go to and sign up. You can sign up without selecting a pricing plan if you intend to release an open source project through Nodejitsu.

Nodejitsu has a command-line client called jitsu. You can install it with npm install -g jitsu. Once npm has finished, you’ll need to sign in—type jitsu login and enter your username and password. This will save an API token to a file called ~/.jitsuconf, so your password won’t be stored locally. Figure 12.1 shows what this process looks like in the terminal.

Figure 12.1. The jitsu command-line client allows you to sign in.

To deploy an application, type jitsu deploy. The jitsu command will prompt with questions about your application, and then set it up to run on a temporary subdomain. If you’re using an Express application, it’ll automatically set NODE_ENV to production, but you can edit this setting along with other environmental variables in the web interface. In fact, the web interface can do most of the things the jitsu command does, which means you don’t necessarily need a developer on hand to do basic maintenance chores like restarting applications.

Figure 12.2 shows a preview of Nodejitsu’s web interface, which is called WebOps. It allows you to stop and start applications, manage environmental variables, roll back to earlier versions of your application, and even stream logs in real time.

Figure 12.2. The WebOps management interface

Unsurprisingly Nodejitsu is heavily tailored toward Node applications, and the deployment process is heavily influenced by npm. If you have a strong grasp of npm and package.json files, and your projects are all Node applications, then you’ll feel at home with Nodejitsu.

Another PaaS solution that’s popular with Node developers is Heroku. Heroku supports several programming languages and platforms, including Node, and was founded in 2007. It has since been acquired by, and uses a virtualized solution based on Ubuntu servers. To use Heroku, you’ll need to sign up at heroku .com. It’s easy to create a free account, and you can even run production applications on the free tier. Essential features like domain aliases and SSL are paid, so it doesn’t take many requirements to hit around $20 a month, but if you don’t mind using a Heroku subdomain, you can keep things running for free.

Once you’ve created an account, you’ll need to install the Heroku Toolbelt from There are installers for Linux, Mac OS X, and Windows. Once you’ve installed it, you’ll have a command-line client called heroku that can be used to create and manage applications. Before you can use it, you’ll have to sign in; heroku login can be used to do this, and functions in much the same way as Nodejitsu’s jitsu command. You only need to log in once because it stores a token that will be used for subsequent requests. Figure 12.3 shows what this should look like.

Figure 12.3. Signing in with Heroku

The next step with a Heroku deploy is to prepare your repository. You’ll need to git init and commit your project. If you’re using our code samples and have checked them out of Git, then you should copy the specific project that you want to deploy out of our working tree. The full steps are as follows:

1.  git init

2.  git add.

3.  git commit -m 'Create new project'

4.  heroku create

5.  git push heroku master

The heroku create command sets up a remote repository called heroku, and the first git push to it will trigger the creation of a temporary subdomain.

If your application can be started with npm start, it should just work. If not, you might need to add a file called Procfile to your application that contains web: node yourapp.js. This file lists the processes that your application needs to run—it could include background workers as well.

If you’re using an Express application that expects NODE_ENV to be set, then you’ll need to do this manually with Heroku. The command is just heroku config:set NODE_ENV=production, but notice that this is automatic with Nodejitsu.

The last PaaS provider we’ll discuss is Windows Azure. Microsoft’s Azure platform can be used entirely through the web interface, but there’s also a command-line interface that you can install with npm install -g azure-cli. Figure 12.4 shows what the command-line tool looks like.

Figure 12.4. The Azure CLI tool

Azure also has an SDK that you can download for Linux, Mac OS X, and Windows. The downloads are available at

To start using Azure, you’ll need to sign in to with a Microsoft account. This is the same account that you can use for other Microsoft services, so if you already have an email account with Microsoft, you should be able to sign in. Azure’s registration process has extra security steps: a credit card and phone number are used to validate your account, so it’s a bit more tedious than Heroku or Nodejitsu.

Once you’ve created your Windows Azure account, you’ll want to go to the Portal page. Next go to ComputeWeb Site, and then Quick Create. Just keep in mind that you’re creating a “Web Site” and you should be fine—Microsoft supports a wide range of services that are partly tailored to .NET development with their existing tools like Visual Studio, so it can be bewildering for Mac and Unix developers.

Once your application has been created, you’ll need to tie it to a source control repository. Don’t worry, you can use GitHub! Before we go any further, check that you’re looking at a page like the one in figure 12.5.

Figure 12.5. Azure’s web interface after creating a website


Cloud configuration

PaaS providers all seem to have their own approaches to application configuration. You can, of course, keep configuration settings in your code, or JSON files, but there are times when it’s useful to store them outside of your repository.

For example, we build open source web applications that we also run on Heroku, so we keep our database passwords outside of our open source repository and use heroku config:set instead.


Click your application’s name, select Set up deployment from source control, and then look for site URL on the right side. From here you’ll be able to choose from a huge range of repositories and service providers, but we tested our application with GitHub. Azure fetched the code and set up a Node application—it was the same Express code that we used for Heroku (listings/production/inky), and worked the first time.

Table 12.1 shows how to get and set configuration values on each of the cloud providers we’ve discussed here.

Table 12.1. Setting environmental variables






jitsu env set name value

jitsu env delete name

jitsu env list


heroku config:set name=value

heroku config:unset name

heroku config


azure site appsetting add name=value

azure site appsetting delete name

azure site appsetting list

Although Azure’s registration requirements might seem less convenient than Heroku and Nodejitsu, it does have several benefits: if you’re working with .NET, then you can use your existing tools. Also, Microsoft’s documentation is excellent, and includes guides on setup and deploying for Linux and Mac OS X (

Your own servers, rented servers, or cheap virtual hosts all have their own advantages. If you want complete control over your server, or if your business already has its own servers or data centers, then read on to learn how to deploy Node to your own servers.

Technique 97 Using Node with Apache and nginx

Deploying Node to private servers running Apache or nginx is entirely possible, and recommended for certain situations. This technique demonstrates how to run a Node program behind Apache and nginx.


You want to run a Node web application on your own server.


Use Apache or nginx proxying and a service supervisor like runit.


While PaaS solutions are easy to use, there are times when you have to use dedicated hardware, or virtual machines that you have full control over. Larger businesses often have investments in their own data centers, so it doesn’t make sense to switch to an external service provider.

Virtualization has transformed web hosting. Linux virtual machines have been a key solution for hosting web applications for several years, and services like Amazon Elastic Compute Cloud make it easy to create and destroy services on demand.

It’s therefore likely that at some point you’ll be faced with deploying Node applications to servers that need configuration and maintenance. If you’re already experienced with basic systems administration tasks, then you can reuse your existing skills and software. Otherwise, you’ll have to become familiar with web server daemons and the tools used to keep Node programs running and recovering from errors.

This technique presents examples for Apache and nginx. They’re both web servers, but their configuration formats are very different, and they’re built in different ways. Figure 12.6 shows the basic server architecture that we’ll create in this section.

Figure 12.6. A Node program running alongside Apache or nginx

It’s not actually necessary to run a web server—there are ways to make Node programs safely access port 80. But we assume that you’re deploying to a server that has existing websites. Also, some people prefer to serve static assets from Apache or nginx.

The same technique is used for both servers: proxying. The following listing shows how to do this with Apache.

Listing 12.1. Proxying requests to a Node application with Apache

The directives in listing 12.1 should be added to your Apache configuration file. To find the right file, type apache2 -V on your server, and look for the HTTPD_ROOT and SERVER_CONFIG_FILE values—joining them will give you the right path and file. It’s likely that you won’t want to redirect all requests to your Node application, so you can add the proxy settings to a VirtualHost block.

With these three lines, requests to / will now be proxied to a process listening on port 3000 . In this case, the process is assumed to be a Node program that you’ve run with node server.js or npm start, but it could technically be any HTTP server. The LoadModule directives tell Apache to use the proxy  and HTTP proxying  modules.

If you forget to start the Node process, or quit it, then Apache will return a 503 error. To avoid errors like this, you need a way to keep the Node process running, and to also run it when the server boots. One way to do this is with runit (

If you’re using Debian or Ubuntu, you can install runit with apt-get install runit. Once it’s ready, create a shell script that can start your Node process. First, create a directory for your project: sudo mkdir /etc/service/nodeapp. Next, create a file that will be used for the script: sudo touch /etc/service/nodeapp/run. Then edit the file to make it look like the next listing.

Listing 12.2. Running a program with runit

Our server was using nvm ( to manage the installed versions of Node, so we added its location to $PATH ; otherwise the shell couldn’t find where node and npm were installed. You may have to modify this based on the output of which node, or remove it entirely. The last two lines  just change the directory to the location of your Node project, and then start it with npm start.

The application can be started with sudo sv start /etc/service/nodeapp and stopped with sudo sv stop /etc/service/nodeapp. Once the Node process is running, you can test it by killing it, and then checking to see that it automatically gets restarted by runit.

Now that you know how Apache handles proxies, and how to keep a process running, let’s look at nginx. Nginx is often used as a web server, but it’s technically a reverse proxy server that supports HTTP, HTTPS, and email. To make nginx proxy connections to Node applications, you can use the Proxy module, which uses a proxy_pass directive in a way similar to Apache.

Listing 12.3 has the settings needed by nginx. Like Apache, you could also put the server block in a virtual host file.

Listing 12.3. Proxying requests to a Node application with nginx

If you have multiple applications on the same server, then you can use a different port, but we’ve used 3000 here . This example is basically the same as Apache—you tell the server what location to proxy, and then the port. And of course, this example could be combined with runit.

If you don’t want to run Apache or nginx, you can run Node web applications without a web server. Read on to learn how to do this using firewall rules and other techniques.

Technique 98 Safely running Node on port 80

You can still run Node without a web server daemon like Apache. To do this, you basically need to forward the external port 80 to an internal, unprivileged port. This technique presents some ways to do this in Linux.


You don’t want to use Apache or nginx.


Use firewall rules to redirect port 80 to another, unprivileged port.


In most operating systems, binding to port 80 requires special privileges. That means that if you try to use app.listen(80) instead of port 3000 as we’ve used in most of our examples, you’ll see Error: listen EACCES. This happens because your current user account doesn’t have permission to bind to port 80.

You could get around this restriction by running sudo npm start, but this is dangerous. Ideally you want your Node program to run as a nonroot user.

In Linux, traffic can be redirected from port 80 to a higher port number by using iptables. Linux uses iptables to manage firewall rules, so you just need a rule that maps from port 80 to 3000:

iptables -t nat -I PREROUTING -p tcp --dport\

    80 -j REDIRECT --to-port 3000

To make this change permanent, you’ll need to save the rules to a file that gets run whenever the network interface is set up. The general approach is to save the rules to a file, like /etc/iptables.up.rules, and then edit /etc/network/interfaces to use it:

auto eth0

iface eth0 inet dhcp

  pre-up iptables-restore < /etc/iptables.up.rules

  post-down iptables-restore < /etc/iptables.down.rules

This is highly dependent on your operating system; these rules are adapted from Debian and Ubuntu’s documentation, but it may be different in other Linux distributions.

One downside of this technique is that it maps traffic to any process that’s listening to that port. An alternative solution is to grant the Node binary extra capabilities. You can do this by installing libcap2.

In Debian and Ubuntu, you can use sudo apt-get install libcap2-bin. Then you just need to grant the Node binary the capabilities for accessing privileged ports:

sudo setcap cap_net_bind_service=+ep /usr/local/bin/node

You may need to change the path to Node—check the output of which node if you’re not sure where it is. The downside of using capabilities for this is that now the node binary can bind to all ports from 1–1024, so it’s not as specific as restricting it to port 80.

Once you’ve applied a capability to a binary, it will be fixed until the file changes. That means that you’ll need to run this command again if you upgrade Node.

Now that your application is running on a server, you’ll want to ensure that it runs forever. There are many different ways to do this; the next technique outlines runit and the forever module.

Technique 99 Keeping Node processes running

Programs inevitably crash, and it’s unfortunate when this happens. What matters is how well you handle failure—users should be informed, and programs should recover elegantly. This technique is all about keeping Node programs running, no matter what.


Your program crashed in the middle of the night, and customers were unable to use the service until you restarted it.


Use a process monitor to automatically restart the Node program.


There are two main ways to keep a Node program running: service supervision or a Node program that manages other Node programs. The first method is a generic, operating system–specific technique. You’ve already seen runit in technique 97. Runit supports service supervision, which means it detects when a process stops running and tries to restart it.

Another daemon manager is Upstart ( You may have seen Upstart if you use Ubuntu. To use it, you’ll need a configuration file that describes how the Node program is managed. Listing 12.4 contains an example that you can modify for your server—it should be saved in /etc/init/nodeapp.conf, where nodeapp is the name of your application.

Listing 12.4. Managing a Node program with Upstart

This configuration file tells Upstart to respawn the application ( if it dies for any reason. It sets up a PATH  that’s similar to the one you’ll see in your terminal if you type echo $PATH. Then it states the program should be run on run levels 2 and 3 —run level 2 is usually when networking daemons are started.


Run levels

Unix systems handle run levels differently depending on the vendor. The Linux Standard Base specification describes run level 2 as multi-user mode, and 3 as multi-user mode with networking. In Debian, 2–5 are grouped as multi-user mode with console logins and the display manager. However, Ubuntu treats run level 2 as graphical multi-user with networking, so you should check how your system implements run levels before using Upstart.


The Upstart script stanza allows you to include a short script, so this means you can do things like set NODE_ENV to production. The application itself is launched with the exec instruction. We’ve included some logging support by redirecting standard out and standard error to a log file .

Upstart can be more work to set up than runit, but we’ve used it in production for three years now without any issues. Both are easier to set up and maintain than traditional stop/start init scripts, but there’s another technique you can use: Node programs that monitor other Node programs.

Node process managers work by using a small program that ensures another program runs continuously. This program is simple and therefore less likely to crash than a more complex web application. One of the most popular modules for this is forever(, which can be used as a command-line program or programmatically.

Most people use it through the command-line interface. The basic usage is forever start app.js, where app.js is your web application. It has lots of options beyond this, though: it can manage log files and even wrap your program so it behaves like a daemon.

To start your program as a daemon, use the following options:

forever start -l forever.log -o out.log -e err.log app.js

This will start app.js, creating some additional files: one to store the current PID of the active process, a log file, and an error log file. Once the program is running, you can stop it gracefully like this:

forever stop app.js

Forever can be used with any Node program, but it’s generally seen as a tool for keeping web applications running for a long time. The command-line interface makes it easy to use alongside other Unix programs.

Deploying applications that use WebSockets can bring a set of unique requirements. It can be more difficult with PaaS providers, because they can kill requests that last for more than a certain number of seconds. If you’re using WebSockets, look through the next technique to make sure your setup will work in production.

Technique 100 Using WebSockets in production

Node is great for WebSockets—the same process can serve both standard HTTP requests and the newer WebSocket protocol. But how exactly do you deploy programs that use WebSockets in production? Read on to find out how to do this with web servers and cloud providers.


You want to use WebSockets in production.


Make sure the service provider or proxy you’re using supports HTTP Upgrade headers.


WebSockets are amazing, but are still treated almost like second-class citizens by hosting providers. Nodejitsu was the first PaaS provider to support WebSockets, and it uses node-http-proxy ( to do this. Almost all solutions involve a proxy. To understand why, you need to look at how WebSockets work.

HTTP is essentially a stateless protocol, which means all interactions between a server and a client can be modeled with requests and responses that hold all of the required state. This level of encapsulation has led to the design of modern client/server web applications.

The downside of this is that the underlying protocol doesn’t support long-running full-duplex connections. There’s a wide class of applications that are built on TCP connections of this type; video streaming and conferencing, real-time messaging, and games are prominent examples. As web browsers have evolved to support richer, more sophisticated applications, we’re naturally left trying to simulate these types of applications using HTTP.

The WebSocket protocol was developed to support long-lived TCP-like connections. It works by using a standard HTTP handshake where the client establishes whether the server supports WebSockets. The mechanism for this is a new header called Upgrade. As HTTP clients and servers are typically bombarded with a variety of nonstandard headers, servers that don’t support Upgrade should be fine—the client will just have to fall back to old-fashioned HTTP polling.

Because servers have to handle WebSocket connections so differently, it makes sense to effectively run two servers. In a Node program, we typically have an http.listen for our standard HTTP requests, and another “internal” WebSocket server.

In technique 97, you saw how to use nginx with Node. The example used proxies to pass requests from nginx to your Node process, which meant the Node process could bind to a different port to 80. By using the same technique, you can make nginx support WebSockets. A typicalnginx.conf would look like the next listing.

Listing 12.5. Adding WebSocket support to nginx

Adding proxy_http_version 1.1 and proxy_set_header Upgrade  enables nginx to filter WebSocket requests through to your Node process. This example will also skip caching for WebSocket requests.

Since we mentioned Nodejitsu supports WebSockets, what about Heroku? Well, you currently need to enable it as an add-on, which means you need to run a heroku command:

heroku labs:enable websockets

Heroku’s web servers usually kill requests that take longer than around 75 seconds, but enabling this add-on means requests that originate with an Upgrade header should keep running for as long as the network allows.

There are times when you might not be able to use WebSockets easily. One example is older versions of Apache, where the proxy module doesn’t support them. In cases like this, it can be better to use a proxy server that runs before everything else.

HAProxy ( is a flexible proxy server. The usage is similar to nginx, and it’s also event-based, so it has been widely adopted in the Node community. If you’re using an old version of Apache, you can proxy web requests to Apache or Node, depending on various options like URL or headers.

If you want to install HAProxy in Debian or Ubuntu, you can do so with sudo apt-get install haproxy. Once it’s set up, you’ll need to edit /etc/default/haproxy and set ENABLED=1—this is just because it ships with a default configuration, so it’s disabled by default. Listing 12.6 is a sample configuration that’s capable of routing requests to a Node web application that runs on port 3000, but will be accessible using port 80 externally.

Listing 12.6. Using HAProxy with a Node application

This should work with WebSockets, and we’ve used a long timeout so HAProxy doesn’t close WebSockets connections, which are typically long-lived . If you run a Node program that listens on port 3000, then after restarting HAProxy with sudo/etc/init.d/haproxy restart, your application should be accessible on port 80.

You can use table 12.2 to find the web server that’s right for your application.

Table 12.2. Comparing server options



Best for


·        Fast asset serving

·        Works well with established web platforms (PHP, Ruby)

·        Lots of modules for things like proxying, URL rewriting

·        Virtual hosts

May already be on servers


·        Event-based architecture, very fast

·        Easy to configure

·        Proxy module works well with Node and WebSockets

·        Virtual hosts

Hosting Node applications when you also want to host static websites, but don’t yet have Apache or a legacy server set up


·        Event-based and fast

·        Can route to other web servers on the same machine

·        Works well with WebSockets.

Scaling up to a cluster for high-traffic sites, or complex heterogeneous setups

Native Node proxy

·        Reuse your Node programming knowledge

·        Flexible

Useful if you want to scale and have a team with excellent Node skills


Which server is right for me?

This chapter doesn’t cover every server choice out there—we’ve mainly focused on Apache and nginx for Unix servers. Even so, it can be difficult to pick between these options. We’ve included table 12.2 so you can quickly compare each option.


Your HAProxy setup can be made aware of multiple “back ends” by naming them with the backend instruction. In listing 12.7 we only have one—node_backend. It would be possible to also run Apache, and route certain requests to it based on the domain name:

frontend http-in

  mode http

  bind *:80

  acl static_assets hdr_end(host) -i

backend static_assets

  mode http

  server www_static localhost:8080

This works well if you have an existing set of Apache virtual hosts—perhaps serving things like static assets, blogs, and websites—and you want to add Node to the same server. Apache can be set up to listen on a different port so HAProxy can sit in front of it, and then route requests to Express on port 3000 and the existing Apache sites on port 8080. Apache allows you to change the port by using the Listen 8080 directive.

You can use the same acl option to route WebSockets based on URL. Let’s say you’ve mounted your WebSocket server on /chat in your Node application. You could have a specific instance of your server that just handles WebSockets, and route conditionally using HAProxy by usingpath_beg. The following listing shows how this works.

Listing 12.7. Using HAProxy with WebSockets

HAProxy can match requests based on lots of parameters. Here we’ve used hdr(Upgrade) -i WebSocket to test if an Upgrade header has been used . As you’ve already seen, that denotes a WebSocket handshake.

By using path_beg and marking matching routes with acl is_websocket , you can now route requests based on the prefix expression if is_websocket.

All of these HAProxy options can be combined to route requests to your Node application, Apache server, and WebSocket-specific Node server. That means you can run your WebSockets off an entirely different process, or even another internal web server. HAProxy is a great choice for scaling up Node programs—you could run multiple instances of your application on multiple servers.

HAProxy provides a weight option that allows you to implement round-robin load balancing by adding balance roundrobin to a backend.

You can initially deploy your application without nginx or HAProxy in front of it, but when you’re ready, you can scale up by using a proxy. If you don’t have performance issues right now, then it’s worth just being aware that proxies can do things like route WebSockets to different servers and handle round-robin load balancing. If you already have a server using Apache 2.2.x that isn’t compatible with proxying WebSockets, then you can drop HAProxy in front of Apache.

If you’re using HAProxy, you’ll still have to manage your Node processes with a monitoring daemon like runit or Upstart, but it has proven to be an incredibly flexible solution.

Another approach that we haven’t discussed yet is to put your Node applications behind a lightweight Node program that acts as a proxy itself. This is actually used behind the scenes by PaaS providers like Nodejitsu.

Selecting the right server architecture is just the first step to successfully deploying a Node application. You should also consider performance and scalability. The next three techniques include advice on caching and running clusters of Node programs.

12.2. Caching and scaling

This section is mainly about running multiple copies of Node applications at once, but we’ve also included a technique to give you details on caching. If you can make the client do more work, then why not?

Technique 101 HTTP caching

Even though Node is known for high-performance web applications, there are ways you can speed things up. Caching is the major technique, and you should consider caching before deploying your application. This technique introduces the concepts behind HTTP caching.


You want to reduce how long it takes to make requests to your application.


Check to ensure that you’re using HTTP caching correctly.


Modern web applications can be huge: image assets, fonts, CSS, JavaScript, and HTML all add up to a formidable payload that’s spread across several HTTP requests. Even with the best minimizers and compression, downloads can still run into megabytes. To avoid requiring users to wait for every action they perform on your site, the best strategy can be to remove the need to download anything at all.

Browsers cache content locally, and can look at the cache to determine if a resource needs to be downloaded. This process is controlled by HTTP cache headers and conditional requests. In this technique we’ll introduce cache headers and explain how they work, so when you watch your application serving responses in a debugging tool like WebKit Inspector, you’ll know what caching headers to expect.

The main two headers are Cache-Control and Expires. The Cache-Control header allows the server to specify a directive that controls how a resource is cached. The basic directives are as follows:

·        public —Allow caching in the browser and any intermediate proxies between the browser and server.

·        private —Only allow the browser to cache the resource.

·        no-store —Don’t cache the resource (but some clients still cache under certain conditions).

For a full list of Cache-Control directives, refer to the Hypertext Transfer Protocol 1.1 specification (

The Expires header tells the browser when to replace the local resource. The date should be in the RFC 1123 format: Fri, 03 Apr 2014 19:06 BST. The HTTP/1.1 specification notes that dates over a year shouldn’t be used, so don’t set dates too far into the future because the behavior is undefined.

These two headers allow the server to tell clients when a resource should be cached. Most Node frameworks like Express will set these headers for you—the static serving middleware that’s part of Connect, for example, will set maxAge to 0 to indicate cache revalidation should occur. If you watch the Network console in your browser’s debugging tools, you should see Express serving static assets with Cache-Control: public, max-age=0, and a Last-Modified dates based on the file date.

Connect’s static middleware, which is found in the send module, does this by using stat.mtime.toUTCString to get the date of the last file modification. The browser will make a standard HTTP GET request for the resource with two additional request headers: If-Modified-Since and If-None-Match. Connect will then check If-Modified-Since against the file modification date, and respond with an HTTP 304, depending on the modification date. A 304 response like this will have no body, so the browser can conditionally use local content instead of downloading the resource again.

Figure 12.7 shows a high-level overview of HTTP caching, from the browser’s perspective.

Figure 12.7. Browsers either use the local cache or make a conditional request, based on the previous request’s headers.

Conditional caching is great for large assets that may change, like images, because it’s much cheaper to make a GET request to find out if a resource should be downloaded again. This is known as a time-based conditional request. There are also content-based conditional requests, where a digest of the resource is used to see if a resource has changed.

Content-based conditional requests work using ETags. ETag is short for entity tag, and allows servers to validate resources in a cache based on their content. Connect’s static middleware generates ETags like this:

exports.etag = function(stat) {

  return '"' + stat.size + '-' + Number(stat.mtime) + '"';


Now contrast this to how Express generates ETags for dynamic content—this is usually content sent with res.send, like a JavaScript object or a string:

exports.etag = function(body){

  return '"' + crc32.signed(body) + '"';


The first example uses the file modification time and size to create a hash. The second uses a hashing function based on the content. Both techniques send the browser tags that are based on the content, but they’ve been optimized for performance based on the resource type.

There’s pressure on developers of static servers to make them as fast as possible. If you were to use Node’s built-in http module, you’d have to take all of these caching headers into account, and then optimize things like ETag generation. That’s why it’s advisable to use a module like Express—it’ll handle the details of the required headers based on sensible default behavior, so you can focus on developing your application.

Caching is an elegant way of improving performance because it effectively allows you to reduce traffic by making clients do a bit more work. Another option is to use a Node-based HTTP proxy to route between a cluster of processes or servers. Read on to learn how to do this, or skip totechnique 103 to see how to use Node’s cluster module to manage multiple Node processes.

Technique 102 Using a Node proxy for routing and scaling

Local development is simple because you generally run one Node application at a time. But a production server can host multiple applications, and run the same application on multiple CPU cores to improve performance. So far we’ve talked about web and proxy servers, but this technique focuses on pure Node servers.


You want to use a pure Node solution to host multiple applications, or scale an application.


Use a proxy server module like Nodejitsu’s http-proxy.


This technique demonstrates how to use Node programs to route traffic. It’s similar to the proxy server examples in technique 100, so you can reapply these ideas to HAProxy or nginx. But there are times when it might be easier to express routing logic in code rather than using settings files.

Also, as you’ve seen before in this book, Node programs run as a single process, which doesn’t usually take advantage of a modern server that may have multiple CPUs and CPU cores. Therefore, you can use the techniques here to route traffic based on your production needs, but also to run multiple instances of your application so it can better take advantage of your server’s resources, reducing response latency and hopefully keeping your customers happy.

Nodejitsu’s http-proxy ( is a lightweight wrapper around Node’s built-in http core module that makes it easier to define proxies with code. The basic usage should be familiar to you if you’ve followed our chapter on Node web development. The following listing is a simple proxy that redirects traffic to another port.

Listing 12.8. Redirecting traffic to another port with http-proxy

This example redirects traffic to port 3000 by using http-proxy’s target option . This module is event-based, so errors can be handled by setting up an error listener . The proxy server itself is set to listen on port 9000 , but we’ve just used that so you can run it easily—port 80 would be used in production.

The options passed to createProxyServer can define other routing logic. If ws: true is set, then WebSockets will be routed separately. That means you can create a proxy server that routes WebSockets to one application, and standard requests elsewhere. Let’s look at that in a more detailed example. The next listing shows you how to route WebSocket requests to a separate application.

Listing 12.9. Routing WebSocket connections separately

This example creates two proxy servers: one for web requests and the other for WebSockets . The main web-facing server emits upgrade events when a WebSocket is initiated, and this is intercepted so requests can be routed elsewhere .

This technique can be extended to route traffic according to any rules you like—if you can infer something from a request object, you can route traffic accordingly. The same idea can also be used to map traffic to multiple machines. This allows you to create a cluster of servers, which can help you scale up an application. The following listing could be used to proxy to several servers.

Listing 12.10. Scaling using multiple instances of a server

This example uses an array that contains the options for each proxy server, and then creates an instance of proxy server for each one . Then all you need to do is create a standard HTTP server and map requests to each server . This example uses a basic round-robin implementation—after each request a counter is incremented, so the next request will be mapped to a different server. You could easily take this example and reconfigure it to map to any number of servers.

Mapping requests like this can be useful on a single server with multiple CPUs and CPU cores. If you run your application multiple times and set each instance to listen on a different port, then your operating system should run each Node process on a different CPU core. This example useslocalhost, but you could use another server, thereby clustering the application across several servers.

In contrast to this technique’s use of additional servers for scaling, the next technique uses Node’s built-in features to manage multiple copies of the same Node program.

Technique 103 Scaling and resiliency with cluster

JavaScript programs are considered single-threaded. Whether they actually use a single thread or not is dependent on the platform, but conceptually they execute as a single thread. That means you may have to do additional work to scale your application to take advantage of multiple CPUs and cores.

This technique demonstrates the core module cluster, and shows how it relates to scalability, resiliency, and your Node applications.


You want to improve your application’s response time, or increase its resiliency.


Use the cluster module.


In technique 102, we mentioned running multiple Node processes behind a proxy. In this technique we’ll explain how this works purely on the Node side. You can use the ideas in this technique with or without a proxy server to load balance. Either way, the goal is the same: to make better use of available processor resources.

Figure 12.8 shows a system with two CPUs with four cores each. A Node program is running on the system, but only fully utilizing a single core.

Figure 12.8. A Node process running on a single core

There are reasons why figure 12.8 isn’t entirely accurate. Depending on the operating system, the process might be moved around cores, and although it’s accurate to say a Node program is a single process, it still uses several threads. Let’s say you start up an Express application that uses a MySQL database, static file serving, user sessions, and so on. Even though it will run as a single process, it’ll still have eight separate threads.

We’re trained to think of Node programs as single-threaded because JavaScript platforms are conceptually single-threaded, but behind the scenes, Node’s libraries like libuv will use threads to provide asynchronous APIs. That gives us the event-based programming style without having to worry about the complexity of threads.

If you’re deploying Node applications and want to get more performance out of your multicore, multi-CPU system, then you need to start thinking more about how Node works at this level. If you’re running a single application on a multicore system, you want something like the illustration in figure 12.9.

Figure 12.9. Take advantage of more cores by running multiple processes.

Here we’re running a Node program on all but one core, the idea being that a core is left free for the system. You can get the number of cores for a system with the os core module. On our system, running require('os').cpus().length returns 4—that’s the number of cores we have, rather than CPUs—Node’s API cpus method returns an array of objects that represent each core:

[{ model: 'Intel(R) Core(TM) i7-4650U CPU @ 1.70GHz',

   speed: 1700,


   { user: 11299970, nice: 0, sys: 8459650, idle: 93736040, irq: 0 } },

 { model: 'Intel(R) Core(TM) i7-4650U CPU @ 1.70GHz',

   speed: 1700,


   { user: 5410120, nice: 0, sys: 2514770, idle: 105568320, irq: 0 } },

 { model: 'Intel(R) Core(TM) i7-4650U CPU @ 1.70GHz',

   speed: 1700,


   { user: 10825170, nice: 0, sys: 6760890, idle: 95907170, irq: 0 } },

 { model: 'Intel(R) Core(TM) i7-4650U CPU @ 1.70GHz',

   speed: 1700,


   { user: 5431950, nice: 0, sys: 2498340, idle: 105562910, irq: 0 } } ]

With this information, we can automatically tailor an application to scale to the target server. Next, we need a way of forking our application so it can run as multiple processes. Let’s say you have an Express web application: how do you safely scale it up without completely rewriting it? The main issue is communication: once you start running multiple instances of an application, how does it safely access shared resources like databases? There are platform-agnostic solutions to this that would require a big project rewrite—pub/sub servers, object brokers, distributed systems—but we’ll use Node’s cluster module.

The cluster module provides a way of running multiple worker processes that share access to underlying file handles and sockets. That means you can wrap a Node application with a master process that works workers. Workers don’t need access to shared state if you’re doing things like accessing user sessions in a database; all the workers will have access to the database connection, so you shouldn’t need to set up any communication between workers.

Listing 12.11 is a basic example of using clustering with an Express application. We’ve just included the server.js file that loads the main Express application in app.js. This is our preferred method of structuring Node web applications—the part that sets up the server using .listen(port)is in a different file than the application itself. In this case, separating the server and application has the additional benefit of making it easier to add clustering to the project.

Listing 12.11. Clustering a Node web application

The basic pattern is to load the cluster core module , and then determine how many cores should be used . The cluster.isMaster allows the code to branch if this is the first (or master) process, and then fork workers as needed with cluster.fork .

Each worker will rerun this code, so when a worker hits the else branch, the server can run the code particular to the worker . In this example workers start listening for HTTP connections, thereby starting the Express application.

There’s a full example that includes this code in this book’s code samples, which can be found in production/inky-cluster.

If you’re a Unix hacker, this should all look suspiciously familiar. The semantics of fork() are well known to C programmers. The way it works is whenever the system call fork() is used, the current process is cloned. Child processes have access to open files, network connections, and data structures in memory. To avoid performance issues, a system called copy on write is used. This allows the same memory locations to be used until a write is attempted, at which point each forked process receives a copy of the original. After the processes are forked, they’re isolated.

There’s an additional step to properly dealing with clustered applications: worker exit recovery. If one of your workers encounters an error and the process ends, then you’ll want to restart it. The cool thing about this is any other active workers can still serve requests, so clustering will not only improve request latency but also potentially uptime as well. The next listing is a modification of listing 12.11, to recover from workers exiting.

Listing 12.12. Recovering from untimely worker death

The cluster module is event-based, so the master can listen for events like exit , which denotes the worker died. The callback for this event gets a worker object, so you can get a limited amount of information about the worker. After that all you need to do is fork again , and you’ll be back to the full complement of workers.


Recovering from a crash in the master process

You might be wondering what happens when the master process itself dies. Even though the master should be kept simple to make this unlikely, a crash is still of course possible. To minimize downtime, you should still manage your clustered applications with a process manager like theforever module or Upstart. Both of these solutions are explored in technique 99.


You can run this example with an Express application, and then use kill to force workers to quit. The transcript of such a session should look something like this:

Running 3 total workers

    Worker PID: 58733

    Worker PID: 58732

    Worker PID: 58734

    Worker 1 died

    Worker PID: 58737

Three workers were running until kill 58734 was issued, and then a new worker was forked and 58737 started.

Once you’ve got clustering set up, there’s one more thing to do: benchmark. We’ll use ab (, the Apache benchmarking tool. It’s used like this:

ab -n 10000 -c 100 http://localhost:3000/

This makes 10,000 requests with 100 concurrent requests at any one time. Using three workers on our system gave 260 requests per second, whereas a single process version resulted in 171 requests per second. The cluster was definitely faster, but is this really working as well as our round-robin example with HAProxy or nginx?

The advantage of the cluster module is that you can script it with Node. That means your developers should be able to understand it rather than having to learn how HAProxy or nginx works for load balancing. Load balancing with an additional proxy server doesn’t have the same kind of interprocess communication options that cluster has—you can use process.send and cluster.workers[id].on('message', fn) to communicate between workers.

But proxies with dedicated load-balancing features have a wider choice of load-balancing algorithms. Like all things, it would be wise to invest time in testing HAProxy, nginx, and Node’s clustering module to see which works best for your application and your team.

Also, dedicated load-balancing servers can proxy requests to multiple servers—you could technically proxy from a central server to multiple Node application servers, each of which uses the cluster core module to take advantage of the server’s multicore CPU.

With heterogeneous setups like this, you’ll need to keep track of what instances of your application are doing. The next section is dedicated to maintaining production Node programs.

12.3. Maintenance

No matter how solid your server architecture is, you’re still going to have to maintain your production system. The techniques in this section are all about maintaining your Node program; first, package optimization with npm.

Technique 104 Package optimization

This technique is all about npm and how it can make deployments more efficient. If you feel like your module folder might be getting a bit large, then read on for some ideas on how to fix it.


Your application seems larger than expected when it’s released to production.


Try out some of npm’s maintenance features, like npm prune and npm shrinkwrap.


Heroku makes your application’s size clear when you deploy: each release displays a slug size in megabytes, and the maximum size on Heroku is 300 MB. Slug size is closely related to dependencies, so as your application grows and new dependencies are added, you’ll notice that it can increase dramatically.

Even if you’re not using Heroku, you should be aware of your application’s size. It will impact how quickly you can release new code, and releasing new code should be as fast as possible. When deployment is fast, then releasing bug fixes and new features becomes less of a chore and less risky.

Once you’ve gone through your dependencies in package.json and weeded out any that aren’t necessary, there are some other tricks you can use to reduce your application’s size. The npm prune command removes packages that are no longer listed in your package.json, but it also applies to the dependencies themselves, so it can sometimes dramatically reduce your application’s storage footprint.

You should also consider using npm prune --production to remove devDependencies from production releases. We’ve found test frameworks in our production releases that didn’t need to be there. If you have ./node_modules checked into git, then Heroku will run npm prune for you, but it doesn’t currently run npm prune --production.


Why check in ./node_modules?

It might be tempting to add ./node_modules to .gitignore, but don’t! When you’re working on an application that will be deployed, then you should keep ./node_modules in your repository. This will help other people to run your application, and make it easier to reproduce your local setup that passes tests and everything else on a production environment.

Do not do this for modules you release through npm. Open source libraries should use npm to manage dependencies during installation.


Another command you can use to potentially improve deployment is npm shrinkwrap. This will create a file called npm-shrinkwrap.json that specifies the exact version of each of your dependencies, but it doesn’t stop there—it continues recursively to capture the version of each submodule as well. The npm-shrinkwrap.json file can be checked into your repository, and npm will use it during deployment to get the exact version of each package.

shrinkwrap is also useful for collaboration, because it means people can duplicate the modules you’ve had living on your computer during development. This helps when someone joins a project after you’ve been working solo for a few months.

Some PaaS providers have features for excluding files from deployment as well. For example, Heroku can accept a .slugignore file, which works like .gitignore—you could create one like this to ignore tests and local seed data:




By taking advantage of npm’s built-in features, you can create solid and maintainable packages, reduce deployment time, and improve deployment reliability.

Even with a well-configured, scalable, and carefully deployed application, you’ll still run into issues. When things go wrong, you need logs. Read on for techniques when dealing with log files and logging services.

Technique 105 Logging and logging services

When things break—not if, but when—you’ll need logs to uncover what happened. On a typical server, logs are text files. But what about PaaS providers, like Heroku and Nodejitsu? For these platforms you’ll need logging services.


You want to log messages from a Node application on your own server, or on a PaaS provider.


Either redirect logs to files and use logrotate, or use a third-party logging service.


In Unix, everything is a file, and that partly dictates the way systems administrators and DevOps experts think about log files. Logs are just files: programs stream data into them, and we stream data out. This kind of setup is convenient for those of us that live in the command line—piping files through commands like grep, sed, and awk makes light work of even gigabyte-sized logs.

Therefore, whatever you do, you’ll want to correctly use console.log and console.error. It also doesn’t hurt to be aware of err.stack—instances of Error in Node get a stack property when they’re defined, which can be extremely helpful for debugging problems in production. For more on writing logs, take a look at technique 6 in chapter 2.

The benefit of using console.error and console.log is that you can pipe output to different locations. The following command will redirect data from standard out (console.log) to application.log, and standard error (console.error) to errors.log:

npm start 1> application.log 2> errors.log

All you need to remember is the greater-than symbol redirects output, and using a number specifies the output stream: 1 is standard out, and 2 is standard error.

After a while, your log files will get too large. Fortunately, modern Unix systems usually come with a log rotation package. This will split files up over time and optionally compress them. The logrotate package can be installed in Debian or Ubuntu with apt-get install logrotate. Once you’ve installed it, you’ll need a configuration file for each set of log files you want to rotate. The following listing shows an example configuration that you can tailor for your application.

Listing 12.13. logrotate configuration

After listing the log files you want to rotate, you can list the options you want to use. logrotate has many options, and they’re documented in man logrotate. The first one here, daily , just states that we want to rotate files every day. The next line makes logrotate keep 20 files; after that files will be removed . The third option will make sure old log files are compressed so they don’t use up too much space .

The fourth option, copytruncate , is more important for an application that uses simple standard I/O-based logging. It makes logrotate copy and then truncate the current log file. That means that your application doesn’t need to close and re-open standard out—it should just work without any special configuration.

Using standard I/O and logrotate works well for a single server and a simple application, but if you’re running an application in a cluster, you might find it difficult to manage logging. There are Node modules that are dedicated to logging and provide cluster-specific options. Some people even prefer to use these modules because they generate output in standard log file formats.

Using the log4node module ( is similar to using console.log, but has features that make it easier for use in a cluster. It creates one log file for all workers, and listens for a USR2 signal to determine when to re-open files. It supports configuration options, including log level and message prefix, so you can keep logs quiet during tests or increase the verbosity for critical production systems.

winston ( is a logging module that supports multiple transports, including Cassandra, which allows you to cluster your log writes. That means that if you have an application that writes millions of log entries an hour, then you can use multiple servers to capture the logs in a more reliable manner.

winston supports remote log services, including commercial ones like Papertrail. Papertrail and Loggly (see figure 12.10) are commercial services that you can pipe your logs to, typically using the syslogd protocol. They will also index logs, so searching gigabytes of logs is extremely fast, depending on the query.

Figure 12.10. Loggly’s dashboard

A service like Loggly is absolutely critical for Heroku. Heroku only stores the last 5,000 log entries, which can be flooded off within minutes of running a typical application. If you’ve deployed a Node application to Heroku that uses console.log, log4node, or winston, then you’ll be able to redirect your logs just by enabling the add-on.

With Heroku, Loggly can be configured by selecting a plan name and running heroku addons:add Loggly:PlanName from your project’s directory. Typing heroku addons:open loggly will open the Loggly web interface, but there’s also a link in Heroku’s administration panel under Resources. Any logging you’ve done with standard I/O should be sent straight to Loggly.

If you’re using winston, then there are transports available for Loggly. One is winston-loggly (, which can be used for easy access to Loggly with non-Heroku services, or your own private servers.

Because Winston transports can be changed by using winston.add(winston .transports.Loggly, options), you don’t need to do anything special to support Loggly if you’re already using winston.

There’s a standard for logging that you can use with your applications: The Syslog Protocol (RFC 5424). Syslog message packets have a standard format, so you won’t usually generate them by hand. Modules like winston typically support syslog, so you can use it with your Node application, but there are two main benefits to using it. The first is that messages have standardized log levels, so filtering logs is easier. Some examples include level 0, known as Emergency, and level 4, which is Warning. The second is that the protocol defines how messages are sent over the network, which means you can make your Node application talk to a syslog daemon that runs on a remote server.

Some log services like Loggly and Splunk can act as syslog servers; or, you could run your own daemon on dedicated hardware or a virtual machine. By using a standardized protocol like syslog, you can switch between log providers as your requirements change.

That’s the last technique on Node-specific production concerns. The next section outlines some additional issues relating to scaling and resiliency.

12.4. Further notes on scaling and resiliency

In this chapter we’ve demonstrated how to use proxies and the cluster module to scale Node programs. One of the advantages we cited in cluster’s favor is easier interprocess communication. If you’re running an application on separate servers, how can Node processes communicate?

One simple answer might be HTTP—you could build an internal REST API for communication. You could even use WebSockets if messages need faster responses. When we were faced with this problem, we used RabbitMQ ( This allowed instances of our Node application to message each other using a shared message bus, thereby distributing work throughout a cluster.

The project was a search engine that used Node programs to download and scrape content. Work was classified into spidering, downloading, and scraping. Swarms of Node processes would take work from queues, and then push new jobs back to queues as well.

There are several implementations of RabbitMQ clients on npm—we used amqplib ( There are also competitors to RabbitMQ—zeromq ( is a highly focused and simple alternative.

Another option is to use a hosted publish/subscribe service. One example of this is Pusher (, which uses WebSockets to help scale applications. The advantage of this approach is that Pusher can message anything, including mobile clients. Rather than restricting messaging to your Node programs, you can create message channels that web, mobile, and even desktop clients can subscribe to.

Finally, if you’re using private servers, you’ll need to monitor resource usage. StrongLoop ( offers monitoring and clustering tools for Node, and New Relic (New Relic) also now has Node-specific features. New Relic can help you break down where time is being spent in a live application, so you can use it to discover bottlenecks in database access, view rendering, and application logic.

With service providers like Heroku, Nodejitsu, and Microsoft, and the tools provided by StrongLoop and New Relic, running Node software in production has rapidly matured and become entirely feasible.

12.5. Summary

In this chapter you’ve seen how to run Node on PaaS providers, including Heroku, Nodejitsu, and Windows Azure. You’ve also learned about the issues of running Node on private servers: safely accessing port 80 (technique 98), and how WebSockets relate to production requirements (technique 100).

No matter how fast your code is, if your application is popular, then you may run into performance issues. In our section on scaling, you’ve learned all about caching (technique 101), proxies (technique 102), and scaling with cluster (technique 103).

To keep your application running solidly, we’ve included maintenance-related techniques on npm in production (technique 104) and logging (technique 105). Now if anything goes wrong, you should have enough information to solve the problem.

Now you should know how to build Node web applications and release them in a maintainable and scalable state.