Bill Appleton - May 13, 2016

Our engineering team considered using Node.js to build the DreamFactory REST API backend. There are some great things about Node that we really like. Developers can write JavaScript on both the client and the server, and the Node package manager is great. But after a careful look, we decided that Node was not the best choice. Instead, we choose the Laravel framework, the V8 Engine, and PHP to write DreamFactory. This architecture offers some real advantages when it comes to building a REST API backend. Read on, I think you will come to the same conclusion that we did.

Node is great for some things…

Node has an event-driven architecture capable of asynchronous input and output. This design optimizes throughput and scalability in Web applications with many input and output operations. Node has been used successfully for chat programs, browser games, and other applications that need real-time communications.

Node is single threaded and has a single event loop. All of the JavaScript you write executes in this loop, and if a blocking operation happens in that code, then it will block the entire loop and nothing else will happen until it finishes. When you do something that takes time, like reading from a database, conducting an HTTP transaction, or using the file system, then Node makes an asynchronous call to that driver and continues on immediately.

When one of these long running operations is finished, Node receives a callback. The callbacks are accomplished with a lightweight and highly efficient thread pool. In this manner, Node is very good at waiting for things to happen, and doing this with a bare minimum of resources. Other languages might have to spawn a thread or even start a new process in order to accomplish the same type of asynchronous operation.

One great application of Node is the new Lambda microservice from AWS. The purpose of Lambda is to simplify building smaller, on-demand applications that respond quickly to events and new information. Lambda spends most of the time waiting for an event to happen, after which your snippet of JavaScript is run as a microservice. Because of all the waiting involved, Node is a good choice for this type of product.

Another smart use of Node is for an MQTT message broker. This software is used for Internet of Things (IoT) applications where many devices need to communicate through a hub and spoke network. Each device on the network might publish certain messages and subscribe to others. As you can imagine, most of the time this network is just waiting for a message to be published, so Node also works well for an MQTT message broker.

…but not everything

If Node is so nifty, then why don’t people use it to build websites? The answer is that websites follow a particular data access pattern. A request comes in, there are some database transactions, and a response goes out. The advantages of Node start to fade away when there is a database transaction for every request and response. The problem is that the database driver needs to run in a separate process. This allows many people to use it at the same time. And so if you have to create a new process anyway, the fact that Node can efficiently wait for an asynchronous database transaction to finish doesn’t help conserve resources or improve speed very much.

node.png

As it turns out, a REST API platform follows the same basic data access pattern that a simple website does. A request comes in, there is a database transaction, and the response goes out. On a website the transaction happens with HTML pages, and on a REST API backend the transaction happens with JSON documents, but the workflow is the same. So the advantages of Node are mitigated when you are trying to build a REST API backend. In short, a REST API backend should be working all the time, not waiting for something to happen.

There is another problematic issue with Node. If there is lots of heavy lifting to be done, either before or after the database transaction, then all of that work must be completed by a single thread running JavaScript. That might be fine for some applications, but if massive scalability is the goal, then this can become a bottleneck. Two of the main uses cases for REST API services are widely distributed mobile applications and IoT deployments. Both of these scenarios could overwhelm a single CPU running Node at scale.

A better way

So what is the best language for building a simple request and response website? The world’s most popular Content Management Systems are Drupal and WordPress, and they are both written in PHP. In fact, over 80% of the world’s websites are written in PHP, including Facebook and Yahoo. There is a long history of solving scalability problems with LAMP stacks running PHP. Web servers, load balancers, operating systems, databases, and the PHP language have been optimized for this basic request and response model.

Some people might think that PHP is old hat. But take a look at the cutting edge Laravel framework and the new performance enhancements in PHP 7.0. We have written elsewhere about the advantages of using modular Laravel components for the DreamFactory platform. The popularity of PHP is another advantage, because third party database drivers are usually up to date and more widely tested than other languages. This was a key need for our REST API platform.

The diagram below shows the architecture for the DreamFactory backend. A process is assigned to the incoming request, the database or external service is called synchronously, and the response goes out on the same thread. If there is a lot of work to be done customizing, integrating, transforming or formatting the JSON request or response then this can happen in parallel without the single-threaded bottleneck. You need a separate process, but that was going to happen anyway because of the database transaction. For enhanced scalability we recommend running on NGINX to reduce the overhead of handling multiple processes.

lamp.png

Node is run on top of Google’s V8 Engine. So we also incorporated V8, but rather than running it in a single process like Node, we run it in parallel for server-side scripting and customization. Compare the blue boxes in the two architecture diagrams above to see the difference. We also support PHP, Python and Node itself for scripting and customization, if you prefer.

There are many debates on the Internet about how to scale Node applications. Vertical scaling seems possible, but horizontal scaling appears to be more difficult. In our case, we wanted to enable any system administrator to scale DreamFactory using techniques and technologies that they were already familiar with. Because the platform is configured as a LAMP stack, DreamFactory can be scaled vertically with additional processors or horizontally with a load balancer just like any website.

I have written this blog post to explore some of the technology and design decisions that we made along the way. Since DreamFactory is a REST API backend, the platform could always be rewritten in another language if there were a compelling reason to do so. But as things stand we are happy with the current architecture and scalability characteristics. Reach out and let me know what you think about it.