Ben Busse - January 18, 2016

One of the main design goals for DreamFactory 2.0 was to increase speed and scalability. The entire platform was rewritten in the Laravel framework and we adopted JSON Web Tokens (JWT) for better security and completely stateless operation. This post presents some benchmarking results designed to help enterprise customers scale their DreamFactory installation to any desired level of performance. The sections below talk about vertical and horizontal scalability and then we look at the effect of increasing concurrent users.

In this post, we’ll cover:

Vertical Scalability
Horizontal Scalability
Concurrent Users
Profile Results
Scalability Use Cases
Built For Speed 

Vertical Scalability

Below are some results that show the vertical scalability of a single DreamFactory Instance calculated with Apache Benchmark. Two different Amazon Web Services EC2 servers were tested: a m4.2xlarge with 8 processors and a m4.4xlarge with 16 processors.

For this test, we conducted 1000 GET operations from the DreamFactory REST API. There were 100 concurrent users making the requests. Each operation searched, sorted, and retrieved 1000 records from a SQL database. This test was designed to exercise the server side processing required for a complex REST API call.

pic1-1.png

Looking at the two servers, we see a nice doubling of capacity that matches the extra processors and memory. This really shows the vertical scalability of the system. The complex GET scenario highlights the advantages of the additional processor power. 

Next, we tried a similar test with a simple GET command that basically just returned a single database record 5000 times. There were 100 concurrent users making the requests. In this situation, the fixed costs of Internet bandwidth, network switching, and file storage start to take over and the additional processors contribute less.
 pic2-1.png

Look at these results for 5000 simple GETs from the API. As you can see, performance does not fully double with additional processors. This demonstrates the diminishing returns of adding processors without scaling up other fixed assets. By the way, we also looked at POST and DELETE transactions. The results were pretty much what you would expect and in line with the GET requests tested above.

Horizontal Scalability

Below are some results that show the horizontal scalability of a single DreamFactory Instance calculated with Apache Benchmark. Four m4.xlarge Amazon Web Services EC2 instances were configured behind a load balancer. The servers were configured with a common default database and EBS storage.

First we tested the complex GET scenario. The load balanced m4.xlarge servers ran at about the same speed as the m4.4xlarge server tested earlier. This makes sense because each setup had similar CPU and memory installed. Since this example was bound by processing requirements, there was not much advantage to horizontal scaling.

Next, we tested the simple GET scenario. In this case there appears to be some advantage to horizontal scaling. This is probably due to improvements in network IO and the relaxation of other fixed constraints compared to the vertical scalability test.

Concurrent Users

We also evaluated the effects of concurrent users simultaneously calling REST API services on the platform. This test used the complex GET scenario where 1000 records were searched, sorted, and retrieved. The test was conducted with three different Amazon Web Services EC2 instances. The servers were m4.xlarge, m4.2xlarge, and m4.4xlarge. We started with 20 concurrent users and scaled up to 240 simultaneous requests.

The minimum time for the first requests to finish was always around 300 milliseconds. This is because some requests are executed immediately within the minimum time while others must wait to be executed.

The maximum time for the last request to finish will usually increase with the total number of concurrent users. Based on the processor size, the maximum time for the last request can increase sharply past some critical threshold. This is illustrated by the 8 processor example, where maximum request times spike past 160 concurrent users.

The 16 processor server never experienced any degradation of performance all the way to 240 concurrent users. This is the maximum number of concurrent users supported by the Apache Bench test program. Even then, the worst round trip delay was less than ½ second.

A typical mobile application only makes a handful of REST API calls every minute. A situation where hundreds of users are simultaneously calling the backend would be quite unusual, so these benchmarks represent a worst case scenario. The 16 processor server tested below could probably support 100,000 registered users, where 10,000 would be using the app at any given time, and 1000 were making REST API calls during any given time frame.

Profile Results

We have also profiled the code base extensively. We used the Blackfire profiling tools as well. Check these out if you get a chance. Below is a picture of the smooth use of time across all of the different classes and code in the stack. Extensive analysis has demonstrated that there are no “bottlenecks” in the platform.

Scalability Use Cases

Adding multiple processors is a very easy way to implement vertical scalability. As our results show, doubling the number of processors will just about double performance, except for installations that are network I/O or storage bound.

Adding new servers behind a load balancer is a sure fire way to implement horizontal scalability. Because DreamFactory is completely stateless, you can add as many instances as needed and balance them any way you want. Just be sure that they all use the same default database and cache. Other than that, DreamFactory can be scaled just like any other simple LAMP stack.

DreamFactory 2.0 has also been designed to run with or without shared storage. This capability allows DreamFactory to operate perfectly on Platform as a Service (PaaS) systems like Heroku or BlueMix where persistent storage is not available. Scalability becomes even easier to calculate on PaaS, because you can simply specify the number of instances needed.

DreamFactory 2.0 can also be run in a Docker container managed by Kubernetes, Swarm, Fleet, or Mesos. This is a real game changer. We will discuss DreamFactory on Docker and the use of Microservices in more detail in an upcoming blog post. Meanwhile, there is more detailed information about DreamFactory scalability available.

Built For Speed 

We put a lot of work into making DreamFactory 2.0 highly scalable. The use of JSON Web Tokens and the complete rewrite in Laravel contributed greatly to the stateless and highly efficient architecture of the platform. The routing engine, eventing system, and caching options in Laravel are truly state of the art and were directly leveraged by the architecture.

In my tests and benchmarking, there were never strange unexplained delays or other performance characteristics that did not respond in a very scalable manner. Want to double performance? Just double processors, servers, instances, or containers. You will need to get a sense of the load imposed by your application. But, once you do, DreamFactory 2.0 will scale to any desired level of performance.