While NGINX is much younger than other web servers, it has quickly become a popular choice. Part of its success is based on it being the web server of choice for those looking for a lightweight and performant web server.
In today's article, we'll be taking an out-of-the-box instance of NGINX and tuning it to get more out of an already high-performance web server. While not a complete tuning guide, this article should provide readers with a solid understanding of tuning fundamentals and a few common NGINX tuning parameters.
Before we get into tuning however, let's first install NGINX.
Installing NGINX
For this article, we will be running NGINX on an Ubuntu Linux-based server, so we can perform the installation with the apt-get
command.
root@nginx-test:~# apt-get install nginx
This step will install a generic installation of NGINX, which already has some tuning parameters set out of the box. The default installation of NGINX, however, doesn't offer much in the way of content to serve. In order to give ourselves a realistic web application to tune, let's go ahead and deploy a sample site from GitHub.
root@nginx-test:~# git clone https://github.com/BlackrockDigital/startbootstrap-clean-blog.git /var/www/html Cloning into '/var/www/html'... remote: Counting objects: 308, done. remote: Total 308 (delta 0), reused 0 (delta 0), pack-reused 308 Receiving objects: 100% (308/308), 1.98 MiB | 0 bytes/s, done. Resolving deltas: 100% (119/119), done. Checking connectivity... done.
When performance tuning, it's important to understand the type of application that's being tuned. In the case of NGINX, it's important to know if you're tuning for static content or dynamic content served by a downstream application. The difference between these two types of content can alter what tuning parameters to change, as well as the values for those parameters.
In this article, we'll be tuning NGINX to serve static HTML content. While most of the parameters will apply to NGINX in general, not all of them will. It's best to use this article as a guide for your own tuning and testing.
Now that our basic instance is installed and a sample site deployed, let's see how well an out-of-the-box installation of NGINX performs.
Establishing a Baseline
One of the first steps in performance tuning anything is to establish a unit of measurement. For this article, we will be using the HTTP load testing tool ApacheBench, otherwise known as ab
to generate test traffic to our NGINX system.
This load-testing tool is very simple and useful for web applications. ApacheBench provides quite a few options for different types of load-testing scenarios; however for this article, we'll keep our testing pretty simple.
We will be executing the ab
command with the -c
(concurrency level) and -n
(number of requests) parameters set.
$ ab -c 40 -n 50000 http://159.203.93.149/
When we execute ab
, we'll be setting the concurrency level (-c
) to 40
, meaning ab
will maintain at least 40
concurrent HTTP sessions to our target NGINX instance. We will also be setting a limit on the number of requests to make with the -n
parameter. Essentially these two options together will cause ab
to open 40
concurrent HTTP sessions and send as many requests as possible until it reaches 50000
requests.
Let's go ahead and execute a test run to establish a baseline and identify which metric we will use for our testing today.
# ab -c 40 -n 50000 http://159.203.93.149/ This is ApacheBench, Version 2.3 <$Revision: 1528965 $> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Licensed to The Apache Software Foundation, http://www.apache.org/ Benchmarking 159.203.93.149 (be patient) Completed 5000 requests Completed 10000 requests Completed 15000 requests Completed 20000 requests Completed 25000 requests Completed 30000 requests Completed 35000 requests Completed 40000 requests Completed 45000 requests Completed 50000 requests Finished 50000 requests Server Software: nginx/1.10.0 Server Hostname: 159.203.93.149 Server Port: 80 Document Path: / Document Length: 8089 bytes Concurrency Level: 40 Time taken for tests: 16.904 seconds Complete requests: 50000 Failed requests: 0 Total transferred: 420250000 bytes HTML transferred: 404450000 bytes Requests per second: 2957.93 [#/sec] (mean) Time per request: 13.523 [ms] (mean) Time per request: 0.338 [ms] (mean, across all concurrent requests) Transfer rate: 24278.70 [Kbytes/sec] received
In the above output, there are several interesting metrics. Today we will be focusing on the Requests per second
metric. This metric shows the average number of requests our NGINX instance can serve in a second. As we adjust parameters, we should see this metric go up or down.
Requests per second: 2957.93 [#/sec] (mean)
From the above, we can see that the mean requests per second was 2957.93
. This might seem like a lot, but we will increase this number by quite a bit as we continue.
When performance tuning, it's important to remember to make small incremental changes and compare the results with the baseline. For this article, 2957.93
requests per second is our baseline measurement. For a parameter to be successful, it must result in an increase in requests per second.
With our baseline metrics set, let's go ahead and start tuning NGINX.
Worker Threads
One of the most basic tuning parameters in NGINX is the number of worker threads available. By default, the value of this parameter is auto
, which tells NGINX to create one worker thread for each CPU available to the system.
For most systems, one worker process per CPU is an even balance of performance and reduced overhead. With this article however, we are trying to get the most out of NGINX serving static content which should be pretty low CPU overhead. Let's go ahead and see how many requests per second we can get by increasing this value.
For our first test, let's go ahead and start two worker processes for each CPU on the system.
In order to figure out how many worker processes we need, we first need to know how many CPUs are available to this system. While there are many ways to do this, in this example we will use the lshw
command to show hardware information.
root@nginx-test:~# lshw -short -class cpu H/W path Device Class Description ============================================ /0/401 processor Intel(R) Xeon(R) CPU E5-2650L v3 @ 1.80GHz /0/402 processor Intel(R) Xeon(R) CPU E5-2650L v3 @ 1.80GH
From the output above, it appears our system is a 2
CPU system. This means for our first test, we will need to set NGINX to start a total of 4
worker processes.
We can do this by editing the worker_processes
parameter within the /etc/nginx/nginx.conf
file. This is the default NGINX configuration file and the location for all of the parameters we will be adjusting today.
worker_processes auto;
The above shows that this parameter is set to the default value of auto
. Let's go ahead and change this to a value of 4
.
worker_processes 4;
After setting the new value and saving the /etc/nginx/nginx.conf
file, we will need to restart NGINX in order for the configuration change to take effect.
root@nginx-test:~# service nginx restart root@nginx-test:~# ps -elf | grep nginx 1 S root 23465 1 0 80 0 - 31264 sigsus 20:16 ? 00:00:00 nginx: master process /usr/sbin/nginx -g daemon on; master_process on; 5 S www-data 23466 23465 0 80 0 - 31354 ep_pol 20:16 ? 00:00:00 nginx: worker process 5 S www-data 23467 23465 0 80 0 - 31354 ep_pol 20:16 ? 00:00:00 nginx: worker process 5 S www-data 23468 23465 0 80 0 - 31354 ep_pol 20:16 ? 00:00:00 nginx: worker process 5 S www-data 23469 23465 0 80 0 - 31354 ep_pol 20:16 ? 00:00:00 nginx: worker process 0 S root 23471 23289 0 80 0 - 3628 pipe_w 20:16 pts/0 00:00:00 grep --color=auto nginx root@nginx-test:~#
We can see from the above that there are now 4
running processes with the name of nginx: worker process
. This indicates that our change was successful.
Checking the effect
With our additional workers started, let's run ab
again to see if there has been any change in throughput.
# ab -c 40 -n 50000 http://159.203.93.149/ | grep "per second" Requests per second: 3051.40 [#/sec] (mean)
It seems that our change has had very little effect: our original Requests per second
was 2957.93
, and our new value is 3051.40
. The difference here is roughly 100
more requests per second. While this is an improvement, this is not the level of improvement we were looking for.
worker_processes 8;
Let's go ahead and change the worker_processes
value to 8
, four times the number of CPU's available. In order for this change to take effect, we will once again need to restart the NGINX service.
root@nginx-test:~# service nginx restart
With the service restarted, we can go ahead and rerun our ab
test.
# ab -c 40 -n 50000 http://159.203.93.149/ | grep "per second" Requests per second: 5204.32 [#/sec] (mean)
It seems that 8
worker threads have a much more significant effect than 4
. Compared to our baseline metrics, we can see that with 8
worker threads we are able to process roughly 2250
more requests per second.
Overall this seems like a significant improvement from our baseline. The question is how much more improvement would we see if we increased the number of worker threads further?
Remember, it's best to make small incremental changes and measure performance increases each step of the way. For this parameter, I would simply increase its value in multiples of two and rerun a test each time. I would repeat this process until the requests per second value no longer increases. For this article however, we will go ahead and move on to the next parameter, leaving the worker_processes
value set to 8
.
Worker Connections
The next parameter we are going to tune is the worker_connections
configuration within NGINX. This value defines the maximum number of TCP sessions per worker. By increasing this value, the hope is that we can increase the capacity of each worker process.
The worker_connections
setting can be found within the events
block in the /etc/nginx/nginx.conf
configuration file.
events { worker_connections 768; # multi_accept on; }
The default setting for Ubuntu's installation of NGINX is 768
. For this first test, we will try to change this setting to 1024
and measure the impact of that change.
events { worker_connections 1024; # multi_accept on; }
Like the previous configuration change, in order for this adjustment to take effect we must restart the NGINX service.
root@nginx-test:~# service nginx restart
With NGINX restarted, we can run another test with the ab
command.
# ab -c 40 -n 50000 http://159.203.93.149/ | grep "per second" Requests per second: 6068.41 [#/sec] (mean)
Once again, our parameter change has resulted in a significant increase in performance. With just a small change in worker_connections
, we were able to increase our throughput by 800
requests per second.
Increasing worker threads further
If a small change in worker_connections
can add 800
requests per second, what affect would a much larger change have? The only way to find this out is to make the parameter change and test again.
Let's go ahead and change the worker_connections
value to 4096
.
worker_rlimit_nofile 4096; events { worker_connections 4096; # multi_accept on; }
We can see the worker_connections
value is 4096
, but there is also another parameter whose value is 4096
. The worker_rlimit_nofile
parameter is used to define the maximum number of open files per worker process. The reason this parameter is now specified is because, when adjusting the number of connections per worker, you must also adjust the open file limitations.
With NGINX, every open connection equates to at least one or sometimes two open files. By setting the maximum number of connections to 4096
, we are essentially defining that every worker can open up to 4096
files. Without setting the worker_rlimit_nofile
to at least the same value as worker_connections
, we may actually decrease performance, because each worker will try to open new files and would be rejected by the open file limitations or 1024
.
With these settings applied, let's go ahead and rerun our test to see how our changes affect NGINX.
# ab -c 40 -n 50000 http://159.203.93.149/ | grep "per second" Requests per second: 6350.27 [#/sec] (mean)
From the results of the ab
test run, it seems we were able to add about 300
requests per second. While this may not be as significant of a change as our earlier 800
requests per second, this is still an improvement in throughput. As such, we will leave this parameter as is to move on to our next item.
Tuning for Our Workload
When tuning NGINX or anything else for that matter, it's important to keep in mind the workload of the service being tuned. In our case, NGINX is simply serving static HTML pages. There is a set of tuning parameters that are very useful when serving static HTML.
http { open_file_cache max=1024 inactive=10s; open_file_cache_valid 120s;
The open_file_cache
parameters within the /etc/nginx/nginx.conf
file are used to define how long and how many files NGINX can keep open and cached in memory.
Essentially these parameters allow NGINX to open our HTML files during the first HTTP request and keep those files open and cached in memory. As subsequent HTTP requests are made, NGINX can use this cache rather than reopening our source files.
In the above, we are defining the open_file_cache
parameter so that NGINX can cache a max
imum of 1024
open files. However, of those files, the cache will be invalidated if they are not accessed within 10
seconds. The open_file_cache_valid
parameter is defining a time interval to check if currently cached files are still valid; in this case, every 120
seconds.
These parameters should significantly reduce the number of times that NGINX must open and close our static HTML files. This means less overall work per request, which should mean a higher throughput. Let's test our theory with another run of the ab
command.
# ab -c 40 -n 50000 http://159.203.93.149/ | grep "per second" Requests per second: 6949.42 [#/sec] (mean)
With an increase of nearly 600
requests per second, the open_file_cache
parameters have quite an effect. While this parameter might seem very useful, it is important to remember that this parameter works in our example because we are simply serving static HTML. If we were testing an application that was serving dynamic content every time, these parameters may result in rendering errors for end users.
Summary
At this point, we have taken an out-of-the-box NGINX instance, measured a baseline metric of 2957.93
requests per second, and tuned this instance to 6949.42
requests per second. As a result, we've gotten an increase of roughly 4000
requests per second. We did this by not only changing a few key parameters, but also experimenting with those parameters.
While this article only touched on a few key NGINX parameters, the methods used in this article to change and measure impact can be used with other common NGINX tuning parameters, such as enabling content caching and gzip compression. For more tuning parameters, check out the NGINX Admin Guide which has quite a bit of information about managing NGINX and configuring it for various workloads.