PHP-FPM – process management

Introduction

PHP-FPM has a fully functional pool configuration file out of the box – the idea of this article is to explore the default configuration settings vs ‘optimised’ settings.

It is reasonable to say setting up PHP-FPM beyond the documentation is a dark art unless you’re well experienced on the topic – fear not, hopefully you’ll have a better understanding after reading this post.

As usual the disclaimer: system admins that breathe, eat & sleep underlying architecture may raise eyebrows at my findings. Share your comments that will help others & me in the process. This article isn’t for newbies and it’s not a step-by-step setup guide either.

It’s also worth noting that OS settings can and will affect the applications running on it. I.e. on Linux, PHP-FPM running as a domain socket vs TCP socket, the maximum number connections, open ports, open files etc.

You should look at system log files for any errors or issues thrown that are not logged in your PHP-FPM or Apache/Nginx logs.

PHP-FPM pool settings overview

If you’re lazy and don’t bother to read the well commented default pool configuration file (www.conf), then shame on you – it explains in depth each of the settings. However since the comments are literal, they don’t explain how to set them correctly based on your setup (i.e web server, web/db server etc).

PHP-FPM – baseline stats

Before we start tinkering, lets do some basic system spec listing & benchmark testing to see where we are currently.

The System

  • System: Local VM (Hyper-V)
  • OS: Ubuntu 16.04.4 LTS server
  • CPU: Intel i7 6700K @ 4GHz -Hyper-threading enabled (4 vCPUs set)
  • Ram: 8GB RAM allocated
  • Disk: Virtual HD (software Raid 5)
  • Software: Nginx + PHP-FPM 5.6 + MySQL 5.7

Benchmarking

For benchmarking I’m going to use Siege – you can find out how to install it via the Siege website https://www.joedog.org/siege-home/

I am pointing Siege to a Magento 1.9.3.8 store (with demo data). I have disabled all Magento caching so we’re doing pure PHP processing (Opcode cache has also been disabled via php.ini).

Default www.conf (with the exception of enabling the status feature):

[www]
user = www-data
group = www-data
listen = /run/php/php5.6-fpm.sock
listen.owner = www-data
listen.group = www-data

pm = dynamic
pm.max_children = 5
pm.start_servers = 2
pm.min_spare_servers = 1
pm.max_spare_servers = 3

pm.status_path = /status

My siege command is:

siege -c50 -i -b -t1m -f urls.txt

c50 = 50 CONCURRENT users

i = “INTERNET user simulation, hits URLs randomly.”

b = “BENCHMARK: no delays between requests”

t1m = Test for 1 minutes.

Siege results:

Transactions: 553 hits
Availability: 100.00 %
Elapsed time: 59.92 secs
Data transferred: 21.73 MB
Response time: 5.16 secs
Transaction rate: 9.23 trans/sec
Throughput: 0.36 MB/sec
Concurrency: 47.59
Successful transactions: 553
Failed transactions: 0
Longest transaction: 6.65
Shortest transaction: 0.30

PHP-FPM status page:

You’re able to monitor the PHP-FPM child processes by enabling the PHP-FPM status feature. http://php.net/manual/en/install.fpm.configuration.php#pm.status-path

The results of the PHP-FPM status page after the initial benchmark:

pool: www
 process manager: dynamic
 start time: 29/Mar/2018:19:03:48 +0100
 start since: 829
 accepted conn: 6167
 listen queue: 0
 max listen queue: 0
 listen queue len: 0
 idle processes: 2
 active processes: 1
 total processes: 3
 max active processes: 5
 max children reached: 2
 slow requests: 0

Highlighted in red is “max children reached” – this is a great indicator that we have already hit the ceiling of the PHP-FPM child processes to deal with the incoming requests. max children reached is reported by the either the PHP-FPM status page or in the PHP-FPM log file:

WARNING: [pool www] server reached pm.max_children setting (5), consider raising it

There are a few fixes:

  1. Reduce the amount of server traffic (lol)
  2. Increase the number of PHP-FPM child processes (pm.max_children)

PHP-FPM – child processes

pm.max_children

This is probably the most important value to set. Why? Because quite literally this is the maximum PHP-FPM child processes that will spawn.

Why does it matter? Lets use an analogy to explain this based on the results above.

Have you ever been to a supermarket and there’s only 1 till open? What happens? A queue forms.

The same thing happens with PHP-FPM – if there are many incoming ‘connections’, it will take longer to process all the incoming requests (p.s. I’m not actually talking about the PHP-FPM queue – listen.backlog, which is set to unlimited by default anyway).

This is what PHP-FPM requests looks like if you have pm.max_children = 1;

Before observing the illustrations below for pools or workers – my findings are based on research online and experience. Though I’m not entirely sure how the PHP-FPM master process prioritises or distributes load to the pools, I have understood it to work like the following (please feel free to correct me or explain further):

php-fpm with 1 worker

Each of those green humans are requests from the NGINX web server.

In the above, there is only one “supermarket till” (PHP-FPM worker), to deal with each request. Usually, with FAST PHP scripts, the queue will be processed before you can notice.

Note: to put it bluntly, poor hardware or lack of OS optimisation could be the bottleneck despite having fast PHP code (i.e. CPU – raw processing power, hard disk IO, network, OS settings and so forth). For the remainder of this article, lets assume everything is running fast as hell and it’s just our PHP-FPM configuration which needs tweaking.

Next, lets illustrate 5x PHP-FPM workers (child processes) instead of 1:

php-fpm with 5 child worker

 

Note: from what I’ve read, a request queue is not isolated per PHP-FPM worker, but the actual PHP-FPM master processs. FPM stands for FastCGI Process Manager, keyword being Manager, a ‘master’ will manage work load to workers in the pool(s).

I.e. you have 1 pool, with 5 workers, each worker does not have it’s own queue, the master process will manage each request based on all the pools and workers under it’s control.

Below is a micro view of requests being processed by a single pool:

php-fpm incoming requests single queue per pool 2

And next a macro view of requests being processed by multiple pools:

php-fpm pools and workers managed by the master process

Great work – now we have an overall view for child/worker processes, we can go ahead and workout what the pm.max_children value should be. To do this we need to figure out how much system RAM we can allocate to PHP-FPM.

Calculating the pm.max_children value

Since we’re dealing with PHP, you may have experienced a script that consumed way too much memory causing a the “Exhausted memory” error. You probably headed over to the php.ini file and raised the value of max_memory until your error went away (or adjusted the memory limit locally in your PHP script) or even better, you fixed your memory leak =D.

First method for the calculation

Lets say in the php.ini file max_memory is set to 128MB and you have approximately 8GB ram in your system.

Let’s reserve 512MB for the OS and other processes, leaving us with 7.5GB – we don’t want all our PHP-FPM processes to go beyond 7.5GB or even worse cause swap.

If we assume our PHP app / script, whatever it is, consumes a max of 128MB, we want to divide 7.5GB by 128MB:

7500MB / 128MB = 58 (rounded off).

pm.max_children = 58
Second method for the calculation

The second method involves a more precise calculation. You do this by finding the average memory consumption for all the running PHP-FPM processes. My best advice is to run this command while your server is on load (i.e. running Siege or any crawler that will hit web pages that chew up a bit of memory).

I’ll illustrate this on Linux (sorry Windows peeps!).

Step 1 – name and shame

First of all lets find out what the PHP-FPM process is called using the service status command (we need to know what PHP-FPM is called so we can use it in Step 2).

sudo service --status-all | grep -i fpm
Output
[ + ] php-fpm5.6

In the above output for my dev box, its called php-fpm5.6, it might be php5.6-fpm for you or php-fpm or php7.0-fpm. Hence why I said we need to use the process status command to help determine this.

Step 2 – calculate the average memory consumption

Run the following command to give us the average memory used in human readable format (aka Megabytes).

Be sure to change php-fpm5.6 with whatever your output was:

ps --no-headers -o "rss,cmd" -C php-fpm5.6 | awk '{ sum+=$1 } END { printf ("%d%s\n", sum/NR/1024,"Mb") }'

Output:

35Mb

I re-ran the command while Sieging the website to get a more accurate value:

Output:

60Mb

So based on the result above, lets input this into the formula for calculating the pm.max_children:

7500MB / 60MB = 125 (rounded off).

My new PHP-FPM pool config now looks like this:

[www]

user = www-data
group = www-data
listen = /run/php/php5.6-fpm.sock
listen.owner = www-data
listen.group = www-data
pm = dynamic
pm.max_children = 125
pm.start_servers = 2
pm.min_spare_servers = 1
pm.max_spare_servers = 3


pm.status_path = /status

pm.min_spare_servers & pm.max_spare_servers

Since we have currently set the process manager to spawn child workers dynamically

pm = dynamic

we need to calculate the pm.min_spare_servers & pm.max_spare_servers. There is little information in the official doc on what to set these to http://php.net/manual/en/install.fpm.configuration.php#pm.min-spare-servers.

We could take a look at the current values and make some assumptions based on the default values.

pm.max_children = 5
pm.start_servers = 2 
pm.min_spare_servers = 1 
pm.max_spare_servers = 3

We know that pm.max_children is the absolute maximum number of child processes. So lets say this is 100%. We can try to calculate what percentage the other values are based on this:

pm.min_spare_servers

; pm.min_spare_servers – the minimum number of children in ‘idle’
; state (waiting to process). If the number
; of ‘idle’ processes is less than this

pm.min_spare_servers = 1

(1 / 5) * 100 = 20%

According to this, we should set pm.min_spare_servers to 20% of our pm.max_children.

pm.max_spare_servers

; pm.max_spare_servers – the maximum number of children in ‘idle’
; state (waiting to process). If the number
; of ‘idle’ processes is greater than this
; number then some children will be killed.

pm.max_spare_servers = 3

(3 / 5) * 100 = 60%

According to this, we should set pm.max_spare_servers to 60% of our pm.max_children.

pm.start_servers

; pm.start_servers – the number of children created on startup.

Now that we have spare min & max children, we can look at the pm.start_servers. Based on the comments in the default www.conf file, it has some advice on how to set this value:

; The number of child processes created on startup.
; Note: Used only when pm is set to ‘dynamic’
; Default Value: min_spare_servers + (max_spare_servers – min_spare_servers) / 2

Putting it all together

Based on our value from earlier pm.max_children = 125, we can use this in the calculation methodology based on the above.

pm.min_spare_servers = 20% of pm.max_children
(20 / 100) * 125 = 25

pm.max_spare_servers = 60% of pm.max_children
(60 / 100) * 125 = 75

pm.start_servers = min_spare_servers + (max_spare_servers – min_spare_servers) / 2
25 + (75 – 25) / 2 = 37 (rounded)

Updated values
[www]

user = www-data
group = www-data
listen = /run/php/php5.6-fpm.sock
listen.owner = www-data
listen.group = www-data
pm = dynamic
pm.max_children = 125
pm.start_servers = 37
pm.min_spare_servers = 25
pm.max_spare_servers = 75

pm.status_path = /status

Output of the status page after updates

pool:                 www
process manager:      dynamic
start time:           31/Mar/2018:18:44:09 +0100
start since:          18
accepted conn:        2
listen queue:         0
max listen queue:     0
listen queue len:     0
idle processes:       36
active processes:     1
total processes:      37
max active processes: 1
max children reached: 0
slow requests:        0

Testing our new values

Oh how exciting, will these new changes make a difference?

siege -c50 -i -b -t1m -f urls.txt

Results:

Lifting the server siege...
Transactions: 600 hits
Availability: 100.00 %
Elapsed time: 59.29 secs
Data transferred: 23.40 MB
Response time: 4.76 secs
Transaction rate: 10.12 trans/sec
Throughput: 0.39 MB/sec
Concurrency: 48.20
Successful transactions: 600
Failed transactions: 0
Longest transaction: 8.59
Shortest transaction: 2.77

And the results for the status page:

pool:                 www
process manager:      dynamic
start time:           31/Mar/2018:19:50:47 +0100
start since:          414
accepted conn:        636
listen queue:         0
max listen queue:     0
listen queue len:     0
idle processes:       74
active processes:     1
total processes:      75
max active processes: 53
max children reached: 0
slow requests:        0

Lets compare this to our baseline results:

DefaultOptimisedDifference
Transactions553600+ 47
Availability100%100%
Elapsed Time59.9259.29
Data transferred21.73 MB23.40 MB+ 1.67
Response time5.16 secs4.76 secs– 0.4 sec
Transaction rate9.23 trans/sec10.12 trans/sec+ 0.89
Throughput0.36 MB/sec0.39 MB/sec+ 0.03 MB
Concurrency47.5948.20+ 0.61
Successful transactions553600+ 47
Failed transactions00
Longest transaction6.658.59+ 1.94
Shortest transaction0.302.77+ 2.47

Highlighted in green are the positive gains and in red the negative losses. The values in red, I would’t worry too much about, they are the the single longest and shortest transactions. Overall the Response time, which is an average, is the metric that matters, along with the total Transactions – clearly our server was able to serve more requests and faster too.

Some questions itching

Admittedly these results weren’t “holy cow what a vast improvement”, but an improvement nonetheless. I can only imagine since the VM is restricted to only 4 vCPU’s, I would say that is the bottleneck – HTOP confirms 100% CPU usage across all cores when running the benchmark.

I was interested to see what affect it would have on our benchmark if we add more CPU’s.  8 vCPU’s allocated instead of 4 vCPU’s:

OptimisedExtra CPU coresDifference
Transactions600733+ 133
Availability100%100%
Elapsed Time59.2959.70
Data transferred23.40 MB28.79 MB+ 5.39
Response time4.76 secs3.93 secs– 0.83 sec
Transaction rate10.12 trans/sec12.28 trans/sec+ 2.16
Throughput0.39 MB/sec0.48 MB/sec+ 0.09 MB
Concurrency48.2048.23+ 0.03
Successful transactions600733+ 133
Failed transactions00
Longest transaction8.596.09– 2.5
Shortest transaction2.771.56– 1.21

No surprise there really, more cores = more transactions + faster responses overall.

PHP-FPM management type

Out of the box, PHP-FPM sets the default process management to dynamic. There are two other types: static and ondemand. Each is advantageous depending on your server and it’s resources.

; Choose how the process manager will control the number of child processes.
; Possible Values:
;   static  - a fixed number (pm.max_children) of child processes;
;   dynamic - the number of child processes are set dynamically based on the
;             following directives. With this process management, there will be
;             always at least 1 children.
;             pm.max_children      - the maximum number of children that can
;                                    be alive at the same time.
;             pm.start_servers     - the number of children created on startup.
;             pm.min_spare_servers - the minimum number of children in 'idle'
;                                    state (waiting to process). If the number
;                                    of 'idle' processes is less than this
;                                    number then some children will be created.
;             pm.max_spare_servers - the maximum number of children in 'idle'
;                                    state (waiting to process). If the number
;                                    of 'idle' processes is greater than this
;                                    number then some children will be killed.
;  ondemand - no children are created at startup. Children will be forked when
;             new requests will connect. The following parameter are used:
;             pm.max_children           - the maximum number of children that
;                                         can be alive at the same time.
;             pm.process_idle_timeout   - The number of seconds after which
;                                         an idle process will be killed.

pm = ondemand is self explanatory and it is the most efficient – it will kill all lingering PHP-FPM processes to free memory and only spawn them when needed.

This uses the least amount of memory.

pm = static will spawn the maximum number of child processes instantly. This has some benefit in that PHP-FPM doesn’t have to spawn new processes when needed – it’s quite obvious processing requests will be quicker off the bat if you already have child processes standing by – there is no delay in spawning.

This uses the most amount of memory.

pm = dynamic shares similar behaviour to static when you set the pm.min_spare_serverspm.max_spare_servers, in that some child processes are spawned to boot and awaiting for requests to process.

This is a good balance between static and ondemand in terms of memory usage.

I did some testing to check these differences in a working environment.

I used the following command to calculate the initial memory foot print when restarting PHP-FPM & after benchmarking:

ps --no-headers -o "rss,cmd" -C php-fpm5.6 | awk '{ sum+=$1 } END { printf ("%d%s\n", sum/1024,"Mb") }'
Memory Footprint
pm modePHP-FPM restart1 min after benchmark
ondemand32Mb32Mb
dynamic70Mb400Mb
static412Mb1864Mb

Conclusion

My final thought is that it’s worth tweaking your PHP-FPM settings or at least taking some time to understand where the bottlenecks lie and where you can squeeze every bit of performance with your PHP-FPM configuration.

To figure out whether to use dynamic, static or ondemand, just take a hard look at your server setup and what it’s doing.

If you have a dedicated web server, then I would personally use static.

If you have server that does everything (web/db), dynamic would be a better bet.

If you’re running a really poxy server with little memory and you don’t get much traffic or if your application doesn’t take too long to process PHP code, then I suggest you use ondemand.

 

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.