How to Reduce Sleeping Process in Linux

The total number of sleeping processes is not important. If it's over 400 the system dies though. What is important is the number of sleeping nginx and php-fpm processes. You use some public cloud. These thing are cheap and they promise the moon but in fact they are big pain in the ass. 

They lower the load and can handle more traffic in theory but with dedicated server it's hard to get many sleeping processes unless the server has problems with a hard drive or the hard drive controller. 

The reason for sleeping processes in clouds is bad I/O between the cloud nodes. MySQL on cloud is a big disaster it is the main reason to increase the average load CPU usage. As a %CPU as a process it can be over 1000% which is pretty crazy. They use the lan conectivity for everything and there's a lot of traffic generate d by the "mother" layer not your guest OS (CentOS,Debian etc).

Sometimes even the back-up system of the cloud can cause the processes to "fall asleep". If you use PHP sessions that means even more sleeping processes. You can lower them if you use memcached serving the pages directly from the memory. That can be used if your pages are mainly "static".

It is normal for a server with no load. Many of the processes is the kernel threads. Many core - many processes.

Sleeping processes aren't the cause, they're a symptom. They're sleeping because they're waiting on I/O to complete, which is reflected in the iowait stat.

Once you start going into swap, performance is going to degrade, Apache is going to get backed up (process count is going to start spiking), and your throughput is going to go way down.

Based on the specs, and that you aren't aware of paying for anything more, you probably have a single 120GB hard drive. Depending on how much traffic you're trying to push, this may or may not be sufficient. Adding more spindles (more disks) is effective, but probably not appropriate in this situation.

More than likely, IPB is the culprit. What's probably happening is that users are doing database-intensive operations, and when the server is slow to respond, they click the button a few more times, causing more operations to stack up. 

This is thrashing the disks, and causing everything to slow down, causing Apache to stack up, causing the server to start going heavy into swap as it's loading more stuff into memory (more Apache threads), and worsening the situation.

There's three things that will have an immediate benefit;

Since we know you're running IPB, and seem to have a significant user base (based on the load symptoms)

  1. Do some tuning of IPB. Ask where people are discussing IPB, be it on WHT or another forum. Check the IPB docs, they probably have suggestions for improving performance.
  2. Do some tuning of MySQL to better suit IPB. Again, same as #1, check places where people are familiar with IPB; they'll be able to offer boilerplate suggestions that are appropriate for IPB.
  3. Add more RAM. MySQL loves RAM, but you do need to configure it to take advantage of the RAM. There will be an immediate effect, though, that things will become much better; don't confuse this for no further action needed, you can still make great gains by tuning MySQL to utilize the RAM, and instead of the problem recurring in 3 months, it'll last much longer.


Well, I'm going to answer my own question. The problem apparently had something to do with the request_terminate_timeout = 30s value that I was using, possibly combined with the ondemand FPM process manager. 

The double post always coincided with a PHP-FPM timeout error combined with a killing of a child process immediately after. So I disabled the request_terminate_timeout, and it appears to be redundant anyway since the php.ini file already specifies a 30 second timeout. 

I also realized that I don't really need the ondemand process manager because I'm the only user on this box with fairly stable load, so I switched to static and set the pm.max_requests fairly low to 100. 

This prevents memory leaks. One or both of these changes has effectively eliminated the duplicates posts.

#howtoreducesleepingprocessinlinux #reducesleepingtasksinlinux