Skip to main content

Huge increase in Apache processes

Comments

16 comments

  • GOT
    Well, it sounds like you are getting some kind of DoS attack. This command: /usr/bin/lynx -dump -width 500
    0
  • GoWilkes
    Thanks for the commands, those are very helpful! I didn't think about changing it from :80 after I moved everything to HTTPS. But I'm still not seeing a high number of connections. From the first command, I have 28 connections right now, but Munin is showing about 100 Apache processes; roughly double the number that I had at this time on May 30. And using the second command, the IP with the highest number of connections is a local IP, with 13 connections (pretty much what I would expect). I even blocked all non-US IP addresses in CSF (firewall) using CC_ALL_FILTER (only allowing US), but it had no noticeable impact on the problem.
    0
  • GOT
    You might want to read the docs on that csf filter. If my memory serves me I dont think it works like you're expecting. As for munin I would not necessarily use that for real time diagnostics. ps axf|grep httpd|wc will give you a live count of Apache processes. From your numbers it doesn't sound like an attack but you should look at your general Apache settings. I believe the default max children/servers is set to 150 by default and if you are exceeding that then pages won't load I would also look at Apache status in whm because sometimes your eyes can show you things that just getting numbers from commands doesn't reveal.
    0
  • GoWilkes
    This is what I was going by on CC_ALLOW_FILTER: [quote]An alternative to CC_ALLOW is to only allow access from the following countries but still filter based on the port and packets rules. All other connections are dropped
    And this: crybit.com/block-whole-countries-csf/ I used the command you posted (ps axf|grep httpd|wc ) and this was the result: 46 366 3106 There wasn't a column header, though, so I'm not sure what I'm looking at here. It's 1:30am here right now, and the 46 matches what Munin shows for the current number of processes, but I'm not sure what the 366 or 3106 represent. Regardless, I would usually have 46 processes at peak time, not at 1:30. It should be more like 15-20 right now. [quote]From your numbers it doesn't sound like an attack but you should look at your general Apache settings. I believe the default max children/servers is set to 150 by default and if you are exceeding that then pages won't load
    You're right, and that turned out to be why my site was freezing up. Raising the number stopped it from freezing, but I have no clue why it increased in the first place :-( [quote]I would also look at Apache status in whm because sometimes your eyes can show you things that just getting numbers from commands doesn't reveal.
    Possibly because of the increase I made on Max Clients and Server Limit, but I do have about 100 of these: ::1 myservername.com OPTIONS * HTTP/1.0 I'm guessing that's normal, though... 100+/- free slots?
    0
  • MaFt
    I'm following this as I've seen exactly the same. For years my sites averaged 4-6 Entry Processes and suddenly on Friday around 4-5pm UK time I was hitting "resource limit is reached" errors as these were limited to 20 on this server. I'm a reseller though and have no control over the limits. I've managed to minimise this by shutting down 1 site completely and using Cloudflare's "I'm under attack" to reduce the number of visitors. Not ideal though as it's meant a 40% loss of income over the weekend compared to normal - but at least the sites are online. The hosts are being painfully slow and keep saying they'll increase the limits. They still haven't. However, they've still not actually responded to my main query as to why the sites in question, with no changes at my end, are suddenly being reported as using a lot more processes than previously. Looking at the cPanel "concurrent usage" logs for 30 days you can see the sudden spike from Friday. It seems very weird that the only similar thing I can find is this post - and the same issue also started on Friday too; Around the same time too (assuming the original poster is in the US). I'm hopeful my hosts can find out what's going on and I'll certainly report back here if they find anything out.
    0
  • GoWilkes
    You're right, MaFt, I'm in eastern US. That's too much to be a coincidence, I think. I ran ClamAV and rkhunter, and neither found anything, so I'm ruling out a virus on my end. Right now (roughly 2pm EST) I have 101 busy Apache servers, but only 46 connections. The IP with the highest connection has 13 connections, which is reasonable, so I think that I can rule out a DDoS attack. My RAM is high, too; I'm usually at around 3G at this time of day, but it's currently over 4G (I have 4G of RAM, so it's maxing out). My CPU load is fine, though: 0.87, and since I have 2 CPUs a load of 2 would be a normal-high. MaFt, there's no excuse for your host to be dragging their feet on increasing the limits. It literally takes 30 seconds, and the restart of Apache might have a downtime of less than 1 second. It doesn't solve the problem, but it definitely help with the symptom (and should bring your revenue back on track).
    0
  • cPanelMichael
    Hello Everyone, Can anyone affected by this issue verify if the Prefork MPM is enabled? You can execute the following command to check: rpm -qa|grep mpm
    If so, verify if any recent entries like the one below exist in /usr/local/apache/logs/error_log: AH00144: couldn't grab the accept mutex
    Thank you.
    0
  • GoWilkes
    I SSH'ed in to my server as root via Putty, ran rpm -qa|grep mpm, and basically nothing happened. It ran for about 2 seconds, then just gave me the prompt again. In /usr/local/apache/conf/httpd.conf, though, the only reference to prefork is here: Timeout 60 TraceEnable Off ServerSignature Off ServerTokens ProductOnly FileETag None StartServers 15 MinSpareServers 10 MaxSpareServers 20 MinSpareServers 10 MaxSpareServers 20 ServerLimit 256 MaxClients 150 MaxRequestsPerChild 10000 KeepAlive On KeepAliveTimeout 5 MaxKeepAliveRequests 100
    I checked my error_log, anyway, but didn't find any reference to "mutex". The oldest entry was May 31, about 12 hours before this problem began the first time. I looked through, and don't see any errors other than attempts for pages that don't exist, and a handful of errors that I see all the time that I don't understand, but I doubt that they're related to this: RewriteOptions: MaxRedirects option has been removed in favor of the global LimitInternalRecursion directive and will be ignored. Hostname X provided via SNI and hostname example.com provided via HTTP are different
    Thanks, Michael!
    0
  • dalem
    Are you running a lot of WordPress sites? What you are describing sounds Just your run of the mill Layer 7 attack which happen 24/7 365 days a year non stop from bots, the typical wp-login & xmlrp attacks. I have noticed that some of the bots have a new plan instead of rapid fire brute force they are connecting and reconnecting or one in out & switch to a new IP which will allow them to not get banned as easily. So we did not notice right away what was going on. A good custom Mod security rule stops them in their tracks. One of our servers has been acting up as you described a couple times a day and we realized on of our clients multiple Magento installs was getting hammered adedd a mod security rule all is well now (well all most as soon a all in the botnet ips get banned ). Also realized for some reason our WordPress mod security rule was not working which did not help
    0
  • MaFt
    I have 2 wordpress installs on the hosting I mentioned in my reply. Can you expand on what the "good mod security rule" would be?
    0
  • cPanelMichael
    Hello @GoWilkes, Thank you for sharing the additional information. The issue reported on this thread does not appear related to the case quoted below, but feel free to test out the temporary workaround if the affected system uses the Prefork MPM to see if it has any impact on the reported issue: [QUOTE] Internal case EA-8508 was recently opened to address an issue where an update to the ea-apr RPM lead to instability on some systems using the Prefork MPM. The temporary workaround for affected systems is to execute the following command: echo "Mutex sysvsem" >> /etc/apache2/conf.modules.d/000_mod_mpm_prefork.conf; /scripts/rebuildhttpdconf; /scripts/restartsrv_httpd --hard
    Note the above command includes a restart of the Apache service. We're tentatively planning to publish a fix for this case in the next EasyApache 4 release (you can follow the EA4 Change Log support ticket so we can rule out any issues with cPanel & WHM? Post the ticket number here and I'll link this thread to it. Thank you.
    0
  • dalem
    PS this was just a guess as what your issue is on our server it was definitely the issue you can do a quick check and see how many foreign ip's are brute forcing grep -ir wp-login.php /var/log/apache2/domlogs grep -ir wp-admin /var/log/apache2/domlogs
    0
  • GoWilkes
    Michael, it turns that I don't have Prefork, after all. This was the result when I ran the commands you gave: -bash: /etc/apache2/conf.modules.d/000_mod_mpm_prefork.conf: No such file or directory Built /usr/local/apache/conf/httpd.conf OK Waiting for "httpd"httpd" Service Status httpd (/usr/local/apache/bin/httpd -k start) is running as root with PID 4923 (pidfile+/proc check method). Startup Log [Wed Jun 05 02:50:50 2019] [error] VirtualHost *:443 -- mixing * ports and non-* ports with a NameVirtualHost address is not supported, proceeding with undefined results Log Messages [Wed Jun 05 02:50:51 2019] [notice] ModSecurity for Apache/2.9.0 (http://www.modsecurity.org/) configured. [Wed Jun 05 02:50:51 2019] [notice] suEXEC mechanism enabled (wrapper: /usr/local/apache/bin/suexec) httpd restarted successfully.
    I'm a tad concerned about the error message, considering that all of the accounts on the server were created with WHM and I haven't manually edited httpd.conf in years... probably not since I got this server, honestly. All of my sites seem to be running so I don't think it's a fatal error, but I definitely wasn't expecting it! @dalem, that was a great thought, but unfortunately not my issue :-( My log files were at: /usr/local/apache/domlogs/[USERNAME]/[DOMAIN.COM] I already test for references to wp-admin and wp-login via PHP and block IPs, but not at the firewall so it was an idea! But I only had 5 references to wp-login, and 2 to wp-admin. So that wasn't the culprit, either. @GOT, just FYI, it looks like CC_ALLOW_FILTER isn't blocking non-US IPs the way I'd hoped, so you could be right on that one. I was manually adding RIPE, APNIC, and LACNIC IP ranges but removed them in favor of CC_ALLOW_FILTER a few days ago. I didn't notice an increase in processes or anything, but I just now looked and saw that I have 7 RIPE connections. But anyway... no change on my end, I still have almost double the number of processes, my RAM usage is off the charts, etc. I'm at a complete loss.
    0
  • dalem
    My log files were at: /usr/local/apache/domlogs/[USERNAME]/[DOMAIN.COM]

    you are still running Easyapache3 (EOL) best to think about upgrading to Easyapache4
    0
  • GoWilkes
    I am... I'm procrastinating for 2 reasons: 1. I always wait until the last minute for software updates, to let everyone else figure out the bugs before I deal with them; and 2. Nothing in the documentation has commented on potential down time while waiting for it to update, so I'm waiting for a time when I have a few hours to possibly wait, and then another few hours to sort out bugs before the next business day.
    0

Please sign in to leave a comment.