Skip to main content

All cron jobs randomly firing a bunch of times in parallel?

Comments

16 comments

  • Benjamin D.

    0
  • cPRex Jurassic Moderator

    Hey there!  I can't say I've heard of such a thing, and I also don't have any other reports of similar behavior that I could find.

    0
  • Benjamin D.

    It almost feels like some sort of race condition triggers all CPU cores to fire the cron jobs in parallel at the exact same time or something... seemingly at random, but I can list all the date/times where this occurred and I've seen it happen live so I know this is not just written in the logs, it truly brings the server to its knees and I very much have to manually kill all the processes when this happens.

    0
  • Benjamin D.

    OH BOY.  WTH IS THIS? Is this normal? And how did this happen?! EDIT: I read online that crond forks itself to the amount of scheduled tasks to run.  Is this still the case in AlmaLinux 8.9? Am I panicking for nothing here or is this abnormal? I ran the command again after some time and now there are only 2 crond processes, so I think it's normal that the amount of crond processes varies throughout the day, depending on scheduled cron jobs.  https://support.cpanel.net/hc/en-us/community/posts/19634192633879-Multiple-crond-instances

    0
  • cPRex Jurassic Moderator

    I would expect the number of processes to vary throughout the day - I wouldn't expect them to be firing the crons multiple times. 

    0
  • Benjamin D.

    Yeah me neither and I've never seen that happen under CentOS 7 last year, before I upgrade to a new server under AlmaLinux 8 and WHM 116.

    0
  • Benjamin D.

    It just happened again.  133 cron jobs (same 8 or 9 crons multiplied by a bunch of parallel instances) of all kinds across the entire server all fired up at the same time, using most of Apache's slots, since they're basically just calling an HTTP request each, denying normal web browser traffic for a few minutes until I managed to kill the processes.  I almost couldn't sign in to WHM to do so.  I'm now wondering if it would be a bug caused by when WHM auto-updates itself?

    For future reference, this happened under WHM 118.0.4, let's see if the next time it happens is under a newer version or not.

    0
  • Benjamin D.

    Will you look at that? It just happened again and now WHM is under 118.0.6.

    0
  • Benjamin D.

    It happened again and now WHM is under 118.0.8.  It's definitely tied to WHM self updating but I'm not quite sure how it's even related to it.  All I know is that it seems quite clear that it happens once after WHM version changes.  Could it be tied to date/time sync or adjustments done after WHM updates? I'm really desperate to find a fix to this as it drives customers away from my server because it's "unstable".  If I'm not sitting in front of WHM to catch it do it, then the server basically does a DoS.

    0
  • cPRex Jurassic Moderator

    And it will likely keep happening as this isn't related to cPanel.

    0
  • Benjamin D.

    I'm trying to understand why it's suddenly been doing that starting immediately after getting this new server earlier this winter.  Those crons have been doing fine for YEARS on the old server, before we were forced to move to AlmaLinux and if you are 100% sure there's nothing in WHM that could cause this, then perhaps it comes from AlmaLinux itself? It's just so weird that it always happens once after WHM self updates and then it's fine for a week or two until WHM updates again.  The cron runs every hour (or minute, depending on the cron) and there are a lot of hours/minutes in between WHM updates where things could go wrong and they don't.  It's so weird and almost predictable (but not fully predictable as it does not seem to happen *immediately* after a WHM update but it will happen once some time after a WHM update).

    Here are the only differences I can think of between YEARS of it working perfectly fine (zero issue) and now:

    • AlmaLinux 8 instead of CentOS 7
    • PHP-FPM running 8.2 instead of suphp 8.2
    • Newer WHM version

    So, if it doesn't come from WHM, then it leaves only 2 possibilities that I see that differ from the old server:

    1) The AlmaLinux OS, which I highly doubt the issue comes from since cron is a very basic thing in an OS and there would have been thousands of AlmaLinux users complaining and the bug would have been fixed a long time ago...

    or 2) PHP-FPM... Now, I know PHP-FPM does something weird whenever you start or restart it.  It seemingly pre-executes the website index file in order to (allegedly, not sure) compile/cache the HTML page that comes out of it.  I don't recall any other PHP handler doing this, but to PHP-FPM's credit, it's faster than any other PHP handler, so it has got to cache responses or at least parts of its PHP code results in some way.  I've seen PHP-FPM "reload" all the websites whenever I tweak ANYTHING under MultiPHP settings in WHM (e.g. the "Max Requests" value).  Most of them are almost instantaneous to "reload", but some of them take 3-4 seconds.  Anyway, I'm now wondering if PHP-FPM could cause all the crons to fire up at the same time to "reload" (cache) the PHP scripts that are tied to them.  But if it were the case, the thing is that specifically for today, nobody changed anything in MultiPHP settings at all, so unless WHM reloads the PHP-FPM pool or service, then I don't see how this could occur automatically and why it seemingly only does it once after any WHM self update.

    For future reference: I disabled PHP-FPM on all accounts that had cron jobs and we're now waiting the next WHM self update to see if it resolves the issue or not.

     

    0
  • Benjamin D.

    Alright, I just caught WHM updating to 120.0.5 and so far, it's stable and crons have not fired up in parallel.  This makes me believe the whole thing was caused by PHP-FPM "reloading" all the crons after each WHM update.  There still was a good 10 seconds of unresponsiveness across the whole server immediately following the WHM update though.  This is why I hate WHM updates in the middle of a week day, like seriously, why not update at 3:00 AM on a Sunday or something like that instead of during work hours on a week day? Is there a way to force it to never update during working hours?

    0
  • Benjamin D.

    Ah, yes.  I made it so it only updates during a week end day! Thanks cPRex.  I'm also pleased to report that nothing unusual occurred over the last 30-ish hours so I'm quite confident that this whole thing was caused by PHP-FPM re-launching duplicates of cron jobs that were tied to PHP scripts after every WHM update.  This is a serious bug that should be addressed by PHP-FPM developers but for the time being, like I mentioned above, I disabled PHP-FPM on all the subdomains that run cron jobs and this seemingly fixed the issue!

    0
  • cPRex Jurassic Moderator

    Sure thing!

    0
  • Benjamin D.

    I'm pleased to report that once again, WHM updated to 120.0.9 and there was no surge in cron jobs and no DDoS like state since PHP-FPM was disabled on all accounts that have cron jobs.

    We can close this issue.

    0

Please sign in to leave a comment.