Apache workers not exiting cleanly
On one of our servers we're seeing scoreboards eventually fill with workers trying to stop. They will typically say "Stopping = Yes" and have 1 or 2 connections and be stuck in stopping forever. Eventually they all get in that state and we need to restart httpd when the log starts getting lines like:
[QUOTE]
[mpm_event:error] [pid 269334:tid 47973321840000] AH03490: scoreboard is full, not at MaxRequestWorkers.Increase ServerLimit.
When gracefully restarting we got logs like: [QUOTE] [Wed Oct 27 09:35:20.693405 2021] [core:warn] [pid 269334:tid 47973321840000] AH00045: child process 1256367 still did not exit, sending a SIGTERM [Wed Oct 27 09:35:22.695412 2021] [core:warn] [pid 269334:tid 47973321840000] AH00045: child process 1256367 still did not exit, sending a SIGTERM [Wed Oct 27 09:35:24.697411 2021] [core:warn] [pid 269334:tid 47973321840000] AH00045: child process 1256367 still did not exit, sending a SIGTERM [Wed Oct 27 09:35:26.699429 2021] [core:error] [pid 269334:tid 47973321840000] AH00046: child process 1256367 still did not exit, sending a SIGKILL
We are running: [QUOTE] Server Version: Apache/2.4.51 (cPanel) OpenSSL/1.1.1l mod_bwlimited/1.4 Phusion_Passenger/6.0.7 Server MPM: event Server Built: Oct 7 2021 19:16:56
Anyone else seeing this issue and/or have any debugging tips? Edit: I just realized I put this in the wrong category/forum but I can't find a way to move it or delete it. Sorry about that!
When gracefully restarting we got logs like: [QUOTE] [Wed Oct 27 09:35:20.693405 2021] [core:warn] [pid 269334:tid 47973321840000] AH00045: child process 1256367 still did not exit, sending a SIGTERM [Wed Oct 27 09:35:22.695412 2021] [core:warn] [pid 269334:tid 47973321840000] AH00045: child process 1256367 still did not exit, sending a SIGTERM [Wed Oct 27 09:35:24.697411 2021] [core:warn] [pid 269334:tid 47973321840000] AH00045: child process 1256367 still did not exit, sending a SIGTERM [Wed Oct 27 09:35:26.699429 2021] [core:error] [pid 269334:tid 47973321840000] AH00046: child process 1256367 still did not exit, sending a SIGKILL
We are running: [QUOTE] Server Version: Apache/2.4.51 (cPanel) OpenSSL/1.1.1l mod_bwlimited/1.4 Phusion_Passenger/6.0.7 Server MPM: event Server Built: Oct 7 2021 19:16:56
Anyone else seeing this issue and/or have any debugging tips? Edit: I just realized I put this in the wrong category/forum but I can't find a way to move it or delete it. Sorry about that!
-
Hello! It sounds like the MaxRequestWorkers limit is getting hit repeatedly, which may be causing these issues. Could you let me know if these articles help? Tuning MaxRequestWorkers for Apache 0 -
Hello! It sounds like the MaxRequestWorkers limit is getting hit repeatedly, which may be causing these issues. Could you let me know if these articles help? Tuning MaxRequestWorkers for Apache
It does not appear to be traffic related. What is happening is some will say "stopping" and hang (for days) and never stop. Eventually they all get to stopping and there's nothing left to serve requests until apache is restarted. Here is what I'm seeing right now. Most of those stopping=yes have been that way for hours.Slot PID Stopping Connections Threads Async connections total accepting busy idle writing keep-alive closing 0 3423094 no 15 yes 6 19 3 5 1 1 3027370 yes 1 no 0 0 0 0 0 2 3415720 yes 2 no 0 0 2 0 0 3 3053957 yes 2 no 0 0 0 0 0 4 3424137 no 7 yes 2 23 1 3 1 5 3425059 no 17 yes 6 19 1 8 2 6 3079413 yes 2 no 0 0 0 0 1 Sum 7 4 46 14 61 7 16 5 ___R__R_W_R____WW________.......R............................... .....G..G...............R...........___________W__________R_____ _W_RW_______WW____R___............R............................. ........
0 -
In order to rule out a cPanel-side issue, could you open a ticket using the link in my signature? Please update me with the incident ID so I can post the solution here. 0
Please sign in to leave a comment.
Comments
4 comments