Skip to main content

[CPANEL-21627] Chkservd reports service failures during graceful reboots

Comments

7 comments

  • cPanelMichael
    Hello @leadwatch, This can happen after a reboot when the service monitoring process (Chksrvd) starts before the other services. You can try increasing the default value of "3" to a value such as "4" for the following option under the System tab in WHM >> Tweak Settings: ChkServd TCP check failure threshold Per it's description: The number of times a ChkServd TCP check must fail before notification is sent and the service is restarted. On heavily loaded systems these types of service checks fail occasionally, producing erroneous indications that services are down. A value of 3 or higher is recommended for most systems. Thank you.
    0
  • leadwatch
    Hi Michael, I set the ChkServd TCP check failure threshold to 5 and rebooted the server to test. It did not fix the issue. I still got a series of alert emails for failed services right after reboot. Do you have any other suggestions? Thanks.
    0
  • cPanelMichael
    Hello @leadwatch, Can you open a
    0
  • leadwatch
    I opened a ticket as you suggested.
    0
  • cPanelMichael
    Hello @leadwatch, To update, per support ticket 9730779, it looks like the services were not actually failing. Instead the service checks from Chkservd occurred while the server was in the process of shutting down and thus the notifications were sent when the server booted back up. Thank you.
    0
  • leadwatch
    [QUOTE]the services were not actually failing. Instead the service checks from Chkservd occurred while the server was in the process of shutting down and thus the notifications were sent when the server booted back up
    Yes, this is the problem I'm having. I suspected the services were not actually failing, but it's nice to have confirmation. However, that still leaves me with the same problem - a flood of unnecessary and unwanted failure emails every time I restart the server. Response from cPanel support: [QUOTE] It was not a problem! I do believe there could be room for improvement here in where we possibly suspend chkservd if cpsrvd issues a shutdown/reboot (aka, doing a graceful restart from within WHM). Then on the tailwatchd start up process we can check to see if there is a suspended state and 'unsuspend' based on how long has passed. I am going to do some testing on this and will be sure to update this ticket if an improvement case gets pushed out for this. Regarding the flood of email, I am unfortunately not seeing anything specific that can be changed to alter that behavior at the moment. The improvement case would definitely prevent that from happening so frequently as well. We greatly appreciate your understanding.
    Has anyone in the community experienced this issue? Does anyone know of a workaround or a setting that may be causing this?
    0
  • cPanelMichael
    Hello @leadwatch, We do have an internal case open (CPANEL-21627) that would address this issue by suspending Chkservd (the service monitoring daemon) upon initiating a graceful reboot through Web Host Manager. I'll monitor this case and update this thread with more information on it's status as it becomes available. There's no workaround to report at this time, but you can safely ignore the notifications that are sent during the time the server is rebooting. Thank you.
    0

Please sign in to leave a comment.