TailWatch / Service Manager has stopped monitoring MySQL
Hi,
I'm hoping someone can advise me, please.
Since June 2018 I have been Administrator of a cPanel server with one WordPress website using it. Occasionally we have an "abusive user" that hammers the facilities on the website in a stupid way and the MySQL server snarls up in a state of "statistics" or "waiting for table level lock".
From June until late November Service Monitor/chkservd used to restart the MySQL server and clear the processes, I would receive an email saying:
- "MYSQL appears to be down", or;
- "the service MySQL failed to restart"
- followed closely by an email saying "MySQL is now operational"
- Nothing has been changed in the configuration of cPanel.
- In Service Monitor MySQL is checked under both "Enable" and "Monitor".
- cPanel v76.0.20
- MySQL version 5.6
-
Hi @TwilightZoneCP If the service manager shows that the MySQL service is in fact monitored + enabled it should be automatically restarting the service after failure. Is anything MySQL related as far as the service restart failures in /var/log/chkservd.log? 0 -
Here is an example from /var/log/chkservd.log from when Service Manager was successfully monitoring and restarting MySQL... Service Check Started Loading services .....apache_php_fpm....cpanellogd....cpdavd....cphulkd....cpsrvd....crond....dnsadmin....exim....ftpd....httpd....imap....ipaliases....lmtp....mailman....mysql....named....nscd....pop....queueprocd....rsyslogd....spamd....sshd..Done [2018-10-19 13:51:29 +0100] Disk check .... / (/) [19.77%] ... /var/tmp (/var/tmp) [5.46%] ... /tmp (/tmp) [5.46%] ... {status:eek:k} ... Done [2018-10-19 13:51:29 +0100] OOM check ......OOM Event:[anon_rss=308488kB,file_rss=0kB,is_cgroup=0,pid=29324,proc_name=mysqld,score=108,seconds_since_boot=697213.249977,time=1539953265,total_vm=9319476kB,uid=993,user=mysql]....OOM Event:[anon_rss=2824kB,file_rss=0kB,is_cgroup=0,pid=1653,proc_name=named,score=9,seconds_since_boot=697213.367078,time=1539953265,total_vm=755852kB,uid=25,user=named].....Skipped OOM Notification (too soon)......Skipped OOM Notification (too soon)...... Done [2018-10-19 13:51:29 +0100] Service check .... queueprocd [[check command:+][socket connect:N/A]]... sshd [[check command:+][socket connect:N/A]]... spamd [[check command:+][socket connect:N/A]]... rsyslogd [[check command:+][socket connect:N/A]]... pop [[check command:+][socket connect:+]]... p0f [[check command:N/A][socket connect:N/A]]... nscd [[check command:+][socket connect:N/A]]... named [[check command:-][check command output:(XID wc7gm3) The "named" service is down. The subprocess "/usr/local/cpanel/scripts/restartsrv_named" reported error number 255 when it ended.][socket connect:N/A][fail count:1]Restarting named.... [notify:failed service:nameserver]]... mysql [[check command:-][check command output:(XID eg4ua2) The "mysql" service is down. The subprocess "/usr/local/cpanel/scripts/restartsrv_mysql" reported error number 255 when it ended.][socket connect:N/A][fail count:1]Restarting mysql.... [notify:failed service:mysql]]... mailman [[check command:+][socket connect:N/A]]... lmtp [[check command:+][socket connect:+]]... ipaliases [[check command:+][socket connect:N/A]]... interval [[check command:N/A][socket connect:N/A]]... imap [[socket_service_auth:1][check command:+][socket connect:+]]... httpd [[check command:N/A][socket connect:+]]... ftpd [[socket_service_auth:1][check command:+][socket connect:+]]... exim [[check command:+][socket connect:+]]... dnsadmin [[http_service_auth:1][check command:+][socket connect:+]]... crond [[check command:+][socket connect:N/A]]... cpsrvd [[http_service_auth:1][check command:N/A][socket connect:+]]... cphulkd [[check command:+][socket connect:+]]... cpdavd [[http_service_auth:1][check command:+][socket connect:+]]... cpanellogd [[check command:+][socket connect:N/A]]... cpanel_php_fpm [[check command:N/A][socket connect:N/A]]... apache_php_fpm [[check command:+][socket connect:N/A]]...Done Service Check Finished
Here is an example of the /var/log/chkservd.log from a time when MySQL was stuck for almost 2 hours and I had to manually restart MySQL myself to resolve the issue...Service Check Started Loading services .....apache_php_fpm....cpanellogd....cpdavd....cphulkd....cpsrvd....crond....dnsadmin....exim....ftpd....httpd....imap....ipaliases....lmtp....mailman....mysql....named....nscd....pop....queueprocd....rsyslogd....spamd....sshd..Done [2019-02-24 08:13:23 +0000] Disk check .... / (/) [69.83%] ... /tmp (/tmp) [5.46%] ... /var/tmp (/var/tmp) [5.46%] ... {status:eek:k} ... Done [2019-02-24 08:13:23 +0000] OOM check ....Done [2019-02-24 08:13:23 +0000] Service check .... queueprocd [[check command:+][socket connect:N/A]]... sshd [[check command:+][socket connect:N/A]]... spamd [[check command:+][socket connect:N/A]]... rsyslogd [[check command:+][socket connect:N/A]]... pop [[check command:+][socket connect:+]]... p0f [[check command:N/A][socket connect:N/A]]... nscd [[check command:+][socket connect:N/A]]... named [[check command:+][socket connect:N/A]]... mysql [[check command:+][socket connect:N/A]]... mailman [[check command:+][socket connect:N/A]]... lmtp [[check command:+][socket connect:+]]... ipaliases [[check command:+][socket connect:N/A]]... imap [[socket_service_auth:1][check command:+][socket connect:+]]... httpd [Service check failed to complete Timeout while trying to get data from service: Died[check command:N/A][socket connect:-][socket failure threshold:16/3][fail count:14]Restarting httpd.... [notify:failed service:httpd]]... ftpd [[socket_service_auth:1][check command:+][socket connect:+]]... exim [[check command:+][socket connect:+]]... dnsadmin [[http_service_auth:1][check command:+][socket connect:+]]... crond [[check command:+][socket connect:N/A]]... cpsrvd [[http_service_auth:1][check command:N/A][socket connect:+]]... cphulkd [[check command:+][socket connect:+]]... cpgreylistd [[check command:N/A][socket connect:N/A]]... cpdavd [[check command:+][socket connect:N/A]]... cpanellogd [[check command:+][socket connect:N/A]]... cpanel_php_fpm [[check command:N/A][socket connect:N/A]]... apache_php_fpm [[check command:+][socket connect:N/A]]...Done Service Check Finished
Does this shed any light on the matter? It all means next to nothing to me! Thank you!0 -
Hi @TwilightZoneCP That output in the second code box shows it seeing/checking MySQL but obviously not seeing it being down. I'm curious what the output of the following is: ps faux |grep -i mysql0 -
Here is the output.... # ps faux |grep -i mysql root 25697 0.0 0.0 112708 996 pts/0 S+ 15:13 0:00 \_ grep --color=auto -i mysql mysql 7454 0.0 0.0 113312 1292 ? Ss Feb25 0:00 /bin/sh /usr/bin/mysqld_safe mysql 7701 1.9 11.5 6074220 926084 ? Sl Feb25 26:12 \_ /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib64/mysql/plugin --log-error=serversaddress.com.err --open-files-limit=10000 --pid-file=serversaddress.com.pid 0 -
Hi @TwilightZoneCP Nothing weird there, all looks as it should be, you don't have a custom MySQL dir so that rules that out. Have you ever made any changes to the my.cnf? If not can you please open a ticket using the link in my signature? Once open please reply with the Ticket ID here so that we can update this thread with the resolution once the ticket is resolved. Thanks! 0 -
This is in the my.cnf... [mysqld] performance-schema=on default-storage-engine=MyISAM slow_query_log=1 slow_query_log_file=slow-query.log innodb_file_per_table=1 max_allowed_packet=268435456 innodb_buffer_pool_instances=4 query_cache_size=0 query_cache_type=0 query_cache_limit=1M tmp_table_size=16M max_heap_table_size=16M innodb_buffer_pool_size=4G innodb_log_file_size=2G open_files_limit=10000 0 -
Hi @TwilightZoneCP That all looks standard, please go ahead and open a ticket so we can look further into the configuration on the server. 0 -
My support ticket is: Your Support Request ID is: 11534115 Many thanks for your assistance, it's much appreciated! 0 -
Hi @TwilightZoneCP You're most welcome and I'm sure we'll be able to get to the bottom of this soon. I'll update this thread with the outcome of the ticket as soon as more information is available. Thanks! 0 -
Hello, I checked in on this ticket today and it looks like the service was indeed being monitored but the restart wasn't occurring due to the fact it was waiting on a table level lock on a database to be cleared. The advice from the analyst was to look at switching to InnoDB for row level locking. 0 -
It appears so. Also, for anyone else reviewing this thread in future, I upgraded MySQL from version 5.6 to version 5.7 and testing suggests that this has resolved the "table level lock" issue, as the server is handling much bigger requests without issue. 0
Please sign in to leave a comment.
Comments
11 comments