mysqld.pid killed our server...
Hi All,
We've been running our AWS server (t2.2xlarge) since Oct 2019 - we've had no issue.
We use cloudflare, have imunify360, and KernelCare, our typical load averages are: 0.51 0.51 0.50
Yesterday - we were notified our site was running slow, and logged in to see load averages of 365.55 400.98 411.46
I looked at the process manager to see this -
was using 50% of the CPU% I then restarted the SQL Server, restarted the server, and upgraded to the v90.0.8 whm release. I have checked our error_log, our messages file, and our access_log - but I cannot see anything that would have caused the site to crash/hang. Is there anything else I can try to see what caused the issue? Like i said - we've been running for close to a year with no problems, so you would think something would stand out in a log -- but we do not know where else to look on the server to find what caused the problem. Thanks so much!
/usr/sbin/mysqld --daemonize --pid-file=/var/run/mysqld/mysqld.pid
was using 50% of the CPU% I then restarted the SQL Server, restarted the server, and upgraded to the v90.0.8 whm release. I have checked our error_log, our messages file, and our access_log - but I cannot see anything that would have caused the site to crash/hang. Is there anything else I can try to see what caused the issue? Like i said - we've been running for close to a year with no problems, so you would think something would stand out in a log -- but we do not know where else to look on the server to find what caused the problem. Thanks so much!
-
Hello, To help you with this issue, could you provide a little more information, that is, at the time of the problem, please send the output: top with the entire title + first 10 lines + also iotop. Thank you. 0 -
Did you happen to look at the MySQL process list before starting things? Have you looked in the MySQL log? 0 -
To help you with this issue, could you provide a little more information, that is, at the time of the problem, please send the output: top with the entire title + first 10 lines + also iotop.
Hi, - I do not have iotop installed, but top shows the following:Tasks: 376 total, 4 running, 370 sleeping, 0 stopped, 2 zombie %Cpu(s): 16.6 us, 7.9 sy, 0.0 ni, 75.2 id, 0.0 wa, 0.0 hi, 0.1 si, 0.1 st KiB Mem : 32779080 total, 23613508 free, 3297020 used, 5868552 buff/cache KiB Swap: 0 total, 0 free, 0 used. 29046312 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 640 root 20 0 1488888 22652 6788 S 1.7 0.1 8:29.35 imunify360-php- 21683 nobody 20 0 341608 141780 3860 S 1.3 0.4 0:00.37 httpd 26002 nobody 20 0 358740 22444 12496 R 1.3 0.1 0:00.04 php-cgi 12113 nobody 20 0 341632 142060 4112 S 1.0 0.4 0:01.11 httpd 16235 nobody 20 0 341596 141960 4052 S 1.0 0.4 0:00.83 httpd 25426 nobody 20 0 341616 141236 3308 S 1.0 0.4 0:00.07 httpd 25504 nobody 20 0 341492 140964 3268 S 1.0 0.4 0:00.05 httpd 25547 nobody 20 0 341636 141200 3252 S 1.0 0.4 0:00.10 httpd 6611 nscd 20 0 2692384 5528 1472 S 0.7 0.0 5:34.01 nscd 8015 nobody 20 0 341608 141980 4052 S 0.7 0.4 0:01.76 httpd 8028 nobody 20 0 341616 142084 4152 S 0.7 0.4 0:01.38 httpd 8081 nobody 20 0 341620 142000 4068 S 0.7 0.4 0:01.54 httpd 10108 nobody 20 0 341616 141976 4044 S 0.7 0.4 0:01.30 httpd 11223 nobody 20 0 341616 141960 4028 S 0.7 0.4 0:01.28 httpd 12126 nobody 20 0 341604 141992 4072 S 0.7 0.4 0:01.17 httpd 14031 nobody 20 0 341492 141864 4060 S 0.7 0.4 0:00.92 httpd 15377 nobody 20 0 341600 141968 4056 S 0.7 0.4 0:00.89 httpd
Right now there is 2,485 active users - which is why the stats may look high.Did you happen to look at the MySQL process list before starting things? Have you looked in the MySQL log?
Hi - when I looked at the Process Manager before restarting the server, it was this line using 50% of the CPU: /usr/sbin/mysqld --daemonize --pid-file=/var/run/mysqld/mysqld.pid When I look in my daily process logs for yesterday - I see the following:Top Processes nobody 29.0 /opt/cpanel/ea-php73/root/usr/bin/php-cgi nobody 25.0 /opt/cpanel/ea-php73/root/usr/bin/php-cgi nobody 5.0 /opt/cpanel/ea-php73/root/usr/bin/php-cgi site1 domain.com 2.3 cpaneld - serving 92.27.219.188 cpanelphpmyadmin 2.0 php-fpm: pool cpanelphpmyadmin site2 domain2.com 0.8 php-fpm: pool domain2.com site1 domain.com 0.7 cpaneld - serving 92.27.219.188 site1 domain.com 0.5 php-fpm: pool site1 site2 domain2.com 0.3 php-fpm: pool site2 imunify360-webshield 0.2 wsshdict: worker imunify360-webshield 0.1 wsshdict: worker ossec 0.1 /var/ossec/bin/ossec-analysisd polkitd 0.1 /usr/lib/polkit-1/polkitd --no-debug
The mysql error log shows nothing strange or noteworthy - These two lines around the time the server crashed: [Note] InnoDB: page_cleaner: 1000ms intended loop took 4365ms. The settings might not be optimal. (flushed=2 and evicted=0, during the time.) and: [Note] Access denied for user 'Cpanel::MysqlUtils::Unprivileged'@'localhost' (using password: NO) Thanks for the replies to both of you! :)0 -
You can't really look at it 'now' you'd have to look at it when it's having an issue. Looking at the MySQL process from Linux command line is not really going to help you much (or the WHM process list). You need to see what MySQL is actually 'doing' when it's using a lot CPU. From command line you can run: mysqladmin processlist
Of (if you can) you can go into phpmadymin, Status, Processes. You'd then be able to see the user/database and at least part of the query that was running. More than likely it was one of your sites that was getting beat on (bots, crawlers etc.) That in turn caused PHP to make a lot of MySQL requests (assuming your sites are php/mysql) and that, in turn cranked up the load on your server. If you can get into the server while it's going to you can see what IP's are connecting to the web server (from netstat or Apache status). If there are a lot from the same IP or IP range, you can try to block that IP/range via your firewall/IPTables. Hopefully that gets you going the the right direction the next time this happens.0
Please sign in to leave a comment.
Comments
4 comments