Suspect Updates are crashing server, or server cannot cope...
Hi all,
We are running a AWS server (t2.2xlarge) with centos 7 - since oct 2019. We use cloudflare (and while we tried to resolve this issue we enabled cloudflare ddos protection - just to rule it out)
Most of the time the site seems to run fine with load averages of: 0.51 0.51 0.50
The site gets 700-900k page views a day, and each day it peaks at 2500-4000 active users (from analytics) on the site. But as mentioned - the server seems to handle these numbers fine, so we do not think the sever is not handling the load.
Twice our server crashed, and we haven't been able to work out the reason. The first time was in Sept 2020, and I asked about it
-
Hey there! First of all, cool username :D Sorry to hear about the issues with the site. The mysql process that you mention is just the main process, which runs on CentOS 7 systems. I see the same thing on a CentOS 7 test machine when I check with the "ps aux" command: [root@90test ~]# ps aux | grep mysql mysql 1317 0.0 3.7 1036920 77220 ? Sl Nov20 1:22 /usr/sbin/mysqld --daemonize --pid-file=/var/run/mysqld/mysqld.pid
so I don't think that is the issue with the system, unless MySQL was using too may resources and causing stability issues. The second thing you posted, which shows the InnoDB crash, is much more concerning. That indicates there is damage to the MySQL data on the system, which could keep things from working well and will cause issues. The guide we have posted in the link below explains how you can confirm this (although I would say the log you have provided is enough to confirm the issue) and also gives a link to some resources the help repair that:0 -
Hi cPRex! Thank you so much for your reply. I have followed the guide you provided, but could not find anything. We did not have a /var/lib/mysql/HOSTNAME.err error log, the only log we had was: /var/log/mysqld.log I did a search in that log for the keywords in the guide, "corruption | failed | corrupt | deleted | moved" We had none of those words. When I ran the checks on the tables in MySQL - all the tables came back "ok" (see attached) So I am not sure if the MySQL data is corrupt? The server copes with a peak of 3k concurrent users, and 800k pageviews a day -- but could it be possible that during a server update - that it struggles? Our daily process logs show we use 37% of memory each day (Is there any way to check this theory?) Do you, or anyone else have any other ideas what I could look into to resolve this issue? Thanks again for your reply and help. 0 -
Thanks for the additional details. The earlier log showed the InnoDB crash clearly, so you may want to search around that point in the log file. The hostname.err log is just the default location, but it can be changed on the system in the /etc/my.cnf file. For the memory check, it's not really accurate to get any type of summary, as it's best to see the usage in real time while you're experiencing the issues with the machine. It's possible while updates are performed that affect services, such as an Apache restart or MySQL restart, that there could be additional slowness with the system. If you need to, you can adjust root's crontab to change the time the updates run on the system, or you could manually run "/scripts/upcp --force" to see if you can reproduce the issues with the system just by running that. 0
Please sign in to leave a comment.
Comments
3 comments