Disk I/O Utilisation High and Errors in Logs - Site Unusable
hi
Just a few days ago my server started to show some high average loads but yesterday it really went out of control and I have battled to keep it up since then.
I am experiencing a disk I/O of over 100%. There is an insane amount of writing going on. In addition the average load are in the double digits, so much so that I can't even get into WHM unless I reboot the Virtual server via the hosting control panel.
In my messages log I find something like this:
The error log shows the following strange entries:
I also suspect this is when it all started:
Any idea where to start?
Jan 6 04:43:46 server1 kernel: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Jan 6 04:43:46 server1 kernel: ata1.00: failed command: WRITE DMA
Jan 6 04:43:46 server1 kernel: ata1.00: cmd ca/00:60:50:9c:57/00:00:00:00:00/e3 tag 0 dma 49152 out
Jan 6 04:43:46 server1 kernel: res 40/00:01:06:4f:c2/00:00:00:00:00/a0 Emask 0x4 (timeout)
Jan 6 04:43:46 server1 kernel: ata1.00: status: { DRDY }
Jan 6 04:43:46 server1 kernel: ata1: soft resetting link
Jan 6 04:43:46 server1 kernel: ata1.00: configured for MWDMA2
Jan 6 04:43:46 server1 kernel: ata1: EH complete
Jan 6 04:47:09 server1 kernel: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Jan 6 04:47:09 server1 kernel: ata1.00: failed command: WRITE DMA
Jan 6 04:47:09 server1 kernel: ata1.00: cmd ca/00:58:c0:71:8a/00:00:00:00:00/e3 tag 0 dma 45056 out
Jan 6 04:47:09 server1 kernel: res 40/00:01:06:4f:c2/00:00:00:00:00/a0 Emask 0x4 (timeout)
Jan 6 04:47:09 server1 kernel: ata1.00: status: { DRDY }
Jan 6 04:47:09 server1 kernel: ata1: soft resetting link
Jan 6 04:47:09 server1 kernel: ata1.00: configured for MWDMA2
Jan 6 04:47:09 server1 kernel: ata1: EH complete
Jan 6 04:49:23 server1 kernel: Clocksource tsc unstable (delta = -8589720273 ns). Enable clocksource failover by adding clocksource_failover kernel parameter.
Jan 6 04:49:23 server1 kernel: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Jan 6 04:49:23 server1 kernel: ata1.00: failed command: WRITE DMA
Jan 6 04:49:23 server1 kernel: ata1.00: cmd ca/00:00:b0:75:8a/00:00:00:00:00/e3 tag 0 dma 131072 out
Jan 6 04:49:23 server1 kernel: res 40/00:01:06:4f:c2/00:00:00:00:00/a0 Emask 0x4 (timeout)
Jan 6 04:49:23 server1 kernel: ata1.00: status: { DRDY }
Jan 6 04:49:23 server1 kernel: ata1: soft resetting link
Jan 6 04:49:23 server1 kernel: ata1.00: configured for MWDMA2
Jan 6 04:49:23 server1 kernel: ata1: EH complete
The error log shows the following strange entries:
DBD::mysql::st execute failed: There is no such grant defined for user '' on host 'localhost' at /usr/local/cpanel/Cpanel/MysqlUtils/Connect.pm line 82.
DBD::mysql::st execute failed: There is no such grant defined for user '' on host 'server1.xxx.info' at /usr/local/cpanel/Cpanel/MysqlUtils/Connect.pm line 82.
DBD::mysql::st execute failed: There is no such grant defined for user '' on host 'localhost' at /usr/local/cpanel/Cpanel/MysqlUtils/Connect.pm line 82.
DBD::mysql::st execute failed: There is no such grant defined for user '' on host 'server1.xxx.info' at /usr/local/cpanel/Cpanel/MysqlUtils/Connect.pm line 82.
DBD::mysql::st execute failed: There is no such grant defined for user '' on host 'localhost' at /usr/local/cpanel/Cpanel/MysqlUtils/Connect.pm line 82.
DBD::mysql::st execute failed: There is no such grant defined for user '' on host 'server1.xxx.info' at /usr/local/cpanel/Cpanel/MysqlUtils/Connect.pm line 82.
I also suspect this is when it all started:
[2014-01-03 16:22:58 +0200] warn [cpses_tool] Timeout of authentication at /usr/local/cpanel/Cpanel/MysqlUtils/NetMySQL.pm line 63
at /usr/local/cpanel/Cpanel/MysqlUtils/NetMySQL.pm line 72
Cpanel::MysqlUtils::NetMySQL::connect('dbuser', 'root', 'dbserver', 'localhost', 'database', 'mysql', 'debug', 0, 'dbpass', 'xxx') called at /usr/local/cpanel/Cpanel/MysqlUtils/NetMySQL.pm line 29
Cpanel::MysqlUtils::NetMySQL::new('Cpanel::MysqlUtils', 'database', 'mysql', 'dbuser', 'root', 'dbpass', 'xxx', 'dbserver', 'localhost', 'debug', 0) called at /usr/local/cpanel/Cpanel/Mysql.pm line 113
eval {...} called at /usr/local/cpanel/Cpanel/Mysql.pm line 113
Cpanel::Mysql::_get_dbh(Cpanel::Mysql=HASH(0x271e000), 'localhost', 'root', 'xxx') called at /usr/local/cpanel/Cpanel/Mysql.pm line 77
Cpanel::Mysql::new('Cpanel::Mysql', HASH(0x2a5f650)) called at bin/cpses_tool line 182
bin::cpses_tool::action_CLEANUPSESSIONS('bin::cpses_tool', HASH(0x27652e0)) called at bin/cpses_tool line 73
bin::cpses_tool::process_request(HASH(0x27652e0)) called at bin/cpses_tool line 60
bin::cpses_tool::script('bin::cpses_tool') called at bin/cpses_tool line 36
[2014-01-03 16:23:11 +0200] warn [cpses_tool] Error while connecting to MySQL: Timeout of authentication at /usr/local/cpanel/Cpanel/Mysql.pm line 1600
Cpanel::Mysql::_log_error_and_output(Cpanel::Mysql=HASH(0x271e000), 'Error while connecting to MySQL: [_1]', 'Timeout of authentication') called at /usr/local/cpanel/Cpanel/Mysql.pm line 117
Cpanel::Mysql::_get_dbh(Cpanel::Mysql=HASH(0x271e000), 'localhost', 'root', 'xxx') called at /usr/local/cpanel/Cpanel/Mysql.pm line 77
Cpanel::Mysql::new('Cpanel::Mysql', HASH(0x2a5f650)) called at bin/cpses_tool line 182
bin::cpses_tool::action_CLEANUPSESSIONS('bin::cpses_tool', HASH(0x27652e0)) called at bin/cpses_tool line 73
bin::cpses_tool::process_request(HASH(0x27652e0)) called at bin/cpses_tool line 60
bin::cpses_tool::script('bin::cpses_tool') called at bin/cpses_tool line 36
[2014-01-03 16:27:13 +0200] warn [cpses_tool] Lock on /var/cpanel/cpanel.config.lock lost! at /usr/local/cpanel/Cpanel/SafeFile.pm line 159
Cpanel::SafeFile::safeunlock(Cpanel::SafeFileLock=ARRAY(0x2945640)) called at /usr/local/cpanel/Cpanel/SafeFile.pm line 77
Cpanel::SafeFile::safeclose(IO::Handle=GLOB(0x281d7c0), Cpanel::SafeFileLock=ARRAY(0x2945640)) called at /usr/local/cpanel/Cpanel/Config/LoadCpConf.pm line 179
Cpanel::Config::LoadCpConf::loadcpconf() called at /usr/local/cpanel/Cpanel/Locale.pm line 278
Cpanel::Locale::get_server_locale() called at /usr/local/cpanel/Cpanel/Locale/Utils/User.pm line 192
Cpanel::Locale::Utils::User::get_user_locale('root') called at /usr/local/cpanel/Cpanel/Locale/Utils/User.pm line 35
Cpanel::Locale::Utils::User::init_cpdata_keys() called at (eval 3) line 1
eval ' Cpanel::Locale::Utils::User::init_cpdata_keys(); \x0A;' called at /usr/local/cpanel/Cpanel/Locale.pm line 38
Cpanel::Locale::preinit() called at /usr/local/cpanel/Cpanel/Locale.pm line 99
Cpanel::Locale::get_handle('Cpanel::Locale') called at /usr/local/cpanel/Cpanel/Mysql.pm line 70
Cpanel::Mysql::new('Cpanel::Mysql', HASH(0x2b3f530)) called at bin/cpses_tool line 182
bin::cpses_tool::action_CLEANUPSESSIONS('bin::cpses_tool', HASH(0x28463b0)) called at bin/cpses_tool line 73
bin::cpses_tool::process_request(HASH(0x28463b0)) called at bin/cpses_tool line 60
bin::cpses_tool::script('bin::cpses_tool') called at bin/cpses_tool line 36
[2014-01-03 16:28:15 +0200] warn [cpses_tool] Couldn't connect to localhost:3306/tcp: IO::Socket::INET: connect: Connection refused at /usr/local/cpanel/Cpanel/MysqlUtils/NetMySQL.pm line 63
at /usr/local/cpanel/Cpanel/MysqlUtils/NetMySQL.pm line 72
Cpanel::MysqlUtils::NetMySQL::connect('debug', 0, 'dbserver', 'localhost', 'dbpass', 'xxx', 'database', 'mysql', 'dbuser', 'root') called at /usr/local/cpanel/Cpanel/MysqlUtils/NetMySQL.pm line 29
Cpanel::MysqlUtils::NetMySQL::new('Cpanel::MysqlUtils', 'database', 'mysql', 'dbuser', 'root', 'dbpass', 'xxx', 'dbserver', 'localhost', 'debug', 0) called at /usr/local/cpanel/Cpanel/Mysql.pm line 113
eval {...} called at /usr/local/cpanel/Cpanel/Mysql.pm line 113
Cpanel::Mysql::_get_dbh(Cpanel::Mysql=HASH(0x2801250), 'localhost', 'root', 'xxx') called at /usr/local/cpanel/Cpanel/Mysql.pm line 77
Cpanel::Mysql::new('Cpanel::Mysql', HASH(0x2b3f530)) called at bin/cpses_tool line 182
bin::cpses_tool::action_CLEANUPSESSIONS('bin::cpses_tool', HASH(0x28463b0)) called at bin/cpses_tool line 73
bin::cpses_tool::process_request(HASH(0x28463b0)) called at bin/cpses_tool line 60
bin::cpses_tool::script('bin::cpses_tool') called at bin/cpses_tool line 36
[2014-01-03 16:28:15 +0200] warn [cpses_tool] Error while connecting to MySQL: Couldn't connect to localhost:3306/tcp: IO::Socket::INET: connect: Connection refused at /usr/local/cpanel/Cpanel/Mysql.pm line 1600
Cpanel::Mysql::_log_error_and_output(Cpanel::Mysql=HASH(0x2801250), 'Error while connecting to MySQL: [_1]', 'Couldn\'t connect to localhost:3306/tcp: IO::Socket::INET: connect: Connection refused') called at /usr/local/cpanel/Cpanel/Mysql.pm line 117
Cpanel::Mysql::_get_dbh(Cpanel::Mysql=HASH(0x2801250), 'localhost', 'root', 'xxx') called at /usr/local/cpanel/Cpanel/Mysql.pm line 77
Cpanel::Mysql::new('Cpanel::Mysql', HASH(0x2b3f530)) called at bin/cpses_tool line 182
bin::cpses_tool::action_CLEANUPSESSIONS('bin::cpses_tool', HASH(0x28463b0)) called at bin/cpses_tool line 73
bin::cpses_tool::process_request(HASH(0x28463b0)) called at bin/cpses_tool line 60
bin::cpses_tool::script('bin::cpses_tool') called at bin/cpses_tool line 36
Building global cache for cpanel...Done
Any idea where to start?
-
Hello, You most likely have a hard drive problem. The first log you provided, clearly indicates DMA write errors on ata1. Hard drive is slowly failing and needs to be replaced. Once that is done, see if the other problems go away. 0 -
Thank you Peter My host says I'm on a shared SAN and is subject to abuse now and then, but that there is no abuse and therefore will be moving me to another hypervisor to see if that resolves the issue. I'm not entirely convinced of this. Even restarting my virtual server does not solve the issue. What do you make of the other logs that I posted? 0 -
[QUOTE]My host says I'm on a shared SAN and is subject to abuse now and then
This:Jan 6 04:47:09 server1 kernel: ata1.00: failed command: WRITE DMA Jan 6 04:47:09 server1 kernel: ata1.00: cmd ca/00:58:c0:71:8a/00:00:00:00:00/e3 tag 0 dma 45056 out Jan 6 04:47:09 server1 kernel: res 40/00:01:06:4f:c2/00:00:00:00:00/a0 Emask 0x4 (timeout) Jan 6 04:47:09 server1 kernel: ata1.00: status: { DRDY }
is a sign of a hardware problem, not "abuse". It is happening a lot, based on the logs you provided. You need to ask your host to run diagnostics on each actual hard drive, because when you start to see that kind of error message, hardware failure may be imminent. This is not something that cPanel would have any control over. It is happening at a deeper level, in the hardware, and the hardware needs to be carefully investigated before a hard drive is lost (and your data with it).0 -
Thanks I agree with you, I've been moved over to another machine and everything is running smooth now. Just another thing they picked up was they saw I was running a debug version of the Kernel. I actually upgraded the Kernel yesterday, this was subsequent to these issues so it was not the primary issue. Security Advisor in Cpanel recommended I upgrade to 2.6.32-431.3.1.el6 which I did. Hosting support changed it to the "normal" kernel in the bootloader. How would I ensure that the debug version is not selected in the reboot going forward? How does one check the difference between the debug and normal version and ensure it's configured properly in the bootloader? 0
Please sign in to leave a comment.
Comments
4 comments