CloudLinux - High CPU usage and server crashes
Hello,
We are using CloudLinux on our webservers and one of the servers started to behave strange since last week. It all began by total server crash all of a sudden and which requires a reboot to fix it. Its a VM (VMware) running CloudLinux 2.6.18-471.3.1.el5.lve0.8.72 and cPanel 11.40.1 in it. Whenever the server crashes we see a lot the these in /var/log/messages
------------------------------------------
------------------------------------------ I then tried yum update and also tried to fix this by amending the below in sysctl.conf "vm.dirty_ratio = 10" But this didn't help and I have reverted the change. Since yesterday I do not see these error messages in logs any more but the server is still experiencing serious issues. The server is now having sudden spike in load and completely goes unresponsive - its Apache consuming almost all available CPU. Please see
And on another instance
------------------------------------------ With CloudLinux in place I am not expecting this behavior. I see a lot of the below processes in the server which seems unusual for me.
I have checked the server I/O and MySql processlist and both looks good to me. Below is the server configuration.
Domlogs does not say that we are having sites with huge traffic inflow and I am a bit confused here. Can someone please help me to understand and resolve this issue? I have raised a call with CloudLinux and is still awaiting their advice/solution. If seems like its going to take some time for them to respond due to the time zone difference so thought to check here. Many Thanks, efheem.
INFO: task pdflush blocked for more than 300 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
CPU5:
ffff81023ec13f48 0000000000000000 0000000000000000 ffffffff801be201
ffff81023abced00 000000000000002f 000000000000002f ffffffff801be259
ffffffff8006235c ffffffff800241af ffff8101dfef7ef8 ffff8101dfef7d60
Call Trace:
[] showacpu+0x0/0x65
[] showacpu+0x58/0x65
[] call_softirq+0x1c/0x28
[] smp_call_function_interrupt+0xad/0xc9
[] call_function_interrupt+0x66/0x6c
[] :ext3:ext3_permission+0x0/0xc
[] virtinfo_notifier_call+0x2/0xa9
[] vfs_getattr+0x35/0xae
[] vfs_lstat_fd+0x2f/0x47
[] free_pages_and_swap_cache+0x70/0x87
[] fairsched_switch+0x24e/0x452
[] __sched_text_start+0x70c/0x110c
[] sys_newlstat+0x19/0x31
[] mntput_no_expire+0x1d/0x135
[] filp_close+0x5c/0x64
[] system_call+0x7e/0x83
------------------------------------------ I then tried yum update and also tried to fix this by amending the below in sysctl.conf "vm.dirty_ratio = 10" But this didn't help and I have reverted the change. Since yesterday I do not see these error messages in logs any more but the server is still experiencing serious issues. The server is now having sudden spike in load and completely goes unresponsive - its Apache consuming almost all available CPU. Please see
------------------------------------------
Apache Server Status for localhost
Server Version: Apache/2.2.27 (Unix) mod_ssl/2.2.27
OpenSSL/0.9.8e-fips-rhel5 mod_bwlimited/1.4
Server Built: Apr 3 2014 20:45:34
_________________________________________________________________
Current Time: Tuesday, 03-Jun-2014 10:53:37 BST
Restart Time: Tuesday, 03-Jun-2014 10:48:31 BST
Parent Server Generation: 0
Server uptime: 5 minutes 6 seconds
Total accesses: 3173 - Total Traffic: 222.9 MB
CPU Usage: u41.5 s88.92 cu566.39 cs0 - 228% CPU load
10.4 requests/sec - 0.7 MB/second - 71.9 kB/request
82 requests currently being processed, 5 idle workers
------------------------------------------
Apache Server Status for localhost
Server Version: Apache/2.2.27 (Unix) mod_ssl/2.2.27
OpenSSL/0.9.8e-fips-rhel5 mod_bwlimited/1.4
Server Built: Apr 3 2014 20:45:34
_________________________________________________________________
Current Time: Tuesday, 03-Jun-2014 11:52:35 BST
Restart Time: Tuesday, 03-Jun-2014 11:48:04 BST
Parent Server Generation: 0
Server uptime: 4 minutes 31 seconds
Total accesses: 4371 - Total Traffic: 126.3 MB
CPU Usage: u6.38 s7.14 cu116.55 cs0 - 48% CPU load
16.1 requests/sec - 477.1 kB/second - 29.6 kB/request
6 requests currently being processed, 9 idle workers
And on another instance
CPU Usage: - 269% CPU load
40 requests currently being processed, 5 idle workers
------------------------------------------ With CloudLinux in place I am not expecting this behavior. I see a lot of the below processes in the server which seems unusual for me.
------------------------------------------
nobody 55733 1.4 0.1 93984 15676 ? S 11:30 0:03 /usr/local/apache/bin/httpd -k start -DSSL
nobody 55749 1.4 0.1 94008 15664 ? S 11:30 0:03 /usr/local/apache/bin/httpd -k start -DSSL
nobody 55760 0.7 0.1 93888 15488 ? S 11:30 0:01 /usr/local/apache/bin/httpd -k start -DSSL
nobody 55763 2.0 0.1 93892 15564 ? S 11:30 0:04 /usr/local/apache/bin/httpd -k start -DSSL
nobody 55845 1.1 0.1 93884 14696 ? S 11:30 0:01 /usr/local/apache/bin/httpd -k start -DSSL
nobody 55849 1.6 0.1 94016 14808 ? R 11:31 0:02 /usr/local/apache/bin/httpd -k start -DSSL
nobody 55926 1.1 0.1 93888 14788 ? S 11:31 0:01 /usr/local/apache/bin/httpd -k start -DSSL
nobody 56258 11.5 0.1 93048 13732 ? S 11:33 0:00 /usr/local/apache/bin/httpd -k start -DSSL
nobody 56261 14.0 0.1 93044 13664 ? S 11:33 0:00 /usr/local/apache/bin/httpd -k start -DSSL
nobody 56262 35.0 0.1 93048 13620 ? S 11:33 0:00 /usr/local/apache/bin/httpd -k start -DSSL
------------------------------------------
I have checked the server I/O and MySql processlist and both looks good to me. Below is the server configuration.
------------------------------------------
PHP 5.3.28 (cli) (built: Apr 3 2014 20:59:40)
Copyright (c) 1997-2013 The PHP Group
Zend Engine v2.3.0, Copyright (c) 1998-2013 Zend Technologies
------------------------------------------
Server version: Apache/2.2.27 (Unix)
Server built: Apr 3 2014 20:45:34
Cpanel::Easy::Apache v3.24.14 rev9999 +cloudlinux
------------------------------------------
Loaded Modules:
core_module (static)
authn_file_module (static)
authn_default_module (static)
authz_host_module (static)
authz_groupfile_module (static)
authz_user_module (static)
authz_default_module (static)
auth_basic_module (static)
include_module (static)
filter_module (static)
log_config_module (static)
logio_module (static)
env_module (static)
expires_module (static)
headers_module (static)
setenvif_module (static)
version_module (static)
proxy_module (static)
proxy_connect_module (static)
proxy_ftp_module (static)
proxy_http_module (static)
proxy_scgi_module (static)
proxy_ajp_module (static)
proxy_balancer_module (static)
ssl_module (static)
mpm_prefork_module (static)
http_module (static)
mime_module (static)
status_module (static)
autoindex_module (static)
asis_module (static)
info_module (static)
suexec_module (static)
cgi_module (static)
negotiation_module (static)
dir_module (static)
actions_module (static)
userdir_module (static)
alias_module (static)
rewrite_module (static)
so_module (static)
hostinglimits_module (shared)
bwlimited_module (shared)
bw_module (shared)
suphp_module (shared)
Syntax OK
------------------------------------------
Domlogs does not say that we are having sites with huge traffic inflow and I am a bit confused here. Can someone please help me to understand and resolve this issue? I have raised a call with CloudLinux and is still awaiting their advice/solution. If seems like its going to take some time for them to respond due to the time zone difference so thought to check here. Many Thanks, efheem.
-
Hello :) The following thread is a good place to start when troubleshooting load issues: Troubleshooting High Loads On Linux Systems Thank you. 0 -
Hi Michael, I have had a look at it already but didn't help in our case much unfortunately. We have identified that the %sy CPU is too high in the server and is unknown why it is so. Do you have any advise? With Regards, Faheem. 0 -
I don't see anything from your output that would indicate an issue with the cPanel/WHM or CloudLinux software itself. You may need to consult with a qualified system administrator if the guide from my last reply was not helpful in determining the cause of your load issues. Thank you. 0 -
Hi Michael, The %sys CPU is always high in the system even if there is no traffic to the server. It keeps an average of 40-50% and goes upto 90%. I believe its the kernel itself consuming this much CPU and there is something wrong with the kernel - As I mentioned the issue began with system crash with lots of kernel errors shown in logs/console. Kind Regards, efheem 0 -
Please, try re-install vmware tools, make sure they are compiled from source (RPMs will not work) 0
Please sign in to leave a comment.
Comments
5 comments