Info about Hardware tests suggested by OVH
Good afternoon,
OVH recommended that I perform these hardware tests to detect potential hardware failures: help.ovhcloud.com/csm/en-gb-dedicated-servers-hardware-diagnostics?id=kb_article_view&sysparm_article=KB0043500
-Is it safe to run these commands on a production server with CloudLinux, WHM/cPanel, LiteSpeed, and Imunify360?
-Can they be run directly in the terminal since WHM/cPanel or need rescue mode?
-WHM/cPanel have any hardware tests software?
-Can the execution be stopped with "Ctrl + C" like other commands?
Thank you very much.
-
Hey there! This guide specifically says the server needs to be in rescue mode when the commands are run, so the server would be offline and cPanel wouldn't be running.
0 -
Thank you cPRex.
WHM/cPanel have any similar script/software for can test safely the hardware without rescue mode?.
Thank you very much.
0 -
Unfortunately we don't - as we don't manage the hardware of the machine we don't make any tools for that.
What specifically are you trying to check, or what problems are you experiencing with the machine?
0 -
Hi cPRex,
1 time every various months the server freezes and the load average in WHM/cPanel goes up from 1 to 150 or similar.
Websites and emails are inaccessible, so it doesn’t respond to any requests. After that, we restarted the server from the data center, and everything worked correctly again.
For example, last day that happens:
top - 14:46:01 up 19 days, 16:30, 0 users, load average: 1,00, 0,84, 0,82
top - 14:47:01 up 19 days, 16:31, 0 users, load average: 0,76, 0,80, 0,81
top - 14:48:02 up 19 days, 16:32, 0 users, load average: 0,93, 0,85, 0,82
top - 14:49:01 up 19 days, 16:33, 0 users, load average: 3,45, 1,65, 1,10
top - 14:50:01 up 19 days, 16:34, 0 users, load average: 1,80, 1,51, 1,08
top - 14:51:01 up 19 days, 16:35, 0 users, load average: 1,74, 1,53, 1,11
top - 14:52:02 up 19 days, 16:36, 0 users, load average: 0,86, 1,31, 1,06
top - 14:53:01 up 19 days, 16:37, 0 users, load average: 0,77, 1,19, 1,04
top - 14:54:01 up 19 days, 16:38, 0 users, load average: 1,38, 1,25, 1,06
top - 14:55:01 up 19 days, 16:39, 0 users, load average: 1,05, 1,17, 1,05
top - 14:56:01 up 19 days, 16:40, 0 users, load average: 1,02, 1,13, 1,03
top - 14:57:01 up 19 days, 16:41, 0 users, load average: 0,58, 0,98, 0,99
top - 14:58:01 up 19 days, 16:42, 0 users, load average: 0,85, 0,97, 0,98
top - 15:35:01 up 0 min, 0 users, load average: 0,36, 0,08, 0,03
top - 15:36:01 up 1 min, 0 users, load average: 1,44, 0,47, 0,16
top - 15:37:01 up 2 min, 0 users, load average: 1,38, 0,61, 0,23
top - 15:38:01 up 3 min, 0 users, load average: 1,52, 0,79, 0,31
top - 15:39:01 up 4 min, 0 users, load average: 1,00, 0,79, 0,34
top - 15:40:02 up 5 min, 0 users, load average: 0,75, 0,76, 0,36
top - 15:41:01 up 6 min, 0 users, load average: 0,95, 0,84, 0,41
top - 15:42:01 up 7 min, 0 users, load average: 1,00, 0,87, 0,45
top - 15:43:01 up 8 min, 0 users, load average: 0,79, 0,83, 0,46
top - 15:44:02 up 9 min, 0 users, load average: 0,61, 0,78, 0,46We rebooted the server at: 15:31:22. Before the reboot, the server was frozen.
Between 14:58:01 and 15:35:01 the server freezes and no have logs about Load Average, but since WHM/cPanel we see that its like 150 or similar.
The kernel don't crash:
# ll /var/crash/
total 0What do you recommend me?.
Thank you very much.
0 -
I would start by checking the sar logs on the machine to see if that tells you anything helpful about the state of the server when the issue happens.
0 -
Hi cPRex,
It’s the same with that tool:
13:00:02 50 1035 0,65 0,88 1,02 0
13:10:01 20 920 0,76 0,86 0,98 0
13:20:01 27 954 2,20 1,90 1,34 0
13:30:01 47 997 0,40 0,72 0,97 0
13:40:01 12 901 0,53 0,87 0,99 0
13:50:01 13 908 1,71 1,64 1,22 0
14:00:01 47 1027 0,63 0,73 0,89 0
14:10:02 19 924 0,71 0,66 0,77 0
14:20:01 16 921 1,89 1,63 1,12 0
14:30:02 45 1008 0,67 0,76 0,86 0
14:40:01 19 931 0,96 0,90 0,86 0
14:40:01 runq-sz plist-sz ldavg-1 ldavg-5 ldavg-15 blocked
14:50:01 12 914 1,80 1,51 1,08 0
Media: 26 954 1,08 1,03 0,92 0
15:34:34 LINUX RESTART
15:40:02 runq-sz plist-sz ldavg-1 ldavg-5 ldavg-15 blocked
15:50:01 22 799 1,55 1,05 0,66 0
16:00:02 51 889 0,81 0,87 0,74 0
16:10:02 17 782 0,77 1,44 1,17 0
16:20:01 14 777 0,90 1,01 1,07 0
16:30:02 47 877 1,10 1,07 1,04 1
16:40:02 12 765 0,51 1,12 1,16 0
16:50:01 20 805 0,66 0,83 0,98 0
17:00:02 54 906 1,04 0,88 0,91 0While the server is frozen, it doesn’t store data about the Load Averages. Any idea?.
Thank you very much.
0 -
I was hoping it would give some data a bit closer to the time of the issue. I don't have any other ideas on my end as this issue wouldn't be related to the cPanel tools. Is your host able to check the system for issues on their end?
0
Please sign in to leave a comment.
Comments
7 comments