VPS shutting down randomly (no log entries, no high load)
Hi, I am running a WHM/cPanel on a VPS (AlmaLinux), since september.
About 3,5 weeks ago I upgraded the VPS for more CPU/RAM/DISK (2/8GB/100GB => 8/32GB/500GB). My hosting provider helped me manually increase the disk space (100GB => 500GB), and removed the swap partition of only 100 kb of size, after which I created a 4GB swap file instead (using "create-swap").
However, since the upgrade (don't know if that's the cause) the VPS has randomly shut down 4 times (3-6 days apart). I have checked the SAR command, and memory use/server load has been low on all 4 occasions. And swap file usage low or none. In the "messages" log the only pattern I can see is there's lots of "p0f WARNING: Too many host entries". See example below.
~~~~ VPS shut down here ~~~~ My hosting provider says there has been no incidents on the nodes on these occasions. I filed a ticket to cPanel support but they were not able to determine the cause of the issue. Greatful for any clues!
Dec 12 03:29:39 srv2 p0f[1324]: [!] WARNING: Too many host entries, deleting 1001. Use -m to adjust.
Dec 12 03:29:39 srv2 p0f[1324]: [!] WARNING: Too many host entries, deleting 1001. Use -m to adjust.
Dec 12 03:29:43 srv2 p0f[1324]: [!] WARNING: Too many host entries, deleting 1001. Use -m to adjust.
Dec 12 03:29:43 srv2 p0f[1324]: [!] WARNING: Too many host entries, deleting 1001. Use -m to adjust.
Dec 12 03:29:47 srv2 p0f[1324]: [!] WARNING: Too many host entries, deleting 1001. Use -m to adjust.
Dec 12 03:29:47 srv2 pdns_server[1224]: Error sending reply with sendmsg (socket=5, dest=10.0.2.157:53): Invalid argument
Dec 12 03:29:47 srv2 PAM-hulk[317161]: Brute force detection active: 580 LOGIN DENIED -- EXCESSIVE FAILURES -- IP TEMP BANNED
Dec 12 03:29:47 srv2 p0f[1324]: [!] WARNING: Too many host entries, deleting 1001. Use -m to adjust.
Dec 12 03:29:48 srv2 p0f[1324]: [!] WARNING: Too many host entries, deleting 1001. Use -m to adjust.
Dec 12 03:29:48 srv2 pdns_server[1224]: Error sending reply with sendmsg (socket=5, dest=10.0.2.157:53): Invalid argument
Dec 12 03:29:52 srv2 PAM-hulk[317168]: Brute force detection active: 580 LOGIN DENIED -- EXCESSIVE FAILURES -- IP TEMP BANNED~~~~ VPS shut down here ~~~~ My hosting provider says there has been no incidents on the nodes on these occasions. I filed a ticket to cPanel support but they were not able to determine the cause of the issue. Greatful for any clues!
-
Hey there! If I support team, with access to the server, wasn't able to find anything helpful, I don't think the Forums are going to be of much help. You'll likely want to reach out to your hosting provider or datacenter to have them check the machine and test the hardware for possible issues. 0 -
Hi cPRex, thanks you for your reply, I appreciate it! As a beginner I thought I would give it a shot in the forums, since I had already reached out to both the datacenter and the cPanel support team. I felt I simply could not rule out the possibility of someone here having had a similar experience, or just having a hunch on specifically where to look for the answer. For example, the cPanel support mentioned the "sys-snap" script as an option to possibly get further data concerning the problem. Would you say that is unnecessary, considering the high (?) probability of this being a hardware problem? I will follow your advice for sure and have the DC look again. Sorry for any inconvenience if this thread should never have been started :-) Alll the best, Henrik 0 -
Oh no, it's all good, and someone might have some ideas, but it just seems unlikely to me that if someone with server access couldn't come up with things, trying to guess may not be very helpful. sar is a good thing to check as that would show historical data for the system, but then you'd have to hope you could use that data to find something after it's happened, which is always tricky. Did that show any load spikes or other interesting data from around the time of an outage? 0 -
Thank you cPRex. Unfortunately no clues so far in SAR history or any other logs. For reference, VPS message logs from all 4 shut down incidents this >> In lack of other ideas I have now tried the following: - Force reinstalled all cPanel files:
# /usr/local/cpanel/scripts/upcp --force
- Installed and started the sys-snap monitoring script:
# /root/sys-snap.pl --start
0 - Force reinstalled all cPanel files:
-
sys-snap is a great idea - hopefully that gives you more details! 0 -
Ok, so the problem seems solved. The last unexpected shut down was on december 26 2022. And since then the server has been running fine. The only clue that I got from the Data Center was that they saw an "out of memory kill" message in the log files for my VPS. So my conclusion is this was never a cPanel problem. Instead my speculation is that the physical server node ran out of resources for some reason causing the crashes. Maybe the DC had to many nodes on one physical server? I feel the DC should be able to see this, but they are very vague in the answers. Anyways, I thank you very much for the moral support :) Case closed for now. 0 -
I'm glad to hear things are working well for you now! 0
Please sign in to leave a comment.
Comments
7 comments