[UPS-177] Service Failures after CentOS 7.7 update
Hello Everyone,
We have received a number of reports regarding service failures with the error "Too many levels of symbolic links". This is an upstream issue involving the latest update to systemd included in CentOS 7.7.
Technical Summary
The recent CentOS 7.7 update includes an update to systemd-219-67, which changed how systemd opens symbolic links. If your kernel doesn"t support the new method which systemd uses to open symbolic links, then you may see "Too many levels of symbolic links" errors as seen in the example below:
[COLOR=rgb(44, 130, 201)]Do you know of any additional error messages that should appear above? Reply to this thread to let us know! Please understand this is an Operating System issue with the kernel and systemd. While the following link is a bug reported against Ubuntu, it is still relevant, and has a response from OpenVZ developers about the issue:Workaround Notices [QUOTE] " We recommend that only experienced system administrators perform the steps in these workarounds. " If you operate a VPS, but do not have access to the VPS hardware node, report the issue to your VPS hosting provider so they can update the kernel (you can proceed to workaround #2 below in the meantime). " The workarounds in this thread are intended for OpenVZ/Virtuozzo systems using CentOS 7.7 with kernels older than 2.6.32-042stab134.7 (or in limited cases, CentOS 7.7 systems with kernels older than 2.6.39). Using these instructions on different operating systems or virtual environments is not recommended. The following commands will help you determine the type of server you are using: # cat /etc/redhat-release # uname -r # lscpu|grep Hypervisor " If you desire to hire a company to perform this system administration task, check out our system administration services page at SafeAdmin certification process.
Workaround 1 (requires access to the VPS hardware node) 1) Update the Virtuozzo kernel on the server. 2) Reboot the server. [QUOTE]Note: If the server is rebooted prior to updating the kernel, the server's systemd target may default to graphical instead of multi-user, blocking other services from attempting to start. This is because `default.target' is also set via symlink.
Workaround 2 Create an addon service file for each service that is failing in order to change the location of the PID file. Using "spamd" as an example: 1) Execute the following command via SSH as a user with root access:
2) The command will bring up a text editor, at which point you can add entries like the ones below to modify where the PID file is located: [QUOTE] [Service] PIDFile=/run/spamd.pid
Exit the text editor after adding the entries and save the changes you made. Make sure you change this content to match different services as needed. 3) Confirm the custom override file for systemd was automatically populated: [CODE=rich]# cat /etc/systemd/system/spamd.service.d/override.conf [Service] PIDFile=/run/spamd.pid
4) After making any changes, reload systemd and the failed service. EX:
Additional Workarounds [COLOR=rgb(44, 130, 201)]Do you know of any additional workarounds that should appear above? Reply to this thread to let us know! We'll update this thread with more information and additional workarounds as they becomes available. Thank you.
# systemctl status lfd.service
lfd.service - ConfigServer Firewall & Security - lfd
Loaded: loaded (/usr/lib/systemd/system/lfd.service; enabled; vendor preset: disabled)
Active: failed (Result: timeout) since Wed 2019-09-18 16:10:10 CEST; 1min 40s ago
Process: 24132 ExecStart=/usr/sbin/lfd (code=exited, status=0/SUCCESS)
Sep 18 16:08:40 server systemd[1]: Starting ConfigServer Firewall & Security - lfd...
Sep 18 16:08:41 server systemd[1]: Can"t open PID file /var/run/lfd.pid (yet?) after start: Too many levels of symbolic links
Sep 18 16:10:10 server systemd[1]: lfd.service start operation timed out. Terminating.
Sep 18 16:10:10 server systemd[1]: Failed to start ConfigServer Firewall & Security - lfd.
Sep 18 16:10:10 server systemd[1]: Unit lfd.service entered failed state.
Sep 18 16:10:10 server systemd[1]: lfd.service failed.
[COLOR=rgb(44, 130, 201)]Do you know of any additional error messages that should appear above? Reply to this thread to let us know! Please understand this is an Operating System issue with the kernel and systemd. While the following link is a bug reported against Ubuntu, it is still relevant, and has a response from OpenVZ developers about the issue:Workaround Notices [QUOTE] " We recommend that only experienced system administrators perform the steps in these workarounds. " If you operate a VPS, but do not have access to the VPS hardware node, report the issue to your VPS hosting provider so they can update the kernel (you can proceed to workaround #2 below in the meantime). " The workarounds in this thread are intended for OpenVZ/Virtuozzo systems using CentOS 7.7 with kernels older than 2.6.32-042stab134.7 (or in limited cases, CentOS 7.7 systems with kernels older than 2.6.39). Using these instructions on different operating systems or virtual environments is not recommended. The following commands will help you determine the type of server you are using: # cat /etc/redhat-release # uname -r # lscpu|grep Hypervisor " If you desire to hire a company to perform this system administration task, check out our system administration services page at SafeAdmin certification process.
Workaround 1 (requires access to the VPS hardware node) 1) Update the Virtuozzo kernel on the server. 2) Reboot the server. [QUOTE]Note: If the server is rebooted prior to updating the kernel, the server's systemd target may default to graphical instead of multi-user, blocking other services from attempting to start. This is because `default.target' is also set via symlink.
Workaround 2 Create an addon service file for each service that is failing in order to change the location of the PID file. Using "spamd" as an example: 1) Execute the following command via SSH as a user with root access:
systemctl edit spamd.service
2) The command will bring up a text editor, at which point you can add entries like the ones below to modify where the PID file is located: [QUOTE] [Service] PIDFile=/run/spamd.pid
Exit the text editor after adding the entries and save the changes you made. Make sure you change this content to match different services as needed. 3) Confirm the custom override file for systemd was automatically populated: [CODE=rich]# cat /etc/systemd/system/spamd.service.d/override.conf [Service] PIDFile=/run/spamd.pid
4) After making any changes, reload systemd and the failed service. EX:
systemctl daemon-reload
/scripts/restartsrv_spamd
Additional Workarounds [COLOR=rgb(44, 130, 201)]Do you know of any additional workarounds that should appear above? Reply to this thread to let us know! We'll update this thread with more information and additional workarounds as they becomes available. Thank you.
-
The workaround did help with spamd, but LFD failed. Redirecting to /bin/systemctl start lfd.service Job for lfd.service failed because a fatal signal was delivered to the control process. See "systemctl status lfd.service" and "journalctl -xe" for details. [root@server ~]# systemctl status lfd.service * lfd.service - ConfigServer Firewall & Security - lfd Loaded: loaded (/usr/lib/systemd/system/lfd.service; enabled; vendor preset: disabled) Drop-In: /etc/systemd/system/lfd.service.d `-override.conf Active: failed (Result: signal) since Wed 2019-09-18 16:35:04 UTC; 25s ago Process: 28325 ExecStart=/usr/sbin/lfd (code=killed, signal=KILL) Sep 18 16:35:04 server systemd[1]: Starting ConfigServer Firewall & Security - lfd... Sep 18 16:35:04 server systemd[1]: lfd.service: control process exited, code=killed status=9 Sep 18 16:35:04 server systemd[1]: Failed to start ConfigServer Firewall & Security - lfd. Sep 18 16:35:04 server systemd[1]: Unit lfd.service entered failed state. Sep 18 16:35:04 server systemd[1]: lfd.service failed. 0 -
Hello @jisha, Please post the output from the commands below: ls -al /etc/systemd/system/lfd.service.d cat /etc/systemd/system/lfd.service.d/override.conf uname -r cat /etc/redhat-release grep baseurl /etc/yum.repos.d/CentOS-Base.repo cat /usr/local/cpanel/version lscpu|grep Hypervisor
Additionally, can you share any additional commands or actions you performed prior to using the workaround in this thread? Thank you.0 -
Hi!, Same problem here, thanks for the workaround, it works fine. I've found that there are 2 possible paths that services are: /usr/lib/systemd/system/ /etc/systemd/system/ I've made this script to automate the process (it changes /var/run to /run in all services). Feel free to update/improve it: [SPOILER="User-Submitted Workaround Script (Not reccomended at this time)"> [CODE=bash]#!/bin/bash function changePIDFile() { echo "Processing $1..." grep -r "^PIDFile=/var/run/" $1 | uniq | while read LINE do SERVICE=$(echo "$LINE" | cut -d':' -f1 | rev | cut -d'/' -f1 | rev) SERVICE_NOEXT=$(echo $SERVICE | sed 's/\.service//') PIDLINE=$(echo "$LINE" | cut -d':' -f2 | sed 's/\/var\/run/\/run/') PID=$(echo "$PIDLINE" | cut -d'=' -f2) echo "Fixing $SERVICE..." mkdir /etc/systemd/system/$SERVICE.d/ 2>/dev/null cat > "/etc/systemd/system/$SERVICE.d/override.conf" << EOF [Service] $PIDLINE EOF systemctl daemon-reload if systemctl is-active --quiet $SERVICE; then echo "Restarting $SERVICE.." if [ -f /scripts/restartsrv_$SERVICE_NOEXT ]; then /scripts/restartsrv_$SERVICE_NOEXT else systemctl restart $SERVICE fi fi done } changePIDFile "/usr/lib/systemd/system/" echo "" changePIDFile "/etc/systemd/system/"
Moderator Edit Feedback: " It will overwrite any existing customization to your drop-in files because override.conf is the default for systemctl edit $service. Consider using a different drop-in file name. " Consider adding a method to verify a service is failing because of this particular issue and only restart that service (as opposed to all services). Thanks! Ignacio0 -
I'm having this issue on CENTOS 7.6 as well edit: nevermind, it's 7.7 edit2: workaround #2 worked. spamd, dnsadmin, lfd 0 -
We followed Workaround 2 and it brought up our critical services. However, cphulkd will not start regardless of whether we edit the configuration for the service or not. Has anyone else found a workaround for this? We get the following when running "systemctl status cphulkd": ? cphulkd.service - cPanel brute force detector services Loaded: loaded (/etc/systemd/system/cphulkd.service; disabled; vendor preset: disabled) Drop-In: /etc/systemd/system/cphulkd.service.d ??override.conf Active: failed (Result: exit-code) since Wed 2019-09-18 13:21:41 MDT; 50min a go Process: 4639 ExecStart=/usr/local/cpanel/scripts/restartsrv_cphulkd --notconf igured-ok --systemd-service=cphulkd (code=exited, status=255) CGroup: /system.slice/cphulkd.service ?? 804 cPhulkd - processo ??2063 cPhulkd - dbprocesso Sep 18 13:21:40 vps.example.com systemd[1]: Starting cPanel brute force detector services... Sep 18 13:21:41 vps.example.com systemd[1]: cphulkd.service: control process exited, code=exited status=255 Sep 18 13:21:41 vps.example.com systemd[1]: Failed to start cPanel brute force detector services. Sep 18 13:21:41 vps.example.com systemd[1]: Unit cphulkd.service entered failed state. Sep 18 13:21:41 vps.example.com systemd[1]: cphulkd.service failed. EDIT: Sorry for this post. It actually is running as a process, it just lists as a failed service. 0 -
I've made this script to automate the process (it changes /var/run to /run in all services). Feel free to update/improve it:
Hello @imorandin, Thanks for taking the time to share your script! I've edited the post to note that it's not reccomended at this time and added some feedback for anyone that wants to work on improving it.0 -
What services can be fixed using Workaround 2 . Can Workaround 2 be used for MySQL? Thanks. 0 -
What services can be fixed using Workaround 2 . Can Workaround 2 be used for MySQL?
SpamAssassin is used as the example, but the steps are applicable to any service that's failing as a result of this issue. Thank you.0 -
We are still facing the same issue with CentOS 7.7 and OpenVZ latest kernel "2.6.32-042stab140.1" 0 -
We are still facing the same issue with CentOS 7.7 and OpenVZ latest kernel "2.6.32-042stab140.1"
Hello :) Can you share more information about the behavior you are seeing and the workaround steps you have taken thus far? Thank you.0 -
We are getting following at the command line: systemctl list-units --type=service UNIT LOAD ACTIVE SUB JOB DESCRIPTION atd.service loaded active running Job spooling tools * clamd.service loaded failed failed clamd antivirus daemon console-getty.service loaded active running Console Getty cpanel-dovecot-solr.service loaded active running Solr for cPanel Dovecot cpanel.service loaded active running cPanel services cpanel_php_fpm.service loaded active running FPM service for cPanel Daemons cpanellogd.service loaded active running cPanel Log services cpdavd.service loaded active running cPanel dav services * cpgreylistd.service loaded failed failed cPanel Greylisting Daemon cphulkd.service loaded active running cPanel brute force detector services cpipv6.service loaded active exited cPanel IPv6 service crond.service loaded active running Command Scheduler dbus.service loaded active running D-Bus System Message Bus dnsadmin.service loaded active running cPanel DNS admin service dovecot.service loaded active running Dovecot Imap Server exim.service loaded active running Exim is a Mail Transport Agent, which is the program that moves mail from one machine to a filelimits.service loaded active exited SYSV: Increases max open file limits getty@tty2.service loaded active running Getty on tty2 httpd.service loaded active running Apache web server managed by cPanel EasyApache * mailman.service loaded failed failed mailman services mysqld.service loaded activating start start MySQL Server named.service loaded active running Berkeley Internet Name Domain (DNS) network.service loaded active exited LSB: Bring up/down networking nscd.service loaded active running Name Service Cache Daemon pure-authd.service loaded active running Pure-Authd pure-ftpd.service loaded active running Pure-FTPd queueprocd.service loaded active running cPanel Queue services quotaon.service loaded active exited Enable File System Quotas rhel-domainname.service loaded active exited Read and set NIS domainname from /etc/sysconfig/network rhel-readonly.service loaded active exited Configure read-only root support smartd.service loaded active running Self Monitoring and Reporting Technology (SMART) Daemon spamd.service loaded active running Apache SpamAssassin<84> deferral daemon sshd.service loaded active running OpenSSH server daemon sysstat.service loaded active exited Resets System Activity Logs systemd-journal-flush.service loaded active exited Flush Journal to Persistent Storage systemd-journald.service loaded active running Journal Service systemd-logind.service loaded active running Login Service systemd-random-seed.service loaded active exited Load/Save Random Seed systemd-remount-fs.service loaded active exited Remount Root and Kernel File Systems * systemd-sysctl.service loaded failed failed Apply Kernel Variables systemd-tmpfiles-setup.service loaded active exited Create Volatile Files and Directories systemd-udev-trigger.service loaded active exited udev Coldplug all Devices systemd-udevd.service loaded active running udev Kernel Device Manager
But at cPanel all services shows down.0 -
Hello @saurabhnsonar, Can you share the output from the following commands on the affected server? [CODE=rich]systemctl get-default systemctl list-jobs uname -r cat /etc/redhat-release grep baseurl /etc/yum.repos.d/CentOS-Base.repo cat /usr/local/cpanel/version lscpu|grep Hypervisor
Thank you.0 -
Hello @saurabhnsonar, Can you share the output from the following commands on the affected server? [CODE=rich]systemctl get-default systemctl list-jobs uname -r cat /etc/redhat-release grep baseurl /etc/yum.repos.d/CentOS-Base.repo cat /usr/local/cpanel/version lscpu|grep Hypervisor
Thank you.
@cPanelMichael here you go.#systemctl get-default multi-user.target # systemctl list-jobs JOB UNIT TYPE STATE 68 mysqld.service start running 121 systemd-update-utmp-runlevel.service start waiting 88 systemd-readahead-done.timer start waiting 1 multi-user.target start waiting 107 vzfifo.service start waiting 5 jobs listed. # uname -r 2.6.32-042stab140.1 # cat /etc/redhat-release CentOS Linux release 7.7.1908 (Core) # grep baseurl /etc/yum.repos.d/CentOS-Base.repo # remarked out baseurl= line instead. #baseurl=http://mirror.centos.org/centos/$releasever/os/$basearch/ #baseurl=http://mirror.centos.org/centos/$releasever/updates/$basearch/ #baseurl=http://mirror.centos.org/centos/$releasever/extras/$basearch/ #baseurl=http://mirror.centos.org/centos/$releasever/centosplus/$basearch/ # cat /usr/local/cpanel/version 11.82.0.15 # lscpu|grep Hypervisor lscpu: failed to determine number of CPUs: /sys/devices/system/cpu/possible: No such file or directory
0 -
Hello Apparently it also happens in Centos 6.x What is the equivalent command in Centos 6 for Workaround 2? This "systemctl edit spamd.service" command does not exist in Centos 6. 2.6.32-042stab120.19 CentOS release 6.9 (Final) Cpanel version 11.82.0.15 OpenVZ 0 -
Hi, I found my mysql server failed to start after server reboot. Is it related to the issue of this thread? I have attached the screenshot that shows the error. Please advice. 0 -
Is there any update from cPanel for fix the issue? 0 -
We followed Workaround 2 and it brought up our critical services. However, cphulkd will not start regardless of whether we edit the configuration for the service or not. Has anyone else found a workaround for this? We get the following when running "systemctl status cphulkd": ? cphulkd.service - cPanel brute force detector services Loaded: loaded (/etc/systemd/system/cphulkd.service; disabled; vendor preset: disabled) Drop-In: /etc/systemd/system/cphulkd.service.d ??override.conf Active: failed (Result: exit-code) since Wed 2019-09-18 13:21:41 MDT; 50min a go Process: 4639 ExecStart=/usr/local/cpanel/scripts/restartsrv_cphulkd --notconf igured-ok --systemd-service=cphulkd (code=exited, status=255) CGroup: /system.slice/cphulkd.service ?? 804 cPhulkd - processo ??2063 cPhulkd - dbprocesso Sep 18 13:21:40 vps.example.com systemd[1]: Starting cPanel brute force detector services... Sep 18 13:21:41 vps.example.com systemd[1]: cphulkd.service: control process exited, code=exited status=255 Sep 18 13:21:41 vps.example.com systemd[1]: Failed to start cPanel brute force detector services. Sep 18 13:21:41 vps.example.com systemd[1]: Unit cphulkd.service entered failed state. Sep 18 13:21:41 vps.example.com systemd[1]: cphulkd.service failed. EDIT: Sorry for this post. It actually is running as a process, it just lists as a failed service.
Hi, Just want to share my experience troubleshooting this type of problem. I do encounter this errors, after I have upgraded my cpanel_php_fpm, dnsadmin, cphulkd, mailman cpanellogd, cpdavd, and tailwatchd. All of these service has not started after updating my cpanel. I have the same kernel as posted here. To fix these problem, I did the workaround #2, which i edited the not started service "systemctl edit dnsadmin.service command. Fixing cpanel_php_fpm.service and dnsadmin.service, following the instruction works like a charm. But during the fixing I encounter the problem on cphulkd.service and tailwatchd.service, i wont start the service, after searching the net, and reading the journalctl logs. I stumble upon checking the process. Even if the cphulkd.service is not started, there still a process running. for cphulkd there are two processes: 529 ? S 0:00 cPhulkd - processor 4867 ? S 0:00 \_ cPhulkd - dbprocessor So stopping or killing this pid process. then trying to restart "systemctl start cphulkd.service will fix the problem. I did the same again for tailwatchd.service 1 process running for tailwatchd, killing this process the running again "systemctl start tailwatchd.service will fix the problem. This works for me. Thank you, valtechnical0 -
Hello, I have problems with services clamd dnsadmin lfd But I don't know how to solve it. I have a vps server and my provider doesn't know how to do it. my server with a Centos 7.7 (virtuozzo) Who should solve this problem? CenTOS? cPanel? I don't know how my solution should be while it is fixed. Can someone tell me what file to modify and what lines do I have to add? I would be very grateful thanks to everyone. 0 -
Hello, I have problems with services clamd dnsadmin lfd But I don't know how to solve it. I have a vps server and my provider doesn't know how to do it. my server with a Centos 7.7 (virtuozzo) Who should solve this problem? CenTOS? cPanel? I don't know how my solution should be while it is fixed. Can someone tell me what file to modify and what lines do I have to add? I would be very grateful thanks to everyone.
idem my problem0 -
I just started noticing this for the last few days (clamd, spamd, dnsadmin), and wrote my provider to ask them to update the kernel on my VPS. I disabled clamd and asked them to look at it. CentOS Linux release 7.7.1908 (Core) 2.6.32-042stab128.2 They wrote back and said that the update to 84.0.15 cleared up these issues, which I can confirm are no longer happening for spamd and dnsadmin. However reinstalled clamd and it immediately started reporting the same failures. They are looking at it again. 0
Please sign in to leave a comment.
Comments
23 comments