Service failures after upgrading system to CentOS version 7.6

December 04, 2018 08:23

Glad to hear it came back online! :) If you haven't done so already, I'd strongly recommend checking that you have working IPMI / KVM / Console access available to you. In a situation like this, it's the one (and only) hope available to revive a server that's inaccessible via normal means like SSH, WHM, etc. Personally speaking, I like to audit such out-of-band access systems every month or so, just to ensure they still work properly.

0

greektranslator

December 04, 2018 08:26

Thanks, looks like the host intervened automatically. Not sure though what really I need to change on the server. Here are the details of this operation: Software diagnosis The server is on the Grub rescue. Action: Restarting the server in bzimage (Kernel OVH) Result: Boot: OK Ping: OK Services started; Server on login. Recommendation: Software configuration to be corrected by the customer.

0

WorkinOnIt

December 04, 2018 11:30

I've opened a support ticket 10885695 I am having issue similar to these threads: [Removed Links - these threads were merged into this one] Can't restart the server after applying update. I am using the kernel care free simlink protection patch on this server and when I click on restart I see the message: kcarectl --info reporting unknown patch type: free:15xxxxxx at /usr/local/cpanel/Cpanel/KernelCare.pm line 80. Cpanel::KernelCare::get_kernelcare_state() called at /usr/local/cpanel/Cpanel/KernelCare/Suggest.pm line 65 Cpanel::KernelCare::Suggest::get_suggestion() called at whostmgr/bin/whostmgr.pl line 5204 main::graceful_reboot_landing("graceful_reboot_landing") called at /usr/local/cpanel/Whostmgr/Dispatch.pm line 259 Whostmgr::Dispatch::_do_call("graceful_reboot_landing", HASH(0x2b9d000), HASH(0x2b9d988)) called at /usr/local/cpanel/Whostmgr/Dispatch.pm line 157 Whostmgr::Dispatch::dispatch("graceful_reboot_landing", 1, ARRAY(0x2b9cf88), HASH(0x2b9d988)) called at whostmgr/bin/whostmgr.pl line 393 I performed a graceful restart and the server failed to reload. Then I restarted server at host - server won't restart. In host console I get the message: !!!!!failed to load selinux policy The solution is to modify the Root config in the GRUB boot (you need to be administrator) I did this at my server host console (NOTE: use at your own risk. This worked for me (CentOS KVM) - it might be a different process for you depending on your machine / OS / Host. Please check with cPanel your host if you are not 100% 1. Access host console and click the send CTRL+ALT+DEL button on the top right, or click [RESTART] to restart the server. As soon as the boot process starts, press ESC to bring up the GRUB boot prompt. You may need to stop the server first, then restart it to reach the GRUB boot prompt. 2. You will see a GRUB boot prompt - press E to edit the first boot option. (If you do not see the GRUB prompt, you may need to press any key to bring it up before the machine boots) 3. Find the kernel line (it starts with "linux16"), REPLACE ro with rw init=/sysroot/bin/sh and leave the rest of the line unchanged. 4. Press CTRL+X or F10 to boot into single user mode. 5. Access the system with the command: chroot /sysroot 6. Edit the file with vi: eg; #vi /etc/selinux/config and change the line "enforcing" to "disabled" 7. reboot or restart the machine at host control panel. I hope this helps someone ! There is also a tutorial here asafshoval.wordpress.com/2014/11/18/overcome-fail-to-load-selinux-policy-freezing-error-message-while-booting-linux (I didn't use it, but it is similar instruction)

0

WorkinOnIt

December 04, 2018 11:40

Are you on an OVH managed server? I had a similar issue - I believe due to

0

WorkinOnIt

December 04, 2018 11:46

I also have several other servers - awaiting cPanel tech support to provide an update on how to mitigate this issue before updating other servers. I guess if you are running SELinux (I didn't know I was!) you can

0

greektranslator

December 04, 2018 11:51

Indeed. Thanks for the instructions on that thread. I think I will not do anything for now, I will just wait for Cpanel techs to check out my ticket and take the appropriate action.

0

JayFromEpic

December 04, 2018 16:43

I can confirm this is an active problem as well, atleast for servers running on Vultr OS Templates. For what ever reason, when system updates are automatically updated and the machine is rebooted as needed, selinux somehow becomes enabled which is definitely not supposed to happen being a cpanel server. Those are the same instructions vultr support confirmed with my when I brought this to their attention after having 15 servers crap out at once with them. I was able to just disable selinux and it resolved the issue. Ill be shooting in a ticket as well with access to a now fixed machine for further investigation if needed. EDIT: Ticket submitted with access to server that was affected: 10888269

0

cPanelMichael

December 05, 2018 03:28

Hello Everyone, We've received multiple reports of service failures and failed reboot attempts after a system is upgraded to CentOS version 7.6. The upgrade to CentOS 7.6 can occur automatically as part of nightly cPanel & WHM updates on CentOS 7 systems with Operating System Package Updates set to Automatic in WHM Home " Server Configuration " Update Preferences, or when a system administrator manually runs the yum update command. Upon investigating the issue, it appears the issue is isolated to systems missing the /etc/selinux/config file that's part of the selinux-policy RPM. This RPM and file exist by default on stock CentOS 7 installations, however it looks like some providers intentionally exclude the /etc/selinux/config file in the CentOS and cPanel & WHM images provided to their customers. Furthermore, an updated Bind package is included as part of the CentOS 7.6 upgrade which includes a hard dependency for the selinux-policy RPM. This leads to the automatic installation of the selinux-policy RPM on systems where it was previously uninstalled. The selinux-policy RPM includes the following post-script:

# rpm -q --scripts selinux-policy-3.13.1-229.el7_6.6.noarch
postinstall scriptlet (using /bin/sh):
if [ ! -s /etc/selinux/config ]; then
#
#     New install so we will default to targeted policy
#
echo "
# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
#     enforcing - SELinux security policy is enforced.
#     permissive - SELinux prints warnings instead of enforcing.
#     disabled - No SELinux policy is loaded.
SELINUX=enforcing
# SELINUXTYPE= can take one of three values:
#     targeted - Targeted processes are protected,
#     minimum - Modification of targeted policy. Only selected processes are protected.
#     mls - Multi Level Security protection.
SELINUXTYPE=targeted
" > /etc/selinux/config

This post script automatically creates the /etc/selinux/config file with the SELINUX=enforcing value enabled if the file did not previously exist. Once enforcing mode is enabled, issues such as service failures and failed reboot attempts will occur. Available solutions: [SPOILER="Solution 1 - For servers booted and accessible via SSH"> Modify the following value in the /etc/selinux/config file: SELINUX=enforcing
To: SELINUX=disabled
Once you've modified the /etc/selinux/config file, reboot the system to enable the change. [SPOILER="Solution 2 - For servers that fail to boot with "Failed to load SElinux policy" console errors"> You will need to connect to your server's console using the method documented by your server provider. If console access is not available, contact your provider and ask them to review and perform the workaround steps provided by @WorkinOnIt below: 1. Access host console and click the send CTRL+ALT+DEL button on the top right, or click [RESTART] to restart the server. As soon as the boot process starts, press ESC to bring up the GRUB boot prompt. You may need to stop the server first, then restart it to reach the GRUB boot prompt. 2. You will see a GRUB boot prompt - press E to edit the first boot option. (If you do not see the GRUB prompt, you may need to press any key to bring it up before the machine boots) 3. Find the kernel line (it starts with "linux16"), REPLACE ro with rw init=/sysroot/bin/sh and leave the rest of the line unchanged. 4. Press CTRL+X or F10 to boot into single user mode. 5. Access the system with the command: chroot /sysroot 6. Edit the file with vi: eg; #vi /etc/selinux/config and change the line "enforcing" to "disabled" 7. reboot or restart the machine at host control panel. [SPOILER="Advanced Users: Additional information regarding longer than normal reboot times"> The following steps explain how it's possible for reboot attempts to take longer than normal on affected systems:

The /etc/selinux/config file does not exist on the system (SELinux defaults to disabled in this state).
The server was rebooted at least once in the past (with /etc/selinux/config missing).
As part of the initial reboot, the /.autorelabel touch file is created during the shutdown phase.
With SELinux disabled, the reboot process finishes normally and the relabeling process is not initiated.
Later, the /etc/selinux/config is created and SELinux is enabled (see the beginning of this post for how this can happen).
Upon the next reboot, the /.autorelabel touch file initiates the relabeling process this time because it detects SELinux as enabled.
The relabeling process can take anywhere from a few minutes to several hours depending on the server hardware and usage.
Once the relabeling process is complete, the server will automatically reboot again.

If you have not yet upgraded your system to CentOS 7.6, and the /etc/selinux/config file does not exist on your system, ensure that it's created with the SELINUX=disabled value before performing the upgrade as seen in the output below:

# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
#     enforcing - SELinux security policy is enforced.
#     permissive - SELinux prints warnings instead of enforcing.
#     disabled - No SELinux policy is loaded.
SELINUX=disabled
# SELINUXTYPE= can take one of three values:
#     targeted - Targeted processes are protected,
#     minimum - Modification of targeted policy. Only selected processes are protected.
#     mls - Multi Level Security protection.
SELINUXTYPE=targeted

Let us know if you have any questions. Thank you.

0

Webdew

December 05, 2018 05:46

I'm on OVH and have just had this happen.. naturally its exactly 20min before phone support closed and I can not get through! Also after booting to a rescue image and looking for the SELINUX=enforcing flag only to discover it was already set to SELINUX=disabled I'm totally lost as to what to do and now looking at 12 hours until phone support comes online - I have emailed them so hopefully they can work out what has happened and get it back online. WHM/cPanel auto-update is enabled. I've received message that after a new kernel update the server needs to reboot. I've logged into WHM and clicked the "You must reboot the server to apply software updates." Then the server never comes back. I'm on OVH as a physical machine. They have IPMI java console. I access the machine through this and I get grub> ? This is not my area of expertise. But after I find the link above and boot into a rescue image to get disk access, I check the raid is OK and SSH to look for the SELINUX=enforcing line in /etc/selinux/config and it was already set to 'disabled' now I am at a loss as to how to get the server back up. I have ticketed OVH as it is just after hours for phone support... I'm hoping now they can resolve it an understand my notes (I'm also hoping they will not discard my request if they see me connected by the IPMI).

0

Webdew

December 05, 2018 07:17

So OVH 'intervened' Here are the details of this operation: Diagnosis interface boot (rescue) Date 2018-12-05 18:52:16 AEDT (UTC +11:00),Diagnosis interface boot (rescue): Here are the details of the operation performed: The server was stuck on the grub. We have seen this issue on other servers. A restart with the OVH kernel (bzimage) allows the server to start. Actions: restart with the OVH kernel (bzimage) result: Boot OK. Server pings and is on login recommendations: Further action is required by the customer to fix the root cause of the grub/kernel issues ..... Awesome it is back up. I'm not sure what I need to do to fix the "grub issue" - certainly I'm extremely hesitant to restart this server right now. Any help or direction appreciated now.

0

LucasRolff

December 05, 2018 09:47

I'm on OVH and have just had this happen.. naturally its exactly 20min before phone support closed and I can not get through! Also after booting to a rescue image and looking for the SELINUX=enforcing flag only to discover it was already set to SELINUX=disabled I'm totally lost as to what to do and now looking at 12 hours until phone support comes online - I have emailed them so hopefully they can work out what has happened and get it back online.

When booted into the rescue image, did you go to /etc/selinux/config on SSH, or did you mount your partition on /mnt and changed it there? the /etc/selinux/config file in rescue, is the selinux config for the rescue image, and not your server.

0

WorkinOnIt

December 05, 2018 09:58

@ above may assist you if you are confident and still offline - could be worth a go! Otherwise, if you are still online, check @cPanelMichael's explanation

0

WorkinOnIt

December 05, 2018 10:03

Glad you got it solved. Check this out - it worked for me. [QUOTE] Disable SELinux If your server runs an operating system from a source other than the documentation.

While cPanel & WHM can function with SELinux in permissive mode, we recommend that you do not use it. Permissive mode generates a large number of log entries.

To check the status of SELinux on your server, run the sestatus command.

Do not transfer the SELinux configuration file between computers. It may destroy the file's integrity.
Also read @cPanelMichael's

0

cPanelMichael

December 05, 2018 19:09

Hello Everyone, I updated my

0

sparek-3

December 05, 2018 19:16

Does this affect OpenVZ? As far as I know, OpenVZ does not support SELinux, and so as such /etc/selinux/config does not exist. And OpenVZ system do not install the selinux-policy rpm on the CentOS 7.6 update, so I believe they would be unaffected.

0

cPanelMichael

December 05, 2018 19:31

Does this affect OpenVZ? As far as I know, OpenVZ does not support SELinux, and so as such /etc/selinux/config does not exist. And OpenVZ system do not install the selinux-policy rpm on the CentOS 7.6 update, so I believe they would be unaffected.

Hello @sparek-3, I don't have an OpenVZ system available for immediate testing, but as I understand the /etc/selinux/config file exists by default with any stock CentOS 7 installation (with the SELINUX=disabled value). Are you using a particular image or template to provision the servers? Thank you.

0

sparek-3

December 05, 2018 20:08

I just happen to have 1 OpenVZ CentOS 7 server. It does not have an /etc/selinux/config file and yum check-update does not list an selinux-policy package to be installed (I haven't yet updated to CentOS 7.6). I'm not sure how CentOS 7 was installed on this particular OpenVZ VPS. Perhaps others might chime in? I suspect I can add an /etc/selinux/config with SELINUX=disabled just to be safe, but I don't think it would make any difference. The issue appears to only pop up on systems where selinux-policy is updated.

0

greektranslator

December 06, 2018 09:46

These are the contents of my selinux config, what should I do?


# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
#     enforcing - SELinux security policy is enforced.
#     permissive - SELinux prints warnings instead of enforcing.
#     disabled - No SELinux policy is loaded.
SELINUX=permissive
# SELINUXTYPE= can take one of three two values:
#     targeted - Targeted processes are protected,
#     minimum - Modification of targeted policy. Only selected processes are protected.
#     mls - Multi Level Security protection.
SELINUXTYPE=targeted

0

greektranslator

December 06, 2018 15:24

This was the host reply (whom I pointed out to this thread): Your server got updated and with the new update is not able to boot correctly. Since the monitoring was enabled our system detected that and putted an intervention to check why the server was offline. The message you have received is the result of that intervention advising you to fix the software problems preventing your server from correctly booting into the OS. I checked the link that you have send to me and it refers to a configuration file that does not exist. You can reboot the server in rescue mode and create the missing configuration in order for the server to boot. From what I saw the server now is online and you are using one of the kernels of OVH. I suggest in this case you are able to log in to your server and start the configuration of the kernel in order to boot from your kernel and have the services running.

0

cPanelMichael

December 06, 2018 15:35

The issue appears to only pop up on systems where selinux-policy is updated.

An updated Bind package is included as part of the CentOS 7.6 upgrade and comes with a hard dependency for the selinux-policy RPM. Thus, upgrading to CentOS 7.6 will automatically install the selinux-policy RPM on systems where it was previously uninstalled, leading to the issue described in this thread.

These are the contents of my selinux config, what should I do?

The contents you pasted show that SELinux is not configured with enforcing mode enabled, so you should not experience any problems after rebooting the server. Are reboot attempts failing? Or, are you experiencing any specific issues? Thank you.

0

Adamfynd

December 06, 2018 15:42

Hi An hour ago I clicked on the updated WHM Unfortunately, the entire server stopped working The server is now in rescue mode How do I fix the problem in rescue mode? Please I want a quick fix

0

JayFromEpic

December 06, 2018 16:14

These are the contents of my selinux config, what should I do? # This file controls the state of SELinux on the system. # SELINUX= can take one of these three values: # enforcing - SELinux security policy is enforced. # permissive - SELinux prints warnings instead of enforcing. # disabled - No SELinux policy is loaded. SELINUX=permissive # SELINUXTYPE= can take one of three two values: # targeted - Targeted processes are protected, # minimum - Modification of targeted policy. Only selected processes are protected. # mls - Multi Level Security protection. SELINUXTYPE=targeted

You would want your configuration file updated to the following:


# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
#     enforcing - SELinux security policy is enforced.
#     permissive - SELinux prints warnings instead of enforcing.
#     disabled - No SELinux policy is loaded.
SELINUX=disabled
# SELINUXTYPE= can take one of three two values:
#     targeted - Targeted processes are protected,
#     minimum - Modification of targeted policy. Only selected processes are protected.
#     mls - Multi Level Security protection.
SELINUXTYPE=targeted

Once you have finished updating this file, reboot the server. As cPanelMichael mentioned, it will take some time for the command to process most likely because of the relabeling process. I was able to force reboot one of our servers via the control panel of our server provider but I don't normally recommend this due to the risk of data loss.

0

cPanelMichael

December 06, 2018 17:00

The server is now in rescue mode. How do I fix the problem in rescue mode?

You'll need console access to the server to solve the problem. If you have console access, you can perform the steps noted under "Solution 2" on if you don't have experience performing this type of action on your own.

Once you have finished updating this file, reboot the server. As cPanelMichael mentioned, it will take some time for the command to process most likely because of the relabeling process.

As long as SELinux is disabled before the server the server is rebooted, the relabeling process should not occur at boot time. Thank you.

0

Webdew

December 07, 2018 01:16

So OVH 'intervened' recommendations: Further action is required by the customer to fix the root cause of the grub/kernel issues

Received a follow up from OVH. " After some recent feedback from other client"s, we can confirm that there"s likely an issue with our installation template for Centos 7 with the Host lineup of servers. Since the issue at hand has to do with the software, we wouldn"t be able to provide a fix on our end but it"s possible that booting the server using a network kernel can remedy the issue. Below is a link with more information on booting from a network kernel and there"s another link below it with more information on how to update the kernel if you wanted to implemented your own fix. Updating the kernel on a dedicated server " So I'm still not sure how to be sure I have fixed it. I'm not wanting to reboot as I'm still not sure how OVH managed to get it started again when it stopped before. Will future cPanel patches be able to revert / check and ensure this is now OK?

0

cPanelMichael

December 07, 2018 16:31

So I'm still not sure how to be sure I have fixed it. I'm not wanting to reboot as I'm still not sure how OVH managed to get it started again when it stopped before.

They have IPMI java console. I access the machine through this and I get grub> ?

Hello @Webdew, Once you're at the "grub" boot prompt, you could try following the steps provided by @WorkinOnIt earlier in this thread. See a quote below:

2. You will see a GRUB boot prompt - press E to edit the first boot option. (If you do not see the GRUB prompt, you may need to press any key to bring it up before the machine boots) 3. Find the kernel line (it starts with "linux16"), REPLACE ro with rw init=/sysroot/bin/sh and leave the rest of the line unchanged. 4. Press CTRL+X or F10 to boot into single user mode. 5. Access the system with the command: chroot /sysroot 6. Edit the file with vi: eg; #vi /etc/selinux/config and change the line "enforcing" to "disabled" 7. reboot or restart the machine at host control panel.

See also the advice from @LucasRolff as well:

When booted into the rescue image, did you go to /etc/selinux/config on SSH, or did you mount your partition on /mnt and changed it there? the /etc/selinux/config file in rescue, is the selinux config for the rescue image, and not your server.

Let me know if this helps. Thank you.

0

omaniyat

December 09, 2018 13:05

Hello, I'm on OVH i have same issue. Can someone help me and make the server start again please? Thank you,

0

greektranslator

December 09, 2018 14:09

Hi, I had to assign this to a freelance sysadmin. The KVM was not working, neither the IPMI! OVH had to intervene to fix the IPMI. It took the sysadmin several hours to resolve the problem, quoting: The grub-efi package was never installed or removed. (Probably when setting up the server you chose for OVH kernel so they didn't install it). That explains why it needed a command to find the config file on each boot.

0

WorkinOnIt

December 10, 2018 01:41

Hi, I had to assign this to a freelance sysadmin. The KVM was not working, neither the IPMI! OVH had to intervene to fix the IPMI. It took the sysadmin several hours to resolve

Ouch! Painful - it never rains but it pours!! When I first spin up a machine, I always run through the console checks to make sure I will always have access in an emergency. Glad you got it sorted though.

0

cPanelMichael

December 10, 2018 14:54

Hi, I had to assign this to a freelance sysadmin. The KVM was not working, neither the IPMI! OVH had to intervene to fix the IPMI. It took the sysadmin several hours to resolve the problem, quoting: The grub-efi package was never installed or removed. (Probably when setting up the server you chose for OVH kernel so they didn't install it). That explains why it needed a command to find the config file on each boot.

Hello @greektranslator, Thanks for sharing the outcome. Do you have any additional details to share, such as the specific commands that were ran, in-case other OVH users face the same problem? Thank you.

0

greektranslator

December 10, 2018 16:39

No sorry, as I said, this was assigned to a sysadmin who struggled for a considerable amount of time, I can get people in touch with him if they so desire.

0

Service failures after upgrading system to CentOS version 7.6

Comments

Didn't find what you were looking for?