Skip to main content

Up to 18 hour delay for checkallsslcerts triggered web app errors during PEAK HOURS!

Comments

5 comments

  • cPRex Jurassic Moderator

    Hey there!  I hadn't heard about this one before, but you are absolutely correct - the maintenance tool picks a random time within the next 18 hours, which we can see inside the /usr/local/cpanel/scripts/maintenance file:

    sub action_checkallsslcerts {

        # Base install does this in the background before upcp
        return if $ENV{'CPANEL_BASE_INSTALL'};

        my $max_delay_seconds = 18 * 60 * 60;                                                  # 18 hours
        my $bytes_to_get      = length($max_delay_seconds) + 1;
        my $rand_int          = Cpanel::Rand::Get::getranddata( $bytes_to_get, [ 0 .. 9 ] );
        my $delay_seconds     = $rand_int % $max_delay_seconds;

        # Should be between 1 and $max_delay_seconds
        # scheduling a task for 0 seconds will cause queueprocd to throw an error
        $delay_seconds++;

        return (
            show_status('Scheduling task to check service default SSL/TLS certificates'),
            sub {
                Cpanel::ServerTasks::schedule_task(
                    ['SSLTasks'],
                    $delay_seconds,
                    'checkallsslcerts'

    You are also correct about the "why" - we originally implemented this to keep cPanel requests from overloading the AutoSSL servers at Sectigo.

    One possible workaround would be flush the task queue on the server immediately after upcp runs.  This would force the scheduled random maintenance to happen immediately.  You could do that by adding the following line to /scripts/postupcp:

    /usr/local/cpanel/bin/servers_queue run

    Let me know if that helps!

    0
  • Kenric Ashe

    My personal solution was to change $max_delay_seconds to 1800 i.e. half an hour which is plenty for the load issue, especially when cert updates are only a few times a year, and I need it to end before my scheduled 3AM WHM Backup.

    But my main reason for this post was to identify an issue that affects all cPanel users. Probably the vast majority aren't even aware of it. But it is a bug which needs to be resolved for all users instead of individual sysadmins editing cPanel scripts on their servers, especially when those scripts might be overwritten in future upgrades.

    In other words, it should either function as stated in the docs (2AM) or at least limit the max delay and mention that in the docs.

    Can do?

    0
  • cPRex Jurassic Moderator

    Sure - I've made case CPANEL-44141 to bring this up to our SSL team and I've linked them this thread as well.  If I hear any updates on it I'll be sure to post!

    0
  • Kenric Ashe

    Thank you!

    0
  • cPRex Jurassic Moderator

    Sure thing!

    0

Please sign in to leave a comment.