Final state is Backup Partial Failure error

Hy there! Over the past couple of weeks one of our servers has been unable to successfully run its weekly backup even though nothing changed on our end. Our latest backup log isn't very helpful:


[2019-03-18 04:53:51 -0300] info [backup] no_transport = 0 .. and queueid = TQ:TaskQueue:24463
[2019-03-18 04:53:51 -0300] info [backup] leaving queue_backup_transport_item
[2019-03-18 04:53:51 -0300] info [backup] checking backup for user01
[2019-03-18 04:53:51 -0300] info [backup] Skipping suspended account user01.
[2019-03-18 04:53:51 -0300] info [backup] checking backup for user02
[2019-03-18 04:53:51 -0300] info [backup] Skipping suspended account user02.
[2019-03-18 04:53:51 -0300] info [backup] Queuing transport of meta file: /backup/weekly/2019-03-17/accounts/.master.meta
[2019-03-18 04:53:51 -0300] info [backup] no_transport = 0 .. and queueid = TQ:TaskQueue:24464
[2019-03-18 04:53:51 -0300] info [backup] leaving queue_backup_transport_item
[2019-03-18 04:53:51 -0300] info [backup] Queuing transport of file: /backup/weekly/2019-03-17/backup_incomplete
[2019-03-18 04:53:51 -0300] info [backup] no_transport = 0 .. and queueid = TQ:TaskQueue:24465
[2019-03-18 04:53:51 -0300] info [backup] leaving queue_backup_transport_item
[2019-03-18 04:53:51 -0300] warn [backup] Pruning of backup files skipped due to errors. at /usr/local/cpanel/bin/backup line 394.
        bin::backup::run("bin::backup") called at /usr/local/cpanel/bin/backup line 122
[2019-03-18 04:54:12 -0300] info [backup] Scheduling backup metadata vacuum
[2019-03-18 04:54:12 -0300] info [backup] Queuing transport reporter
[2019-03-18 04:54:12 -0300] info [backup] no_transport = 0 .. and queueid = TQ:TaskQueue:24466
[2019-03-18 04:54:12 -0300] info [backup] leaving queue_backup_transport_item
[2019-03-18 04:54:12 -0300] info [backup] Completed at Mon Mar 18 04:54:12 2019
[2019-03-18 04:54:12 -0300] info [backup] Final state is Backup::PartialFailure (0)
[2019-03-18 04:54:12 -0300] info [backup] Sent Backup::PartialFailure notification.

However, last week's log does suggest something killed the process:


Skipping access-logs
Skipping .cpanel/caches
Skipping .cpanel/datastore
Skipping .cagefs
.........
.........
.........
.........
.........
.........
.........
.........
rsync error: received SIGINT, SIGTERM, or SIGHUP (code 20) at rsync.c(638) [generator=3.1.2]
rsync error: received SIGUSR1 (code 19) at main.c(1429) [receiver=3.1.2]
[2019-03-10 00:40:37 -0300] info [backup] Final state is Backup::Failure (HUP)
[2019-03-10 00:40:37 -0300] info [backup] Sent Backup::Failure notification.

Given how different both log outputs are, I'm not sure if they were caused by the same issue. Our dmesg won't show info from all the way back when this occurred so I can't confirm whether or not OOM was triggered in any of the two circumstances. Our memory usage graphs don't show anything our of the ordinary. There's 1.6TB of free space available in the backup NFS mount and I can write data to it just fine. Is there anything about cPanel v78 causing its backups process to take up more memory than it used to? The only recent change I can think of is cPanel's upgrade from v76 to v78. Any assistance is greatly appreciated. Thanks, everyone!

Final state is Backup Partial Failure error

Comments

Didn't find what you were looking for?