Skip to main content

Backup Pruning Timeout

Comments

12 comments

  • CanSpace
    To answer my own question, it seems like patching /usr/local/cpanel/Cpanel/Transport/Files/Rsync.pm to allow a greater than 300 second timeout would do the trick. I'm still not sure why cPanel limits this to 300 seconds. This solution was inspired by the posts in this thread:
    0
  • cPanelLauren
    Hi @CanSpace There's a relatively old inquiry I found with this exact question and the response from development was as follows: [QUOTE] It's important to note what kind of timeout is being hit. Based on the log entry, this is an idle connection timeout: [2014-09-19 07:26:32 +0100] warn [cpbackup_transporter] Upload attempt failed: RequestTimeout: Your socket connection to the server was not read from or written to within the timeout period. Idle connections will be closed. at /usr/local/cpanel/Cpanel/LoggerAdapter.pm line 26
    This means literally no packets made it either to or from the cPanel server in 5 minutes. Increasing this timeout wouldn't do any good, since this is 100% a network issue. If the transporter is actively transferring data, that timeout can be increased up to 2 days. The 300 seconds only applies if data stops flowing. Perhaps this needs to be documented better; we have a lot of different timeouts and they all signify different conditions. Keep in mind that a currently-running backup transporter will prevent a new one from running. Continually increasing timeouts will eventually run into problems with backups getting deleted before they can be transported, and/or customer drives filling up while they wait in vain for backups to get moved.
    In your case does the transport log indicate the same/similar issue?
    0
  • CanSpace
    [2020-01-27 14:53:37 -0500] info [cpbackup_transporter] cpbackup_transporter - Processing next task [2020-01-27 14:53:37 -0500] info [cpbackup_transporter] Instantiating Object [2020-01-27 14:53:37 -0500] info [cpbackup_transporter] Starting a "prune" operation on the "xxxx" destination ID "LXhr8rkO7BDl5cnFU7v0yprS". [2020-01-27 14:53:37 -0500] info [cpbackup_transporter] Performing prune operation, retaining 2 items on: xxxx [2020-01-27 14:53:38 -0500] info [cpbackup_transporter] Pruning backup directory: 2020-01-22, from xxxx [2020-01-27 22:53:37 -0500] info [cpbackup_transporter] ERROR: Pruning 2020-01-22 from xxxx: time out reached [2020-01-27 22:53:37 -0500] info [cpbackup_transporter] The system could not prune the "2020-01-22" directory due to an error. Now that I look at it, it's not reaching the 300 second limit - it's reaching the 8 hour total destination backup limit. What might be causing this? If I log in to the server and delete the directory manually, it only takes a minute or so. Is there something else that could be disconnecting and killing the connection? On another server: [2020-01-27 12:04:29 -0500] info [cpbackup_transporter] cpbackup_transporter - Processing next task [2020-01-27 12:04:29 -0500] info [cpbackup_transporter] Instantiating Object [2020-01-27 12:04:29 -0500] info [cpbackup_transporter] Starting a "prune" operation on the "xxxx" destination ID "0lotB6lKc0bSDgR7OiWKYiXl". [2020-01-27 12:04:29 -0500] info [cpbackup_transporter] Performing prune operation, retaining 2 items on: xxxx [2020-01-27 12:04:29 -0500] info [cpbackup_transporter] Pruning backup directory: 2020-01-19, from xxxx [2020-01-27 13:07:18 -0500] info [cpbackup_transporter] ERROR: Pruning 2020-01-19 from xxxx: ssh slave failed: timed out [2020-01-27 13:07:18 -0500] info [cpbackup_transporter] The system could not prune the "2020-01-19" directory due to an error. This one timed out after less than 3 minutes. What might be causing this? Also notice the errors are slightly different.
    0
  • CanSpace
    To answer my own question, this actually wasn't a timeout issue. If I log in to the backup server and su into the rsync backup user, I can see that one of the directories is set to -rw-r--r-- so the owner cannot delete it. This isn't actually a timeout issue - cPanel is just throwing the wrong error. The reason I didn't originally check this is because for some of our servers, cPanel actually does show when there is a permission error when pruning backups, whereas here it just shows a timeout.
    0
  • cPanelLauren
    Nice @CanSpace I'm actually glad I deterred you from the rabbit hole of making the modifications you were initially intending on making. 644 for a directory is kind of unusual but was it actually that ownership was incorrect? Because anything owned by the cPanel user/group should be able to be removed by the user.
    0
  • CanSpace
    The changes to be honest seem relatively straightforward - it's a fairly simple perl script. Yes it was owned by the same user, but the file did not have owner write permissions. When I su'd to that user I was not able to rm the file (permission denied), but I could (as the same user) chmod u+w the file and then remove it. As mentioned, I did not think this was a permission issue because when this happens on another server of ours, the cPanel backup failed email actually shows the permission issue, as opposed to just showing a timeout error.
    0
  • cPanelLauren
    The changes to be honest seem relatively straightforward - it's a fairly simple perl script

    Yea, I agree, and thankfully pretty easy to read. It's not difficult, but I think there would have been some further confusion when the timeout wasn't resolved.
    As mentioned, I did not think this was a permission issue because when this happens on another server of ours, the cPanel backup failed email actually shows the permission issue, as opposed to just showing a timeout error.

    The actual error that's being output is probably not a permission error, I was actually more curious what the issue was than whether or not it was a permissions error. Permission due to bad ownership may be something that isn't expected (that specifically is doubtful though)
    0
  • CanSpace
    Again the issue is that some files don't have user write permissions, so the user cannot delete them, and the rm -rf fails. Whereas the cPanel email just says "timed out" - this is what led me in the wrong direction. On another server with the exact same issue (and the exact same backup destination server) the email actually does show the permission issues (ie cannot delete file, permission denied), so I assumed if there actually was a permission issue, that the email would have told me. Not sure why cPanel has this inconsistency - but this issue has likely led a lot of people to think there is a timeout issue when it's really just a permission issue.
    0
  • cPanelLauren
    Again the issue is that some files don't have user write permissions, so the user cannot delete them, and the rm -rf fails. Whereas the cPanel email just says "timed out" - this is what led me in the wrong direction.

    Ok, to clarify what I mean: If I log in as the user lauren to my server - it's a standard user with no special permissions, and create a file or directory with 644 permissions - because it is owned by my user and group I can remove that file. There should absolutely not be an issue with that. Standard file permissions are 644 - if it were the case that the user was unable to remove files owned by them with permissions of 644 almost no one would be able to remove their own files. The issue I'm trying to understand here is specifically why this specific folder was unable to be removed. Either the attributes are modified or it isn't owned by that user/group
    0
  • CanSpace
    Sorry I pasted the wrong permissions above. The directory was set to 400 (ie r--------), so the owner-write permission was not set, and therefore the user could not delete their own files (without first changing the permission on the directory to 600). This is a major nuisance because many applications create files without user-write permissions (444 or 400) like Drupal does by default on the /sites/default directory. rvSiteBuilder does this too with some specific template directories. I had to go in and manually change all these directories to 600 or 644 and then the backup user was able to delete them successfully. So every time a user installs Drupal and it gets backed up, cPanel will not be able to prune the backups properly. This is likely a very common issue with cPanel backups, but since cPanel sends an error message saying there was a "timeout" instead of actually showing the permission issues, everyone assumes there is a timeout issue - which is likely why there are so many discussions about this. I've fixed the permission issues and everything is working fine now, but it's only a matter of time until another Drupal installation causes the pruning issue again. The ideal solution to this would be for the rsync copy to make sure that files/directories it copies have at least permissions of 600.
    0
  • cPanelLauren
    Again, While I can appreciate that the permissions were incorrectly relayed prior, I think there is some confusion about what a user can and can't do. If my user:group is lauren:lauren and my file is owned by the same user id and group ID (UID:GID) there are absolutely no restrictions as to what my user can remove unless the files attributes have been modified. I can set the permissions of a file/directory my user owns to 000 and still be able to remove it. For example: [lauren@server ~]$ mkdir permissions_test [lauren@server ~]$ chmod 400 permissions_test [lauren@server ~]$ stat permissions_test/ File: "permissions_test/" Size: 4096 Blocks: 8 IO Block: 4096 directory Device: fd01h/64769d Inode: 2099840 Links: 2 Access: (0400/dr--------) Uid: ( 1000/ lauren) Gid: ( 1002/ lauren) Access: 2020-01-30 14:32:49.387711162 -0600 Modify: 2020-01-30 14:32:49.387711162 -0600 Change: 2020-01-30 14:32:57.651711765 -0600 Birth: - [lauren@server ~]$ rm -rf permissions_test [lauren@server ~]$ [lauren@server ~]$ stat permissions_test stat: cannot stat "permissions_test": No such file or directory
    The only reason the user wouldn't be able to remove a file is in two instances:
    • The file has had its attributes modified (i.e., it is immutable)
    • The file is not owned by the same UID/GID as the users.
    What I believe is actually occurring here is because the folder does not have the executable value, when the pruning operation takes place it is trying to navigate the directory to see if there is content below but cannot and times out doing so - it doesn't report back an error, or the error is not the expected error. This isn't really a straightforward permissions error (though still technically one).
    0
  • CanSpace
    I don't think it's that simple. If lauren:lauren creates directory "dir1", goes in to dir1 and then creates a file "file1" (with permissions 777.. or anything else.. permissions on the file do not matter), then goes up a level, chmod's dir1 to 400, and then tries to rm -rf dir1, the command will fail. For example: [user1@server1 ~]$ mkdir dir1 [user1@server1 ~]$ cd dir1 [user1@server1 dir1]$ touch file1 [user1@server1 dir1]$ cd .. [user1@server1 ~]$ chmod 400 dir1 [user1@server1 ~]$ rm -rf dir1 rm: cannot remove "test1/file1": Permission denied As you can see, the rm -rf fails. This is the command that the cPanel backup pruning process runs, and why it always fails.
    0

Please sign in to leave a comment.