[Opinion Requested] Most crucial server files to backup
Hello,
After reading Really looking forward to read your opinion!
Thank you
-
cPanel has a cPanel documentation for this), /home/ for most of the user data and /var/lib/ for the databases. 0 -
Thank you both for your replies. We do use cPanel's system backups to backup the account files and also the "Disaster scenario #1: Let's say a VPS breaks down. Right now there are 2 solutions: 1) restore an VPS image backup and 2) setup cPanel, CloudLinux, etc and the restore each account and the system settings. Both scenarios will take many hours to complete. Disaster scenario #2: Let's say dedicated server dies out. The solution is to setup the Virtual Environment and then restore each VPS image backup. It will certainly take a day to complete. Important Question: Is what we are doing enough? Can we do anything else to decrease the time needed to come back online and to also increase the chances of success/decrease the chances of failure? What would you do different on both scenarios? Have you heard of anything better? I'm open to all suggestions and keen to try anything useful, so please be so kind to provide your knowledge/suggestions/ideas/experience. Thank you 0 -
That"s the answer i was finding from last 1 years. I think Jetbackup incremental backups is the good thing, as if we face any hardware failure or server compromised then we can restore backup using disaster recovery as the process is faster. If anyone have any idea, please share 0 -
We do use cPanel's system backups to backup the account files and also the "Disaster scenario #1: Let's say a VPS breaks down. Right now there are 2 solutions: 1) restore an VPS image backup and 2) setup cPanel, CloudLinux, etc and the restore each account and the system settings. Both scenarios will take many hours to complete.
Yep, a restore could take hours (I remember pulling a 36hour "shift" to restore a failed server once). Hard drives and computers are only so fast. If you have a VPS snapshot of the running system, you do not then need to seutp cPanel/CloudLinux etc - restoring the snapshot should restore everything. If you only have a "template snapshot" you use for your VPS, then yes, you will have to manually reinstall CloudLinux and WHM/Cpanel before applying the backups (but if you have any deployment software in place, that reinstallation shouldn't take too long). The only way(s) to avoid a restore taking hours are: * By having a warm/hot spare - i.e. a "standby" server to which very frequent changes from your "main/live" server are copied to - if you main server goes down, you just need to switch the IP addresses on the standby and you should be up and running with minimal loss (emails that reached the old server but the most recent sync will be lost, but that's better than hours lost if you have to restore from yesterday's backup). * Having all the data on a network share/mount using a SAN (Storage Area Network) or NAS (Network Attached Storage). If you main server dies, the data is available for another server to take over. However, if the SAN/NAS fails in any way, then you have got to resort to backups again (a good SAN should be more reliable than a "common server", but odd issues can still occur - decades ago, I had to resolve an issue where the file allocation table on a SAN got corrupted: all the files were still there, but the system had no idea "where")...Disaster scenario #2: Let's say dedicated server dies out. The solution is to setup the Virtual Environment and then restore each VPS image backup. It will certainly take a day to complete.
You have two ways of restoring a VPS Host server: either per VPS image (which means that sites/accounts will gradually come back online as they are being restored) or "entire machine" (usually a block level backup) - either way, takes a while to restore. However, if you take frequent VPS snapshots from server 'A' and restore them to server 'B' (even if they are transferred across in a "power-off"/"frozen" stage) and then server A fails, then you can just start them on server B instead (see the warm/hot spare solution above)Important Question: Is what we are doing enough? Can we do anything else to decrease the time needed to come back online and to also increase the chances of success/decrease the chances of failure? What would you do different on both scenarios? Have you heard of anything better?
Disaster Recovery procedures are a whole can of worms... You first need to figure out what you are actually protecting against (a hot spare is good for hardware failure, but not so much against a malicious employee or a virus changing the data: backups are good for multiple "point in time" recoveries which gives you a chance to rollback before the employee/virus - however, if you keep the backups/hot spare in the same datacentre - which will make backups faster - it means if there is a datacentre fire, then the data is lost). I would try and figure out how much an hour of outage would "cost" you (if you are a web hosting provider giving 99% uptime, that means you have over 3 days per year you can be offline without your "compensation" kicking in) and then work from there. It's no good speccing for a "99.999%" (5 minutes per year) backup/recovery system if the most it'll cost you is $100 per day... I would avoid cheap hardware/datacentres, have a mixed backup strategy to cover major issues (so maybe have an hour/previous days backup on a secondary drive in the machine, the previous days/weeks on a different server in the same datacentre, and the week before that's backup somewhere else entirely - Amazon Glacier I've found is quite good for stuff you hope you never have to touch)0 -
Thank you @rbairwell for your detailed reply! Really appreciate you took the time for it!!! Other than the SAN idea, we've gone pretty much through the rest ideas in the past. Among the others, a reason why we use NVMe only servers is due to their higher speeds when a restoration is needed. Out of all options we have examined, 1) the standby server solution in a Failover Cluster setup seems the most reliable one. 2) the VPS snapshot is the cheapest one. 3) the Amazon Glacier can't suite us as you basically store in the Glacier pretty old backups (older than 6 months files are stored there if i'm remember it correctly). 4) unfortunately, having to rely on multiple data centers to be better safe than sorry is a reality. We have not examined yet Jetbackup's solution @retechpro mentioned. I had high hopes that the industry had already found other and more efficient ways for disaster recovery that we are not aware of. So, out of all the options available, have you heard what larger hosting providers are doing or what's working better? I'm sure that companies with a few hundreds of servers are facing disaster situations more frequently due to their larger server usage. Thank you 0
Please sign in to leave a comment.
Comments
6 comments