Difference between revisions of "Backup Tools"
| (11 intermediate revisions by the same user not shown) | |||
| Line 1: | Line 1: | ||
| {{Box|As of January 2011 I have moved to a different backup software. Read about the new solution [[Proud to be paranoid -- a backup strategy for the lazy and forgetful|here]].}} | |||
| [[File:2007-12-23_12.47.11_0000000424Tivoli-vialla_andriana_and_Este_048.jpg|thumb|400px|right]] | |||
| ==Why?== | ==Why?== | ||
| Contrary to popular belief -- Computer failures do happen! They are rare in today's modern hardware, but when they do happen, they are devastating. Once I dropped notebook running from a room to room in the office -- the computer survived with some minor blemishes -- but the hard drive was gone -- with all the data in it! Another time my hard disk JUST failed! When these things happen, there is almost no way to recover the data from the failed hardware. Of course there are numerous specialized companies doing data recovery (try Googling), but they are not for us mortals! Data recovery from failed hardware will cost you an arm and a leg. Once one of my rich friends from a rich company faced this situation. First, we tried local services -- and we were surprised by the quotations. Then my friend started carrying the failed hardware with him during his numerous business trips to Asia hoping (quite desperately, for like most of us, he also did not backup his data!) he'd find a reasonable recovery service! To cut the long story short -- he had to live with the loss and recover slowly. Having witnessed the amount of time and effort that was wasted during this event, I almost preach to all I know on the virtues of a good backup system. Until recently there were no backup software I could outrightly recommend to a person who do not want to make major sacrifices to the advancement of the art of hacking. But, the times have changed. Therefore, it's a good time to make a summary of the options.    | Contrary to popular belief -- Computer failures do happen! They are rare in today's modern hardware, but when they do happen, they are devastating. Once I dropped notebook running from a room to room in the office -- the computer survived with some minor blemishes -- but the hard drive was gone -- with all the data in it! Another time my hard disk JUST failed! When these things happen, there is almost no way to recover the data from the failed hardware. Of course there are numerous specialized companies doing data recovery (try Googling), but they are not for us mortals! Data recovery from failed hardware will cost you an arm and a leg. Once one of my rich friends from a rich company faced this situation. First, we tried local services -- and we were surprised by the quotations. Then my friend started carrying the failed hardware with him during his numerous business trips to Asia hoping (quite desperately, for like most of us, he also did not backup his data!) he'd find a reasonable recovery service! To cut the long story short -- he had to live with the loss and recover slowly. Having witnessed the amount of time and effort that was wasted during this event, I almost preach to all I know on the virtues of a good backup system. Until recently there were no backup software I could outrightly recommend to a person who do not want to make major sacrifices to the advancement of the art of hacking. But, the times have changed. Therefore, it's a good time to make a summary of the options.    | ||
| Line 37: | Line 39: | ||
| * Apply time-limit patch.   | * Apply time-limit patch.   | ||
| * compile and install | * compile and install | ||
| <pre> | <pre> | ||
| cd /tmp | cd /tmp | ||
| Line 51: | Line 54: | ||
| I have done this in Linux distributions and in [[Cygwin]] without issues. I do this in all machines that participate in my rsync activities.   | I have done this in Linux distributions and in [[Cygwin]] without issues. I do this in all machines that participate in my rsync activities.   | ||
| ==Scenario 1: Daily backup of a website== | ===Scenario 1: Daily backup of a website=== | ||
| First I install my custom rsync in the local and remote machines. Then I make it possible to login from the local machine to the remote machine without a password. [[SSH login without passwords|Here's how.]] | First I install my custom rsync in the local and remote machines. Then I make it possible to login from the local machine to the remote machine without a password. [[SSH login without passwords|Here's how.]] | ||
| I create the following script in the directory <tt>/home/tommy/backups</tt> as sitebackup.bash | I create the following script in the directory <tt>/home/tommy/backups</tt> as sitebackup.bash | ||
| <pre> | <pre> | ||
| #!/bin/bash | #!/bin/bash | ||
| Line 62: | Line 66: | ||
| OPTS="--append-verify --time-limit=1400 -v  -a --rsh=ssh --stats" | OPTS="--append-verify --time-limit=1400 -v  -a --rsh=ssh --stats" | ||
| export PATH=$PATH:/bin:/usr/bin:/usr/local/bin | export PATH=$PATH:/bin:/usr/bin:/usr/local/bin | ||
| log=$0.`date +% | log=$0.`date +%d`.log | ||
| rsync $OPTS $FROM $BKTO >& $log | rsync $OPTS $FROM $BKTO >& $log | ||
| </pre> | </pre> | ||
| Then add a [[wikipedia:crontab|crontab]] entry:   | Then add a [[wikipedia:crontab|crontab]] entry:   | ||
| <pre> | <pre> | ||
| 03 1 * * * /home/tommy/backups/sitebackup.bash >& /home/tommy/backups/sitebackup.bash.log | 03 1 * * * /home/tommy/backups/sitebackup.bash >& /home/tommy/backups/sitebackup.bash.log | ||
| Line 71: | Line 76: | ||
| What this will do is:   | What this will do is:   | ||
| * Every day at 0103HRs, an rsync session will start and backup (remote) the directory /home/tom/ | * Every day at 0103HRs, an rsync session will start and backup (remote) the directory <tt>/home/tom/</tt> to the directory <tt>/backup/site/</tt> in the local machine.  | ||
| * The rsync session will not last more than 1400 minutes (just short of a day). So that the possibility of two rysnc sessions running once is eliminated. If the first day's rsync session could not complete the job, the next day's session starts from where it left.  | |||
| * If there is not much change in the remote machine, the rsync session will be short. If there's no change it will be very short (just ring and check for changes!).  | |||
| What I have to do: | |||
| ;Occasionally:  | |||
| :* Check the log files (<tt>sitebackup.bash.log</tt>, <tt>sitebackup.bash.01.log</tt>, <tt>sitebackup.bash.02.log</tt>, ...) to see if things are running smoothly.  | |||
| :* Check the backup files at <tt>/backup/site</tt> to see if they are OK. | |||
| [[Category:Computing]][[Category:Unix]] | |||
Latest revision as of 13:33, 28 January 2011
As of January 2011 I have moved to a different backup software. Read about the new solution here.
Why?
Contrary to popular belief -- Computer failures do happen! They are rare in today's modern hardware, but when they do happen, they are devastating. Once I dropped notebook running from a room to room in the office -- the computer survived with some minor blemishes -- but the hard drive was gone -- with all the data in it! Another time my hard disk JUST failed! When these things happen, there is almost no way to recover the data from the failed hardware. Of course there are numerous specialized companies doing data recovery (try Googling), but they are not for us mortals! Data recovery from failed hardware will cost you an arm and a leg. Once one of my rich friends from a rich company faced this situation. First, we tried local services -- and we were surprised by the quotations. Then my friend started carrying the failed hardware with him during his numerous business trips to Asia hoping (quite desperately, for like most of us, he also did not backup his data!) he'd find a reasonable recovery service! To cut the long story short -- he had to live with the loss and recover slowly. Having witnessed the amount of time and effort that was wasted during this event, I almost preach to all I know on the virtues of a good backup system. Until recently there were no backup software I could outrightly recommend to a person who do not want to make major sacrifices to the advancement of the art of hacking. But, the times have changed. Therefore, it's a good time to make a summary of the options.
Windows
First of all, all the modern stable (the latter condition eliminates Vista -- I've never used it and never plan to!) windows operating systems have a built in backup software -- but many do not know about it. It is called NTBackup and can be opened by typing ntbackup on the start>Run menu. I've used this successfully over the years and has saved me through several catastrophic failures. But the tool is basic at best -- and be quite painful to work with.
But, recently a series of backup software tools have been introduced that are much more easy to work with. The good news is the best I've seen to the day is free!
Cobian Backup
The best windows backup tool I've seen among both commercial and open source categories -- period (As of 2009). Cobian backup 8 is free/open source. Version 9 (latest at the time of writing) is free, but not open source. Of course this is a risky situation -- the author may be considering to go commercial with a version 10! (Quite justifiably, and I'd recommend people to buy it if they can/are willing to afford it) But, even if that happens, all is not lost for the free software community -- someone can start developing a fork from version 9.
I use Cobain backup in my home and office computers and it is a pleasure to work with. Using the menus it is possible to setup backup tasks that will define:
- What files (or directories) to be backed up.
- When the backups will take place (say once a week)
- Where to store the backups
- Whether to encrypt your backups (so that no one will be able to read them without a password, even if they get hold of your backup files.)
- What kind of backup to make each time.
It is quite easy to use with its intuitive user interface. There is also a good help system too. Here's what I do:
- Whenever possible I backup files to a different computer. This can be easily done by sharing a folder of another computer using a network share[1]. Then Cobian backup will simply save the backup files to this network drive (hence, to the remote computer). You can easily make a deal with a colleague in your office to do this -- helping each other to backup. But, be sure to encrypt your data if the remote computer is not yours!
- There are three basic types of backups: Full, Incremental and differential. Full means everything is backed up. Incremental will backup any files that were changed after the last backup. Differential will backup everything that was changed since the last full backup. I usually don't bother with Incremental. I go with differential backups with once in each 10 backup operations. Then I set backup to happen twice a week (Tuesday and Friday). And I keep three full backup copies. (Cobian will automatically delete older backups -- so you will not run out of space in the backup disk)
- In this scenario, if my computer fails (Say Wednesday), here's how I go about recovering my data: If the backup on Tuesday was a full backup -- I am in luck and only have to restore that to my new/repaired computer. If that was not a full backup (I make a full backup once in every 10 differentials) I'll first get the latest full backup from the remote computer and restore it to a new computer. Then I'll get the differential backup made on Tuesday and restore it over the previously restored data. Then my data is back to the state when the Tuesday backup was made. Of course I've lost any changes that I made during Wednesday -- but that's only a day's worth of work and I can live with that).
Linux/UNIX
In UNIX -- the OS of the brave, things are much better. In UNIX there always have been variety of utilities that allow well oiled, efficient backup systems to be set up. Of course there are ready to use commercial/free software that allows one to click their way forward -- but they are not even half as fun as using built in standard utilities of UNIX. Following is an account of the system that I use. If you Google, you'll find enough variation to keep the interest up for years to come!
I use rsync
There are number of utilities that can be used to make backups in UNIX -- scp, tar, dd, cpio, the list goes on. But my favorite is rsync. It is reliable, even in unpredictable networks, efficient and fast.
I do not use standard rsync
All UNIX systems come either with rsync installed or provides a method of install it from binaries. But, for the purpose of backups, I build it from sources. The reason is I make use of the non-standard option called --time-limit. This allows rsync sessions to be stopped after a specific time.
The procedure is:
- Go to rysnc web site [2]
- Download source files [3], both standard distribution and patch files.
- Apply time-limit patch.
- compile and install
cd /tmp wget http://www.samba.org/ftp/rsync/rsync-3.0.6.tar.gz wget http://www.samba.org/ftp/rsync/rsync-patches-3.0.6.tar.gz tar -xzf rsync-3.0.6.tar.gz tar -xzf rsync-patches-3.0.6.tar.gz cd rsync-3.0.6 patch -p1 <patches/time-limit.diff ./configure make
I have done this in Linux distributions and in Cygwin without issues. I do this in all machines that participate in my rsync activities.
Scenario 1: Daily backup of a website
First I install my custom rsync in the local and remote machines. Then I make it possible to login from the local machine to the remote machine without a password. Here's how.
I create the following script in the directory /home/tommy/backups as sitebackup.bash
#!/bin/bash cd /home/tommy/backups FROM="tom@tommy.net:/home/tom/" BKTO="/backup/site/" OPTS="--append-verify --time-limit=1400 -v -a --rsh=ssh --stats" export PATH=$PATH:/bin:/usr/bin:/usr/local/bin log=$0.`date +%d`.log rsync $OPTS $FROM $BKTO >& $log
Then add a crontab entry:
03 1 * * * /home/tommy/backups/sitebackup.bash >& /home/tommy/backups/sitebackup.bash.log
What this will do is:
- Every day at 0103HRs, an rsync session will start and backup (remote) the directory /home/tom/ to the directory /backup/site/ in the local machine.
- The rsync session will not last more than 1400 minutes (just short of a day). So that the possibility of two rysnc sessions running once is eliminated. If the first day's rsync session could not complete the job, the next day's session starts from where it left.
- If there is not much change in the remote machine, the rsync session will be short. If there's no change it will be very short (just ring and check for changes!).
What I have to do:
- Occasionally
- 
- Check the log files (sitebackup.bash.log, sitebackup.bash.01.log, sitebackup.bash.02.log, ...) to see if things are running smoothly.
- Check the backup files at /backup/site to see if they are OK.
 




