There are lots of good ways to back up your computer. I’ve used several. Lately, I am enjoying the ease, convenience, and quality of rsync. In typical geek fashion, I was perusing the rsync man page the other day and found some nice options that I hadn’t known about, so I started to experiment.
I wanted to back up my laptop to a portable, external hard drive, starting with a full backup, then going to incremental backups after that. I also wanted to make sure the backup was kept in sync with my local hard drive, but without accidentally permanently deleting anything from the backup that I might want or need later. Here is what I came up with, posted here mainly for the sake of my memory, but you might find it interesting as well.
First, to back up the entire hard drive, I need to do this as root. Since I am using Ubuntu, and because I like sudo, I just add that to the beginning of the command and enter my password at the prompt. This reminds me to mention that it is important that your backup be kept in a secure location, just as with your computer. Anyone with physical access to the backup drive will eventually have access to all your data.
Here is the command I used, followed by an explanation of the options I am using.
sudo rsync --force --ignore-errors --delete --delete-excluded --exclude-from=/media/disk/matthew-exclude.txt --backup --backup-dir=`date +%Y-%m-%d` -av / /media/disk/backup/matthew-laptop
Options used:
–force: forces the deletion of directories on the backup drive, even if they are not empty
–ignore-errors: tells –delete to go ahead and delete files even when there are I/O errors
–delete: deletes unnecessary or extra files from destination directories
–delete-excluded: deletes excluded files from destination directories
–exclude-from=/media/disk/matthew-exclude.txt: tells rsync not to backup files or directories listed in this file, which I include on the destination drive (my sample is below)
–backup: creates backups of files before deleting them
–backup-dir=`date +%Y-%m-%d`: creates a backup directory on the destination drive for those backups with today’s date as the directory name
-av: archive mode, which combines lots of great stuff together like preserving file permissions and ownership, and verbose output, which is nice for knowing what is going on
This is my exclude file.
home/lost+found/
home/.Trash-root/
home/matt/.thumbnails/
home/matt/.Trash/
lost+found/
media
mnt
proc
root/.thumbnails/
root/.Trash/
sys
tmp
That’s it. The first time I ran it, it took a long time. Of course, I have some 75 gigabites of data, so that isn’t surprising. After that, only things which have changed need to be transferred or deleted, so the process is quite fast.
use rsnapshot for regular backups, this encapsulates all these crazy rsync params 🙂
jim: the way I am using rsync in the example, when it is discovered that I have deleted something myself from the source drive, rsync then deletes it from the destination drive, but only after making a copy of the file/directory in a special, new directory titled using the date.
I believe you are saying the same thing.
I don’t blame you for wanting to confirm that. 🙂
@ Jim: I’ll second Ilja’s recommendation for rsnapshot. It is simple and automated, and easy to setup, then forget about. Well suited for a single system IMHO.
Check out BackupPC, it’s based on rsync and hardlinks and it suits your requirements better than using pure rsync as you do now. Actually, it would be nice if a more user-friendly version of BackupPC could become part of Ubuntu. It could become a Linux-equivalent of Timemachine.
Please be aware that Ubuntu’s rsync has a bug with large backups: bugs.launchpad.net/ubuntu…
hey matthew, thx for this. doesn’t look like a true incremental backup, though — that is, each time you perform a backup, you lose any data about the history of a file (old files are rewritten, rather than old & new versions being diffed & all versioning info saved). i think rdiff-backup works better with this, doesn’t it?
matt
matt: yeah, you are right. This isn’t incremental in that sense at all.
Maybe I need to find a better word to express, "It only saves the stuff that has changed since the last backup," even though it doesn’t save both the old and new versions, etc. If that was what I needed, then rdiff-backup would be better, or subversion, or git, etc.
Matthew, two suggestions:
(1) look at -x to rsync, which excludes any mounted filesystems on top of the backup tree. It’s really nasty to accidentally have GNOME VFS, SSHFS, or something else mounted while backing up and before you know it you’ve backed up the universe.
(2) I personally prefer rsync to rdiff-backup and rsnapshot and friends because it’s really robust and reliable in all my experience, while I’ve had the other utilities flake on me before to the point of needing a rebuild. And that’s totally unacceptable in a backup solution… You might also want to think about combining rsync with LVM snapshotting to implement incremental or differential backups.
(3) This kind of procedure can also be easily used to clone a system to another. For the most part Linux doesn’t care what hardware it’s running on… I’ve not installed Ubuntu from a CD for 2 years now. I’ve got my cloning images of Ubuntu just the way I like it, and every time I need to load up another system or VM I just unpack one of those images.
jdong: thanks! I will read up on -x and will very likely add it in there.
I’ve also had other backup options leave me in the lurch, which is why I was using tar archives until this week, and why I chose to try rsync instead of the other options. I’m glad I’m not the only one thinking this way. That encourages me a bit.
I especially appreciate your comment #3 as I found myself wondering earlier today whether this would work.
Your comment regarding others potentially having access to your bacckups should not be taken lightly. While it is great to maintain local backups on a portable hard disk, or even to a media server, that leave the very large hole of fire and theft as possible means of losing everything. You really want an off site storage set up. While there are a lot of companies offering that online these days, I have not seen all that many that are OS agnostic enough to have great support for Linux. So you’ll probably end up taking media to a trusted friend, family member, or possibly work.
On the assumption that you do, you very probably should use LVM and encrypted partitions on the media.
Of course if you are using LVM on your laptop or the computer that you are backing up, you may want to use lvm snapshots to do the backup in the first place. This doesn’t give you the backup history that the provided rsync, or rdiff solutions provide, but may provide you with faster initial ‘full backup’ starting points for the rsync solutions.
As opposed to backing up the entire hard disk though, I would recommend snapshots of directory trees that have variable and configuration information you will need (/etc, and many of the /var directories such as /var/www, and /var/mysql if you are running a web server) Also use the appropriate apt tools to collect a list of the installed applications for your system, possibly even using aptoncd or the like if you suspect that when you need to recover you may not have access to the internet.
That should leave the user directory tree as the primary tree that needs to be backed up.
For some media servers you may also want to back up folders used to capture media to, however that will depend upon your own archival requirements. If you had transferred all the Disney tapes to the media server for your 4 year old, who’s now 8, and far more interested in Anime, you might not find the Disney content to be all that necessary to maintain. But you might want to continue maintaining that copy of Casablanca you recorded off of TCM. Then again, they will probably show it again right?
Another point based on some things Rusty said…
If you are a "good" Ubuntu/Debian sysadmin who effectively leverages the package manager rather than hacking around it, you get rewarded when it comes to backups…
On my system, of the 16GB / partition I back up a 4KB text file (package list), a 500MB local APT repo, and /etc and a few data files in /var. Everything else can be reconstituted from a quick cp and a single apt command. This makes backing up the "entire system" a lot easier
Ok, for a single system, it is overkill, that’s true.
My wife and I have two desktops and two laptops at home plus one media server running most of the time (that now doubles as a backup server), so it’s worth for us.
Might want to tack on at least
~/.local/share/Trash
and
$XDG_CACHE_HOME (defaults to ~/.cache).
jdong: what are the few data files in /var and what is the single apt command? thx,
matt
I’m using rdiff-backup for backup up my home directory. It works great for me.
http://www.nongnu.org/rdiff-back...
I use mrb and I like it. It is quite easy to set up, it does incremental backups (based on hard links), and it is a fairly small make (well) script.
Druvaa inSync is Pure Python implementation of rsync. The best i like is its simplicity. Its currently free for 25 licenses and less.
Extra features include –
1. Bandwidth scheduler, WAN optimizations
2. SSL secure, snapshot supports and all
3. Good compression
The company has tall claims on the blog – blog.druvaa.com/2008/03/2…
They seem to be building single instancing around backup, which can speed up backup speed to 4x and storage cost to reduce by same amount.
Its downloadable for Windows and available for Linux on demand. I have requested and waiting for MAC release which someone from support promised in April End.