Rsync provides a feature-ful and flexible set of options to copy and synchronize data.
- [Instructor] Whether your using Linux as a desktop machine or on a server it's important to back up your data. Data that needs to be backed up is generally data that's generated by the user, or by people using the system. Normally we don't really need to back up an operating system if we back up user data and configurations. Operating systems can be reinstalled, but holiday photos and customer data can't be as easily recreated. Before we start talking about backups, though, it's important to understand what a backup is. If you have a document file on your desktop and you make a copy of it and move it to your documents folder, that can seem like a kind of backup.
There's two copies in case something happens to one of them. But if your hard drive crashes, chances are, you won't be able to access either copy. So that doesn't really help. A much more reliable approach is to copy files to another disk or system, ideally one that's usually detached from the system and stored somewhere safe. We can do this with Rsync, a very flexible file copying program. At it's most basic Rsync copies files from one place to another. We can do this with a CP command, but Rsync offers many features that make it a great candidate for more sophisticated backups.
Rsync can, for example, ensure that the contents of one path are consistent with those of another path, deleting files in the second location if they've been deleted in the first. And it can skip files that haven't been modified, saving a lot of time in folders or paths that have many files that aren't changed frequently like photo and music libraries. Let's take a look at a basic Rsync setup, copying files from a home folder to an external disk. I have a few folders here full of files. They're just simple files, nothing special. They're from my Ubuntu Desktop course here on LinkedIn Learning, if you want to go download them, or you can use your own files.
You can see there's a reasonable number of them, and some of them are fairly large. I also have a USB disk plugged in and mounted in my media folder. Let's use Rsync to copy these files. I'll write rsync -avz, copying everything in this folder over to my USB drive. The a option here is for archive mode and it's a shortcut for a whole bunch of other options, which (mumbles) directories, preserve ownership and creation times, and more.
Be sure to check out the main page for Rsync to learn about the many options it offers. V is for verbose. So we'll see each file as a transfer, and we'll get a summary at the end. If you leave this off Rsync will just do its thing quietly. The Z option offers some compression, which can help speed things up. And here we go. Now I've made a copy of my files. Down here I can see that we've sent about 220 megabytes at about 15 million bytes per second.
Once we have an initial copy made, that's where Rsync's features become more helpful. If I were to edit one of these files or maybe make a copy of one, for example, copying one of the photos, to a new file, the next time that I run that backup, Rsync will figure out that it doesn't have to copy everything all over again. It'll just copy the changed file, and that saves a lot of time. If I delete a file from my original location and then run the sync again, nothing really happens.
And I can see that the deleted file is still on my backup disk. Now we're getting somewhere with the backup. If I deleted that file by accident I could recover it from my backup, even after the copy process ran. This is useful to keep a safety net, but it can also take up a lot of space, and what if you really did want to delete that file. You can pass another option to Rsync, - -delete, to tell it to delete anything from the destination that's been deleted from the source.
Here I can see that Rsync deleted that file, and if I take a look at my backup, now that file's gone. Rsync can also work with network paths and with remote servers via SSH, allowing you to back up to a central server or to an offsite host. One thing to keep in mind about backups across the internet is that at some point, unless you've seeded an initial backup manually, you could be transferring a lot of data upstream, and many home ISPs have limits or will start throttling the speed of large upstream transfers. Though, again, Rsync's ability to detect whether files have been modified, helps to keep the data transfer to just a delta rather than the whole volume of the folder after an initial backup.
These Rsync commands can be tedious to write out over and over. So it's a good idea to write a little script to contain the command. And you could add some logging capability to keep a record of what was transferred and when. You might also consider adding your script to your cron tab to make backups happen automatically. Backing up data isn't a one size fits all kind of thing you'll need to determine what data you need to backup, whether you want to keep things strictly in sync by deleting files that don't exist anymore, or whether you want to keep deleted files in the backup in case of accidental deletions on your workstation.
You might choose to run individual syncs for specific folders, or one big one for a top-level folder that contains everything. And you'll need to decide whether to backup to local disks, local servers, or remote servers. If you're designing a backup strategy for an organization, you might be better served with backup management software, but for one-off backups or backups of home machines, Rsync is a very useful tool.
Note: Because this is an ongoing series, viewers will not receive a certificate of completion.