Learn about working with rsync to compare the files in the original and destination locations, and copy only the files that are newer.
- Let's move on to the seventh video of this section. Backup snapshots with rsync. In the previous video, we learned how to create file systems with compression. In this video, we'll explore how to backup snapshots with rsync. We'll copy a source directory to a destination and backup data to a remote server and restore it. Backing up data is something that most sys admins need to do regularly. In addition to backing up local files, we may need to backup data from a web server or from remote locations. Rsync is a command that can be used to synchronize files and directories from one location to another while minimizing data transfer using difference calculations and compression.
The advantage of rsync over the CP command is that rsync uses strong difference algorithms. Additionally, it supports data transfer across remote machines. While making copies, it compares the files in the original and destination locations and will only copy the files that are newer. It will also support compression, encryption, and a lot more. Let's see how to work with rsync. Let's see how to copy files and create backups with rsync. To copy a source directory to a destination use this syntax. In this command, - A stands for archiving.
- V verbose prints the detail or progress on standard out. The above command will recur to recopy all the files from the source path to the destination path. We can specify paths as remote or local paths. Now, in order to backup data to a remote server or host, use this command. To keep a mirror at the destination run the same rsync schedule at regular intervals. It will copy only changed files to the destination. Next, to restore the data from the remote host to the local host as this.
You should note that the rsync command uses SSH to connect to the remote machine. Hence, you should provide the remote machine's address in the format user @host, where user is the username, and host is the IP address or hostname attached to the remote machine. Path is the path on the remote machine from where the data needs to be copied. Make sure that open SSH server is installed and running on the remote machine. Additionally, to prevent the prompt for a password for the remote machine, see the video, Passwordless Auto Login with SSH from section seven.
Let's now look into compressing the data while transferring through the network, which can be significantly optimizing the speed of transfer. Here we can use the rsync option - zed to specify compressed data while transferring through a network. For example, synchronize one directory to another with this command. This command copies the source /home/test to an existing folder called backups.
Now copy a full directory inside another directory as we do it here. This command copies the source, home/test to a directory named backups by creating that directory. For the path format, if we use / at the end of the source, rsync will copy the contents of that ndirectory specified in the source path to the destination. If / is not present at the end of the source, rsync will copy that ndirectory itself to the destination. For example, the first command here copies the contents of the test directory.
While the second command copies the test directory to the destination. If / is at the end of destination path, rsync will copy the source to the destination directory. If / is not used at the end of the destination path, rsync will create a folder named similar to the source directory at the end of the destination path, and copy the source into that directory. For example, see this one here.
Let's look into it to understand how it works. Rsync works with source and destination paths which can be either local or remote. Most importantly, even both the paths can be remote paths. Usually, the remote connections are made using the SSH so the rsync can calculate what files to copy and what not to. Local and remote paths look like this. So this is the local path, and this is the remote path.
This path specifies the absolute path in the machine in which the rsync command is executed. While this one specifies that the path is /home/backups/data in the machine with IP address 1921680.6, and is logged in as user/slynux. It's time to dive into some additional information. So the rsync command has several additional functionalities that could be specified using its command line options. Let's go through them. So first we talk about excluding files while archiving with rsync.
Some files need not be updated while archiving to a remote location. It is possible to tell rsync to exclude certain files from the current operation. Files can be excluded by two options. We can specify a wildcard pattern of files to be excluded. For example, this command excludes .txt files from backing up. Or we can specify this to files to be excluded by providing a list file.
You could also use here - -exclude-from file path. Since we have not created a file we got this error. So you can try it with a valid filename which you created prior. Next we move onto deleted, non-existing files while updating rsync backup. By default, rsync does not remove files from the destination if they no longer exist to the source. In order to remove the files from the destinations that do not exist to the source, use the rsync --delete option as we do here.
Now let's discuss about scheduling backup intervals. You can create a chron job to schedule backups at regular intervals. A sample is as you see here. Let's add this line. The above chron tab entry schedules the rsync to be executed every 10 hours. This is the hour position of the chron tab syntax. While /10 specifies to execute the backup every 10 hours, if *10 is written in the minutes section, it will execute every 10 minutes.
Have a look at these Scheduling with Chron video in the upcoming section administration calls to understand how to configure chron tab. Great! In this video we've learned how to backup snapshots with rsync, and at the end we dived into some additional information. In the next video, we'll look into version control based backup with Git command.
Note: This course was created by Packt Publishing. We are pleased to host this training in our library.
- Printing in the terminal
- Performing math in the Linux shell
- Getting and setting dates
- Working with functions and arguments
- Reading output
- Making comparisons
- Concatenating text
- Finding, editing, generating, and deleting files
- Running parallel processes
- Using regular expressions
- Downloading webpages
- Parsing data from a website
- Finding broken links
- Backing up and archiving
- Transferring files and data through the network
- Monitoring your Linux system
- Gathering data for system administration