Rsync - A Quick Guide

Last updated on

command line

Rsync is a very handy tool for doing high-speed file transfers between Linux hosts over a local network or remote hosts on the internet, such as EC2 instances on Amazon Web Services. You can pass options to Rsync to do things like recursive transfer (all files and sub-directories within the source directory are also transferred), ignore existing or newer files at the destination, and more.

Sections

  1. Rsync Basics
  2. Advanced Rsync-ing
  3. Using Rsync with Cron
  4. References

Rsync Basics

Let’s start with using rsync to copy a directory, and all it’s contents, to another location.

rsync -r /home/Chandler/stuff /opt/stuff

This will copy stuff, and all contents including sub-folders because of the -r option, into /opt/, creating a new stuff directory in the process.

If we wanted not to copy the stuff directory as a whole, but only the contents inside, to an existing directory, instead we’d use:

rsync -r /home/Chandler/stuff/* /opt/existing/

Here we’re sending stuff/* the trailing slash and asterisk specifies only the contents, meaning at the destination we also add a trailing slash to existing/ to specify the files being transferred go into that directory.

Now we upgrade to sending stuff between different machines on a network. Let’s start with transferring a single file:

rsync /home/Chandler/report.txt [email protected]:/home/Monica/junk/report.txt

The above sends report.txt to the user Monica on machine 192.168.1.20 (Monica’s IP address), into Monica’s directory at /home/Monica/junk.

Likewise with a directory, it would be like this:

rsync /home/Chandler/stuff [email protected]:/home/Monica/stuff

Or only the contents of Chander’s /stuff into Monica’s existing directory:

rsync /home/Chandler/stuff/* [email protected]:/home/Monica/junk/

Advanced Rsync-ing

To explain Rsync, let’s pretend we want to transfer a directory of photos between two Linux hosts. The hostname of the computer with the photos is called Workstation, and the main user is Monica. The hostname of the receiving computer is unknown, but it’s IP address is 192.168.1.19 and it’s user is Chandler. Monica only wants photos in .jpg format to be transferred, wants to skip transferring any photos that already exist at the destination unless she has a newer version of it, and wants to NOT transfer files over 100MB in size. She will use the below command and options:

rsync -rvuogtz --progress --max-size=100M /home/Monica/photos/*.jpg [email protected]:/mnt/storage/photos/

Let’s break it down:

  • rsync is the command to invoke Rsync, not much to explain here.
  • -r or --recursive will copy directories recursively, which will include all sub-directories (and sub-directories within sub-directories) and all files within them.
  • -v or --verbose displays information on the files being transferred, without this Rsync will work silently and not display any new information until it is done running.
  • -u or --update will skip any files which exist on the destination and have a modified time that is newer than the source file; replaces existing file at destination if it has same modification time as source file, but different size.
  • -o, -g and -t will preserve the owner, group and last modified times respectively of files when transferred to destination.
  • -z or --compress will compress the files being transferred, which can significantly increase transfer speeds, especially when transferring to remote hosts over a network.
  • --progress will display the progress of files as they are transferred, giving you even more information as Rsync runs.
  • --max-size=100M prevents files at or over the specified file size (in this case 100MB) from being transferred.

This is just a small sample of the options available, but this set of options is a good base to start using Rsync. If copying important files, especially as backups, it’s desirable to pass the flag -a for archive mode. According to the Rsync man page:

--archive, -a
archive mode is -rlptgoD (no -A,-X,-U,-N,-H)

Therefore using -a will automatically include the following:

  • -r: recurse into directories
  • -l: copy symlinks
  • -p: preserve permissions
  • -t: preserve modification times
  • -g: preserve preserve group
  • -o: preserve owner
  • -D: preserve device files & special files

As explained by the rsync man page, archive mode does not include other flags, these can be important and might want to consider using them (in addition to archive mode) when backing up important system files:

  • -A: preserve access control lists (ACLs)
  • -X: preserve extended attributes
  • -U: preserve access times
  • -N: preserve create times
  • -H: preserve hard links

Therefore, to backup important files with Rsync, like for example a /home directory to another machine, we should use this:

rsync -azAXUNH --progress /home/Monica [email protected]:/mnt/storage/backups

Using RSync with Cron

Rsync makes a decent automatic backup solution by using a cronjob and passing Rysnc the --daemon flag to schedule recurring background transfers during off-hours.

First use the following command:

crontab -e

This will open the crontab file in your default text editor. For this example, Monica wants to use Rsync to back up the contents her home directory (/home/Monica) which includes her entire media library to another directory located at /mnt/backup, and she wants this to happen automatically every Saturday at 3:00AM, so the transfer happens when no one is using the Workstation. She’ll have to put this into the crontab just below the commented text:

0 3 * * SAT rsync -aq --daemon /home/Monica /mnt/backup/*

The Rsync stuff first. Besides using --daemon to run Rsync as a background process, we’re using -a for archive mode and -q to suppress most information during transfer (the opposite of -v).

Now for the Cron stuff, the 0 3 * * SAT before rsync is a schedule expression. You schedule cron jobs by expressing the timing in five fields, from left to right they are: minute (0 - 60), hour (0 - 23), day of the month (1 - 31), month (1 - 12), and day of the week. In my example above, 0 3 * * SAT means it will run every Saturday at 3:00 AM.

If you think you’ll have a hard time remembering the scheduling expressions, just know that I am right there with you, which is why I use a handy tool called Crontab Guru</ato schedule cron. To make Rsync a little less daunting to use, check out Rsyncinator, which has a slick GUI that lets you create your Rsync command with checkboxes and fields specifying what they do.

Finally, for a list of ALL the rsync options with short descriptions of what they do, check out the Rsync man page, a truly exhaustive manual for understanding all the granular details of Rsync.