How to optimize data synchronization with rsync
rsync is a versatile tool that simplifies file transfer over network connections and speeds up the synchronization of local directories. The high flexibility makes the synchronization tool an excellent option for a variety of file-level operations.
What is rsync?
rsync, short for “remote synchronization”, is a flexible and network-compatible synchronization tool under Linux. The open-source program can be used to synchronize files and directories between local systems or across networks. The tool uses a differential data transfer technique, whereby only those sections of data that have actually been changed are transferred. This minimizes the amount of data exchange and considerably speeds up the synchronization process. Thanks to a variety of options, rsync allows precise control of synchronization behavior. The flexible syntax makes both simple local copies and complex network synchronizations possible.
What is the syntax for rsync?
The command syntax of rsync has a simple structure and is similar to that of SSH, SCP and CP. The basic structure is as follows:
rsync [OPTION] source destination
bashThe source path that the data should be synchronized from is entered in source
, while the destination path is specified as destination
. rsync offers a variety of options which users can use to adapt the synchronization process to their requirements. The most frequently used options are:
-
-a
(archives): Preserves recursive file permissions, timestamps, groups, owners and special file properties -
-v
(verbose): Displays detailed information about the synchronization process -
-r
(recursive): Synchronizes directories and their contents recursively -
-u
(update): Only transfers files that are newer than those already in the target directory -
-z
(compress): Reduces data traffic over the network -
-n
–itemize-changes: Displays a list of the changes to be made -
--delete
: Deletes files in the target directory that no longer exist in the source -
--exclude
: Excludes certain files or directories from synchronization -
--dry-run
: Simulates the synchronization process without actually transferring files -
--progress
: Shows the progress of the file transfer -
--partial
: Files that have been partially transferred remain in the target directory if the transfer is interrupted. When the transfer is resumed, the file is continued from its last state
Examples of rsync syntax
The following examples of rsync syntax should make it easier to understand how the command is used. The following code example creates the directory dir1
including 100 empty test files and a second empty directory dir2
:
$ cd ~
$ mkdir dir1
$ mkdir dir2
$ touch dir1/file{1..100}
bashThe contents of dir1
can be synchronized on the same system with dir2
using the -r
option:
$ rsync -r dir1/ dir2
bashAlternatively, the -a
option can be used, which synchronizes recursively and contains symbolic links, special device files, modification times, groups, owners and authorizations:
$ rsync -a dir1/ dir2
bashNote: The slash (/) at the end of the source directory in an rsync command is important because it indicates that the contents of the directory should be synchronized, not the directory itself.
$ rsync -a dir1/ dir2
bashHere’s an example of the output:
sending incremental file list
./
file1
file10
file100
file11
file12
file13
file14
file15
file16
file17
file18
. . .
bashIf the source directory doesn’t have a trailing slash, the source directory will be copied to the target directory:
$ rsync -a dir1 dir2
bashHere’s the output:
sending incremental file list
dir1/
dir1/file1
dir1/file10
dir1/file100
dir1/file11
dir1/file12
dir1/file13
dir1/file14
dir1/file15
dir1/file16
dir1/file17
dir1/file18
. . .
bashUsing the slash at the end of the source directory ensures that the synchronization process runs as expected and that the contents of the source directory end up in the correct target directory.
How to synchronize rsync with a remote system
Synchronizing a remote system with rsync is usually not difficult, provided you have SSH access to the remote computer and have the necessary authentication information. Rsync often uses SSH (Secure Shell) for secure communication with remote systems. To use this tool, it has to be installed on both sides.
If SSH access between the two computers is verified, the dir1
folder can be synchronized on a remote computer. In this case, the actual directory needs to be transferred, which is why the trailing slash has been omitted in the following command:
$ rsync -a ~/dir1 username@remote_host:destination_directory
bashIf a directory is moved from a local system to a remote system, this is referred to as a push operation. In contrast, when a remote directory is synchronized with a local system, this is referred to as a pull operation. The syntax for this is as follows:
$ rsync -a username@remote_host:/home/username/dir1 place_to_sync_on_local_machine
bash- Unlimited traffic
- Fast SSD NVMe storage
- Free Plesk Web Host Edition
What other options are there in rsync?
The standard behavior of rsync can be further adapted using the options below.
Transferring non-compressed files with rsync
The network load when transferring non-compressed files can be reduced using the -z
option:
$ rsync -az source destination
bashDisplaying progress and resuming interrupted transmissions
With -P
you can combine the options --progress
and --partial
. This gives you an overview of the progress of transmissions and also allows you to resume interrupted transmissions at the same time:
$ rsync -azP source destination
bashHere’s the output:
sending incremental file list
./
file1
0 100% 0.00kB/s 0:00:00 (xfer#1, to-check=99/101)
file10
0 100% 0.00kB/s 0:00:00 (xfer#2, to-check=98/101)
file100
0 100% 0.00kB/s 0:00:00 (xfer#3, to-check=97/101)
file11
0 100% 0.00kB/s 0:00:00 (xfer#4, to-check=96/101)
. . .
bashExecute the command again to obtain a shorter output. This allows rsync to determine whether changes have been made based on change times.
$ rsync -azP source destination
bashHere’s the output:
sending incremental file list
sent 818 bytes received 12 bytes 1660.00 bytes/sec
total size is 0 speedup is 0.00
bashKeep directories synchronized with rsync
To ensure that two directories are actually kept in sync, it’s necessary to delete files that have been removed from the source directory in the target directory. But rsync doesn’t remove files from the target directory automatically. This can be modified with the --delete
option. However, it’s important to use this option with caution since it deletes files in the target directory that no longer exist in the source.
Before using this option, you should use the --dry-run
option. This will allow you to perform a simulation of the synchronization process without deleting any actual files. That way you can ensure that only the desired changes are made without accidentally losing important data:
$ rsync -a --delete source destination
bashExclude files and directories from synchronization
In rsync, you can use the --exclude
option to exclude certain files and directories from synchronization. This is useful if, for example, you don’t want to synchronize temporary files, log files or other content.
$ rsync -a --exclude=pattern_to_exclude source destination
bashIf you’ve specified a pattern for excluding files, you can use the --include=
option to overwrite this exclusion for certain files that match a different pattern.
$ rsync -a --exclude=pattern_to_exclude --include=pattern_to_include source destination
bashSave backups with rsync
The --backup
option allows you to save backups of important files. It can be used in conjunction with the --backup-dir
option to specify the directory where the backup files should be saved:
$ rsync -a --delete --backup --backup-dir=/path/to/backups /path/to/source destination
bashYou can find a detailed overview of the various backup scenarios in our article about server backups with rsync.