Switched to Restic Backup

Date: 2020-09-11

Hi! In this post I will explain why and how I switched my backup solution for my server to Restic. I am using it both for local backups (i.e. backups on an external hard-drive and on a third hard-drive which I keep in a different place) and for cloud backups (using Backblaze S3 Cloud Storage). Considering that on my server I host Nextcloud and other services, I believe that basically all my important data is there: backups should be the main priority when self-hosting.

Previous solution

My previous solution involved rsync, one of the most used and verstatile file copying tool. Rsync has the possibility to perform incremental backups: this means that, when performing backups over time, only new/modified files were copied. Consider a simple example. On the filesystem I want to backup there are three files:

file1
file2
file3

One day I make a backup of these files to a new location. All the files are copied in that location. However, the following day, I modify only file1. If, then, I do another backup to the same location of before, only file1 will be copied to the backup location because file2 and file3 are the same as before.

My backup schema was this:

  1. Every day, at 4 am, a systemd timerlaunched a systemd service which launched a command. This command copied the / directory to an external hard-drive connected to USB (some directories, which should not be included in backups, were excluded).
  2. Every 3 / 4 months I would copy the content of the external drive to another external drive, which was kept in another room, not connected to any PC.

These were the content of the services:

ale@aleserver ~$ cat /etc/systemd/system/backup.timer   
[Unit]  
Description=Backup Timer  
  
[Timer]  
OnCalendar=*-*-* 4:00:00  
  
[Install]  
WantedBy=multi-user.target
ale@aleserver  ~$ cat  /etc/systemd/system/backup.service   
[Unit]  
Description=Backup service  
  
[Service]  
User=root  
ExecStart=/usr/bin/rsync -aAXS -v --delete --log-file=/home/ale/rsync\_config/rsync\_log --exclude-from='/home/ale/rsync_config/exclude.txt' / /home/ale/backup/  

Basically, it launched rsync with a bunch of options:

  1. The -aAXS flags are typically used in performing backups using rsync. I think the great Archwiki rsync page is a great resource to understand what these flags do;
  2. The v enables verbose logging
  3. --delete means that file deleted on the source are deleted on the backup, too;
  4. log-file indicates the file where the log should be saved;
  5. --exclude-from='...' points to a file which contains a list of folders which should be excluded from the backup;
  6. / is the source of the backup, while /home/ale/backup is the directory on which the external hard-drive is mounted on.

The content of exclude.txt is:

/dev/*  
/proc/*  
/sys/*  
/tmp/*  
/run/*  
/mnt/*  
/media/*  
/lost+found/*  
/home/ale/backup  
swap.img  
/home/ale/.cache/*  
/var/lib/lxcfs/*  
/var/lib/docker/*

which are directories and files that, usually, you don’t need nor want in backups. Note that /home/ale/backup has to be excluded, or the backup will go in a infinite loop! rsync would try to copy the content of the backup folder in the backup folder, which in turn would become a nested backup…

Problems with the previous method

These are the main reasons that made me switch backup solution:

  1. No snapshots: the backup folder reflects the server filesystem at 4 am of the last night, stop. If, during the day, you deleted a file and after two days you want it back you can’t because the backup does not contain it anymore. It happened to me with a config file.
  2. No “third location” backup: the backups were performed on an external disk and every 3 / 4 months on another drive which, however, was just in another room. In case of thieves or fires it is possible that all the copies of my data are lost!

Always remember the “3-2-1 backup rule”:

Keep at least three (3) copies of your data, and store two (2) backup copies on different storage media, with one (1) of them located offsite.

While I had 3 backups copies, with 2 copies on different storage medias (the external hard-drive and the other external hard-drive), I failed to accomplish the last point. While you always think that it is not possible to have thieves or a fire in your house, it still may happen.

The alternatives

There are two alternatives that I checked:

Both are very good backup applications, widely used and known. They provide data deduplication (which means that unique pieces of data are saved only once), encryption (which is fundamental if the backup location is not safe/can be accessed by someone else) and snapshots (which lets you save many backup states in different moment, so you can restore the data at a certain date).

In contrast with plain rsync the syntax and functions of these applications are aimed at backup managing, making them a better solution for this usecase.

While Borg provides also backup compression (coming soon for Restic), I decided to go with Restic. Honestly, this is mainly a matter of taste. I liked the Restic syntax more than the Borg’s one, and I prefer the output of Restic’s commands. Considering that I don’t have much data to backup, in my tests, compression did not entail an important reduction in the occupied space. But, Restic appeared to be faster on my system.

Lastly, I find that the configuration for Restic is a bit easier than the one for Borg. However, I like both the softwares.

Configuration

Local backups

As I said, local backups are performed on an external hard-drive connected to my server through USB. This can help me in case of internal hard-drive failure/corruption, but not in the case of electrical problems which may damage both the server and attached device.

I still use the systemd timer showed in the first section, but instead of the rsync command the systemd unit launches a script:

[Unit]
Description=Backup service

[Service]
User=root
ExecStart=/home/ale/restic_config/backup.sh

While the script is:

#!/usr/bin/env bash  

# When a repository is initialized a password is
# requested. It will be used for encryption
# purposes; it is necessary everytime the repository
# is accessed.

export RESTIC_PASSWORD="My_beautiful_password"  
export RESTIC_REPOSITORY="/home/ale/backup/NUC-restic-repo"  

# Launch the restic command. Excludes some path
# from the backup, produce a verbose output.
# The directory to backup is "/".
restic                                     \  
   --exclude=/home/ale/backup              \  
   --exclude=/home/ale/external_backup     \  
   --exclude=/dev                          \  
   --exclude=/media                        \  
   --exclude=/mnt                          \  
   --exclude=/proc                         \  
   --exclude=/run                          \  
   --exclude=/sys                          \  
   --exclude=/tmp                          \  
   --exclude=/var/tmp                      \  
   --exclude=/var/lib                      \  
   --exclude=/var/lib/lxcfs                \  
   --exclude=/var/lib/docker               \  
   --exclude=/home/ale/.cache              \  
   --verbose                               \  
   backup                                  \  
   /    
  
# This command forgets and deletes some snapshots
# from the backup repository, according to some
# policies. 
# The last 5 snapshots will be always conserved;
# then, keep the last 3 daily snapshots, the
# last 4 weekly snapshots, the last 6 monthly
# snapshots and the last 2 yearly ones.
restic                                     \  
   forget                                  \  
   --keep-last 5                           \  
   --keep-daily 3                          \  
   --keep-weekly 4                         \  
   --keep-monthly 6                        \  
   --keep-yearly 2                         \  
   --prune

Sometimes I manually do the same operation on an external disk. You only have to change the repository directory. You can access the logs of the systemd service with journalctl -u backup.service.

Backblaze B2 backups

Restic as built-in support for Backblaze B2 Storage which is a very affordable storage in the Cloud with support for B2. Consider that, for month, you spend 0.005$/GB. Which means that, for 60 GB saved in a bucket, you spend something like 0.3$ a month, which is less than 4$ in a year.

In case of backups with Restic, some additional costs may apply because some data must be downloaded and uploaded, obviously. But I still think that it’s difficult to find a cheaper alternative.

The script, which can be used to perform backups on a Backblaze B2 bucket is still very simple:

#!/usr/bin/env bash  
export B2_ACCOUNT_ID=<my id>  
export B2_ACCOUNT_KEY=<my key>  
  
export RESTIC_PASSWORD="my beautiful password"  
export RESTIC_REPOSITORY="b2:mybeautifulbucket:/NUC-restic-repo-b2"  
  
restic                                     \  
   --exclude=/home/ale/backup              \  
   --exclude=/home/ale/external_backup     \  
   --exclude=/dev                          \  
   --exclude=/media                        \  
   --exclude=/mnt                          \  
   --exclude=/proc                         \  
   --exclude=/run                          \  
   --exclude=/sys                          \  
   --exclude=/tmp                          \  
   --exclude=/var/tmp                      \  
   --exclude=/var/lib                      \  
   --exclude=/var/lib/lxcfs                \  
   --exclude=/var/lib/docker               \  
   --exclude=/home/ale/.cache              \  
   --verbose                               \  
   backup                                  \  
   /    
  
restic                                     \  
   forget                                  \  
   --keep-last 5                           \  
   --keep-daily 3                          \  
   --keep-weekly 4                         \  
   --keep-monthly 6                        \  
   --keep-yearly 2                         \  
   --prune

Conclusion

After checking the backups, I saw that you can access them very simply (you can mount a repository in a directory and recover exactly the data you want, or you can restore an entire snapshot in a folder). I think that this solution solves the problems I stated in the beginning.

There are two points to improve: