A while ago I got a testing account on rsync.net to try out the experience of using it with zrepl. The experience I has been quite wonderful, and I want to share what I learned with you.
What is zrepl?
If you use zfs on your NAS, you inevitably think about doing an offsite backup for your data. For this you might use off the shelf tools like Duplicati or borg. As a zfs user you might also be aware of the option to use the zfs send
command to send a byte for byte copy of a zfs snapshot to another system running zfs. Writing some shell scripts around this and using it as a backup has numerous drawbacks however.
This is where zrepl comes in, it’s a “one-stop, integrated solution for ZFS replication”. It runs as a daemon on your local nas and your remote system making sure the replication of your datasets proceeds reliably (for example by recovering from network outages) and transparently (by exposing various tools for monitoring).
It it also manages the creation and cleanup of periodic snapshots of your datasets. You could for example keep 24 hourly snapshots on your local nas to quickly recover deleted files while on the other hand keeping 6 monthly snapshots on your remote in case you need to recover something older.
Zrepl is an awesome tool if you want your data to be in two places at the same time while using many native zfs features.
What is rsync.net?
Rsync.net is basically a cloud storage provider without all the fuss. Originally they would just provide you a vm to rsync
your files to (hence the name) and make sure they stay there. You can also use various other tools to send them your data like sftp
, borg
, rclone
, git-annex
or just good old scp
.
The happy coincidence is that their system is based on zfs, just like ours. You even get full access to your own zpool if you sign up for a zfs-send enabled account here. You can simply treat your rsync.net vm as a remote zpool under your control.
Setting up zrepl on rsync.net
As it turns out all this allows us to run zrepl on our remote vm, the rest of this post is about how to set that up. I will skip the setup on your local nas as that is nicely explained by the zrepl docs.
When you sign up for your account you will get a user, password and a host to ssh into. You then want to do basic setup steps like adding a ssh key for your user, but I will skip this part as they have their own guides on how to get started with that. The vms are running on FreeBSD, so some things are a little different. (At least they were for me as a Linux user)
To install zrepl simply run pkg add zrepl
.
The next step is making sure that our local NAS’ zrepl daemon can access the rsync.net vm. Zrepl has various transports to make that work. For this guide we will use the simple ssh+stdinserver
transport (be aware that this has some throughput limitations however). You should generate a fresh ssh keypair for that on your local machine, putting the private key somewhere your local zrepl daemon can access it. The public counterpart goes into the familiar .ssh/authorized_keys
file in addition to some preamble that restrict incoming connections using that key to running zrepl.
command="zrepl stdinserver client1",restrict ssh-ed25519 AAAAC3N...
Simply replace the ssh-ed25519 AAAAC3N...
here with the public key you generated.
The next step is to configure zrepl by editing the config file at /usr/local/etc/zrepl/zrepl.yml
, however we will go into detail on that in the next section.
To enable the zrepl service on your remote simple create the file /etc/rc.conf.d/zrepl
and add a line containing zrepl_enable="YES"
to it. Then start zrepl by running service zrepl start
.
Simple zrepl config
I’ll pride you with a very simple zrepl config for your local nas and remote here. Be sure to read the zrepl docs to tune it for your use case.
Local: /etc/zrepl/zrepl.yml
global:
logging:
- format: human
level: warn
type: syslog
jobs:
- connect:
host: <your rsync.net vm name>.rsync.net
identity_file: <path to the private key you crated>
port: 22
type: ssh+stdinserver
user: root
filesystems:
<datasets you want to replicate>: true
pruning:
keep_receiver:
- grid: 1x1h(keep=all) | 24x1h | 30x1d | 6x30d
regex: ^zrepl_
type: grid
keep_sender:
- type: not_replicated
- count: 10
type: last_n
snapshotting:
interval: 10m
prefix: zrepl_
type: periodic
type: push
Remote: /usr/local/etc/zrepl/zrepl.yml
global:
logging:
- type: "stdout"
level: "error"
format: "human"
- type: "syslog"
level: "info"
format: "logfmt"
jobs:
- name: sink
type: sink
serve:
type: stdinserver
client_identities:
- "client1"
root_fs: "data1"
This will make zrepl push snapshots to your rsync.net vm with some snapshotting and pruning. Specifically it takes a snapshot every 10 minutes on your local nas, but only keeps the 10 most recent ones around. Your remote will keep 1 snapshot of each of the last 24h, 1 snapshots for each of the last 30 days and 1 snapshot for each of the last 6 months.
Conclusion
Using zrepl on rsync.net provides us with a fairly straight forward way to replicate our zpool off site, with the added benefit not having to convert our data into other backup formats (such as borg) while keeping most of their benefits. It’s a unique offering that I have not seen anywhere else and hopefully fits your use case as well as it does mine.
Disclaimer: I’m continueing to receive a discounted access to rsync.net.