Storage replication

Introduction to ZFS Replication

Introduction to ZFS Replication

Up your OpenZFS data management game and handle hardware failure with a minimal data loss

In Basics of ZFS Snapshot Management, we demonstrated how easy and convenient it is to create snapshots and use them to restore data on the local system. 

In this article, we’ll demonstrate how to replicate snapshots to another system. This feature of OpenZFS really ups the data management game, providing a mechanism for handling a hardware failure with minimal data loss and downtime. Replication is also a convenient way to quickly spin up a copy of an existing system to another, say when you purchase a new laptop, or to deploy a whole lab of similar systems. It could also be used to mirror the contents of your home directory on two different systems. 

The replication design used by OpenZFS is pretty ingenious. Unlike cloning software, replication does not do a byte-for-byte copy. Instead, zfs send converts snapshots into a serialized stream of data and zfs receive transforms the streams back into files and directories. Received snapshots are treated as a live file system, meaning that the data in the snapshot can be directly accessed on the receiving system. 

In practical terms, snapshots are replicated to another system over the network, possibly in another geographic location, and typically on a schedule. This assumes that the other system has enough storage capacity to accept the replicated data, in addition to the replicated changes over time, and that the network can handle the data transfer. It can seem a bit daunting to determine the amount of required storage for data that change over time as well as the needed network capacity. 

Fortunately, replication itself is easy to configure and understand. In this article we’ll keep things simple, and practice replicating small amounts of data to a virtual machine. Once you’re comfortable with how the commands work, you can start to apply them to real systems and larger amounts of data. 

Things to Know First

Replication requires both systems to have at least one OpenZFS pool. The pools do not need to be identical: for example, they can be a different size, use a different RAIDZ level, or have different properties. However, if you have explicitly enabled a feature on one system’s pool, it must also be enabled on the other system’s pool. 

Depending upon the size of the snapshot and the speed of the network, the first replication can take a very long time to complete, especially when replicating an entire pool. If possible, perform an initial replication when the network is not busy. Once the first replication is complete, subsequent replications of incremental data are quick. When replicating pools, be aware that the replicated data will not be accessible on the receiving system until the replication is complete. 

Finally, it is very important to be aware of the available capacity on the receiving system and the size of the snapshot being sent. If you are scripting a replication schedule, include a space check before starting the replication. 

Preparing the Receiving System 

For these replication examples, a laptop is the sending system and a virtual machine is the receiving system. 

Note: The commands in this article are run as the root user. While you could use zfs allow to give a user permission for the send and receive commands, consider that replication involves another system and usually the transfer of system files or pools. This is different than giving a regular user permission to snap or restore their own data locally. 

I installed FreeBSD 13 into a VirtualBox virtual machine configured with two 16GB virtual storage devices. During installation, one virtual storage device was formatted with ZFS, DHCP was configured on the virtual network interface, and the default option of enabling SSH was selected. The IP address of this receiving system is 10.0.2.15

Create a Pool to Hold the Replicated Snapshots 

To FreeBSD, the two virtual storage devices appear as ada*, which represents the zroot pool created during installation, and ada1 which is still available: 

ls /dev/ad* 
/dev/ada0	/dev/ada0p1	/dev/ada0p2	/dev/ada0p3	/dev/ada1 	

Did you know?

Getting your ZFS infrastructure up to date has never been easier!

Our team provides consistent, expert advice tailored to your business.

I’ll use the zpool create command to create a pool named backups on the ada1 device: 

zpool create backups /dev/ada1 

This system now has 2 pools: zroot contains the operating system, and backups will be used to hold the replicated snapshots: 

zpool list 
NAME		SIZE	ALLOC	FREE	CKPOINT	EXPANDSZ	FRAG	CAP	DEDUP	  HEALTH 
backups   15.5G	 336K	15.5G		 -		 -	  0%	 0%	1.00x	  ONLINE 
zroot     15.5G	 336K	15.5G		 -		 -	  0%	 0%	1.00x	  ONLINE 
Configure SSH Access 

OpenZFS uses SSH to encrypt the replication stream during the network transfer. By default, root is not allowed to ssh into a FreeBSD system. Since root will be sending the replication stream, change this line in the SSH daemon configuration file (/etc/ssh/sshd_config): 

#PermitRootLogin no 

to: 

PermitRootLogin yes 

Then, tell the SSH daemon to reload its configuration: 

service sshd reload 

The receiving system is now configured. 

Preparing the Sending System 

On the laptop (sending system), I want to check that the root user has an SSH key pair. This authentication method requires a copy of the public key on the receiving system. By using a key pair without a passphrase, replication becomes fully scriptable, without prompting for user input.                                 

If a key pair does not exist, generate one as root. Press enter at all of the prompts to accept the defaults and not require a passphrase: 

ssh-keygen 
Generating public/private rsa key pair. 
Enter file in which to save the key (/root/.ssh/id_rsa): 
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /root.ssh/id_rsa. 
Your public key has been saved in /root.ssh/id_rsa.pub. 
<SNIP fingerprint and randomart image> 

Next, send a copy of the public key to the receiving system. This command assumes that you know the root password on the receiving system: 

cat .ssh/id_rsa.pub | ssh 10.0.2.15 'cat >>.ssh/authorized_keys' 
Password for root@10.0.2.15: 
exit 

Finally, verify that you can ssh to the receiving system without being prompted for a password or passphrase. If it works, you should just get the command prompt of the receiving system. Type exit to logout of the ssh session. 

Testing a Replication 

Let’s start by creating a test dataset, populating it with a small amount of data, and taking a snapshot: 

zfs create tank/usr/home/dru/test 
cp -R /etc/* /usr/home/dru/test/ 
zfs snapshot tank/usr/home/dru/test@testbackup 
zfs list -t snapshot 
NAME                            		USED    AVAIL    REFER     MOUNTPOINT 
tank/usr/home/dru/test@testbackup       0        -     2.23M             - 

Remember: it is very important to be aware of the amount of snapshot data (REFER) to ensure it will fit on the replicated system. In this case, 2.23M is a trivial amount of data, even for the small 15.5GB pool on the receiving system. 

The command to replicate that snapshot makes a send stream with verbose stats (-v) of the  tank/usr/home/dru/test@testbackup snapshot, pipes (|) that stream to the ssh command in order to log in as root on 10.0.2.15 so the receiving system can receive the stream and save it to the backups pool: 

zfs send -v tank/usr/home/dru/test@testbackup | ssh 10.0.2.15 zfs receive backups 
full send of tank/usr/home/dru@testbackup estimated size is 2.13M 
total estimated size is 2.13M 
TIME		SENT		SNAPSHOT 
cannot receive new filesystem stream: destination ‘backups’ exists 
must specify -F to overwrite it 
warning: cannot send ‘tank/usr/home/dru/test@testbackup’: signal received 

This error indicates an important difference between replicating a snapshot of a pool and replicating a snapshot of a dataset. When sending a snapshot of a dataset, you must append a location name to the name of the destination pool. You can name the location whatever you want, as long as it doesn’t already exist on the destination pool. In this example, I’ll specify a name of test

Did you know?

Want to learn more about ZFS? We consistently write about the awesome powers of OpenZFS in our article series.

zfs send -v tank/usr/home/dru/test@testbackup | ssh 10.0.2.15 zfs receive backups/test 
full send of tank/usr/home/dru@test-backup estimated size is 2.13M 
total estimated size is 2.13M 
TIME		SENT		SNAPSHOT 

Note that the output indicates the estimated required storage capacity. You should ^c to abort the command if the receiving system doesn’t have the required capacity. In this case, the snapshot is so small, the transfer is almost instantaneous. I can verify it worked by checking the receiving system: 

ssh 10.0.2.15 
ls /backups/ 
test 
ls /backups/test 

The location namof /test was created automatically and the second listing displays the contents of /etc/, the original source of the dataset. 

Let’s try replicating a larger snapshot. Note that I specify a different location on the destination pool to hold this snapshot of dru’s home directory: 

zfs snapshot tank/usr/home/dru@homedir 
zfs send -v tank/usr/home/dru@homedir | ssh 10.0.2.15 zfs receive backups/dru 
full send of tank/usr/home/dru@homedir estimated size is 3.13G 
total estimated size is 3.13G 
TIME		SENT		SNAPSHOT 
07:47:14	3.37M		tank/usr/home/dru@homedir 
<SNIP> 
08:09:53	3.15G		tank/usr/home/dru@homedir 

This command perked along, indicating the transfer progress until the transfer successfully completed. Over this network, it took 22 minutes to send just over 3GB worth of data. Your transfer times will vary—note them over a variety of transfer sizes and network activity to estimate your own baseline. 

If you’re impatient and try to list the contents of /backups/dru/ before the transfer completes, that destination won’t exist until the transfer is finished. Once the transfer is complete, a listing of /backups/dru/ should look just like her home directory. 

Incremental Replication 

Once an initial replication is complete, you can test sending an incremental snapshot. Let’s add a few new files to dru’s home directory and list the differences to the snapshot: 

zfs destroy tank/usr/home/dru@permtest1
cp /var/log/messages* /usr/home/dru/ 
zfs diff tank/usr/home/dru@homedir 
+	/usr/home/dru/messages 
+	/usr/home/dru/messages.0.bz2 
+	/usr/home/dru/messages.1.bz2 
+	/usr/home/dru/messages.2.bz2 

Let’s take a new snapshot that includes the 4 newly added files: 

zfs snap tank/usr/home/dru@homedir-mod 

To replicate these differences, add the increment switch (-i) and specify the names of the two snapshots. This command requires the first snapshot to already exist in the specified destination on the receiving system: 

zfs send -vi tank/usr/home/dru@homedir tank/usr/home/dru@homedir-mod | ssh 10.0.2.15 zfs receive backups/dru 
send from @homedir to tank/usr/home/dru@homedir-mod estimated size is 24.1M 
TIME		SENT		SNAPSHOT 
10:34:23	3.50M		tank/usr/home/dru@homedir-mod 
<SNIP> 
10:34:32	24.1M		tank/usr/home/dru@homedir-mod 

While the initial replication took 22 minutes, the incremental replication took 9 seconds. 

If I ssh into the receiving system and do a listing, the 4 added files appear in /backups/dru/. 

Note: 

If you receive this error when performing an incremental replication: 

cannot receive incremental stream: destination backups/dru has been modified since most recent snapshot 

warning: cannot send ‘tank/usr/home/dru@homedir-mod’: signal received 

it means that ZFS has determined that replicated data has changed on the destination; since the snapshots on the sending and receiving systems are no longer identical, ZFS aborts the replication. If the dataset has accidently changed since the last snapshot, you can use the zfs rollback command to revert the changes. If you want to overwrite the data changes on the receiving system, use receive -F in the command to force the receiving system to rollback to the state of the last received snapshot so the systems are again in sync. 

Conclusion

This article should get you started replicating data between systems. We recommend that you start with small amounts of data to get a better understanding of replication times within your own environment. 

Like this article? Share it!

You might also be interested in

Getting expert FreeBSD advice is as easy as reaching out to us!

At Klara, we have an entire team dedicated to helping you with your FreeBSD projects. Whether you’re planning a FreeBSD project, or are in the middle of one and need a bit of extra insight, we’re here to help!

More on this topic

NET 1 2

History of FreeBSD – Net/1 and Net/2 – A Path to Freedom

Let’s talk some more FreeBSD history. From the release of 4.3BSD to how networking for all became available and Net/1 and Net/2 came to be.
This article is going to cover the time period from the release of 4.3BSD with TCP/IP to the BSD lawsuits. This period set the stage for BSD as we know it today.

arm_freebsd

Tracing the History of ARM and FreeBSD

Did you know that during the course of the day, you have spent more time interacting with Arm processors than any other architectures. And FreeBSD can take advantage of this technology. Let’s take a look at the Arm architecture.

freebsd containers

FreeBSD Jails – The Beginning of FreeBSD Containers

Join us as we take you through the history of how FreeBSD containers came to be, where did the need for such a solution originate, and how they were developed into the practical FreeBSD Jails they are today.

One Comment on “Introduction to ZFS Replication

  1. Pingback: Valuable News – 2021/06/21 | 𝚟𝚎𝚛𝚖𝚊𝚍𝚎𝚗

Tell us what you think!