Klara

Every year, World Backup Day reminds us of the importance of safeguarding our digital assets. In an era where data breaches, hardware failures, and ransomware attacks are all too common, choosing an effective and dependable backup strategy can make all the difference. Designing the right disaster recovery system requires that we first understand our requirements and constraints. 

Requirements 

Objectives 

The two most critical performance metrics of a backup system are the RTO (Recovery Time Objective) and RPO (Recovery Point Objective). As described in detail in our article Achieving RPO/RTO Objectives with ZFS,  they refer to the time required to restore from a backup and how long it has been since the last successfully completed backup. Both are critical factors, each influenced by:  

  • the volume of data to backup or restore,  
  • the impact of backups on system performance,  
  • available network bandwidth, and  
  • the presence of standby hardware. 

Consistency 

One of the most critical factors in having a short RPO is being able to take a consistent backup. Unlike the batch processing systems of the past, most critical data infrastructure does not have a long period of idle time that can be used to backup all of the data without any changes occurring during the process. This is where features like filesystem or VM snapshots are invaluable—providing an inexpensive, consistent point-in-time view of a live system. Taking a backup where all the data is guaranteed to be coherent and unchanged ensures that the backup will contain usable data—not broken dreams. 

Data Integrity 

The only thing worse than not having a backup, is having a backup that is corrupt or compromised. A backup system must ensure both data integrity and authenticity. That means the backup system needs to protect the data from bit-flips and other hardware-induced errors, as well as malicious modification. 

Scale 

As data volumes continue to grow exponentially, the need for a scalable backup system becomes ever more pressing. ZFS is designed to avoid the limitations of traditional filesystems, allowing it to scale to the maximum extent of your available hardware. Its ability to manage vast amounts of data without degradation in performance or reliability means that ZFS won’t let you down as your data volume continues to grow. 

ZFS creates dynamic storage pools, allowing administrators to add new storage devices to an existing pool without the need for complex reconfiguration or downtime. With this flexibility, you can easily expand backup storage as your data grows—ensuring that your backup solution remains robust and relevant in the long term. 

Affordability 

In order to provide backups for the immense volumes of data managed by modern systems, the solution must offer a reasonable cost per terabyte stored. Expensive software licenses, restrictive contract terms, vendor lock-in, and other issues common in the industry drive up the price of storage systems and make backups less affordable. 

Maintainability 

A successfully deployed disaster recovery plan must also to account for the ongoing upkeep of the system. Monitoring is a critical component, not only to ensure that backups continue to complete successfully, but to verify that the data volumes match expectations, no important data is being skipped, and that nothing unimportant is filling up your storage. 

Backup Solutions 

Now that we have established the criteria our candidate backup solutions must meet, we can start to evaluate how each stacks up. 

Standard Offerings 

Standard commercial backup solutions—whether hardware or software-based—have a number of drawbacks. When it comes to RTO and RPO, the biggest factor is often how much time it takes to determine which data has changed. The RTO is often limited by the throughput that can be achieved by restoring files from the backup media. Backup software that must scan each individual file and inspect its modification time and contents to decide if it has changed, will have an RPO limited by the amount of time such scanning takes. One way to measure this is by timing a “null” backup—how long it takes to backup a system when nothing has changed since the previous backup. 

Any solution that operates on a live system, rather than an immutable snapshot, cannot achieve a globally consistent backup. While technologies like Volume Shadow Copy (VSS) allow backup software to capture a consistent backup of individual files, there is no consistency across files. 

 

Criteria 

Score 

Comments 

RTO 

4/5 

Limited by the throughput and latency of the media 

RPO 

2/5 

Limited by how long each backup takes 

Consistency 

1/5 

At best consistent only per-file, not per filesystem 

Data Integrity 

3/5 

Only some backup solutions create and verify checksums 

Authenticity 

2/5 

Most solutions rely on disk encryption without authentication 

Scale 

2/5 

As scale increases, RTO and RPO are more difficult to achieve 

Affordability 

1/5 

Expensive licenses that scale up based on storage volume 

Maintainability 

4/5 

Commercial solutions often offer slick UIs 

OpenZFS 

An advanced filesystem available for Linux, FreeBSD, MacOS, and even Windows, OpenZFS can be the cornerstone of an unparalleled disaster recovery system. 

The fact that ZFS is a filesystem itself, means it can minimize your RTO by allowing recovered systems to be provisioned directly from the backup data using its writable-clones feature. Rather than waiting for data to restore to the replacement system, a virtual machine can be spun up using a copy-on-write clone of the backup and be back online within seconds. When restoring to another system that also uses ZFS, the replication feature maximizes throughput by sending a stream instead of individual files, removing the latency typically experienced when transferring many small files. 

On the RPO and consistency fronts, ZFS is the hands-down winner with its instantaneous, transactionally consistent snapshots. You can take consistent snapshots many times per minute if necessary and maintain tens of thousands of snapshots with no read or write performance penalty. Any data that has been written synchronously is guaranteed to be included in the snapshot if the fsync() call completed before the snapshot was taken. 

One of the biggest advantages of ZFS is its ability to perform an online integrity check, called a scrub. Every block of data written to ZFS has its checksum recorded, and every time the data is read (during normal operations, replication, or a restore) the checksum is verified. If there is an error, ZFS will attempt to self-heal using additional copies or parity. A regularly scheduled scrub will verify the integrity of not only all data and metadata, but also parity, ensuring data remains intact and detecting any early signs of bit rot or other degradation while they can still be repaired. Compared to tape-based backups, where a verify operation means the tape drive is busy, a scrub can be performed while regular backup jobs continue running. 

When using ZFS encryption, the 256-bit checksum field is split in two. The first 128 bits store the checksum of the data as it is written to disk (ciphertext), while the second 128 bits are a Message Authentication Code (MAC) of the plaintext data. This cryptographic construct signs each block of data, ensuring that only someone with the encryption key could have written this data and it has not been modified. 

As a filesystem, ZFS was designed to scale beyond the limits of today’s hardware. Reaching the limits of what ZFS supports would require enough electricity to literally boil all the water in the oceans. Beyond supporting as much storage as you can connect to it, ZFS also provides a range of features to reduce storage requirements.  

First, ZFS supports transparent compression, including ZStandard, the best-in-class compression algorithm that shrinks data to a smaller size. It also includes zero elision, which supresses blocks of empty data, and ‘nop’ write, which avoids overwriting unchanged blocks.  

Additionally, ZFS’s new Fast Deduplication feature uses strong checksums (already maintained by ZFS) to detect duplicate blocks and share their backing storage. Finally, the new BRT (Block Reference Tree) feature allows applications or backup software to explicitly clone parts of a file. This allows backup software to perform server-side copy offloading, as well as construct synthetic full backup images, by cloning the unmodified parts of a file between a full and incremental backup to create a new full backup without having to recopy data from the machine being backed up. 

 

Criteria 

Score 

Comments 

RTO 

5/5 

Limited only by throughput of the media. Directly deployable. 

RPO 

5/5 

Unlimited instantaneous snapshots 

Consistency 

5/5 

Immutable snapshots provide full-system consistency 

Data Integrity 

5/5 

End-to-end integrity verification with 256bit checksums 

Authenticity 

5/5 

Encrypted data protected by 128bit AES-GMAC 

Scale 

5/5 

Full scale-up, many active 10+ PiB deployments. Features like transparent compression and BRT conserve storage space. 

Affordability 

5/5 

Open-source solutions have no license fees, and cost per TiB is purely the price of commodity hardware 

Maintainability 

4/5 

ZFS based backup solutions benefit from access expertise, such as the Klara OpenZFS Support Subscription 

Other Factors 

Community and Ecosystem 

Another advantage of using ZFS is the vibrant community and robust ecosystem that surrounds it. Over the years, ZFS has built a dedicated following among open source enthusiasts and professionals, as well as appliance vendors and software products alike.  

This active community means that users have access to a wealth of shared knowledge, troubleshooting advice, and regular updates keeping the file system at the cutting edge of technology. Klara actively collaborates with the community to design and implement new features, such as the recent Fast Dedup feature. This broad support base ensures that ZFS continues to evolve and adapt, incorporating new features and improvements that enhance its functionality as a backup solution, solidifying its place as one of the most advanced backup solutions available.

Looking Ahead: The Future of Backup Solutions with ZFS 

As we celebrate World Backup Day 2025, it’s clear that ZFS represents more than just a storage solution—it’s a philosophy of data stewardship. With its focus on data integrity, efficient storage management, advanced redundancy, and robust security features, ZFS is well-positioned to meet the evolving challenges of data backup in an increasingly digital world. 

Organizations that adopt ZFS for their backup strategy are not only investing in a technology that safeguards their data today but also in a platform built for future growth. The combination of scalable architecture, resilient design, and a supportive community makes ZFS a compelling choice for anyone serious about protecting their digital assets. 

Conclusion 

World Backup Day is an all too timely reminder of the importance of safeguarding our digital assets. As our digital footprints continue to expand, ensuring the safety and integrity of our data becomes more crucial than ever. ZFS is more than just a file system—it’s a commitment to the resilience and reliability that modern data backup demands, offering robust backup solutions to keep your information secure.

Happy World Backup Day 2025! Here’s to a future where every byte of data is safe, secure, and ready for recovery when you need it most. 

 

 

Back to Articles