Announcement

Save Your Spot — 12 Days of ZFS: Practical Tips, Tricks & Treats (Live Webinar) Learn More

Klara

ZFS offers powerful scalability through its virtual device (VDEV) model, allowing administrators to build pools that balance performance, redundancy, and capacity across diverse workloads. However, as systems grow, an important question arises: how far can you push VDEV scaling before performance degradation, redundancy limits, or administrative burdens set in? 

This guide explores the practical boundaries of VDEV count, separating misconception from reality. Based on production deployments and operational experience, it lays out the considerations behind designing pools with high VDEV counts and offers strategies for scaling cleanly. 

What Is a VDEV? 

A VDEV, or virtual device, is the fundamental unit of storage aggregation in ZFS. It consists of one or more physical disks grouped together to provide redundancy and performance characteristics. VDEVs are not used individually but instead combined into a storage pool, where ZFS stripes data across them. Importantly, redundancy is handled at the VDEV level, not across VDEVs. This means that if a single VDEV fails entirely, the entire pool becomes unavailable. 

This architectural choice means that the configuration of each VDEV directly influences the durability and performance of the pool. VDEVs are the atomic failure domains in ZFS and must be treated as such in any serious deployment. 

Types of VDEV  

There are several types of VDEVs available in ZFS, each suited for different performance and fault tolerance requirements.  

  • Mirror VDEVs consist of two or more drives that replicate data for redundancy.  
  • RAID-Z configurations (RAID-Z1, Z2, Z3) offer single, double, or triple parity across disks within each VDEV, allowing for varying levels of fault tolerance and usable capacity.  
  • DRAID, or distributed RAID, is designed to simplify redundancy management and improve resilver speed in very large pools.  

 Additionally, ZFS supports special VDEVs, typically used for metadata or small block workloads, and dedicated log and cache devices for ZIL and L2ARC functionality, respectively. A more detailed overview of VDEV types is available in our article OpenZFS – Understanding ZFS VDEV Types. 

Why VDEVs Matter 

The design and quantity of VDEVs in a pool determine the pool’s performance ceiling, fault tolerance, and recovery characteristics. ZFS treats each VDEV as an independent unit, with its own read and write queues and healing logic; adding VDEVs increases the pool’s ability to parallelize operations.  

At the same time, each VDEV represents a distinct failure domain. A misconfigured VDEV can jeopardize the pool, as the loss of any whole VDEV makes the pool unrecoverable, but a well-planned layout can offer precise fault isolation, better rebuild characteristics, and cleaner operational behavior under load. 

Why More VDEVs Can Be a Good Thing 

ZFS encourages modularity and parallelism, and the structure of VDEVs reflects that design. Instead of building a single monolithic RAID group, administrators can construct pools from multiple independent VDEVs. This structure allows ZFS to distribute I/O across many devices, making scaling a practical and often beneficial choice. This section explains why adding more VDEVs tends to improve performance, growth flexibility, and failure isolation: 

Parallelism and Performance 

Adding more VDEVs to a pool increases the number of independent I/O paths available to ZFS. Each VDEV can handle its own set of read and write operations, allowing the filesystem to distribute I/O across them in parallel, which results in an increase in IOPS rather than just additional I/O queues.  

A single RAID-Z delivers roughly the IOPS of one disk, since all disks within that VEV participate in each operation. For example, a pool of 24 disks configured as two 12-wide RAID-Z2 VDEVs will provide about the IOPS of two disks. Reconfiguring the same disks as four 6-wide RAID-Z2 VDEVs doubles that to the IOPS of four disks. This improvement is particularly noticeable for random-access workloads such as databases, VM storage, and metadata-heavy applications.  

Performance tends to scale predictably with the number of VDEVs until bottlenecks shift to other parts of the system, such as CPU, storage controllers, or network interface. 

Faster Pool Expansion 

More VDEVs also mean more capacity. Unlike monolithic RAID arrays, ZFS allows administrators to incrementally expand storage by adding new VDEVs to an existing pool. Each new VDEV brings not only space but also bandwidth. This approach facilitates predictable and manageable growth. Mirror VDEVs, in particular, scale extremely well and are the easiest to add without disrupting the rest of the pool, making them ideal for steady scaling. 

Fault Isolation 

ZFS defines redundancy and failure domain at the VDEV level. The health of the entire pool depends on each VDEV remaining operational, but the fault tolerance within each VDEV is independent. A RAID-Z2 VDEV, for example, can sustain the loss of up to two disks. If a pool consists of four RAID-Z2 VDEVs, each one can independently lose two disks and continue operating. 

This distinction matters when considering total risk.  A pool built from a single large RAID-Z2 VDEV can survive only two disk failures in total. The same number of disks arranged in multiple smaller RAID-Z2 VDEVs can tolerate more simultaneous disk failures overall, provided those failures occur in different VDEVs. 

This isolation enables administrators to design large-scale pools with risk compartmentalized by chassis, rack, or workload. In mirror-heavy deployments, this principle allows for graceful degradation in the face of partial hardware failure, provided the failure domains are constructed with awareness of shared components and dependencies. 

Does More Become Too Much? 

In forums like the ZFS subreddit and various administrator communities, a common question recurs: is there a point where adding more VDEVs becomes a problem? This concern is usually expressed in vague terms, backed by anecdote or rule-of-thumb warnings that rarely specify context.  Solaris deployments in the past often cited operational concerns starting around 200 to 300 vdevs. FreeBSD and OpenZFS do not impose different technical limits, but modern systems with large amounts of memory and high-core-count CPUs can tolerate higher load to better leverage larger numbers of vdevs. 

It is true that poorly designed large pools can exhibit fragility or degraded performance. However, most of these problems are not inherent to ZFS’s scalability model but arise from design shortcuts, underprovisioned hardware, or misaligned workloads.  This section addresses those concerns directly and clarifies what actually happens when you scale into dozens or hundreds of VDEVs. 

Provisioning Complexity, Not Ongoing Overhead 

While higher VDEV counts may appear to introduce operational risk, the reality in most production environments is that the complexity is front-loaded. The planning and provisioning phase requires careful attention, particularly in designing and aligning with physical failure domains and selecting VDEV types. However, once deployed, pools with high VDEV counts are not inherently more difficult to manage.  

Many of our customers operate pools with over  200+ mirror VDEVs, especially in high-IOPS environments, without requiring ongoing tuning or intervention. The administration burden scales based on design quality, not VDEV count. 

Import and Boot Considerations 

Another misconception is the assumption that large pools slow down boot times. This ignores the fact that pool import duration is not strictly a function of VDEV count. Rather, the total number of drives, the structure of VDEVs, and the presence of degraded or faulted components influence boot-time import duration.  

In high-availability environments, it is essential to measure and test import behavior under both normal and degraded scenarios. However, scaling VDEVs alone does not introduce unacceptable delays if hardware and configuration are well balanced. 

Parity Inefficiencies 

Another concern centers around RAID-Z pools composed of many small VDEVs, where parity overhead accumulates. In practice, the main drawback is not performance but space efficiency. Each RAID-Z VDEV incurs parity overhead, and when many small VDEVs are used, this overhead adds up. The result is a lower usable capacity compared to fewer, wider VDEVs using the same number of disks. The issue of parity efficiency at larger scales (100+ HDDs) is addressed by the design of DRAID, although it comes with its own trade-offs. 

However, concerns about degraded rebuild speed are largely misplaced. ZFS resilvering can skip entirely healthy VDEVs, focusing only on those with damaged or missing devices. This behavior means that pools with many small VDEVs may actually see faster recovery times in degraded scenarios, especially when failures are isolated. While zpool status reports progress as if the entire pool is being scanned, data present on healthy vdevs will be skipped during a resilver to speed up the process. This may result in improbable performance being reported by the resilver status (scanning at 100 GB/sec), but this is working as intended. 

The tradeoff is therefore not between performance and resilience, but between space efficiency and the ability to scale out fault domains more granularly. When choosing RAID-Z widths, administrators must weigh usable capacity against deployment flexibility and recovery locality. 

Smarter Ways to Scale 

Knowing that ZFS can scale across many VDEVs is only useful if it informs a strategy that fits your hardware, workload, and fault model. Reckless scaling can lead to inefficiencies and operational fragility, while conservative planning can underutilize the platform’s capabilities. Smarter scaling means understanding which VDEV types fit your environment, when to switch strategies, and how to measure the system’s behavior under growth:  

Use Mirrors When IOPS and Resilver Speed Matter 

For latency-sensitive environments and high-IOPS applications, mirror VDEVs remain the most performant option. Each mirror provides an independent I/O path, enabling concurrency and minimal seek penalties. They also allow for the fastest possible resilver behavior, since data can be rebuilt in parallel and the recovery process is not constrained by parity calculations.  

That said, this performance comes at a cost. Mirrors offer the lowest usable capacity per disk and introduce risk if fault domains are poorly designed. A double disk failure within a single mirror group results in catastrophic data loss. For this reason, careful physical separation of mirror pairs is essential, particularly in systems with shared backplanes or enclosures. 

Adopt DRAID Only at Large Scale 

DRAID was introduced to address the resilver limitations of traditional RAID-Z in large-scale deployments. By pre-allocating spare capacity and distributing redundancy metadata across the pool, DRAID can drastically reduce recovery times after a disk failure. However, these benefits only materialize at a significant scale.  

Pools with fewer than 60 drives rarely experience resilver bottlenecks severe enough to justify DRAID's added complexity. Administrators should be cautious before adopting it in small environments, as the trade-offs, such as comparatively less predictable performance and more involved configuration, can overshadow the gains. DRAID is most useful in hyperscale or petabyte-class systems where resilver time becomes the primary design constraint. 

Monitor Key Metrics as You Scale 

 Regardless of VDEV type, the importance of observability increases with scale. Once a pool exceeds a few dozen VDEVs, monitoring becomes critical. Key metrics such as import time, scrub duration, and resilver throughput provide early warnings when systemic stress begins to appear. These values should be tracked consistently and benchmarked against known-good baselines established during initial deployment. Changes in these metrics can point to hardware degradation, unbalanced workloads, or silent configuration drift. Effective monitoring tools and alert thresholds make it possible to grow without losing control over operational reliability. 

Split Workloads Across Pools 

Finally, it is worth recognizing that a single massive pool is not always the best architectural choice. In some environments, especially those with mixed physical drives, workloads, or divergent performance goals, distributing storage across multiple pools can offer distinct advantages.  

This approach allows administrators to isolate fault domains, apply different redundancy models, and manage performance independently for each tier. It also simplifies testing, maintenance, and migrations, since each pool can be operated on in parallel. While this model introduces its own overhead in terms of pool coordination, it can provide the modularity needed to scale cleanly while preserving administrative simplicity. 

Wrapping Up 

There are many considerations when designing a storage pool to maximize performance and resilience. Having too many VDEVs is rarely a real concern; instead, the major factors are: 

  • Too few VDEVs may not provide the IOPS required 
  • Too wide VDEVs may take too long to rebuild 
  • Too wide VDEVs will be inefficient at small-block workloads 
  • Too narrow VDEVs will have too high a parity cost proportional to usable storage 
  • The wrong type of VDEV for a workload will not perform as well 

With these factors in mind, a pool can be designed to meet most any challenging workload; the key is to match the tradeoffs of the different VDEV types with the tradeoffs that suit your workload. Expert advice from the team at Klara via our ZFS Storage Design service will ensure you make the right choices and are well served by your storage in the long term. 

Back to Articles