Improve the way you make use of ZFS in your company.
Did you know you can rely on Klara engineers for anything from a ZFS performance audit to developing new ZFS features to ultimately deploying an entire storage system on ZFS?
ZFS Support ZFS DevelopmentAdditional Articles
Here are more interesting articles on ZFS that you may find useful:
- ZFS in Production: Real-World Deployment Patterns and Pitfalls
- Klara’s Expert Perspective on OpenZFS in 2026 and What to Expect Next
- Understanding ZFS Scrubs and Data Integrity
- Unwrapping ZFS: Gifts from the Open Source Community
- What We Built: Top ZFS Capabilities Delivered by Klara in 2025
ZFS provides strong guarantees for integrity, scalability, and manageability. Its storage stack integrates checksumming, copy-on-write, snapshots, and advanced caching in a way that traditional filesystems do not. With this sophistication comes temptation. The system exposes numerous tunables that appear to offer direct levers for performance improvement. Administrators frequently encounter blog posts or forum discussions that promise huge throughput gains from a handful of parameter changes, and the arrival of LLMs has only widened the appetite for quick adjustments.
The reality is far more nuanced: ZFS performance tuning is both workload-specific and environment-dependent. The defaults are selected to provide robust performance across a wide range of deployments, though it's important to mention this is not true for all cases. However, in production systems, careless tuning can reduce stability or degrade performance in subtle ways. At the same time, a deeper understanding of the Adaptive Replacement Cache (ARC), the Level 2 ARC (L2ARC), and the Separate Log Device (SLOG) can help you evaluate whether tuning is appropriate for your workload.
This article examines these core performance components and considers the role of tunables and the risks of over-adjustment.
The Adaptive Replacement Cache (ARC)
At the heart of ZFS caching lies the ARC. Unlike traditional least-recently-used (LRU) caches, the ARC implements an adaptive strategy that balances recency and frequency of access. Internally, it maintains four lists:
- Most Recently Used (MRU): caches data accessed only once or infrequently.
- Most Frequently Used (MFU): caches data accessed repeatedly.
- Ghost MRU (GMRU): tracks metadata about recently evicted MRU entries.
- Ghost MFU (GMFU): tracks metadata about recently evicted MFU entries.
The ghost lists do not contain data blocks, but they provide feedback to the cache algorithm. If a workload begins re-accessing data that had been evicted, the ARC can respond by adjusting how much memory is assigned to recency or frequency. This should cause future accesses to result in a cache hit, instead of a cache miss. This self-tuning behavior makes ARC highly efficient across mixed workloads without manual adjustment. For more details, see Klara’s other article about the self-tuning of the ARC.
Memory Interactions
ARC resides in system memory and competes with applications and the kernel for RAM. On FreeBSD, the ARC grows until it approaches system memory pressure, at which point it releases memory. The Linux port originally implemented a fixed target size, but more recent versions have integrated ARC shrinkage into the kernel’s memory reclamation framework.
The ARC caches both data and metadata. Metadata caching is critical for workloads with small files or heavy directory traversal. In practice, you must be aware that the ARC may occupy a large fraction of system RAM, which can surprise operators unfamiliar with ZFS. This is not memory leakage but deliberate design.
Observability with arcstat and arc_summary
Two tools are commonly used for visibility into ARC behavior: arcstat and arc_summary.
- arcstat provides a live view of hit ratios, cache size, demand reads, prefetch reads, and hit/miss statistics. A high hit ratio suggests effective caching, while a persistently low ratio indicates a workload that exceeds available memory or does not benefit from caching.
- arc_summary offers a snapshot of ARC state, including target sizes, MFU/MRU distribution, and metadata ratios. It helps in understanding whether a workload is recency or frequency biased.
If you are evaluating performance problems, you should examine these outputs before considering tunable changes. If ARC is delivering strong hit ratios, adding an L2ARC or altering tunables may provide little benefit.
ARC Tunables
ARC exposes parameters through vfs.zfs.arc_* sysctls (FreeBSD) or /sys/module/zfs/parameters (Linux). Important tunables include:
arc_max and arc_min: define upper and lower bounds for ARC size. Adjusting arc_max can be useful in environments where ZFS should not grow to consume nearly all RAM, such as database servers where the application also relies heavily on caching.
arc_meta_limit: controls the maximum memory dedicated to metadata.
arc_meta_balance: controls the balance of how ghost hits impact the mix of data and metadata.
The general recommendation is to avoid tuning arc_meta_limit and arc_meta_balance unless there is a clear conflict between ZFS caching and application-level caching. The default works well in most, if not the majority, of deployments.
The Level 2 ARC (L2ARC)
The L2ARC supplements system RAM by extending the cache onto fast secondary devices, typically NVMe or SSD. While the ARC resides in DRAM, the L2ARC stores blocks that would otherwise be evicted. Instead of evicting the blocks, the ARC keeps only a header pointing to where the data is stored on the L2ARC device.
However, the L2ARC is not a write cache. Writes must be committed to the main pool devices. The L2ARC only stores evicted data blocks for potential future reads.
Metadata Overhead and Memory Consumption
Each block stored in L2ARC requires an in-memory pointer in the ARC. This means that adding a large L2ARC device increases metadata consumption in system RAM. A rule of thumb is that L2ARC requires approximately 80 bytes of ARC memory for every block cached in L2ARC. On systems with limited RAM, adding a large L2ARC can paradoxically reduce effective caching because valuable ARC space is consumed by pointers rather than data.
Compression, Prefetch, and Eviction
L2ARC stores data in compressed form if the dataset uses compression. It also respects ARC prefetch policies, meaning sequential streaming reads are not always cached. Eviction from L2ARC follows a simple LRU policy. If a block is no longer referenced in ARC, it will also be invalidated in L2ARC.
Earlier versions of L2ARC did not persist across reboots, meaning that after a restart the device provided no immediate benefit. More recent versions of ZFS support persistent L2ARC, which allows cache content to be reloaded after reboots and provide faster warm-up.
Multi-Device Scaling
It is possible to add multiple L2ARC devices. The ARC balances the population across them. While this can expand effective cache size, administrators should ensure that the system has sufficient RAM to manage metadata overhead.
When L2ARC Helps
L2ARC is most beneficial when the working set of a workload is larger than system RAM but smaller than the combination of RAM and L2ARC. Examples include large read-mostly datasets such as analytics queries or virtual machine images that are frequently accessed but not constantly modified.
It is less effective for streaming workloads, highly random writes, or datasets that exceed the combined ARC and L2ARC size by orders of magnitude.
The Separate Log Device (SLOG)
All synchronous writes in ZFS are first written to the ZFS Intent Log (ZIL). The ZIL ensures that if the system crashes, committed writes can be replayed to maintain consistency. By default, the ZIL is stored on the main pool devices.
ZFS aggregates writes into transaction groups, which are flushed to disk at intervals (typically every 5 seconds). Asynchronous writes may remain in memory until the next flush, but synchronous writes must be acknowledged immediately. This is where the SLOG device becomes useful.
Why Latency Dominates Bandwidth
The SLOG is not a write accelerator for all operations. It only affects synchronous writes. Its performance impact is determined primarily by latency, not throughput. A small, low-latency device such as an enterprise-grade NVMe can greatly reduce the time needed to acknowledge synchronous writes.
Bandwidth matters less because the volume of synchronous writes is limited to what can be flushed in a few seconds. A device with extremely high throughput but high latency provides little benefit. If the SLOG device cannot persist the data significantly more quickly than the main pool storage, there is not benefit to writing to the SLOG to acknowledge the write sooner.
Mirroring, Device Failure, and Risk
SLOG contains uncommitted data that has not yet been flushed to the main pool, losing it will result in data loss. For this reason, SLOG devices should be mirrored since durability is important. If a non-redundant SLOG device fails, ZFS will fall back to the main pool ZIL, but in-flight synchronous writes may be lost.
Safe Deployment Practices
SLOG devices should be chosen for low latency and high endurance. Consumer SSDs with volatile caches are inappropriate because they may lose data on power failure. Enterprise NVMe devices with capacitor-backed write caches are preferred. The SLOG does not need to be large, since it only holds a few seconds’ worth of writes, but it must be reliable.
Tunables in Practice
ZFS exposes many tunables related to ARC, L2ARC, and the ZIL. They can be grouped broadly into:
- ARC size controls: governing how much RAM is used.
- Prefetch and metadata settings: affecting how aggressively ZFS caches metadata or sequential reads.
- L2ARC feed controls: controlling how quickly data is written to L2ARC.
- ZIL parameters: not usually adjusted beyond specific use cases, as they influence consistency guarantees.
Which Tunables Are Safe
In practice, only a small subset of tunables should be adjusted:
arc_max and arc_min to limit memory use.
l2arc_feed_min_ms and l2arc_feed_secs to moderate how aggressively data is written to L2ARC.
Prefetch toggles (zfs_prefetch_disable) in workloads where sequential prefetching causes harm.
l2arc_write_max to set the maximum rate that data will be written to the L2ARC device. This setting is designed to avoid wearing out the flash device prematurely by writing data to it that may not be used before it is overwritten with newer data.
l2arc_write_boost to temporarily increase the write rate during system startup or when the L2ARC has not yet become full to help it fill faster. Other parameters like l2arc_noprefetch (defaulted to prevent speculative reads from polluting the cache) and l2arc_headroom give you fine-grained control over what data is eligible for caching.
Most other tunables are recommended to remain at defaults unless guided by upstream recommendations or developer expertise
Wrapping Up
Performance tuning in ZFS is best approached with a clear understanding of the architecture. The ARC delivers intelligent caching that requires minimal adjustment. The L2ARC extends caching capacity but introduces metadata overhead that must be considered. The SLOG improves synchronous write latency but only when paired with the right workload and reliable hardware.
It is worth emphasizing that the recommendation to avoid indiscriminate tuning is not propaganda, gatekeeping or conspiracy. The defaults exist because they have been validated across many workloads and environments. Random parameter changes are more likely (not always) to degrade performance, increase instability, or in the worst case expose a system to data loss. The real risk is not that tuning “does nothing,” but that it silently undermines the guarantees ZFS was built to provide.
The safe path is to observe, measure, and analyze before adjusting any tunables. Where tuning is justified, it should be guided by real workload evidence, hardware awareness, and ideally by expertise from practitioners who live in the code. Otherwise, you may discover that the fastest way to make ZFS “perform better” is simply to break it faster.
Klara Systems provides specialized support for organizations running ZFS at scale, from workload analysis and performance tuning to long-term architectural planning. Our engineers work directly with the codebase and upstream development, ensuring that tuning decisions are grounded in both practice and deep technical insight.

Umair Khurshid
Developer, open source contributor, and relentless homelab experimenter.
Learn About Klara




