Why You Can't Trust AI to Tune ZFS

In our Halloween special webinar we received a cheeky write-in question. We did not have a chance to cover it live, because we wanted to provide a much more in-depth answer, since this is bound to be a question that others will have, if not now, then in the future.

| Can we/should we trust AI for designing best practices when it comes to deploying technical resources? Is there a model that is best fit for these types of situations where we need the best scenario without going so far off the rails?

Before we get into this let's take a moment to discuss what an AI LLM is and how it operates.

What Is An AI LLM?

At its core an LLM (Large Language Model) is nothing more than a statistical model created through training an extremely high dimensional vector map where its connection strengths between linguistic units are saved as values in a giant database. As the input to an LLM is broken down, the connections in this vector map are used to calculate the most probable options for the next possible string.

LLMs are text completion engines that operate on statistics. If you want to understand the math behind LLMs, we highly recommend Episodes 5 and 6 of 3Blue1Brown Neural Networks playlist which focus on GPT LLMs.

It's a statistical model and nothing more.

Now while it's true that the statistical model may provide results that are often a very accurate representation, it cannot be correct all the time because that's not how statistics works. Regardless of how much training data is available and how long it is trained for, there will always be a non-zero chance that there is a valid mathematical answer not accurate according to the data it was trained on. With any statistical model, it's never black and white.

This is where the so-called “AI Hallucinations“ come from. These Hallucinations are mathematically correct statistical answers according to the model’s stored data; however, we recognize it as incorrect because it does not reflect what we know to be true. The term Hallucination is in effect a clever marketing gimmick to cover up the fact that the model can produce incorrect answers. Afterall, it's hard to sell a predictive statistical model that's incorrect a significant percent of the time. With clever marketing along with anthropomorphizing an LLM by calling it "Artificial Intelligence", you can hand wave away any mistakes.

Can We Trust AI For Designing And Tuning ZFS Pools?

So, with that understood, let's return to the issue at hand that we were asked about.

| Can we trust AI to design a ZFS pool, or recommend specific tuning? Is there a specific model that is better at these type of questions?

While an easy answer to this question would be, “no”; let’s look at this from another perspective. Is your data of such little value that you would be willing to let a sysadmin who routinely makes mistakes and recommends broken and destructive code to setup, manage, and help maintain your data?

ZFS is an incredibly powerful file system with a large number of complex adjustable parameters. The defaults in ZFS are the defaults for a reason: they will work reasonably for most workloads most of the time. We need to understand each of these parameters, what they do, how they are used by ZFS and how changing those parameters will interact with other parameters. We don’t even need to dig into the complexities of how ZFS operates and how parameters interact with each other to show that LLMs are not to be trusted.

LLMs are training on content they found on the internet, without:

Respect for how old it might be

The fact that it might pertain to a different version of ZFS

Apply different operating system.

Here are six examples of the responses to simple questions about ZFS parameters from one of the most accurate commercially available LLMs models you can access.

Note: Accuracy based on model performance as of March 2025.

Example 1: spa_slop_shift

AI ZFS: Example 1: The author asks the LLM, ”What should I set for the spa_slop_shift parameter in ZFS?” — Example 1: The author asks the LLM, ”What should I set for the spa_slop_shift parameter in ZFS?” The model responds that the parameter controls the extra space reserved for copy-on-write operations. Its default is 5 (32 sectors) and for most systems, they keep the default value and to only increase it if you’re experiencing out-of-space errors during large writes.

The LLM tells us that this parameter controls the extra space reserved for operations. This part is correct; however, the LLM then states that the default is 5 which it says is 32 sectors. This is not correct, as by default the last 3.2% (1/(2^spa_slop_shift)) of pool space is reserved. The minimum SPA slop space is limited to 128 MiB. Since ZFS 2.1.0, the maximum SPA slop space has been limited to 128 GiB, meaning it has not been necessary to tune this value manually to save space on very large pools since 2021.

Official documentation for spa_slop_shift.

Example 2: arc_min_prescient_prefetch_ms

AI ZFS: Example 2: The author asks the LLM, “What should I set the ‘zfs_arc_min_prescient_prefetch_ms’ value to in ZFS for the best performance?” — Example 2: The author asks the LLM, “What should I set the ‘zfs_arc_min_prescient_prefetch_ms’ value to in ZFS for the best performance?” The model responds with stating that the default of 10ms is recommended and that higher values increase the cache retention but consume more memory. Finally, it closes with that no significant performance benefit can come beyond 10ms for most workloads.

The LLM tells us that the default for this value is 10ms, and that is what is recommended. It goes further to claim that there is no significant performance benefit beyond 10ms for most workloads. In reality, this tunable defaults to 6000ms (6 seconds) in ZFS. This value controls the minimum time “prescient prefetched” blocks are locked in the ARC. ZFS’s prescient prefetch feature examines usage patterns and attempts to guess which block the application will want next, prestaging those in the cache to improve read performance, but this only helps if the data is still in the cache when the application requests it. Setting this tunable to only 10ms as the LLM recommends would result in lots of prefetched data being evicted from the ARC before it may be used. This means all of the work to prefetch those blocks were wasted and will be repeated when the block is actually read a few 100 milliseconds later.

Official documentation for arc_min_prescient_prefetch_ms.

Example 3: dirty_data_max

AI ZFS: Example 3: The author asks the LLM, “Can you explain what the zfs_dirty_data_max parameter does in ZFS?” — Example 3: The author asks the LLM, “Can you explain what the zfs_dirty_data_max parameter does in ZFS?” The model explains that the parameter sets the maximum amount of modified (dirty) data that can be held in RAM before ZFS forces a transaction group (TXG) commit to disk. It also helps balance write performance with memory usage and when that threshold is reached, writes are throttled. The default value is typically 10%.

This one isn't incorrect but does provide a poor explanation of what is actually going on and could lead to data loss. Dirty Data is information that ZFS needs to store to disk but has not yet been able to put into a TXG and commit to disk. This data has no redundancy or protection. In a system that is doing a large quantity of writes, if you increase the dirty_data amount significantly your application will appear to write faster, but in fact the data is simply being stored as dirty data in RAM. If a power loss or other system failure occurs at this time, all of that data may be lost as it has never been written to disk. The value of dirty_data_max needs to be set carefully to balance the need to improve write I/O through the ability to aggregate more data for writes, and the importance of the data being saved.

Another way to think about this is that this can give a similar experience as the file writing delay with USB 2.0 and older thumb drives. When you could copy large files to them, the file copy UI would report that the file was transferred, but in fact the file copy was on-going in the background for much longer. If the system was shut down or the drive was removed unsafely, the data in flight would be lost, and whatever file was in the process of being written would be corrupted.

Official documentation for dirty_data_max.

Example 4: dnodesize

AI ZFS: Example 4: The author asks the LLM, “What should I set dnode size to in zfs?” The model recommends for most modern ZFS deployments, set dnodesize=auto, which lets ZFS choose between 512B-32KB based on the dataset needs. For special cases like many small files or extended attributes, manually setting dnodesize=4k can help performance. — Example 4: The author asks the LLM, “What should I set dnode size to in zfs?” The model recommends for most modern ZFS deployments, set dnodesize=auto, which lets ZFS choose between 512B-32KB based on the dataset needs. For special cases like many small files or extended attributes, manually setting dnodesize=4k can help performance.

This example also incorrect. The LLM says that the default is "auto" and that zfs will choose a size between 512b and 32k. The actual default for ZFS is "legacy", and the size can be set to 512-16K; 32K is not an option. The ZFS documentation recommends this be set to auto if you need xattr (extended attributes) properties. Specific values should only be used for performance testing or when the optimal size is already known.

Official documentation for dnodesize.

Example 5: metaslab_aliquot

Example 5: The author asks the LLM, “Can you explain the metaslab_aliquot parameter in ZFS and what I should set it to for the best performance?” The model states that it controls the minimum size of metslab allocation, the default is 512KB, and for optimal performance, it should be set to 1MB. It continues to say that it reduces fragmentation, improves allocation efficiency, and to not set it below 512KB.

The LLM misses a lot of important nuances in its explanation, including considering the reader’s familiarity with metaslabs. The value does not set a minimum allocation size so much as it sets a threshold after which ZFS will attempt to choose another vdev to allocate the next block from. This controls the frequency at which ZFS will change top level devices to balance load and free space, while optimizing the number of writes that can be done with the loaded metaslabs.

The LLM also claims that the default is 512KiB, which is incorrect. While the LLM does recommend this be set to 1MiB for Modern ZFS systems, in fact the default is 1Mib in modern ZFS. The LLM claims that this reduces fragmentation and improves allocation efficiency. Both of these are incorrect: changing this value does not reduce fragmentation for large files or improve allocation efficiency. Lastly, the LLM also recommends that this not be set to lower than 512KiB because it can increase metadata overhead. The direct effect of this is mitigated by the “log spacemap” feature introduced in 2019 and released as part of OpenZFS 2.0. While it's probably not a good idea to set this below 512KiB, doing so would not increase metadata overhead to a significant degree.

Since ZFS spreads its writes across the disks in the pool, this parameter controls how much data gets written to a disk before switching to the next disk in the pool.

Official documentation for metaslab_aliquot.

Example 6: redundant_metadata

Example 6: The author asks the LLM, “Can you explain the options ’all, most, some, and none’ of redundant_metadata parameter in zfs?” The model explains that the parameter in ZFS controls how much metadata is stored redundantly across the pool. It provides definitions for each vocabulary before saying the default is ”most” which provides good protection while being more space-efficient than ”all.”

The LLM claims several things which are incorrect.

Firstly, that only 2 copies are stored of metadata, which is not accurate. ZFS does store an extra copy of metadata, so that if a single block is corrupted, the amount of user data lost is limited. This extra copy is in addition to any redundancy provided at the pool level (e.g. by mirroring or RAID-Z) and is in addition to an extra copy specified by the user via the copies property (up to a total of 3 copies). For example, if the pool is mirrored, copies=2, and redundant_metadata=most, then ZFS stores 6 copies of most metadata, and 4 copies of data and some metadata.

The LLM also claims that the "some" may risk pool operation. This is not correct, setting "some" will not risk the pool being able to operate. If a single on-disk block is corrupt, with the value of "some" set, at worst a single user file can be lost. In fact, even with setting this value to "none" the pool's critical metadata is still redundant. While it could result in a lost dataset, the pool would continue to function.

The LLM claims the default is "most", which is also incorrect. The Default in ZFS is "all".

Official documentation for redundant_metadata.

Can LLM’s Be Used For ZFS Performance And Tuning? No.

While at first glance to a layman the answers given by the LLM may be believed. An individual who doesn't know better and believes the information is accurate would make decisions based on it, which could result in extremely poor performance and potentially suffer catastrophic data loss.

As shown with the examples above, even basic details about the ZFS parameters and their functionality were full of errors, mistakes, outdated information, and outright falsehoods. These answers are from one of the highest-quality LLMs available at the time, and you can plainly see how poorly it answered these simple questions.

While in a decade or two it may be possible to train an LLM specifically for ZFS, it is an ever-evolving project with new features and capabilities being added. An LLM made for ZFS would need to constantly be retrained to keep up with the project development.

A better solution for you, if you need help with your ZFS deployment, is to reach out and work with the engineers that are contributing to making ZFS better every day.

Klara’s ZFS Services connect you directly with upstream contributors offering production-grade support, performance tuning, and architectural guidance—far beyond what any generic AI model can provide.

Topics / Tags

benchmarking disk ARC system administration

Back to Articles

JT Pennington

JT Pennington is a ZFS Solutions Engineer at Klara Inc, an avid hardware geek, photographer, and podcast producer. JT is involved in many open source projects including the Lumina desktop environment and Fedora.

Why You Can’t Trust AI to Tune ZFS

Additional Articles

What Is An AI LLM?

Can We Trust AI For Designing And Tuning ZFS Pools?

Example 1: spa_slop_shift

Example 2: arc_min_prescient_prefetch_ms

Example 3: dirty_data_max

Example 4: dnodesize

Example 5: metaslab_aliquot

Example 6: redundant_metadata

JT Pennington

Embedded ARM Development Experts

OpenZFS Development & Support

FreeBSD Development & Support

Stay Informed and Make Smart Business Decisions with Klara's Resources

Unlock the Power of OpenZFS, Linux, and FreeBSD with Klara's Open Source Development Experts

Why You Can’t Trust AI to Tune ZFS

Additional Articles

What Is An AI LLM?

Can We Trust AI For Designing And Tuning ZFS Pools?

Example 1: spa_slop_shift

Example 2: arc_min_prescient_prefetch_ms

Example 3: dirty_data_max

Example 4: dnodesize

Example 5: metaslab_aliquot

Example 6: redundant_metadata

JT Pennington