FreeBSD or Linux – A Choice Without OS Wars
Uncover the key differences between FreeBSD and Linux as we break down their features and use cases, helping users make well-informed decisions based on their specific requirements.
ZFS Channel Programs—essentially, a way to batch multiple ZFS operations together in a single, atomic operation—are a new OpenZFS feature available in FreeBSD 12.0 and newer.
Today’s article answers some common questions regarding ZFS Channel Programs and provides some resources for learning how to create your own Channel Programs.
zfs-program(8) defines it this way: The ZFS channel program interface allows ZFS administrative operations to be run programmatically via a Lua script. The entire script is executed atomically, with no other administrative operations taking effect concurrently. A library of ZFS calls is made available to channel program scripts. Channel programs may only be run with root privileges.
Let’s take a closer look at that definition:
You might be wondering, why not just write a script using my favorite scripting language for zfs batch operations? That brings us to the next question:
Consider this common scenario: as part of your OpenZFS snapshot management, you need to periodically traverse the list of snapshots and destroy snapshots that meet certain criteria. If you’re using snapshots, you have probably already written a script that does this for you.
However, you may have noticed that it takes some time for all of the snapshot operations to complete, especially when iterating over a large amount of snapshots. The first reason for this is that the operations are issued by the zfs userland command. In contrast, a Channel Program is evaluated within the kernel (and kernel operations are faster than userland operations).
Running Channel Programs within the kernel gives the added advantage that Channel Programs guarantee consistency with concurrent ZFS modifications. For example, a Channel Program guarantees that the zfs list and zfs destroy commands both see the same ZFS pool state, whereas running these commands from a script leaves the chance that another ZFS operation outside of the script has modified the pool state before the next command in the script is run.
To understand the difference, let’s do a quick overview of OpenZFS transaction groups and synctasks.
A transaction group—also known as a TXG—is a collection of queued ZFS operations. These operations could be disk operations, such as writes, or they might be administrative operations, such as zfs list or zfs destroy.
If a transaction group contains an administrative operation, the step of actually running the operation is known as a synctask. In order to provide filesystem consistency, transaction groups are numbered sequentially, are always committed in sequential order, and there is only one open transaction group at any given time.
If the transaction group contains a synctask, the synctask runs last and must complete and return its results before the next transaction group can run. The time for a synctask to complete depends upon the operation, but can vary from a few milliseconds to several seconds.
Did you know?
Channel programs combine multiple and iterating operations into one compound operation that is performed in a single transaction group with a single synctask. This is a powerful concept that can provide both a significant performance boost and filesystem consistency guarantee.
Consider a script that iterates through a snapshot listing, checks each snapshot against a criteria, destroys a snapshot if it meets the criteria, then continues through the iteration. Every one of those operations requires its own transaction group and synctask. Besides the performance hit of running multiple synctasks, there is no guarantee that a transaction group created outside of the script won’t alter the state of the ZFS pool before the next transaction group generated by the script is run.
In contrast, a Channel Program containing the same operations would be sent to one transaction group and executed as one synctask. In addition to performance improvements, this ensures that the sequence of operations within that synctask occurs atomically and is not interrupted by any other transaction groups.
As another example, a Channel Program could promote a clone and destroy it in the same synctask. Or, create recursive snapshots using a kernel API rather than using the zfs userland application.
In short, when you need to execute a series of dependent zfs commands, you want the performance and reliability of a Channel Program.
The blog post Proposed ZFS Feature: Channel Programs provides some more examples of how multi-operation zfs commands benefit from being replaced by Channel Programs.
zfs-program(8) lists the API functions for these supported operations:
As you can see, there are API functions for managing most bulk dataset and snapshot tasks.
If you’re not a programmer, or if Lua is not your programming language of choice, you may be wondering if it is worthwhile to learn yet another language in order to use Channel Programs. Lua was selected as it is small, easy to sandbox and embed in existing programs, and the interpreter is highly configurable.
zfs-program(8) lists the available APIs and provides some code examples, including one that demonstrates error handling. Refer to the Lua 5.2 reference manual for everything you need to know to get started with Lua.
Did you know?
The following example of a script for dataset batch deletion is from the ZFS Channel Programs blog post from Delphix. Refer to the blog post for a more complete description of how the script enforces correct deletion order and error handling. It should be noted that the script is probably not any longer or more complicated looking than similar scripts may be using now.
args = ...
pool = args["argv"][1]
function gather_destroy(root, to_destroy)
for child in zfs.list.children(root) do
to_destroy = gather_destroy(child, to_destroy)
end
for snap in zfs.list.snapshots(root) do
for clone in zfs.list.clones(snap) do
to_destroy = gather_destroy(clone, to_destroy)
end
table.insert(to_destroy, snap)
end
table.insert(to_destroy, root)
return to_destroy
end
function cleanup_dataset(root)
datasets = gather_destroy(root, {})
for ds in datasets do
err = zfs.check.destroy(ds)
if (err != 0 and err != ECHILD) then
error("failed to destroy " .. ds .. " errno: " .. err)
end
end
for ds in datasets do
assert(zfs.sync.destroy(ds) == 0)
end
end
end
function recursive_cleanup(root)
for child in zfs.list.children(root) do
recursive_cleanup(child)
end
-- We may encounter these clones when recursing through children of some
-- other filesystem, but we catch them here as well to make sure each is
-- destroyed before its origin fs.
for snap in zfs.list.snapshots(root) do
for clone in zfs.list.clones(snap) do
recursive_cleanup(clone)
end
end
-- Only recursively destroy the dataset if it's marked for destruction
if (zfs.get_prop(root, "gc:tmp_cleanup") == "yes") do
cleanup_dataset(root)
end
end
recursive_cleanup(pool)
In IBM terminology, the channel subsystem (CSS) moves data into and out of a mainframe. Since the CSS is independent of the mainframe’s processors, input/output (I/O) within a mainframe can be done asynchronously. Asynchronous I/O is handled within the channel subsystem by, you guessed it, a channel program.
If you transpose CSS with ZFS and mainframe with kernel, Channel Program seems like a logical term for the functionality provided by ZFS Channel Programs.
If you’re currently using scripts to manage snapshots or repetitive ZFS tasks, consider the performance and consistency benefits of replacing those scripts with Channel Programs. The resources mentioned in this article are a great starting point for creating your own Channel Programs.
The resources mentioned in this article are a great starting point for creating your own Channel Programs. If you need help creating or debugging Channel Programs, the OpenZFS experts at Klara are always available to provide support.
You might also be interested in
At Klara, we have an entire team dedicated to helping you with your FreeBSD projects. Whether you’re planning a FreeBSD project, or are in the middle of one and need a bit of extra insight, we’re here to help!
Uncover the key differences between FreeBSD and Linux as we break down their features and use cases, helping users make well-informed decisions based on their specific requirements.
DTraceis a powerful tool for system administrators to diagnosis system issues without unduly impacting performance. DTrace became part of FreeBSD with the release of FreeBSD 7.1 in 2009—two years before Oracle began porting DTrace, and nine years before Oracle eventually solved the inherent CDDL vs GPL license conflict.
Pkgng became FreeBSD’s official package manager in FreeBSD 10 in 2014. Applications can be easily installed from either pkg—a system managing precompiled binary packages—or the ports tree, which automates building and installation of packages directly from their source code.
Pingback: Valuable News – 2021/09/13 | 𝚟𝚎𝚛𝚖𝚊𝚍𝚎𝚗