Understanding ZFS Channel Programs

Common Questions Around Creating Your Own Channel Programs

ZFS Channel Programs—essentially, a way to batch multiple ZFS operations together in a single, atomic operation—are a new OpenZFS feature available in FreeBSD 12.0 and newer.

Today’s article answers some common questions regarding ZFS Channel Programs and provides some resources for learning how to create your own Channel Programs.

What is a ZFS Channel Program?

zfs-program(8) defines it this way: The ZFS channel program interface allows ZFS administrative operations to be run programmatically via a Lua script. The entire script is executed atomically, with no other administrative operations taking effect concurrently. A library of ZFS calls is made available to channel program scripts. Channel programs may only be run with root privileges.

Let’s take a closer look at that definition:

Channel Programs are used to program a batch of ZFS administrative operations, such as a combination of the zfs snapshot, list, and destroy commands. (You can learn more about these commands in Basics of ZFS Snapshot Management).
Channel Programs use the Lua programming language to make ZFS kernel calls which combine ZFS operations and provide iteration and error handling.
Channel Programs are executed atomically, meaning they are not affected by other processes and you don’t have to stop any running applications before running the Channel Program. Either all of the tasks in the channel program will succeed, or if any one of them fails, all of the work will not be committed, and it will be as if the commands had not run at all.
Channel Programs may only be run with root privileges. As an additional security measure, FreeBSD executes Channel Programs in a sandboxed environment and enforces memory and time limits to prevent poorly-written or malicious Lua scripts from consuming all the memory in the kernel or causing an operation to block forever. By default, a Channel Program will stop if it runs longer than 10 million Lua instructions or uses more than 10MB of memory.

You might be wondering, why not just write a script using my favorite scripting language for zfs batch operations? That brings us to the next question:

Why Should I Use Channel Programs?

Consider this common scenario: as part of your OpenZFS snapshot management, you need to periodically traverse the list of snapshots and destroy snapshots that meet certain criteria. If you’re using snapshots, you have probably already written a script that does this for you.

However, you may have noticed that it takes some time for all of the snapshot operations to complete, especially when iterating over a large amount of snapshots. The first reason for this is that the operations are issued by the zfs userland command. In contrast, a Channel Program is evaluated within the kernel (and kernel operations are faster than userland operations).

Running Channel Programs within the kernel gives the added advantage that Channel Programs guarantee consistency with concurrent ZFS modifications. For example, a Channel Program guarantees that the zfs list and zfs destroy commands both see the same ZFS pool state, whereas running these commands from a script leaves the chance that another ZFS operation outside of the script has modified the pool state before the next command in the script is run.

To understand the difference, let’s do a quick overview of OpenZFS transaction groups and synctasks.

A transaction group—also known as a TXG—is a collection of queued ZFS operations. These operations could be disk operations, such as writes, or they might be administrative operations, such as zfs list or zfs destroy.

If a transaction group contains an administrative operation, the step of actually running the operation is known as a synctask. In order to provide filesystem consistency, transaction groups are numbered sequentially, are always committed in sequential order, and there is only one open transaction group at any given time.

If the transaction group contains a synctask, the synctask runs last and must complete and return its results before the next transaction group can run. The time for a synctask to complete depends upon the operation, but can vary from a few milliseconds to several seconds.

Did you know?

Want to learn more about ZFS? We consistently write about the awesome powers of OpenZFS in our article series.

Read More >

Channel programs combine multiple and iterating operations into one compound operation that is performed in a single transaction group with a single synctask. This is a powerful concept that can provide both a significant performance boost and filesystem consistency guarantee.

Consider a script that iterates through a snapshot listing, checks each snapshot against a criteria, destroys a snapshot if it meets the criteria, then continues through the iteration. Every one of those operations requires its own transaction group and synctask. Besides the performance hit of running multiple synctasks, there is no guarantee that a transaction group created outside of the script won’t alter the state of the ZFS pool before the next transaction group generated by the script is run.

In contrast, a Channel Program containing the same operations would be sent to one transaction group and executed as one synctask. In addition to performance improvements, this ensures that the sequence of operations within that synctask occurs atomically and is not interrupted by any other transaction groups.

As another example, a Channel Program could promote a clone and destroy it in the same synctask. Or, create recursive snapshots using a kernel API rather than using the zfs userland application.

In short, when you need to execute a series of dependent zfs commands, you want the performance and reliability of a Channel Program.

The blog post Proposed ZFS Feature: Channel Programs provides some more examples of how multi-operation zfs commands benefit from being replaced by Channel Programs.

What operations are supported by Channel Programs?

zfs-program(8) lists the API functions for these supported operations:

zfs list: filesystems, snapshots, and clones; also used to iterate a dataset’s snapshots, children, or properties
zfs get: filesystem, snapshot, and volume properties
zfs destroy: dataset or snapshot; includes dry run support and marking a snapshot with holds or clones for deferred deletion
zfs promote: includes dry run support
zfs snapshot: includes dry run support
zfs rollback: for filesystems and zvols but not snapshots or mounted datasets; includes dry run support

As you can see, there are API functions for managing most bulk dataset and snapshot tasks.

Why do Channel Programs use Lua?

If you’re not a programmer, or if Lua is not your programming language of choice, you may be wondering if it is worthwhile to learn yet another language in order to use Channel Programs. Lua was selected as it is small, easy to sandbox and embed in existing programs, and the interpreter is highly configurable.

zfs-program(8) lists the available APIs and provides some code examples, including one that demonstrates error handling. Refer to the Lua 5.2 reference manual for everything you need to know to get started with Lua.

Did you know?

Getting your ZFS infrastructure up to date has never been easier!

Our team provides consistent, expert advice tailored to your business.

Find out more

What does a Channel Program look like?

The following example of a script for dataset batch deletion is from the ZFS Channel Programs blog post from Delphix. Refer to the blog post for a more complete description of how the script enforces correct deletion order and error handling. It should be noted that the script is probably not any longer or more complicated looking than similar scripts may be using now.

args = ...
pool = args["argv"][1]

function gather_destroy(root, to_destroy)
    for child in zfs.list.children(root) do
        to_destroy = gather_destroy(child, to_destroy)
    end
    for snap in zfs.list.snapshots(root) do
        for clone in zfs.list.clones(snap) do
            to_destroy = gather_destroy(clone, to_destroy)
        end
        table.insert(to_destroy, snap)
    end
    table.insert(to_destroy, root)
    return to_destroy
end

function cleanup_dataset(root)
        datasets = gather_destroy(root, {})
        for ds in datasets do
            err = zfs.check.destroy(ds)
            if (err != 0 and err != ECHILD) then
                error("failed to destroy " .. ds .. " errno: " .. err)
            end
        end
        for ds in datasets do
            assert(zfs.sync.destroy(ds) == 0)
        end
    end
end

function recursive_cleanup(root)
    for child in zfs.list.children(root) do
        recursive_cleanup(child)
    end
    -- We may encounter these clones when recursing through children of some
    -- other filesystem, but we catch them here as well to make sure each is
    -- destroyed before its origin fs.
    for snap in zfs.list.snapshots(root) do
        for clone in zfs.list.clones(snap) do
            recursive_cleanup(clone)
        end
    end
    -- Only recursively destroy the dataset if it's marked for destruction
    if (zfs.get_prop(root, "gc:tmp_cleanup") == "yes") do
        cleanup_dataset(root)
    end
end

recursive_cleanup(pool)

Why is it called a Channel Program?

In IBM terminology, the channel subsystem (CSS) moves data into and out of a mainframe. Since the CSS is independent of the mainframe’s processors, input/output (I/O) within a mainframe can be done asynchronously. Asynchronous I/O is handled within the channel subsystem by, you guessed it, a channel program.

If you transpose CSS with ZFS and mainframe with kernel, Channel Program seems like a logical term for the functionality provided by ZFS Channel Programs.

Need help?

If you’re currently using scripts to manage snapshots or repetitive ZFS tasks, consider the performance and consistency benefits of replacing those scripts with Channel Programs. The resources mentioned in this article are a great starting point for creating your own Channel Programs.

The resources mentioned in this article are a great starting point for creating your own Channel Programs. If you need help creating or debugging Channel Programs, the OpenZFS experts at Klara are always available to provide support.

Topics / Tags

snapshots

Back to Articles

Understanding ZFS Channel Programs

Additional Articles