Manipulating a Pool from the Rescue System

Going From Recovery Mode to Normal Operations with OpenZFS

Manipulating a Pool from the Rescue System

We’ve all been there: that moment of panic when a system fails to boot back up. Perhaps there was a glitch with an upgrade. Maybe you’re wondering if you fumble-fingered a typo when you made that last change to loader.conf. But there you are, staring at single-user mode, or worse, the boot loader prompt thinking “what now?”.


Fortunately, FreeBSD and its built-in rescue mechanisms have you covered. Barring a truly catastrophic hardware failure, it is possible to quickly recover from most scenarios that prevent a system from booting into normal operation. And if you’re using OpenZFS, you can rest assured that your data is intact.

Let’s take a look at some common recovery scenarios.

Single-User Mode

If a system starts to boot normally but stops with this prompt after probing the disks, you’re in single-user mode:

Enter full pathname of shell or RETURN for /bin/sh:

In this mode, there is only one user (the superuser) and no authentication, no networking or running daemons, and most filesystems are unmounted. While that sounds rather dire, this mode provides the tools needed to repair whatever is preventing the system from completing the booting process.

Start by pressing enter. If you get a # prompt, you’re now in the Bourne (/bin/sh) shell. Next, see if your OpenZFS pools are mounted:

# mount
zroot/ROOT/default on / (zfs, local, noatime, read-only, nfsv4acls)

In this example, only the root dataset of the zroot pool is mounted, and it is mounted as read-only. This means if I try a command such as vi /etc/rc.conf, I’ll receive “read-only file system” errors. To remedy this, unset the read-only property on the specified pool:

# zfs set read-only=off zroot

Then, mount all of the filesystems:

# zfs mount -a

Rerunning the mount command should show that all of the filesystems–including zroot/var, zroot/tmp, and zroot/usr–are mounted as read-write. You should now be able to make any configuration file edits as well as use any other utilities needed to investigate and fix the problem. When finished, type exit. If your changes were successful, the system will continue to boot into normal operation.

Using the Rescue Utilities

Since it is possible that the base utilities themselves (such as sh, mount, or vi) could become corrupt, FreeBSD provides a /rescue directory containing statically linked versions of these utilities.

This means that if a system in single-user mode is too damaged to enter the Bourne shell, you can type:

/rescue/sh

Did you know?

Getting your ZFS infrastructure up to date has never been easier!

Our team provides consistent, expert advice tailored to your business.

This should give you a shell prompt. As another example, if the single-user mode shell cannot open the vi command and you need to edit rc.conf, try this:

/rescue/vi /etc/rc.conf

If that fails, try the rescue version of the ed editor:

/rescue/ed /etc/rc.conf

Most of the commands you need to repair a system in single-user mode have rescue equivalents. You can see which utilities are available by typing ls rescue.

Mount Root Prompt

If a system boot experiences an issue when mounting the root filesystem, it will stop at a prompt which looks like this:

mountroot>

If you suspect that the system is attempting to mount the wrong pool location, you can input the correct location at this prompt. This example points zfs to the location for a default FreeBSD installation, where zroot is the pool name, ROOT is the parent dataset, and default is the default boot environment:

mountroot> zfs:zroot/ROOT/default

However, that command will fail if the problem isn’t the location but rather that the required kernel modules were not loaded. To fix that situation, perform a cold boot and press 3 at the FreeBSD boot menu to “Escape to loader prompt”. This will stop the boot and display:

Boot Loader Prompt

Exiting menu!
Type ‘?’  for a list of commands, ‘help’ for more detailed help.
OK

The commands available at this prompt differ from those available from single-user mode or a fully booted system. Start by unloading any loaded kernel and modules and reloading the kernel. You should get an OK after each command:

OK unload
OK load /boot/kernel/kernel

Then, load the opensolaris and zfs kernel modules (.ko) which are needed to successfully mount an OpenZFS pool:

OK load /boot/kernel/opensolaris.ko
OK load /boot/kernel/zfs.ko
OK

If you get errors with the load commands and the boot failure followed a system upgrade, you can instead try loading the previous version of the kernel by replacing the /boot/kernel/ portion with /boot/kernel.old/ in all 3 load commands.

Once the load commands complete without errors, you should be able to successfully boot with a mounted pool:

OK boot

Did you know?

Want to learn more about ZFS? We consistently write about the awesome powers of OpenZFS in our article series.

Once the system is up, double-check that these lines exist and are typed correctly in /boot/loader.conf:

opensolaris_load=”YES”
zfs_load=”YES”

Those lines instruct the system to load those kernel modules for you at boot. If the boot was into kernel.old, you will want to investigate and fix the reason for the upgrade failure so that you don’t have to repeat going into the loader prompt whenever the system reboots.

Mounting a Boot Environment

One of the quickest ways to recover from a boot failure due to a misconfiguration or failed update is to select a previous boot environment from the boot menu. (See our article on Managing Boot Environments  for instructions on how to create and use boot environments, particularly the section on repairing a system from a boot environment in the If Something Goes Wrong section.)

If you’re in the habit of making a boot environment before performing an update or when testing configuration changes, the only down-time is the time it takes to reboot and select the previous boot environment. You can then mount the failing boot environment while the system is up and operational in order to fix the failure without disruption to the users of the system.

Live USB

If all else fails, booting the system from the FreeBSD installation disk (USB or CD/DVD image) offers an option to enter a shell where you can try to repair your system.

First, import the pool with an ‘altroot’ (a path prepended to all of the mountpoints, so as not to mount over top of the live system you are using). We also set the “do not automatically mount filesystems” flag, because we need to manually mount the boot environment.

# zpool import -R /mnt -N -f zroot

Once that completes, confirm you can see all of your datasets:

# mount -t zfs zroot/ROOT/default /mnt

If you are not sure what is currently the default boot environment, you can use:

# zpool get bootfs zroot

And it will return the name of the current default boot environment. Once you have mounted the root directory, you can mount the rest of the ZFS datasets:

# zfs mount -a

Now your broken system is mounted with a prefix of /mnt, so rc.conf will be in /mnt/etc/rc.conf, and you can edit files as required to repair your system.

Then just reboot without the USB/CD image connected, and your system should boot as normal.

Takeaways

Part of an administrator’s toolkit for preparing a system for resiliency against boot failures is taking advantage of OpenZFS boot environments. They’re instantaneous to create: get in the habit of creating one before any operation that could potentially prevent the system from booting.

When attempting the first boot of an just-upgraded system, consider setting the default boot environment to the snapshot you created before you started, and using the boot-once feature (`bectl activate -t newbootenvironment`). The system will override the default boot environment for the next boot only, allowing you to revert to the working system by just rebooting after the failed boot.

All is not lost if a FreeBSD administrator didn’t create a boot environment and the system isn’t booting. The built in tools in single-user mode, the loader prompt, and the rescue utilities are capable of recovering a system from most boot failures.


<strong>Meet the author:</strong> Dru Lavigne
Meet the author: Dru Lavigne

Dru Lavigne is a retired network and systems administrator, IT instructor, author, and international speaker. Dru is author of BSD Hacks, The Best of FreeBSD Basics, and The Definitive Guide to PC-BSD.

Like this article? Share it!

You might also be interested in

Getting expert FreeBSD advice is as easy as reaching out to us!

At Klara, we have an entire team dedicated to helping you with your FreeBSD projects. Whether you’re planning a FreeBSD project, or are in the middle of one and need a bit of extra insight, we’re here to help!

More on this topic

FreeBSD History – Understanding the Origins of DTrace

DTraceis a powerful tool for system administrators to diagnosis system issues without unduly impacting performance. DTrace became part of FreeBSD with the release of FreeBSD 7.1 in 2009—two years before Oracle began porting DTrace, and nine years before Oracle eventually solved the inherent CDDL vs GPL license conflict.

A Quick Look at the History of Package Management on FreeBSD

Pkgng became FreeBSD’s official package manager in FreeBSD 10 in 2014. Applications can be easily installed from either pkg—a system managing precompiled binary packages—or the ports tree, which automates building and installation of packages directly from their source code.

Unix Philosophy: A Quick Look at the Ideas that Made Unix

Early on, developers working on Unix created a set of ideals that acted as a roadmap for the programs they wrote. They didn’t always follow these ideals, but they set the tone for the Unix project. Keep programs simple, design programs to work together, test early and often – are only some of these ideals.…

One Comment on “Manipulating a Pool from the Rescue System

  1. Pingback: Valuable News – 2022/08/29 | 𝚟𝚎𝚛𝚖𝚊𝚍𝚎𝚗

Tell us what you think!