FreeBSD iostat – A Quick Glance

FreeBSD iostat

Understanding the Storage Subsystem and Disk I/O

iostat provides a window into the i/o effort of the storage subsystem. You can use it to determine usage patterns, bottlenecks and poor behavior at a glance. It can produce data to support conclusions and suggest further avenues of investigation when used judiciously. In this article, we will dissect its output and introduce disk subsystem troubleshooting using statistical output from iostat. 

With no other arguments, iostat produces the following output:

       tty            nvd0             ada0              da0             cpu 
 tin  tout  KB/t tps  MB/s   KB/t tps  MB/s   KB/t tps  MB/s  us ni sy in id 
   0   112 15.89   0  0.00  27.28   1  0.04  34.56   0  0.00   0  0  0  0 100 

Let’s decode the output:

column title: device name. This Column also vertically delineates the statistics for the device. 

  • tin: typewriter bytes in (* see ‘-d’ flag in the reducing cognitive load) 
  • tout: typewriter out 
  • KB/t: average (mean) size of transaction in KB 
  • tps: transaction frequency (per second) 
  • MB/s: total throughput in megabytes/sec (Both reads and writes) 

cpu: percent of time spent split into the following: 

  • us: user mode 
  • ni: nice 
  • sy: system (kernel) 
  • in: interrupt (hardware) 
  • id: idle 

The CPU statistics can be used to verify that the system was under load, while also looking at the storage. These stats must be interpreted with an understanding of the workload. A system with high user cpu utilization may not be i/o bound. However, if the system shows mostly kernel and interrupt use, there may be an i/o throughput limitation to investigate further. 

Extended device data is available with the ‘-x’ switch 

                        extended device statistics 
device       r/s     w/s     kr/s     kw/s  ms/r  ms/w  ms/o  ms/t qlen  %b 
ada0           0       0     28.8      7.6     0     0     1     0    0   0 
da0            0       0      0.3      0.1     3    10     0     6    0   0 
... 
da41           0       0      0.6      3.7     3    12   786    17    0   0 

This view displays extended statistics for the entire system, including every disk device available on the system. These statistics can reveal the health of specific disks. 

The fields explained: 

r/s: read transactions / second 
w/s: writes / second 
kr/s: kilobytes read / sec 
kw/s: kilobytes written / sec 
ms/r: mean time(milliseconds) / read 
ms/w: mean time(milliseconds) / write 
ms/t: mean time(milliseconds) / transaction (read or write) 
qlen: the depth of the transaction queue 
%b: percentage busy. 

The r/s and w/s are indicators for the use of the device. The figure should correlate to the workloadโ€™s demands.    

For example, an OLTP application will burst reads and writes and will not tolerate high queue depths. Alternately, a data recording application will show a continuous stream of writes while being able to tolerate higher queue depths in effort to improve throughput. 

You might also be interested in

Improve the way you make use of ZFS in your company

ZFS is crucial to many companies. We guide companies and teams towards safe, whitepaper implementations of ZFS that enhance and improve the way the infrastructure is enabling your business.

kr/s and kr/s should similarly track the workload’s need. A mismatch in the observed and expected values requires investigation. If write transactions per second (w/s) are high, yet the (kw/s) throughput is suspiciously low, there might be an application using a poor i/o pattern such as single byte writes. This ‘tinygram’ anti-pattern would support modifying the application to use larger writes. 

The mean time grouping (ms/r, ms/w, ms/t) reveals the drive’s record of retiring requests. The value for a single transaction should be near the stated performance of the media,  <3ms for ssd’s and <20ms for spinning disks. Large departures from these values suggest a welfare check on the specific device. Qlen and %busy provide a snapshot of how heavily loaded a device is, where the queue length and the % busy are correlated to each other. The implementation of tagged queuing and native command queuing allows a storage system to improve performance by collecting requests and optimally ordering them for execution. SSD and NVME drives are comfortable with queue depths exceeding twenty, while spinning disks may struggle beyond eight outstanding transactions. 

If a device reveals sustained high queue lengths or high busy values, it might be the bottleneck in your work. Investigate why that device is hot-spotted, maybe it is an opportunity to split the work to less heavily loaded peers or perhaps it is delaying operations. Excessive delayed transactions are an indication that the disk is precipitating failure. The drive firmware will retry operations in the hope of hiding underlying faults. However, that repeated operation may be blocking all the other i/o for your workload. If the drive has poor latency, investigate with a low-level tool such as smartctl. Replacing the disk before it fails mitigates an emergency into routine maintenance. 

Trending

iostat will repeat its output on demand with the ‘-w <seconds>‘ flag. A large delay between reports produces an overview that hides bursts and troughs in favor of a broad indication of throughput. Decreasing the value below 1.0 will allow fine time steps. For example, ‘ -w 0.050’ will produce a report every 50 milliseconds. This resolution may be helpful if you want to see bursts of i/o or are looking for fine grain i/o patterns.  At the opposite time scale ‘-I’ provides the cumulative numbers since boot time.

Reduce Cognitive Load

It’s easy to get torrents of numbers out of iostat, but resist the urge to collect more data than you need. Use filters to list specific data that is of interest. For example, the -t parameter allows you to specify the device classes you are interested in (SCSI, IDE, tty …). By default, iostat includes the CPU and tty classes, which are interesting, but not directly probative when diagnosing storage subsystem behavior. Adding ‘-d’ to the command line will mute these as it is unlikely you are troubleshooting the teletype subsystem. iostat will also display the ‘pass’ devices associated with ‘da’ scsi-like devices if asked for all devices, but they are not relevant to throughput or health analysis. Provide ‘-c <count>’ to limit the number of reports to prevent overloading your terminal session. Naming devices at the end of the command selects them one by one, however shell-like globbing patterns are not supported.

Parting Caution

iostat is like the top command; providing indicators at a glance. However, a glance is insufficient to fully characterize a complex system; donโ€™t use a single glance with top or iostat to make critical decisions.

iostat reports the statistical mean as a primary indicator. The mean is infamous for hiding outliers and blurring modal distributions. If iostat reports a value that is unusual, investigate further with a tool that produces better statistical indicators. A histogram of latency is more useful than a mean; tools such as dtrace can generate these indicators.

Iostat output varies in format across platforms, therefore you should interpret the output in context.

Related Tools:

Like this article? Share it!

You might also be interested in

Improve the way you make use of ZFS in your company

ZFS is crucial to many companies. We guide companies and teams towards safe, whitepaper implementations of ZFS that enhance and improve the way the infrastructure is enabling your business.

More on this topic

Managing Boot Environments

A ZFS boot environment is a bootable clone of the datasets needed to boot the operating system. Creating a BE before performing an upgrade provides a low-cost safeguard: if there is a problem with the update, the system can be rebooted back to the point in time before the upgrade.
This article demonstrates how to use the bectl utility to manage BEs and provides examples on how to update packages, apply security patches, and upgrade the operating system using BEs.

Performance observability

FreeBSD Performance Observability

Performance observability is a powerful feature that highly supports FreeBSD. In this article, weโ€™re showing you how to take advantage of tools that are specifically built for and with an operating system: tools which understand and are built into the operating systemโ€™s kernel structures. Learn about how to gather the information you need in order to get the most out of your system, determine your operational baselines, and find and resolve performance bottlenecks.

Devsummit

FreeBSD Developer Summit 2021

Join us through the 2 day walk through of our (Hopefully last) online conference walkthrough of the year. Learn more about FreeBSD and what the open source community is working on in this write-up.

One Comment on “FreeBSD iostat – A Quick Glance

  1. Pingback: Valuable News – 2021/04/19 | ๐šŸ๐šŽ๐š›๐š–๐šŠ๐š๐šŽ๐š—

Tell us what you think!