Improve the way you make use of FreeBSD in your company.
Find out more about what makes us the most reliable FreeBSD development organization, and take the next step in engaging with us for your next FreeBSD project.
FreeBSD Support FreeBSD DevelopmentAdditional Articles
Here are more interesting articles on FreeBSD that you may find useful:
- Debunking Common Myths About FreeBSD
- GPL 3: The Controversial Licensing Model and Potential Solutions
- Our 2023 Recommended Summer Reads 2023 FreeBSD and Linux
- FreeBSD – Linux and FreeBSD Firewalls – Part 2
- FreeBSD – 3 Advantages to Running FreeBSD as Your Server Operating System
A Star is Born
According to DTrace: The Reverse Engineer’s Unexpected Swiss Army Knife by Tiller Beauchamp and David Weston, development of DTrace started in 2001. At the time, Cantrill was the sole developer. Leventhal and Shapiro joined the effort later. The changelog, shows that DTrace was first released in Sun's Solaris 10 in 2004. In 2006, Cantrill talked to TechRepublic about how DTrace came about. The initial idea was sparked in 1997 when Sun ran into a problem while benchmarking a new system."In 1997 we were working on breaking some Solaris benchmarking and ran into problems for two weeks with some of the best minds in the community. The answer was simple: one of the computers was thinking it was a router when it was supposed to be benchmarking. However, after that experience I was a bit shaken. We had some of the best Solaris expertise in the world on this problem. What would happen if this was a customer? A lot of customers either live with the problem or escalate it with Sun and see if we could answer it or not. We realised we had to come up with a way to dynamically instrument production systems and answer questions about the systems. The problems we had before can now be solved by one person in one afternoon, instead of two weeks with 16 or so people."Cantrill talked further about DTrace's origins in an interview with Lightstep's Ben Sigelman. Cantrill said that DTrace came out of a simple problem they had—“if a complicated system is doing something unexpected, how do you determine what it is doing and why, without taking the system down? Why is the system not operating as well as we think it should be? It's not crashing. It's operating. But we want it to be operating much more efficiently.” Cantrill also remarked that “The debugging literature prior to that was really focused on debugging systems in development, as opposed to actually examine systems in production." Cantrill also said that they had to make sure DTrace was implemented in such a way that it would not panic or crash the system because "if you run a tool and the system dies as a result, you will never be allowed to run that tool again." DTrace: The Reverse Engineer’s Unexpected Swiss Army Knife goes on to state that, "DTrace was Sun’s first software component to be released under their own open source Common Development and Distribution License (CDDL)." However, some groups were slow to port DTrace because they didn't trust the CDDL—for example, Adam Leventhal claimed in 2011 that Oracle believed the CDDL license would “make DTrace too toxic for other Linux vendors.” These license concerns may have contributed to Red Hat’s decision to release a similar utility named SystemTap.
Other Systems Add Support
MacOS was one of the first non-Solaris operating system to add support for DTrace in 2007's Mac OS X 10.5 Leopard. Apple also included a graphical tool named Instrument to make DTrace more user-friendly.DTrace became part of FreeBSD with the release of FreeBSD 7.1 in 2009—two years before Oracle began porting DTrace, and nine years before Oracle eventually solved the inherent CDDL vs GPL license conflict by adding a GPLv2 dual license option for the kernel-space portion that they now owned, and a UPL license for the formerly-proprietary userspace tools.
Shortly after Oracle purchased Sun Microsystems in 2010, several key OpenSolaris developers announced the creation of Illumos, an open source fork of OpenSolaris which included DTrace. This fork was based on the OpenSolaris OS/Net consolidation. In 2018, Oracle's Linux port of Dtrace hit the 1.0 milestone. Oracle is still working to improve this version, with released in 2020. Up to this point, each port was maintained separately, with the Illumos, Apple, and FreeBSD versions of DTrace drifting apart, and the Oracle-maintained and developed project being entirely separate. That changed in 2016 when the OpenDTrace project appeared on GitHub. According to an article from TechGig, the purpose of the OpenDTrace project is to "produce a portable implementation of DTrace for multiple operating systems". The project's GitHub account states, "new scripts being licensed under the BSD 2-Clause license and modifications to the original scripts remaining under the CDDL" Also in 2018, Microsoft announced that they were porting DTrace to Windows from FreeBSD. DTrace for Windows became available to the general public in Windows 10 Insider builds, starting in March of 2019. Earlier versions of Windows were not supported.What can DTrace Do For Me?
The goal of dtrace is to be able to dynamically instrument a system. Traditionally, operating systems had debugging features, a flag you could enable at build or boot time, to output verbose messages about what the system was doing, to be able to investigate issues. Unfortunately, traditional debugging systems were generally not very fine grained—and, worse, produced significant (and sometimes enormous) performance impacts once enabled. This made troubleshooting performance problems very difficult, since the debugging system could constitute a performance issue of its own!Traditional debuggers also generally needed to be enabled when the software was being compiled or started—so if a problem cropped up while a system was running, the only way to try to investigate it was to restart with debugging enabled. This would cripple performance, and possibly make the cause of the issue disappear.
Between the destruction of state involved in restarting an app with debugging enabled, and the large change in performance profile with debugging running, many performance investigations effectively began already stuck at a dead end. By contrast, Dtrace allows the tracepoints to exist as disabled code which do not impact performance. They can then dynamically be enabled at or during runtime, when an issue arises. There are 3 main varieties of tracing: Statically defined tracing (SDT), system call tracing, and function boundary tracing (FBT). The former allows specific probes to be created while the system is being developed, allowing details of specific events to be output easily. The latter instruments nearly every function in the code, allowing the observer to see who called a function, and with what arguments, as well as the return value. With this set of tools, most problems can be investigated. Reads from disk seem slow? Start by asking (and answering) a few basic questions:- What applications are doing the most reads?
- How big are the reads?
- How long do the reads take?
Using DTrace
Now that we've looked at the history of DTrace, let's look at how to use it with some quick examples. These examples were tested on FreeBSD.kldload dtrace_all
List all of the probes:
dtrace -l
Search for a particular probe:
dtrace -l | grep read
Filter the list of probes:
dtrace -l -f syscall:freebsd: | grep read
Coming back to our previous example, if we want to determine what programs are performing read system calls, we can do:dtrace -n ‘syscall::read:entry { print(execname); }’
This will print out the executable name for each program when it does a read system call:dtrace: description 'syscall::read:entry ' matched 2 probes
CPU ID FUNCTION:NAME
5 90124 read:entry string "firefox"
5 90124 read:entry string "firefox"
5 90124 read:entry string "sshd"
3 90124 read:entry string "firefox"
5 90124 read:entry string "sshd"
3 90124 read:entry string "firefox"
5 90124 read:entry string "sshd"
3 90124 read:entry string "firefox"
5 90124 read:entry string "sshd"
5 90124 read:entry string "sshd"
Although this output turns out not to be terribly useful. Dtrace has a series of “aggregation” functions which produce more useful output. Let’s perform the same trace again—but this time, we’ll aggregate the results based on the executable name. This will run until you hit control+c, then display the results:dtrace -n 'syscall::read:entry { @[execname] = count(); }'
The results:dtrace: description 'syscall::read:entry ' matched 2 probes
ssh 1
qterminal 2
sshd 2
lumina-desktop 30
firefox 492
Much more usable output. Another aggregation function allows us to create a histogram, to break down these read calls by their size, and see how many of these calls fall into buckets of different sizes. For the read syscall, the arguments are: arg0=file descriptor to read from, arg1 = buffer to read into, arg2 = number of bytes to read, see man 2 read:dtrace -n ‘syscall::read:entry { @[execname] = quantize(arg2); }’
The output:dtrace: description 'syscall::read:entry ' matched 2 probes
firefox
value ------------- Distribution ------------- count
0 | 0
1 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 447
2 | 0
sshd
value ------------- Distribution ------------- count
0 | 0
1 |@@@@@@@@@ 27
2 | 0
4 |@@@@@@@ 21
8 |@ 2
16 |@ 2
32 |@ 4
64 |@@ 6
128 | 1
256 |@@ 6
512 |@ 2
1024 | 0
2048 |@ 3
4096 |@@@@@@@@ 26
8192 |@@@@@ 16
16384 |@ 4
32768 |@ 4
65536 | 0
dd
value ------------- Distribution ------------- count
16 | 0
32 | 2
64 | 0
128 | 1
256 | 1
512 | 0
1024 | 0
2048 | 0
4096 | 0
8192 | 0
16384 | 0
32768 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 3438337
65536 | 0
From this we can see that the reads from firefox are 1 byte, and not likely the cause of any issues. We see that a dd process is reading a large number of blocks of 32 KB, while sshd’s reads are distributed across an array of sizes. To answer our last question, we can look at the timing of the calls, by measuring the elapsed time between the entry to the syscall, and its eventual return:dtrace -n 'syscall::read:entry { self->ts = timestamp; } syscall::read:return /self->ts/ { @[execname, "ns"] = quantize(timestamp - self->ts); self->ts = 0; }'
This script will set a variable (self->ts) to the current timestamp when a read syscall is initiated. When a read syscall returns, if it has that variable set, we will subtract it from the current timestamp, and add that value to our aggregation:dtrace: description 'syscall::read:entry ' matched 4 probes
firefox ns
value ------------- Distribution ------------- count
256 | 0
512 |@@@@@@@@@ 131
1024 |@@@@@@ 97
2048 |@@@@@@@@@@@@@@@@ 251
4096 |@@@@@@@@ 116
8192 |@ 14
16384 | 0
sshd ns
value ------------- Distribution ------------- count
256 | 0
512 |@@@@@@@@@@@@ 104
1024 |@@@@@@ 54
2048 |@@@@@@@@@@ 88
4096 |@@@@@ 40
8192 |@@@@ 36
16384 |@@ 15
32768 | 2
65536 | 0
131072 | 0
262144 | 0
524288 |@ 6
1048576 | 0
dd ns
value ------------- Distribution ------------- count
8192 | 0
16384 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 54393
32768 |@@@@@@@@@@ 19190
65536 | 0
From this we see that about 26% of the reads done by dd took more than 32,768 nanoseconds, while the rest took between 16,384 and 32,767 ns. We also see a small number (6) of outliers in the ssh read calls, taking more than half a million nanoseconds. This is only a small taste of what dtrace can do, but you can see how quickly and easily you can determine what is happening on a system while it is running, in a non-disruptive way. Some other examples you might find useful: 1.) Count system calls by program name:dtrace -n 'syscall:::entry { @[execname] = count(); }'
2.) Find the parent of a process calling a syscall:dtrace -n 'syscall::read:entry { printf("execname=%s parent_execname=%s", execname, curthread->td_proc->p_pptr->p_comm); }'
3.) Trace file opens by process and filename:dtrace -n 'syscall::open*:entry { printf("%s %s", execname, copyinstr(arg0)); }'
4.) Trace file opens with process and filename:dwatch -X open
5.) Find the parent of a process calling a syscall:dwatch -R syscall::read:entry
6.) Watch signals being passed to kill syscall:dwatch -X kill
7.) Watch signals being passed between bash(1) and vi(1):dwatch -k bash -k vi -X kill
8.) Display the first process to call kill syscall and then exit:dwatch -N 1 kill
9.) Count IP-level events:dtrace -n 'ip::: { @[probename] = count(); }'
10.) Trace new processes showing parent program and command executed:dwatch execve
11.) Trace new processes showing mini process-tree for command executed:dwatch -R execve
12.) Trace new processes belonging to root super-user:dwatch -u root execve
13.) Trace new processes belonging to wheel super-group:dwatch -g wheel execve
You can find more information on using DTrace here.Allan Jude
ZFS Engineering Manager and co-Founder of Klara Inc., Allan has been a part of the FreeBSD community since 1999 (that’s more than 20 years ago!), and an active participant in the ZFS community since 2013. In his free time he enjoys baking, and obviously a good episode of Star Trek: DS9.
Learn About KlaraWhat makes us different, is our dedication to the FreeBSD project.
Through our commitment to the project, we ensure that you, our customers, are always on the receiving end of the best development for FreeBSD. With our values deeply tied into the community, and our developers a major part of it, we exist on the border between your infrastructure and the open source world.