networking virtualization

Using Netgraph for FreeBSD’s Bhyve Networking

networking virtualization

Using Netgraph for FreeBSD’s bhyve Networking

FreeBSD’s bhyve hypervisor offers support for virtual networks connections. Beginning with FreeBSD13, bhyve also supports a netgraph backend for its virtual network devices.


Understanding Netgraph

Netgraph is a high-performance modular networking framework that has been a part of FreeBSD for more than 20 years–since FreeBSD’s 3.4 release in 1999, to be exact. Netgraph’s modular design allows for arbitrary stacking of protocols and transports along with features such as filtering, tunneling, redirection, inspection, and injection. Essentially, netgraph is to networking what the geom layer is to disks and storage, with consumers including FreeBSD’s Bluetooth subsystem and ppp dialup networking stack.

Despite its long history and rich featureset, netgraph is often overlooked. Much of the documentation is at the level of individual modules rather than guided step-by-step instructions for building a complete solution to common problems.

The example we will present in this article is a basic recipe which demonstrates some common netgraph syntax and use-cases.

Did you know?

Improve the way you make use of ZFS in your company

Did you know you can rely on Klara engineers for anything from a ZFS performance audit, to developing new ZFS features to ultimately deploying an entire storage system on ZFS?

Bhyve Networking

Although it may be possible to use PCI passthrough to make a physical network card available to a virtual machine, that approach is neither convenient nor scalable. It is typically easier and more sensible to use bhyve’s built-in network stack to provide a completely virtual network device for the guest operating system.

From the perspective of the host system, the only choice of backend for this network device before FreeBSD 13 was tap(4), also known as vmnet. This creates a tap virtual network interface on the host system which can be configured like any other network interface. 

Once an IP address is allocated to the tap adapter, you can enable routing of packets between it and another network. In many cases, the simplest option is to use a bridge to join it with the host’s primary network interface. The tap adapter can then be passed down to the guest operating system, allowing it to have an address on the same subnet as the rest of your network. 

In the article Virtualise your network on FreeBSD with VNET, we demonstrated the setup of such a bridge to join a physical em0 interface to the epair0a interface that provided the host’s side of a connection to a VNET jail. This uses the if_bridge(4) network device but there is also a netgraph counterpart to this–ng_bridge(4).

A Netgraph Bridge

A netgraph system consists of nodes joined together with edges to form a graph. Data packets flow from one node to another, with each node performing a single task. Netgraph also supports control messages that are passed directly between nodes.

Each node is an instance of a specific node type. The type defines what hooks the node supports, what the node does with data received on each hook, and what control messages it understands.

One common approach to doing bridging with netgraph is to use a modified copy of the script in /usr/share/examples/netgraph/ether.bridge—but that only works for bridging local interfaces, so we’ll break down the individual commands used in that script.

First, you must load the relevant kernel modules. You can do this automatically at boot time by inserting commands in /boot/loader.conf or use the kldload command to load them on demand in a running system. Using kldload’s -n and –d options ensures that we don’t get error messages if the modules are already loaded:

  kldload -nq ng_ether ng_bridge

You now need to identify the physical network interface that you want to bridge to the virtual machine. You can run ifconfig -a to list network interfaces. This might list interfaces for all sorts of things like Wi-Fi, link aggregation, vlans and so on. On my desktop, aside from lo0 for loopback, I have just a re0 interface so that’s what I’ll use in the examples. The naming here indicates which driver is used; by looking at the re(4) man page I can see that this is the driver for RealTek hardware and check out the options specific to it.

Controlling netgraph is done with the ngctl command. Let’s start by listing the available netgraph nodes:

    # ngctl list
    There are 2 total nodes:
      Name: re0             Type: ether           ID: 00000009   Num hooks: 0
      Name: ngctl3418       Type: socket          ID: 0000000e   Num hooks: 0

You should see a node of type ether for each network interface. (The ng_ether(4) man page covers the ether node type.) 

Next, we use the ngctl mkpeer command to create a bridge. The bridge is created as a peer of the existing re0 node, and a connection is made from the lower hook on that to the link0 hook on the bridge:

  ngctl mkpeer re0: bridge lower link0

We now have an unnamed node of type bridge. We can refer to this unnamed node via the hook as re0:lower, but it’s better practice to give it a proper name. This name can be anything, but we’ll use bnet0 here:

  ngctl name re0:lower bnet0

Often, both the lower and upper hooks of the ether node need to be connected to the bridge. The lower hook got connected as part of the creation, and connecting it first avoided any temporary loss of connectivity. The upper hook only needs to be connected in the case of interfaces that the host is using itself. The command to connect it is as follows:

  ngctl connect re0: bnet0: upper link1

Finally, we need to send a couple of control messages to the ether node. First, we enable promiscuous mode, to ensure that the interface will pick up all network packets, not just packets targeted to the host itself. Without promiscuous mode enabled, the host would reject packets destined for the virtual machines. 

The second control message tells it not to overwrite the source address on packets:

  ngctl msg re0: setpromisc 1
  ngctl msg re0: setautosrc 0

With our bridge up and configured, we’re ready to start the virtual machine. Bringing a bhyve guest up manually typically involves passing a lot of different options. A network is handled as a virtual PCI device which the option -s covers. In an existing setup, you might have something like the following:

/usr/sbin/bhyve -c 2 -m 512M -H -A -P -g 0 -s 0:0,hostbridge -s 1:0,lpc -s 29,fbuf,tcp=127.0.0.1:5900,w=1024,h=768 -l bootrom,/usr/local/share/uefi-firmware/BHYVE_UEFI.fd -s 2:0,virtio-net,tap0 -s 3:0,virtio-blk,./disk.img -l com1,stdio myvmname

The -s 2:0,virtio-net,tap0 part of that, is connection a virtio-net device to the host’s tap0 interface

Using the netgraph bridge we’ve just created; we would replace that segment of the command with:

  -s 2:0,virtio-net,netgraph,path=bnet0:,peerhook=link2

The initial number indicates a PCI slot; in most cases, we just need to ensure a unique slot number for each piece of hardware. 

The next argument indicates the type of hardware being emulated. For networking, bhyve supports e1000 and virtio-net. e1000 refers to a driver for physical Intel gigabit network cards that is widely supported by guest operating systems. As convenient as emulating already-supported physical hardware might seem, it is better to use virtio-net—a standard interface designed specifically for virtual machines—wherever possible. 

Now that the adapter type is set, we specify the netgraph node and hook to which we want to connect. In this example, we have named the bridge (bnet0) and the link2 hook. If you have multiple virtual machines, you’ll need to use link3, link4 and so on. 

Hook numbers don’t need to be contiguous, and you don’t need to have used 0 and 1 for the host interface, so you can use whatever numbers are convenient. Internally, bhyve creates a node of type ng_socket. You can give this node and its single hook names with further socket= and hook= parameters, the bridge and network don’t need those optional names to function.

If it helps to visualize the graph of nodes and hooks, you can generate an image file showing the nodes and interconnections. This relies on a tool named graphviz from ports, which we invoke like this:

  ngctl dot | dot -T png -o netgraph.png

For the setup in this example with two virtual machines, graphviz’s output looks like this:

Bridges are not the only type of node that you can connect to with bhyve—you could also connect directly to the lower hook of an ether type interface, for example. 

There are many netgraph modules providing a wealth of opportunities to explore. Two that might be interesting to look at include ng_eiface, which allows you to create a fully virtual network interface connected into the netgraph network, and ng_vlan which handles vlan tagging.

Performance

For benchmarking we ran a basic test using iperf3. This reported around 6.5 Gbps for tap with if_bridge, and 7.5 Gbps with the direct netgraph connection and ng_bridge. Combining a tap adapter with ng_bridge resulted in a slower connection, at around 3.5 Gbps. 

This is a fairly rudimentary test—so if performance is crucial, it’s better to do your own benchmarking in a network environment and with a workload which better models your own.

Managing Bhyve VMs

There are a number of wrappers for starting bhyve virtual machines. FreeBSD provides a basic example script in /usr/share/examples/bhyve/vmrun.sh, but there are also more sophisticated stacks including cbsdchyvesiohyve and vm-bhyve.

If you prefer to use your own wrapper scripts, you can handle all the ngctl commands with a single invocation by passing them to standard input. Using a sh script, the whole bridge setup can be coded as follows:

  if ! ngctl status bnet0: >/dev/null 2>&1; then
    ngctl -f- <<END
      mkpeer re0: bridge lower link0
      name re0:lower bnet0
      connect re0: bnet0: upper link1
      msg re0: setpromisc 1
      msg re0: setautosrc 0
  END
  fi

Since the example above is wrapped in an if test, it’s safe to run even if the bridge is already up and configured.

Additional Resources

Here are some netgraph and bhyve related resources that you may also find useful:

FreeBSD Man Page – bhyve(8) https://www.freebsd.org/cgi/man.cgi?query=bhyve&sektion=8

FreeBSD Man Page – netgraph(4) https://www.freebsd.org/cgi/man.cgi?query=netgraph&sektion=4

FreeBSD Man Page – ng_bridge(4) https://www.freebsd.org/cgi/man.cgi?query=ng_bridge&sektion=4

FreeBSD Man Page – ng_ether(4) https://www.freebsd.org/cgi/man.cgi?query=ng_ether&sektion=4

FreeBSD Man Page – ng_socket(4) https://www.freebsd.org/cgi/man.cgi?query=ng_socket&sektion=4

DaemonNews, All About Netgraph – https://people.freebsd.org/~julian/netgraph.html

AsiaBSDCon 2012, Introduction to Netgraph (presentation slides) – https://www.netbsd.org/gallery/presentations/ast/2012_AsiaBSDCon/Tutorial_NETGRAPH.pdf

FreeBSD code review for addition of netgraph backend to bhyve – https://reviews.freebsd.org/D24620

Tell us what you think!