Did you know that FreeBSD has more than one TCP stack and that TCP stacks are pluggable at run time? Since FreeBSD 12, FreeBSD has support pluggable TCP stacks, and today we will look at the RACK TCP Stack. The FreeBSD RACK stack takes this pluggable TCP feature to an extreme: rather than just swapping the congestion control algorithm, FreeBSD now supports dynamically loading and an entirely separate TCP stack. With the RACK stack loaded, TCP flows can be handled either by the default FreeBSD TCP stack or by the RACK stack.
Virtualize Your Network on FreeBSD with VNET
Virtualise your network on FreeBSD with VNET
FreeBSD Jails are a well-known feature and have become core to many excellent tools on FreeBSD such as the Poudriere package builder. Jails offer process and file system isolation, but for a long time they did not offer very satisfying network isolation. VIMAGE provides isolation for networking through virtual network stacks or VNET.
VIMAGE was first presented in the Paper “Implementing a Clonable Network Stack in the FreeBSD Kernel” by Macro Zec in 2003 and VIMAGE code appeared first in FreeBSD 8.0, however it wasn’t until FreeBSD 12.0 that VIMAGE was built into FreeBSD GENERIC kernels and it is a bit of an overlooked feature.
If you have used jails before you might wonder what more VNET jails offer or why you would want to migrate from using localhost-based networking. VNET jails give each jail its own isolated copy of the network stack. They get everything from the IP layer up, creating a network stack that is entirely its own, almost anything you could do with a distinct host you can do with a jail and VNET.
VNET network stacks can be given interfaces from the host and once an interface has been delegated into a VNET jail it disappears from the view of the host, becoming only visible in the jail. FreeBSD offers the epair interface that acts as a virtual Ethernet cable. Give one end of an epair to a VNET jail and keep the other end in the host and you have a way to network between the host and the jail or even between multiple jails. Unlike normal jail networking a VNET jail is able to fully use the interface and once given the correct permissions it can perform packet captures and traces as if it were a full host.
VNET allows jails to be closer to full virtual machines with much less overhead and they offer the ability to test networking in isolation with a very similar environment to the jail host.
Using VNET with a jail
The best way to experiment with VNET is to create some test jails and play with their networks. With VNET it is easy and cheap to create a new separate network that will not interfere with the hosts configuration.
First to demonstrate the power the VNET provides we will create a simple jail that only isolates networking from the host. If we run the following three commands on a FreeBSD system we can create a simple jail:
host # jail -c name=emptyjail persist vnet host # jexec emptyjail /bin/sh emptyjail # ifconfig lo0: flags=8008<LOOPBACK,MULTICAST> metric 0 mtu 16384 options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6> groups: lo nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
The important flags to the jail command here are persist and vnet, persist keeps the jail around even though nothing is running inside of it, vnet creates the jail with its own VIMAGE network stack.
The jail we have created contains a loop back interface (lo0) but cannot see any of the interfaces from the host, it has been isolated at the network and process level. Because we do not specify a new root to use for the jails file system, it can see everything on the host file system as before, this can be really handy when you are developing and running tests that take advantage of VNET.
For our jail to be useful we need to be able to access network interfaces, VNET jails can be given any interface and once given to a jail they vanish from the view of the host network stack. When we want to network between the host and the jail we can use an epair(4) device. epair devices are a pair of Ethernet-like interfaces connected back to back, sort of like an emulated patch cable. Each side is denoted with a letter, they come as an ‘a’ and a ‘b’ part. If we create an epair and move one half into the jail, once it is configured we can communicate from the host to the jail.
host # ja host # ifconfig epair create epair0a host # ifconfig lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384 options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6> inet6 ::1 prefixlen 128 inet6 fe80::1%lo0 prefixlen 64 scopeid 0x2 inet 127.0.0.1 netmask 0xff000000 groups: lo nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL> epair0a: flags=8842<BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=8<VLAN_MTU> ether 02:c5:7b:f8:1e:0a groups: epair media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>) status: active nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL> epair0b: flags=8842<BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=8<VLAN_MTU> ether 02:c5:7b:f8:1e:0b groups: epair media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>) status: active nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
This time when we create our jail we delegate one half of the epair to it. We need to configure the interface in the jail and on the host. Once we have the interfaces configured we can communicate between the host and jail (Note: when you delegate an interface to a jail all the configuration will be erased. You must configure the interface once it is inside the jail).
host # jail -c name=networkjail persist vnet vnet.interface=epair0b host # jexec networkjail /bin/sh networkjail # ifconfig lo0: flags=8008<LOOPBACK,MULTICAST> metric 0 mtu 16384 options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6> groups: lo nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL> epair0b: flags=8842<BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=8<VLAN_MTU> ether 02:c5:7b:f8:1e:0b groups: epair media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>) status: active nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL> networkjail # ifconfig epair0b inet 192.0.2.2/24 up networkjail # exit host # ifconfig epair0a inet 192.0.2.1/24 up host # ping 192.0.2.2 PING 192.0.2.2 (192.0.2.2): 56 data bytes 64 bytes from 192.0.2.2: icmp_seq=0 ttl=64 time=0.143 ms 64 bytes from 192.0.2.2: icmp_seq=1 ttl=64 time=0.069 ms ^C --- 192.0.2.2 ping statistics --- 2 packets transmitted, 2 packets received, 0.0% packet loss round-trip min/avg/max/stddev = 0.069/0.106/0.143/0.037 ms
Python3 is installed on the host and as our jail doesn’t isolate the file system, all installed software is also available to the jail. We can use the http server built in to python3 to demonstrate a simple network service running inside the jail.
host # jexec networkjail sh -c "cd /usr/src; python3 -m http.server" Serving HTTP on 0.0.0.0 port 8000 (http://0.0.0.0:8000/) ...
With this a web browser on the host machine can be pointed at 192.0.2.2:8000 and we can browse around the structure of the FreeBSD source tree that lives in /usr/src. We can run any service that will run on a FreeBSD host from within this VNET jail. The difference here between older jail networking is that we can do anything with the epair that we could do with a real network interface such as place add it to a bridge.
When we remove networkjail the half of the epair that was delegated is returned to the host:
host # ifconfig epair0b ifconfig: interface epair0b does not exist host # jls -v JID Hostname Path Name State CPUSetID IP Address(es) 1 / networkjail ACTIVE 3 host # jail -r networkjail host # ifconfig epair0b epair0b: flags=8842<BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=8<VLAN_MTU> ether 02:c5:7b:f8:1e:0b groups: epair media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>) status: active nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
When we destroy either half of the epair both halves will be removed from the system, even if the side delegated to a jail.
VNET offers a facility for local testing with a network segment that is very similar to a real network. It is possible to perform any action from the host as you would on a non-virtual network interface such as adding it to a bridge or firewalling.
Using VNET to isolate networks
We are not limited to using virtual interfaces when using VNET, but are able to delegate any interface on the host into the VNET. VNET opens up the opportunity to isolate out chunks of networks allowing us to build complex topologies such as soft routers and firewalls.
We can also use the isolation offered by VNET to offer services from a host on a particular interface that might be undesirable or destructive or if the address space that a piece of equipment uses clashes with the network that the host is on.
On this development machine we have network interfaces on two different networks, one is in the development environment and the other lives in the sandbox testing environment. Never can packets go between the two. The sandbox testing network is where virtual machines live, normally these bhyve machines are configured with a tap interface and they join the network via a bridge interface on the host. VNET allows us to do the same thing, but with jails.
Before we start we need to load the kernel modules for epair and bridge on the host, they normally load automatically on the host when you try to use them, but the jail is unable to do this.
host # kldload if_epair if_bridge
VNET also allows us to give a physical network interface on the development host to a jail and from that jail create further sub VNET jails that are isolated from the hosts network:
host # jail -c name=isojail persist vnet vnet.interface=em0 children.max=1
This time we create our jail with the children.max parameter set to 1 to allow the creation of a sub jail. Inside our isolated isojail we can create a bridge and an epair, delegating half of the pair to the subjail. Next we can bridge together the real external interface em0 and the half of the epair in the isolated jail isojail. When creating more complex configurations it is important to make sure that all interfaces are brought up even if they are not being assigned any addresses, we do this for the epair0a side that is a member of the bridge in isojail.
host # jexec isojail /bin/sh isojail # ifconfig bridge create bridge0 isojail # ifconfig epair create epair0a isojail # jail -c name=subjail persist vnet vnet.interface=epair0b isojail # ifconfig bridge0 addm em0 addm epair0a up isojail # ifconfig epair0a up
With the hierarchy of jails set up and the interfaces bridged and delegated we can configure the network interface subjail on the sandbox network. The sandbox network offers dhcp so all we need to do is to start dhclient inside the subjail.
host # jexec isojail.subjail dhclient epair0b
With jail hierarchies and VNET we can create an entire sub environments that contain their own view of both virtual and real networks. This can be very helpful when you need to create environments that use the same address space or that cannot be contaminated with traffic from other places.
Using VNET to test potentially hazardous firewall changes
Everyone has had that horror moment when you press enter to commit some firewall configuration and everything stops. You wonder if your local network has gone down, or if the host has had a hiccup and eventually you have to admit that the new firewall rule that you were so sure of has locked you out of the machine.
VNET jails offer a safe(r) way to test firewall configurations by using isolated network stack. If you lock yourself out with a bad firewall rule you can always jexec into the jail and tidy up your mistake. All three of the firewalls in FreeBSD support running in VNET (although some features such as dummynet with ipfw are not yet supported) and they are automatically tested as part of the FreeBSD firewall test suite using VNET.
The small downside to this is that the kernel modules for the firewall to be used must be loaded on the host. This should not cause any trouble, but the more things there are the more possible bad interactions there can be. First we need to load the kernel module for ipfw and set it to default allow traffic:
host # kenv net.inet.ip.fw.default_to_accept=1 host # kldload ipfw
Now lets set up another jail like our first networkjail:
host # ifconfig epair create epair0a host # jail -c name=firewalljail persist vnet vnet.interface=epair0b host # ifconfig epair0a inet 192.0.2.2/24 up host # jexec firewalljail /bin/sh firewalljail # ifconfig lo0: flags=8008<LOOPBACK,MULTICAST> metric 0 mtu 16384 options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6> groups: lo nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL> epair0b: flags=8842<BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=8<VLAN_MTU> ether 02:46:6b:30:e6:0b groups: epair media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>) status: active nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL> firewalljail # ifconfig epair0b inet 192.0.2.1/24 up firewalljail # ping -c 1 192.0.2.2 PING 192.0.2.2 (192.0.2.2): 56 data bytes 64 bytes from 192.0.2.2: icmp_seq=0 ttl=64 time=0.124 ms --- 192.0.2.2 ping statistics --- 1 packets transmitted, 1 packets received, 0.0% packet loss round-trip min/avg/max/stddev = 0.124/0.124/0.124/0.000 ms firewalljail # ipfw list 65535 allow ip from any to any
Inside the jail we can see that our ipfw instance has its own set of rules, with only the default allow all rule installed. We can also ping the address of the hosts epair side showing that packets are allowed to pass.
When packets are dropped by a firewall, the firewall can choose to send us an ICMP error message so rather than just timing out ping will tell us ‘sendto: Permission denied’. We can configure a deny all rule in the jail and see all traffic stop to and from the jail, while the hosts networking is unaffected.
firewalljail# ipfw add 100 deny ip from any to any 00100 deny ip from any to any firewalljail# ping -c 1 192.0.2.2 PING 192.0.2.2 (192.0.2.2): 56 data bytes ping: sendto: Permission denied --- 192.0.2.2 ping statistics --- 1 packets transmitted, 0 packets received, 100.0% packet loss
VNET jails are a very powerful feature of FreeBSD, these have only been simple examples which aim to demonstrate the ease in which they can be used and some of their benefits for testing. If you have ever wished that you had extra machines to test with or another network interface to use, you can recreate a lot of that utility by using VNET jails. VNET jails with nesting allow the creation of isolated environment that can completely separate out test environments from the hosts network.
Like this article? Share it!
You might also be interested in
Get more out of your FreeBSD development
Kernel development is crucial to many companies. If you have a FreeBSD implementation or you’re looking at scoping out work for the future, our team can help you further enable your efforts.
Today, let’s talk a little bit less about technology itself, and a little bit more about business management. There are a couple of key management terms that every system administrator and IT professional should know and love—RPO and RTO, or Recovery Point Objective and Recovery Time Objective.
Once we understand the meaning and importance of RTO and RPO, we will take a look at two ZFS technologies—snapshots and replication—which greatly ease their management.
Understanding which data benefits from being in a snapshot and how long it makes sense to keep snapshots will help you get the most out of OpenZFS snapshots. Pruning snapshots to just the ones you need will make it easier to find the data you want to restore, save disk capacity, and prevent performance bottlenecks on your OpenZFS system.