Cluster provisioning with Nomad and Pot on FreeBSD

Cluster provisioning with Nomad and Pot on FreeBSD 

While there have been efforts to bring Docker to FreeBSD, none of these are really mature. The presence of Docker in so many areas and the lack of Docker support in FreeBSD might make you think that you are out of luck if you want DevOps workflows for managing clusters of computers. 

Pot is a jail abstraction framework/management tool that aims to replace Docker in your DevOps tool chest and it has support for using Nomad for orchestration of clustered services. The team behind Pot are aiming to provide modern container infrastructure on top of FreeBSD and have been progressing over the last 3 years to get Pot into production. 

The Pot project was started in 2018 with the ambitious goal of taking the best things from Linux container management and creating a new container model based on FreeBSD technologies, running on FreeBSD. 

Pot is based on the core proven FreeBSD tools: jails, zfs, VNET and pf and it uses rctl and cpuset to constrain the resources available to each container. These tools are used to manage: 

  • Jail configuration 
  • Dataset/Filesystem management 
  • Network management 
  • Resource limitation 

Part of why the success of Docker and similar tools was such a surprise to FreeBSD sysadmins was that FreeBSD’s core tools already made the job of running relatively complex clusters quite straight forward. Pot aims to keep things simple and uses core FreeBSD features to implement functionality when possible. For example, there is no need to invent new functionality to move images between hosts when using zfs snapshots and zfs send | zfs recv already exists and is well understood. 

Nomad is a cluster manager and scheduler that provides a common workflow to deploy applications across an infrastructure. A cluster manager handles distributing applications across a set of hosts based on load and cluster usage. Nomad has support for provisioning and managing images of many different types, with Pot’s goal of creating modern container infrastructure for FreeBSD.  It was not a huge leap to add support to Nomad for creating pot style containers. Pot support for Nomad is provided through the nomad-pot-driver package. 

Setting up Minipot 

It can be difficult to experiment with cluster based software without first having to go through a lot of setup work to have a pool of nodes to use. Minipot alleviates this requirement and gives us a single node environment to test Pot and Nomad on, while avoiding a lot of setup work. 

Minipot handles setting up and configuring all of the services required to run a nomad cluster, including consul service directory and the traefik http proxy. 

Minipot is packaged on FreeBSD and can be installed with: 

# pkg install minipot 

You might also be interested in

Get more out of your FreeBSD development

Kernel development is crucial to many companies. If you have a FreeBSD implementation or you’re looking at scoping out work for the future, our team can help you further enable your efforts.

We need to configure pot before minipot will be able to run correctly. Refer to the pot installation instructions for what all the pot config controls do. 

We need to configure two things for pot: the network and the zfs layout. Pot depends on zfs which uses datasets as the primary data storage layer. 

Three configuration values need to be set so that Pot can figure out how to create datasets and configure the network. You need to edit /usr/local/etc/pot/pot.conf and uncomment the ‘POT_ZFS_ROOT’, ‘POT_NETWORK’ and ‘POT_EXTIF’ lines: 

# pot configuration file                                          
# All datasets related to pot use the some zfs dataset as parent  
# With this variable, you can choose which dataset has to be used 
# Internal Virtual Network configuration 
# IPv4 Internal Virtual network                                              
# Internal Virtual Network netmask                                           
# The default gateway of the Internal Virtual Network                        
# The name of the network physical interface, to be used as default gateway  

The configuration can be tested by initializing pot and VNET, running a test ping and then tidying it up: 

# pot init 
# pot vnet-init 

The first two commands create the pot ZFS datasets required and configure pf for the pot network. This can be tested by pinging the default bridge ip address: 

# ping 

Then pot de-init was used to tidy everything up. 

# pot de-init 

Using Minipot 

With Pot configured we are able to experiment with Nomad by using Minipot. Running minipot-init will create and configure a cluster for us: 

# sudo minipot-init 
Creating a backup of your /etc/rc.conf 
/etc/rc.conf -> /etc/rc.conf.bkp-pot 
syslogd_flags: -b -b -a -> -b -b -a 
Creating a backup of your /etc/pf.conf 
/etc/pf.conf -> /etc/pf.conf.bkp-pot 
auto-magically editing your /etc/pf.conf 
Please, check that your PF configuration file /etc/pf.conf is still valid! 
nomad_user:  -> root 
nomad_env:  -> 
nomad_args:  -> -config=/usr/local/etc/nomad/minipot-server.hcl 
consul_enable:  -> YES 
nomad_enable:  -> YES 
traefik_enable:  -> YES 
traefik_conf:  -> /usr/local/etc/minipot-traefik.toml 

Pot can control how many resources a container is able to use. To do so the kern.racct.enable needs to be set. This is set via /boot/loader.conf, and to enable this you need to add kern.racct.enable=1 to /boot/loader.conf and reboot. 

If you rebooted to enable resource limits, minipot can be restarted with: 

# minipot-start         

Minipot ships with an example nginx webserver that can be used for testing that the infrastructure works: 

# cd /usr/local/share/examples/minipot 
# nomad run nginx.job 
Job Warnings: 
1 warning(s): 
* Group "group1" has warnings: 1 error occurred: 
        * 1 error occurred: 
        * Task "www1": task network resources have been deprecated as of Nomad 0.12.0. Please configure networking via group network block. 
==> 2021-11-19T15:13:11Z: Monitoring evaluation "38349af2" 
    2021-11-19T15:13:11Z: Evaluation triggered by job "nginx-minipot" 
==> 2021-11-19T15:13:12Z: Monitoring evaluation "38349af2" 
    2021-11-19T15:13:12Z: Evaluation within deployment: "a81f3323" 
    2021-11-19T15:13:12Z: Allocation "bb57cb82" created: node "83947219", group "group1" 
    2021-11-19T15:13:12Z: Evaluation status changed: "pending" -> "complete" 
==> 2021-11-19T15:13:12Z: Evaluation "38349af2" finished with status "complete" 
==> 2021-11-19T15:13:12Z: Monitoring deployment "a81f3323" 
  ✓ Deployment "a81f3323" successful 
    ID          = a81f3323 
    Job ID      = nginx-minipot 
    Job Version = 0 
    Status      = successful 
    Description = Deployment completed successfully 
    Task Group  Desired  Placed  Healthy  Unhealthy  Progress Deadline 
    group1      1        1       1        0          2021-11-19T15:23:40Z

When the example job has launched there will be a webserver running on localhost 8080: 

$ curl -H 'host: hello-web.minipot'
<!DOCTYPE html>
<title>Welcome to nginx!</title>
    body {
        width: 35em;
        margin: 0 auto;
        font-family: Tahoma, Verdana, Arial, sans-serif;
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href=""></a>.<br/>
Commercial support is available at
<a href=""></a>.</p>

<p><em>Thank you for using nginx.</em></p>

If we have a look around the system we can get an idea of what minipot has done on the system. First we can see that it has created a nginx jail.

# jls
JID  IP Address      Hostname                      Path
1                  nginx-minipotwww1_bb57cb82-64 /opt/pot/jails/nginx-minipotwww1_bb57cb82-6490-f8d3-cfbb-f12204c997e8/m

To support the container a number of zfs datasets that host the jail have been created:

# zfs list | grep pot
zroot/pot                                                                  132M   888G       96K  /opt/pot
zroot/pot/bases                                                             96K   888G       96K  /opt/pot/bases
zroot/pot/cache                                                           36.9M   888G     36.9M  /var/cache/pot
zroot/pot/fscomp                                                            96K   888G       96K  /opt/pot/fscomp
zroot/pot/jails                                                           95.0M   888G      128K  /opt/pot/jails
zroot/pot/jails/FBSD120-nginx_1_2                                         94.1M   888G       92K  /opt/pot/jails/FBSD120-nginx_1_2
zroot/pot/jails/FBSD120-nginx_1_2/m                                       93.9M   888G     93.9M  /opt/pot/jails/FBSD120-nginx_1_2/m
zroot/pot/jails/nginx-minipotwww1_bb57cb82-6490-f8d3-cfbb-f12204c997e8     804K   888G      616K  /opt/pot/jails/nginx-minipotwww1_bb57cb82-6490-f8d3-cfbb-f12204c997e8
zroot/pot/jails/nginx-minipotwww1_bb57cb82-6490-f8d3-cfbb-f12204c997e8/m   188K   888G     93.9M  /opt/pot/jails/nginx-minipotwww1_bb57cb82-6490-f8d3-cfbb-f12204c997e8/m
On the system, minipot has created a bridge interface to allow our jails to connect to the outside world and it has created an epair for each.
# ifconfig
igb0: flags=8863<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        ether a8:a1:59:95:87:60
        inet netmask 0xffffff00 broadcast
        inet6 fe80::aaa1:59ff:fe95:8760%igb0 prefixlen 64 scopeid 0x1
        inet6 fddd:3c85:d32c:0:aaa1:59ff:fe95:8760 prefixlen 64 autoconf
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
        inet6 ::1 prefixlen 128
        inet6 fe80::1%lo0 prefixlen 64 scopeid 0x2
        inet netmask 0xff000000
        groups: lo
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
epair0a: flags=8963<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        ether 02:4a:22:96:d3:0a
        groups: epair
        media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>)
        status: active
bridge0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        ether 58:9c:fc:10:ff:a2
        inet netmask 0xffc00000 broadcast
        id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15
        maxage 20 holdcnt 6 proto rstp maxaddr 2000 timeout 1200
        root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0
        member: epair0a flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
                ifmaxaddr 0 port 3 priority 128 path cost 2000
        groups: bridge
        nd6 options=9<PERFORMNUD,IFDISABLED>
The bridge allows pot jails to connect to each other, inside the jail the b side of the epair has been given an ip address. We can check this with jexec:
# sudo jexec nginx-minipotwww1_bb57cb82-6490-f8d3-cfbb-f12204c997e8 ifconfig
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
        inet netmask 0xff000000 
        inet6 ::1 prefixlen 128 
        inet6 fe80::1%lo0 prefixlen 64 scopeid 0x1 
        groups: lo 
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
epair0b: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        ether 02:4a:22:96:d3:0b
        inet netmask 0xffc00000 broadcast 
        groups: epair 
        media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>)
        status: active

The pot configuration is derived from the defaults we configured, the image itself comes from a public registry of images. The registry is very clearly marked NOT FOR PRODUCTION USE and as of writing only contains nginx images with different FreeBSD bases.

The pot developers have stated that they are interested in running a Docker like repository of images, but until that exists you need to create your own images using Pot. The documentation on the github wiki covers how to make your own images using Pot tools and how to bundle them into a repository.

Next Steps

Minipot is a great way to experiment with Pot and Nomad, but there are clear warnings on the label. The next step is to experiment with larger deployments using Nomad, rather than just a single node.

The Pot developers have a three part series of articles where they walk through building a virtual data center using Pot and Nomad. The first part covers an overview and setup of potluck, the second part discusses setting up nomad, and the third part shows how to test the example services. They acknowledged in the Q3 2021 FreeBSD quarterly update report that the documentation is a little stiff and they are working on improving it.

Pot is developed by Luca Pizzamiglio and a set of about 10 other contributors. nomad-pot-driver which enables using pot from nomad was developed by Esteban Barrios. These small projects show some of the power of FreeBSD, small teams are able to create powerful tools that manage to match the features of Linux tooling developed and managed by large teams of contributors. If you use Pot and Nomad on FreeBSD and experience bugs or rough edges, both projects would love to receive feedback and patches.

<strong>Meet the Author</strong>: Tom Jones
Meet the Author: Tom Jones

Tom Jones is an Internet Researcher and FreeBSD developer that works on improving the core protocols that drive the Internet. He is a
contributor to open standards in the IETF and is enthusiastic about using FreeBSD as a platform to experiment with new networking ideas as they progress towards standardisation.

Like this article? Share it!

You might also be interested in

Get more out of your FreeBSD development

Kernel development is crucial to many companies. If you have a FreeBSD implementation or you’re looking at scoping out work for the future, our team can help you further enable your efforts.

More on this topic

Interacting with FreeBSD – Learning the Fundamentals of the FreeBSD Shell

The time of the CLI might seem over given the plethora of UIs these days, however, any experienced sysadmin knows just how necessary a powerful CLI like the FreeBSD shell can be. In FreeBSD 14, the default root shell is changing, and in this article we talk about the background and motivations for this change and what implications and advantages this change brings.

Controlling Resource Limits with rctl in FreeBSD

As an administrator, you may often need to limit the amount of system resources an individual uses. FreeBSD provides several methods to do just that. The rctl command can be used to provide an effective method for controlling resource limits or it can be used to set resource constraints on processes and jails. Find out how to configure and enforce your limits.

Unix Philosophy: A Quick Look at the Ideas that Made Unix

Early on, developers working on Unix created a set of ideals that acted as a roadmap for the programs they wrote. They didn’t always follow these ideals, but they set the tone for the Unix project. Keep programs simple, design programs to work together, test early and often – are only some of these ideals. To this day, the Unix Philosophy impacts many projects.

One Comment on “Cluster provisioning with Nomad and Pot on FreeBSD

  1. Pingback: Valuable News – 2022/01/24 | 𝚟𝚎𝚛𝚖𝚊𝚍𝚎𝚗

Tell us what you think!