Building Your Own FreeBSD-based NAS with ZFS
Building Your Own FreeBSD-based NAS
This article is the first of a four-part series on building your own NAS on FreeBSD. This series will cover:
- Selecting a storage drive interface that meets your capacity and performance requirements both now and into the future.
- Why it makes sense to build your own NAS using FreeBSD rather than installing a NAS distribution (even a FreeBSD-based one). We’ll also discuss which configuration and tuning settings are needed.
- The nitty-gritty on sharing: configuring NFS, Samba, and iSCSI shares.
- Software maintenance and monitoring your NAS.
So, let’s kick off the series with a discussion of hardware.
Decisions, Decisions, Decisions…
When it comes to researching NAS hardware, one can quickly succumb to information overload. There is a dizzying array of technologies, vendor datasheets touting performance and reliability stats, and DIY hardware lists typically aimed at the home user or SOHO rather than the enterprise.
How do you balance your budget and hardware choices, knowing that storage technologies are rapidly evolving? While SAS has been the stable powerhouse for over a decade, NVMe will definitely start to dominate as its features become on par with SAS (with much higher performance) and costs continue to come down. Just deciding which device interface to use impacts the price of drives, backplanes, HBAs, cabling, etc.
So many decision factors might lead one to ask: “why not just purchase a pre-built NAS solution and be done with it?”.
Choosing the hardware for a custom build, rather than buying a solution from a vendor, offers several advantages:
- you control the design: integrate into existing infrastructure or build from newer technology, use components on-hand or from your preferred vendor, incorporate your plan for growing future capacity and replacing slower or smaller-capacity drives
- you control the cost: purchase for what you want/need and eliminate the middle man
- you have more control over security: no unused feature sets or waiting for patches from a solution vendor
- you choose the storage technology and control the schedule for switching to newer technologies
While we can’t tell you what hardware to buy in an article, we can discuss some of the factors to consider as you research which hardware best meets your storage requirements.
Throughput, Cost, Availability
Many storage purchase decisions boil down to three factors: throughput, cost, and purchase availability of initial and replacement disks. Table 1 provides a quick overview of these three factors as they apply to the available drive interface technologies:
Table 1: Quick Comparision of SATA, SAS, and NVMe
|NVMe||Up to 256GB/s full-duplex||High||Improving|
At a quick glance, NVMe is the clear winner when it comes to throughput, while SATA wins at cost and availability but suffers for performance. How that works out in the real world depends upon your storage requirements:
Throughput: SSDs can easily saturate the SATA interface limit of 600MB/s, making that interface a bottleneck. However, for light workloads involving large amounts of cold storage, SATA may be fast enough for less cost; it may also be a good match when cost is more important in your storage interface decision than performance. For organizations with mostly cold data that is seldom accessed, the superior throughput (and cost) of NVMe is essentially overkill. For large numbers of drives, a SAS HBA can provide much higher bandwidth and port counts, even if the disks use a SATA interface.
A budget-wise solution for mixed storage workloads could use a mix of SATA, to reduce costs for cold storage, and SAS, for improved throughput for more frequently accessed data. Since a SAS controller can be configured to operate a mix of SAS and SATA drives, you can use SAS drives for hot (frequently accessed) data, while leveraging the cost advantage of SATA drives for colder data.
When throughput is your primary storage requirement, the value provided by NVMe is worth its cost.
Half-duplex vs Full-duplex: Half-duplex can be a serious performance bottleneck; read and write functions are not executed simultaneously since only one lane or direction can be used for transferring data. Servers with multicore processors and plenty of RAM will be waiting for data transactions (reads or writes) to complete, resulting in an underutilization of compute resources.
Cost: While SATA SSDs are relatively low cost to purchase, you lose those cost savings in performance and underutilization of other components. While NVMe has a higher purchase cost, it provides other cost gains: reduced power consumption and maximum throughput that doesn’t bottleneck other purchased components.
Purchase Availability: Storage technology is definitely heading towards NVMe and the pricing for NVMe SSDs is starting to reflect that. As you might expect from any emerging technology, there are a number of competing formats, including the M.2 format popular with personal devices, and a number of different form factors for servers, the most popular of which is U.2. At the same time, the availability of SAS SSDs is starting to decline. For organizations that want to future-proof their storage infrastructure, NVMe will provide the most operational flexibility in the years to come.
Switch From SAS to NVMe?
If you’re planning a new storage infrastructure with hot storage, it makes sense to consider building it around NVMe. You’ll have a future-proof build that provides excellent performance now. If you already have an existing storage infrastructure, it’s probably running SAS. SAS connections and backplanes understand SATA, meaning a mix of SAS and SATA disks can be used in the same drive bay. SATA disks can easy be swapped out for SAS disks or SSDs, with no other infrastructure changes required.
Unfortunately, that same ease does not apply when upgrading to NVMe. Differing hardware technologies mean you can’t just swap a SAS drive for an NVMe one. NVMe drives often do not require an HBA, or require HBAs that support the NVMe protocol. Additionally, they may require upgrading existing backplanes to get enough NVMe ports.
Which begs the question: “when does it make sense to invest in an NVMe infrastructure and phase out an existing SAS infrastructure?”. You will definitely need to balance the need for performance gain with cost. Taking a close look at the following will both help in the decision-making process and identify areas that could be improved while planning for the new infrastructure:
- performance analysis: where are the current bottlenecks? Will higher throughput resolve these or uncover bottlenecks in other components that will also need an upgrade?
- layout: does the current layout adequately separate hot and cold storage? Are there sufficient redundant paths? Are the current RAIDZ choices appropriate?
- inventory: which ports are available on the backplanes? Are there older backplanes reaching EoL? Are the sizes and number of current storage drives sufficient? Is there sufficient room in the chassis?
Some recently addressed NVMe limitations can also help you make an informed decision.
NVMe Form Factors
The older NVMe M.2 form factor is similar to memory sticks that slide parallel into M.2 slots. This form factor, plus the fact that it is not hot-swappable and only supports 3.3 V power, pretty much made NVMe a no-go as a SAS replacement in the enterprise.
The new NVMe U.2 form-factor changes that. This form factor:
- is the same size as most SATA SSDs
- uses a cable, allowing it to ride separately from the board so that the extra heat source isn’t laying directly on the board; unlike SATA cables, U.2 cables carry both power and data
- has increased speed and can make use of up to four PCIe lanes and two SATA lanes
- has larger capacities
- can use 3.3 V or 12 V power
- is hot-swappable
- has dual-port support
So, pretty much anything that SAS can do, NVMe U.2 can do faster. The current disadvantage is what one would expect of a recent technology advance: the expense and availability of U.2 backplanes. That new technology premium will decrease over the next few years as costs decrease and availability increases.
Dual Port and High Availability
You may have noticed that one of the features provided by U.2 is dual port, meaning that the drive has two independent physical channels. This can be used to maintain I/O to and from each connection independently and simultaneously. It can also be used to provide redundancy.
Most of today’s high availability storage solutions are based on dual-port SAS drives and an enclosure and HBA capable of configuring the second port to be used for redundancy, so that a disk is still available after a path or controller failure. Until U.2, NVMe did not support this type of dual-port architecture, making it unsuited for HA environments.
If HA is a requirement for your NAS, it might make sense to wait a bit for increased availability of HA enclosures that support NVMe.
Building the Chassis for Your ZFS NAS
Before purchasing hardware, take the time to review any existing ZFS topology and plan out what that topology should look like to meet the capacity needs for the next few years. Knowing how many disks you’ll need to support that topology will impact the choice of chassis and the number of required bays. Remember, the amount of required disks includes:
- vdevs for data storage: See our article on Choosing the Right Storage Layout. A RAIDZ Capacity Calculator can also assist in determining the required number of disks.
- support vdevs: these include disks for L2ARC, SLOG, and special vdevs for metadata. To learn more about L2ARC and SLOG, see our articles OpenZFS: All about the cache vdev or L2ARC and Understanding OpenZFS SLOGs. A special vdev is device dedicated solely for allocating various kinds of internal metadata, or OpenZFS allocation classes. rsync.net recently published a technical note on their observations about using special vdevs.
One more thing…
The experts at Klara have been designing NAS solutions for over two decades and have in-the-trenches experience with using new technologies and planning for technology upgrades. Reach out to us if you would like to discuss the practicalities of creating your own NAS solution.
Worth noting: even if you’re using SATA drives, you probably want a SAS controller managing them. SAS controllers are almost always capable of managing SATA drives, and they generally can handle multiple GiB/sec aggregate throughput over those drives, where even high-quality on-motherboard SATA controllers generally bottleneck at around 650MiB/sec *aggregate* throughput, not just 650MiB/sec per individual attached drive.
The LSI Broadcom SAS 9300-8i is one example of such a controller. There are many, many more, but if you’re not sure of what’s out there that’s a good starting point, with a mix of decent cost, plenty of vendors stocking it, and no hardware RAID firmware you need to work around for ZFS duties.
Pingback: Valuable News – 2022/04/11 | 𝚟𝚎𝚛𝚖𝚊𝚍𝚎𝚗
Where’s the next part?
Pingback: Part 2: Tuning Your FreeBSD Configuration for Your NAS | Klara Inc.
Pingback: Part 3: Building Your Own FreeBSD-based NAS with ZFS - Klara Inc.
Have the following parts been published yet? If so, would it be possible to share the links to those articles? I am having trouble finding them. Thanks!