Avoid Vendor Lock-In with MinIO and OpenZFS
Avoid Infrastructure Vendor Lock-in by leveraging MinIO and OpenZFS
Avoiding vendor lock-in is a key infrastructure concern these days. More and more engineers and infrastructure leaders want to avoid being locked in to systems that are becoming more expensive and less manageable. Looking at open source infrastructure as an alternative is becoming the logical next step.
Vendor Lock-In and Software Defined Storage
Getting locked into a particular vendor can cause your project undue pain from inflated bills and a lack of technical choices, as well as making it difficult to move away if your current vendor no longer suits your needs for some reason.
One of the most common cases of vendor lock-in today comes from the headlong rush to the cloud—and specifically, to Amazon Web Services. An increasing number of projects rely on S3 storage buckets to function, and it’s easy to think that AWS is the only place you can provide your projects with those buckets.
Fortunately, that’s not the case—the MinIO project offers fully S3-compatible, Kubernetes-native, free and open source software defined storage.
MinIO, Software Licensing, and You
For the moment, let’s leave the technical discussion aside—MinIO offers you a drop-in replacement for Amazon S3 buckets, which in turn means you can feed your S3-designed projects viable storage with or without Amazon.
But we still need to talk about licensing, particularly since this article is about avoiding vendor lock-in. MinIO is a fully FOSS (Free and Open Source Software) project, licensed under the GNU Affero GPL v3. It’s also available under commercial licensing—and it’s a good idea to know when, and why, you’d want to pay for the latter.
An overview of the GNU Affero General Public License (AGPL)
The standard GPL offers users the freedom to install, use, and redistribute GPL-licensed code, with the only catch being that any modifications must also be distributed freely under the same license. The AGPL works the same way, but with a significant extension: accessing AGPL-licensed software across a network counts as “distribution.”
In order to understand this latter restriction, consider a hypothetical web application which allows users to upload, organize, and share photos. If this web app is licensed under the standard GPL, you could heavily modify it, put the app online for the public to use, and would not be required to share your modifications with the public.
If the same web app were licensed AGPL and made available to the public, however, the administrators would be required to offer their modifications to the original project to anyone who used the application.
This is less of an onerous burden than it might initially appear—simply using MinIO doesn’t make your web application a derivative work of MinIO, and does not put you on the hook for making your application’s source code available to anyone who uses it.
MinIO’s AGPL in (hypothetical) action
Let’s return to our hypothetical photo-sharing web application for a moment.
What if we don’t want the app to be open source at all? The app itself can be copyrighted with no distribution rights granted, and still use MinIO for storage. MinIO’s AGPL license requires you (the operator of a MinIO instance) to make that MinIO instance’s source code available to you (the operator of the web app which uses it to store photos).
What MinIO’s AGPL license does not require is you (the operator of the MinIO instance) to make any code available to random Internet users who manage their photos with your application—because they’re not interacting with MinIO directly, they’re interacting with your application.
Even if the way you design your project causes your users to interact with your MinIO instance directly over the network—which would be a fairly unusual setup—you’re only on the hook to make the MinIO source code available, not the source for your entire project. And if you haven’t modified the MinIO source code yourself, that obligation can be met simply by linking to MinIO’s own Download page.
If you or your legal team is still uncomfortable with the AGPL, you can instead purchase a commercial license offering access to and use of the same MinIO codebase, but with all FOSS restrictions lifted.
Why not just use Amazon S3 itself?
If you’re already hosting your project on Amazon, using S3 storage may very well be a no-brainer. And that’s fine!
But even if you’re using S3 on Amazon in production, you may want to consider using MinIO in a development environment.
This can significantly reduce your dev infrastructure costs by allowing you to develop outside AWS itself—which could be on a less-expensive cloud provider such as Linode or Digital Ocean, or could be your own self-hosted environment.
This also makes sure that you can leave Amazon should you ever want to.
Knowing that your project works well with a MinIO powered storage system means that if Amazon’s pricing or policies should ever change in an unpleasant way, you can move production to another service that suits you better.
OpenZFS and MinIO
So far, we’ve focused on the potential business model benefits of using FOSS software defined storage. But there are some potential real technical wins, as well.
MinIO’s storage buckets can operate atop any filesystem, or even raw drives with no filesystem at all. In the simplest deployments—such as in small-scale development environments—this might simply be a folder on a single server.
MinIO also offers Erasure Code storage, in servers with at least four drives. MinIO’s erasure coding option offers many of the same benefits that OpenZFS does—including automatic bitrot detection and healing, as well as fault tolerance.
But even if you use MinIO Erasure Code—which, by default, allows loss of up to half of the total drives in a system before losing fault tolerance—OpenZFS can offer significant additional benefits, such as access to OpenZFS snapshots and replication.
ZFS filesystem snapshots offer the storage administrator access to a volume at the point in time the snapshot was taken, and are atomically consistent. Taking a snapshot is an instantaneous operation, and results in virtually no additional storage load either from taking the snapshot, or from keeping it available on the system.
The administrator can access snapshots directly to “cherry-pick” data for restoration, clone the snapshot to create a mountable, writable copy of the snapshot, or “roll back” the snapshotted volume itself to the point in time the snapshot was created.
OpenZFS replication builds on these snapshot capabilities by offering native, block-level data transfer. The net effect is similar to rsync, but frequently multiple orders of magnitude faster—and without the additional storage load that rsync’s need to grovel over directories and file contents creates.
To get the best performance, ZFS and MinIO need to be used to cooperate with each other and avoid excessive write amplification and ensure all of the atomicity guarantees are respected. A thorough analysis of the workload and failure scenarios is required to determine which type of redundancy will provide the best outcomes and ensure the longevity of the data.
Need help? We’ve got you covered!
If you’d like help setting up your own MinIO+OpenZFS-powered storage system, Klara Systems’ technical experts are here to help. Schedule a consultation