This isn't really a "task", but a general decision that affects system building tasks. Choosing filesystem is such a critical decision, because switching b/w different storage type is cumbersome, difficult, and time consuming. So your storage configuration has to be future-proof.
Before hoping on, let's be honest. EXT4 is almost always enough, even for
servers. "Snapshot" is a fancy version ofcp
, and you rarely touch volumes.
If it works, don't change it.
If you think you need something more, you must have a clear goal. No goals equal to no preferences, and you'll be just wasting your time over equally worthless options. Stop caring about non-problems and find your actual problems.
BTRFS is officially not production ready, so I don't trust BTRFS much. However, it's rock solid stable as an advanced EXT4 replacement, as proven by many brave souls. With swapfile support added in Linux 5.0, it's currently the best filesystem for system partition.
It's also good for storage, but no RAID5/6, and you simply don't want to risk your dataset w/ non-production-ready filesystem. Things will change over time, but it's important to know that its time is yet to come.
So, if you're setting up PC/laptop, or building a small server, use BTRFS for both system and storage. It's much better w/ LVM, which is covered in a later section.
ZFS is the best in terms of features. It can manage both volumes and files, supports thin provisioning, encryption, RAID, etc. It's just waste of time to talk about good things about ZFS, so I'll only mention some bad parts:
ZFS is not in the Linux kernel, because of its license. Using ZFS on Linux can be a major PITA. DKMS can solve this problem, but it isn't perfect on fast rolling distros like Arch.
ZoL(ZFS on Linux) is a problem of its own, compared to BSD implementation. TBH, this is the very reason why I started writing this piece. But you should understand that maintaining kernel modules outside of kernel is often PITA.
ZFS is a memory pig. Even though cache gets purged quickly in OOM situation, it
causes performance drop, UI stutter, and swapping. Limiting arc_max
is not a
smart solution. Naturally, ZFS performs best on dedicated storage system. I
strongly discourage running ZFS on VM hosts or compute nodes.
Zpool isn't flexible. Zpool can be upgraded (w/ caveats), but cannot be shrunk nor changed like LVM/BTRFS. This sometimes makes some configuration changes absolute PITA, and you'll end up recreating your tank. You must ensure that your setup is final, like dedicated NAS or non-expandable laptops.
With the rise of ZFS and btrfs, this decades old technology has been losing its ground little by little, but LVM does have its position: managing logical volumes.
LVM is really the best replacement for partition table. No need to worry about partition order. Simply create, delete, and resize. Fragmentation is minimal, and overhead rounds to zero. LVM combined with BTRFS(file) and MDADM(RAID) provides extreme flexibility and can support almost any configuration you can think of. (This should be why Redhat is making Stratis based on LVM)
Just tuck this below, except ZFS, and you're 100% ready to mess up your system.
What makes LVM better than just advanced partition table is device mapper. LVM both implements and utilizes device mapper to support various features: encryption, stripping, mirroring, thin provision, snapshot, cachefs, integrity, etc. These are all built into kernel and thus rock solid.
So, theoretically, LVM should be on par w/ ZFS, or even exceed it. However, the harsh reality is that LVM lacks tools. This can be broken into 2 different problems:
Some useful features are implemented outside official repo (e.g. send/recv), or simply not there yet.
No centralized tools for management, While both BTRFS and ZFS can be managed through single command, setting up LVM w/ some features require dedicated tools. Stratis might fill this hole, but is clearly not there yet.
A funny thing is, while some important bits are missing, LVM can perform some random voodoo rituals, like splitting mirrored volumes into individual LVs, unbalanced LV mirroring b/w multiple disks, and selective RAID on LVs. Zpool looks almost lame at this moment, but these are corner cases unlikely to happen.
So, I recommend starting from LVM + BTRFS, mainly because this can save your day whatever happens. You can use either MD or DM-RAID(LVM) for RAID. This setup is field-tested and mainlined, and is abnormally flexible.
ZFS is good, and you really should use it for your massive storage. But its lack of flexibility and high RAM consumption make it less ideal for workstation which either change in configuration or consume lots of RAM, or both.