This post is about some basic ZFS (Zettabyte File System more can be found in this Wikipedia article) commands. If you wonder 1 Zettabyte is 1 000 000 000 000 000 000 000 Bytes. As you already know I am using FreeBSD.
After installing FreeBSD I assume you chose ZFS as the file system. If you did not you can use the zpool command to create volume if you have an additional drive.
In this article, we focus on ZFS dedupe and compress features. Before we start let us take a look into ZFS, what is it, and how it works. For this document’s purpose, we will not go into the details.
ZFS
Unlike traditional filesystem/ raid setups, ZFS has a very unique structure. ZFS merges the traditional volume management and filesystem. ZFS uses a copy-on-write transactional mechanism. To understand ZFS we need to understand its building blocks: zpool, vdev, and devices.
vdev(short for virtual device).
Each vdev consists of one or more physical devices.
RAIDz1, RAIDz2, and RAIDz3 are special varieties of storage. The 1, 2, and 3 refer to how many parity blocks are allocated to each data stripe.
zpool
The zpool is the uppermost ZFS structure. A zpool contains one or more vdev, each of which in turn contains one or more devices.
- Facts about zpools:
- Zpools are self-contained units.
- Zpools cannot share vdev.
- ZFS redundancy is at the vdev level.
So ZFS “stripes” write across the pool like RAID stripes correct? This is inaccurate, for the most part, writes are distributed across available vdevs depending on the available free space, thus all vdevs should be filled at the same time. Now we should have a very basic understanding of ZFS.
zpool command
Before we start modifying the ZFS file system we need to get familiar with the zpool command. zpool command has many switches as shown on this man page. We will focus on the list and set commands.
Examples:
root@test:~ # zpool list NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT zroot 17.5G 8.36G 9.14G - - 34% 47% 1.01x ONLINE - root@test:~ #
more details on the pool you can obtain using zfs command for example:
root@test:~ # zfs list NAME USED AVAIL REFER MOUNTPOINT zroot 8.44G 8.53G 96K /zroot zroot/ROOT 3.91G 8.53G 96K none zroot/ROOT/default 3.91G 8.53G 3.91G / zroot/tmp 7.98M 8.53G 7.98M /tmp zroot/usr 4.47G 8.53G 96K /usr zroot/usr/home 96K 8.53G 96K /usr/home zroot/usr/ports 3.71G 8.53G 3.71G /usr/ports zroot/usr/src 773M 8.53G 773M /usr/src zroot/var 768K 8.53G 96K /var zroot/var/audit 96K 8.53G 96K /var/audit zroot/var/crash 96K 8.53G 96K /var/crash zroot/var/log 256K 8.53G 256K /var/log zroot/var/mail 120K 8.53G 120K /var/mail zroot/var/tmp 104K 8.53G 104K /var/tmp root@test:~ #
ZFS dedupe
Deduplication is a process where identical data is only stored once, and this can greatly reduce storage size. Deduplication is a compromise and balance between many factors, including cost, speed, and resource needs. It must be understood and considered carefully before using it. In addition, keep in mind that this process will increase the CPU load.
When the deduplication process is turned on ZFS will keep only one copy of the data and then will create a reference table with links, files, and pool data to the actual storage blocks containing original data. This is the Deduplication Table (DDT). You use less space, however, in case of data corruption, consequences can be catastrophic.
Command to turn on dedupe is:
zpool set dedup=on <poolname>
to turn off: zpool set dedup=off <poolname>
Here is an example:
root@test:~ # zpool set dedup=on zroot/usr/ports root@test:~ # zpool status -Dv pool: zroot state: ONLINE config: NAME STATE READ WRITE CKSUM zroot ONLINE 0 0 0 da0p3 ONLINE 0 0 0 errors: No known data errors dedup: DDT entries 5012, size 8.35K on disk, 2.69K in core bucket allocated referenced ______ ______________________________ ______________________________ refcnt blocks LSIZE PSIZE DSIZE blocks LSIZE PSIZE DSIZE ------ ------ ----- ----- ----- ------ ----- ----- ----- 1 4.56K 205M 133M 138M 4.56K 205M 133M 138M 2 325 1.54M 784K 1.60M 656 3.08M 1.54M 3.23M 4 15 8.50K 8.50K 60K 66 37K 37K 264K 8 5 3K 3K 20K 42 25K 25K 168K 256 1 512B 512B 4K 369 184K 184K 1.44M Total 4.89K 207M 134M 140M 5.66K 208M 135M 143M
Keep in mind when turning off the deduplication it will affect only future data.
ZFS compress
Compression is a pretty neat feature that compresses your files on the fly and therefore lets you store more data using limited storage and will decompress them on the fly while you are accessing the files. All this is been done by ZFS magic. You can request compression stats per pool or volume and it will show you exactly how much space you’re saving.
Examples:
root@test:~ # zpool set compress=lz4 zroot/usr/ports root@test:~ # zpool get all zroot | grep compress zroot feature@lz4_compress active local zroot feature@zstd_compress enabled local root@test:~ #
If you are new to FreeBSD and would like to learn about ports management feel free to check my document about installing VMware Tools that covers some basics about ports. I strongly recommend checking the FreeBSD handbook.