UP  |  HOME

ZFS

Table of Contents

Enable extended attributes for Linux

Without this enabled extended attributes are stored in a hidden subdirectory. This hurts performance, and clutters things.

zfs set xattr=sa tank tank2 tank3

Exporting ZFS over NFS

On Linux, remote mounted ZFS over NFS filesystems don't follow umask unless POSIX ACLs are enabled. This is done with:

zfs set acltype=posixacl tank tank2 tank3

It's also possible to manage NFS sharing directly from ZFS properties rather than exports(5) entries.

Some notes:

  • Don't need to bind mount filesystem in /export/
  • Children datasets inherit this property, but can override it in their own properties
  • You can manually toggle sharing with zfs share and zfs unshare (-a for all filesystems)
zfs set sharenfs=sec=krb5,rw=@192.168.1.0/24,rw=@10.12.0.0/24,async,no_subtree_check,root_squash zstorage

Other common tasks with ZFS

List pools, check status:

zfs list
zpool list
zpool status

Mounting, unmounting, mount all

zfs mount tank
zfs unmount tank
zfs mount -a

Prepare pool for another machine

To make a pool ready for importing elsewhere mark its members inactive by exporting:

zpool export tank

Automatic import may use generic device names. I found it preferable to use the device id because it identifies the disk better. Import a pool using specific device names (like id, uuid, etc.).

zpool import -d /dev/disk/by-id tank

Scan and import existing pools:

systemctl start zfs-import-scan.service

Display a history of a pool/all pools:

zpool history tank
zpool history

List block devices that are ZFS members:

lsblk -f|grep zfs

Display more info than you want to know about your pools:

zdb -U /etc/zfs/zpool.cache

Remove a device from the pool:

zpool remove tank <device_name>

Replace a device with another in the pool:

zpool replace tank <old_device_name> <new_device_name>

I/O Statistics, and with individual disks:

zpool iostat
zpool iostat -v

Considerations for an external drive pool

TODO

always export before disconnecting. If you don't, and need to import it on another machine you'll need to force it:

zpool import -f tank

scrubbing won't happen on a regular schedule so you'll have to manage it.

import with appropriate label (will search and import any pools)
zpool import -d /dev/disk/by-id/ zvideo

OR

import with specific device names if you want
zpool import -d /dev/disk/by-id/usb-WD_Elements_25A3_355147364A4D3946-0\:0 -d /dev/disk/by-id/usb-WD_Elements_25A3_355147394A4D3346-0\:0 zvideo

Do your work

unmount and export
zfs unmount zvideo
zpool export zvideo

Remove usb device (udiskie, udisksctl power-off).

Moving a pool to new drives (example with my zvideo mirror)

This was my process for moving my video mirror from a pair of full internal disks (2x8TB) to a set of larger external disks (2x12TB), but keeping the same data.

May be worth noting that zfs send​/recv on a local machine is probably slower than using cp​/rsync. Take care to preserve attributes (file dates, etc.) if you use one of these.

1. get history, so we can create the new pool with same settings
sudo zpool history zvideo|grep -v import|grep -v scrub
2. init new pool, configure pool.
sudo zpool create -f new_zvideo mirror \
      /dev/disk/by-id/usb-WD_Elements_25A3_355147364A4D3946-0\:0 \
      /dev/disk/by-id/usb-WD_Elements_25A3_355147394A4D3346-0\:0
sudo zfs set xattr=sa new_zvideo
3. snapshot old pool
sudo zfs snapshot zvideo@migrate12tb
4. send old pool (takes a long time), copies properties
sudo -c 'zfs send -vR zvideo@migrate12tb | zfs recv -F new_zvideo'
5. rename old pool
sudo zfs unmount zvideo
sudo zpool export zvideo
sudo zpool import zvideo old_zvideo
6. change old pool mountpoint
sudo zfs set mountpoint=/mnt/old_zvideo old_zvideo
7. rename new pool to old name
sudo zfs unmount new_zvideo
sudo zpool export new_zvideo
sudo zpool import -d /dev/disk/by-id/ new_zvideo zvideo
8. define mount
sudo zfs set mountpoint=/mnt/zvideo zvideo
9. finish initializing unallocated space (wait for finish before returning)
sudo zpool initialize zvideo -w

.

10. (optional) verify copy. (note: I tried zfs diff, but it only compares snapshots of the same pool)
diff -qrN /mnt/zvideo/ /mnt/old_zvideo/

My zpool history

An abridged (scrubs and automatic imports removed) zpool history of my pools:

zpool history
History for 'zbackup' (4x 2TB raidz):
zpool create -f -m /var/backups/ zbackup raidz /dev/disk/by-id/ata-WDC_WD20EFRX_WD-1 /dev/disk/by-id/ata-WDC_WD20EFRX_WD-2 /dev/disk/by-id/ata-WDC_WD20EFRX_WD-3 /dev/disk/by-id/ata-WDC_WD20EFRX_WD-4
zpool upgrade zbackup

History for 'zhome' (2x 240GB SSD in mirror + spare):
zpool create zhome mirror ata-SATA_SSD_1 ata-SATA_SSD_2
zpool add zhome spare ata-SATA_SSD_3
zfs set mountpoint=/home zhome
zpool upgrade zhome
zpool trim zhome ata-SATA_SSD_1
zpool trim zhome ata-SATA_SSD_2
zpool set autotrim=on zhome
zpool set autotrim=off zhome
zfs set xattr=sa zhome

History for 'zstorage' (4x 4TB raidz2 + optane cache):
zpool create zstorage raidz2 /dev/sda /dev/sdb /dev/sdc /dev/sdd
zfs set mountpoint=/mnt/zstorage zstorage
zpool import -f zstorage
zpool import zstorage
zpool export zstorage
zpool import -d /dev/disk/by-id/ zstorage
zpool add zstorage cache /dev/disk/by-id/nvme-INTEL_MEMPEK1W016GA
zpool upgrade zstorage
zfs set xattr=sa zstorage
zfs set acltype=posixacl zstorage

History for 'zstorage' (2x 4TB mirror, lz4 compression, NFS share):
zpool create -m /mnt/zstorage -o feature@lz4_compress=enabled -O compression=on -O acltype=posixacl -O xattr=sa zstorage mirror /dev/disk/by-id/ata-WDC_WD40EFRX_WD-1 /dev/disk/by-id/ata-WDC_WD40EFRX_WD-2
zfs receive -F zstorage
zfs set sharenfs=sec=krb5,rw=@192.168.1.0/24,rw=@10.12.0.0/24,async,no_subtree_check,root_squash zstorage

History for 'zvideo' (2x 8TB mirror):
zpool create -f -m /mnt/videotemp zvideo mirror /dev/disk/by-id/ata-WDC_WD80EMAZ-1 /dev/disk/by-id/ata-WDC_WD80EMAZ-2
zfs set mountpoint=/mnt/zvideo zvideo
zpool import zvideo
zpool clear zvideo
zpool upgrade zvideo
zfs set xattr=sa zvideo zbackup

SSD cache drives

SSD cache drives probably aren't worth it unless your data exceeds RAM.

For example I built this document, which touches at least a few hundred megs of images, and gains were only 0.5s on 48s.

Setup was two 8TB WD WD80EMAZ, and a Kowin 256GB M.2 SATA SSD.

make clean-html; time make html:
MIRROR, NO CACHE
    real  0m47.518s 0m47.493s 0m47.559s 0m47.846s 0m47.977s
    user  3m23.100s 3m23.030s 3m22.193s 3m22.782s 3m22.827s
    sys   0m13.766s 0m13.356s 0m13.298s 0m13.390s 0m13.323s

MIRROR, CACHE (256GB SATA SSD)
    real  0m46.994s 0m46.926s 0m46.998s 0m47.110s 0m47.071s
    user  3m21.965s 3m22.086s 3m22.342s 3m22.285s 3m22.709s
    sys   0m13.348s 0m12.908s 0m12.870s 0m13.360s 0m13.254s