ZFS on Linux
READ FIRST
Some considerations when working with ZFS
- ZFS uses vdevs and not physical disks.
- Be careful about how you add new disks to the array. No random adding and removing of disks (exception being when upgrading disks or a disk fails)
- ZFS is very powerful, be mindful of what you are going to do and plan it out!
- After a vdev is created, it can never be removed and you can not add into it.
Example:
NAME STATE READ WRITE CKSUM pool4tb ONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 sdb ONLINE 0 0 0 sdd ONLINE 0 0 0
radz1-0 is a vdev. To add more disks (other than hotspares) you must create a second vdev. In this case we are running two mirrored drives so it would be best to add a second pair of mirrored drives.
NAME STATE READ WRITE CKSUM pool4tb ONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 sdb ONLINE 0 0 0 sdd ONLINE 0 0 0 raidz1-1 ONLINE 0 0 0 sde ONLINE 0 0 0 sdf ONLINE 0 0 0
Now data will be striped across both vdevs.
Installation
It has been reported that when installing zfs and it's dependencies at the same time, the kernel modules will not get created. Below are the current steps I found to work when installing ZFS.
yum -y install epel-release
Make sure the system is completely up to date.
yum -y update reboot -h
After reboot
yum -y install kernel-devel yum -y localinstall --nogpgcheck http://archive.zfsonlinux.org/epel/zfs-release.el7.noarch.rpm yum -y install spl
If everything was done right, the following command will take a while (depending on hardware)
yum -y install zfs-dkms yum -y install zfs /sbin/modprobe zfs
At this point you can create your pool. Most of the time we will be interested in a ZRAID configuration. Depending on how much parity your interested user raidz, raidz1, raidz2, or raidz3.
zpool create <name of pool> raidz <disk1> <disk2> <etc>
NOTE: By default this will create a mount point of "/<name of pool>"
To add a spare drive
zpool add <name of pool> spare <disk>
Make sure to enable automatic rebuild when a drive fails, especially when using hot spares.
zpool autoreplace=on <name of pool>
Create ZFS Volumes
zfs create <name of pool>/<Volume Name> zfs set mountpoint=<mount point>
Example:
zfs create pool4tb/archive mkdir /archive zfs set mountpoint=/archive pool4tb/archive
Additional Options
To enable compression
zfs set compression=lz4 <name of pool>
To increase the number of copies of a file on a dataset
zfs set copies=<1,2,3>
To have the pool auto-expand
zpool set autoexpand=on <name of pool>
- Encryption
http://www.makethenmakeinstall.com/2014/10/zfs-on-linux-with-luks-encrypted-disks/
EXAMPLE
1x2TB HDD sdb 4x1TB HDDs sdc sdd sde sdf Using the above drives it is possible to create a variety of deployments. In this example we will create a RAID5 like configuration that spans across three 2TB devices. We start by creating a pools and adding the drives. [root@nas ~]# zpool create -f set1 raidz /dev/sdc /dev/sdd [root@nas ~]# zpool create -f set2 raidz /dev/sde /dev/sdf [root@nas ~]# zpool status pool: set1 state: ONLINE scan: none requested config: NAME STATE READ WRITE CKSUM set1 ONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 sdc ONLINE 0 0 0 sdd ONLINE 0 0 0 errors: No known data errors pool: set2 state: ONLINE scan: none requested config: NAME STATE READ WRITE CKSUM set2 ONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 sde ONLINE 0 0 0 sdf ONLINE 0 0 0 errors: No known data errors [root@nas ~]# zfs create -V 1.50T set1/vdev1 [root@nas ~]# zfs create -V 1.50T set2/vdev1 [root@nas ~]# zfs list NAME USED AVAIL REFER MOUNTPOINT set1 1.55T 214G 57.5K /set1 set1/vdev1 1.55T 1.76T 36K - set2 1.55T 214G 57.5K /set2 set2/vdev2 1.55T 1.76T 36K - [root@nas ~]# ls /dev/ <condensed output> zd0 zd16 [root@nas ~]# zpool create -f data raidz1 /dev/sdb /dev/zd0 /dev/zd16 [root@nas ~]# zpool list NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT data 4.47T 896K 4.47T - 0% 0% 1.00x ONLINE - set1 1.81T 742K 1.81T - 0% 0% 1.00x ONLINE - set2 1.81T 429K 1.81T - 0% 0% 1.00x ONLINE - [root@nas ~]# df -lh Filesystem Size Used Avail Use% Mounted on /dev/sda3 33G 1.6G 32G 5% / devtmpfs 3.8G 0 3.8G 0% /dev tmpfs 3.8G 0 3.8G 0% /dev/shm tmpfs 3.8G 8.5M 3.8G 1% /run tmpfs 3.8G 0 3.8G 0% /sys/fs/cgroup /dev/sda1 497M 200M 298M 41% /boot tmpfs 775M 0 775M 0% /run/user/0 set1 214G 128K 214G 1% /set1 set2 214G 128K 214G 1% /set2 data 2.9T 256K 2.9T 1% /data
As you can see there is a LOT of wasted space using this method. Where we should have ~4TB of usable space we end with ~3TB. This was only an example, the better option is to create multiple independent data sets.