Gentoo + ZFS on Linux + RAID-Z1 = Awesome

Now that I have had a few days to form an opinion on ZFS, I will provide a more in-depth analysis, while still only scratching the surface. Before I begin, let me explain my build.

  • AMD ASROCK E350M1-USB3 Mini-ITX Motherboard (4x SATA 3 ports, all on a single controller as it seems)
  • 4GB DDR3 1333 RAM (1x4GB)
  • 2x2TB HDDs (2 new and cheap, 1 external that is 6 months old that I pulled out of its case)
  • 1x120GB 2.5″ HDD manufactured in June 2007, used as ext4 root

ZFS as a whole is amazing. Copy-on-write, clones, snapshots, compression, deduplication, NFS sharing, SMB sharing (not on Linux yet), and encryption (currently Solaris closed-source only) are some of its best features, to name a few. The features are simply astounding, which is what makes ZFS the volume manager of choice (I say volume manager because its package contains more than just a filesystem) for anyone who is interested in managing the way their data is stored. Personally, I have no real need for ZFS; I have plenty of space, copies of my data in two physical locations 700 miles apart, and infrequent hard copies of my data. The real motivator for trying ZFS is pure “sport,” if you will. Without further ado, here follows my notes on the whole setup. I began my setup with Pendor’s guide over at Github. Pendor wrote a Gentoo Linux overlay, made a custom LiveCD, and a nice and easy guide for installing ZFS. His overlay really helped me get up and running much faster than I otherwise would have been able to. I was able to shortcut some of the steps because I did not aim to have a ZFS root. Regardless, I thank Pendor for his excellent guide. The first troubles I ran into were, even after switching to the HEAD revision of the ZFSOnLinux project at Github, that I could not compile SPL (Solaris Porting Layer) under any kernel in the 3.0 line. There seem to be some incompatible functions that SPL relies on that have changed their interfaces. After moving to 2.6.39, the compile went as smooth as warm butter. From there, it has been smooth sailing. I have not encountered any bugs at all, but that is to be semi-expected since the ZFS code isn’t a rewrite of ZFS but an adaptation for Linux.

Creating the initial zpool is, not unlike every other command for ZFS, simple. It is a one-liner that will setup three disks in a RAID-Z1 (RAID-5) format:

zpool create rpool raidz sdb sdc sdd

Next, create the first filesystem and sharing it over NFS. This one will be used for my Time Machine backups over the network:

zfs create rpool/cleteTimeMachine
zfs set sharenfs=on rpool/cleteTimeMachine

There is one small problem with the previous command. Time Machine cannot be limited through its own user interface. Time Machine always consumes as much as it can grow to. To fix that, I used the following command:

zfs set quota=500g rpool/cleteTimeMachine

Time Machine doesn’t compress its backups, so I should have ZFS do this for me:

zfs set compression=gzip rpool/cleteTimeMachine

After setting up Time Machine, I created a few more filesystems. Namely, one to backup the NAS itself (the root drive isn’t a part of the RAID), one to backup my family’s computers (CrashPlan), and one for my files.

zfs create rpool/cleteFiles
zfs create rpool/crashPlan
zfs create rpool/cleteNASBackup

All properties are scope-aware, so you can set deduplication and compression at the pool level and revoke it for the filesystems that perform their own deduplication and/or compression:

zfs set dedup=on rpool
zfs set compression=gzip rpool
zfs set dedup=off rpool/cleteTimeMachine
zfs set dedup=off rpool/crashPlan
zfs set compression=off rpool/crashPlan

Creating snapshots is easy and so is destroying data:

zfs snapshot rpool/cleteTimeMachine@12345abc
rm -rf /rpool/cleteTimeMachine/*
zfs rollback rpool/cleteTimeMachine@12345abc

I really cannot stress how simple ZFS/ZVOL has been to use. It has really been a painless experience so far. Keeping tabs on your filesystems are just as easy. I aliased together a command that will print out information about the pool, its health, space usage, and compression and deduplication ratios.

cleteNAS ~ # zstatus
rpool                    559G  3.02T  38.6K  /rpool
rpool/cleteFiles         199G  3.02T  93.4G  /rpool/cleteFiles
rpool/cleteNASBackup    1.18G  98.8G   689M  /rpool/cleteNASBackup
rpool/cleteTimeMachine   253G   247G   252G  /rpool/cleteTimeMachine
rpool/crashPlan          103G   297G   103G  /rpool/crashPlan
rpool  5.44T   831G  4.63T    14%  1.01x  ONLINE  -
NAME                    PROPERTY       VALUE  SOURCE
rpool/cleteFiles        compressratio  1.00x  -
rpool/cleteNASBackup    compressratio  3.14x  -
rpool/cleteTimeMachine  compressratio  1.21x  -
rpool/crashPlan         compressratio  1.00x  -
all pools are healthy

ZFS on Linux performance is lacking. I do not use ZFS-Fuse, but an actual, real port of ZFS that compiles kernel support, which is called ZFSOnLinux. While the kernel-enabled version of ZFS is much faster than ZFS-Fuse, it is not optimized for speed at all. I have not really tested read performance, but I have tested write performance and I have found it to be roughly 8-10MB/sec when deduplication and compression are on. Without either of those, it soars to 20+MB/sec. One thing to note is that I am testing over 802.11n, so these numbers are probably very inaccurate. Basically, my testing consisted of writing to the drive from two local computers and one remote computer at a time.

One other performance related item that I have noticed is that deduplication and gzip compression together will lower write speed to 8-10MB/sec instead of well over 25MB/sec with both disabled. They also use 60% or so of each processor. All of this is expected with software RAID and the low processing power of a 18W TDP processor.

NOTE: I have verified the speeds using local dd testing. These tests above are an accurate representation of performance.

I have recently turned off atimes (access times) in hopes that it will give a small bump in performance. Since this machine will be used for backups and file storage, I am not very concerned with speed. What I am concerned about is drives being able to sleep. So far, they haven’t spun down a single time, despite the fact that I have all drives set to a 5-minute spin-down. I believe this to be mostly due to the CrashPlan engine. There seems to be a bug with CrashPlan where it will keep the drives running, but I have yet to confirm this. If anyone has ideas about how to stop CrashPlan from keeping the drives up, I would greatly appreciate the help.

Edit: I am pleased to report that my drive performance has been improved greatly, most likely due to an update to ZFS or SPL updates. I have been keeping my ZFS on Linux and SPL at the master branch in order to try to get fixes as soon as they are released. The data I have is of little importance and can be rebuilt easily, so I do not mind doing that so much.

I ran some more tests last night and received the following performance with compression=off dedup=off. For a RAID-Z this is expected performance since ZFS on Linux is still in its early stages (the performance is still a low priority):

cleteNAS media # dd if=/dev/zero of=/rpool/media/output.img bs=8k count=1024k
 1048576+0 records in
 1048576+0 records out
 8589934592 bytes (8.6 GB) copied, 150.951 s, 56.9 MB/s

With compression=gzip-1 and dedup=off (note any compression at all increases speed greatly when writing from /dev/zero):

cleteNAS media # dd if=/dev/zero of=/rpool/media/output.img bs=8k count=1024k
 1048576+0 records in
 1048576+0 records out
 8589934592 bytes (8.6 GB) copied, 37.8502 s, 227 MB/s

16 thoughts on “Gentoo + ZFS on Linux + RAID-Z1 = Awesome

  1. I also tested with dd. 10MB/sec. Test is not inaccurate.

    During my original testing I was hitting it from many boxes, both local and remote. The connections were not maxed out.

    I don’t appreciate you calling me a retard.

  2. Hi,
    very interesting project indeed. I was thinking of building something very similar and was considering the same motherboard, but after your test I am little worried about the speed.

    I wonder if you have had time to try any speed optimizing; 25MB does sound a bit slow doesnt it?
    Would help me a lot if you could confirm the maximum speed, f.e. to up/download a big file from single point to the ZFS without deduplication and compression.

    Dual memory might speed up things a lot, I believe the mboard supports that?

  3. It’s very slow indeed. I haven’t tweaked anything (I don’t know what to tweak) nor have I benchmarked again since upgrading to a 3.x kernel, which could give me some benefits. I fear that much of my issues may lie in the block size, but I don’t know for sure.

    The motherboard is great. Enough power to compile Gentoo quickly while also not raising my electric bill too much. I do not think the motherboard supports dual channel memory. You may want to double check.

    CPU is not my bottleneck currently but utilization does get very high when writing to a partition that has deduplication and compression turned on. I will benchmark again this week.

  4. Peter,

    You’re right. I was mistaken; it must have been the specific RAM that I bought that does not support dual-channel.

    Most of my writes are for backups and those that are not for backups are for storing files for later FTP or for media streaming, both of which are not HDD-intensive. The speed wasn’t a concern of mine.

    Like I said, I’ll be sure to do new benchmarks later.

  5. Any updates on this?

    I’ve had my media fileserver running Gentoo w/software Raid 6 for some time. And would like to start looking at ZFS for storage instead.

    This on an MSI P35 Platinum board with a Q6600, 6GB ram and an IBM M1015/HP SAS Expander combo.
    Attached I have 2 100GB drives in RAID1 for the OS and 12 320GB drives for storage (the great floods happened just when I wanted to buy larger drives :@)

    Software raid is able to do local benchmarks of 500-600MB/s in both reads and writes and when recording game footage with xsplit/dxtory, I’m able to pull 150MB/s over the network. (4x bonded 1Gbit nics on server, 2x bonded 1Gbit nics on desktop).

    Should be able to shrink my raid 6 array by 4 disks to start a ZFS RaidZ2 system and then start moving data over, but before that, I’d like some more info on performance and possible fatal bugs.

    I’d move the box over to freenas, but FreeBSD is still being idiotic in supporting the M1015 controller (which due to its price is extremely popular), so that still isn’t an option.

  6. FreeNAS fully supports the M1015 controller (as experienced with 8.0.x and 8.2. There’s some tweaking to be added to loader.conf for it to boot smoothly thou.

    Edit the loader.conf
    Add the following to it:

    hw.pci.enable_msi=”0″ # Driver Interrupts problem SAS2008
    hw.pci.enable_msix=”0″ # Driver Interrupts problem SAS2008

    And u’r good to go!

  7. Very interesting article. I agree that ZFSOnLinux is kinda slow: I have similar hardware to the one used in the benchmarks above, and here I have the following numbers on a ReiserFS3.6 running on top of LVM which is itself on top of a md 3-disk RAID5:

    # dd if=/dev/zero of=/usr2/tmp/test bs=8k count=1024k
    1048576+0 records in
    1048576+0 records out
    8589934592 bytes (8.6 GB) copied, 90.2324 s, 95.2 MB/s

    (of course dd isn’t a reasonable benchmark, but as that’s what the author used…)

    So the goode-olde ReiserFS3 is almost twice as fast as ZFSOnLinux at its fastest (ie, with dedup and compression turned off)… and ReiserFS has the handicap of having to cope with the overhead of LVM and md, which ZFS doesn’t (due to the built-in RAID and volume facilities).

    Really not impressed… let’s see how things stand when the ZFSOnLinux folks start giving due attention to performance…


    Durval Menezes.

  8. Hey there! This is my first comment here so I just wanted to give a quick shout out and tell
    you I truly enjoy reading your blog posts. Can you recommend any other blogs/websites/forums
    that deal with the same subjects? Appreciate it!

Leave a Reply