Since the goal of this little project is to actually read my geli-encrypted zfs filesystems on a Linux laptop, I had to get a USB enclosure that supports drives bigger than 2TB; I also got a model which supports USB 3.0. The news I have is:
Ungeli and zfs-on-linux work for this task. I was able to read files and verify that their content was the same as on the Debian kFreeBSD system.
The raw disk I tested with (WDC WD30 EZRX-00DC0B0) gets ~155MiB/s at the start, ~120MiB at the middle, and ~75MiB/s at the end of the first partition according to zcav. Even though ungeli has had no serious attempt at optimization, it achieves over 90% of this read rate when zcav is run on /dev/nbd0 instead of /dev/sdb1, up to 150MiB/s at the start of the device while consuming about 50%CPU.
(My CPU does have AES instructions but I don't know for certain whether my OpenSSL uses them the way I've written the software. I do use the envelope functions, so I expect that it will. "openssl speed -evp aes-128-xts" gets over 2GB/s on 1024- and 8192-byte blocks)
Unfortunately, zfs read speeds are not that hot. Running md5sum on 2500 files totalling 2GB proceeded at an average of less than 35MB/s. I don't have a figure for this specific device when it's attached to (k)FreeBSD via sata, but I did note that the same disk scrubs at 90MB/s. On the other hand, doing a similar test on my kFreeBSD machine (but on the raidz pool I have handy, not a volume made from a single disk) I also md5sum at about 35MB/s, so maybe this is simply what zfs performance is.
All in all, I'm simply happy to know that I can now read my backups on either Linux or (k)FreeBSD.
The geli infrastructure is strongly linked with FreeBSD and I didn't discover any documentation of the data formats. So, in the wake of my concerns about being able to read backups on Linux I read a lot of freebsd source code and now I've written a portable (I hope) userspace program which can decrypt at least a toy geli-encrypted volume.
It's called ungeli and I'm going to try letting it live on github instead of a personal git repo. So far it's a toy in that I've only tested it on a toy volume, the performance is not tuned, but it does seem to work and due to is smallness (<600SLOC at present) it may be a useful second reference if you too wish to understand geli.
Update: I added nbd support and squashed some bugs. Now I've succeeded in retrieving files from a geli-encrypted zfs volume on Linux using zfs-on-linux:
# ./ungeli -j geli-passfile npool.img /dev/nbd0 & # zpool import -d /dev -o readonly=on npool # (imports /dev/nbd0) # cat /npool/example/GPL-3 GNU GENERAL PUBLIC LICENSE Version 3, 29 June 2007 Copyright (C) 2007 Free Software Foundation, Inc. <http://fsf.org/> ...
As I recently discussed, I use zfs replication for my off-site backups, manually moving volumes from my home to a second location on a semi-regular schedule.
Of course, I would rather that if one of these drives were stolen or lost that the thief not have a copy of all my data. Therefore, I use geli to encrypt the entire zpool.
On my new Debian GNU/kFreeBSD system, my backup strategy has changed. On the previous system, I relied on incremental dumps and a DAT160 tape drive, which has an 80GB uncompressed capacity. When you have a few hundred gigabytes of photos to back up, this is an inconvenient solution. On the new system, I am using multiple removable 3TB hard drives and a set of scripts built around zfs send/receive.
Cron runs the rep.py script 4 times a day, which does a zfs send | zfs receive pipeline for each filesystem to be backed up. On a semi-regular basis (I haven't yet decided on what schedule to do this on; with tape backups I did it less than once a month even though it was comparatively easier), I remove the drive to an off-site location, return the other drive from off-site, and insert it. (There's also fiddling with zpool import/export, of course)
The rep.py script relies on the zfs python module, also of my own creation. This module has facilities for inspecting and interacting with zfs filesystems, e.g., to list filesystems and snapshots, to create and destroy snapshots, and to run replication pipelines.
The rep.py script needs to be customized for your system. Customization items are:
TARGETS = ['bpool', 'cpool'] SRCS = ['mpool', 'rpool']
"SRCS" is a list of zpools which are replicated. "TARGETS" is a list of zpools to which backups are replicated. The first available pool out of TARGETS is chosen. (so if more than one TARGET is inserted, only one will ever be used)
You can designate individual filesystems as not replicated by setting the user property net.unpy.zreplicator:skip to the exact string "1", i.e.,
zfs set net.unpy.zreplicator:skip=1 examplepool/junkfiles
Files currently attached to this page:
I'm in the process of setting up a new machine to be a home server. I'll include more boring details about this here. Trust me, it's pretty dry reading.
All older entries
Website Copyright © 2004-2017 Jeff Epler