Building a ZFS Backup Server

2015-11-16T20:30:00Z

I've been running an Ubuntu home server with ZFS based storage for sometime. What follows is a rather detailed walk through my implementation of a ZFS backup server. I have been meaning to implement ZFS snapshot based backups for quite some time and I figured I may as well document the journey, not only as an aide memoire for myself but also in case it's of any use to other people interested in doing the same. As with everything else I write, this guide is provided as is and your mileage may vary.

For the remainder of this article, the home server is referred to as brox and the backup server is mundo. And just in case you're wondering, the sensitive information (i.e. public keys etc) featured in this articles aren't real.


Updated 20160104T21:00:00Z

The original backup server build used recycled components (cheap mobo, AMD Athlon 64 X2 CPU) and proved problematic in a number of ways, not least getting checksum errors in both zpools. I have since changed the mobo, CPU and memory. The new specification is as follows:

This has resolved (so far) the checksum problems, the dysfunctional Wake-on-Lan and the problems transferring ZFS snapshots from primary to backup server at anything over 100 Megabits/second. I've not amended the article to reflect these changes, but instead inserted this ammendment to serve as a warning to anyone looking to cut corners when relying on ZFS.

Additionally, in the original article there were some problems with the way I was invoking zpool scrubs on the backup server. I was starting them remotely via ssh, and then not checking to see if the scrub was still running/complete before shutting down the backup server. I'd lost sight of the fact that "zpool scrub" instantiates a background process. And there seems no way for forcing it to run in the foreground. Now, after starting a scrub, I check that the scrub is running, and then repeat the check every ten minutes until the srub is complete. I have worked the amendments for this in this article.

Sometime soon I will add an additional step to the backup script that causes a failure if there are any read/write/checksum errors from the zpool status, and thus results in an email notification when the cron job fails.


Hardware

The existing home server uses a Xeon E3-1220L v2 CPU, an efficient power supply, DDR3L ECC ram and Noctua CPU cooler/case fans. This amounts to a package that consumes 15 Watts of electricity in it's idle state, yet has a reasonable compute power when required.

The new (to me) backup server has afforded no such luxuries. Whilst it would be nice to use ECC ram, which is recommended for running ZFS, that would make this project more expensive. Electricity consumption is also less of a concern as I'd only expect it to be on for, at maximum an hour each day. The only new component is the case, which I bought to match the existing home server.

I intially acquired a pair of Asus P5 series boards each complete with an Intel Core2Quad (Q6600) CPU. These were binned components from what we affectionately call "the graveyard" at work. One of the boards turned out to have a series of visibly blown capacitors, so that was quickly returned. The second looked promising, but no amount of fettling would get it to boot stabley. Plan-B turned up an XFS MI-A78U-8309 AM2 motherboard complete with AMD Athlon 64 X2 CPU from eBay for the princely sum of £15. Normally, I'd consider this to be a terrible choice due to it's low CPU mark and abysmal power consumption (65 Watts at idle, before adding disks).

When it arrived, I assembled the components and added 4 2TB hard disk drives recycled out of my desktop computer and my old NAS box. An addition 500GB disk was salvaged out of an old media player to use a boot drive. Memory (4 x 1 GB DDR2) was sourced from a box of spare parts, as was a 550 watt PSU.

I've been tempted to move away from Ubuntu for some time - mainly out of the frustration I've had with the sometimes lengthy wait for patches to show up in the Ubuntu repositories. That said, the ZFS packages for Ubuntu have been rock solid reliable and the 5 year extended support period for Ubuntu's LTS releases is useful. Debian stable releases are in contrast only support for 12 months following the next stable release.

The first thing I did was boot it up. I was pleased to see it successfully POST. I rebooted and ran through the BIOS - it recognised all five SATA disks. We're off to a promising start.

Ubuntu Installation

I downloaded Ubuntu Server 14.04.3 LTS and prepared USB installation media. At this point, I encountered the first hitch - I couldn't get the XFS MI0A78U mobo to boot from USB media. Period. The only spare ROM drive I had in my parts bin was a PATA drive. Fortunately, this board has one PATA connector. I plumbed in the ROM drive and prepared an Unbuntu installation CD. On rebooting the server, I again checked the BIOS. The ROM drive was recognised, but instead of seeing 5 SATA disks I only saw 4. Early indication was that this was an effected of connecting a PATA device to the motherboard. Nevermind. I'll just have to disconnect the ROM drive post install. Yes, this inflexibility is sub-optimal but we're talking about a backup for a home server, rather than anything mission critical.

I rebooted with the Ubuntu Server installation CD in the ROM drive and launched Memtest86+ - it ran successfully for three entire passes before I rebooted the machine again and this time chose to test the installation media.

I continued with the installation which went swiftly, until I rebooted. On boot we see everything happening as normal, until standard output shows plymouth-upstart-bridge respawning too fast and then a blank display.

Post-installation Boot Woes

I rebooted (again) and using the Grub menu, dropped into a root recovery console. The file system is, by default, mounted read only. The first step is mounting it read-write:

# mount -o rw,remount /

I then edited the plymouth-upstart-bridge.conf job

# vi /etc/init/plymouth-upstart-brdige.conf

And added to the end of the file:

post-stop exec sleep 2

Sure enough, this significantly reduced the visible complaints from the kernel about plymouth-upstart-bridge, but the blank display persisted. Once again I rebooted, dropped to the recovery console and remounted the root file system read-write. I then made some changes to /etc/defaults/grub:

# vi /etc/defaults/grub

And changed the line:

GRUB_CMDLINE_LINUX_DEFAULT=""

To:

GRUB_CMDLINE_LINUX_DEFAULT="noplymouth nosplash nomodeset"

This basically disables the splash screen and stops the kernel trying to use anything but the default BIOS video mode. Finally I rebooted and the black screen of death was supplanted by a log-in prompt.

Configuring Networking

The first thing I tried, post-boot, was pinging the new server.

biscuitninja@colnago ~ $ ping mundo
PING mundo.bikeshed.internal (172.16.3.49) 56(84) bytes of data.
64 bytes from mundo.bikeshed.internal (172.16.3.49): icmp_seq=1 ttl=64 time=1.11 ms
64 bytes from mundo.bikeshed.internal (172.16.3.49): icmp_seq=2 ttl=64 time=3.73 ms
64 bytes from mundo.bikeshed.internal (172.16.3.49): icmp_seq=3 ttl=64 time=1.35 ms
^C
--- mundo.bikeshed.internal ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2003ms
rtt min/avg/max/mdev = 1.110/2.065/3.736/1.185 ms

Result. It's received an IP address and DNS records courtesy of the DHCP server running on brox. Now, for most my devices, I'd create a DHCP reservation. In this instance, howerver, I want to statically define an IP address. I'm planning some sort of crude/semi-automated DHCP/DNS failover, where-by when I boot this server, if the DHCP/DNS services on brox are unavailable, it will take the reigns. True high availability for most home set-ups is overkill - especially on a box that with it's disks will eat best part of 100 Watts.

I ssh into mundo and swap out it's DHCP configuration for a static address:

biscuitninja@mundo:~$ sudo vi /etc/network/interfaces

# The loopback network interface
auto lo
iface lo inet loopback

# The primary network interface
auto eth0
iface eth0 inet static
    address 172.16.1.52
    netmask 255.255.248.0
    gateway 172.16.1.1
    dns-nameservers 172.16.1.52 127.0.0.1

biscuitninja@mundo:~$ sudo ifdown eth0 ; sudo ifup eth0

As a result I lose the ssh session. In another terminal window I check that I can ping the new IP address and then wonder off to update my DNS records accordingly.

$ ping 172.16.1.52
PING 172.16.1.52 (172.16.1.52) 56(84) bytes of data.
64 bytes from 172.16.1.52: icmp_seq=1 ttl=64 time=0.504 ms
64 bytes from 172.16.1.52: icmp_seq=2 ttl=64 time=0.265 ms
64 bytes from 172.16.1.52: icmp_seq=3 ttl=64 time=0.249 ms
^C
--- 172.16.1.52 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 2999ms
rtt min/avg/max/mdev = 0.249/0.320/0.504/0.107 ms

$ ssh brox

$ sudo rndc freeze bikeshed.internal
$ sudo vi /var/lib/bind/db.bikeshed.internal

I incremented the zone's serial number and amended the record for mundo to match the new static IP address.

mundo                   A       172.16.1.52

$ sudo rndc thaw bikeshed.internal

I repeat the process to remove the old PTR record from the DHCP reverse lookup zone (3.16.172.in-addr.arpa) and add it to the correct reverse lookup zone (1.16.172.in-addr.arpa).

And then finally, I test the changes:

biscuitninja@colnago ~ $ nslookup mundo
Server:     172.16.1.51
Address:    172.16.1.51#53

Name:   mundo.bikeshed.internal
Address: 172.16.1.52

biscuitninja@colnago ~ $ nslookup 172.16.1.52
Server:     172.16.1.51
Address:    172.16.1.51#53

52.1.16.172.in-addr.arpa    name = mundo.bikeshed.internal

biscuitninja@colnago ~ $ ping mundo
PING mundo.bikeshed.internal (172.16.1.52) 56(84) bytes of data.
64 bytes from mundo.bikeshed.internal (172.16.1.52): icmp_seq=1 ttl=64 time=2.25 ms
64 bytes from mundo.bikeshed.internal (172.16.1.52): icmp_seq=2 ttl=64 time=4.67 ms
64 bytes from mundo.bikeshed.internal (172.16.1.52): icmp_seq=3 ttl=64 time=1.40 ms
^C
--- mundo.bikeshed.internal ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2003ms
rtt min/avg/max/mdev = 1.406/2.777/4.675/1.386 ms

Finally, I want to make sure "hostname -f" shows returns mundo's fully qualified domain name.

biscuitninja@mundo:~$ hostname -f
mundo

biscuitninja@mundo:~$ sudo vi /etc/host

Change the line

127.0.1.1       mundo

To

127.0.1.1      mundo.bikeshed.internal mundo.bikeshed mundo

And then test

biscuitninja@mundo:~$ hostname -f
mundo.bikeshed.internal

Securing SSH

Thus far we've got the server to build, moved it onto a static IP and re-configured it's DNS records. Before moving on, it's time to generate a public/private key pair for SSH and disable password authentication.

$ ssh-keygen -t rsa -b 3072 -f ~/.ssh/id_rsa_mundo.bikeshed.internal
Generating public/private rsa key pair.
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /home/biscuitninja/.ssh/id_rsa_mundo.bikeshed.internal.
Your public key has been saved in /home/biscuitninja/.ssh/id_rsa_mundo.bikeshed.internal.pub.
The key fingerprint is:
99:52:54:20:d1:e7:8d:2d:95:0d:ba:b1:98:22:fa:34 biscuitninja@colnago
The key's randomart image is:
+--[ RSA 3072]----+
|      oooo. .+   |
|       o. ..o .  |
|        .oo=     |
|       . =++o    |
|    . o S o.     |
|   . . o         |
|  . E            |
|   o .           |
|    .            |
+-----------------+

biscuitninja@colnago ~ $ ssh-copy-id -o PreferredAuthentications=password -i ~/.ssh/id_rsa_mundo.bikeshed.internal mundo
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
biscuitninja@mundo's password: 

Number of key(s) added: 1

Now try logging into the machine, with:   "ssh -o 'PreferredAuthentications=password' 'mundo'" and check to make sure that only the key(s) you wanted were added. 

At this point I will take the opportunity to amend my ssh config and backup the keys:

biscuitninja@colnago ~ $ vi ~/.ssh/config

Host mundo mundo.bikeshed.internal
    Hostname mundo.bikeshed.internal
    IdentityFile ~/.ssh/id_rsa_mundo.bikeshed.internal
    User biscuitninja

biscuitninja@colnago ~ $ cp ~/.ssh/* /mnt/it/infrastructure/secure/ssh/.
biscuitninja@colnago ~ $ rm /mnt/it/infrastructure/secure/ssh/known_hosts*
biscuitninja@colnago ~ $ chmod 400 /mnt/it/infrastructure/secure/ssh/*

I can then test my connection without having to specify the identity file.

biscuitninja@colnago ~ $ ssh mundo
The authenticity of host 'mundo.bikeshed.internal (172.16.1.52)' can't be established.
ECDSA key fingerprint is e2:63:b0:84:6f:3a:d8:e1:cc:e6:65:4a:64:85:8b:4b.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'mundo.bikeshed.internal' (ECDSA) to the list of known hosts.
Welcome to Ubuntu 14.04.3 LTS (GNU/Linux 3.19.0-25-generic x86_64)

 * Documentation:  https://help.ubuntu.com/

  System information as of Fri Nov  6 21:21:50 GMT 2015

  System load:  0.0                Processes:           98
  Usage of /:   0.3% of 458.32GB   Users logged in:     1
  Memory usage: 1%                 IP address for eth0: 172.16.1.52
  Swap usage:   0%

  Graph this data and manage this system at:
    https://landscape.canonical.com/

56 packages can be updated.
25 updates are security updates.

Last login: Fri Nov  6 20:47:13 2015 from 172.16.3.41
biscuitninja@mundo:~$

Whilst I have a secure shell session running, I'll tweak the configuration as to only allow publickey authentication.

biscuitninja@mundo:~$ sudo vi /etc/ssh/sshd_config

Amend/add the following entries:

PermitRootLogin no
PasswordAuthentication no
AllowTcpForwarding no
Banner /etc/issue.net

Save changes and close the editor. I'll then setup my preferred unauthorised use warning as an ssh banner.

biscuitninja@mundo:~$ sudo vi /etc/issue.net

Insert the following, amending as appropriate:

********************************************************************
* Unauthorized use of this computer system constitutes a criminal  *
* offense.                                                         *
*                                                                  *
* Anyone accessing this system expressly consents to the           *
* monitoring of their activity.                                    *
*                                                                  *
* Any suspicious or criminal activity will be reported to law      *
* enforcement and/or relevant service providers, rendering the     *
* perpetrators liable to criminal investigations and other         *
* appropriate sanctions.                                           *
********************************************************************

Then restart the ssh daemon

biscuitninja@mundo:~$ sudo service ssh restart

Tweak aptitude and then Update

I'm a bit of a control freak, at least as far as technology goes. So I tend to prevent apt-get from automatically installing recommended packages.

biscuitninja@mundo:~$ sudo vi /etc/apt/apt.conf.d/99bikeshed-tweaks

Insert the following lines:

APT::Install-Recommends "false";
APT::Install-Suggests "false";

Save and close the new file. Then let's bring the new server bang up-to-date

biscuitninja@mundo:~$ sudo apt-get update ; sudo apt-get upgrade -y

Set-up and Test Wake-on-Lan (WOL)

On reading this article, my first thoughts are to shut down mundo and hunt around the BIOS settings for any configuration relating to "Wake-On-LAN". After an extensive search, I drew a blank. I could see a resume setting within the Advanced Power Management menu, but I chose to ignore it for now as I suspect that relates to resuming from a suspended state - where as I'm more interested in starting the server from a powered-off state.

biscuitninja@mundo:~$  sudo ethtool eth0
[sudo] password for biscuitninja: 
Settings for eth0:
    Supported ports: [ TP ]
    Supported link modes:   10baseT/Half 10baseT/Full 
                            100baseT/Half 100baseT/Full 
                            1000baseT/Half 1000baseT/Full 
    Supported pause frame use: No
    Supports auto-negotiation: Yes
    Advertised link modes:  10baseT/Half 10baseT/Full 
                            100baseT/Half 100baseT/Full 
                            1000baseT/Half 1000baseT/Full 
    Advertised pause frame use: No
    Advertised auto-negotiation: No
    Speed: 1000Mb/s
    Duplex: Full
    Port: Twisted Pair
    PHYAD: 0
    Transceiver: internal
    Auto-negotiation: on
    MDI-X: Unknown
    Supports Wake-on: pg
    Wake-on: d
    Current message level: 0x000000ff (255)
                   drv probe link timer ifdown ifup rx_err tx_err
    Link detected: yes

The ethtool output shows that Wake-On-Lan is supported - as demarked by the letter 'g' after the field name "Supports Wake-on". There's no 'g' for the "Wake-on" field so we have to try and enable it.

biscuitninja@mundo:~$ sudo ethtool -s eth0 wol g
biscuitninja@mundo:~$ sudo ethtool eth0
Settings for eth0:
    Supported ports: [ TP ]
    Supported link modes:   10baseT/Half 10baseT/Full 
                            100baseT/Half 100baseT/Full 
                            1000baseT/Half 1000baseT/Full 
    Supported pause frame use: No
    Supports auto-negotiation: Yes
    Advertised link modes:  10baseT/Half 10baseT/Full 
                            100baseT/Half 100baseT/Full 
                            1000baseT/Half 1000baseT/Full 
    Advertised pause frame use: No
    Advertised auto-negotiation: No
    Speed: 1000Mb/s
    Duplex: Full
    Port: Twisted Pair
    PHYAD: 0
    Transceiver: internal
    Auto-negotiation: on
    MDI-X: Unknown
    Supports Wake-on: pg
    Wake-on: g
    Current message level: 0x000000ff (255)
                   drv probe link timer ifdown ifup rx_err tx_err
    Link detected: yes

Okay, that looks promising. The next step is to try and test it. First let's obtain the MAC address:

biscuitninja@colnago ~ $ arp -a | grep mundo
mundo.bikeshed.internal (172.16.1.52) at 00:e0:61:0d:79:b6 [ether] on wlan0

Let's shut it down

biscuitninja@mundo:~$ sudo shutdown -P now

And then try and wake it up again...

biscuitninja@colnago ~ $ sudo apt-get install -y wakeonlan
biscuitninja@colnago ~ $ wakeonlan 00:e0:61:0d:79:b6

No joy. This doesn't work. Try again with port 7.

biscuitninja@colnago ~ $ wakeonlan -p 7 00:e0:61:0d:79:b6

Still no joy. Further reading suggests that the ethtool change doesn't persist a reboot, so lets make it a bit more persistent.

biscuitninja@mundo:~$ sudo -i 
root@mundo:~$ echo '#!/bin/sh' > /etc/network/if-up.d/wol
root@mundo:~$ echo 'ethtool -s eth0 wol g' >> /etc/network/if-up.d/wol
root@mundo:~$ chmod a+x /etc/network/if-up.d/wol

Okay, let's reboot and enable Wake-On-LAN in the BIOS APM resume settings, boot, shutdown and then try again:

biscuitninja@colnago ~ $ wakeonlan 00:e0:61:0d:79:b6
biscuitninja@colnago ~ $ wakeonlan -p 7 00:e0:61:0d:79:b6

And again, no joy. I've a sneaking suspicion that this is happening because the network card remains visibly unpowered whilst the machine is in it's shutdown state, as indicated by a lack of blinking lights. Okay, lets try suspending the server instead of shutting it down.

biscuitninja@mundo:~$ sudo apt-get install -y pm-utils
biscuitninja@mundo:~$ sudo pm-suspend

The server drops down to a suspended state - as noted by the power consumption reading 4.2 Watts, as oppose to the usual 2.4 Watts when it's propely shutdown. There's still no blinking lights upon the ethernet port, so things still don't look promising.

biscuitninja@colnago ~ $ wakeonlan 00:e0:61:0d:79:b6
biscuitninja@colnago ~ $ wakeonlan -p 7 00:e0:61:0d:79:b6

Nope. No cigar. I revisit the BIOS settings and enable all the 'wake' features (PCIE/USB/Keyboard/Mouse etc) and then suspend the server again. This time, I still can't wake it up by sending a magic packet but I can wake it up by pressing a key on the keyboard. So let's step back from the problem and examine options.

In an ideal world, a cron job running on brox will wake up mundo, and then having checked the ssh port becomes available send a ZFS snapshot to mundo. The current hardware doesn't appear to support Wake-On-LAN, so what are teh alternatives?

I could use an additional Network Adapter that does support Wake-On-LAN? It's plausible, but it might fall down if the BIOS "Wake-On-PCIE" setting is ineffective. Further more, either PCIE 2.2 is required or a NIC with an independant power connection. I'm already thinkingn about upgrading the motherboard as soon as funds allow, so spending additional cash on a short-term solution seems silly.

A second option might be using a Raspberry Pi and a remote controlled electric socket. I could leave the Pi running 24x7 and use it to power on the remote socket. All I need do is change Mundo's powered-on state BIOS option. It's quite a viable option given that the Pi uses approx 2.5 Watts and the remote socket is just 1 Watt more. Given that when shutdown and plugged in Mundo is using 2.4 Watts, the difference is tiny. The only down sides are maintaining two machines instead of one (a tiny overhead, particularly given that the Pi has a single function) and really, I'd like to keep the Pi spare as it's earmarked for a future project.

So, what else? Oh yeah, sifting through the BIOS options, I noticed on the Advanced Power Management sub-menu a "Wake on RTC" option. RTC? A quick search shows that to be Real Time Clock. So we can wake up mundo at a given time, using the server's wall clock. Bingo. Although it would be nice fire mundo up remotely using Wake-On-LAN, in the event brox has experienced a hardware failure and is no longer running DHCP/DNS etc., that can wait until I find some more flexible hardware.

I configure the "Wake on RTC" option, shut mundo down and wait. Five minutes later, it starts up of it's own accord. Sold. This approach does mean relying on the system wall clock, so brox must check first (with some tolerance) that mundo is awake before sending it any ZFS snapshots. If after a period of time mundo isn't available, then brox will need to send some sort of notification.

I'll also ensure that when it's switched on, mundo syncrhonises the wall clock with the NTP service on brox.

Configure NTP

Check the current status of NTP.

biscuitninja@mundo:~# ntpq -p
The program 'ntpq' is currently not installed. You can install it by typing:
apt-get install ntp

NTP isn't installed. First I remove the deprecated ntpdate...

biscuitninja@mundo:~# sudo apt-get remove -y ntpdate
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following packages will be REMOVED
  ntpdate ubuntu-minimal
0 to upgrade, 0 to newly install, 2 to remove and 3 not to upgrade.
After this operation, 312 kB disk space will be freed.
(Reading database ... 56559 files and directories currently installed.)
Removing ubuntu-minimal (1.325) ...
Removing ntpdate (1:4.2.6.p5+dfsg-3ubuntu2.14.04.5) ...
Processing triggers for man-db (2.6.7.1-1ubuntu1) ...

Then install ntp:

biscuitninja@mundo:~# sudo apt-get install -y ntp
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following extra packages will be installed:
  libopts25
Suggested packages:
  ntp-doc
The following NEW packages will be installed
  libopts25 ntp
0 to upgrade, 2 to newly install, 0 to remove and 3 not to upgrade.
Need to get 474 kB of archives.
After this operation, 1,677 kB of additional disk space will be used.
Get:1 http://gb.archive.ubuntu.com/ubuntu/ trusty/main libopts25 amd64 1:5.18-2ubuntu2 [55.3 kB]
Get:2 http://gb.archive.ubuntu.com/ubuntu/ trusty-updates/main ntp amd64 1:4.2.6.p5+dfsg-3ubuntu2.14.04.5 [419 kB]
Fetched 474 kB in 2s (174 kB/s)
Selecting previously unselected package libopts25:amd64.
(Reading database ... 56547 files and directories currently installed.)
Preparing to unpack .../libopts25_1%3a5.18-2ubuntu2_amd64.deb ...
Unpacking libopts25:amd64 (1:5.18-2ubuntu2) ...
Selecting previously unselected package ntp.
Preparing to unpack .../ntp_1%3a4.2.6.p5+dfsg-3ubuntu2.14.04.5_amd64.deb ...
Unpacking ntp (1:4.2.6.p5+dfsg-3ubuntu2.14.04.5) ...
Processing triggers for ureadahead (0.100.0-16) ...
ureadahead will be reprofiled on next reboot
Processing triggers for man-db (2.6.7.1-1ubuntu1) ...
Setting up libopts25:amd64 (1:5.18-2ubuntu2) ...
Setting up ntp (1:4.2.6.p5+dfsg-3ubuntu2.14.04.5) ...
Starting NTP server ntpd
   ...done.
Processing triggers for libc-bin (2.19-0ubuntu6.6) ...
Processing triggers for ureadahead (0.100.0-16) ...

biscuitninja@mundo:~$ sudo vi /etc/ntp.conf

Find the line

server ntp.ubuntu.com

Specify the following servers, deleting any that already exist:

server brox.bikeshed.internal prefer iburst
server 0.uk.pool.ntp.org
server 1.uk.pool.ntp.org
server 2.uk.pool.ntp.org
server 3.uk.pool.ntp.org
server ntp.ubuntu.com
server 127.127.1.0
fudge 127.127.1.0 stratum 16

This configuration specifies brox as mundo's primary time source with the "prefer" option. The "iburst" confused me somewhat. The "burst" and "iburst" settings according to the relevant documentation:

burst: When the server is reachable, send a burst of eight packets instead of the usual one. The packet spacing is normally 2 s; however, the spacing between the first and second packets can be changed with the calldelay command to allow additional time for a modem or ISDN call to complete. This is designed to improve timekeeping quality with the server command and s addresses.

iburst: When the server is unreachable, send a burst of eight packets instead of the usual one. The packet spacing is normally 2 s; however, the spacing between the first two packets can be changed with the calldelay command to allow additional time for a modem or ISDN call to complete. This is designed to speed the initial synchronization acquisition with the server command and s addresses and when ntpd(8) is started with the -q option

I figure as the brox should be "reachable", then using the burst option should decrease the time it takes for ntpd to acquire and groom data before selecting a time source. However, in my testing, I found that using "iburst" meant brox was selected as a time source after a 30 or so seconds, where as with "burst" it took several minutes.

As a side note, using burst and iburst with public NTP servers is considered bad form.

The other servers are added primarily for redundancy. The last two lines mean that in the event no network is available, the local wall clock is used.

If the hardware clock on mundo drifts by over 1000 seconds (the ntp daemons default panic threshold) whilst mundo is powered down, then the ntp daemon log a warning to the syslog and then exit. We reconfigure it with the "-g" option which allows the time to be set to any value.

biscuitninja@mundo:~$ sudo vi /etc/default/ntp

Add the "-g" option, if it doesn't already exist:

NTPD_OPTS='-g'

Restart the NTP service:

biscuitninja@mundo:~$ sudo service ntp restart
[sudo] password for biscuitninja: 
 * Stopping NTP server ntpd
   ...done.
 * Starting NTP server ntpd
   ...done.

Test:

biscuitninja@mundo:~$ ntpq -p
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
*brox.bikeshed.i 94.125.132.7     3 u    3   64  377    0.246   -3.479   1.015
+kvm1.websters-c 193.190.230.65   2 u   38   64  377   22.726   -5.157   3.049
+5.77.45.219     81.63.144.23     3 u   44   64  377   24.682   -4.723   1.043
+lon.jonesey.net 81.174.136.35    2 u   30   64  377   23.332   -6.065   1.694
-159-253-77-127. 94.125.132.7     3 u   30   64  377   34.777   -3.020   1.661
+juniperberry.ca 140.203.204.77   2 u   32   64  377   24.964   -4.832   0.755
 LOCAL(0)        .LOCL.          16 l 1147   64    0    0.000    0.000   0.000

The asterisk denotes the remote time server mundo is using as it's time source. It might take a short while whilst the That concludes all of the prerequisite work prior to installing and configuring ZFS.

Install ZFS

Install the ZFS PPA (Personal Package Archive). More information from the ZFS On Linux project.

biscuitninja@mundo:~$ sudo apt-get install software-properties-common
[sudo] password for biscuitninja: 
Reading package lists... Done
Building dependency tree       
Reading state information... Done
software-properties-common is already the newest version.
0 to upgrade, 0 to newly install, 0 to remove and 3 not to upgrade.

Okay, it was already installed. The above step is only necessary for an Ubuntu minimal installation. Lets add the ZFS On Linux PPA.

biscuitninja@mundo:~$ sudo add-apt-repository ppa:zfs-native/stable
 The native ZFS filesystem for Linux. Install the ubuntu-zfs package.

Please join this Launchpad user group if you want to show support for ZoL:

  https://launchpad.net/~zfs-native-users

Send feedback or requests for help to this email list:

  http://list.zfsonlinux.org/mailman/listinfo/zfs-discuss
  <email address hidden>

Report bugs at:

  https://github.com/zfsonlinux/zfs/issues  (for the driver itself)
  https://github.com/zfsonlinux/pkg-zfs/issues (for the packaging)

The ZoL project home page is:

  http://zfsonlinux.org/
More info: https://launchpad.net/~zfs-native/+archive/ubuntu/stable
Press [ENTER] to continue or ctrl-c to cancel adding it

gpg: keyring `/tmp/tmpoz3eqc57/secring.gpg' created
gpg: keyring `/tmp/tmpoz3eqc57/pubring.gpg' created
gpg: requesting key F6B0FC61 from hkp server keyserver.ubuntu.com
gpg: /tmp/tmpoz3eqc57/trustdb.gpg: trustdb created
gpg: key F6B0FC61: public key "Launchpad PPA for Native ZFS for Linux" imported
gpg: Total number processed: 1
gpg:               imported: 1  (RSA: 1)
OK

With the new PPA in tow, download/update package lists.

biscuitninja@mundo:~$ sudo apt-get update && sudo apt-get install -y libc6-dev ubuntu-zfs

As I've stopped aptitude from installing recommended software, I have to explicitly include libc6-dev. The install may take a while whilst the kernel modules are compiled. Watch the output carefully for unexpected failures.

That tail end of the output should look something like:

Loading new spl-0.6.5.3 DKMS files...
First Installation: checking all kernels...
Building only for 3.19.0-25-generic
Building initial module for 3.19.0-25-generic
Done.

spl:
Running module version sanity check.
 - Original module
   - No original module exists within this kernel
 - Installation
   - Installing to /lib/modules/3.19.0-25-generic/updates/dkms/

splat.ko:
Running module version sanity check.
 - Original module
   - No original module exists within this kernel
 - Installation
   - Installing to /lib/modules/3.19.0-25-generic/updates/dkms/

Running the post_install script:

depmod.......

DKMS: install completed.
Processing triggers for libc-bin (2.19-0ubuntu6.6) ...
Selecting previously unselected package zfs-dkms.
(Reading database ... 58704 files and directories currently installed.)
Preparing to unpack .../zfs-dkms_0.6.5.3-1~trusty_amd64.deb ...
Unpacking zfs-dkms (0.6.5.3-1~trusty) ...
Selecting previously unselected package spl.
Preparing to unpack .../spl_0.6.5.3-1~trusty_amd64.deb ...
Unpacking spl (0.6.5.3-1~trusty) ...
Selecting previously unselected package libuutil1.
Preparing to unpack .../libuutil1_0.6.5.3-1~trusty_amd64.deb ...
Unpacking libuutil1 (0.6.5.3-1~trusty) ...
Selecting previously unselected package libnvpair1.
Preparing to unpack .../libnvpair1_0.6.5.3-1~trusty_amd64.deb ...
Unpacking libnvpair1 (0.6.5.3-1~trusty) ...
Selecting previously unselected package libzpool2.
Preparing to unpack .../libzpool2_0.6.5.3-1~trusty_amd64.deb ...
Unpacking libzpool2 (0.6.5.3-1~trusty) ...
Selecting previously unselected package libzfs2.
Preparing to unpack .../libzfs2_0.6.5.3-1~trusty_amd64.deb ...
Unpacking libzfs2 (0.6.5.3-1~trusty) ...
Selecting previously unselected package zfsutils.
Preparing to unpack .../zfsutils_0.6.5.3-1~trusty_amd64.deb ...
Unpacking zfsutils (0.6.5.3-1~trusty) ...
Selecting previously unselected package ubuntu-zfs.
Preparing to unpack .../ubuntu-zfs_8~trusty_amd64.deb ...
Unpacking ubuntu-zfs (8~trusty) ...
Processing triggers for man-db (2.6.7.1-1ubuntu1) ...
Processing triggers for initramfs-tools (0.103ubuntu4.2) ...
update-initramfs: Generating /boot/initrd.img-3.19.0-25-generic
Processing triggers for ureadahead (0.100.0-16) ...
Setting up zfs-doc (0.6.5.3-1~trusty) ...
Setting up zfs-dkms (0.6.5.3-1~trusty) ...
Loading new zfs-0.6.5.3 DKMS files...
First Installation: checking all kernels...
Building only for 3.19.0-25-generic
Building initial module for 3.19.0-25-generic
Done.

zavl:
Running module version sanity check.
 - Original module
   - No original module exists within this kernel
 - Installation
   - Installing to /lib/modules/3.19.0-25-generic/updates/dkms/

zcommon.ko:
Running module version sanity check.
 - Original module
   - No original module exists within this kernel
 - Installation
   - Installing to /lib/modules/3.19.0-25-generic/updates/dkms/

znvpair.ko:
Running module version sanity check.
 - Original module
   - No original module exists within this kernel
 - Installation
   - Installing to /lib/modules/3.19.0-25-generic/updates/dkms/

zpios.ko:
Running module version sanity check.
 - Original module
   - No original module exists within this kernel
 - Installation
   - Installing to /lib/modules/3.19.0-25-generic/updates/dkms/

zunicode.ko:
Running module version sanity check.
 - Original module
   - No original module exists within this kernel
 - Installation
   - Installing to /lib/modules/3.19.0-25-generic/updates/dkms/

zfs.ko:
Running module version sanity check.
 - Original module
   - No original module exists within this kernel
 - Installation
   - Installing to /lib/modules/3.19.0-25-generic/updates/dkms/
depmod....

DKMS: install completed.
Setting up spl (0.6.5.3-1~trusty) ...
Setting up libuutil1 (0.6.5.3-1~trusty) ...
Setting up libnvpair1 (0.6.5.3-1~trusty) ...
Setting up libzpool2 (0.6.5.3-1~trusty) ...
Setting up libzfs2 (0.6.5.3-1~trusty) ...
Setting up zfsutils (0.6.5.3-1~trusty) ...
Processing triggers for initramfs-tools (0.103ubuntu4.2) ...
update-initramfs: Generating /boot/initrd.img-3.19.0-25-generic
Processing triggers for ureadahead (0.100.0-16) ...
Setting up ubuntu-zfs (8~trusty) ...
Processing triggers for libc-bin (2.19-0ubuntu6.6) ...
biscuitninja@mundo:~$

The zfs kernel module won't as yet be loaded. So let's do it now:

biscuitninja@mundo:~$ sudo modprobe zfs
biscuitninja@mundo:~$ lsmod | grep zfs
zfs                  2785280  0 
zunicode              331776  1 zfs
zcommon                57344  1 zfs
znvpair                90112  2 zfs,zcommon
spl                    94208  3 zfs,zcommon,znvpair
zavl                   16384  1 zfs

I'm planning to run two zpools, each consisting of two mirrored drives (vdevs). Incidentally, I'm using 4 x 2TB disks, two of which are HGST, the other two being Seagate. Both sets of disks have been used as RAID pairs so the wear on them will be quite even. To reduce the liklehood of a zpool being wiped out by concurrent failure of the disks (vdevs), I want each zpool to contain one disk of each type.

Having established which disk is which (using lsblk and lshw), it's time to create the zpools.

biscuitninja@mundo:~$ sudo mkdir /zfs
biscuitninja@mundo:~$ sudo zpool create -f -O aclinherit=passthrough -O casesensitivity=mixed -O nbmand=on -m /zfs/biz biz mirror /dev/disk/by-id/ata-Hitachi_HDS723020BLA642_MN1220F32027XD /dev/disk/by-id/ata-ST2000DM001-9YN164_W1E21B0T
biscuitninja@mundo:~$ sudo zpool create -f -O aclinherit=passthrough -O casesensitivity=mixed -O nbmand=on -m /zfs/bikeshed bikeshed mirror /dev/disk/by-id/ata-Hitachi_HDS723020BLA642_MN1220F320M63D /dev/disk/by-id/ata-ST2000DM001-9YN164_W1E219TZ
biscuitninja@mundo:~$ sudo zpool status
  pool: bikeshed
 state: ONLINE
  scan: none requested
config:

    NAME                                            STATE     READ WRITE CKSUM
    bikeshed                                        ONLINE       0     0     0
      mirror-0                                      ONLINE       0     0     0
        ata-Hitachi_HDS723020BLA642_MN1220F320M63D  ONLINE       0     0     0
        ata-ST2000DM001-9YN164_W1E219TZ             ONLINE       0     0     0

errors: No known data errors

  pool: biz
 state: ONLINE
  scan: none requested
config:

    NAME                                            STATE     READ WRITE CKSUM
    biz                                             ONLINE       0     0     0
      mirror-0                                      ONLINE       0     0     0
        ata-Hitachi_HDS723020BLA642_MN1220F32027XD  ONLINE       0     0     0
        ata-ST2000DM001-9YN164_W1E21B0T             ONLINE       0     0     0

errors: No known data errors
biscuitninja@mundo:~$ 

Thus far we've installed ZFS and we've created two separate storage pools. At this point there are a two considerations worthy of note.

In many on-line tutorials you will see folk creating zpools or adding/attaching vdevs (disks) to pools by their device ids (e.g. dev/sdb). This is a bad idea. As hardware is added or removed from a system, these device ids can change. It's better practice to refer to disks by their disk ids rather than device ids.

Secondly, I've used disks with differing physical sector sizes:

biscuitninja@mundo:~$ sudo parted /dev/sdb unit s print
Model: ATA Hitachi HDS72302 (scsi)
Disk /dev/sdb: 3907029168s
Sector size (logical/physical): 512B/*512B*
Partition Table: gpt

Number  Start        End          Size         File system  Name  Flags
 1      2048s        3907012607s  3907010560s               zfs
 9      3907012608s  3907028991s  16384s

biscuitninja@mundo:~$ sudo parted /dev/sdd unit s print
Model: ATA ST2000DM001-9YN1 (scsi)
Disk /dev/sdd: 3907029168s
Sector size (logical/physical): 512B/*4096B*
Partition Table: gpt

Number  Start        End          Size         File system  Name  Flags
 1      2048s        3907012607s  3907010560s               zfs
 9      3907012608s  3907028991s  16384s

This actually turns out to be okay - the partition alignment on both disks starts at sector 2048 which will work fine with normal and advanced format disks. In short, partition miss-alignment and newer advanced format disks with 4KiB physical sectors can have significant performance implications. ZFS seems to handle this unproblematically and will even align partitions appropriately when mixing different types of disks in the same storage pool. You can read more about the 4KiB sector size issue here.

With our storage pools created, it's time to remote onto the home server (brox) to create and transmit some snapshots.

Create ZFS Backup User and SSH Keys

Before I can securely transmit any ZFS snapshots from brox to mundo, I need to create a user and delegate them ZFS permissions.

biscuitninja@mundo:~$ sudo adduser zfsbackup
Adding user `zfsbackup' ...
Adding new group `zfsbackup' (1001) ...
Adding new user `zfsbackup' (1001) with group `zfsbackup' ...
Creating home directory `/home/zfsbackup' ...
Copying files from `/etc/skel' ...
Enter new UNIX password: 
Retype new UNIX password: 
passwd: password updated successfully
Changing the user information for zfsbackup
Enter the new value, or press ENTER for the default
    Full Name []: 
    Room Number []: 
    Work Phone []: 
    Home Phone []: 
    Other []: 
Is the information correct? [Y/n] y

In the next step, we give our new user the permissions necessary to receive a snapshot. Reading the documentation, the following should work...

biscuitninja@mundo:~$ sudo zfs allow zfsbackup receive,mount,create biz
biscuitninja@mundo:~$ sudo zfs allow zfsbackup receive,mount,create bikeshed

... however I found this insufficient and resorted instead to visudo

biscuitninja@mundo:~$ sudo visudo

Cmnd_Alias ZFS_CMDS =   /sbin/zfs receive -Fduv biz, /sbin/zfs receive -Fduv bikeshed, \
                /usr/local/bin/zfs_destroy_biz_snapshots.sh, \
            /usr/local/bin/zfs_destroy_bikeshed_snapshots.sh, \
                /sbin/zpool scrub biz, \
                /sbin/zpool scrub bikeshed, \
                /sbin/zpool status, \
                /sbin/shutdown -P now
zfsbackup       ALL=(ALL) NOPASSWD: ZFS_CMDS

I've added here commands for destroying old snapshots (shell scripts that will remove snapshots over 60 days old), scrubbing the zpools and of course remotely shutting down the backup server.

Finally, as I'm going to copy the keys across manually, I'll create an authorized_keys file for zfsbackup on mundo and set permissions appropriately.

biscuitninja@mundo:~$ sudo su - zfsbackup
zfsbackup@mundo:/home/zfsbackup$

zfsbackup@mundo:~$ mkdir .ssh
zfsbackup@mundo:~$ chmod 700 .ssh
zfsbackup@mundo:~$ touch .ssh/authorized_keys
zfsbackup@mundo:~$ chmod 600 .ssh/authorized_keys

Open up the new authorized_keys file for editing.
zfsbackup@mundo:~$ vi .ssh/authorized_keys

Then (using another terminal window) on brox we create new public/private key pairs that can be used by root to SSH onto mundo. There's 7 keys in all, zfs receive, scrub and destroy for each pool and then a single key to shut mundo down. As a passphrase will be ommitted, we will swing by and turn these into single use keys later on, hence having so many of them.

biscuitninja@brox:~$ sudo ssh-keygen -t rsa -b 3072 -f /root/.ssh/id_rsa_zfsbackup_biz_mundo.bikeshed.internal -C "zfs backup biz"
biscuitninja@brox:~$ sudo ssh-keygen -t rsa -b 3072 -f /root/.ssh/id_rsa_zfsbackup_bikeshed_mundo.bikeshed.internal -C "zfs backup bikeshed"

biscuitninja@brox:~$ sudo ssh-keygen -t rsa -b 3072 -f /root/.ssh/id_rsa_zfsdestroy_biz_mundo.bikeshed.internal -C "zfs destroy biz snapshot"
biscuitninja@brox:~$ sudo ssh-keygen -t rsa -b 3072 -f /root/.ssh/id_rsa_zfsdestroy_bikeshed_mundo.bikeshed.internal -C "zfs destroy bikeshed snapshot"

biscuitninja@brox:~$ sudo ssh-keygen -t rsa -b 3072 -f /root/.ssh/id_rsa_zfsscrub_biz_mundo.bikeshed.internal -C "zfs scrub biz"
biscuitninja@brox:~$ sudo ssh-keygen -t rsa -b 3072 -f /root/.ssh/id_rsa_zfsscrub_bikeshed_mundo.bikeshed.internal -C "zfs scrub bikeshed"

biscuitninja@brox:~$ sudo ssh-keygen -t rsa -b 3072 -f /root/.ssh/id_rsa_zfsscrubcheck_mundo.bikeshed.internal -C "zfs scrub check"

biscuitninja@brox:~$ sudo ssh-keygen -t rsa -b 3072 -f /root/.ssh/id_rsa_shutdown_mundo.bikeshed.internal -C "shutdown mundo"

On brox, grab the first key:

biscuitninja@brox:~$ sudo cat /root/.ssh/id_rsa_zfsbackup_biz_mundo.bikeshed.internal.pub

Copy the key and switch back to the terminal session which is sshed into mundo and paste the first key into the authorized_keys file and save the changes. ( - don't close it). Then on brox, check we can connect with the keys.

biscuitninja@brox:~$ sudo ssh -i /root/.ssh/id_rsa_zfsbackup_biz_mundo.bikeshed.internal zfsbackup@mundo.bikeshed.internal
The authenticity of host 'mundo.bikeshed.internal (172.16.1.52)' can't be established.
RSA key fingerprint is 6e:f3:53:26:e3:73:95:fe:6d:98:2f:29:78:0f:06:64.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'mundo.bikeshed.internal,172.16.1.52' (RSA) to the list of known hosts.
********************************************************************
* Unauthorized use of this computer system constitutes a criminal  *
* offense.                                                         *
*                                                                  *
* Anyone accessing this system expressly consents to the           *
* monitoring of their activity.                                    *
*                                                                  *
* Any suspicious or criminal activity will be reported to law      *
* enforcement and/or relevant service providers, rendering the     *
* perpetrators liable to criminal investigations and other         *
* appropriate sanctions.                                           *
********************************************************************
Welcome to Ubuntu 14.04.3 LTS (GNU/Linux 3.19.0-25-generic x86_64)

 * Documentation:  https://help.ubuntu.com/

  System information as of Sun Nov  8 00:26:48 GMT 2015

  System load:  0.25               Processes:           177
  Usage of /:   0.3% of 458.32GB   Users logged in:     0
  Memory usage: 2%                 IP address for eth0: 172.16.1.52
  Swap usage:   0%

  Graph this data and manage this system at:
    https://landscape.canonical.com/

The programs included with the Ubuntu system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Ubuntu comes with ABSOLUTELY NO WARRANTY, to the extent permitted by
applicable law.

zfsbackup@mundo:~$
zfsbackup@mundo:~$ exit 

Success. Okay, before we continue, we should restrict the usage of the keys we've created

biscuitninja@brox:~$ sudo -i
[sudo] password for biscuitninja: 
root@brox:~# cd /root/.ssh
root@brox:~/.ssh# vi id_rsa_zfsbackup_biz_mundo.bikeshed.internal.pub

Insert "command="unpigz | sudo /sbin/zfs receive -Fduv biz",no-agent-forwarding,no-port-forwarding,no-x11-forwarding,no-user-rc " as the first part of the public key. Close the file and save the changes.

The resultant public key should look like:

command="unpigz | sudo /sbin/zfs receive -Fduv biz",no-agent-forwarding,no-port-forwarding,no-x11-forwarding,no-user-rc ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQCyBIoc+HkUQ9t/mTiIj8NUc7iQyEfXU7YklRHHyzg7Vf2FEvOtD8220DTJVdICDoqhT0y0Ag+eN4mNIasIXPc7DMIKtlUbbY22k9i03VejYnqS9z46yQOpUJNQxxnq/Y3F2CIMWD58/PMuFOcy+mSPjoB1uYn765TJV7V9KnNon83K15PoFvIV9iyazD35GYYm3dKn7heKhlw7YVR6jhTuO/7lHmnIG7K5Kp85Ob/wyMdtKgvhQ/TTctmshWWn8r2SUd1XkuUojE+QdcXuF7klqzCz5kzXUkERuuvsKVlyKnpK0c0ZmFVQFfca9NcLmwma4AFgPunAE5TPbSetmyBYF4vyXNU4JgXQUxJE7ebZfUbI/UeqPMi9cBv8PJAiJEHzuiXWekSlPfdb121tMFG8zuIQSubqJ5DOy6gtUoQpasNxvG/uV7YBff5Y0q0jT5cNdwd+VCreEK7TVZikTGiraUyfKH9gSMNvfoFu1LmGzRgkbGOIM+vlS47iGiHCxOc= zfs biz backup

I repeated this for the second key, id_rsa_zfsbackup_bikeshed_mundo.bikeshed.internal.pub, substituting bikeshed for biz in the command. Next up are the keys for destroying snapshots:

root@brox:~/.ssh# vi id_rsa_zfsdestroy_biz_mundo.bikeshed.internal.pub

Insert "command="sudo /usr/local/bin/zfs_destroy_biz_snapshots.sh",no-agent-forwarding,no-port-forwarding,no-x11-forwarding,no-user-rc " as the first part of the public key. Close the file and save the changes. Repeat for id_rsa_zfsdestroy_bikeshed_mundo.bikeshed.internal.pub

Next up are the keys for scrubbing the zpools. Insert "command="sudo /sbin/zpool scrub biz",no-agent-forwarding,no-port-forwarding,no-x11-forwarding,no-user-rc " as the first part of id_rsa_zfscrub_biz_mundo.bikeshed.internal.pub. And again for id_rsa_zfsscrub_bikeshed_mundo.bikeshed.internal.pub, making the necessary subsitution of "bikeshed" for "biz" in the command.

Lets not forget the key used to check the status of a zfs scrub on the backup server. We shouldn't shutdown the backup server until all invoked scrubs are complete. Insert "command="sudo /sbin/zpool status",no-agent-forwarding,no-port-forwarding,no-x11-forwarding,no-user-rc " as the first part of id_rsa_zfsscrubcheck_mundo.bikeshed.internal.pub.

Almost finally we edit the key for shutting mundo down once backup activities are complete. Insert "command="sudo /sbin/shutdown -P now",no-agent-forwarding,no-port-forwarding,no-x11-forwarding,no-user-rc " as the first part of public key id_rsa_shutdown_mundo.bikeshed.internal.pub.

Let's copy from brox and paste to mundo's zfsbackup user all of the public keys we've so far created and tweaked:

root@brox:~/.ssh# cat *mundo*.pub

Copy the output and switch to the terminal with the ssh session for zfsbackups@mundo.bikeshed.internal, pasted it into the authorized_keys file. Save your changes and close the file.

Create Our First Snapshot on the Source Server

Okay, now let's snapshot one of the existing storage pools on brox.

biscuitninja@brox:~$ sudo zfs snapshot -r biz@$(date -Iseconds | cut -c1-19)

biscuitninja@brox:~$ sudo zfs list -t snapshot 
NAME                          USED  AVAIL  REFER  MOUNTPOINT
biz@2015-11-08T02:52:10          0      -   152K  -
biz/dcp@2015-11-08T02:52:10      0      -   954G  -
biz/it@2015-11-08T02:52:10       0      -   429G  -

Before I send our snapshot from brox over to mundo, I want to try limiting the upload bandwidtch using trickle. As you can see, there's a significant amount of data to move and I don't want to leave brox's outbound network connection saturated. I'm also planning to compress the snapshot before sending it to mundo and then decompress it again before applying it, so I'm also installing pigz. pigz is a parallel implementation of gzip for modern multi-processor, multi-core machines.

biscuitninja@brox:~$ sudo apt-get install -y trickle pigz

Now lets send the snapshot. The storage pools are quite chunky, so expect this to take some time.

biscuitninja@brox:~$ zfs send -R biz@2015-11-08T02:52:10 | pigz -4 | trickle -u 10240 ssh -i /root/.ssh/id_rsa_zfsbackup_biz_mundo.bikeshed.internal zfsbackup@mundo.bikeshed.internal "unpigz | sudo /sbin/zfs receive -Fduv biz"

That was successful, so I repeated the process for the second storage pool:

biscuitninja@brox:~$ sudo zfs snapshot -r bikeshed@$(date -Iseconds | cut -c1-19)

biscuitninja@brox:~$ sudo zfs list -t snapshot 
NAME                                    USED  AVAIL  REFER  MOUNTPOINT
biz@2015-11-08T02:52:10                    0      -   152K  -
biz/dcp@2015-11-08T02:52:10                0      -   954G  -
biz/it@2015-11-08T02:52:10                 0      -   429G  -
bikeshed@2015-11-09T14:24:31                   0      -   152K  -
bikeshed/biscuitNinja@2015-11-09T14:24:31          0      -   454G  -
bikeshed/missBiscuitNinja@2015-11-09T14:24:31      0      -    68G  -

biscuitninja@brox:~$ zfs send -R bikeshed@2015-11-09T14:24:31 | pigz -4 | trickle -u 10240 ssh -i /root/.ssh/id_rsa_zfsbackup_bikeshed_mundo.bikeshed.internal zfsbackup@mundo.bikeshed.internal "unpigz | sudo /sbin/zfs receive -Fduv bikeshed"

Great stuff. Having taken some time to check the filesystems on mundo are all present and correct, we now need to automate the process! But before I do, you might have noticed the upload restriction I'm passing into trickle is quite low. The backup server is relying on it's onboard NIC, a Marvell 88E8056 PCI-E Gigabit Ethernet Controller. It seems any load on the NIC results in the kernel spamming the syslog:

Nov 11 19:06:00 mundo kernel: [ 5024.045470] net_ratelimit: 35 callbacks suppressed
Nov 11 19:06:00 mundo kernel: [ 5024.045480] sky2 0000:05:00.0: error interrupt status=0x40000008
Nov 11 19:06:00 mundo kernel: [ 5024.045506] sky2 0000:05:00.0 eth0: rx error, status 0x7ffc0001 length 996

It seems to be a common complaint running modern versions of linux on these controllers - and the issue is thought to be caused by an actual hardware timing issue. I'll be certain that mundo version 2 gets an Intel NIC :/

Automating the ZFS Backups

The first thing to do in automating the backups from brox to mundo is produce some shell scripts. We will start with the shell scripts to destroy the old snapshots on mundo.

biscuitninja@mundo:~$ sudo vi /usr/local/bin/zfs_destroy_biz_snapshots.sh

Paste in the following:

#!/bin/bash

storagePool="biz"
oldestSnapshotToKeep=$(date --date="61 days ago" +"%Y%m%d%H%M%S")
for snapshot in $(zfs list -t snapshot -o name -s name | grep ^${storagePool}@) ; do
        snapshotDate=$(echo $snapshot | grep -P -o '2\d{3}-[0-3]\d-[0-3]\dT[012]\d:[0-5]\d:[0-5]\d')
        if [ $(date --date="$snapshotDate" +"%Y%m%d%H%M%S") -lt $oldestSnapshotToKeep ] ; then
                /sbin/zfs destroy -R ${snapshot} &>/dev/null || exit 110
        fi
done

Copy and amend the file:

biscuitninja@mundo:~$ sudo cp /usr/local/bin/zfs_destroy_biz_snapshots.sh /usr/local/bin/zfs_destroy_bikeshed_snapshots.sh
biscuitninja@mundo:~$ sudo vi /usr/local/bin/zfs_destroy_bikeshed_snapshots.sh

Amend 'storagePool="biz"' to 'storagePool="bikeshed"', then save the changes and close the file. Then make both scripts executable:

biscuitninja@mundo:~$ sudo chmod 700 /usr/local/bin/zfs_destroy_bi*_snapshots.sh

We're done scripting on mundo. Lets start working scripts to take snapshots on brox and send them to mundo.

biscuitninja@brox:~$ sudo vi /usr/local/bin/zfs_biz_backup.sh

Paste in the following:

#!/bin/bash

# Exit codes:
# 110   remoteDestination is not online
# 120   failed to take snapshot
# 130   failed to send snapshot
# 140   failed to destroy snapshot

storagePool="biz"
remoteDestination=mundo.bikeshed.internal
remoteUser=zfsbackup
identifyFile=/root/.ssh/id_rsa_zfsbackup_biz_mundo.bikeshed.internal

for ((i=1;i<=20;i++)) ; do
  nc -z $remoteDestination 22 &> /dev/null && remoteDestinationOnline=true
  [ $remoteDestinationOnline ] && break
  sleep 30
done

if ! [ $remoteDestinationOnline ] ; then
    exit 110
fi

lastSnapshot=$(/sbin/zfs list -t snapshot -o name -s name | grep ^${storagePool}@ | sort | tail -1)
/sbin/zfs snapshot -r ${storagePool}@$(date -Iseconds | cut -c1-19) &>/dev/null || exit 120
newSnapshot=$(/sbin/zfs list -t snapshot -o name -s name | grep ^${storagePool}@ | sort | tail -1)

/sbin/zfs send -R -i ${lastSnapshot} ${newSnapshot} | pigz -4 | trickle -u 10240 ssh -q -i ${identifyFile} ${remoteUser}@${remoteDestination} "unpigz | sudo /sbin/zfs receive -Fduv ${storagePool}" &>/dev/null || exit 130

oldestSnapshotToKeep=$(date --date="31 days ago" +"%Y%m%d%H%M%S")
for snapshot in $(zfs list -t snapshot -o name -s name | grep ^${storagePool}@) ; do
    snapshotDate=$(echo $snapshot | grep -P -o '2\d{3}-[0-3]\d-[0-3]\dT[012]\d:[0-5]\d:[0-5]\d')
    if [ $(date --date="$snapshotDate" +"%Y%m%d%H%M%S") -lt $oldestSnapshotToKeep ] ; then
        /sbin/zfs destroy -R ${snapshot} &>/dev/null || exit 140
    fi
done

Save and close the file. I created a second copy of the above script with the values changed for the second storage pool, and then made them both executable:

biscuitninja@brox:~# sudo chmod 750 /usr/local/bin/zfs*backup.sh

Then I created a shell script to call both storage pool backup scripts. I want to ensure that the cron job backups up both storage pools in sequence, not concurrently. Additionally I will later add some logic to the shell script to remove snapshots over thirty days old from mundo and once a month scrub each storage pool.

biscuitninja@brox:~# sudo vi /usr/local/bin/zfs_backups.sh

#!/bin/bash

function handleError {
    echo "$1 failed with exit status $2" 1>&2
    exit $3
}

/usr/local/bin/zfs_biz_backup.sh 
exitStatus=$?
if [ $exitStatus -ne 0 ] ; then
  handleError "zfs_biz_backup" $exitStatus 110
fi

/usr/local/bin/zfs_bikeshed_backup.sh
exitStatus=$?
if [ $exitStatus -ne 0 ] ; then
  handleError "zfs_bikeshed_backup" $exitStatus 120
fi

ssh -q -i /root/.ssh/id_rsa_zfsdestroy_biz_mundo.bikeshed.internal zfsbackup@mundo.bikeshed.internal "sudo /usr/local/bin/zfs_destroy_biz_snapshots.sh"
exitStatus=$?
if [ $exitStatus -ne 0 ] ; then
  handleError "ssh -q -i /root/.ssh/id_rsa_zfsdestroy_biz_mundo.bikeshed.internal zfsbackup@mundo.bikeshed.internal sudo /usr/local/bin/zfs_destroy_biz_snapshots.sh" $exitStatus 130
fi

ssh -q -i /root/.ssh/id_rsa_zfsdestroy_bikeshed_mundo.bikeshed.internal zfsbackup@mundo.bikeshed.internal "sudo /usr/local/bin/zfs_destroy_bikeshed_snapshots.sh"
exitStatus=$?
if [ $exitStatus -ne 0 ] ; then
  handleError "ssh -q -i /root/.ssh/id_rsa_zfsdestroy_bikeshed_mundo.bikeshed.internal zfsbackup@mundo.bikeshed.internal sudo /usr/local/bin/zfs_destroy_bikeshed_snapshots.sh" $exitStatus 140
fi

if [ $(date +%d) -eq 5 ] ; then
  ssh -q -i /root/.ssh/id_rsa_zfsscrub_biz_mundo.bikeshed.internal zfsbackup@mundo.bikeshed.internal "sudo /sbin/zpool scrub biz"
  if [ $exitStatus -ne 0 ] ; then
    handleError "ssh -q -i /root/.ssh/id_rsa_zfsscrub_biz_mundo.bikeshed.internal zfsbackup@mundo.bikeshed.internal sudo /sbin/zpool scrub biz" $exitStatus 150
  fi

  # Check biz scrub has started, if not throw exception
  sleep 30s
  ssh -q -i /root/.ssh/id_rsa_zfsscrubcheck_mundo.bikeshed.internal zfsbackup@mundo.bikeshed.internal "sudo /sbin/zpool status" | grep -q "scrub in progress"
  if [ $exitStatus -ne 0 ] ; then
    handleError "ssh -q -i /root/.ssh/id_rsa_zfsscrubcheck_mundo.bikeshed.internal zfsbackup@mundo.bikeshed.internal sudo /sbin/zpool status" $exitStatus 153
  fi

  # Wait for biz scrub to complete... (it's a background process)
  sleep 10m
  while $(ssh -q -i /root/.ssh/id_rsa_zfsscrubcheck_mundo.bikeshed.internal zfsbackup@mundo.bikeshed.internal "sudo /sbin/zpool status" | grep -q "scrub in progress")
  do
    sleep 10m
  done

  ssh -q -i /root/.ssh/id_rsa_zfsscrub_bikeshed_mundo.bikeshed.internal zfsbackup@mundo.bikeshed.internal "sudo /sbin/zpool scrub bikeshed"
  if [ $exitStatus -ne 0 ] ; then
    handleError "ssh -q -i /root/.ssh/id_rsa_zfsscrub_bikeshed_mundo.bikeshed.internal zfsbackup@mundo.bikeshed.internal sudo /sbin/zpool scrub bikeshed" $exitStatus 160
  fi

  # Check bikeshed scrub has started, if not throw exception
  sleep 30s
  ssh -q -i /root/.ssh/id_rsa_zfsscrubcheck_mundo.bikeshed.internal zfsbackup@mundo.bikeshed.internal "sudo /sbin/zpool status" | grep -q "scrub in progress"
  if [ $exitStatus -ne 0 ] ; then
    handleError "ssh -q -i /root/.ssh/id_rsa_zfsscrubcheck_mundo.bikeshed.internal zfsbackup@mundo.bikeshed.internal sudo /sbin/zpool status" $exitStatus 153
  fi

  # Wait for bikeshed scrub to complete... (it's a background process)
  sleep 10m
  while $(ssh -q -i /root/.ssh/id_rsa_zfsscrubcheck_mundo.bikeshed.internal zfsbackup@mundo.bikeshed.internal "sudo /sbin/zpool status" | grep -q "scrub in progress")
  do
    sleep 10m
  done
fi

ssh -q -i /root/.ssh/id_rsa_shutdown_mundo.bikeshed.internal zfsbackup@mundo.bikeshed.internal "sudo /sbin/shutdown -P now"
exitStatus=$?
if [ $exitStatus -ne 0 ] ; then
  handleError "ssh -q -i /root/.ssh/id_rsa_shutdown_mundo.bikeshed.internal zfsbackup@mundo.bikeshed.internal sudo /sbin/shutdown -P now" $exitStatus 170
fi

exit 0

Lets make our new script executable:

biscuitninja@brox:~# sudo chmod 750 /usr/local/bin/zfs_backups.sh

And give it a whirl:

biscuitninja@brox:~# sudo /usr/local/bin/zfs_backups.sh

This fortunately (after some time) returns an exit status of 0 indicating success. Running "sudo zpool list -t snapshot" on both brox and mundo concurs. Lets edit the crontab and schedule the backup script.

biscuitninja@brox:~# sudo vi /etc/crontab

# /etc/crontab: system-wide crontab
# Unlike any other crontab you don't have to run the `crontab'
# command to install the new version when you edit this file
# and files in /etc/cron.d. These files also have username fields,
# that none of the other crontabs do.

SHELL=/bin/sh
PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin

# m h dom mon dow user  command
17 *    * * *   root    cd / && run-parts --report /etc/cron.hourly
05 0    * * *   root    /usr/local/bin/zfs_backups.sh
25 4    * * *   root    test -x /usr/sbin/anacron || ( cd / && run-parts --report /etc/cron.daily )
47 5    * * 7   root    test -x /usr/sbin/anacron || ( cd / && run-parts --report /etc/cron.weekly )   
52 6    1 * *   root    test -x /usr/sbin/anacron || ( cd / && run-parts --report /etc/cron.monthly )
00 7    * * 0   root    /sbin/zpool scrub biz
00 7    * * 1   root    /sbin/zpool scrub bikeshed
00 7    * * 2   root    /sbin/zpool scrub media

And that's it. It's key to stay vigilent and periodically check everything is working, especially in the early days. brox is configured to send an email if any of the crontab scripts fail. I will be working on a server monitoring project in the near future, and an aspect of that will be checking the status of the filesystems as well as the general health on each server. I run a firewall, a few VPS servers and some other appliances too, so I hope to produce something that's inspired by Nagios but lighter weight. I'm not overly keen to reinvent the wheel, but the learning curve should help me elicit some new skills.

Other considerations:

I've hope you've enjoyed this rather lengthy journey and garnered something useful out of it. I've no doubt my implementation could be improved, your feedback and comments are more than welcome.