Arch Linux VMs, part 2 (3) | Lecture | NSWI106

Information below is not for the current semester. The current semester can be found here.

Introduction
Before class reading
LUKS-encrypted root partition
systemd
Left-overs

Introduction

Last week, we got our first Arch Linux VM running. This week, it’s time to set up the VM to do something useful. For that, we need to understand how systemd and package management works. Also, we need to get our VM online.

But first, encryption!

Before class reading

There was no before class reading for this lecture.

LUKS-encrypted root partition

One of the things we didn’t get to during the last lecture was device encryption, and how you can install your system to use an encrypted root partition. Since encryption-at-rest is a must for portable devices, we decided to open the lecture with that.

The procedure to install Arch Linux with encrypted root is almost identical to the basic UEFI+GPT installation, with only a handful of extra steps. The basic UEFI+GPT setup from last week called for a Btrfs root file system (mounted at /) on /dev/sda2:

┌─────────────────┬─────────────────────────────────────┐
│ FAT 32          │ Btrfs                               │
│ /boot           │ / (root file system)                │
├─────────────────┼─────────────────────────────────────┤
│ /dev/sda1       │ /dev/sda2                           │
│ 300 MiB         │ ~9.7 GiB                            │
└─────────────────┴─────────────────────────────────────┘

Today, we’ll create a LUKS container on /dev/sda2. When the container is unlocked—for which you need to provide a password—a new device /dev/mapper/<name> will appear. This is a block device which provides the transparent encryption of your data. We can then create Btrfs on that device. The resulting setup looks as follows:

                  ┌─────────────────────────────────────┐
                  │ Btrfs                               │
                  │ / (root file system)                │
┌─────────────────┼─────────────────────────────────────┤
│ FAT 32          │ LUKS container                      │
│ /boot           │ Not mounted                         │
├─────────────────┼─────────────────────────────────────┤
│ /dev/sda1       │ /dev/sda2                           │
│ 300 MiB         │ ~9.7 GiB                            │
└─────────────────┴─────────────────────────────────────┘

Then:

All data written to /dev/sda2 is encrypted. It’s impossible to recover the data without the password (or the LUKS headers—it is a very good idea to back up your LUKS headers).
When you read a block from /dev/mapper/<name>, a block is read from /dev/sda2 and decrypted at that instant;
When you write a block to /dev/mapper/<name>, the block is encrypted at that instant and written to /dev/sda2.

For you as the user of the Btrfs file system (and for the file system itself) all of this is completely transparent.

Boot a QEMU VM with a blank 10 GiB drive attached. We booted into UEFI with OVMF.
Create a GPT on /dev/sda with two partitions as depicted above.
Create FAT 32 on /dev/sda1.
Use cryptsetup(8) to initialize a LUKS container on /dev/sda2. You will be prompted for password. We used foobar which is obviously not a secure password:

cryptsetup luksFormat /dev/sda2

ls -l /dev/mapper, you should see your new block device.
Then, unlock the container. The name cryptroot is arbitrary, but follows a convention whereby encrypted devices are prefixed with crypt—hence cryptroot, since this container holds our encrypted root file system:

cryptsetup luksOpen /dev/sda2 cryptroot

lsblk should indicate that the container is unlocked and resides on /dev/sda2.
Create Btrfs on /dev/mapper/cryptroot
Mount /dev/mapper/cryptroot at /mnt, create /mnt/boot and mount the ESP at /mnt/boot.
The actual installation is exactly the same: pacstrap, generate fstab, arch-chroot to /mnt…
Install GRUB onto the ESP.
Edit /etc/default/grub and append the following parameters to the kernel command line:

cryptdevice=UUID=[UUID of the LUKS container]:cryptroot root=/dev/mapper/cryptroot

To do this, we used :read! blkid in Vim to obtain UUIDs of the partitions, and then just crafted out the one we cared about. Note that the UUID must be the UUID of the LUKS container, not of the root file system. This makes sense: when kernel boots up, it only sees the LUKS container, the Btrfs file system on it is still encrypted (after all, that’s the point).
Please note that although these are formally kernel command line parameters, the kernel itself does not care. The early userspace will use the parameters to unlock the encrypted root partition.
We generated the main configuration file for GRUB afterwards.
Finally, we added the encrypt hook to /etc/mkinitcpio.conf and regenerated the initramfs:

mkinitcpio -P linux

Leave chroot, unmount, reboot.
If the machine does not boot, you can correct any mistakes by jump-starting the installation with Archiso again, just as you would if encryption wasn’t used.
To override boot priority (so that instead of attempting the boot the VM, you boot the Archiso), you can select “UEFI Firmware Settings” in the boot menu, select Boot Manager and force boot from Archiso. Next time, the machine will boot from the hard drive again (kudos to Petr for telling me about this).

Note on terminology:

dm-crypt is a Device mapper module,
LUKS is a particular type of encryption container (there’s also TrueCrypt),
cryptsetup is the tool to configure dm-crypt.

We also discussed the various disadvantages of carrying around an unencrypted device. Hopefully I scared some people. Note that:

You can encrypt an existing Linux filesystem,
There’s VeraCrypt for Windows.

In other words, there’s no good excuse for not encrypting your laptop :-).

systemd

We followed up with the exploration of systemd (mind the lowercase s), a service manager used by Arch Linux and many other distros.

To understand what systemd is, we talked about “services”, or “daemons” (cf daemon, demon). Services in this context are usually long-running processes which provide some—well—service; either to the machine itself, or to other machines on the network. Examples of services:

MTA
DHCP server
DNS server
…

We discussed the two main roles of systemd which are relevant to this course:

systemd is a service manager. It provides the component of Arch Linux which runs as process number 1 (“PID 1”), usually called “init.” It makes sure that whatever services are supposed to be running in the system are running, and whatever services are no longer supposed to be running are terminated. It also handles inter-service relationships (dependency, ordering).
systemd is a log collector. It collects stdout+stderr of the services and stores it in a binary log format on disk. It performs the equivalent of log rotation and enforces retention (keeping logs for some amount of time, or until they occupy a certain amount of space).

We then went ahead to take a look at the following:

Breaking the sshd.service

systemctl(1) (no options) output which lists the various units in the system.
We picked a victim (sshd.service) and killed it (pkill sshd). We noted that systemd took care of restarting the service.
We took a quick look at systemctl status sshd and from there, the associated service unit file /usr/lib/systemd/system/sshd.service:

[Unit]
Description=OpenSSH Daemon
Wants=sshdgenkeys.service
After=sshdgenkeys.service
After=network.target

[Service]
ExecStart=/usr/bin/sshd -D
ExecReload=/bin/kill -HUP $MAINPID
KillMode=process
Restart=always

[Install]
WantedBy=multi-user.target

We noticed that when we break sshd (rm "$(which sshd)"), systemd will try to restart the service a few times and then the service unit will enter a “failed” state. That seemed at odds with Restart=always. The explanation was that there’s rate limiting at play. This is common mistake #1.
We also noticed that when we systemctl start sshd, the command succeeds, but the service may still be crashing. That’s because with Type=simple, systemd considers the service alive before it even tries to execute it. (The mere fact that the service is being started means that the service is ready. More preciesly, systemd considers the service started after fork, before exec—so it cannot notice that the exec failed.) This is common mistake #2.

hello.service

Next, we tried to create a bare-bones systemd service unit. First we wrote our /usr/local/bin/hello script providing the “service”:

#!/bin/sh
set -eu

while :; do
  printf "Hello World\n"
  sleep 1
done | cat -n

We wrote a systemd service unit for the service:

[Unit]
Description=The most important service in the system, don't turn off!

[Service]
ExecStart=/usr/local/bin/hello

We started, stopped and restart the service (systemctl {start,stop,restart} hello)
We inspected the logs with journalctl -u hello -e and journalctl -u hello -f.
We modified the unit to actually keep restarting forever:

[Unit]
Description=The most important service in the system, don't turn off!

[Service]
ExecStart=/usr/local/bin/hello
Restart=always
RestartSec=1
StartLimitInterval=0

We broke the script (with some nonsense command) and expected systemd to keep restarting the service. But it didn’t work.
It turned out that you need to systemctl daemon-reload when you modify your unit definitions, since systemd keeps them loaded in memory, and disregards the changes on disk by default. This is common mistake #3.

Left-overs

We didn’t have time to talk about Pacman. So instead, we’ll prepare a Pacman cheat-sheet for the next labs.
Our first task during the next lab will be to configure systemd-networkd and get us back online.