Yocto Hardening: Block Device Encryption with dm-crypt

The last few hardening posts have been about boot measurements, so maybe it’s time to do something else. Data safety is crucial in the embedded systems. The devices can gather, process and store information that should be kept secret. Encryption can be used to achieve exactly that. I’ll cover the encryption topic in two parts, this first part is about block device encryption and the upcoming second part will be about file system encryption. Let’s encrypt!

About Encryption

Encryption is an act of converting information from a readable format into a coded format to prevent unauthorised reads. The encrypted information can then be decrypted with a key or a password. Encrypting persistent data stored in a hard drive or other storage device is sometimes called data-at-rest encryption. The name comes from the fact that the data is stored encrypted when the device is not active. Having the correct user access control in a live system will not help much if someone just takes the storage device from a powered-off machine and reads the secret data stored in plain text.

I wish it was that easy

Of course, the benefits of encryption come with a price, and that price is the performance hit. As the data gets encrypted and decrypted, it will take some time. These days it is more common to only decrypt the required things at the required times to reduce the impact. This is more efficient, and could also be considered more secure because all the encrypted secrets won’t always get decrypted as a part of the unlocking process. The performance overhead will be constant compared to one-off decryption, but on the other hand, there won’t be a significant wait for the one-off decryption process.

Data encryption can be done on two different levels: block device level, and file system level. Block device encryption encrypts the entire device or a partition, whereas file system encryption encrypts individual folders and files in a file system. A more comprehensive comparison of the two approaches can be found in the Arch Wiki article “Data-at-rest encryption”, specifically the section “Block device vs stacked filesystem encryption”. It’s worth noting that the two approaches are not mutually exclusive.

In case you did not open the previous link and read the whole article, I want to highlight that encryption does not protect against all the possible threats. The most obvious thing to remember is that an unauthorized user can read secret data if they just manage to get into the system with proper access level when it is in the unlocked state. Second, the encryption keys may be extracted from the memory using a cold boot attack if the attacker has physical access to the device and the resources to perform the attack. And third, there is always a risk that a human being with the keys leaks them, either voluntarily or involuntarily.

About Block Device Encryption in Linux

The current de facto choice for block device encryption in Linux is dm-crypt. It is a device mapper target that provides block device encryption. Device mapper is a Linux kernel subsystem that allows creating virtual layers on top of storage devices. dm-crypt provides on-the-fly encryption, decrypting the data read from and encrypting the data written to the hardware device. To enable dm-crypt in kernel, ensure that CONFIG_DM_CRYPT is set. A block device needs to be prepared with dm-crypt mappings to be encrypted. These can be conveniently set with cryptsetup tool.

The block device can be configured either as a raw volume, where key management has to be done manually, or a LUKS (Linux Unified Key Setup) volume, which adds a LUKS header to the device. This header allows easier key management, but in theory, some information can be extracted from the LUKS header. However, header is considered non-secret. If you want your disk to appear as totally random bits and bytes to the possible snoopers, you can use the raw volume, or detached LUKS headers. I’m not going to cover detached headers, but the “Recommended Reading” section in the end contains a link to an article that has more information about them.

“Problem” with full disk encryption is that there is a lot of documentation and guides available, so it may be difficult to figure out the correct solution for your exact situation as every tutorial seems to do things slightly differently, or use different tools to achieve the same things. I have a feeling that this guide won’t be any different in that regard, but I hope that it helps you to think and plan your encryption process. Also, most of the guides are aimed at PC users, meaning that they have a slightly different perspective compared to the embedded devices. I’ll try to write mostly from the embedded perspective. Speaking of which.

Full Disk Encryption in Embedded Devices

Does disk encryption make sense for embedded systems? Disk encryption is considered the most useful when the disk gets stolen from the machine, or the machine requires a secret input from the user to unlock the disk. In the embedded world, these assumptions are not quite true. First of all, the storage media is usually a fixed part of the device, meaning that a situation where the disk gets separated from the rest of the machine without destroying either is quite rare. Second, embedded devices usually cannot expect user input.

Therefore, the disk unlocking has to happen automatically. This could involve binding the unlock process to trusted platform module (TPM) platform configuration registers (PCRs) that contain the boot measurements. This would then prevent decrypting the disk if the kernel, boot arguments or something else have been tampered with. Some other alternatives could be fetching a password using a network connection, or using a hardware device to have a kind of an “ignition key”.

Now, with automatic unlocking, a malicious actor can just turn on the device and the disk gets decrypted, assuming they haven’t tampered with the boot configuration. This to some extent reduces the usefulness of disk encryption in the systems without user input. It doesn’t mean that it is useless, there are a plenty of situations where the encryption still makes sense. For example, as mentioned above, it could help keeping things secret if someone tampers with the boot flow. Also on-the-fly decryption prevents leaking unnecessary secrets. However, it is good to be aware of the weaknesses of the automatic unlocking so that you won’t end up designing a complex encryption scheme that does not improve the system’s security.

Choice of Programs

For creating the dm-crypt mappings and the LUKS header to the volume, I’ll use the standard cryptsetup. As far as I know, there aren’t many alternatives for it.

For performing the key generation, TPM sealing & unsealing, and device unlocking, there are more options. I chose to go with Clevis. Clevis is a framework for automated decryption. It can be used to unlock LUKS volumes with TPM2, but it also provides other unlock methods, like network unlocking with Tang.

Another commonly used tool for generating and sealing the key is systemd-cryptenroll. I usually prefer to use tools that are not tied to the init system because I’m not always working with systemd. However, if you know you are going to be working with a systemd-based system, it’s an alternative worth looking into. I haven’t given it a try, but it seems like an easier alternative.

In addition, there exists tpm2-initramfs-tool that can be used to seal the encryption key. The workflow is quite similar to Clevis, but overall it offers less functionality in a (presumably) smaller package. I didn’t try this one either, and it doesn’t seem to be that commonly used, but it is one option.

The core idea will be the same, regardless of the tools: we want to set up and format a LUKS volume using cryptsetup, then bind TPM to the volume by sealing an unlock key into the TPM, and finally perform the unlocking during the boot with one of the tools listed above. The “Recommended Reading” section at the end contains some information about the systemd tools if you want to read more about them.

Hardware

To test out the block device encryption, I used a Raspberry Pi 4 with the LetsTrust TPM module. I have written a blog post about adding LetsTrust support to Raspberry Pi Yocto builds, you may want to check that one out. I also used serial UART for communicating with the device to ensure I had access to the early boot process. That’s all about hardware I guess.

Yocto

Now, since we’re talking about Yocto hardening, I’m going to talk about Yocto. The actual volume formatting and binding steps are generic, but some steps have to be performed in Yocto world to get the firmware built with the required toolset.

Cryptsetup

cryptsetup has an existing recipe in meta-oe, so adding it to the image is just as easy as adding the following line to your image recipe:

IMAGE_INSTALL:append = " cryptsetup"

Use PACKAGE_INSTALL if you’re using initramfs image.

Clevis

Unfortunately, Clevis does not have a bitbake recipe. Fortunately, I made one. It is available in meta-clevis, alongside the other dependencies that don’t have an existing recipe. I had some trouble with cross-compiling Clevis though. It checks the existence of certain binaries. However, in a bitbake cross-compilation context, these checks do not make sense. Therefore, I patched some checks out to enable the required TPM2 functionality. There’s still disabled functionality as it builds just the bare minimum to do the things shown in this blog text. Therefore, you may have to patch out some more checks, or attempt to add dependencies, to get everything working properly. Pull requests are always welcome.

Raspberry Pi BSP & Linux

For building the Raspberry Pi image, I added the customary meta-raspberrypi layer in addition to the meta-slb9670-rpi layer for TPM support. I modified the kernel configuration fragment in meta-slb9670-rpi layer so that all the features are built into the kernel. I also enhanced the default Raspberry Pi SD card image with a brand new test partition for encryption purposes:

part /crypted --ondisk mmcblk0 --fstype=ext4 --label crypted --align 4096 --size 100

Also, the kernel is built with bundled initramfs. The initramfs is mostly the same as in my earlier blog text, with a slight modification to the init script where it drops to shell just before switching root. This allows working inside the initramfs, and manually switching root when/if so desired. Also, I added Clevis with its run time dependencies. And Busybox. And mkfs. This results in a quite big initramfs. It is worth considering if this is something you want to have in your actual final product or just use it for device initialization purposes (more on the initialization process a tiny bit later). In the end, PACKAGE_INSTALL of my initramfs looks like this:

PACKAGE_INSTALL = "minimal-initramfs-init busybox clevis cryptsetup luksmeta tpm2-tools libtss2-tcti-device e2fsprogs-mke2fs"

minimal-initramfs-init is my init script that was presented in the initramfs blog text.

Existing Encryption Work in Yocto

After figuring out how to do the encryption and writing half of this text, I realised that there exists a meta-layer for encrypting block devices. In my defense, Google didn’t find it and it took some mail archive digging to find it. Wind River has created a meta-encrypted-storage layer, part of their meta-secure-core. I didn’t give it a go myself, but the “Use case 2: luks-setup” section seems to be quite similar to what I’m going to present soon, so you may want to check that out as well.

The Initialization Process

However, there’s a small problem. Yocto does not have a mechanism for encrypting image partitions during the build. This makes sense, because if the image was encrypted during the build, it would mean that all the devices flashed with the image would have the same encryption key. This in turn would mean that if one device gets cracked, all the devices get cracked. Also, we are going to be sealing the encryption key to the TPM chip present in the actual hardware, and that is something that cannot be done during the build. Therefore, we need to think about an initialization process, or a factory process in fancy terms.

This process means performing some set-up work on a live system in the actual hardware before the device gets shipped into the wild. This can consist of installing additional software, performing quality control testing, setting up keys, etc. The image used in the initialization process can be different from the final firmware that is flashed to the device. Initialization images may contain some extra tools, scripts and tests that are not required for the actual operation of the device. Usually, the initialization work is handled by robots and automated scripts on the factory floor, but it’s a bit much to talk about “factory” when I’m doing this stuff in my bedroom.

Anyway, what I’m trying to say in this chapter is that unfortunately, we need to run the commands in the next chapter manually on the Raspberry Pi, and they cannot be performed during the build. So get an SD card flashed and boot the device.

Encrypting Block Device

In an ideal world, the following chapters show what would be required to encrypt the block device, bind TPM to it, and (automatically) unlock it in subsequent boots. You can perform these actions in the initramfs, or the user space. Commonly the encrypted devices are unlocked in the initramfs or the early init. It’s a good idea to perform both the binding and unlocking in the same phase of the boot. This should ensure that the TPM is in a similar state during both operations, minimizing the risk of PCR mismatches.

1. Prepare Block Device for dm-crypt

Format the device for dm-crypt with LUKS header with the following command:

cryptsetup luksFormat --key-size 512 /dev/mmcblk0p3

Your device path may be different, but if you’ve added one extra partition to the default Raspberry Pi image, the third partition should be the demonstration partition. This command will ask for a passphrase that can be used to open the encrypted device. For embedded devices with unattended unlocking, it may make more sense to use a key file instead:

dd if=/dev/urandom of=/path/to/keyfile bs=1024 count=4
cryptsetup luksFormat --key-file=/path/to/keyfile /dev/mmcblk0p3

Depending on your use case, you could move the key file away from the device after formatting and use it only for recovery if TPM fails to decrypt the device. If you keep it on the device, make sure that it is well protected, and possibly even encrypted.

Note that luksFormat command destroys all the data already present in the device, so be sure that you’re formatting the correct device. If the device contains already some data, you may want to back it up, format the device and then copy it back in place.

2. Bind the TPM to the Block Device

To bind the block device to the TPM, you can use the following command:

clevis luks bind -d /dev/mmcblk0p3 tpm2 '{"pcr_bank":"sha256","pcr_ids":"1"}'

If you’re using a key file instead of the passphrase, add -k /path/to/keyfile to the command. Otherwise you’ll get prompted for the passphrase. tpm2 defines the binding type (PIN in Clevis terms), and the last parameter contains the configuration for the PIN.

This command generates new key data, adds it to the LUKS key slot and seals it into TPM. The key can then be unsealed if the TPM PCR 1 is in the expected state. I’m using just the PCR 1 in this example, because I’ll be demonstrating a change in the boot parameters that will modify the PCR 1. Other PCR registers (or a combination of them) may make more sense in your case. For example, PCR 7 (Secure Boot State) is commonly used. A brief documentation about PCRs can be found for example in the Arch Wiki.

Neat-o. This should encrypt the device, and allow automatic unlocking later on. However, nothing performs the unlocking yet. Rebooting the device leaves the block device in a locked state. For now.

3. Unlocking the Device

You can unlock the encrypted device with the following command:

clevis luks unlock -d /dev/mmcblk0p3

Note that it is also possible to use the cryptsetup luksOpen command to open the device with the initial passphrase or keyfile. However, this method is not bound to the TPM PCR registers. Therefore, you may want to either have an uncrackable passphrase or move the key file away from the device to prevent this from happening. Another alternative is to kill the initial key slot from LUKS header, but this is a risky choice and can result in the partition becoming permanently locked.

After Clevis unlocks the block device, a device appears in /dev/mapper/luks-<UUID>. Remember that luksFormat command formatted this partition, so it needs a file system before mounting. File system can be created for example with mkfs.ext4. After this, it can be mounted with the regular mount /dev/mapper/luks-<UUID> /mnt command.

Automated Unlocking

You usually want to automatically unlock the device when the machine boots. In systemd, this is handled with /etc/crypttab file and systemd-cryptsetup-generator that creates systemd-cryptsetup services that unlock the volume. If you’re not using systemd, you’ll have to write an init script for it yourself. Well, Debian has cryptdisks scripts that can be used to handle the crypttab file in sysvinit systems, but unfortunately they are not available in Yocto. An alternative to an init script is a program that unlocks and locks the partition as needed (instead of automatically doing the unlocking during boot).

If you’re doing this in initramfs, you could just call clevis luks unlock directly as a part of the init script.

4. Re-Binding the Device

We should now have a working encryption and decryption of a volume. At least if nothing changes in the device. However, usually something will change. The kernel gets updated, boot arguments change, or something happens that results in the bound PCR values being different. You can simulate this on the Raspberry Pi by editing cmdline.txt containing the kernel boot arguments in the Raspberry Pi boot partition:

mount /dev/mmcblk0p1 /boot
vi /boot/cmdline.txt
# Add something to the kernel command line arguments

Now if you boot the device and try to unlock the volume, it should fail because the different kernel boot arguments get measured into the PCR 1. To fix this, we need to re-bind the TPM. There are a few options for this. You could either try clevis luks regen, or clevis luks unbind followed by bind. However, I had some trouble with these. They don’t accept key files, and also seem to rely on /dev/tty existing (which wasn’t the case in the initramfs I was using). In the end, I used the following command to remove the outdated key:

cryptsetup luksKillSlot --key-file=/path/to/key /dev/mmcblk0p3 <SLOT>

The slot to kill should be 1, indexing starts from 0, and the slot 0 is the initial key slot. You can verify this with clevis luks list -d /dev/mmcblk0p3

Now you should be able to perform the binding from step 2 again without using another key slot. Then if you reboot the device, the unlocking should work again. That covers the happy path.

The Problems

As mentioned before, I chose to do this in initramfs. Because of this, I had two issues. Well, three if you count the aforementioned /dev/tty issue that I didn’t fully investigate. But here are the other two I had to sort out.

Missing TPM Device in Initramfs

The kernel configuration I used built the SPI drivers as modules. Because the kernel drivers were not available in initramfs, I had to modprobe the required modules manually from the root file system. Not a big problem, but required some head-scratching. As mentioned earlier, I modified the kernel configuration so that the TPM drivers were built-in, but I had to probe the following SPI drivers manually:

mount /dev/mmcblk0p2 /mnt
modprobe /mnt/lib/modules/6.6.22-v8/kernel/drivers/spi/spi-bitbang.ko.xz
modprobe /mnt/lib/modules/6.6.22-v8/kernel/drivers/spi/spi-gpio.ko.xz

These get automatically probed in the user space, so this was an initramfs specific problem. It’s worth noting that this creates a dependency between the initramfs and root file system, which isn’t a good thing.

Process Substitution in Clevis TPM2 Encryption Script

Binding the TPM to the block device with Clevis seems to perform a shell process substitution. This has an assumption that there is a /dev/fd directory containing symlinks to the current process’s open file descriptors. However, these seem to be present only in /proc/self/fd (at least in my case). Therefore I had to add the following link in initramfs to get Clevis working:

ln -s /proc/self/fd /dev/fd

It seems that this link is present in the user space, so this problem again seems to exist specifically in the initramfs. I guess it’s my fault for trying to run Bash in initramfs, but it is a dependency for Clevis, so what can you do? In the hindsight, perhaps Clevis wasn’t the optimal choice for initramfs work, and tpm2-initramfs-tool would have fared better.

Encrypting the Root File System Device

The root file system is a special case worth mentioning because it has to be decrypted before the actual init kicks off (because the init process is located in the encrypted root file system). Therefore, the encryption and decryption have to be handled in the initramfs, and they cannot be performed later in the user space. Also, encrypting the root file system means that the performance hit from encryption is constantly present.

Encrypting the root file system volume has its risks. If the decryption fails for some reason, the machine essentially becomes bricked. An alternative worth considering would be having no secrets on a read-only root file system, and just verifying its integrity with dm-verity. Secrets could be stored in another partition, and the modifications to the read-only rootfs could be applied with an overlayfs.

Another point worth noting. Because the partition needs to be formatted with cryptsetup luksFormat on a live system, the root file system partition gets formatted clean and needs to be re-populated. This is inconvenient, but of course not impossible.

Closing Words

Phew, that was exhausting. As you can see, the encryption is not a one-size-fits-all solution. Instead, you need to carefully think about what you want to encrypt, where you want to encrypt it, how to unlock the data, how to handle the firmware updates, etc. As a good rule of thumb, the most securely stored data is the data that is not stored at all. The important thing is that you have to have a solid understanding of the encryption so that you won’t end up bricking all your devices after a bad firmware update. Hopefully, this text has given you some pointers on your encryption planning process, and maybe even practical guidance on how to achieve your goal. Until next time, when we discuss file system encryption!

Recommended Reading

Ton of wiki articles coming up. They contain plenty of good content, but as mentioned before, they are usually written from PC perspective, so they don’t always completely apply to the embedded world.

Share