My First Plug-In: Pastel Distortion

It’s time to finish a project. Lately, I have been mostly interested in embedded tinkering, but I’m also fascinated by audio and DSP programming. Partially because it is an interesting field, but mostly because I make music as a hobby so it’s interesting to see how the virtual instruments and audio effects work. So, in this text I’m presenting my first full-fledged and complete VST plug-in, Pastel Distortion. In a way it’s my second plug-in, as I used to make Delayyyyyy plug-in (that’s mentioned in some older texts of this blog as well), but that project has been abandoned in a state that I can’t quite call complete. However, here’s a screenshot of something that I actually have completed:

tada.wav

In short, VST plug-ins are software used in music production. They create and modify the sound based on the information they’re given by the VST host, that is usually a digital audio workstation. Plug-ins are commonly chained together so that one plug-in’s output is connected to the next one’s input. This is all done real-time, while the music is playing.

There will be free downloads at the end, but first, let’s go through the history of the project, some basic theory, and a six-paragraph subchapter that I like to call “I’m not sponsored by JUCE, but I should be”.

History of the Project

About two and a half years ago I started working on a distortion VST plug-in following this tutorial. Half a year after that, I got distracted when I thought about testing the plug-in (which resulted in this blog text and my first conference talk). As a side note, it may tell something about the development process and schedule when testing is “thought about” six months after starting the project. A year after that I got a Macbook for Macing Mac builds and got distracted by the new shiny laptop. And some time after that I made some overenthusiastic plans for the plug-in that didn’t quite come to reality and then I forgot to develop the plug-in.

The timeline is almost as confusing as Marvel Multiverse and full of delays, detours and time loops. In the end, I’ve just come to a conclusion that I’ll release this Pastel Distortion as it is, and add the new cool features later on if there’s interest in the plug-in. If there’s no interest, I can start working on a new plug-in, so it’s a win-win situation. But let’s finish this first.

What Is Waveshaping Distortion?

The Physics

In the real world that surrounds us all, sound is a change of pressure in a medium. Our ears then receive these changes of pressure and turn them into some sort of electricity in the brain. In short, it’s magic, that’s the best way I can explain it. To translate this into the world of computers, a microphone receives changes of pressure in air and converts them into changes in electricity that an analogue-to-digital converter then turns into ones and zeros understood by a computer. Magic, but of a slightly different kind.

I asked AI to generate an infographic for this section. Hopefully this helps you to understand.

After this transformation, we can process the analogue sound in the digital domain, and then convert it back into an analogue signal and play it out from speakers. Commonly the pressure/voltage changes get mapped into numbers between some range. One common range is [-1.0, 1.0]. -1.0 and 1.0 represent the extreme pressure changes where the microphone’s diaphragm is at its limit positions (=receiving loud sound), while the value of 0 is the position where it receives no pressure (receiving= silence).

Well, I also tried drawing the information myself. I’m not sure which one is better.

The Maths

Now we’ve established what sound is. But what is waveshaping distortion? You can think of it as a function that gets applied to the sampled values. Let’s take an example function that does not actually do any shaping, y=x:

This is quite possibly the dullest shaping function. It takes x as an input, and returns it. However, this is in theory what waveshaping does. It takes the input samples from -1.0 to 1.0, puts them into mathematical function, and uses output for new samples. Let’s take another example, y=sign(x):

This takes an input sample and outputs one of the extreme values. You can emulate this effect by turning up the gain of a microphone, shoving the mic in your mouth, and screaming as loud as possible. It’s not really a nice effect. Finally, let’s take a look at a useful function, y=sqrt(x) where x >= 0, y=-sqrt(-x) where x < 0:

Finally, we get a function that does something but isn’t too extreme. This will create a sound that’s more pronounced because the quieter samples get amplified. Or, in other words, values get mapped further away from zero. The neat part is that the waveshaper function can be pretty much anything. It can be a simple square root curve like here. It also can be a quartic equation combined with all of the trigonometric functions (assuming your processor can calculate it fast enough). Maybe it doesn’t sound good, but it’s possible.

But Why Bother?

It’s always a good idea to think why something is done. Why would I want to use my precious processor time to calculate maths when I could be playing DOOM instead? As an engineer, I’m not 100% sure, I think it has something to do with psychoacoustics which is a field of science of which I know nothing about and to be honest it sounds a bit made up. From a music producer’s point of view, I can say that distortion effects make the sound have more character, warmth, and loudness (and other vague adjectives which don’t mean anything), so it’s a good thing.

Implementing the Distortion

I have talked about JUCE earlier in this blog, but I think that’s been so long ago that it’s forgotten. So I’ll summarize it shortly again. It’s a framework for creating audio software. It handles input and output routing, VST interfacing, user interface, and all that other boring stuff so that we can focus on what we actually want to do: making the computer go bleep-bloop.

The actual method of audio signal processing may vary between different types of projects. For a VST audio effect like this, there usually is a processBlock function that receives an input buffer periodically. It is then your duty as a plug-in developer to do whatever you want with that input buffer and fill it with values that you deem correct. Doing all this in a reasonable amount of CPU time, of course.

In this Pastel Distortion plug-in, we receive an input buffer filled with values ranging from -1.0 to 1.0, and then we feed those samples to the waveshaping function and replace the buffer contents with the newly calculated values. Sounds simple, and to be honest, that’s exactly what it is because JUCE does most of the heavy work.

JUCE has a ProcessorChain template class that can be filled with various effects to process the audio. There’s a WaveShaper processor, to which you simply give the mathematical function you want it to perform, and the rest is done almost automatically! As you can guess, the plug-in uses that. In the plug-in there are also some filters, EQs, and compressors to tame the distorted signal a bit more because the distortion can start to sound really ugly really quickly. That doesn’t mean that you can’t create ugly sounds with Pastel Distortion, quite the contrary.

The life of a designer is a life of fight: fight against the ugliness

Another great feature of JUCE is that it has a graphics library built-in. It’s especially good in a sense that an embedded developer like me can create a somewhat professional-looking user interface, even though I usually program small devices where the only human-computer interaction methods are a power switch and a two-colour LED. Although I have to admit, most of the development time went into making the user interface. You wouldn’t believe the amount of hours that went into drawing these little swirls next to the knobs.

Honestly, it was pure luck that I managed to get these things looking even remotely correct. The best part is that in the end they’re barely even visible.

All in all, Pastel Distortion is a completed plug-in that I think is quite polished (at least considering the usual standards for my projects). There’s the distortion effect of course, but in addition to that there’s tone control to shape the distortion and output signal, a dry-wet mixer for blending the distorted and clean signal, and multiple waveshape functions to choose from. Besides GUI, I also spent quite a lot of time tweaking the distortion parameters, so hopefully that effort can be heard in the final product.

There’s still optimization that could be done, but the performance is in fairly good shape already. At least compared to the FL Studio stock distortion plug-in Disructor it seems to have about the same CPU usage. Disructor averages at around 8%, while Pastel Distortion averages at 9%. Considering the fact that my previous delay plug-in used about 20% I consider this a great success. This good number is most likely a result of the optimizations in JUCE and not because of my programming genius.

But enough talk, let’s get to the interesting stuff. How to try this thing out?

Getting Pastel Distortion

Obtaining Pastel Distortion plug-in is quite easy. Just click this link to go to the Gumroad page where you can get it. And if you’re quick, you can get it for free! The plug-in costs $0 until the end of February 2024. After that you can get the demo version for free to try it out, or if you ask me I can generate some sort of a discount code for it (I’d like to get feedback on the product in exchange for the discount).

If you don’t want to download Pastel Distortion but want to see it in action, check out the video below. I put all the skills I’ve learned from Windows Movie Maker and years of using Ableton into this one:

That’s all this time. I’ve already started working on the next plug-in, let’s hope that it won’t take another two and a half years. Maybe the next text will be out sooner than that when I get something else ready that’s worth writing about. I’ve been building a Raspberry Pi Pico-based gadget lately, and it got a bit out of hand, but maybe I’ll finish that soon.

Fixing Stability Issue In The Blog Server

The biggest fans of this blog (or just the people usually browsing between 6:00-7:00 UTC) may have noticed a frustrating issue where the site occasionally loads really slowly. Or in the worst-case scenario, refuses to load at all. Only an error page containing a message about a failing database connection gets returned.

Well, at least the message is short and to the point.

Investigating Issue

This issue started to occur sporadically in August and became consistent in October. And I started to consider fixing it in November. This kind of relaxed response time is common for hobby projects. The first obvious step to fix the issue was to check what was going on in the server when load times got longer. Once I noticed that the site was slowing down, I checked the monitoring stats. From the graphs, I saw that both CPU usage and disk reads were spiking. CPU was peaking at 80%, and disk reads were over 100MB/s for over 15 minutes. From the 7-day monitoring graph, it could be seen that this kind of spiking was happening almost daily.

Not every day though, and some spikes are taller than others.

Investigating the system log and comparing it with the time stamps of the peaks revealed the following cycle:

  1. One of the two daily apt package manager upgrade services gets started
  2. The CPU and disk activity starts ramping up
  3. The system starts heavy swapping and the website load times get longer
  4. About 15-20 minutes after the apt service starts the OOM (out-of-memory) killer kicks in and stops MySQL. Few other services may time out or get killed in this phase as well.
  5. MySQL restarts and the blog works again

I started investigating why the daily apt services seemed to constantly cause the server to run out of memory. The first of the two services downloaded the packages for upgrading, and the second one installed the downloaded upgrades. After trying out a few different things I realized that just installing or removing a package caused the server to randomly run out of memory if either of the apt services was started a few minutes earlier. It’s fun to do tests like this on a live server.

Fix Attempt 1: Installing System Upgrades

Some further investigation into the daily apt services revealed that the unattended upgrades had been failing for a long time. It seemed like the MySQL apt repository was missing keys, causing the apt update to fail. Also, it seemed like some upgrades required input from the user to configure packages. So I took a server backup and started installing the upgrades manually.

This isn’t foreshadowing. At least yet. Let’s see in three months.

Out of 141 packages, 127 wanted an update, which is “quite many” (to put it lightly). Fortunately, I have made no promises about the availability of this site, so I could liberally reboot the server as much as I needed for the upgrades. I was hoping that installing these pending upgrades would clear some cache that would reduce the RAM usage of the apt services. And in the worst-case scenario, it wouldn’t fix the issue but I would get an up-to-date server, so upgrading seemed like a win-win.

In addition to the upgrades I also installed an improved DigitalOcean monitoring service. This actually revealed something that should have been quite obvious from the beginning. The new monitoring service monitored RAM usage (the old one did not), and I could see that the server was using 90% of the RAM when it was idle. In hindsight, checking the RAM usage and monitoring how it gets consumed should have been the very first step when investigating an OOM issue.

Needless to say, 90% RAM usage is not good. I guess this happens because I’m running this blog on a low-end instance that doesn’t have much of RAM (I actually checked the minimum requirements of the OS, and the instance barely fills even that). However, before investigating the insufficient RAM, I wanted to first see if the upgrades would fix the original OOM issue. They did not.

So, the problem started to seem like a case of insufficient RAM. To fix this kind of issue, there are usually two options: scale the server up or scale the services down. In other words, throw money at the problem, or try to optimize the server. Being a cheapskate I chose the latter option. Also, I usually work with embedded things, so “just adding more RAM” feels like cheating. Also, considering the fact that on average I have about 20 daily visitors, beefing up the server seems like the wrong direction.

Fix Attempt 2: Optimizing RAM Usage

I used top to check the biggest memory consumers, and found two RAM gluttons: MySQL and Apache. Both are required for the well-being and existence of WordPress (that is the platform of this blog), but perhaps they could be optimized. At least they used to work on the server before, so perhaps they could be configured to work once again.

In the case of MySQL, there was a single mysqld daemon that was consuming plenty of RAM. Some googling revealed that disabling performance schema could help lower memory consumption. It seems to be a feature that measures the performance of the MySQL database server. Considering the fact that I’m using WordPress and I hope to write zero direct database queries to the database, that seemed nonmandatory. Perhaps when developing new software using MySQL such stats could be useful. Disabling performance schema lowered the mysqld RAM consumption from 39% to 19%.

In the case of Apache, there were ten worker threads, each consuming about 5%-8% of RAM. If my math is correct, in the last month I had about 0.00083 concurrent visitors on average. With that in mind, ten worker threads felt a bit excessive, and I scaled their amount down. I think it could be lowered even more, but I wanted to have enough workers in case there’s a sudden influx of readers.

Aaaany day now.

Conclusion

These actions took the idle RAM usage from 90% down to 60%. After this drop, I haven’t seen the OOM killer get activated in the past seven days, so I hope the issue is fixed. 60% is still a bit more than I’d like, but as long as the server stays stable and the performance doesn’t notably degrade I think that’s an acceptable percentage. Also, using the cheaper virtual machine saves me $6 a month!

The root cause for the increased RAM usage is still a bit of a mystery. I’m suspecting that installing WordPress plugins caused it because I was installing SEO plugins around the time the issue became more prevalent. If there’s one thing I’ve learnt from this, it’s that updates should be checked manually every now and then, and consumption of the system resources should be constantly monitored.

Open-source contribution: chdir for BusyBox

Coming soon to the Linux box near you:

Hopefully this doesn’t age like milk

So yeah, I managed to get a commit into one of the open-source projects that I use on a daily basis: BusyBox. I guess many others use it too, either knowingly or unknowingly. BusyBox is a software suite providing plenty of Unix utilities in a minimized single executable. For example, when you’re using dmesg command you don’t necessarily know if the implementation comes from util-linux or BusyBox. But if you’re using OpenWRT, Alpine or Yocto you’re most likely using the BusyBox version.

The Problem

Because the BusyBox binary is minimized, the utilities it provides are often missing lesser-used features. As mentioned in the previous Aioli devblog, start-stop-daemon is for example missing -d/--chdir option present in the full Debian counterpart. As mentioned in that text, I wrote a patch to add that feature. What I didn’t really mention is that I submitted the patch to the BusyBox mailing list. I was hoping that it would get applied, and eventually it did!

start-stop-daemon is a program that’s commonly used in the SysVinit scripts to control the lifecycle of the system services. It doesn’t only start and stop daemons, it can also reload them, check their status and… well that’s primarily that. What --chdir option does is that it changes the working directory of the start-stop-daemon process before it launches the program it’s been assigned to start. This effectively changes the working directory of the process that will actually be started.

The Solution

The patch for this feature was quite straightforward. Mostly it consisted of adding a variable to hold the new working directory, inserting the new -d option to the opt list for the option parser, and editing the usage message. Then, if the new option flag was set, it was just a matter of calling the xchdir() in libbb (BusyBox’s library) to change the directory to the given directory (or die).

The less popular sequel to “Skate or Die” and “Ski or Die”.

In addition to this, I looked at how the tests for BusyBox work and wrote tests for the new flag. And cleaned up the TODO. In the end, the commit delta ended up being less than 60 lines. From what I’ve understood of the commit stats, the start-stop-daemon got bloated by about 79 bytes as a result. So the next time you’re updating BusyBox and curse the fact that it doesn’t fit into your root file system that has 67 bytes of free space remaining, you know who to blame.

All in all, getting the patch merged was an interesting process. I could definitely contribute more to BusyBox if there are suitable issues. Something perhaps a bit less simple the next time. But whether there will be more commits or not, it’s wild to think that my code could be running in Linux boxes around the world. Although, I guess that would require the device vendors to update their devices to run the new (still unreleased) version of BusyBox, so I guess it’s not happening too soon.

Yocto Hardening: Kernel and GCC Configuration

Find all of the Yocto hardening texts from here!

Would you like to make your Yocto image a tiny bit harder to hack ‘n’ crack? Of course you would. This time we’re going to be doing two things to improve its security: hardening the Linux kernel, and setting the hardening flags for GCC. The motivation for these is quite obvious. Kernel is the privileged core of the system, so it better be as hardened as possible. GCC compilation flags on the other hand affect almost every C and C++ binary and library that gets compiled into the system. As you may know, over the years we’ve gotten quite a few of them, so it’s a good idea to use any help the compiler can provide with hardening them.

On the other hand, who wouldn’t like to live a bit dangerously?

Kernel Configuration Hardening

Linux kernel is the heart of the operating system and environment. As one can guess, configuring the kernel incorrectly or enabling everything in the kernel “just in case” will in the best situation lead to suboptimal performance and/or size, and in the worst case, it’ll provide unnecessary attack surfaces. However, optimizing the configuration manually for size, speed, or safety is a massive undertaking. According to Linux from Scratch, there are almost 12,000 configuration switches in the kernel, so going through all of them isn’t really an option.

Fortunately, there are automatic kernel configuration checkers that can help guide the way. Alexander Popov’s Kernel Hardening Checker is one such tool, focusing on the safety of the kernel. It combines a few different security recommendations into one checker. The project’s README contains the list of recommendations it uses as the guideline for a safe configuration. The README also contains plenty of other useful information, like how to run the checker. Who would have guessed! For the sake of example, let’s go through the usage here as well.

Obtaining and Analyzing Kernel Hardening Information

The kernel-hardening-checker doesn’t actually only check the kernel configuration that defines the build time hardening, but it also checks the command line and sysctl parameters for boot-time and runtime hardening as well. Here’s how you can obtain the info for each of the checks:

  • Kernel configuration: in Yocto, you can usually find this from ${STAGING_KERNEL_BUILDDIR}/.config, e.g. <build>/tmp/work-shared/<machine>/kernel-build-artifacts/.config
  • Command line parameters: run cat /proc/cmdline on the system to print the command line parameters
  • Sysctl parameters: run sysctl -a on the system to print the sysctl information

Once you’ve collected all the information you want to check, you can install and run the tool in a Python virtual environment like this:

python3 -m venv venv
source venv/bin/activate
pip install git+https://github.com/a13xp0p0v/kernel-hardening-checker
kernel-hardening-checker -c <path-to-config> -l <path-to-cmdline> -s <path-to-sysctl>

Note that you don’t have to perform all the checks if you don’t want to. The command will print out the report, most likely recommending plenty of fixes. Green text is better than red text. Note that not all of the recommendations necessarily apply to your system. However, at least disabling the unused features is usually a good idea because it reduces the attack surface and (possibly) optimizes the kernel.

To generate the config fragment that contains the recommended configuration, you can use the -g flag without the input files. As the README states, the configuration flags may have performance and/or size impacts on the kernel. This is listed as recommended reading about the performance impact.

GCC Hardening

Whether you like it or not, GCC is the default compiler in Yocto builds. Well, there exists meta-clang for building with clang, and as far as I know, the support is already in quite good shape, but that’s beside the point. Yocto has had hardening flags for GCC compilation for quite some time. To check these flags, you can run the following command:

bitbake <image-name> -e | grep ^SECURITY_CFLAGS=

How the security flags get applied to the actual build flags may vary between Yocto versions. In Kirkstone, SECURITY_CFLAGS gets added to TARGET_CC_ARCH variable, which gets set to HOST_CC_ARCH, which finally gets added to CC command. HOST_CC_ARCH gets also added to CXX and CPP commands, so SECURITY_CFLAGS apply also to C++ programs. bitbake -e is your friend when trying to figure out what gets set and where.

I don’t think any other meme can capture this feeling of madness

So, in addition to checking the SECURITY_CFLAGS, you most likely want to check the CC variable as well to see that the flags actually get added to the command that gets run:

# Note that the CC variable gets exported so grep is slightly different
bitbake <image-name> -e |grep "^export CC="

The flags are defined in security_flags.inc file in Poky (link goes to Kirkstone version of the file). It also shows how to make package-specific exceptions with pn-<package-name> override. The PIE (position-independent executables) flags are perhaps worth mentioning as they’re a bit special. The compiler is built to create position-independent executables by default (seen in GCCPIE variable), so PIE flags are empty and not part of the SECURITY_CFLAGS. Only if PIE flags are not wanted, they are explicitly disabled.

Extra Flags for GCC

Are the flags defined in security_flags.inc any good? Yes, they are, but they can also be expanded a bit. GCC will most likely get in early 2024 new -fhardened flag that sets some options not present in Yocto’s security flags:

-D_FORTIFY_SOURCE=3 (or =2 for older glibcs) 
-D_GLIBCXX_ASSERTIONS 
-ftrivial-auto-var-init=pattern 
-fPIE -pie -Wl,-z,relro,-z,now
-fstack-protector-strong
-fstack-clash-protection
-fcf-protection=full (x86 GNU/Linux only)

Lines 2, 3, and 6 are not present in the Yocto flags. Those could be added using a SECURITY_CFLAGS:append in a suitable place if so desired. I had some trouble with the trivial-auto-var-init flag though, seems like it is introduced in GCC version 12.1 while Yocto Kirkstone is still using 11 series. Most of the aforementioned flags are explained quite well in this Red Hat article. Considering there’s plenty of overlap with the SECURITY_CFLAGS and -fhardened, it may be that in future versions of Poky the security flags will contain just -fhardened (assuming that the flag actually gets implemented).

All in all, assuming you have a fairly modern version of Yocto, this GCC hardening chapter consists mostly of just checking that you have the SECURITY_CFLAGS present in CC variable and adding a few flags. Note once again that using the hardening flags has its own performance hit, so if you are writing something really time- or resource-critical you need to find a suitable balance with the hardening and optimization levels.

In Closing

While this was quite a short text, and the GCC hardening chapter mostly consisted of two grep commands and silly memes, the Linux kernel hardening work is something that actually takes a long time to complete and verify. At least my simple check for core-image-minimal with systemd enabled resulted in 136 recommendation fails and 110 passes. Fixing it would most likely take quite a bit longer than writing this text. Perhaps not all of the issues need to be fixed but deciding what’s an actual issue and what isn’t takes its own time as well. So good luck with that, and until next time!

As a reminder, if you liked this text and/or found it useful, you can find the whole Yocto hardening series from here.


The Movember blog series continues with this text! As usual, after reading this text I ask you to do something good. Good is a bit subjective, so you most likely know what’s something good that you can do. I’m going to eat a hamburger, change ventilation filters, and donate a bit to charity.

Aioli Audiostreamer: Music To The People

Check out the previous part of the Aioli Audiostreamer saga here. In case you don’t want to check it out, here’s a quick recap: I started a new project in which the goal is to stream audio from one Raspberry Pi to another over an IP network. The last time we got the streaming to work in theory, but the practical part of it was (and is still) missing. This time we’re not going to to address that.

Instead of focusing on getting the streaming working robustly and automatically, I chose to add a Bluetooth connection between the Raspberry Pi controller device and an external audio source. This way the system can stream something else than just the audio files present in the controller device. Here’s the graph with a chunky red line showing what’s the focus for today:

Diagram showing the Aioli Audiostreamer system overview

Unfortunately, this means confronting my old nemesis: BlueZ stack. Or Bluetooth in general. Something about it rubs me the wrong way. I’m not sure if it’s actually that bad. However, every time my headphones fail to connect to my phone I curse the whole protocol to the ninth circle of hell. Which happens every single morning. And it’s still the best choice for this kind of project. But yeah, plenty of that coming up.

Picture of a robot pounding a "no fun allowed" sign to the ground

Bluetooth Connectivity

The first step is making our Raspberry Pi audio server advertise itself as a Bluetooth device wanting to receive audio: headphones, speakers, or anything along those lines. In theory, this sounds like a lot of work, but once again, the open-source community comes to the rescue. This bt-speaker project makes a Raspberry Pi act as a Bluetooth speaker, which is exactly what we want. The phone (or some other audio source) can connect to the bt-speaker daemon running on Raspberry Pi and stream audio to it. bt-speaker then outputs the received audio to the desired audio device.

The program required some tweaking for cross-compilation, and some things weren’t quite as generic as they could have been, causing some QA errors in Yocto. However, it mostly worked quite nicely out of the box. I guess because the bulk of the program is written in Python there are not that many compilation issues to wrestle with. There was also one codec that needed to be compiled, and then there was the issue of figuring out the correct dependencies, but all in all fairly simple stuff. Yocto recipe can be found here.

The Actual Troubles

What actually took a long time was getting the start-up script working. I’m still sticking to SysVinit for simplicity, which means that I ended up using start-stop-daemon to launch the program. However, it turned out Busybox’s implementation of start-stop-daemon was missing the -d/--chdir option. bt-speaker loads a codec from a relative path, meaning that the program fails at start-up because it’s launched from the root directory. Because I’m not much of a Python programmer, I chose to patch the feature into Busybox instead of doing the sensible thing and installing the codec into the correct location and fixing bt-speaker. An open-source contribution to Busybox coming soon I hope.

Well, after that came the second problem: the Bluetooth chip in Raspberry Pi wasn’t stable during the startup. The script starting BlueZ worked well, and the bt-speaker launched successfully as well. However, after a few seconds, the Bluetooth device became undiscoverable. I tried to check all the Bluetooth-related changes that happened during boot in the system: changes in the Bluetooth device information and BlueZ status, reading syslog & dmesg, but no. The Bluetooth chip just reset a few seconds after the BlueZ launched. So I did what any sane person would do: power cycled BT chip as a part of the start-up, added a “reasonable amount” of sleep, and moved on with my life.

The Less Actual Troubles

After getting the thing starting automatically during the boot there’s still a small problem. The problems never end with Bluetooth, don’t they? Well, even better, there are two problems. First, for some reason, my phone says that it has trouble connecting to this Frankensteinian BlueZ device. Streaming music from Spotify works nicely though, and even the volume control behaves as expected. So all in all, this sounds like a very typical Bluetooth device already: nothing works, except that it works, except when it doesn’t. I’m not yet sure if this is actually a problem or not to anyone else except my phone.

Screenshot from a mobile phone showing connection issue with BlueZ 5.66 device
I just wanted to flex my phone and watch with this screenshot. The ironic thing here is that if I “turn device off & back on” as suggested, it’ll be forever unable to connect again unless the BlueZ cache is cleared on Raspberry Pi. That may be the fault of dodgy Bluetooth code and not the protocol itself though.

The bigger issue is that we don’t want the audio to be output from the Raspberry Pi we’re connected to. Instead, we want to stream the audio to the other Raspberry Pis in LAN and have those output the audio. GStreamer has an alsasrc source that can take input from an ALSA device and work with that. However, we have a bit of a mismatch here: GStreamer wants an input device to receive the audio from (e.g. microphone), but bt-speaker generates audio that goes to an output device (e.g. speaker).

Loop Devices to the Rescue!

Loop device is a virtual audio device that redirects audio from a virtual output device to a corresponding virtual input device (or vice versa). This means that we can have a virtual “microphone” that outputs the audio that bt-speaker has been received through Bluetooth. Maybe my explanation just made it worse, the idea is quite simple. This blog post explains the functionality quite well.

To explain more: probing snd-aloop module creates two loopback sound cards, both for input and output (four cards in total). The virtual cards have two devices, and each device has eight subdevices. This results in a lot of devices being created. These devices are special because the output of an output device gets directed to the input of a corresponding input device. For example, if I play music with aplay to card 1, device 0, subdevice 0, the same music can be captured from the input card 1, device 1, subdevice 0. Notice how the device number is flipped. Output to input, and vice versa, as I’ve been repeating myself for two chapters now.

Meme saying "snd-aloop transforms input to an out and vice versa" in French
Google Translate don’t fail me now.

With this method, when bt-speaker uses aplay to output the audio it receives, we can define the output device to be a loopback device. Then, GStreamer can use the corresponding loopback input device to receive the Bluetooth audio and pass it to the LAN. A bit of patching to the bt-speaker, and something like this seems to do the trick:

# Play command that bt-speaker uses when it receives audio

aplay -D hw:2,0,1 -f cd -

# Streaming command to send audio to 192.168.1.182

gst-launch-1.0 alsasrc device=hw:2,1,1 ! audioconvert ! audioresample ! rtpL24pay ! udpsink host=192.168.1.182 port=5001

Is This a Good Idea?

I’m not sure. I think it would be possible for the bt-speaker daemon to launch the GStreamer directly once the Bluetooth connection has been initialized. This approach would skip aplay altogether and wouldn’t require the loop devices. However, keeping the Bluetooth and networking separate should keep the system simpler. Both processes do their own thing without knowledge of each other. bt-speaker can output audio without caring if anyone listens to it. On the other end, GStreamer can stream whatever it happens to receive through the loopback device.

This also allows kicking the bt-speaker and GStreamer individually when they eventually and inevitably start misbehaving. The drawback is that GStreamer streams silence if nothing is received from Bluetooth, but I think that’s an acceptable weakness for now. After all, I’ve paid for a WLAN router to route some bits, so I’m going to route them bits, even if they’re all zeros.

I noticed that there actually is a BlueZ plugin for GStreamer. However, it’s labelled as one of the “bad” plugins, and dabbling with such dark magic sounds like a bad idea. If something is labelled “bad” even in the official documentation it’s usually better to avoid it. It doesn’t necessarily mean that the quality itself is bad as the label may also mean a lack of testing or maintenance, but still.

Movie poster of Bad Boys
TBH I don’t exactly know what WASAPI is, but after a quick Google search, I’m not sure I even want to know.

Closing Words

This text was a bit shorter than anticipated, but some things in life are unexpectedly short. Next time we’ll get rid of the static WLAN configuration. Instead, we’ll create a mechanism for passing the SSID and password during run-time. We’ll be doing that mostly because I already started working on that feature. Now I have DHCP servers running amok in my home LAN causing trouble and slowing down development.

One question still remains: does this system contain any code that I have written? Not really at the moment. But in my experience, that’s the story for most of the embedded Linux projects: find half a dozen somewhat working pieces of software and glue them together with some scripts. Until next time!


This blog text is a part of my Movember 2023 series. If you found this text useful, I’d ask you to do “something good”. That doesn’t necessarily mean shoving your money to charities or volunteering weekends away (although those are good ideas), it can be something as simple as asking a family member or colleague how they’re doing. Or it can mean selling your earthly possessions away and becoming a monk. It’s really up to you.

Yocto Hardening: Firewalls, Part 2: firewalld

Find all of the Yocto hardening texts from here!

People often ask me two things. The first question is “Why did you choose to write this firewall text in two parts?”. The answer to that is I actually started writing this over a year ago, the scope of the text swelled like crazy, and in the end, to get something published I chose to write the thing in two parts to have a better focus. The second question that I get asked is “Do you often have imaginary discussions with imaginary people in your head?”. To that, the answer is yes.

Comic about imaginary conversations
I prepare for daily stand-up by mentally going through conversations where I get deservedly fired because I didn’t close three tickets the previous day.

But without further ado, let’s continue where we left off in part 1. As promised, let’s take a look at another way of setting up and configuring a firewall: firewalld.

One Step Forward, Two Steps Back: Configuring Kernel (Again)

To get the firewalld running we need a few more kernel configuration items enabled. How many? Well, I spent a few days trying to figure out the minimal possible configuration to run some simple firewalld commands, all without success. firewalld is not too helpful with its error messages, as it basically says “something is missing” without really specifying what that special something is.

Meme about Linux kernel modules
I’ve always liked Lionel’s ‘stash, but I feel like he’s looking extra fine in this pic.

After banging my head on a wall for a few days I attempted to enable every single Netfilter and nftables configuration item (because firewalld uses nftables), but no success. After some further searching, I found this blog post that contained a kernel configuration that had worked for the author of the blog. I combined that with my desperate efforts, and the resulting config actually worked! In the end, my full configuration fragment looked like this:

As you can see, my shotgun approach to configuration wasn’t 100% accurate because there were few NF and NFT configuration items I missed. Is everything here absolutely necessary? Most likely not. There actually are some warnings about unused config options for the Linux kernel version 5.15, but I’m afraid to touch the configuration at this point.

I spent some more time optimizing this and trying to make the “minimal config”, just to realize that I’m actually minimizing something that can’t be really minimized. The “minimal config” really depends on what your actual firewall configuration contains. After this realization, I decided that it’s better to have too much than too little, and found an inner peace. You can think that the fragment provides a starting point from which you can start to remove useless-sounding options if you feel like it.

While we’re still talking about the build side of things, remember to add the kernel-modules to your IMAGE_INSTALL as instructed in part 1. Or if you chose to go through the RRECOMMENDS route, remember to add all 36 modules to the dependencies.

Meme about saving disk space by removing kernel modules
To be fair, I’ve been in a situation where a few KBs of root file system content have made the difference between a success and a catastrophic failure.

The Hard Firewall – firewalld

Now that I’ve spent three chapters basically apologizing for my Linux config, it’s time to get to the actual firewall stuff. The “hard” way of setting up a firewall consists of setting up firewalld. This is in my opinion suitable if you have multiple interfaces with changing configurations, possibly edited by a human user, meaning that things change (and usually in a more complex direction).

firewalld is a firewall management tool designed for Linux distributions, providing a dynamic interface for network security (another sentence from ChatGPT, thank you AI overlords). Or to put it into more understandable words, it’s a front-end for nftables providing a D-Bus interface for configuring the firewall. The nice thing about firewalld is that it operates in zones & services, allowing different network interfaces to operate in different network zones with different available services, all of which can be edited run-time using D-Bus, creating mind-boggling, evolving, and Lovecraftian configurations. This is the nice thing, so you can imagine what the not-so-nice things are.

Now that we are somewhat aware of what we’re getting into, we can add firewalld to IMAGE_INSTALL:

IMAGE_INSTALL:append = " firewalld"

One thing worth noting is that firewalld is not part of the Poky repository, but it comes as a part of meta-openembedded. Most likely this is not a problem, because every project I’ve worked on has meta-openembedded, but it’s worth knowing nevertheless.

Now that everything is in place we can start hammering the firewalld to the desired shape. To edit the configuration we can use the firewall-cmd shipped with firewalld package. This firewall-cmd (among some other firewall-* tools) uses the D-Bus interface to command the firewalld, which in turn commands nftables, which finally sets the Netfilter in the kernel. This picture illustrates it all nicely:

Graph showing firewalld internal structure
The picture originates from here:
https://firewalld.org/2018/07/nftables-backend

So, about the actual configuration process: each network interface can be assigned a zone, and the zones can enable a certain set of services. Here’s a short command reference for how to add a custom zone, add a custom service, add the new service to the new zone, and set the new zone to a network interface:

# List available zones
firewall-cmd --get-zones
# These zones are also listed in /usr/lib/firewalld/zones/
# and /etc/firewalld/zones/

# Get the default zone with the following command
# (This is usually public zone)
firewall-cmd --get-default-zone

# This'll return "no zone" if nothing is set,
# meaning that the default zone will be used
firewall-cmd --get-zone-of-interface=eth0

# Create a custom zone blocking everything
firewall-cmd --permanent --new-zone=test_zone
firewall-cmd --permanent --zone=test_zone --set-target=DROP
# If permanent option is not used, changes will be lost on reload.
# This applies to pretty much all of the commands

# Reload the firewalld to make the new zone visible
firewall-cmd --reload

#List services, and add a custom service
firewall-cmd --list-services
firewall-cmd --permanent --new-service=my-custom-service
firewall-cmd --permanent --service=my-custom-service --add-port=22/tcp
firewall-cmd --reload

# Add the service to the zone
firewall-cmd --permanent --zone=test_zone --add-service=my-custom-service

# Set zone
firewall-cmd --zone=home --change-interface=eth0
# Add --permanent flag to make the change, well, permanent

You can (and should) check between the commands how the ports appear to the outside world using nmap to get a bit better grasp on how the different commands actually work. The changes are not always as immediate and intuitive as one would hope. Good rule of thumb is that if the settings survive --reload, they’re set up properly.

How does this all work with the file-based configuration and Yocto? You could create the zone and service files for your system using the firewall-cmd, install them during build time, and change the default zone in the configuration file /etc/firewalld/firewalld.conf to have the rules active by default. This is similar to nftables approach, but is a bit easier to customize for example from a graphical user interface or scripts, and doesn’t require custom start-up scripts. An example of adding some zones and services, and a patch setting the default zone, can be found in the meta-firewall-examples repo created for this blog text.

Comparison and Closing Words

So, now we have two options for firewalling. Which one should be used? Well, I’d pick nftables, use it as long as possible, and move to firewalld if it would be required later on. This is suitable when the target is quite simple and lightweight, at least at the beginning of the lifespan of a device. On the other hand, if it’s already known from the get-go that the device is going to be a full-fledged configurable router, it’d be better to pick the firewalld right off the bat.

Meme about differences of nftables and firewalld

What about iptables and ufw then? Both have bitbake recipes and can definitely be installed, and I suppose they even work. iptables is a bit of a legacy software these days, so maybe avoid that one though. It should be possible to use iptables syntax with nftables backend if that’s what you want. ufw in Yocto uses iptables by default, but it should be possible to use nftables as the backend for ufw. So as usual with the open-source projects, everything is possible with a bit of effort. And as you can guess from the amount of weasel words in this chapter, I don’t really know these two that well, I just took a quick glance at their recipes.

Note that the strength of the firewall depends also on how protected the rulesets are. If the rules are in a plaintext file that everyone can write to, it’s safe to assume that everyone will do so at some point. So give a thought to who has the access and ability to edit the rules. Also, it may be worthwhile to consider some tampering detection for the rule files, but something like that is worth a text of its own.

In closing, I hope this helped you to set up your firewall. There are plenty of ways to set it up, this two-parter provides two options. It’s important to check with external tooling that the firewall actually works as it should and there are no unwanted surprises. Depending on the type of device you want to configure you can choose the simple way or the configurable way. I’d recommend sticking to the easy solutions if they’re possible. And finally, these are just my suggestions. I don’t claim to know anything, except that I know nothing, so you should check your system’s overall security with someone who’s an actual pro.

Yocto Hardening: Firewalls, Part 1: nftables

Find all of the Yocto hardening texts from here!

The eternal task of making the Yocto Linux build an impenetrable fortress continues. Next, we’ll look into setting up a firewall two different ways: the easy way with nftables, and in the part two the easily configurable way with firewalld. I must warn you beforehand that this text contains a bit of resentment towards kernel configuration. I also used ChatGPT as a tool for writing a part of this text, but I’ll inform you of the section I asked it to write.

So yeah, firewall. The system keeping the barbarians out of our Linux empire. Also, one of the most bad-ass names for a program, at least if taken literally. Reality is a bit less exciting, turns out “firewall” is some technical construction term. In the magical world of computers, a firewall is a piece of software used to allow or deny network traffic coming in and out of a machine. It’s mighty useful for improving the overall security of the system to be able to prevent unwanted remote connections and such.

Step 0 – Configuring Kernel

The first step of installing and configuring a firewall on a Yocto image is ensuring that the kernel is configured correctly. What, do you think you can just start configuring the firewall and blocking your nefarious enemies just like that? Pfft, get out with that casual attitude (please don’t).

To get some basic traffic filtering with nftables done, we need to ensure that the Netfilter is enabled in the kernel, as well as the Netfilter support for nftables, and that some nftables modules are installed as well. It’s all a bit confusing, but to summarize: Netfilter is the kernel framework for packet filtering and nftables is the front-end for Netfilter that can be used from user space. nftables can then be built with or without modules to extend its functionality. The following kernel configuration fragment should do the trick:

Getting this config fragment working was where things got problematic for me. I thought that I’d just slap =y to everything and be done with it, no need to install and load modules. Well, the configuration process of the kernel consists of merging the defconfig, kernel features, and configuration fragments, and as it turns out, the kernel feature enabling the Netfilter overrode some parts of my config fragment, changing =y to =m. For some reason bitbake didn’t give a warning about this. I’m quite sure it used to do.

But where is this mythical Netfilter kernel feature located? The fragment above enables only nftables related things, how to get the Netfilter working? Well, Kirkstone release (3.1.27) of Poky has this line that enables the feature by default in the kernel, and the related configuration fragment looks like this. The multiple configs lead to funny situations where you try to remember the difference between CONFIG_NF_CONNTRACK and CONFIG_NFT_CT and where they are defined and to what value.

Ensure that you’re also installing the kernel modules to the image by adding the following line somewhere in the image configuration:

IMAGE_INSTALL:append = " kernel-modules"

No use enabling the modules if they’re not installed, and by default nothing installs them, so they won’t get added to the final image. At least to the qemu64 core-image-minimal image they won’t be installed by default, guess if I found that out the hard way as well. If you need more fine-grained control over the modules that get installed, you can add them as a runtime recommendation to some package:

RRECOMMENDS:${PN} += "kernel-module-nf-tables kernel-module-nf-conntrack kernel-module-nft-ct ...

Most likely nftables package is the best place for this. However, I’d advise against this approach if possible, because there will be a lot of modules later on. If you choose to go down this route, the naming format of the modules is fortunately quite self-explanatory. CONFIG_NF_TABLES being built as a module generates kernel-module-nf-tables package, CONFIG_NFT_CT results in kernel-module-nft-ct being built etc, so this should be easily scriptable (although painful to maintain).

The Easy Firewall – nftables

The easy way of getting some firewalling done involves installing nftables, adding some rules to it, and loading them as a part of the start-up. This is suitable if you have a few interfaces with a simple configuration that doesn’t change often, preferably ever.

The actual first step of getting the firewall up and running is adding the nftables package to the image if you don’t have it already installed. This can be easily checked by running nft command. If it fails, add the package to the image configuration:

IMAGE_INSTALL:append = " nftables"

If you see the following types of errors when running the nft command it means that your kernel configuration is missing either nftables support (the first error) or some nftables module (the second error):

root@qemux86-64:~# nft
../../nftables-1.0.2/src/mnl.c:60: Unable to initialize Netlink socket: Protocol not supported
root@qemux86-64:~# 
root@qemux86-64:~# nft -f /tmp/test.conf
/tmp/test2:32:9-16: Error: Could not process rule: No such file or directory
        ct state vmap { established : accept, related : accept, invalid : drop } 
        ^^^^^^^^
root@qemux86-64:~# 

Quite often the firewall in Yocto is empty by default. The active rules of the firewall can be checked by running nft list ruleset command to print the current ruleset. If nothing is output, no rules are present. If something gets output, there are some rules present. Here’s a short reference of commands on how to set some basic rules and list them on the command line.

# List ruleset
nft list ruleset

# Add a table named filter for IP traffic 
nft add table inet filter

# Add an input chain that drops traffic by default
nft add chain inet filter input \{ type filter hook input priority 0 \; policy drop \}

# Add rule to allow traffic to port 22
nft add rule inet filter input tcp dport \{ 22 \} accept

# List tables and rules in a table
nft list tables
nft list table inet firewall

In addition to listing the tables and rulesets from inside the device, you should also try a black-box approach. Running nmap command against your device is a good way to check how the outside world sees your device. In this example, I have added the Dropbear SSH server to the image. Because the server is listening on port 22 and there are no rules defined in the firewall, the port is seen as open:

esa@ubuntu:~$ nmap 192.168.7.2
Starting Nmap 7.80 ( https://nmap.org ) at 2023-07-22 16:14 UTC
Nmap scan report for 192.168.7.2
Host is up (0.0024s latency).
Not shown: 999 closed ports
PORT   STATE SERVICE
22/tcp open  ssh
Nmap done: 1 IP address (1 host up) scanned in 0.06 seconds

The command line commands are useful for editing the configuration run-time, but for the rest of the text we’re going to focus on a file-based configuration that is set during the Yocto build. The configurations presented here are mostly based on “a simple ruleset for a server” in nftables wiki. It contains some useful extra things not presented here, like a rule for the loopback interface, differentiating between ipv4 and ipv6 traffic and logging, and I recommend checking it out.

Usually, a good way to start creating a firewall ruleset is to block everything by default, and then start poking holes as needed. To block everything, you can write a configuration file with the following content:

flush ruleset

table inet firewall {
    chain inbound {
        type filter hook input priority 0; policy drop;
    }

    chain forward {
        type filter hook forward priority 0; policy drop;
    }

    chain output {
        type filter hook output priority 0; policy drop;
    }
}

This config creates one rule table named firewall for inet traffic. The table contains three chains: inbound, forward and output. You can load the ruleset with nft -f <config_file> command. Once you’ve blocked all traffic, you can give the nmap command another go. As you can see, blocking is really effective. nmap actually considers the device to be non-existent at this point:

esa@ubuntu:~$ nmap 192.168.7.2
Starting Nmap 7.80 ( https://nmap.org ) at 2023-07-22 16:18 UTC
Note: Host seems down. If it is really up, but blocking our ping probes, try -Pn
Nmap done: 1 IP address (0 hosts up) scanned in 3.03 seconds

Trying to ping from the device itself to the outside world also fails because the output is blocked. In theory, allowing outgoing traffic is not the worst idea, usually many sample configurations actually allow it. But if you know the traffic that’s going to go out and the ports that will be used there’s in my opinion no sense allowing all of the unnecessary traffic. That’s just an attitude that will lead to botnets. But if you want/need to allow all outgoing traffic, you can just drop the output chain from the config. By default everything will be accepted.

After nothing goes in or out, it’s time to start allowing some traffic. For example, if you want to allow SSH traffic from the barbarians to port 22 (generally a Bad Idea, but for the sake of an example), you can use the following ruleset:

flush ruleset

table inet firewall {
    chain inbound {
        type filter hook input priority 0; policy drop;

        # Allow SSH on port TCP/22 for IPv4 and IPv6.
        tcp dport { 22 } accept
    }

    chain forward {
        type filter hook forward priority 0; policy drop;
    }

    chain output {
        type filter hook output priority 0; policy drop;

        # Allow established and related traffic
        ct state vmap { established : accept, related : accept, invalid : drop } 
    }
}

If you want to define multiple allowed destination ports, separate them using commas, like { 22, 80, 443 }. If you choose not to block outgoing connections, you obviously don’t need the output chain presented here.

On the other hand, if there’s a process in the Yocto image that wants to contact the outside world, you could consider adding a ruleset like below:

flush ruleset

table inet firewall {
    chain inbound {
        type filter hook input priority 0; policy drop;

        # Allow established and related traffic
        ct state vmap { established : accept, related : accept, invalid : drop } 
    }

    chain forward {
        type filter hook forward priority 0; policy drop;
    }

    chain output {
        type filter hook output priority 0; policy drop;

        # You can check the ephemeral port range of kernel 
        # from /proc/sys/net/ipv4/ip_local_port_range
        tcp sport 32768-60999 accept
    }
}

Note how we need to use a port range for the output. Outgoing connections usually use a port from an ephemeral range for their connection, meaning that you may not know beforehand the exact port that needs to be allowed. Having fixed ports for outgoing traffic makes the device more susceptible to port scanning and prevents multiple connections from the same service because the port is already used, but if you choose to create a service that uses a static source port, you can use a single port number in the rule.

You can combine the rules in the chains as required by your system and services. If you want to enable the ping command (for ipv4), you can add the following piece of configuration to the input chain (note that this is just a single line of configuration, not a full config):

icmp type echo-request limit rate 5/second accept

In the end, creating the firewall ruleset is fairly simple: block everything and allow only necessary things. The difficulty really comes from keeping the rules up-to-date when something changes. However, when developing an embedded device with Yocto you generally should have a well-defined list of allowed ports & traffic (as opposed to a general-purpose computer used by a living human where all sorts of traffic and configs may come and go at the user’s whim). If that’s not the case, the part 2 where we work with firewalld may be useful to you.

So, now we know how to write the perfect configuration file. But how to add this to a Yocto build? You can create a bbappend file for nftables into your own meta-layer. This append then installs the configuration file, because nftables does not install any configuration by default. An example of how to add a configuration to nftables can be found in meta-firewall-examples repository I made for this blog post.

You should also note that nftables package doesn’t contain any mechanism to activate the rules during the start-up. Therefore you should append the nftables recipe so that it adds an init.d script or a service file that activates the rules from the configuration file before the networking is started. You can find an example of this as well from the same meta-firewall-examples repository.

Note that if you do run-time edits to the configuration with nft command, the changes will be lost when the device is reset. To ensure that the changes are kept, you can run the following command to overwrite the default config with the new, enhanced configuration:

nft list ruleset > /path/to/your/configuration-file

And one more thing: if you face some No such file or directory issues when trying more exotic configurations you are most likely missing some kernel configurations. In part 2, we’ll be enabling pretty much every feature of the Netfilter, so that’ll (most likely) be helpful. On that cliffhanger, we eagerly await the next chapter in this ever-evolving journey of discovery and innovation (I asked Chat-GPT to write that sentence).

How to Turn Laptop Webcam into Digital Camera

AI generated image of a camera

In my last blog post, I said that I wanted to try finishing a project instead of starting a new one. Let’s forget that and kick off a new project. In my defense, I started working on this idea a few months ago already, and now I am (mostly) finishing it.

Back then I thought that it would be fun to have a retro-style camera. An analog camera would be neat, but developing films in this day and age is a bit of a pain. Early 2000s digital camera could be an option, but paying for one of those would be a bit of a waste. I mean, they cost maybe 10€, but it’s more a matter of principle. A Polaroid camera would tick all the boxes, but that idea didn’t come to my mind.

Then I thought why not build a digital camera? Well, for starters, it takes some effort that’s definitely worth more than 10€. Also, it takes some materials that are worth more than 10€. So yeah, all around a silly idea. Let’s do it.

Camera Lens

When I say that I’m going to build a camera it doesn’t mean building a lens. Those are quite precise devices, and it would be quite a bold claim to say that I possess the skills or facilities to make them. Instead, I had to salvage one from somewhere. Fortunately, every laptop contains a camera lens in the form of a webcam, and I have some spare laptops lying around. Also, the benefit of using a laptop webcam as a camera lens is that it contains a controller that can usually be interfaced with USB.

So, it’s time to get the fine-tuning hammer and give my ancient laptop a light tap with it:

Image of a disassembled laptop
*bonk*

This process of course varies from laptop to laptop, but usually “<laptop name> disassembly” search from Google is a good starting point. There is always a small random repair shop that has made a disassembly video for your laptop. After the webcam is within reach, it’s just a matter of cutting the wires and terminating them. Not a good idea to have unterminated wires inside a laptop, at least if it’s still going to be used.

So now I have the camera module for my camera. I just need to figure out how to connect it to anything. The wires didn’t seem to follow any standard coloring scheme, and after some googling, I couldn’t find any standard order of the wires either. However, there is this useful information printed on the silk screen underneath a sticker in the back:

Image of a laptop webcam module with the wire labels printed on a silkscreen
I wonder what these ancient hieroglyphs mean

The next thing to do is solder the power of the webcam to the power of a USB cable, ground to ground, D+ to D+, and D- to D-, right? Wrong (maybe). I’m not sure if someone has mislabeled the wires or if I just messed them up, but soldering D+ of the webcam to D+ of the USB cable (and the same for D-) resulted in following errors in Linux when plugging in the device.

kernel: usb 4-1: new low-speed USB device number 14 using ohci-pci
kernel: usb 4-1: device descriptor read/64, error -62
kernel: usb 4-1: device descriptor read/64, error -62
kernel: usb 4-1: new low-speed USB device number 15 using ohci-pci
kernel: usb 4-1: device descriptor read/64, error -62
kernel: usb 4-1: device descriptor read/64, error -62
kernel: usb usb4-port1: attempt power cycle
kernel: usb 4-1: new low-speed USB device number 16 using ohci-pci
kernel: usb 4-1: device not accepting address 16, error -62
kernel: usb 4-1: new low-speed USB device number 17 using ohci-pci
kernel: usb 4-1: device not accepting address 17, error -62
kernel: usb usb4-port1: unable to enumerate USB device

This is where I gave up a few months ago on my first try. I was too sad to go on.

Meme that says "my disappointment is immeasurable and my day is ruined"

The Project: Rebirth

Fast forward two months. I saw an ad for a contest. The competition was looking for something that could be described as “overengineered DiWhy” projects, but maybe in a bit more positive sense. After seeing the ad, and realizing the fact that a 3D printer was available as a reward, I knew what I had to do: finish the digital camera.

So I dug up the abandoned project and soldered D+ to D- and D- to D+, plugged the thing in, and was pleasantly surprised that it actually now worked and I hadn’t caused a heat death of the device with careless soldering. The final schematic ended up looking like this:

Schematic of the webcam module connected to a USB port
Note that D+ and D- are subjective truth in this reality. Mixing them up shouldn’t break the device but it won’t work either.

Two 1N4001 diodes are used to drop the voltage from the USB’s 5V closer to the 3.3V expected by the camera module. After reading some blog posts about similar projects it seemed like some other people had had the same problem with D+ and D-, so it may be a common point of confusion.

Picture of a webcam soldered to a USB cable
The quality of the connections has nothing to do with the possible communication issues.

The next step was coming up with a plan for what to actually do with this thing. I ended up using a Raspberry Pi powered by a power bank and creating a small control/status board using GPIO pins of Raspberry Pi.

Control/Status Board

I put together all my entry-level electronics knowledge and tried to remember which way the LED should be connected. I failed. After googling some basic Arduino tutorials for children, I came up with this kind of schematic.

Schematic of a breadboard containing a button and two LEDs connected to a Raspberry Pi
LEDs are connected to GPIO pins 23 and 24, button is connected to 22.

To be honest, I don’t quite understand electronics as well as I would want to, and I’m not 100% sure if the resistors are the correct size. At least the board works and doesn’t produce audio-visual bang-smoke output, so I guess it’s good? Or maybe I’m slowly but steadily causing some irreparable damage that will manifest itself in surprising and slightly disappointing ways? Only time will tell.

Image of a breadboard with a button and two LEDs
Can a button be anything besides big and red? I think not.

The purpose of the button is quite obvious: press it to take a picture. The LEDs give an indication of the system’s status. One LED turns on when the device is listening to button presses and ready to take pictures. The second one turns on when a picture is being taken (it’s a surprisingly long process).

I’ve Got the Power

To power up the Raspberry Pi I used a power bank that has enough juice to power up the device. It’s Anker Powercore II 10000 that outputs 5V and 3A, which is within the recommended limits. In addition, I “soldered” a switch to the power wire of the USB cable to have a rudimentary power button. “Soldered” is in quotes, because I tried new lead-free solder for the first time, and nothing really stuck on anything despite the maximum heat and effort, so it was closer to suffering than soldering.

Image of a USB cable with a power switch
I haven’t tried if the data lines work, but if I had to guess, I’d say no. I’m surprised that even the power line works.

Software

This type of device could use two software components:

  1. A program that handles GPIO input and output
  2. A program that captures the camera frame and outputs the image

I wanted to write a program that would actually communicate with the camera module, but because I had to finish the camera in time for the contest, I opted for an existing solution. fswebcam is a command line software that can be used to capture frames from a webcam, which is exactly what we’re going to do.

The first piece of code I wrote is camera-gpio. It turns on the standby LED, polls the button state, and turns on another LED if an image is being taken. If the button is pressed camera-gpio launches the second program that actually takes the picture. Quite self-explanatory. It’s a C program that’s built with CMake, because I wanted to refresh my memory on how those work.

The second code repo is camera-handler, and it’s for the program that takes the photo. Currently, it’s pretty much just a wrapper for fswebcam. camera-handler also generates filenames for the images from the system time, which is a bit useless because the board doesn’t have RTC or NTP to store or sync the time. But you have to consider the opportunities, all the things it could be! It’s a C++ program that’s built with autotools, because I wanted to refresh my memory on how those work.

If you’ve been reading this blog before, you may know what comes next: Yocto build. In addition to these code repos, I made a meta-layer named meta-camera with the required files to build these into a Linux firmware image ready to be flashed into an SD card.

Final Schematic and Photos

Once all the pieces were ready, all that was left to do was plug in the thingys, boot up the thingy, and hope for the best.

Schematic of the Raspberry Pi camera

Hope is a powerful thing. It may lead to a semi-functional thingy that takes pictures. Here’s a picture of the final build:

Image showing the Raspberry Pi camera
Doesn’t look quite as good as it did in the schematic. Just don’t take it with you on an airplane.

Beautiful. I posted this picture earlier on Linkedin and considered sorting out the wires. Decided to keep them as they are, because I can’t be bothered to be honest.

How do the actual pictures look? Well, I’m going to be polite and say “quite retro”, which was the original goal of the project. See for yourself:

Image taken with the Raspberry Pi showing two decorative pigs
“Coffee break of the piggies threatened by a long-legged spider”
Image taken with the Raspberry Pi showing a forest
“Sunrise in Suburbia (picture taken at 13:02)”
Image taken with the Raspberry Pi showing a close up of a leaf
“Oh wow didn’t expect this close-up to actually look tolerable”

At the least the file sizes are small.

Future Work

Honestly, I think this project could still use plenty of work. Here’s the list of things that could be fixed or improved:

  • Make the boot-up time shorter: While it’s delightfully old-school to wait 20 seconds for the camera to start up, it’s also a quite nerve-racking experience to wait every time for almost half a minute to see if the device still works.
  • Make the picture capture time shorter: It takes quite a few seconds to take a photo. Well, taking the picture itself is almost immediate, but using fswebcam is quite slow.
  • Store the pictures on their own partition: Currently, images are stored in the boot partition because that’s where they can be easily accessed. This is a hilariously bad idea for multiple reasons.
  • Audio feedback: It feels weird to use a camera that doesn’t make an artificial shutter sound.
  • PTP device: Currently, the SD card needs to be removed from the camera and inserted into a computer to access the images. It’d be nifty if the device could just be plugged into a computer to browse the photos.
  • Fix kernel issue at boot: Oh, did I forget to mention that there’s a kernel error message printed during boot? Well, there is. And it should most likely be looked at.
  • Create a nice case for the camera: I wish I had a 3D printer to print a case for the camera. I wonder where I could get one. *wink wink nudge nudge*

All in all, I think there are more things that should be done than actually have been done. However, I’m also quite happy with the things that are already done. Good starting point for whatever comes next.

Image taken with the Raspberry Pi showing overexposed light
“The future is bright (and full of overexposure)”

Aioli Audiostreamer: Moving the Sound

AI generated picture of an amplifier with raspberries

People need projects to consume their free time. I’ve lately felt that I want to actually try finishing a project (instead of just starting them), and that the project should be somehow related to audio, and it would be nice if it would have a real-world use, and it would use the old Raspberry Pis that have lying around. Plenty of requirements then. I think this is still better a better-formulated train wreck than an average customer project.

After considering few different options, I ended up attempting to create a multi-speaker streaming system named Aioli (so yeah, I started another project). This text is closer to a devlog than tutorial, but there will be open source code repositories in case you want to see how it’s done. Enough with the blabber, let’s move on.

Overview Of The Project

Basically, in this project I want to have one audio source, and the audio from that single source would get wirelessly transmitted to multiple speakers. To be more specific, in my case there’s one Raspberry Pi 4 connected to an external audio source, and then there would be other Raspberry Pi 2’s connected to the speakers. The Raspberry Pis in this scenario would handle at least streaming, networking, receiving the audio, and playing the audio. This graph attempts to explain the situation:

To start out the project I decided to focus on the streaming between Raspberry Pis because I didn’t feel masochistic enough to start working with Bluetooth yet. Everything is all fun and games until Bluetooth is added into the mix, and I want to have a bit of fun and games.

Today’s focus is this part to be exact

Obviously, the first thing to do is to create a custom Yocto distro, because every self-respecting hobby project needs its own Linux distribution. Perhaps further down the line this distro can contain some useful configs and other things that actually justify its existence, but for now, it’s just a renamed example Poky distro.

Creating The Network Of Raspberries

To get the Raspberry Pis talking to each other the first step is getting the devices connected to the same LAN. I wanted to use WLAN to not have cables around the house. Using ethernet cables would defeat the whole point of the system anyway as I could just use audio cables then. I considered also an ad hoc network but decided to use WLAN to keep things familiar for now. The Raspberry Pi 4 I own does have an internal Wifi chip, so that was easy to sort out, but the two Raspberry Pi 2’s did not. I had one Wifi dongle that worked out of the box, but another dongle required some extra work. You can read about it from my previous blog post if you’re interested.

After getting the hardware sorted out it was time to get the devices actually connected to WLAN. For that purpose, I added wpa_supplicant to the distro. wpa_supplicant is a program that in layman’s terms “connects the device to wifi” (or so I’ve understood as a layman). A properly configured supplicant that launches during boot should in theory automatically connect the device to WLAN. Surprisingly enough, it usually does. Following simple configuration in /etc/wpa_supplicant.conf added during a build to a Raspberry Pi does the trick:

network={
    ssid="WLANname"
    psk="SecretPassword"
}

This of course means that you have a statically defined network you want to connect to, and the password is stored in plaintext in the device. Both are bad things for different reasons, but they’ll do for now because it’s the simplest solution. This simplicity will be fixed later on in another text. If you have a WLAN network without a password or want to use a calculated key instead of a plaintext password, you can read more about wpa_supplicant from Arch wiki. It’s a good read. Pay attention to the quotation marks in psk-variable, they caused a lot of headache to me.

With quotation marks the value is a plaintext password, without them it’s a calculated key value. Makes “sense”.

After the devices wirelessly connected to the router, I gave them static IP address leases to make the development somewhat easier. I also ran a quick ping test to check that the Raspberry Pis can reach Google and each other before proceeding.

Moving The Audio Bits

Making the audio streaming work was actually fairly simple because there already is an open-source solution, as there usually is. GStreamer is an “open source multimedia framework”, which can mean many things. This is quite fitting because GStreamer does many things, and with the help of its plugins, it can do pretty much anything you can dream of. Assuming your dreams revolve around handling and processing multimedia.

My dream was to find a way to stream audio over IP network. And dreams, they sometimes do come true. Actually, a bit too much so, it was slightly difficult to find the best options for streaming the audio with all encoding options, protocols, and what-not. I’m still not sure I picked the right things. To keep the prototyping fast I worked with command line tools provided by GStreamer (as opposed to using the API, which may be worth looking into in the future).

GStreamer works with pipelines. Pipelines have sources where the media originates from, sinks where the media ends up, and parsers, encoders, and other types of things in between that manipulate input and pass it forward. For example, here’s a simple pipeline that reads an audio file, and then parses, converts, resamples, and outputs it to the appropriate default sink:

gst-launch-1.0 filesrc location=/opt/sample-files/sample1.wav ! wavparse ! audioconvert ! audioresample ! autoaudiosink

This command may result in a sound being output from your speakers. Quite often not. Depends on what your default ALSA output device is, if you’re using PulseAudio, and if it’s the third Tuesday of the month. In the case of Raspberry Pi, the default output device is the HDMI audio, and I’m not using PulseAudio, and it’s not the third Tuesday today, meaning that I actually got some sound out from a television connected to the HDMI port. If you want to get the audio output from the Raspberry Pi’s headphone jack, you can be a bit more specific about the sink:

gst-launch-1.0 filesrc location=/opt/sample-files/sample1.wav ! wavparse ! audioconvert ! audioresample ! alsasink device=hw:1
# use "aplay -l" command to list the available ALSA devices

To get the audio sent over the network we can use the RTP protocol that’s meant for delivering audio and video. Basic GStreamer functionality can be easily extended with plugins, and as it turns out, there exists a plugin for RTP. It’s weird how these things work out nicely. Almost like someone has had the same ideas before me. Now we can package the audio to 16-bit RTP payloads, and instead of using an alsasink we can use a udpsink (from another plugin) to output the stream to a target in a network instead of an audio device.

gst-launch-1.0 filesrc location=/opt/sample-files/sample1.wav ! wavparse ! audioconvert ! audioresample ! rtpL16pay ! udpsink host=192.168.1.182 port=5001

Then, the intended receiver of the stream can use udpsrc instead of filersrc to read the stream, decode, and deliver the contents to its own audio sink. Simple as.

gst-launch-1.0 udpsrc port=5001 ! 'application/x-rtp,media=audio,payload=96,clock-rate=44100,encoding-name=L16,channels=2' ! rtpL16depay ! audioconvert ! autoaudiosink

To get the audio sent to multiple devices, a multiudpsink can be used on the sending side. The receiving end still uses the same command:

gst-launch-1.0 filesrc location=/opt/sample-files/sample1.wav ! wavparse ! audioconvert ! audioresample ! rtpL16pay ! multiudpsink clients=192.168.1.182:5001,192.168.1.183:5001

In theory, we could use multicast streaming instead of multiple streams but for some reason I couldn’t get it working. Most likely it had something to with the third Tuesday of the month. I couldn’t even complete a simple multicast test on my network of Raspberry Pis, so I guess something is wrong with my setup. For the sake of completeness, AFAIK these commands (should (in theory)) work, but don’t. I’ll look into this later on because multicasting seems like a more sensible approach to this problem:

# Controller command
gst-launch-1.0 filesrc location=/opt/sample-files/sample1.wav ! wavparse ! audioconvert ! audioresample ! rtpL16pay ! udpsink host=224.1.1.1 auto-multicast=true port=3000

# Speaker command
gst-launch-1.0 udpsrc multicast-group=224.1.1.1 auto-multicast=true port=3000 ! 'application/x-rtp,media=audio,payload=96,clock-rate=44100,encoding-name=L16,channels=2' ! rtpL16depay ! audioconvert ! autoaudiosink
Considering the amount of multicast memes floating around the internet, I’m not the only one having issues with it.

By using these commands we can send the audio over network from the controller device to the speaker devices. However, this is still a bit cumbersome, because we need to manually run the gst-launch-1.0 commands, figure out the intended receivers & their IP addresses, and so on. Later on I plan to introduce a manager process that’s dynamically able to find the clients in LAN and control the streaming, but that’s a topic for another text.

There’s a recipe for GStreamer and its plugins in Yocto, so to get these things installed into the new custom distro is just a matter of adding a few packages. It’s almost simpler than using a package manager. At least if you’ve spent the last five years learning the ins and outs of Yocto, and don’t need to install them during runtime. Something like this should do the trick:

IMAGE_INSTALL:append = " \
    gstreamer1.0 \
    gstreamer1.0-meta-base \
    gstreamer1.0-meta-audio \
    gstreamer1.0-plugins-good-udp \
    gstreamer1.0-plugins-good-rtp \
"

Plugins are sorted to good, bad, and ugly (I guess it’s no big surprise that bluez plugin is “bad”). To figure out which group plugin belongs to you can check the documentation. The documentation is quite good by the way, I recommend reading it. For example, udp plugin page contains information about the pipeline elements it provides, and also mention which group it belongs to.

That mostly covers all for this text. We’re now able to send sound over the network from one device to another. Next time we’ll stop this goofing off, and get painfully serious by adding Bluetooth to the system, and instead of using sample audio files we’ll actually stream something from a phone.

You can find the top-level repo-tool manifest repository from here. Please note that the progress of the project is a bit further than what’s presented in this blog text, and the progress is also “a bit” all over the place, so the manifest repository and the subrepositories contain plenty of spoilers and confusion.

One more question remains: why the name Aioli? Well, it kinda sounds like audio combined with I/O, and I like garlic flavoured condiments. That’s as good reason as any.

Open-source contribution: RTL8821AU driver recipe

This is a story of how I became a useful member of society by doing my first open-source contribution.

It all began one fateful afternoon, when I purchased a TP-Link Wifi dongle, thinking that it would allow me to connect my old Raspberry Pi 2 wirelessly to the internet. It was running my own Poky-based distro, but what could really go wrong with random USB devices and Linux?

Well quite a plenty really. I plugged the device in, but I couldn’t connect to the highway of data. No delicious internet cookies for me. Not even a blinking led.

To begin troubleshooting the issue, I tried checking if the network interface was seen by kernel by running both ifconfig -a and ip link show. No wlan devices were found. Some googling suggested running lsusb. That showed the device, which at least proved that it wasn’t broken and was recognized by kernel. Some sort of network driver was clearly needed.

Bus 001 Device 004: ID 2357:011f TP-Link 802.11ac WLAN Adapter 
Bus 001 Device 003: ID 0424:ec00 Microchip Technology, Inc. (formerly SMSC) SMSC9512/9514 Fast Ethernet Adapter
Bus 001 Device 002: ID 0424:9514 Microchip Technology, Inc. (formerly SMSC) SMC9514 Hub
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub

Finding the correct driver turned out to be a bit tougher than expected. First I tried googling the name of the wifi dongle suffixed with “driver”. Bad Idea. This led to a lot of ancient forum posts that suggested all kinds of Realtek drivers for (almost) similarly named devices that were installed by enabling a variety of kernel configuration options. None of the drivers worked.

After some more of the furious googling I found out that the wifi dongle I bought required an out-of-tree kernel module instead. That meant I couldn’t just enable a kernel configuration to build the driver in my distro. Finding the correct driver was another trial and error type of affair. Someone suggested a driver for the 8812au chip. It did not work but helped me to find a correct trail.

Fortunately there’s a lot less diseases on this trail.

RTL8812AU driver repo contained a file supported-device-IDs that expectedly did not contain the device id output by lsusb. However, that gave me an idea (that I really should have gotten from the beginning): googling “driver 2357:011f”. Who would have guessed that searching for a driver with an exact device id instead of vague product names would yield the correct driver(s)? This search also helped me to find the name of the Realtek chip, 8821au, which I confused plenty of times with 8812au. I’m not sure if this info would have been available on the manual of the dongle because I did not read it.

After finding the driver & chip I connected some dots and realized that there actually is a kernel configuration driver named CONFIG_RTL8XXXU that I tried. Despite what the name suggests, it does not work with rtl8821au.

Once the correct driver was figured out it was time to add it to the Yocto build. Some more googling revealed that there is a meta layer called meta-rtlwifi for these Realtek out-of-tree modules. Unfortunately, it didn’t contain the RTL8821AU driver. Fortunately, I’ve been using git at work so I could fix that myself. You can see where this is heading.

So I took the RTL8812AU driver recipe as I suspected that it should mostly work, and updated the relevant parts, i.e. the repo to fetch the driver from. I was pleasantly surprised that the build worked just like that. Even more shocking was that the module worked as well. After that, it was just a matter of a pull request to get the driver added to the meta-layer alongside the other friendly drivers.

There were actually multiple drivers available for 8821au. At least morrownr, ulli-kroll and ivanborislav provide RTL8821AU drivers. In the end, I chose morrownr driver because their driver worked satisfactorily out of the box and their driver is also used for 8812au. I first gave a shot at ivanborislav driver but it filled my TTY with logs about power save mode. Most likely a configuration mistake from my side, but usually a thing that works without extra tinkering is the better choice.

It’s almost weird that there’s a meme for literally everything.

That’s how I got quite familiar with my wifi dongle, and made my first open-source contribution in the process. I also learned something. I’m not yet 100% sure what that is. Perhaps it’s that the device id is quite important when trying to find a suitable driver. And googling can give all kinds of interesting useful information. Until next time!