devblog

Aioli Audiostreamer: Music To The People

Check out the previous part of the Aioli Audiostreamer saga here. In case you don’t want to check it out, here’s a quick recap: I started a new project in which the goal is to stream audio from one Raspberry Pi to another over an IP network. The last time we got the streaming to work in theory, but the practical part of it was (and is still) missing. This time we’re not going to to address that.

Instead of focusing on getting the streaming working robustly and automatically, I chose to add a Bluetooth connection between the Raspberry Pi controller device and an external audio source. This way the system can stream something else than just the audio files present in the controller device. Here’s the graph with a chunky red line showing what’s the focus for today:

Diagram showing the Aioli Audiostreamer system overview

Unfortunately, this means confronting my old nemesis: BlueZ stack. Or Bluetooth in general. Something about it rubs me the wrong way. I’m not sure if it’s actually that bad. However, every time my headphones fail to connect to my phone I curse the whole protocol to the ninth circle of hell. Which happens every single morning. And it’s still the best choice for this kind of project. But yeah, plenty of that coming up.

Picture of a robot pounding a "no fun allowed" sign to the ground

Bluetooth Connectivity

The first step is making our Raspberry Pi audio server advertise itself as a Bluetooth device wanting to receive audio: headphones, speakers, or anything along those lines. In theory, this sounds like a lot of work, but once again, the open-source community comes to the rescue. This bt-speaker project makes a Raspberry Pi act as a Bluetooth speaker, which is exactly what we want. The phone (or some other audio source) can connect to the bt-speaker daemon running on Raspberry Pi and stream audio to it. bt-speaker then outputs the received audio to the desired audio device.

The program required some tweaking for cross-compilation, and some things weren’t quite as generic as they could have been, causing some QA errors in Yocto. However, it mostly worked quite nicely out of the box. I guess because the bulk of the program is written in Python there are not that many compilation issues to wrestle with. There was also one codec that needed to be compiled, and then there was the issue of figuring out the correct dependencies, but all in all fairly simple stuff. Yocto recipe can be found here.

The Actual Troubles

What actually took a long time was getting the start-up script working. I’m still sticking to SysVinit for simplicity, which means that I ended up using start-stop-daemon to launch the program. However, it turned out Busybox’s implementation of start-stop-daemon was missing the -d/--chdir option. bt-speaker loads a codec from a relative path, meaning that the program fails at start-up because it’s launched from the root directory. Because I’m not much of a Python programmer, I chose to patch the feature into Busybox instead of doing the sensible thing and installing the codec into the correct location and fixing bt-speaker. An open-source contribution to Busybox coming soon I hope.

Well, after that came the second problem: the Bluetooth chip in Raspberry Pi wasn’t stable during the startup. The script starting BlueZ worked well, and the bt-speaker launched successfully as well. However, after a few seconds, the Bluetooth device became undiscoverable. I tried to check all the Bluetooth-related changes that happened during boot in the system: changes in the Bluetooth device information and BlueZ status, reading syslog & dmesg, but no. The Bluetooth chip just reset a few seconds after the BlueZ launched. So I did what any sane person would do: power cycled BT chip as a part of the start-up, added a “reasonable amount” of sleep, and moved on with my life.

The Less Actual Troubles

After getting the thing starting automatically during the boot there’s still a small problem. The problems never end with Bluetooth, don’t they? Well, even better, there are two problems. First, for some reason, my phone says that it has trouble connecting to this Frankensteinian BlueZ device. Streaming music from Spotify works nicely though, and even the volume control behaves as expected. So all in all, this sounds like a very typical Bluetooth device already: nothing works, except that it works, except when it doesn’t. I’m not yet sure if this is actually a problem or not to anyone else except my phone.

Screenshot from a mobile phone showing connection issue with BlueZ 5.66 device
I just wanted to flex my phone and watch with this screenshot. The ironic thing here is that if I “turn device off & back on” as suggested, it’ll be forever unable to connect again unless the BlueZ cache is cleared on Raspberry Pi. That may be the fault of dodgy Bluetooth code and not the protocol itself though.

The bigger issue is that we don’t want the audio to be output from the Raspberry Pi we’re connected to. Instead, we want to stream the audio to the other Raspberry Pis in LAN and have those output the audio. GStreamer has an alsasrc source that can take input from an ALSA device and work with that. However, we have a bit of a mismatch here: GStreamer wants an input device to receive the audio from (e.g. microphone), but bt-speaker generates audio that goes to an output device (e.g. speaker).

Loop Devices to the Rescue!

Loop device is a virtual audio device that redirects audio from a virtual output device to a corresponding virtual input device (or vice versa). This means that we can have a virtual “microphone” that outputs the audio that bt-speaker has been received through Bluetooth. Maybe my explanation just made it worse, the idea is quite simple. This blog post explains the functionality quite well.

To explain more: probing snd-aloop module creates two loopback sound cards, both for input and output (four cards in total). The virtual cards have two devices, and each device has eight subdevices. This results in a lot of devices being created. These devices are special because the output of an output device gets directed to the input of a corresponding input device. For example, if I play music with aplay to card 1, device 0, subdevice 0, the same music can be captured from the input card 1, device 1, subdevice 0. Notice how the device number is flipped. Output to input, and vice versa, as I’ve been repeating myself for two chapters now.

Meme saying "snd-aloop transforms input to an out and vice versa" in French
Google Translate don’t fail me now.

With this method, when bt-speaker uses aplay to output the audio it receives, we can define the output device to be a loopback device. Then, GStreamer can use the corresponding loopback input device to receive the Bluetooth audio and pass it to the LAN. A bit of patching to the bt-speaker, and something like this seems to do the trick:

# Play command that bt-speaker uses when it receives audio

aplay -D hw:2,0,1 -f cd -

# Streaming command to send audio to 192.168.1.182

gst-launch-1.0 alsasrc device=hw:2,1,1 ! audioconvert ! audioresample ! rtpL24pay ! udpsink host=192.168.1.182 port=5001

Is This a Good Idea?

I’m not sure. I think it would be possible for the bt-speaker daemon to launch the GStreamer directly once the Bluetooth connection has been initialized. This approach would skip aplay altogether and wouldn’t require the loop devices. However, keeping the Bluetooth and networking separate should keep the system simpler. Both processes do their own thing without knowledge of each other. bt-speaker can output audio without caring if anyone listens to it. On the other end, GStreamer can stream whatever it happens to receive through the loopback device.

This also allows kicking the bt-speaker and GStreamer individually when they eventually and inevitably start misbehaving. The drawback is that GStreamer streams silence if nothing is received from Bluetooth, but I think that’s an acceptable weakness for now. After all, I’ve paid for a WLAN router to route some bits, so I’m going to route them bits, even if they’re all zeros.

I noticed that there actually is a BlueZ plugin for GStreamer. However, it’s labelled as one of the “bad” plugins, and dabbling with such dark magic sounds like a bad idea. If something is labelled “bad” even in the official documentation it’s usually better to avoid it. It doesn’t necessarily mean that the quality itself is bad as the label may also mean a lack of testing or maintenance, but still.

Movie poster of Bad Boys
TBH I don’t exactly know what WASAPI is, but after a quick Google search, I’m not sure I even want to know.

Closing Words

This text was a bit shorter than anticipated, but some things in life are unexpectedly short. Next time we’ll get rid of the static WLAN configuration. Instead, we’ll create a mechanism for passing the SSID and password during run-time. We’ll be doing that mostly because I already started working on that feature. Now I have DHCP servers running amok in my home LAN causing trouble and slowing down development.

One question still remains: does this system contain any code that I have written? Not really at the moment. But in my experience, that’s the story for most of the embedded Linux projects: find half a dozen somewhat working pieces of software and glue them together with some scripts. Until next time!


This blog text is a part of my Movember 2023 series. If you found this text useful, I’d ask you to do “something good”. That doesn’t necessarily mean shoving your money to charities or volunteering weekends away (although those are good ideas), it can be something as simple as asking a family member or colleague how they’re doing. Or it can mean selling your earthly possessions away and becoming a monk. It’s really up to you.

Delayyyyyy: v0.1.0, The First Release

Time for a celebration! The 0.1.0 release is here. It took quite a lot of effort to get everything in place and tested. As usual, the last mile turned out to be more difficult than expected. But in the end, things finally felt good enough for the first beta release. I broke the philosophy of having only one logical change in a single pull request by stuffing all the final changes into one PR, sorry about that. In this blog post, I’m going to go briefly through those commits that lead up to the beta release. The section headers are links to individual commits in case you want to judge my code.

Fix bug in short BPM synced delays

So, first things first, the cliffhanger I mentioned in the previous blog post: the bug in the synced delays. The bug was that when there is a high amount of echos, but not enough synced time signature values for them, the plugin went quiet. Instead, there should have been the few short echoes that were available, but still less than the specified amount. This was caused by prematurely breaking out of the loop that creates the delay buffers. Breaking out of the loop would have been fine if we would have been incrementing the iterator (i.e. creating the longest delay first). However, the loop iterates in reverse order (i.e. creating shortest delay first) and thus we can’t break from the loop before going through all synced time intervals.

It may not be a good idea to lie to the user about the final number of echos though.

This is how I imagine an average user.

Fix parameter state storage

I finally tried a simple saving of parameters in a digital audio workstation project. As it turns out, empty functions don’t do much of saving (or loading), and when I reloaded the project the carefully set parameters were gone. The solution was simple, actually it only consisted of copy-pasting code from an official JUCE tutorial. Not only for the saving but the loading as well. One feature isn’t much of use without the other.

Get BPM from a possible VST host

This is the big one. Basically, I did the following things to achieve this:

  • Run setDelayBufferParams in it’s own thread
  • Read BPM (beats per minute) from the playhead object
  • Detect if BPM changes

The first change was done out of necessity to make the second change possible. Playhead is an object that contains audio playback information, like the position of the audio track or the BPM (if available). However, this object can only be read in the processBlock-function. I can’t remember if I’ve mentioned this earlier, but processBlock-function should execute as quickly as possible to keep audio latency as small as possible.

So, we can’t run setDelayBufferParams in the processBlock because we want to keep things smooth. Instead, a separate thread needs to be created and notified in case the BPM (or any other parameter) changes. The thread also checks parameters every now and then by itself. For thread control, I tried to make a sort of one-sided locking where setDelayBufferParams always waits for the processBlock to finish (even if it’s not running, because it’ll run soon anyway), and then hopefully run before the next block has to be processed because the processBlock is a really busy function and it has no time to wait for anything or anyone.

However, thread-control based assumptions that things “should definitely work this way” will fail with 100% certainty. This is no exception. I’ll fix this in a future commit, but please keep this silly design in mind (and then forget it so you won’t create it yourself)

Add plugin header and version info

What’s worse than getting a bug report? Getting a bug report without the version information, or even the name of the product. This commit added a neat little header that shows the name of the plugin, the name of the creator, and even the version in the bottom right corner. I should have dropped the name of the creator though, so no one could send me bug reports.

This commit uses a neat UI feature of JUCE: removeFrom. This can be used to crop (and return) a certain section of the screen area. Something like this is perfect for the headers. Just remove a bit from the top, and the removed area is the new header area! No need to break the existing flexbox-based UI to try and fit header info in there.

Remove negativity, add Fonzie and update TODO

Does what it says, all changes here went to the README. There may be better issue management tools than just updating the README, but at least it’s not as bad as JIRA.

I have watched zero episodes of Happy Days in my life, hopefully I’m using this correctly.

Set the initial values of parameters to default values

This commit sets the initial values of the parameters to their default values. Would you have guessed? Basically makes the code a bit neater. However, this introduced a quite big bug that I somehow missed: the plugin makes no sound before the UI elements are edited because the delay buffers only get created if the parameter values change. If the parameter values are initialized to their default values, the buffers won’t get created before UI changes to something non-default. This programming bug is equivalent to a fist-sized cockroach in real life. I reverted this change here shortly after merging it, but I may need to revisit the idea some other day.


And then I thought that I was ready for the 0.1.0 release. I did some final testing of the plug-in, fiddling around and making sounds. Life was beautiful. Until suddenly, a deafening “pop”-sound occurred. Ableton recorded it to be 173 dBFS. Fortunately, I wasn’t using my headphones, otherwise my head would have exploded. People from miles away heard the sound of my mistake, which was slightly embarrassing.

Fun fact: the loudest sound on earth has most likely originated from the Krakatoa volcano eruption in 1883. It was recorded to be 172 decibels 100 miles away. And that’s probably the most educational thing in this blog so far.

Well, the reality is a bit less exciting to be honest. The sound wasn’t that loud, and I guess Ableton just calculated it “a bit” wrong. But there was a pop, and it was reproducible easily: just by wiggling the controls of the plugin for a minute or two.

And this is where my “slightly” “stupid” approach to thread control comes in.

Fix race-condition in setDelayBufferParams and processBlock

Basically, it’s dumb dumb very very dumb to assume that some things can and will always run in a certain time frame. In the end, it’s better to have a bit more latency than weird popping audio artifacts. Basically, the sound occurred when setDelayBufferParams didn’t finish before the processBlock was called. As you remember, processBlock waits for no one. So setDelayBufferParams was happily replacing/removing the delay buffers and processBlock was happily reading those garbled memory locations, and a pop sound occurred. That’s why I added a wait to the processBlock so that it won’t run too early. Not the best possible design, but it’ll have to do for now (and most likely until the end).

Add companyName for VST information

Finally, while testing out the plug-in I noticed that the company name wasn’t set correctly (it said yourcompany or something like that), so I fixed it to point to my musical alter ego.

And that’s it! That’s the story of how 0.1.0 came to be. I made one final PR where I added the ready-built x64 VST3-file and updated the README with the known issues of the plugin being a CPU hog and popping occurring while editing parameters. The CPU optimization is worth another story, but I already have a few ideas for that (and for the popping as well). But meanwhile, you can download the VST3 plugin here:

Delayyyyyy_0.1.0 release

This is how it feels to optimize. Even the quality of the GIF is optimized.

Delayyyyyy: BPM synced delay effect

As a hobby project, I have been working on a VST plugin lately. For people who are unaware of what a VST plug-in is (I assume this is most of the people), it’s basically an audio effect or instrument used with DAWs (digital audio workstations). To simplify the concept even further, it’s a piece of software that either creates or modifies an audio or MIDI signal.

The plugin I’ve been working on for the past few months is an audio delay/echo plugin. It takes an audio signal and repeats it a given amount of times with some certain maximum delay value. It also has few simple options: decay time for deciding how quickly the repeated signal levels decay, and a “ping pong” to alternate the left and right sides of the signal.

Ta-dah! This how the plug-in looks at the moment.

Now that I’ve briefly explained what the project is about, I can talk about my latest addition to the plugin: BPM synced delay (basically a delay that is synced to the rhythm of the music). I’ll admit, this may feel a bit like jumping into a moving train when there’s existing code that hasn’t been explained in previous blog posts, but I’ll try my best to keep this simple & understandable. Here’s a link to the PR that contains the BPM sync feature I’ll be talking about in this blog post.

There are changes to two classes, PluginEditor and PluginProcessor. The former is responsible for the UI, and the latter for the actual signal processing.

In the PluginEditor changes were quite simple. I added a BPM sync check box that toggles the visibility of the synced and unsynced delay sliders. I actually had most of the stuff already in place, just commented out (commenting out code is definitely the best method of version control, trust me on this, I’ve had “senior” in my title for three whole months /s). Checkbox’ value is also used in the PluginProcessor side to see if the delay should be synced or not. I’m using CustomSliders to have a custom look and feel on the sliders, but everything should work on a regular slider class equally well (or even better).

There are also some UI gymnastics going on to get everything aligned correctly. The UI should be resizable (kinda), which causes its own headaches there. Possibly the most interesting thing is the custom textFromValueFunction that allows showing musical time signatures instead of raw slider values in the slider value box. Besides these, I also had to inherit the Button listener to be able to update UI and delay parameters whenever the user would click the button. There are more elegant ways to do this, but I think for two lines of code in a hobby project this is a sufficient solution.

Lazy day every day

Simple so far? Well, things should get a bit more interesting (i.e. complicated) in PluginProcessor. First of all, I had to create parameters for the synced delay toggle and synced delay amount to ValueTree. ValueTree is a class that, you guessed it, stores different values and helps handling them.

However, maybe the most important functional changes in the PR are to the setDelayBufferParams-function, and that could definitely use some explanation. The delay in this plugin works by creating a vector of size N, where N is the number of echos that we hear back. This vector is then filled with ring buffers, and these buffers have read and write pointers that are offset by a certain amount from each other. There’s a picture after this, and I hope that it’ll explain the idea better than two chapters of words and poorly thought-out mathematical formulas.

The offset between read and write is calculated from the maximum delay time X and echo amount Y user set from the UI. The first buffer gets a value of X/(2(Y-1)), second X/(2(Y-2)), and so on, until (2(Y-n)=0. In that situation, the last buffer just gets the value X. This creates an interesting (yeah for sure) non-linear, yet fairly symmetrical delay effect. Please note that the first echo buffer has the smallest offset between read and write pointers.

ANYWAY, after this offset is calculated, the write pointer is set to this value while the read pointer is set to zero. When there’s an audio signal coming in, there’s first a read performed on the buffer to obtain the possible previous, delayed signal. Then the incoming signal with possible ping pong and decay applied is written to the buffer. Finally, the earlier read value is added to the incoming signal. This is then repeated for all the buffers. Since each buffer has a different, bigger than zero offset between write and read pointers, the audio signal gets repeated at different intervals, creating a delay effect. Wowzers!

In this example, the maximum delay is set to one second, and there are four echoes. I’ve been told to never apologize using Paint in explaining things, but hopefully this makes sense.

Now, how does this idea work in a rhythmically synced world? Well, time for a little music theory for a change. 97% of the modern western music is in 4/4 time signature, meaning that there are four kick drum hits in a bar, and music is constructed of building blocks with a size of four bars (gross oversimplification, but I don’t want to end up neck-deep in music theory because I’ll explain it wrong). So in a track with a BPM of 120 there are 120 kick drum hits in a minute, and 30 bars in a minute.

This gets a bit subjective from here, but I personally like synced delays to be divisions and multiples of two from the previously calculated value, starting from 1 bar, to create a nice rhythm that sounds good on most music. This means that the delay gets values like 1/2, 1/4, 1/8, and 1/16 when going shorter than one bar, and values like 2, 4, 8 & 16 bars when lasting longer than one bar. When we know the beats-per-minute, or BPM, it becomes fairly trivial to calculate the write and read pointer offsets that are rhythmically synced. Instead of using the user set value for the maximum delay, we calculate the time for a certain amount of beats in a certain BPM.

Looks similar, doesn’t it? Max delay set to 1 bar, and echoes to 4. The difference is that the duration of the delay doesn’t depend on the time, it depends on the tempo of the music

Hopefully, you’ve understood what I’ve tried to say with this. To make this code change slightly more confusing, there are some optimization changes thrown in there as well. Basically, earlier I had a static 10-second size for the delay buffers. However, with synced delays, this may not be enough, and for the shorter un-synced delays, this will be way too much. So I set the buffer’s size to be equal to the write pointer offset because there is no point in having the buffer be longer than that. The ring buffer will rotate more often, but that isn’t really an issue because old values are not needed.

And that’s the process of how I did the BPM synced delays to my VST plugin. The theory is quite simple once you get the idea of how I implemented the delay buffers, after that it’s just a matter of modifying the write pointer offset. The rhythm is still hardcoded to 120 BPM, but it’ll be fixed in the next PR. There’s also one bug in this implementation that I’ve fixed on my local branch already, but I’ll leave details on that as a cliffhanger to my next blog post.