c++

My First Plug-In: Pastel Distortion

It’s time to finish a project. Lately, I have been mostly interested in embedded tinkering, but I’m also fascinated by audio and DSP programming. Partially because it is an interesting field, but mostly because I make music as a hobby so it’s interesting to see how the virtual instruments and audio effects work. So, in this text I’m presenting my first full-fledged and complete VST plug-in, Pastel Distortion. In a way it’s my second plug-in, as I used to make Delayyyyyy plug-in (that’s mentioned in some older texts of this blog as well), but that project has been abandoned in a state that I can’t quite call complete. However, here’s a screenshot of something that I actually have completed:

tada.wav

In short, VST plug-ins are software used in music production. They create and modify the sound based on the information they’re given by the VST host, that is usually a digital audio workstation. Plug-ins are commonly chained together so that one plug-in’s output is connected to the next one’s input. This is all done real-time, while the music is playing.

There will be free downloads at the end, but first, let’s go through the history of the project, some basic theory, and a six-paragraph subchapter that I like to call “I’m not sponsored by JUCE, but I should be”.

History of the Project

About two and a half years ago I started working on a distortion VST plug-in following this tutorial. Half a year after that, I got distracted when I thought about testing the plug-in (which resulted in this blog text and my first conference talk). As a side note, it may tell something about the development process and schedule when testing is “thought about” six months after starting the project. A year after that I got a Macbook for Macing Mac builds and got distracted by the new shiny laptop. And some time after that I made some overenthusiastic plans for the plug-in that didn’t quite come to reality and then I forgot to develop the plug-in.

The timeline is almost as confusing as Marvel Multiverse and full of delays, detours and time loops. In the end, I’ve just come to a conclusion that I’ll release this Pastel Distortion as it is, and add the new cool features later on if there’s interest in the plug-in. If there’s no interest, I can start working on a new plug-in, so it’s a win-win situation. But let’s finish this first.

What Is Waveshaping Distortion?

The Physics

In the real world that surrounds us all, sound is a change of pressure in a medium. Our ears then receive these changes of pressure and turn them into some sort of electricity in the brain. In short, it’s magic, that’s the best way I can explain it. To translate this into the world of computers, a microphone receives changes of pressure in air and converts them into changes in electricity that an analogue-to-digital converter then turns into ones and zeros understood by a computer. Magic, but of a slightly different kind.

I asked AI to generate an infographic for this section. Hopefully this helps you to understand.

After this transformation, we can process the analogue sound in the digital domain, and then convert it back into an analogue signal and play it out from speakers. Commonly the pressure/voltage changes get mapped into numbers between some range. One common range is [-1.0, 1.0]. -1.0 and 1.0 represent the extreme pressure changes where the microphone’s diaphragm is at its limit positions (=receiving loud sound), while the value of 0 is the position where it receives no pressure (receiving= silence).

Well, I also tried drawing the information myself. I’m not sure which one is better.

The Maths

Now we’ve established what sound is. But what is waveshaping distortion? You can think of it as a function that gets applied to the sampled values. Let’s take an example function that does not actually do any shaping, y=x:

This is quite possibly the dullest shaping function. It takes x as an input, and returns it. However, this is in theory what waveshaping does. It takes the input samples from -1.0 to 1.0, puts them into mathematical function, and uses output for new samples. Let’s take another example, y=sign(x):

This takes an input sample and outputs one of the extreme values. You can emulate this effect by turning up the gain of a microphone, shoving the mic in your mouth, and screaming as loud as possible. It’s not really a nice effect. Finally, let’s take a look at a useful function, y=sqrt(x) where x >= 0, y=-sqrt(-x) where x < 0:

Finally, we get a function that does something but isn’t too extreme. This will create a sound that’s more pronounced because the quieter samples get amplified. Or, in other words, values get mapped further away from zero. The neat part is that the waveshaper function can be pretty much anything. It can be a simple square root curve like here. It also can be a quartic equation combined with all of the trigonometric functions (assuming your processor can calculate it fast enough). Maybe it doesn’t sound good, but it’s possible.

But Why Bother?

It’s always a good idea to think why something is done. Why would I want to use my precious processor time to calculate maths when I could be playing DOOM instead? As an engineer, I’m not 100% sure, I think it has something to do with psychoacoustics which is a field of science of which I know nothing about and to be honest it sounds a bit made up. From a music producer’s point of view, I can say that distortion effects make the sound have more character, warmth, and loudness (and other vague adjectives which don’t mean anything), so it’s a good thing.

Implementing the Distortion

I have talked about JUCE earlier in this blog, but I think that’s been so long ago that it’s forgotten. So I’ll summarize it shortly again. It’s a framework for creating audio software. It handles input and output routing, VST interfacing, user interface, and all that other boring stuff so that we can focus on what we actually want to do: making the computer go bleep-bloop.

The actual method of audio signal processing may vary between different types of projects. For a VST audio effect like this, there usually is a processBlock function that receives an input buffer periodically. It is then your duty as a plug-in developer to do whatever you want with that input buffer and fill it with values that you deem correct. Doing all this in a reasonable amount of CPU time, of course.

In this Pastel Distortion plug-in, we receive an input buffer filled with values ranging from -1.0 to 1.0, and then we feed those samples to the waveshaping function and replace the buffer contents with the newly calculated values. Sounds simple, and to be honest, that’s exactly what it is because JUCE does most of the heavy work.

JUCE has a ProcessorChain template class that can be filled with various effects to process the audio. There’s a WaveShaper processor, to which you simply give the mathematical function you want it to perform, and the rest is done almost automatically! As you can guess, the plug-in uses that. In the plug-in there are also some filters, EQs, and compressors to tame the distorted signal a bit more because the distortion can start to sound really ugly really quickly. That doesn’t mean that you can’t create ugly sounds with Pastel Distortion, quite the contrary.

The life of a designer is a life of fight: fight against the ugliness

Another great feature of JUCE is that it has a graphics library built-in. It’s especially good in a sense that an embedded developer like me can create a somewhat professional-looking user interface, even though I usually program small devices where the only human-computer interaction methods are a power switch and a two-colour LED. Although I have to admit, most of the development time went into making the user interface. You wouldn’t believe the amount of hours that went into drawing these little swirls next to the knobs.

Honestly, it was pure luck that I managed to get these things looking even remotely correct. The best part is that in the end they’re barely even visible.

All in all, Pastel Distortion is a completed plug-in that I think is quite polished (at least considering the usual standards for my projects). There’s the distortion effect of course, but in addition to that there’s tone control to shape the distortion and output signal, a dry-wet mixer for blending the distorted and clean signal, and multiple waveshape functions to choose from. Besides GUI, I also spent quite a lot of time tweaking the distortion parameters, so hopefully that effort can be heard in the final product.

There’s still optimization that could be done, but the performance is in fairly good shape already. At least compared to the FL Studio stock distortion plug-in Disructor it seems to have about the same CPU usage. Disructor averages at around 8%, while Pastel Distortion averages at 9%. Considering the fact that my previous delay plug-in used about 20% I consider this a great success. This good number is most likely a result of the optimizations in JUCE and not because of my programming genius.

But enough talk, let’s get to the interesting stuff. How to try this thing out?

Getting Pastel Distortion

Obtaining Pastel Distortion plug-in is quite easy. Just click this link to go to the Gumroad page where you can get it. And if you’re quick, you can get it for free! The plug-in costs $0 until the end of February 2024. After that you can get the demo version for free to try it out, or if you ask me I can generate some sort of a discount code for it (I’d like to get feedback on the product in exchange for the discount).

If you don’t want to download Pastel Distortion but want to see it in action, check out the video below. I put all the skills I’ve learned from Windows Movie Maker and years of using Ableton into this one:

That’s all this time. I’ve already started working on the next plug-in, let’s hope that it won’t take another two and a half years. Maybe the next text will be out sooner than that when I get something else ready that’s worth writing about. I’ve been building a Raspberry Pi Pico-based gadget lately, and it got a bit out of hand, but maybe I’ll finish that soon.

Unit testing audio processors with JUCE & Catch2

Testing. The final frontier. Or so it often feels. It’s the place where no man boldly goes, it’s the place where they tend to crawl to when being forced to do so after the test coverage checker lets them know that there’s less than 60% of the line coverage. Hopefully, this is just a tired stereotype, because testing is mighty useful in all kinds of applications for quite obvious reasons: it saves time, helps catch bugs, and a good & passing test set gives a better peace of mind than a ten-minute mindfulness session. In this text, I’m writing a bit about unit testing in audio processing applications and presenting my small example plugin that contains some unit tests. It’s basically my upcoming ADC 2022 talk in a text format, unless something surprising happens in the upcoming week. Let’s hope not, I react poorly to surprises.

Surprise Seinfeld GIF - Find & Share on GIPHY

Testing

Unit testing is a type of testing where small sections of the program code are individually and independently scrutinized. Usually, this is done by the means of feeding known input to the functions being tested, and comparing the outputs, function calls, etc. against an expected set of values. Basic stuff, really.

However, audio applications are slightly different, or at least have some features that don’t fully follow this idea of known inputs and outputs. For example, think about a high-pass filter that adds some “character” to the signal it filters. The character here means something extra being added to the audio signal during processing. Depending on the type of “character” we may know the output only partially. Sure, the audio gets filtered according to the filter parameters, but the processing may also add some random elements, like a tiny smidgen of noise to keep things exciting. In such a situation we can’t directly compare the output of the processing function to some expected value, because the expected value is only partially known.

Of course, the test in such a case could be reduced to smaller components that have fully predictable outputs. And it could even be argued that testing these small components is the whole idea of unit testing. But even if we break down the filter into the smallest possible pieces, there still would be that one noise generator element that can’t be fully predicted (unless using the same random seed every time). How can we know what to expect when we only have a general idea of what something is supposed to sound like, but it can’t be exactly defined? As a sidenote, this kind of vague hand-waving describes my music-making fairly well.

Arm Flailing GIF - Find & Share on GIPHY
Basically me whenever I’m asked to explain anything (please don’t ask difficult questions)

Math to the rescue

I dislike math as much as anyone else, but no one can really disagree with the fact that it’s definitely useful. Without it, we wouldn’t have trebuchets. And from what I’ve heard, mathematics plays some role in most of the computer systems as well.

Fast fourier transform is a magica… mathematical algorithm that can perform discrete fourier transformations. Fourier transformation on the other hand can convert time or space functions to their frequency components. Or, if these words mean nothing to you, it can convert this kind of waveform image (i.e. time domain representation):

Difficult to say what it sounds like, but at least it’s loud.

Into this kind of frequency spectrum image (frequency domain representation):

Ah, it’s a sine sweep.

How does this happen? I wish I understood. Wikipedia article for Fourier transform is filled with sentences like “for each frequency, the magnitude of the complex value represents the amplitude of a constituent complex sinusoid with that frequency, and the argument of the complex value represents that complex sinusoid’s phase offset” and to be honest, sentences like this make me scared. For all I know, it could say that there’s a tiny wizard somewhere in the computer that has nothing better to do than some signal processing. Although, all the signal-processing people I know tend to be wizards, so maybe there’s some truth in that.

However, with the FFT we can get the characteristics of the audio, like what frequency range has a lot of energy, or where there is no energy at all. Audio signals are signals, and as such, they have some amount of energy. These kinds of frequency characteristics can be compared in non-exact, fuzzy ways by setting thresholds for the allowed energy amounts for different frequency ranges.

For example, if we have a simple gain function, we can use FFT to check that the audio signal energy is indeed increasing or decreasing. Or if there’s a filter that needs to be tested we can check that some frequency bands have no energy to ensure that the filter doesn’t allow frequencies to pass from where they shouldn’t. Or if there’s a white noise generator, it can be checked that there is energy across the whole frequency range. Or with a synthesizer, we can check that there are no unexpected artefacts in the signal. Without fast fourier transform finding out these kinds of things automatically can be tricky.

Pamplejuce

Integrating a unit testing framework with an audio development framework can be challenging. It took me surprisingly much googling to find anything useful on the topic. There were plenty of tutorials/advertisements about building different unit testing framework demos. I’m using JUCE, and it has some level of unit testing support, but that seemed insufficient. There also were conference talks confirming that yes, you indeed should test your software even if it handles audio signals, that’s not an acceptable reason to skip testing even though that’s a tempting thought.

Finally, after failing to do what I wanted with JUCE’s unit tests, and failing to integrate two different unit testing frameworks into my JUCE plugin, I came across Pamplejuce. It’s a template project using CMake workflow, and more importantly (at least for me), it integrates Catch2 unit testing framework to JUCE. It’s good that someone around here is a competent developer.

Catch2 is a unit testing framework. I don’t have much else to say about it. It’s being actively developed and it has all the features I could ask for. This isn’t all that much, I have about as many requirements for unit testing frameworks as I do for hotel rooms: hotel rooms should have a bed, a shower and a lockable door. Unit testing frameworks should have the possibility to compare two things and an option to write custom matchers. And I should be able to integrate them into my projects. The last one was surprisingly rare.

90S Internet GIF - Find & Share on GIPHY
I couldn’t yet find “Unit testing framework integration for dummies”, but once I find one I’ll be sure to get one.

FilterUnitTest

To test out Pamplejuce I created a new imaginatively named project FilterUnitTest from the template. To have some actual audio processing to test I made a simple highpass filter plugin using JUCE’s LadderFilter. No GUI, just simple filtering pleasure. Well, it also has a parameter for the wet amount. I’ll give a warning that there’s still plenty of refactoring & warning fixing to be done in the FilterUnitTest, but I’m currently working on it whenever I have the time. I’m quite certain that the tests at least leak memory, but I haven’t yet gotten around to fixing that (I know, I know, it isn’t very C++17 to have memory leaks).

So, after the base of the project was done, it was time to finally implement some tests. First is the dummy test already part of the Pamplejuce template, which simply checks that the name of the plugin can be fetched and that it is actually the expected name. The best test is the one someone writes for you. The test cases presented in this blog text can be found from Tests/PluginBasics.cpp file in the FilterUnitTest repository.

TEST_CASE("Plugin instance name", "[name]")
{
    testPluginProcessor = new AudioPluginAudioProcessor();
    CHECK_THAT(testPluginProcessor->getName().toStdString(),
               Catch::Matchers::Equals("Filter Unit Test"));
    delete testPluginProcessor;
}

To verify the actual functionality of the program, I identified two things that need to be tested:

  1. Testing that the filter performs filtering to the audio signal
  2. Testing that the wet parameter controls the amount of applied filtering

The first test is quite simple: create an audio signal buffer, or read it from an audio file. Run it through the audio processing function, and check that the output contains less energy than before filtering. This of course requires writing a custom matcher, more on that a bit later. If you’ve set your filtering function in stone, you could consider storing the output from one of the test runs and comparing the test results to that instead. During the development phase, it may be easier to do some non-exact comparisons of the FFT values.

If you choose to go the exact value comparison route, you can check out the writeBufferToFile and readBufferFromFile helper functions in Tests/Helpers.h. They serialize and deserialize an audio buffer to/from a file. These helpers can be used to create the exact expected values, and they can also be used to fetch the expected value and compare the output to it. This dummy test basically writes a random buffer to a file, reads the file and ensures that the two buffers have identical contents.

TEST_CASE("Read and write buffer", "[dummy]")
{
    juce::AudioBuffer<float> *buffer = Helpers::generateAudioSampleBuffer();
    Helpers::writeBufferToFile(buffer, "test_file");
    juce::AudioBuffer<float> *readBuffer = Helpers::readBufferFromFile("test_file");
    CHECK_THAT(*buffer,
               AudioBuffersMatch(*readBuffer));
    juce::File test_file ("test_file");
    test_file.deleteFile();
}

As you can see, this type of test requires a custom matcher, AudioBuffersMatch. As does the FFT comparison, and any other custom comparison. For FilterUnitTest, I wrote four different types of comparators, these can be found from Tests/Matchers.h:

  • Audiobuffers are equal
  • Audiobuffer has higher energy than another audio buffer
  • Audiobuffer has a maximum energy of N in its frequency bands (N can vary between different bands, and the check for a band can also be skipped)
  • Audiobuffer has minimum energy of N in its frequency bands (Same here)

The second approach of using FFT to ensure that the audio buffer has lower energy after filtering can use a combination of the second and third matcher. By combining these two, we can ensure that the total energy of the signal is indeed lower and that the amount of lower frequencies is within a certain limit:

TEST_CASE("Filter", "[functionality]")
{
    int samplesPerBlock = 4096;
    int sampleRate = 44100;

    testPluginProcessor = new AudioPluginAudioProcessor();

    //Helper to read a sine sweep wav
    juce::MemoryMappedAudioFormatReader *reader = Helpers::readSineSweep();
    juce::AudioBuffer<float> *buffer = new juce::AudioBuffer<float>(reader->numChannels, reader->lengthInSamples);
    reader->read(buffer->getArrayOfWritePointers(), 1, 0, reader->lengthInSamples);

    juce::AudioBuffer<float> originalBuffer(*buffer);

    //Dismiss the partial chunk for now
    int chunkAmount = buffer->getNumSamples() / samplesPerBlock;

    juce::MidiBuffer midiBuffer;

    testPluginProcessor->prepareToPlay(sampleRate, samplesPerBlock);

    //Process the sine sweep, one chunk at a time
    for (int i = 0; i < chunkAmount; i++) {
        juce::AudioBuffer<float> processBuffer(buffer->getNumChannels(), samplesPerBlock);
        for (int ch = 0; ch < buffer->getNumChannels(); ++ch) {
            processBuffer.copyFrom(0, 0, *buffer, ch, i * samplesPerBlock, samplesPerBlock);
        }

        testPluginProcessor->processBlock(processBuffer, midiBuffer);
        for (int ch = 0; ch < buffer->getNumChannels(); ++ch) {
            buffer->copyFrom(0, i * samplesPerBlock, processBuffer, ch, 0, samplesPerBlock);
        }
    }

    //Check that originalBuffer has higher total energy
    CHECK_THAT(originalBuffer,
               !AudioBufferHigherEnergy(*buffer));

    juce::Array<float> maxEnergies;
    for (int i = 0; i < fft_size / 2; i++) {
        //Set the threshold to some value for the lowest 32 frequency bands
        if (i < 32) {
            maxEnergies.set(i, 100);
        }
        //Skip the rest
        else {
            maxEnergies.set(i, -1);

        }
    }
    //Check that lower end frequencies are within limits
    CHECK_THAT(*buffer,
               AudioBufferCheckMaxEnergy(maxEnergies));

    //I guess programming C++ like this in the year 2022 isn't a good idea to do publicly
    delete buffer;
    delete reader;
    delete testPluginProcessor;
}

The second test for testing the wet parameter is basically a continuation of this. Get your audio buffer and run it through the audio processing function with varying levels of wet-parameter. Ensure that the higher the wet parameter is, the higher the filtering effect. This means there’s again less low-end energy, and less energy in general. Or if you want to do a super simple test as I did, just check that with a wet value of 0 signal doesn’t change, and with the max wet parameter value of 1 it does.

TEST_CASE("Wet Parameter", "[parameters]")
{
    testPluginProcessor = new AudioPluginAudioProcessor();
    //Helper to generate a buffer filled with noise
    juce::AudioBuffer<float> *buffer = Helpers::generateAudioSampleBuffer();
    juce::AudioBuffer<float> originalBuffer(*buffer);    

    juce::MidiBuffer midiBuffer;

    testPluginProcessor->prepareToPlay(44100, 4096);
    testPluginProcessor->processBlock(*buffer, midiBuffer);

    //Check that initial value of wet is not zero, i.e. filtering happens
    CHECK_THAT(*buffer,
               !AudioBuffersMatch(originalBuffer));

    delete buffer;

    buffer = Helpers::generateAudioSampleBuffer();

    //Get and set parameter
    auto *parameters = testPluginProcessor->getParameters();
    juce::RangedAudioParameter* pParam = parameters->getParameter ( "WET"  );
    pParam->setValueNotifyingHost( 0.0f );

    for (int ch = 0; ch < buffer->getNumChannels(); ++ch)
        originalBuffer.copyFrom (ch, 0, *buffer, ch, 0, buffer->getNumSamples());
    testPluginProcessor->processBlock(*buffer, midiBuffer);

    //Check that filter now doesnt affect the audio signal
    CHECK_THAT(*buffer,
               AudioBuffersMatch(originalBuffer));

    delete buffer;

    buffer = Helpers::generateAudioSampleBuffer();
    pParam->setValueNotifyingHost( 1.0f );


    for (int ch = 0; ch < buffer->getNumChannels(); ++ch)
        originalBuffer.copyFrom (ch, 0, *buffer, ch, 0, buffer->getNumSamples());
    testPluginProcessor->processBlock(*buffer, midiBuffer);

    //Finally, check that with max wet the signal is again affected
    CHECK_THAT(*buffer,
               !AudioBuffersMatch(originalBuffer));

    delete buffer;
    delete testPluginProcessor;
}

I wish I could argue which approach is better here, but I think it’s quite apparent whether or not it’s better to do proper or half-assed testing. I’ll leave it as an exercise for the reader to figure out how to do the proper way of testing.

Hey Arnold Nicksplat GIF - Find & Share on GIPHY

Images

If you’re given the following sample values, can you figure out what the audio signal sounds like?

[0.25, -0.59, 0.96, 0.21, -0.22, -0.36, -0.45, -0.14, 0.39, 0.35, 0.87, 0.64, -0.32, 0.12, -0.86, -0.67], repeated ad nauseam

I don’t know either. In general, humans tend to absorb information via visual means. A bunch of decimal numbers isn’t the most intuitive way of understanding something unless you’re one of the wizards I talked about earlier. But what if I showed you this:

Seems like it’s a signal of sorts.

As you can see (pun intended) visual information is a lot easier to digest. It’s not the most impressive of graphs really, nor the easiest to read, but there’s a good explanation for that: I wrote the code for it. At least the code allows easy drawing of images as a part of unit tests to see what’s going on with the inputs and outputs, as opposed to printing out audio buffer contents and FFT results and hoping that staring at the numbers absorbs them into the brain. You can find the image drawing function from Tests/ImageProcessing.h, here it is in action:

juce::AudioBuffer<float> *buffer = Helpers::generateBigAudioSampleBuffer();
ImageProcessing::drawAudioBufferImage(buffer, "NoiseBuffer");

Just give it a buffer and a filename without the .png extension and it’ll handle the rest. So, for example to make sure you’ve hooked things up correctly in your testing set-up, you can call the drawing function before and after doing some processing to the audio signal to see if the changes are at least somewhat sensible.

As you can guess, this was one of the earlier attempts of getting things working.

Benchmarking

Imagine you’re in an all-you-can-eat buffet. You can eat whatever as much as you want and the staff won’t kick you out for a few hours. Usually eating a lot feels like a good idea. However, after you’ve stopped eating, the reality of the situation settles in and you realize that it wasn’t a good idea. You feel sick.

The same applies to coding. It’s fun to code without limits. More features, more, MORE, you’ll think to yourself. However, after coding for a while you’ll realize that this wasn’t a good idea either. The program starts to become slow and sluggish. You’ve introduced latency to your code, and that is the second worst thing that can be done. The only thing worse is a 127 dBFS pop that was caused by careless buffer handling when you were starting out with audio signal processing.

To keep things in check, Catch2 has some simple benchmarking macros. There are a few example usages of those in FilterUnitTest-repo. It’s quite basic C++, meaning that it took me about three compilation attempts and one illegal memory access to get the syntax right. After some trials and a lot of errors, I ended up with something like this:

TEST_CASE("Processblock Benchmark", "[benchmarking]")
{
    testPluginProcessor = new AudioPluginAudioProcessor();
    juce::AudioBuffer<float> *buffer = Helpers::generateAudioSampleBuffer();

    juce::MidiBuffer midiBuffer;

    testPluginProcessor->prepareToPlay(44100, 4096);

    //Example of an advanced benchmark with varying random input
    BENCHMARK_ADVANCED("Plugin Processor Processblock ADVANCED")(Catch::Benchmark::Chronometer meter) {
        juce::Array<juce::AudioBuffer<float>> v;
        for (int j = 0; j < meter.runs(); j++) {
            v.add(*Helpers::generateAudioSampleBuffer());
        }
        meter.measure([&v, midiBuffer] (int i) mutable { return testPluginProcessor->processBlock(v.getReference(i), midiBuffer); });
    };

    delete buffer;
    delete testPluginProcessor;
}
Im Smart Parks And Recreation GIF - Find & Share on GIPHY
Attempting to read C++ documentation on lambdas and pretending to understand what’s going on.

Closing words

I hope you found this blog post useful. As mentioned, I was struggling to get started with JUCE and unit testing, so hopefully this writing helps you to think about how to test your application, assists in integrating a unit testing framework, and contains some useful and practical resources to get you started with testing. Also, I want to say that this type of FFT matching isn’t the only solution for unit-testing audio applications. You can for example remove the random elements from your tests, use pre-determined random seeds, or mock some parts of your code if needed. I’ve just found the FFT approach really intuitive and flexible after I got my head wrapped around it. Thanks for reading!

Comments are welcome but due to a quite hefty amount of bot spam, the comments will go through moderation so it may take some time to see your prose in the comments section. As long as you’re not trying to sell Viagra or women’s haircuts to me it’ll eventually appear there. If you happen to be going to ADC 2022 feel free to let me know!

GIF by South Park  - Find & Share on GIPHY
I don’t know the context of this gif, but knowing it’s South Park it must be something very nice and wholesome.

Delayyyyyy: v0.1.0, The First Release

Time for a celebration! The 0.1.0 release is here. It took quite a lot of effort to get everything in place and tested. As usual, the last mile turned out to be more difficult than expected. But in the end, things finally felt good enough for the first beta release. I broke the philosophy of having only one logical change in a single pull request by stuffing all the final changes into one PR, sorry about that. In this blog post, I’m going to go briefly through those commits that lead up to the beta release. The section headers are links to individual commits in case you want to judge my code.

Fix bug in short BPM synced delays

So, first things first, the cliffhanger I mentioned in the previous blog post: the bug in the synced delays. The bug was that when there is a high amount of echos, but not enough synced time signature values for them, the plugin went quiet. Instead, there should have been the few short echoes that were available, but still less than the specified amount. This was caused by prematurely breaking out of the loop that creates the delay buffers. Breaking out of the loop would have been fine if we would have been incrementing the iterator (i.e. creating the longest delay first). However, the loop iterates in reverse order (i.e. creating shortest delay first) and thus we can’t break from the loop before going through all synced time intervals.

It may not be a good idea to lie to the user about the final number of echos though.

This is how I imagine an average user.

Fix parameter state storage

I finally tried a simple saving of parameters in a digital audio workstation project. As it turns out, empty functions don’t do much of saving (or loading), and when I reloaded the project the carefully set parameters were gone. The solution was simple, actually it only consisted of copy-pasting code from an official JUCE tutorial. Not only for the saving but the loading as well. One feature isn’t much of use without the other.

Get BPM from a possible VST host

This is the big one. Basically, I did the following things to achieve this:

  • Run setDelayBufferParams in it’s own thread
  • Read BPM (beats per minute) from the playhead object
  • Detect if BPM changes

The first change was done out of necessity to make the second change possible. Playhead is an object that contains audio playback information, like the position of the audio track or the BPM (if available). However, this object can only be read in the processBlock-function. I can’t remember if I’ve mentioned this earlier, but processBlock-function should execute as quickly as possible to keep audio latency as small as possible.

So, we can’t run setDelayBufferParams in the processBlock because we want to keep things smooth. Instead, a separate thread needs to be created and notified in case the BPM (or any other parameter) changes. The thread also checks parameters every now and then by itself. For thread control, I tried to make a sort of one-sided locking where setDelayBufferParams always waits for the processBlock to finish (even if it’s not running, because it’ll run soon anyway), and then hopefully run before the next block has to be processed because the processBlock is a really busy function and it has no time to wait for anything or anyone.

However, thread-control based assumptions that things “should definitely work this way” will fail with 100% certainty. This is no exception. I’ll fix this in a future commit, but please keep this silly design in mind (and then forget it so you won’t create it yourself)

Add plugin header and version info

What’s worse than getting a bug report? Getting a bug report without the version information, or even the name of the product. This commit added a neat little header that shows the name of the plugin, the name of the creator, and even the version in the bottom right corner. I should have dropped the name of the creator though, so no one could send me bug reports.

This commit uses a neat UI feature of JUCE: removeFrom. This can be used to crop (and return) a certain section of the screen area. Something like this is perfect for the headers. Just remove a bit from the top, and the removed area is the new header area! No need to break the existing flexbox-based UI to try and fit header info in there.

Remove negativity, add Fonzie and update TODO

Does what it says, all changes here went to the README. There may be better issue management tools than just updating the README, but at least it’s not as bad as JIRA.

I have watched zero episodes of Happy Days in my life, hopefully I’m using this correctly.

Set the initial values of parameters to default values

This commit sets the initial values of the parameters to their default values. Would you have guessed? Basically makes the code a bit neater. However, this introduced a quite big bug that I somehow missed: the plugin makes no sound before the UI elements are edited because the delay buffers only get created if the parameter values change. If the parameter values are initialized to their default values, the buffers won’t get created before UI changes to something non-default. This programming bug is equivalent to a fist-sized cockroach in real life. I reverted this change here shortly after merging it, but I may need to revisit the idea some other day.


And then I thought that I was ready for the 0.1.0 release. I did some final testing of the plug-in, fiddling around and making sounds. Life was beautiful. Until suddenly, a deafening “pop”-sound occurred. Ableton recorded it to be 173 dBFS. Fortunately, I wasn’t using my headphones, otherwise my head would have exploded. People from miles away heard the sound of my mistake, which was slightly embarrassing.

Fun fact: the loudest sound on earth has most likely originated from the Krakatoa volcano eruption in 1883. It was recorded to be 172 decibels 100 miles away. And that’s probably the most educational thing in this blog so far.

Well, the reality is a bit less exciting to be honest. The sound wasn’t that loud, and I guess Ableton just calculated it “a bit” wrong. But there was a pop, and it was reproducible easily: just by wiggling the controls of the plugin for a minute or two.

And this is where my “slightly” “stupid” approach to thread control comes in.

Fix race-condition in setDelayBufferParams and processBlock

Basically, it’s dumb dumb very very dumb to assume that some things can and will always run in a certain time frame. In the end, it’s better to have a bit more latency than weird popping audio artifacts. Basically, the sound occurred when setDelayBufferParams didn’t finish before the processBlock was called. As you remember, processBlock waits for no one. So setDelayBufferParams was happily replacing/removing the delay buffers and processBlock was happily reading those garbled memory locations, and a pop sound occurred. That’s why I added a wait to the processBlock so that it won’t run too early. Not the best possible design, but it’ll have to do for now (and most likely until the end).

Add companyName for VST information

Finally, while testing out the plug-in I noticed that the company name wasn’t set correctly (it said yourcompany or something like that), so I fixed it to point to my musical alter ego.

And that’s it! That’s the story of how 0.1.0 came to be. I made one final PR where I added the ready-built x64 VST3-file and updated the README with the known issues of the plugin being a CPU hog and popping occurring while editing parameters. The CPU optimization is worth another story, but I already have a few ideas for that (and for the popping as well). But meanwhile, you can download the VST3 plugin here:

Delayyyyyy_0.1.0 release

This is how it feels to optimize. Even the quality of the GIF is optimized.

Delayyyyyy: BPM synced delay effect

As a hobby project, I have been working on a VST plugin lately. For people who are unaware of what a VST plug-in is (I assume this is most of the people), it’s basically an audio effect or instrument used with DAWs (digital audio workstations). To simplify the concept even further, it’s a piece of software that either creates or modifies an audio or MIDI signal.

The plugin I’ve been working on for the past few months is an audio delay/echo plugin. It takes an audio signal and repeats it a given amount of times with some certain maximum delay value. It also has few simple options: decay time for deciding how quickly the repeated signal levels decay, and a “ping pong” to alternate the left and right sides of the signal.

Ta-dah! This how the plug-in looks at the moment.

Now that I’ve briefly explained what the project is about, I can talk about my latest addition to the plugin: BPM synced delay (basically a delay that is synced to the rhythm of the music). I’ll admit, this may feel a bit like jumping into a moving train when there’s existing code that hasn’t been explained in previous blog posts, but I’ll try my best to keep this simple & understandable. Here’s a link to the PR that contains the BPM sync feature I’ll be talking about in this blog post.

There are changes to two classes, PluginEditor and PluginProcessor. The former is responsible for the UI, and the latter for the actual signal processing.

In the PluginEditor changes were quite simple. I added a BPM sync check box that toggles the visibility of the synced and unsynced delay sliders. I actually had most of the stuff already in place, just commented out (commenting out code is definitely the best method of version control, trust me on this, I’ve had “senior” in my title for three whole months /s). Checkbox’ value is also used in the PluginProcessor side to see if the delay should be synced or not. I’m using CustomSliders to have a custom look and feel on the sliders, but everything should work on a regular slider class equally well (or even better).

There are also some UI gymnastics going on to get everything aligned correctly. The UI should be resizable (kinda), which causes its own headaches there. Possibly the most interesting thing is the custom textFromValueFunction that allows showing musical time signatures instead of raw slider values in the slider value box. Besides these, I also had to inherit the Button listener to be able to update UI and delay parameters whenever the user would click the button. There are more elegant ways to do this, but I think for two lines of code in a hobby project this is a sufficient solution.

Lazy day every day

Simple so far? Well, things should get a bit more interesting (i.e. complicated) in PluginProcessor. First of all, I had to create parameters for the synced delay toggle and synced delay amount to ValueTree. ValueTree is a class that, you guessed it, stores different values and helps handling them.

However, maybe the most important functional changes in the PR are to the setDelayBufferParams-function, and that could definitely use some explanation. The delay in this plugin works by creating a vector of size N, where N is the number of echos that we hear back. This vector is then filled with ring buffers, and these buffers have read and write pointers that are offset by a certain amount from each other. There’s a picture after this, and I hope that it’ll explain the idea better than two chapters of words and poorly thought-out mathematical formulas.

The offset between read and write is calculated from the maximum delay time X and echo amount Y user set from the UI. The first buffer gets a value of X/(2(Y-1)), second X/(2(Y-2)), and so on, until (2(Y-n)=0. In that situation, the last buffer just gets the value X. This creates an interesting (yeah for sure) non-linear, yet fairly symmetrical delay effect. Please note that the first echo buffer has the smallest offset between read and write pointers.

ANYWAY, after this offset is calculated, the write pointer is set to this value while the read pointer is set to zero. When there’s an audio signal coming in, there’s first a read performed on the buffer to obtain the possible previous, delayed signal. Then the incoming signal with possible ping pong and decay applied is written to the buffer. Finally, the earlier read value is added to the incoming signal. This is then repeated for all the buffers. Since each buffer has a different, bigger than zero offset between write and read pointers, the audio signal gets repeated at different intervals, creating a delay effect. Wowzers!

In this example, the maximum delay is set to one second, and there are four echoes. I’ve been told to never apologize using Paint in explaining things, but hopefully this makes sense.

Now, how does this idea work in a rhythmically synced world? Well, time for a little music theory for a change. 97% of the modern western music is in 4/4 time signature, meaning that there are four kick drum hits in a bar, and music is constructed of building blocks with a size of four bars (gross oversimplification, but I don’t want to end up neck-deep in music theory because I’ll explain it wrong). So in a track with a BPM of 120 there are 120 kick drum hits in a minute, and 30 bars in a minute.

This gets a bit subjective from here, but I personally like synced delays to be divisions and multiples of two from the previously calculated value, starting from 1 bar, to create a nice rhythm that sounds good on most music. This means that the delay gets values like 1/2, 1/4, 1/8, and 1/16 when going shorter than one bar, and values like 2, 4, 8 & 16 bars when lasting longer than one bar. When we know the beats-per-minute, or BPM, it becomes fairly trivial to calculate the write and read pointer offsets that are rhythmically synced. Instead of using the user set value for the maximum delay, we calculate the time for a certain amount of beats in a certain BPM.

Looks similar, doesn’t it? Max delay set to 1 bar, and echoes to 4. The difference is that the duration of the delay doesn’t depend on the time, it depends on the tempo of the music

Hopefully, you’ve understood what I’ve tried to say with this. To make this code change slightly more confusing, there are some optimization changes thrown in there as well. Basically, earlier I had a static 10-second size for the delay buffers. However, with synced delays, this may not be enough, and for the shorter un-synced delays, this will be way too much. So I set the buffer’s size to be equal to the write pointer offset because there is no point in having the buffer be longer than that. The ring buffer will rotate more often, but that isn’t really an issue because old values are not needed.

And that’s the process of how I did the BPM synced delays to my VST plugin. The theory is quite simple once you get the idea of how I implemented the delay buffers, after that it’s just a matter of modifying the write pointer offset. The rhythm is still hardcoded to 120 BPM, but it’ll be fixed in the next PR. There’s also one bug in this implementation that I’ve fixed on my local branch already, but I’ll leave details on that as a cliffhanger to my next blog post.