Random ramblings

Fixing Stability Issue In The Blog Server

The biggest fans of this blog (or just the people usually browsing between 6:00-7:00 UTC) may have noticed a frustrating issue where the site occasionally loads really slowly. Or in the worst-case scenario, refuses to load at all. Only an error page containing a message about a failing database connection gets returned.

Well, at least the message is short and to the point.

Investigating Issue

This issue started to occur sporadically in August and became consistent in October. And I started to consider fixing it in November. This kind of relaxed response time is common for hobby projects. The first obvious step to fix the issue was to check what was going on in the server when load times got longer. Once I noticed that the site was slowing down, I checked the monitoring stats. From the graphs, I saw that both CPU usage and disk reads were spiking. CPU was peaking at 80%, and disk reads were over 100MB/s for over 15 minutes. From the 7-day monitoring graph, it could be seen that this kind of spiking was happening almost daily.

Not every day though, and some spikes are taller than others.

Investigating the system log and comparing it with the time stamps of the peaks revealed the following cycle:

  1. One of the two daily apt package manager upgrade services gets started
  2. The CPU and disk activity starts ramping up
  3. The system starts heavy swapping and the website load times get longer
  4. About 15-20 minutes after the apt service starts the OOM (out-of-memory) killer kicks in and stops MySQL. Few other services may time out or get killed in this phase as well.
  5. MySQL restarts and the blog works again

I started investigating why the daily apt services seemed to constantly cause the server to run out of memory. The first of the two services downloaded the packages for upgrading, and the second one installed the downloaded upgrades. After trying out a few different things I realized that just installing or removing a package caused the server to randomly run out of memory if either of the apt services was started a few minutes earlier. It’s fun to do tests like this on a live server.

Fix Attempt 1: Installing System Upgrades

Some further investigation into the daily apt services revealed that the unattended upgrades had been failing for a long time. It seemed like the MySQL apt repository was missing keys, causing the apt update to fail. Also, it seemed like some upgrades required input from the user to configure packages. So I took a server backup and started installing the upgrades manually.

This isn’t foreshadowing. At least yet. Let’s see in three months.

Out of 141 packages, 127 wanted an update, which is “quite many” (to put it lightly). Fortunately, I have made no promises about the availability of this site, so I could liberally reboot the server as much as I needed for the upgrades. I was hoping that installing these pending upgrades would clear some cache that would reduce the RAM usage of the apt services. And in the worst-case scenario, it wouldn’t fix the issue but I would get an up-to-date server, so upgrading seemed like a win-win.

In addition to the upgrades I also installed an improved DigitalOcean monitoring service. This actually revealed something that should have been quite obvious from the beginning. The new monitoring service monitored RAM usage (the old one did not), and I could see that the server was using 90% of the RAM when it was idle. In hindsight, checking the RAM usage and monitoring how it gets consumed should have been the very first step when investigating an OOM issue.

Needless to say, 90% RAM usage is not good. I guess this happens because I’m running this blog on a low-end instance that doesn’t have much of RAM (I actually checked the minimum requirements of the OS, and the instance barely fills even that). However, before investigating the insufficient RAM, I wanted to first see if the upgrades would fix the original OOM issue. They did not.

So, the problem started to seem like a case of insufficient RAM. To fix this kind of issue, there are usually two options: scale the server up or scale the services down. In other words, throw money at the problem, or try to optimize the server. Being a cheapskate I chose the latter option. Also, I usually work with embedded things, so “just adding more RAM” feels like cheating. Also, considering the fact that on average I have about 20 daily visitors, beefing up the server seems like the wrong direction.

Fix Attempt 2: Optimizing RAM Usage

I used top to check the biggest memory consumers, and found two RAM gluttons: MySQL and Apache. Both are required for the well-being and existence of WordPress (that is the platform of this blog), but perhaps they could be optimized. At least they used to work on the server before, so perhaps they could be configured to work once again.

In the case of MySQL, there was a single mysqld daemon that was consuming plenty of RAM. Some googling revealed that disabling performance schema could help lower memory consumption. It seems to be a feature that measures the performance of the MySQL database server. Considering the fact that I’m using WordPress and I hope to write zero direct database queries to the database, that seemed nonmandatory. Perhaps when developing new software using MySQL such stats could be useful. Disabling performance schema lowered the mysqld RAM consumption from 39% to 19%.

In the case of Apache, there were ten worker threads, each consuming about 5%-8% of RAM. If my math is correct, in the last month I had about 0.00083 concurrent visitors on average. With that in mind, ten worker threads felt a bit excessive, and I scaled their amount down. I think it could be lowered even more, but I wanted to have enough workers in case there’s a sudden influx of readers.

Aaaany day now.

Conclusion

These actions took the idle RAM usage from 90% down to 60%. After this drop, I haven’t seen the OOM killer get activated in the past seven days, so I hope the issue is fixed. 60% is still a bit more than I’d like, but as long as the server stays stable and the performance doesn’t notably degrade I think that’s an acceptable percentage. Also, using the cheaper virtual machine saves me $6 a month!

The root cause for the increased RAM usage is still a bit of a mystery. I’m suspecting that installing WordPress plugins caused it because I was installing SEO plugins around the time the issue became more prevalent. If there’s one thing I’ve learnt from this, it’s that updates should be checked manually every now and then, and consumption of the system resources should be constantly monitored.

Open-source contribution: chdir for BusyBox

Coming soon to the Linux box near you:

Hopefully this doesn’t age like milk

So yeah, I managed to get a commit into one of the open-source projects that I use on a daily basis: BusyBox. I guess many others use it too, either knowingly or unknowingly. BusyBox is a software suite providing plenty of Unix utilities in a minimized single executable. For example, when you’re using dmesg command you don’t necessarily know if the implementation comes from util-linux or BusyBox. But if you’re using OpenWRT, Alpine or Yocto you’re most likely using the BusyBox version.

The Problem

Because the BusyBox binary is minimized, the utilities it provides are often missing lesser-used features. As mentioned in the previous Aioli devblog, start-stop-daemon is for example missing -d/--chdir option present in the full Debian counterpart. As mentioned in that text, I wrote a patch to add that feature. What I didn’t really mention is that I submitted the patch to the BusyBox mailing list. I was hoping that it would get applied, and eventually it did!

start-stop-daemon is a program that’s commonly used in the SysVinit scripts to control the lifecycle of the system services. It doesn’t only start and stop daemons, it can also reload them, check their status and… well that’s primarily that. What --chdir option does is that it changes the working directory of the start-stop-daemon process before it launches the program it’s been assigned to start. This effectively changes the working directory of the process that will actually be started.

The Solution

The patch for this feature was quite straightforward. Mostly it consisted of adding a variable to hold the new working directory, inserting the new -d option to the opt list for the option parser, and editing the usage message. Then, if the new option flag was set, it was just a matter of calling the xchdir() in libbb (BusyBox’s library) to change the directory to the given directory (or die).

The less popular sequel to “Skate or Die” and “Ski or Die”.

In addition to this, I looked at how the tests for BusyBox work and wrote tests for the new flag. And cleaned up the TODO. In the end, the commit delta ended up being less than 60 lines. From what I’ve understood of the commit stats, the start-stop-daemon got bloated by about 79 bytes as a result. So the next time you’re updating BusyBox and curse the fact that it doesn’t fit into your root file system that has 67 bytes of free space remaining, you know who to blame.

All in all, getting the patch merged was an interesting process. I could definitely contribute more to BusyBox if there are suitable issues. Something perhaps a bit less simple the next time. But whether there will be more commits or not, it’s wild to think that my code could be running in Linux boxes around the world. Although, I guess that would require the device vendors to update their devices to run the new (still unreleased) version of BusyBox, so I guess it’s not happening too soon.

Open-source contribution: RTL8821AU driver recipe

This is a story of how I became a useful member of society by doing my first open-source contribution.

It all began one fateful afternoon, when I purchased a TP-Link Wifi dongle, thinking that it would allow me to connect my old Raspberry Pi 2 wirelessly to the internet. It was running my own Poky-based distro, but what could really go wrong with random USB devices and Linux?

Well quite a plenty really. I plugged the device in, but I couldn’t connect to the highway of data. No delicious internet cookies for me. Not even a blinking led.

To begin troubleshooting the issue, I tried checking if the network interface was seen by kernel by running both ifconfig -a and ip link show. No wlan devices were found. Some googling suggested running lsusb. That showed the device, which at least proved that it wasn’t broken and was recognized by kernel. Some sort of network driver was clearly needed.

Bus 001 Device 004: ID 2357:011f TP-Link 802.11ac WLAN Adapter 
Bus 001 Device 003: ID 0424:ec00 Microchip Technology, Inc. (formerly SMSC) SMSC9512/9514 Fast Ethernet Adapter
Bus 001 Device 002: ID 0424:9514 Microchip Technology, Inc. (formerly SMSC) SMC9514 Hub
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub

Finding the correct driver turned out to be a bit tougher than expected. First I tried googling the name of the wifi dongle suffixed with “driver”. Bad Idea. This led to a lot of ancient forum posts that suggested all kinds of Realtek drivers for (almost) similarly named devices that were installed by enabling a variety of kernel configuration options. None of the drivers worked.

After some more of the furious googling I found out that the wifi dongle I bought required an out-of-tree kernel module instead. That meant I couldn’t just enable a kernel configuration to build the driver in my distro. Finding the correct driver was another trial and error type of affair. Someone suggested a driver for the 8812au chip. It did not work but helped me to find a correct trail.

Fortunately there’s a lot less diseases on this trail.

RTL8812AU driver repo contained a file supported-device-IDs that expectedly did not contain the device id output by lsusb. However, that gave me an idea (that I really should have gotten from the beginning): googling “driver 2357:011f”. Who would have guessed that searching for a driver with an exact device id instead of vague product names would yield the correct driver(s)? This search also helped me to find the name of the Realtek chip, 8821au, which I confused plenty of times with 8812au. I’m not sure if this info would have been available on the manual of the dongle because I did not read it.

After finding the driver & chip I connected some dots and realized that there actually is a kernel configuration driver named CONFIG_RTL8XXXU that I tried. Despite what the name suggests, it does not work with rtl8821au.

Once the correct driver was figured out it was time to add it to the Yocto build. Some more googling revealed that there is a meta layer called meta-rtlwifi for these Realtek out-of-tree modules. Unfortunately, it didn’t contain the RTL8821AU driver. Fortunately, I’ve been using git at work so I could fix that myself. You can see where this is heading.

So I took the RTL8812AU driver recipe as I suspected that it should mostly work, and updated the relevant parts, i.e. the repo to fetch the driver from. I was pleasantly surprised that the build worked just like that. Even more shocking was that the module worked as well. After that, it was just a matter of a pull request to get the driver added to the meta-layer alongside the other friendly drivers.

There were actually multiple drivers available for 8821au. At least morrownr, ulli-kroll and ivanborislav provide RTL8821AU drivers. In the end, I chose morrownr driver because their driver worked satisfactorily out of the box and their driver is also used for 8812au. I first gave a shot at ivanborislav driver but it filled my TTY with logs about power save mode. Most likely a configuration mistake from my side, but usually a thing that works without extra tinkering is the better choice.

It’s almost weird that there’s a meme for literally everything.

That’s how I got quite familiar with my wifi dongle, and made my first open-source contribution in the process. I also learned something. I’m not yet 100% sure what that is. Perhaps it’s that the device id is quite important when trying to find a suitable driver. And googling can give all kinds of interesting useful information. Until next time!

How to build Fritzing for Windows

Fritzing is an open-source tool used to design and draw electrical wiring circuit diagrams, like this:

It can also be used to draw schematics and PCB diagrams, so it’s a really handy program indeed.

Why am I covering an open-source tool’s build process here? Because it was a surprisingly difficult process. You can buy the prebuilt version for 8€ and support the developer, but if you’re an adventurous soul (or a cheapskate) you of course want to build it yourself. I’m going to say that it took me almost a full afternoon to figure this build out, so the hourly rate wasn’t really that good, and the software is so useful that I’ll most likely be paying that 8€ anyway.

Here you can find the official Fritzing build instructions wiki. It’s a bit confusing at some points, but I’ll mostly follow the steps there, and mention it when I don’t.

This guide assumes you have a Windows machine with git, Visual Studio 2019 and a “sufficiently new” CMake installed.

Lets Go Sport GIF by ALL ELITE WRESTLING - Find & Share on GIPHY

Installing (correct) Qt version

If you have not installed Qt already, this step is quite straightforward. Download the Qt installer from here, agree with their open source policy and install the 5.15.2 version of Qt.

ERRATA: I did some more digging after publishing this text and realized that there actually is a simpler way to install multiple Qt versions than described in the next chapters. In the root-folder of the Qt installation there is tool named MaintenanceTool.exe. This tool can be used to install, remove and update versions of Qt. So use it instead of following these instructions. However, I’ll leave the chapters here as a proof that not everybody on the Internet is smart.

However, if you’ve already installed a different version of Qt this step was a bit tricky to complete (or at least I tried to make it difficult for me). Qt Creator (=Qt’s IDE) itself doesn’t allow downloading different versions of Qt and the official documentation only says that if you want to add a new version of Qt to Qt Creator you need to locate a qmake file. But where is this qmake file?

In the end, I couldn’t quite find a satisfactory answer to this. What I did was that I used the same installer as with the clean build, and installed the desired version of Qt in a different location. However, the installer forces you to install Qt Creator again, so it pollutes your system a bit.

If you’re not seeing the version you’d like to see, try checking Archive & LTS boxes from the menu on the right

After the version is installed, it can be linked in Qt Creator using this guide. Basically, just navigate to Edit->Preferences->Kits->Qt Versions->Add in Qt Creator and add the mythical qmake executable there (the executable is in a path something along these lines: Qt\5.15.2\msvc2019_64\bin\qmake.exe)

There is a small possibility that I’m just dumb and there is an “Install Qt version” button somewhere in the depths of the Qt Creator and I just couldn’t find it. But this approach at least works, even though it installs a bit of extra to the system. I also found the source packages for different versions, but didn’t feel like compiling Qt just to get Fritzing up and running.

Downloading the sources

There are a few repos that need to be pulled for the build. The first one is obviously Fritzing app itself. Besides that boost version 1.x.0 and libgit2 version 0.28.x are needed for building. For running the application you’ll also want Fritzing parts repo to get some actual components for your diagrams. All these repositories should be placed side-by-side so that you’ll end up with something like this:

Versions I used were:

  • fritzing-app: f0af53a9077f7cdecef31d231b85d8307de415d4
  • fritzing-parts: 4713511c894cb2894eae505b9307c6555afcc32c
  • libgit2: v0.28.5
  • boost: 1.79.0

Compiling dependencies

Next step is to compile libgit2. This is where I hit a big problem. Fritzing Wiki instructs to build with -DBUILD_SHARED_LIBS=OFF. However, with this flag, the build actually doesn’t output the .dll file required later on in the build. So the commands I used to build libgit2 actually were:

cd libgit2
mkdir build64
cd build64
#Note that BUILD_SHARED_LIBS is ON
cmake .. -G "Visual Studio 16 2019" -A x64 -DBUILD_SHARED_LIBS=ON -DBUILD_CLAR=OFF
cmake --build . --config Release

Boost and Fritzing parts shouldn’t require any compilation at this stage.

Compiling Fritzing

Next, we’ll get to open the Qt Creator, and open the phoenix.pro file located in the root of the fritzing-app folder. Configure it for the 5.15.2 version of the Qt, and as the build wiki instructs, add the following to the Projects->Run->Command Line Arguments in Qt Creator:

-f "/path/to/fritzing-app/" -parts "/path/to/fritzing-parts/" -db "/path/to/fritzing-parts/parts.db"

After this is done there’s still one more hurdle to overcome. Building now seems to result in this error:

error: dependent 'F:\Esa\Documents\Fritzing\debug64\ui_fabuploaddialog.h' does not exist.
Detective Looking GIF by Sherlock Holmes Games - Find & Share on GIPHY

To fix this issue, we’ll need to navigate to the build folder that gets generated alongside the fritzing-app folder, and is named like build-phoenix-* (rest of it depends a bit on your build configuration). There we need to use Qt’s jom to build compiler_uic_make_all target:

P:\ath_to_QT\Tools\QtCreator\bin\jom\jom.exe -f Makefile.Debug compiler_uic_make_all

This will generate the missing headers to the debug64-folder that’s also alongside the fritzing-app and build folders. After this, the build should be as simple as clicking the green arrow in Qt Creator. If you get a boost-include error, make sure you have the boost folder directly under the boost_1_79_0 folder, and that you don’t have a structure boost_1_79_0\boost_1_79_0\boost as I did. This wrong structure resulted in the following error:

F:\Esa\Documents\Fritzing\fritzing-app\src\svg\groundplanegenerator.cpp:40: error: C1083: Cannot open include file: 'boost/math/special_functions/relative_difference.hpp': No such file or directory

Once the build completes, the Fritzing will start. Because there is db-argument given in the command line, the actual program won’t start. Instead, Fritzing generates the parts database and closes itself after the process finishes.

Running Fritzing

Congratulations, you should now have built & prepared Fritzing! After the initial build & database generation remove the -db argument from the run arguments so that Fritzing starts properly with Qt Creator. This type of launch was a good enough solution for me, and I didn’t feel like going through the hassle of creating an actual executable for Fritzing. I think I can pay 8€ for that pleasure.

Stay tuned for texts that include more Fritzing diagrams!

How to start a (tech) blog with WordPress & DigitalOcean

Haven’t we all dreamt about it? Writing some simple tutorial blogs for people who don’t want to go through man-pages and watching the ad revenue pour in while building the personal brand to ensure success in every aspect of life. The only problem: you don’t have that blog yet. Fortunately, there are both simple and not-so-simple solutions to this!

Please note that despite the clickbaity title this isn’t really a step-by-step tutorial, although it may be useful when starting out with the technologies mentioned in the title. It’s more of documentation of the things I did to get this site you’re reading now up and running.

Option A: The casual approach
Step 1: (well okay there are steps, but this still isn’t a tutorial)
Go to wordpress.com, create your free account and start writing.

This is definitely a valid approach, but it sounds easy, doesn’t it? A bit too easy one might say…

Option B: The tech approach
Step 1: Obtain the domain
Buy the domain you want. I personally use Namesilo as a domain registrar. It is cheap, and as a nice bonus, it often reminds you that you get what you pay for. There are other alternatives that could definitely be considered. GoDaddy at least seems to be commonly used, and often many services provide documentation on how to get their service working with GoDaddy.

Whatever registrar you choose, buy the domain you want to use. The process isn’t much harder than ordering anything else online. Often registrars try to sell all sorts of extras, you don’t really necessarily need any of those. If you like to wear hats made of tin foil like me then the WHOIS privacy is the most useful one if it’s offered.

Step 2: Obtain the server
Go to digitalocean.com and create an account. DigitalOcean is a cloud infrastructure provider, and what we want from them is a virtual machine instance that runs WordPress (and Apache, and MySQL, and PHP, and Linux, and…). Fortunately, we are not the first people who want to do this, and the process is really streamlined. DigitalOcean even talks about One-Click Install, although there definitely are more clicks than that. After you’ve created the account, go here and get started with the process:
https://marketplace.digitalocean.com/apps/wordpress

I recommend following DigitalOcean’s instructions because they most likely are going to be more up-to-date than mine. When there are questions about some extra services or premium CPUs or what-not, just say no. Your blog isn’t going to be a big hit in the first few days/months/years anyway, so no use wasting money on 12 CPUs yet. Backups might be useful though after you get the things rolling.

After you’ve clicked through the marketplace, you should be greeted with a screen where you can see your Droplet (DigitalOcean’s fancy name for a virtual machine). It’ll take a moment for the machine to initialize. When it’s done, you can click the Getting Started button in the middle of the screen to get started (these instructions will most likely be quite similar to the instructions in the marketplace link above):

Useful big red circle

The first step of the instructions at least at the time of writing was making an SSH connection to the Droplet. You can either use your favorite SSH client or DigitalOcean’s browser console. After an SSH connection was formed, an automatic helper script actually helped to finish the configuration (it seems that the SSH connection may close few times if you try it before the machine has properly settled). The setup is mostly basic stuff, setting up the admin account, etc, but the script also asks about LetsEncrypt. Skip it for now, as the DNS settings aren’t yet in place.

If you break something, fear not. You can trash the Droplet easily:

Useful big red arrow

Then just create a new one and start from the beginning. I did this a few times.

Step 3: Obtain the DNS
After the Droplet has been created and WordPress has been installed, it’s time to set up the DNS. Now, I’m going to be honest here, I don’t know much about this stuff so that’s why I’m putting a lot of pictures in this section. Basically, there are two things that need to be done: changing the nameserver on your domain registrar to DigitalOcean’s nameservers and setting up the DNS in DigitalOcean. The first step varies a bit depending on which registrar you chose earlier, here’s how it is done in Namesilo:

Select this thingy right here (stack of pancakes?)
…and enter the DigitalOcean nameservers.

Setting up the DNS in DigitalOcean is fairly simple, the following steps should create correct DNS settings almost automatically:

First this…
…and then this…
…and finally this. Or something similar to this. I’m not 100% sure if the www A-record is required, but at least it doesn’t hurt. I guess. I tried getting this right more times than I want to admit, and when I finally got things working I decided that I don’t want to touch the settings ever again.

The changes done in these steps will take some time to actually have an effect. I’ve seen a correct domain name resolution happen in five minutes, and I’ve seen it not happen in few hours. I noticed that Firefox Focus is the most efficient browser in destroying its caches, so I’d recommend using it to verify that you’ve set up DNS correctly. The private mode of regular browsers doesn’t seem to help anything.

Step 4: Obtain the HTTPS
After you’ve set up the DNS properly, you can and definitely should set up the SSL with LetsEncrypt. This is actually quite straightforward. Just connect to the Droplet via SSH again and run the following command:

certbot --apache -d yourdomain.com -d www.yourdomain.com

I think you could actually set up the DNS before setting up the VM and create the certificates with the helper script during the initialization, but at least I did things the hard way.

Step 5: Obtain the stress of updating WordPress plugins and seeing the cash flow out of your pocket
At this point, most of the actually stressful work should be done. Unless you’re like me and already forgot your WordPress admin password. In case you have a better memory than I do, you should be able to navigate to yourdomain.com/wp-admin and log in to WordPress there. After that, the usage should be more or less the basic WordPress stuff, which is outside the scope of this “tutorial”.

Just remember to update your WordPress plugins. I already have outdated plugins after a whopping five minutes of use.

In case you’re a financially responsible person, it is nice to know that DigitalOcean allows you to track quite easily how much money has gone into maintaining your blog. It actually reads right in the upper right corner of the Droplet management screen (the number is updated once a day).

Option C: The true tech approach
Step 1: Set up your own server HW

I’m not going to do this.

But that’s the process of doing simple things the hard way! Or one method of doing so. At least you save few bucks by hosting the server yourself (as opposed to buying premium WordPress elsewhere), and you already have a topic for your first blog post. Hopefully, the next one will have some actually informative content.