digitalocean

Fixing Stability Issue In The Blog Server

The biggest fans of this blog (or just the people usually browsing between 6:00-7:00 UTC) may have noticed a frustrating issue where the site occasionally loads really slowly. Or in the worst-case scenario, refuses to load at all. Only an error page containing a message about a failing database connection gets returned.

Well, at least the message is short and to the point.

Investigating Issue

This issue started to occur sporadically in August and became consistent in October. And I started to consider fixing it in November. This kind of relaxed response time is common for hobby projects. The first obvious step to fix the issue was to check what was going on in the server when load times got longer. Once I noticed that the site was slowing down, I checked the monitoring stats. From the graphs, I saw that both CPU usage and disk reads were spiking. CPU was peaking at 80%, and disk reads were over 100MB/s for over 15 minutes. From the 7-day monitoring graph, it could be seen that this kind of spiking was happening almost daily.

Not every day though, and some spikes are taller than others.

Investigating the system log and comparing it with the time stamps of the peaks revealed the following cycle:

  1. One of the two daily apt package manager upgrade services gets started
  2. The CPU and disk activity starts ramping up
  3. The system starts heavy swapping and the website load times get longer
  4. About 15-20 minutes after the apt service starts the OOM (out-of-memory) killer kicks in and stops MySQL. Few other services may time out or get killed in this phase as well.
  5. MySQL restarts and the blog works again

I started investigating why the daily apt services seemed to constantly cause the server to run out of memory. The first of the two services downloaded the packages for upgrading, and the second one installed the downloaded upgrades. After trying out a few different things I realized that just installing or removing a package caused the server to randomly run out of memory if either of the apt services was started a few minutes earlier. It’s fun to do tests like this on a live server.

Fix Attempt 1: Installing System Upgrades

Some further investigation into the daily apt services revealed that the unattended upgrades had been failing for a long time. It seemed like the MySQL apt repository was missing keys, causing the apt update to fail. Also, it seemed like some upgrades required input from the user to configure packages. So I took a server backup and started installing the upgrades manually.

This isn’t foreshadowing. At least yet. Let’s see in three months.

Out of 141 packages, 127 wanted an update, which is “quite many” (to put it lightly). Fortunately, I have made no promises about the availability of this site, so I could liberally reboot the server as much as I needed for the upgrades. I was hoping that installing these pending upgrades would clear some cache that would reduce the RAM usage of the apt services. And in the worst-case scenario, it wouldn’t fix the issue but I would get an up-to-date server, so upgrading seemed like a win-win.

In addition to the upgrades I also installed an improved DigitalOcean monitoring service. This actually revealed something that should have been quite obvious from the beginning. The new monitoring service monitored RAM usage (the old one did not), and I could see that the server was using 90% of the RAM when it was idle. In hindsight, checking the RAM usage and monitoring how it gets consumed should have been the very first step when investigating an OOM issue.

Needless to say, 90% RAM usage is not good. I guess this happens because I’m running this blog on a low-end instance that doesn’t have much of RAM (I actually checked the minimum requirements of the OS, and the instance barely fills even that). However, before investigating the insufficient RAM, I wanted to first see if the upgrades would fix the original OOM issue. They did not.

So, the problem started to seem like a case of insufficient RAM. To fix this kind of issue, there are usually two options: scale the server up or scale the services down. In other words, throw money at the problem, or try to optimize the server. Being a cheapskate I chose the latter option. Also, I usually work with embedded things, so “just adding more RAM” feels like cheating. Also, considering the fact that on average I have about 20 daily visitors, beefing up the server seems like the wrong direction.

Fix Attempt 2: Optimizing RAM Usage

I used top to check the biggest memory consumers, and found two RAM gluttons: MySQL and Apache. Both are required for the well-being and existence of WordPress (that is the platform of this blog), but perhaps they could be optimized. At least they used to work on the server before, so perhaps they could be configured to work once again.

In the case of MySQL, there was a single mysqld daemon that was consuming plenty of RAM. Some googling revealed that disabling performance schema could help lower memory consumption. It seems to be a feature that measures the performance of the MySQL database server. Considering the fact that I’m using WordPress and I hope to write zero direct database queries to the database, that seemed nonmandatory. Perhaps when developing new software using MySQL such stats could be useful. Disabling performance schema lowered the mysqld RAM consumption from 39% to 19%.

In the case of Apache, there were ten worker threads, each consuming about 5%-8% of RAM. If my math is correct, in the last month I had about 0.00083 concurrent visitors on average. With that in mind, ten worker threads felt a bit excessive, and I scaled their amount down. I think it could be lowered even more, but I wanted to have enough workers in case there’s a sudden influx of readers.

Aaaany day now.

Conclusion

These actions took the idle RAM usage from 90% down to 60%. After this drop, I haven’t seen the OOM killer get activated in the past seven days, so I hope the issue is fixed. 60% is still a bit more than I’d like, but as long as the server stays stable and the performance doesn’t notably degrade I think that’s an acceptable percentage. Also, using the cheaper virtual machine saves me $6 a month!

The root cause for the increased RAM usage is still a bit of a mystery. I’m suspecting that installing WordPress plugins caused it because I was installing SEO plugins around the time the issue became more prevalent. If there’s one thing I’ve learnt from this, it’s that updates should be checked manually every now and then, and consumption of the system resources should be constantly monitored.

How to start a (tech) blog with WordPress & DigitalOcean

Haven’t we all dreamt about it? Writing some simple tutorial blogs for people who don’t want to go through man-pages and watching the ad revenue pour in while building the personal brand to ensure success in every aspect of life. The only problem: you don’t have that blog yet. Fortunately, there are both simple and not-so-simple solutions to this!

Please note that despite the clickbaity title this isn’t really a step-by-step tutorial, although it may be useful when starting out with the technologies mentioned in the title. It’s more of documentation of the things I did to get this site you’re reading now up and running.

Option A: The casual approach
Step 1: (well okay there are steps, but this still isn’t a tutorial)
Go to wordpress.com, create your free account and start writing.

This is definitely a valid approach, but it sounds easy, doesn’t it? A bit too easy one might say…

Option B: The tech approach
Step 1: Obtain the domain
Buy the domain you want. I personally use Namesilo as a domain registrar. It is cheap, and as a nice bonus, it often reminds you that you get what you pay for. There are other alternatives that could definitely be considered. GoDaddy at least seems to be commonly used, and often many services provide documentation on how to get their service working with GoDaddy.

Whatever registrar you choose, buy the domain you want to use. The process isn’t much harder than ordering anything else online. Often registrars try to sell all sorts of extras, you don’t really necessarily need any of those. If you like to wear hats made of tin foil like me then the WHOIS privacy is the most useful one if it’s offered.

Step 2: Obtain the server
Go to digitalocean.com and create an account. DigitalOcean is a cloud infrastructure provider, and what we want from them is a virtual machine instance that runs WordPress (and Apache, and MySQL, and PHP, and Linux, and…). Fortunately, we are not the first people who want to do this, and the process is really streamlined. DigitalOcean even talks about One-Click Install, although there definitely are more clicks than that. After you’ve created the account, go here and get started with the process:
https://marketplace.digitalocean.com/apps/wordpress

I recommend following DigitalOcean’s instructions because they most likely are going to be more up-to-date than mine. When there are questions about some extra services or premium CPUs or what-not, just say no. Your blog isn’t going to be a big hit in the first few days/months/years anyway, so no use wasting money on 12 CPUs yet. Backups might be useful though after you get the things rolling.

After you’ve clicked through the marketplace, you should be greeted with a screen where you can see your Droplet (DigitalOcean’s fancy name for a virtual machine). It’ll take a moment for the machine to initialize. When it’s done, you can click the Getting Started button in the middle of the screen to get started (these instructions will most likely be quite similar to the instructions in the marketplace link above):

Useful big red circle

The first step of the instructions at least at the time of writing was making an SSH connection to the Droplet. You can either use your favorite SSH client or DigitalOcean’s browser console. After an SSH connection was formed, an automatic helper script actually helped to finish the configuration (it seems that the SSH connection may close few times if you try it before the machine has properly settled). The setup is mostly basic stuff, setting up the admin account, etc, but the script also asks about LetsEncrypt. Skip it for now, as the DNS settings aren’t yet in place.

If you break something, fear not. You can trash the Droplet easily:

Useful big red arrow

Then just create a new one and start from the beginning. I did this a few times.

Step 3: Obtain the DNS
After the Droplet has been created and WordPress has been installed, it’s time to set up the DNS. Now, I’m going to be honest here, I don’t know much about this stuff so that’s why I’m putting a lot of pictures in this section. Basically, there are two things that need to be done: changing the nameserver on your domain registrar to DigitalOcean’s nameservers and setting up the DNS in DigitalOcean. The first step varies a bit depending on which registrar you chose earlier, here’s how it is done in Namesilo:

Select this thingy right here (stack of pancakes?)
…and enter the DigitalOcean nameservers.

Setting up the DNS in DigitalOcean is fairly simple, the following steps should create correct DNS settings almost automatically:

First this…
…and then this…
…and finally this. Or something similar to this. I’m not 100% sure if the www A-record is required, but at least it doesn’t hurt. I guess. I tried getting this right more times than I want to admit, and when I finally got things working I decided that I don’t want to touch the settings ever again.

The changes done in these steps will take some time to actually have an effect. I’ve seen a correct domain name resolution happen in five minutes, and I’ve seen it not happen in few hours. I noticed that Firefox Focus is the most efficient browser in destroying its caches, so I’d recommend using it to verify that you’ve set up DNS correctly. The private mode of regular browsers doesn’t seem to help anything.

Step 4: Obtain the HTTPS
After you’ve set up the DNS properly, you can and definitely should set up the SSL with LetsEncrypt. This is actually quite straightforward. Just connect to the Droplet via SSH again and run the following command:

certbot --apache -d yourdomain.com -d www.yourdomain.com

I think you could actually set up the DNS before setting up the VM and create the certificates with the helper script during the initialization, but at least I did things the hard way.

Step 5: Obtain the stress of updating WordPress plugins and seeing the cash flow out of your pocket
At this point, most of the actually stressful work should be done. Unless you’re like me and already forgot your WordPress admin password. In case you have a better memory than I do, you should be able to navigate to yourdomain.com/wp-admin and log in to WordPress there. After that, the usage should be more or less the basic WordPress stuff, which is outside the scope of this “tutorial”.

Just remember to update your WordPress plugins. I already have outdated plugins after a whopping five minutes of use.

In case you’re a financially responsible person, it is nice to know that DigitalOcean allows you to track quite easily how much money has gone into maintaining your blog. It actually reads right in the upper right corner of the Droplet management screen (the number is updated once a day).

Option C: The true tech approach
Step 1: Set up your own server HW

I’m not going to do this.

But that’s the process of doing simple things the hard way! Or one method of doing so. At least you save few bucks by hosting the server yourself (as opposed to buying premium WordPress elsewhere), and you already have a topic for your first blog post. Hopefully, the next one will have some actually informative content.