Thursday, January 18, 2007

Oh no -- I've become unbalanced!

I recently discovered an interesting feature of the Linux kernel. It all started when I noticed that some new web servers at work were exhibiting a peculiar behavior -- the two CPU cores didn't seem to have balanced workloads. It seemed that usually the first core would have the majority of the load, while the second core would usually have much less (all witnessed via the mpstat -P ALL 10 command). Given that these servers were Apache web servers, I thought that they should typically be within a few percent of each other, but instead they were separated by anywhere from 5-25%.

After poking around a bit, I decided to try and tweak the server a bit to try and force the kernel to balance better. These servers each had two Ethernet interfaces, one to a 'front channel' across which the HTTP interaction with clients was performed, and one to a 'back channel' across which some database and other calls were made. One of the early thoughts that I had was to dedicate each of the two cores to one of the two Ethernet interfaces: this way, each core would always service the interrupts from the same interface. The default setup on the 2.6 kernel is to have all interrupts serviced by the first core, so I thought that perhaps the data load on the two interfaces was forcing the first core to run a bit hotter than the other.

It was easy enough to make the change. I first checked to find the interrupt numbers for the two interfaces (ifconfig), then forced the second one to always use the second processor core (echo "2" >/proc/irq/177/smp_affinity). This achieved the desired effect of rebalancing the interrupts, but it didn't solve my original problem: the cores were still unbalanced. In fact, the problem became even more interesting -- the two cores had almost flipped their loads, with the second core having the consistently higher usage.

At first I thought that perhaps the second interface might be transferring more data; that was wrong (ifconfig). Next I thought that, even with less total data transferred, perhaps the second interface was triggering more interrupts. That, too, was incorrect (cat /proc/interrupts). Finally, I checked to make sure that both interfaces were using the same driver, which they were (dmesg | grep eth). I had just about given up when I finally decided to put some logging into every web page (using PHP) to see what processor it used. Imagine my surprise when I discovered that nearly every web process started off on the lower-usage core, but then almost immediately migrated to the higher-usage core!

I had uncovered the superficial culprit: when a user requested a web page, one of the first things that the server would do is get a copy of their current session data from another server through the second interface. The moment that the second interface was used, the web process would flip over to the second core (or whatever core was assigned to service that interface's interrupts).

A little more research into the kernel code revealed it even further (caveat: I'm not a kernel hacker, so the details are still a bit fuzzy). When a network interface receives some data, Linux will find the process that is sleeping while waiting for that data and wake it up. When waking up the process, Linux will try to keep it on the same processor core that received the interrupt. I imagine that it does this in an effort to better utilize the core's memory cache, which should speed things up. However, it also appears to have a slight unbalancing effect on the other cores, which is exactly what I was seeing on my servers.

Of course, there are ways to fix this "problem". The easiest is to simply remove the flags in the kernel code that trigger this behavior (you'll find them in include/linux/topology.h; just remove any lines that say SD_WAKE_AFFINE), and then recompile the kernel. I've discovered, though, that this change doesn't seem to improve things like I thought it might. It did seem to balance out the load between the two cores, but the average load across the cores is actually slightly higher (1-2%, usually). This would make some sense; like I said before, the kernel is probably trying to make more efficient use of each core's cache, which apparently has a greater positive effect than having a more equal load distribution.

In the end, I suppose I can live with unbalanced CPU cores. I just wish that this behavior had been a bit better documented -- it was a rather painful process to have to figure it out myself. I suppose it could have been worse, though: I could be using closed-source Windows. ;)

Tuesday, January 16, 2007

Web serving on the cheap

Have you ever wanted to set up a small web server at home that can handle a small but reasonable amount of traffic? Perhaps you have a pet project or a home business and can't justify the cost of professional hosting. Maybe you have a fetish for low powered servers on small internet connections. Or possibly you just want to see if a Slashdotting of your home DSL line will trigger a call from your ISP. In any case, here are some tips on how to get the most out of a small setup.

Understand your limitations

If you don't have a large budget, then you obviously are going to have two big problems:
  • A lack of CPU power

  • A lack of bandwidth
Any effort to improve the performance of your small web site should be directly aimed at alleviating one of these two problems. I'll discuss each of these two issues below. There are other issues, such as a lack of RAM or disk space or disk performance, but usually your CPU and bandwidth will dominate the situation. I'll also ignore the most obvious solution to these problems, which is simply to buy a better processor and a bigger tube.


More often than not, bandwidth will be your biggest hurdle, especially if you're running your server on your home internet connection; most home connections have terrible upstream speeds, which really hurts you when you're a content producer and not a consumer. In addition, some connections, like DSL, will have high latencies.

To mitigate your lack of speed, you're simply going to need to push fewer bytes down the pipe. Look at your web pages and appreciate all of those pretty pictures while you can, because they're the first thing to go. A typical image can be anywhere from 10 to 200 kilobytes, which is simply too large for a small connection, especially if you have 20 of them on the front page. If you can stand it, remove every single GIF, JPEG, and PNG from your site. You may need to redesign your site around the new image-less paradigm, but you won't regret it in a few weeks when you get your bandwidth bill.

Next, move to a CSS-based design instead of a pure HTML one. You should be able to slim down your HTML this way, which will make it that much faster for a user to download. For an added bonus, you should put all of the CSS commands into a separate file. This will slow things down a tiny bit for the user's first visit to the site, but it also means that the same CSS file will be cached on every subsequent page view.

Finally, you should go for the biggest savings of all: compressed web pages. The idea is simple: the web server compress any text files before sending them to a user and the user's web browser will automatically decompress them before reading them. Every modern web browser supports compressed web pages, and you can see immense space savings from using it. Page compression can be a tricky thing to get right, especially if you're short on CPU power, because it obviously takes some effort by the web server to compress things. There are a few ways to get around this, one of which is to pre-render compressed versions of frequently accessed pages and then dish those out to users. You need to experiment with compression to see how much it affects your server's CPU.


If you're on a budget, then you most likely have an old computer with an outdated processor. This isn't necessarily a problem -- even very old computers can saturate a small internet connection -- but you're going to need to code your site correctly if you want to prevent it from being your big bottleneck.

The very first thing to go is the database. Sure, a database like MySQL is nice to have and does provide some convenience, but it can totally kill your web server's performance. Obviously, not everyone can do this, but many people can; it's usually a waste to store every page of a small web site in a full-fledged database system. Many small web sites could eliminate MySQL completely if they just stored data directly in files instead. For those that absolutely must have a database, you will need to at least remove any direct database query on your front page. Sometimes you can even just keep a copy of a MySQL query in a file and have a program update that file every so often.

Another way to avoid making database calls or any other expensive operation is to pre-render entire pages. If you know that your home page is only updated a few times a day, then why dynamically generate it every time someone views it? Just take the rendered page's HTML and save it in a file; the next time a user requests that page, just throw the rendered copy at them. You need to be careful to not give users stale pages, but it usually isn't too hard to figure it out on a small site. This concept ties in well with the previously mentioned tactic of saving pre-compressed copies of pages.

Finally, try to avoid web scripting in general. It is usually hard to avoid, given how much power it provides, but a poorly coded PHP or Perl page can chew through your CPU and RAM; it is best to use it only when it is truly needed. If you manage to free yourself of scripting on all but a few pages, you can even take advantage of a new class of lighter and faster web servers like lighttpd or tux; you'll still need to run a heavier web server process for those few scripted pages, but most of your traffic will hopefully run through the fast and light web server process.