Study material for understanding distributed data stores

Here’s a quick starting point to help people who’ve grown up with SQL databases sort out what all of the “SQL doesn’t scale! No More SQL!” noise is all about, and what the world of distributed data stores looks like. Part of this is from material I used for a discussion track at the Silicon Valley Patterns Group (a study group for serious techies).

A motivation behind the “No SQL” movement is the observation that relational databases don’t scale, that distributed data stores do, and that distributed data tables are a better fit for a large class of applications that have been straining to force their data into SQL schemas. You may noticed a progression from selectively denormalizing relational schemas for performance, to actively avoiding JOINS for more performance, to using an in-memory key/value cache for even more performance, to sharding data for even more performance. A logical step along this path is ditching SQL in favor of distributed in-memory key/value caches (i.e., distributed hash tables, or DHTs). There’s an implied “where this makes sense”, but that’s a discussion for a later post.

The jumping off point in understanding distributed data stores is the ‘CAP Theorem’, which states that, when building distributed systems, one can choose no more than two of Consistency, Availability, and Partition Tolerance (the ability for the system to keep going when pieces can’t talk to one another). For large distributed systems, partitions are a given (due to temporary routing problems or longer fiber cuts), leaving a choice between Consistency and Availability. Systems with SQL back-ends (and designed with ACID compliance in mind) have typically chosen Consistency, and will suffer unavailability until consistency can be guaranteed. Large e-commerce systems often choose Availability, taking on the responsibility for coping with data that may become inconsistent.

Wernor Vogels’ paper Eventual Consistency – Revisited is a good introduction to the CAP Theorem and Amazon’s approach to it. Vogels covers some of the same ground in a video presentation. Read the first or watch the second. A basic understanding of CAP is essential.

With CAP under your belt, I recommend watching Todd Lipcon’s Intro Session at the NoSQL conference (the fist two video links). He packs a lot of useful information into an hour.

From there, it’s on to specifics. Two influential distributed data stores are Amazon’s Dynamo (a distributed hash table) and Google’s BigTable (a distributed data store with more structure than a simple hash table).

Werner Vogels has a good article on Dynamo that delves far enough into Dynamo’s implementation to give a general idea of how to build a distributed hash table.

Dynamo inspired Project Voldemort, an open source implementation started at LinkedIn.

BigTable inspired Project Cassandra, an open source implementation started at FaceBook, designed by one of Dynamo’s authors. Follow up on this if you have data structuring needs beyond what can fit into simple hash tables.

The videos from the NoSQL conference cover a few other players in the distributed data store space.

There’s a lot more good material out there. This is just one way to get started.

XSS Exploits in WordPress Themes

I found a family of Cross-site Scripting (XSS) Vulnerabilities while checking out WordPress themes. It’s the same vulnerability, copy/pasted from one theme to its “inspired by” children.

An XSS vulnerability is where a site allows a hacker to spit back JavaScript of the hacker’s choosing to some innocent third-party, typically by way of some type of sneaky JavaScript injection that the site isn’t coded to protect against. The third-party’s browser executes the JavaScript as if it came from the site, which lets the injected script have access to any cookies that the third-party has with the site. The hacker stealthily retrieves the cookies, and it’s all downhill from there. Lost cookies can lead to lost passwords.

One of my longer-overdue tasks has been to get this site off of the default WordPress theme. So I’ve slowly been checking out themes, taking notes of what I like, and kicking a few of them around on a private WP install on my laptop. Quite a few themes are fodder for XSS attacks, by way of incorrectly sanitized search pages.

Here’s how to tell if the theme you’re looking at (or using!) is vulnerable to the XSS exploit. Type this into the search box:

<script>alert('xss');</script>

If you get a pop-up dialog that says ‘xss’, you’ve found a vulnerable theme. Repair the vulnerability before using the theme, or find a safer one.

To pick on one theme in particular (because it’s a nice theme, and the author hasn’t fixed the exploit or responded to my email), try this on the DePo Clean theme. Scroll to the bottom to get to the search box.

Fortunately, repairing the vulnerability is straightforward. Find and open search.php in the editor of your choice, then replace

<?php echo $s; ?>

with

<?php echo strip_tags($s); ?>

This prevents a script entered via search from being echoed back to the user.

And consider sending a note to the theme’s author. Fixing a problem at the source is usually best.

And no, no new theme here yet. Working on it.

A Hardware Interlude, With Pliers

When it came time to replace my always-on home Linux box, I settled on the Intel Atom in its dual-core flavor, and set about building a small system based on Intel’s D945GCLF2 motherboard. The idea of a system that draws 45 watts when running full out was compelling.

I also wanted quiet. Several reports noted that the fan that comes on the board is noisy. The guy at the local computer store who sold me the parts confirmed the problem, and handed me a Zalman ZM-NBF47 fanless northbridge cooler as a replacement.

When I put the system together, it looked like the Zalman wouldn’t fit. The CPU heatsink blocks one orientation, and RAM blocks the other. So I left the stock heatsink/fan in place. Big mistake. Lots of noise.

After a bit of head scratching and some Googling, I found the trick.

First, mount the clips on the Zalman and bring them out on a diagonal. Then, with needle-nosed pliers, bend the hooks backwards like this:

Zalman heatsink with hooks twisted backwards

Next, remove the stock northbridge heatsink and fan, clean off the old thermal grease, and lay down a new coat from the tube that comes with the Zalman. Set the Zalman down diagonally, and attach it through the back side of the hooks on the motherboard, like this:

Zalman heatsink hooked onto Motherboard

It’s a tight fit. There’s less than 1/8″ clearance between the Zalman and the RAM stick, and I bent a few of the fins up so that they wouldn’t touch the CPU heatsink. The Zalman is quite warm to touch, so you’ll want a case with good ventilation.

The result, with the board in a case with a fanless quiet power supply, is a very quiet system. If I listen, I can hear the disk spinning, but that’s it. Bliss.

I’m sure that replacing the stock heatsink/fan voided the warranty on the board, but if you insist on quiet, it’s worth the risk.

The D945GCLF2 with 2Gb seems to run Ubuntu 8.10 without any problems. The onboard video drives a 1600×1200 LCD reasonably well, though there’s no mistaking it for a gaming box. I haven’t tried audio yet. One pleasant surprise was checking /proc/cpuinfo and finding four processors listed instead of the two I expected. Each core is hyper-threaded.

Addenda: There’s a good discussion on quieting this board at silentpcreview.com. Note in particular concerns about how hot the chipset runs, and assumptions about the fan providing cooling for the passive heatsink next door. Taking a closer look at the power supply in my box, I found that it had a very quiet fan that provided airflow over the southbridge heatsink. If you don’t have airflow, the ZM-NBF47 might not be what you want.

A discussion on the Ubuntu forums has instructions for enabling lm-sensors so that you can see the chip temperatures.

Scalability, and what limits it

Paul Hoffman of Joyent gave a very pragmatic talk on Scaling Ruby applications the other night at Google. The talk will be up on Google Video soon. I recommend watching for it.

Scalability is a fascinating topic, and is almost always approached from the technical side. When I starting musing about scalability, it’s usually over some technical details such as database partitioning or caching (memcached FTW!). Cal Henderson’s book, and his on presentations on scaling Flickr are almost entirely technical.

Paul came at it from a more pragmatic angle. Here’s his “Fundamental Limits of Scalability” list, with my interpretation and commentary.

At the top of Paul’s list…

1. Money

Now there’s a reality slap. And it’s so obvious. No money, no game.

2. Time

Second reality slap. If you don’t have the time to sort out scalability, you’re screwed. Given enough time and enough money, many things are possible. But there’s never enough time.

Time and money also join together as cash flow management. Many startups fail for lack of good cash flow management. (I’ve seen that up close, but that’s a story for another day.)

O.K., time and money. Got it. Time for something technical?

3. People

Damn, forgot people. If you’re going to scale an application, you’re going to need to scale Operations to manage all those servers, and you’ll need to scale support. Duh.

Got it. Money, time, and people. Something technical next?

4. Experience

Oh, right. If you can’t get the right mix of experience, you’re going to have to reinvent the wheel yourself, which is going to cost time and money, and risks burning out your people.

Got it. Money, time, and people with the right experience. Can we talk about event busses yet?

5. Power

Oh, that. 2,000 kW runs ~400 servers (Paul’s numbers). Getting more than that pulled into a standard commercial building may take some doing. Power is why Google builds server farms near hydroelectric plants.

6. Bandwidth

Now we finally get technical, but in a back-of-the-envelope sort of way. Bandwidth requires pipes (the internet being basically a series of tubes), and pipes cost money. Fat pipes cost lots of money. You can calculate how much bandwidth to the outside world you’ll need using some assumptions about what a “standard page” looks like and a simple formula. Catch the video for Paul’s numbers. I jotted down “100 Mbps is good for 5.5M page views/day”, but he had more to say there.

Bandwidth also limits scalability at internal interconnect points. A GigE interconnect eventually limits how much database replication you can do. Finally, we’re at a limit that has direct architectural consequences, and we’ve only just started to talk about technology stack.

I think it’s fair to collapse power and bandwidth into the money bucket. That’s what outfits like Joyent and Engine Yard are for. But the next time I start to think seriously about building a big web application, I’ll try to remember time and people.

Using breakpoints as a checklist

I had an “ahah” moment this afternoon while wrapping tests around some legacy Java code that lacked any. To create an instance of the class, I had to construct object graphs of “stunt doubles” to pass to the constructor. The doubles had to have just enough behavior to pass muster with private methods in the parent class that were invoked form the parent constructor. Getting an initial spike working took a lot of “let’s just pass null and see how far we get,” followed by hand-rolling mock objects.

When I could finally create an instance, I began attacking the class’s single instance method. By eyeball, its cyclometric complexity was about 40. Lots of branches to cover, and it was already late in the day.

That’s when I had the ahah!

By setting breakpoints on every statement in the method, running the tests in the debugger, and clearing each breakpoint as it was hit, I could get a quick checklist for what lines of code where still untested. This was a lot faster than generating a coverage report after every successful test run to show me which lines were still untested.

I’m sure this is an old trick, but it was new to me. (Possibly because, outside of Java, I rarely spent time in debuggers.)

And yes, I know that it’s really not that simple. This technique doesn’t account for things like zero-trip for loops, which often represent hidden branches. The hidden branches I had to note separately as I went. And yes, you really only need breakpoints on the first statement of each basic block, but that takes thinking, and the time that takes is easily canceled out by how fast it is to just put breakpoints on every line that’ll take one. Plus, stopping at each line once gives you a chance to notice little things, which can pay off when you’re working with legacy code.

Django and Rails

I’ve been using Django to build my latest scratch-an-itch project, and had been thinking about writing up a comparison of Django and Rails.

Ben Askins and Alan Green saved me the trouble.

Their presentation (which requires the latest Flash plugin) is a decent, balanced summary of the similarities and differences between the two frameworks. It’s 100 pages, but is paced for speed (and it’s a fine model for how to do compare/contrast presentations).

The only bit I’d add is that Django favors parallel development to a greater extent than Rails. With Django, you get to a clean separation between code and templates sooner, so that UI designers can be working on presentation while coding continues. And Django’s out-of-the-box admin interface supports lets people start doing data entry pretty much as soon as you’ve built your data models. Django doesn’t yet have migrations, though, so there’s more emphasis on getting your data model right up front.

Maybe one more bit: Rails has built-in support for AJAX and visual-effects, with several books that cover that support in depth. Django is JavaScript toolkit neutral, shiping with no built-in toolkit support (or no lock-in, if you want to think of it that way). For a sizeable project, I doubt this will be an issue, but for smaller projects this can give Rails an advantage.

If you’ve already chosen one of these frameworks, it may be worth your time to check out the other. There are good ideas in both.