Archive for April, 2006

Accidentally defeating MySQL’s query cache

Friday, April 28th, 2006

I just learned some neat things about MySQL’s query optimizer, including a bit about how MySQL 4.0.1 (and later) caches queries and result sets. When the query cache is enabled (which it is by default on Fedora Core 3), and you present MySQL with a SELECT query, it consults the cache before preparing or analyzing the query. On finding a match, the cached result from the prior execution of the query is returned.

The cache is actually keyed by the MD5 hash of the full query, including whitespace and comments. In most applications, queries are effectively static. Even when queries are generated dynamically, an identical query string is regenerated each time one is requested. In these cases, hashing the full query isn’t a problem. Same query, same hash.

But consider the application that’s grown over time to where no one developer has a good grasp of data access patterns. When slow performance demands attention, you will have to reconstruct a model of data access patterns, because the story very likely changed while you were paying attention elsewhere. Histograms of query counts and executions times are useful. To account for the same query being issued from different parts of the application, it’s common to instrument the query by adding a comment. For example,

SELECT /* summary view */ COUNT(*) FROM widgets;

Comments have the additional benefit of being visible when you notice that the server is crawling and ask MySQL to show you which queries are currently executing.

But if you start to inject dynamic information (e.g., timestamps and process id) into the comments, the “same” query may hash differently, and the benefit of having a query cache is lost. For no readily apparent reason, things just run a bit slower.

For more on query caching (including how to defeat it intentionally, and why), see the MySQL documentation, or chapter 5 in High Performance MySQL.

Speaking of Testing…

Monday, April 17th, 2006

Elisabeth Hendrickson reminds me that there are a few spaces open in her Getting a Grip on Exploratory Testing workshop next week in Mountain View.

Testing is one of those practices that has suprising depths, and surprises in the depths. Many testers never get beyond doing an adequate job of it, which leaves expectations, and respect for the skill, low. That’s a shame, because good testers can save a project, and a company, a lot of pain and embarrasment.

Test First, Really.

Monday, April 17th, 2006

From the “mistakes I make that you can learn from” file:

When changing legacy code, find and run the tests before you start.

Otherwise you might find yourself burning up an hour or two trying to figure out how your perfectly innocuous changes managed to break some seemingly unrelated test. Only to find, after carefully backing out your changes, that the tests were failing before you started.

So get a baseline, even if you’re writing new tests.

For whatever reason, I have to relearn this lesson every few months. But at least I remember to run the tests before declaring ‘done’.

Another Zero Inbox

Saturday, April 1st, 2006

Ever since reading Getting Things Done, lo these many years ago, I’ve been trying to keep my email inbox size at no more than a screen full of messages (about 25). But it kept blowing right past that limit. I’d beat it down, and it would bloat right back, burying stuff that needed attention. Time for a zero-tolerance approach [43folders.com].

It took about 45 minutes to shrink my inbox from 331 messages to empty. Here’s how I did it:

Step Elapsed Inbox
Find a block of quiet time. Sit down with the laptop, a stopwatch, and a cup of coffee. 0:00.00 331
Deal with the low-hanging fruit. Move all mailing-list digest messages to the @ToRead folder. File the obviously filable stuff and discard the obviously discardable. 0:09.02 232
Next up, a blob of conference-related items. File some, discard some, and move the rest to a new @Conference folder. 0:11.00 51
Filter through the remaining messages, moving some into @Followup or @Pending, replying to a few that were long overdue for replies, and filing or discarding the rest. 0:22.05 0

Zero! (Except for the 7 spams that arrived while I was shoveling, but that’s what spam filters are for.)

GTD fans will note the use of “@” prepended to folder names. It serves as a “you’re not done here yet” reminder, and has the secondary effect of having those folders appear at the top of the list, so that all of the “work” stays together and stays visible.

Now comes the harder work. There are 17 messages in @Followup to follow up on, 168 messages in @Conference to sort through to separate the ones that need to move to @Followup from those that need to be filed or discarded, and 1257 mailing list digests in @ToRead (most of which I won’t). But with all of the work together in one place, rather than scattered in among unsorted clutter, I now stand a chance of knowing exactly how far behind I am (which is better than having my imagination whispering “you are so screwed” into my ear).

This is the first time since taking this laptop out of the box that I’ve seen an empty inbox. It’s so… blank. So… unbearably light.