Dave W. Smith This could all be a lot simpler.

My Technology Radar for 2010

ThoughtWorks recently published their Technology Radar for 2010. I liked the format, and borrowed it to organize and prioritize my own Technology Radar.

Here’s where I plan to invest learning time this year.

My Tech Radar for 2010

1. I’m continuing to actively explore how non-trivial relational data models can be mapped on to distributed data stores. I’ve seen relational databases pushed to their limits, and need to look beyond that. There’s a gap. The non-relational (e.g., NoSQL) work I’ve seen so far hasn’t gotten beyond simple data models. What do do about large, transactional schemas remains a puzzle worth investigating.

If this moves from ‘assess’ to ‘trial’, it’ll probably involve Google’s Big Table, by way of Google App Engine.

2. When building web apps, making them look good is my weak point. It’s time to get better. That means some deliberate design experiments and exercise time with CSS, followed by cleaning up the UI on a Google App Engine hack from last year.

3. I’ve been using git as a CVS/Subversion replacement, and not much more. It’s time now to get serious and master the rest of git, and to level up by branch fu.

4. Some of the data analysis problems that I run into could benefit from a good statistics package. R seems like a useful tool to get acquainted with. I can see this getting bumped from ‘assess’ to ‘trial’.

5. The Silicon Valley Patterns Group’s current track is on Haskell. So far, it’s been mind-bending in the same way that Smalltalk was on first encounter. There’s depth to Haskell that’s worth experimenting with, though I don’t yet see using Haskell for production work.

6. Colleagues are doing some very cool stuff with Clojure, a LISP for the JVM. My plate is too full to be spending time with Clojure right now, so it’s on the radar as a hold.

7. Arduino is there for the fun of building something tangible, and for the opportunity to re-learn some basic Electronics. This is a hold-over from last year, and is the most likely thing to remain undone at the end of this year.

As with any plan, this will likely be completely upended by year-end.


Out with the old theme, in with the temporary one

‘New blog theme’ has been on my TO DO list for a lot longer than I’d like to admit. I’ve made several starts at building one from scratch, but an honest look at priorities pushed that off the back burner. So, I picked simple, clean theme from elsewhere as a starting point, and will fix breakage and iterate as time allows.


Scoring My 2009 Predictions

For the past few years, my reading group has started the year by making predictions. These were mine:

1. This year I will finally see a Zune in the wild.

Unless there’s one at the New Year’s Eve party tonight, this gets scored as WRONG.

I’ve been puzzled and amused that the huge amount of marketing and advertising money that Microsoft has poured into the Zune has yielded a complete absence of the devices in my corner of the world, except on a few store shelves. Granted, I live near the Apple mothership, but I do travel, and do noticed what people are plugged in to. When I asked around, friends who had seen Zunes had only seen them in possession of friends of theirs who work for Microsoft.

2. There will be a very public failure of ‘Cloud Computing’.

This was a ’shooting fish in a barrel’ prediction, since something, somewhere, was bound to happen. I was thinking along the lines of a cascading set of accidents involving a backhoe and a washed out bridge taking a data center offline, but the Microsoft/Danger/Sidekick fiasco lost a huge amount of ‘trust us, it’s safe in the cloud’ user data. This prediction gets scored as RIGHT.

3. Congress will step in, on the wrong side, to correct the Bilski ruling.

In re Bilski, the Supreme Court, in a moment of apparent sanity, threw a wet blanket on business practice patents. The ruling has since been used to toss out claims in several software patents.

I guessed that since so much money was on the line, and the players who stood to lose had so much political clout, that Congress would act. I underestimated both how long it would take for challenges to work their way through the courts, and how distracted Congress appears to have been by the huge food fight they’re engaged in.

I still think that Congress will get involved, but it’ll be a few years before this one makes it back on to my list of predictions. For 2009, unless I missed something in the news, this prediction was WRONG.


The View From Jerry’s Shoulders

In the software community, when we speak of the giants on whose shoulders we aspire to perch, names like Turing, Dijkstra, Wirth, and Knuth are often mentioned–names associated with the Computer Science side of software development. And there are giants on the hardware side: Von Neumann, and, if you want to go a lot farther back, Babbage.

There’s also a side of software development that doesn’t get enough attention, but is nonetheless critically important: The side that’s about people. Until some distant future when software development is fully automated, it will remain an endeavor that includes humans. And, by extension, our imperfections and foibles.

One of the giants on the people side of software is Gerald M. (Jerry) Weinberg. If you’re not familiar with Jerry’s writings, I envy the discoveries you have ahead.

Full disclosure: I’ve been a student of Jerry’s for many years, first through his books, then through his workshops, and finally through working with him on the AYE Conference. His teachings, wisdom, and subtle wit have had a profound impact on my career and life.

Where to start reading Weinberg depends on where you are in your career.

I came to know of Jerry’s work through the chance discovery of The Psychology of Computer Programming, a book that introduced many to the idea that there’s much more to programming than languages and algorithms. It’s a timeless book; human nature evolves much slower than technology. This is where I learned about ‘goal displacement’, which you’ve seen if you’ve ever been puzzled why it’s become more important to some developer that a system runs fast than it is that it works right. I’m not sure this book is the best place to start if you haven’t read Weinberg, but do read it at some point.

If you’re just starting out, Becoming a Technical Leader is full of excellent advise for planning and managing the external and internal aspects of your career. The idea that there’s internal stuff to be managed is a surprise to some, but if you’re going to be a leader, you’re going to have to learn to cope with a whole range of emotions, including frustration, anger, and hopefully elation. It’s worth re-reading every five years or so, if only to see how far you’ve come. This is the book that I gave away the most copies of when I was a manager.

If you’re a bit further along, Are Your Lights On? How to Figure Out What the Problem Really Is (with Don Gause) can help you avoid solving the wrong problems. Exploring Requirements: Quality Before Design will change the way you look at requirements. Though both were written before Agile, both still apply, since much can be lost in imperfect communication between customers and developers. The perils of mind reading remain even when there’s a Product Owner colocated with the team.

If you aspire to lead projects, any or all of Jerry’s four volume “Quality Software Management” series are a great investment. Volume 1, Systems Thinking, is one of the best introductions to a this very powerful technique that I’ve found (Senge’s The Fifth Discipline being the other). Volume 3, Congruent Action is especially good if your career path leads to management. It’s all about dealing with people effectively.

If you consult or advise for a living, The Secrets of Consulting: A Guide to Giving and Getting Advice Successfully, about the external aspects of consulting, is full of deep and useful advice. If you use the services of consultants, this book can help you get more value out of that relationship.

Perfect Software: And Other Illusions about Testing is in my reading queue, so I can’t recommend it yet, but a friend who has read it left the better part of a pad of sticky notes in his copy of the book.

Finally, Weinberg on Writing: The Fieldstone Method has some very good advice on writing for those of us technical folk who have to squeeze writing in to small blocks of time. You won’t find the writing techniques that you were given in school, and may wish that you’d had Jerry’s book instead.

That should be enough to get you started. Your next step is to pick one to add to your reading queue.

Since it’s pretty clear that I’m biased about Jerry’s work, check out the reviews of Jerry’s books on Amazon.com.


Finding Boundaries

Re-reading Martin Fowler’s writeup on rake, Using the Rake Build Language, I found this buried gem:

Often when you come across something new it can be a good idea to overuse it in order to find out it’s boundaries. This is a quite reasonable learning strategy. It’s also why people always tend to overuse new technologies or techniques in the early days. People often criticize this but it’s a natural part of learning. If you don’t push something beyond its boundary of usefulness how do you find where that boundary is? The important thing is to do so in a relatively controlled environment so you can fix things when you find the boundary.

Two thoughts:

First, I’m not doing enough pushing of boundaries when playing with new things, tending instead to test-drive new technologies “on road” rather than off. Solving familiar problems in familiar ways but with a new tool is a good way to get a feeling for the tool, but it shortchanges learning. Perhaps I’ve let the set of problems I’m tackling become too narrow.

The second thought is that “a relatively controlled environment” is both key and often ignored. I’ve seen plenty of examples of people overusing new technologies in production code bases. If that’s you, please get yourself a side project to experiment in. Finding a science experiment buried in a code base can cause a major headache. There’s a time and a place for boundary-pushing experiments.


Study material for understanding distributed data stores

Here’s a quick starting point to help people who’ve grown up with SQL databases sort out what all of the “SQL doesn’t scale! No More SQL!” noise is all about, and what the world of distributed data stores looks like. Part of this is from material I used for a discussion track at the Silicon Valley Patterns Group (a study group for serious techies).

A motivation behind the “No SQL” movement is the observation that relational databases don’t scale, that distributed data stores do, and that distributed data tables are a better fit for a large class of applications that have been straining to force their data into SQL schemas. You may noticed a progression from selectively denormalizing relational schemas for performance, to actively avoiding JOINS for more performance, to using an in-memory key/value cache for even more performance, to sharding data for even more performance. A logical step along this path is ditching SQL in favor of distributed in-memory key/value caches (i.e., distributed hash tables, or DHTs). There’s an implied “where this makes sense”, but that’s a discussion for a later post.

The jumping off point in understanding distributed data stores is the ‘CAP Theorem’, which states that, when building distributed systems, one can choose no more than two of Consistency, Availability, and Partition Tolerance (the ability for the system to keep going when pieces can’t talk to one another). For large distributed systems, partitions are a given (due to temporary routing problems or longer fiber cuts), leaving a choice between Consistency and Availability. Systems with SQL back-ends (and designed with ACID compliance in mind) have typically chosen Consistency, and will suffer unavailability until consistency can be guaranteed. Large e-commerce systems often choose Availability, taking on the responsibility for coping with data that may become inconsistent.

Wernor Vogels’ paper Eventual Consistency – Revisited is a good introduction to the CAP Theorem and Amazon’s approach to it. Vogels covers some of the same ground in a video presentation. Read the first or watch the second. A basic understanding of CAP is essential.

With CAP under your belt, I recommend watching Todd Lipcon’s Intro Session at the NoSQL conference (the fist two video links). He packs a lot of useful information into an hour.

From there, it’s on to specifics. Two influential distributed data stores are Amazon’s Dynamo (a distributed hash table) and Google’s BigTable (a distributed data store with more structure than a simple hash table).

Werner Vogels has a good article on Dynamo that delves far enough into Dynamo’s implementation to give a general idea of how to build a distributed hash table.

Dynamo inspired Project Voldemort, an open source implementation started at LinkedIn.

BigTable inspired Project Cassandra, an open source implementation started at FaceBook, designed by one of Dynamo’s authors. Follow up on this if you have data structuring needs beyond what can fit into simple hash tables.

The videos from the NoSQL conference cover a few other players in the distributed data store space.

There’s a lot more good material out there. This is just one way to get started.


← Before