Robot Has No Heart

Xavier Shay blogs here

A robot that does not have a heart

Evolution of a graph

Recently I have wanted to chart some cost data I collected on various foods. As a baseline for discussion, here is a very vanilla excel type graph, reminiscent of ones I am certain you have seen in powerpoint presentations:

This is not a good graph for several reasons

  • Only provides a general overview of the data – some foods are cheaper, some more expensive, so what?
  • Labels feel cramped and ugly.
  • The grid is too prominent and distracting, without being very helpful – you can’t read accurate values from it.

The biggest problem is that it doesn’t “invite the eye to compare”. It doesn’t leave an impact. The first step to addressing this is to revisit the data – it’s quite possible you just have boring data. In this case, I improved the data by coding it according to whether it is vegetarian or not.

Version 2

For the next iteration of this graph, I colored the graph to highlight the vegetarian aspect of the food. To address the other issues, I moved the labels into the legend, and completely removed the grid, instead displaying the values directly on the graph. This technique works due to the low number of data points. You can think of it has “enhancing” the table rather than displaying a high level overview of it. Also, a serif font (georgia) was used.

This is certainly an improvement, but it still has its flaws

  • 8 different colors, which distracts from the data, and the vegetarian data is muted.
  • It is much harder to identify the food with the data point, now that the labels have been moved into the legend.

Final

I iterated again, moving the labels back down to the x-axis, which in addition to solving the identification problem, allowed me to drop back down to 2 colours. In our initial graph this felt cramped, so I added some more whitespace and also kept the serif font from the last iteration.

This version of the graph speaks much louder. It’s easier on the eye, and the conclusion I want to draw from the data is clearly expressed. I am using this graph (with proper references and notes) on a new information site I’m working on – it’s far from complete but you can follow along on github if you’re interested.

Tools

The first graph was made with OpenOffice spreadsheet, the second with a hacked version of flot for jQuery. The final graph was made with a new jQuery plugin I wrote called tufte-graph. There is a meta-lesson here – I spent hours hacking different JS libraries to try and get them working exactly how I wanted, in the end the quickest solution was to just write what I needed.

I use Colour Lovers to find color nice colour palettes. Works much better than trying random RGB codes.

Final word

Spend time on your graphs. A picture is worth a thousand words. They are too often neglected, and it doesn’t take much effort to make them really shine.

Data is fun

This is a story about a graph.

Inspiration struck just before sunrise one Sunday morning. 8 of us, too tired to sleep, decided to construct a relationship map of the local swing dancing scene. Naturally, the discussion turned to relationships on a micro level … who dances with who, who asks who, and the like, a topic quickly abandoned since gossip is a much more readily available data at 5am in the morning. But the seed was sown and my mind was compelled to tend it. On Monday I borrowed a copy of Tufte’s The Visual Display of Quantitative Information from work and, well, if you don’t feel like drawing a graph after reading that book there is something wrong with you.

Collection

On the following Wednesday I packed up my laptop and set off to brat pack (my performance troupe) rehearsal. Innocuously planted in the line of other machines waiting to play music or show off videos, my iSight went unnoticed as it snapped a picture of the dance floor every second during social dancing, weaving them together into a little over 1 minute of footage.

That Friday after a few too many post work beers at the local, being in an appropriate data collection mood I reviewed the footage and created a two column table: lead on the left, follow on the right, one row per song. The low quality of the iSight made identifying couples towards the rear of the hall tricky, but the tendency of dancers to generally wear distinctly colored clothes made it possible.

Presentation

A brief stint of research led me to Processing, a Java environment for creating neat data visuals. I would have preferred something with ruby, but you take what you can get. My Java was a bit rusty, and the collection handling was downright clumsy to what I was used to in ruby, but after a Saturday of hacking I had something I’m quite proud of. Behold, the "dancing network of brat pack for the 15th August:

Brat Pack Dancing Network

I tried to apply many of Tufte’s ideas in the creation of this graph. It was initially presented vertically, but I rotated it so it was easier to compare the histograms. Chart-junk is kept to a minimum, only the horizontal lines representing each dancer are non-data carrying, and the connecting lines were deliberately thinned and lightened to make interpreting the myriad of partnerships easier. Labels use a serif font and also provide scale information and except for one are all presented horizontally.

Looking forward, I’d like to collect some richer data – both more of the same and also extra information like tempo of song – to incorporate into the graph. I suspect the best way to do this would be to record normal video rather than timelapse, to both grab the audio and also make identifying partnerships easier.

A pretty flower Another pretty flower