Cutting Room Floor: Data Visualizations

I recently took a class on information visualization as part of my Msc at Oxford. For my final project, I created a series of visualizations based on data from Panos Ipeirotis’ 2010 demographic survey of workers on Amazon Mechanical Turk (link to paper + data). Unfortunately (fortunately?) for me, the final visualization had to fit on an A4 sheet of paper, so I had to excise a few graphs. Here’s what didn’t make the cut.

Bar chart of workers' tenure on mturk when it is or isn't the primary income source

A classic bar chart, illustrating how long respondents had been on MTurk if it was their primary income source or not. I created all these graphs with Excel. With the exception of modifying the colors to match the color scheme on MTurk, this is pretty much an out of the box graph from Excel. Not a bad thing, but ultimately not part of the story I wanted to tell in my final visualization.

HITs completed per week because entertaining

This is an example of a more minimalist bar chart. As Edward Tufte recommends in his book, The Visual Display of Quantitative Information, you should minimize the amount of non-data ink in your graphs, as it can be distracting and doesn’t necessarily make the chart easier to read.

The chart shows workers who work on MTurk because they find the tasks entertaining and how many HITs respondents completed per week. I cut this graph because my visualization was meant to be for a general audience who may not be familiar with MTurk. For the graph to make any sense, they would need to know what a HIT was and how long it typically takes to complete; this would have been too difficult to explain within the visualization.

HITs completed per week by primary income

This is a tornado or butterfly chart showing the difference in number of HITs completed on MTurk per week depending on if MTurk is a worker’s primary income source or not. Unlike the other charts, which can be generated automatically and then further modified, Excel can’t handle a double x-axis so you have to trick it into making the chart by means of careful math and several different online tutorials. What I like about this chart is that you can pretty easily see the difference in the two distributions. This was slightly harder to see when the chart looked like the first example, with the bars next to each other. I wound up cutting this for the same reason as the second chart.

There were actually a few more than these three. Aside from the difficulties of the butterfly chart, making charts and graphs is pretty fun.