If you’re in the business of conveying information to people, you might be interested in engaging their interest to cause them to seek out more information and to deepen their understanding of the data. That’s the premise that Nick Diakopoulos is trying to explore with some interactive visualizations of demographic data.
Nick (a former FXPAL Intern) is exploring the design space of interactive, semi-automated visualizations that can be put together quickly and yet leverage the kinds of interaction design characteristic of computer games.
The idea is to enable a journalist to illustrate a story with an interactive data set that would engage readers and cause them to learn more about the data. The challenges include how to create an engaging experience without the kind of authoring process that goes into creating games, and how to translate that engagement into learning.
Nick’s visualizations (which you can try here) are based on the distributions of various kinds of data over the map of the US. The interactions are interesting, but I think more work remains to be done to make them more engaging and more informative.
My take on improving engagement is to add a competitive aspect. Rather than deriving the reward structure of the game from the data, it might be better to make it competitive: you earn points in solving tasks, and system ranks you compared with some other recent players. Of course this requires a more sophisticated back end that can keep track of scores, but it’s not too hard to build these now.
To make the interaction more informative, the tasks should be designed so that they cause people to form and then test hypotheses. Nick’s data includes such health-related variables as adult smoking, binge drinking, diabetes and obesity rates (among others). One possible class of interactions that might be both challenging and give insight into how unevenly these kinds of variables tend to be distributed is to find regional outliers: states or counties for which the reported rates are significantly different from their neighbors. The task could be made more interesting by starting with a coarse sample of the data (state level, for example), and then drilling down into more specific regions, and then counties. My guess is that state-level averages mask a lot of county-level variability, which might make for some interesting sleuthing.
A second class of hypotheses meaningful for these kinds of data revolves around correlations: Where are two variables (e.g., soda consumed and diabetes rates) correlated (positively? negatively?), and where are they not? Is there another factor that can predict the correlation? These are basic data analysis questions that might be useful to teach to students and to newspaper readers alike.
Another way to increase engagement is to localize the data to the region from which the person is accessing the data (or to a region the user selects) to explore more regional perspectives. Starting with a person’s home town or county might make the data more meaningful and might provoke more informed hypothesis-testing.
It will be interesting to see what results Nick obtains from his explorations, and how to translate these lessons into tools that journalists and instructors can use without significant training.