Ten Questions to Ask when Creating a Visualization
All 10QViz Posts
Here, you can search or browse all the posts created by our 10QViz expert community. Use the search box, faceted searching tool, or the tag cloud to the right →
to search to your heart’s content! Anyone can annotate or comment on pages. Please visit the How to… page if you need any guidance or instructions.
This post is motivated by a new interactive visualization tool provided by glue solutions, inc. that allows for visual exploration of the evolution of the IHME models of the COVID-19 pandemic, over time.
In the body of the post, we take a look at how uncertainty is represented in the original Institute for Health Metrics and Evaluation (IHME) deaths per day graphs for COVID-19, and then at the graphical features of the new interactive tool for exploring the IHME models’ predictive history. An essay online at the Prediction Project site offers context on why exploring the history of the IHME models graphically is so interesting.
➡ At this link, you can create your own graphical comparisons of the IHME deaths/day model by clicking on the three dots at the top-right of each graphic there and downloading a PNG
➡ To join the discussion, please comment using the Disqus tool at the bottom of this post, which permits you to upload graphics with your comment.
The IHME Daily Deaths Display
Here’s an annotated sample of a typical IHME plot for the whole United States, with a very clean design, in which a solid red curve shows the record of past deaths, and a dashed red curve shows the models forecast of deaths/day, going in to the future. The shaded red band shows a 95% confidence interval, illustrating the uncertainty in the model prediction. On the IHME web site, users can hover-over the plot to read off the forecast, with uncertainty information, dates in the future, or actual recorded deaths for dates in the past.
The glue solutions interactive tool for creating historical comparisons of IHME forecasts
Here is an annotated sample of output from the glue solutions interactive tool, for the United States. In the glue solutions visualizations, in order to distinguish data from model, actual deaths/day are shown as red points, and all model information is shown in blue. Shading for the (95%) confidence model works exactly as in the original IHME visualizations, only is shown in blue. The average forecasts are shown as solid blue lines within the blue shaded uncertainty bands.
A graphical choice has been made to not assign different dates different colors. A multi-color option, for a small number of overlain models, does make it easier to distinguish which model is which, but as more and more models are overlain, the color in overlap regions turns to mud. Choosing instead just one color (blue) for every model allows user to see a region of ever-darker blue, corresponding to the region of the graph where past models agree, as more and more models are added.
Choices have also been made on the glue solutions site presenting the interactive tool about how users can interact with these graphics. A time slider has been added outside of the graphic, and a date selector has been added on the right. Constraints on these graphical choices were imposed by the Vega visualization grammar. (For example, the developer of the tool, Jonathan Foster, would have liked to add: color in the graph title, a slider within the graph, and more responsive features for mobile, but Vega could not allow those features when this graphic was made).
TL;DR: Throughout history, people have recorded events on timelines. This post is about the remarkably varied design space of timelines as explanation tools.
We begin our timeline adventure with a primer on spatial metaphors for time.
Joseph Priestley’s “Chart of Biography” (1765)
Perhaps the most common representation for time is linear: the “time as an arrow” metaphor. Consider Joseph Priestley’s “Chart of Biography”, published over 250 years ago. In Priestley’s design, time is mapped from left to right (the dates are BCE). You can see the lifespans of a number historical figures, offset vertically to avoid overdrawing. Otherwise there’s no meaning to the vertical placement aside from faceting the data with statesmen at the bottom and philosophers at the top. This left-to-right linear representation of time remains popular today, but it is certainly not the only way to explain a set of events.
A timeline design space with three dimensions: representation, scale, and layout. Different combinations of these serve different communicative intents.
Alternative representations of time
Radial representations are especially effective when explaining and highlighting natural cycles and events that repeat, such as biological life cycles or the seasons of the year. However, time is both linear and cyclic, something that repeats and yet coils upwards or forwards. Thus, spirals are certainly another way to represent time, though one that is less common than the line or the circle. Yet another representation for time is the grid, often manifesting as a calendar. Like radial representations, calendars are good for showing repeating events and deviations from patterns of events, especially when these patterns correspond to conventions of weeks and months.
One of Mark Twain’s whimsical curved timelines (1914).
The last category of representation doesn’t conform to any specific shape. Mark Twain drew whimsical curved timelines for remembering dates along annotated curves. Many timeline infographics that we see today still have this freeform board-game-like appearance.
What is a timeline for? Timelines as explanation tools
These different spatial metaphors for time are one dimension of a timeline design space. Before introducing the second dimension, consider the following question: What is a timeline for? Why do we draw these in the first place?
Basically, a timeline can be used to explain“what happened when?”. If you unpack this question, a timeline can answer a number of more detailed questions. In what sequence did events occur? How long were they? Did event A and event B co-occur? And when did they occur relative to some baseline event? These questions relate to the second dimension of a timeline design space, and that pertains to time scale.
Frames of reference: Alternative time scales
As an example, consider Visualising Painters’ Lives by Accurat, depicting the lives of notable 20th century painters. In these timelines, each lifespan, artistic period, travel, and romantic conquest for several famous painters appear along a common chronological scale.
This video shows a simplified timeline representing the career of Salvador Dalí, inspired by Accurat’s project. You can see his artistic periods like Surrealism and Dada and his travel to places like Paris. As the video progresses, the career of Matisse now shown alongside Dalí’s. Using a common chronological scale, you will notice that were born at different times. You will see when their respective Cubist and Abstract periods took place and how they might have influenced one another. When the scale changes to one relative their birth dates, you can now compare the age at which they started their art career and how long they lived. You can also spot similarities like how they both traveled to Paris in their early thirties. Finally, the aspect of chronological duration disappears altogether, and what remains is simply a sequential ordering of their artistic periods. Ultimately, each of these transformations resulted in a different timeline, telling a different story.
A third dimension of the timeline design space is layout, or how to draw one or more timelines within a page or display. You could draw a single timeline. You could facet the events into multiple timelines, such as by drawing one timeline per famous painter. Or you could wrap a single timeline into meaningful segments of time, like a decade or a century.
The origin of this timeline design space
This design space arose from a research project led by Matthew Brehmer in collaboration with Bongshin Lee, Nathalie Henry Riche, Benjamin Bach, and Tamara Munzner. The team collected and categorized 145 timelines and timeline visualization tools from various sources, which helped them identify the dimensions of the design space. Next, they verified that the design space could be used to label 118 additional timelines from different sources. They also implemented points in the design space with 28 event datasets. These datasets varied in a number of ways: the number of events, the temporal extent of the events, and the rate of event co-ocurrence.
This process of categorization and implementation also led the researchers to identify 20 viable points in the design space. These points are combinations of representation, scale, and layout that are purposeful in terms of their communicative intent, interpretable in terms of which perceptual task the viewer is expected to make, and generalizable across a range of timeline datasets. This thumbnail gallery acts as a visual index for these points in the design space, and at timelinesrevisited.github.io, you will find each of these example designs in detail along with a description of what narrative or communicative intent they serve.
Considerations for storytelling with timelines
So now that you have all of these design choices, how do you use these design choices to craft explanations with timelines? How do you combine different points in this design space? This is important because despite the variety of ways that we visually represent and scale time, existing timeline presentation tools limit us to linear representations and chronological time scales. Existing tools also tend toward a chronological narrative. Some tools show the entire timeline as a static image, and viewers are therefore likely to begin at the start of the timeline. Alternatively, other tools reveal events one at a time in chronological order.
For some stories, a chronological introduction of events makes sense, while for others it does not. Consider the painters example, in which the career of Matisse adds context to the career of Dalí. Additionally, to achieve expressive narrative design, you can make use of animation, highlighting, and annotation to incrementally reveal parts of a narrative, and allow the viewer to make new comparisons.
In this video, you will encounter a set of radial timelines depicting a typical 24 hours in the lives of 26 writers, artists, composers, and the like. When they work, eat, sleep, exercise, and do other activities. A good starting point is to ask: when do creative people create? Are people similar in this regard? You can also ask about the relationship between sleep and creativity. What about variation and creativity? A chronological scale isn’t the best way to convey the number or heterogeneity of activities. Instead, it’s better to use a sequential scale to highlight these aspects. And to determine who varied the most and least, a linear representation is perhaps better than a radial one. Toward the end of the video, a chronological scale returns. This scale allows you to compare timelines just by scanning up and down, to spot synchronocities such as who works or sleeps at the same time of day. It also invites you to compare your own daily rhythm to those of these creative people.
Conclusion
Despite the apparent simplicity of the question of “What happened when?”, this post hopefully relayed the richness of the timeline design space. You have different visual representations and different time scales at your disposal that serve different communicative purposes. This design space grows even richer when you use dynamic storytelling elements like incremental reveal, selective highlighting, animated transitions, and an annotation layer comprised of labels and captions (metadata for events or for the timeline itself, respectively).
I tend to think of information graphics as a continuum, with figurative representations at one end and abstract representations on the other (see the first Figure below). In the world of science visualization, you could argue that the full continuum can also be referred to as data visualizations. After all, essentially all of our work is rooted in data collection at some stage in the process: from bone length measurements in dinosaur reconstructions, to meticulously documented laboratory experiments that build up to a more complete understanding of processes like photosynthesis, to representations of mathematical expressions (like Feynman diagrams), to straight-up plotting of the raw data itself, in chart form.
Outside of the world of science visualization, it may be more useful to think of the continuum like this:
When I flip through old issues of Scientific American, it strikes me that many artists worked across the full spectrum. But as a graphics editor at the magazine now, I find myself maintaining discrete freelance pools for each of the different points along the continuum.
Perhaps this is an artifact of my own biases, but it occurs to me that this increased specialization may also be in part due to the shifting tools of each of these areas. When the primary tool for developing representative illustrations, explanatory diagrams, and data visualizations for print magazines was pen and ink, an artist could become a master at pen and ink, then explore different methods of problem solving in each of these areas.
Since desktop publishing became ubiquitous and digital rendering tools diversified and became more widely available, it seems to me that the simple act of choosing a primary tool starts to define the edges of the artist’s scope. As an art director, I find myself specifically looking for 3D artists to build physical objects; folks who hone in on composition and the flow of information by iterating with tools like Adobe Illustrator for explanatory diagrams; and data designers that build solutions with code for visualizing large datasets.
Each of these tools, mediums, styles and genres take lots of time to master, and tend to favor certain portions of the continuum. Many of the conferences I attend and communities that I engage with seem to reinforce these divisions, by focusing on the tools. And it seems that artists that span more than one of these orbs are harder and harder to find.
Perhaps that way of thinking about things is a bit overly dramatic. The reality is probably much more like this.
And perhaps this is a completely natural and fine state to be in—particularly since the primary tools for these different sub-disciplines have bifurcated over time. And perhaps there isn’t value in trying to force discrete clusters to reconnect.
That said, I argue that even if you don’t have the desire to work across the full continuum—or the time to dedicate to becoming proficient across the full continuum—there is lots to be learned from each of these clusters, and I’d love to see more cross pollination of ideas between them. I think we’d all benefit, if things looked more like this:
And even better, if things looked like this:
I’m not arguing that everyone along this continuum should learn how to code. Or that everyone along the continuum should build clay models and paint from life. I’m arguing that we can—and should—learn how science visualizers from cross the full spectrum think through and solve problems.
Included in Stephen Few’s very interesting visualization blog (perceptual edge) is the provocatively titled “Save the Pies for Dessert” post. Pie charts are notoriously bad for perceptually judging magnitude. Here is an annotated excerpt from Few’s post, giving just one example of how hard it can be to judge scale using pie charts…
Bonus: An amusing pie chart which shows the shadow illusion featured in the Categoriesquestion is this fascinating little image. Once you see the pyramid, you cannot unsee it:
I did not manage to identify the original maker of this -sort of- meme at this point, if you know, please tell me in the comments. Image above is copied from Rebecca Barter here: http://www.rebeccabarter.com/blog/2015-07-23-pie/
Rainbow colors are pretty, and many of us like them. However, go to any visualization-related conference, and you’ll hear a lot of ‘rainbow-hate’. Where does that come from? Below is an excellent example that shows how rainbow color tables might mislead and make us see categories, or patterns, that might not be there. The image below is featured in a blog post titled “How The Rainbow Color Map Misleads” by Robert Kosara, in his wonderful visualization and visual communication blog eagereyes:
There is more to the art and science of choosing colors. Here is another informative post by Lisa Charlotte Rost: https://blog.datawrapper.de/colors/
Consider this infographic about imprisonment, from this article on the American Legislative Exchange Council blog. Most people would look at it and find it very engaging and attractive, which it is. But, as a visualization expert, one wonders if the odd coloring variations in the outer ring of the main figure and in the “Juvenile” block at right, which just show how the larger wedges (categories) divide up more finely (into sub-categories) wouldn’t be better shown in a Tree Map, using the ideas about showing hierarchical categories proposed by Ben Shneiderman in the 199os. A Tree Map version of these data would almost certainly show the area of sub-categories and categories relative to each other (context) better than the snazzy graphic shown here.
In 2007, for their paper entitled Towards a Periodic Table of Visualization Methods for Management, Lengler & Eppler created the static graphic shown here. Later, Chris Wallace created an interactive javascript version, now hosted at the excellent visual-literacy.org web site.
This fun table has SO much to teach us about visualization–just think about all the 10QViz “Questions” to which it directly relates…
Primarily–the highly successful table is about “explaining” what’s meant by different kinds of visualization, but it (especially the interactive display mode) also let’s a user “explore” how those visualizations relate to each other. The table’s two dimensional layout, mimics the periodic table of the elements, familiar to most viewers (cf. “Who?”), and suggests (and labels) categorical groupings (shown in patterns of color and position). The interactivity adds hugely rich “metadata,” by way of examples for each cell in the table.
Reference for the Original Table: “Towards A Periodic Table of Visualization Methods for Management”
Lengler R., Eppler M. (2007). Towards A Periodic Table of Visualization Methods for Management. IASTED Proceedings of the Conference on Graphics and Visualization in Engineering (GVE 2007), Clearwater, Florida, USA. More on the history of the interactive version of the table can be found in this blog.
For hundreds of years, scientists have published their results in scientific journals that were printed on paper. Today, though, most journals have gone entirely online. Articles less and less frequently printed out and read on paper, so why should they still look and funciton exactly the way they did in the 1600s?
Josh Peek and I, and our colleagues wrote a fully online paper presentingThe ‘Paper’ of the Future back in 2014, which highlights (with embedded demonstrations) many of the technologies available to scientists publishing today, and in the near future. One particularly important technology–“3DPDF”– discussed in that paper of the “future” was actually first deployed in a Nature article by my “Astronomical Medicine” collaborators and me, way back in 2009.
Our challenge was to show the difference between two “segmentation” techniques used to define salient structures inside of star-forming regions. The science isn’t important here (sorry). What’s important is that we wanted to offer the “reader” multiple, interactive, views of high-dimensional data, inside of a journal article.
To see the PDF in action, take a look at this video, or download the “nature_demo” file and open it, on any Mac or PC, with an Adobe PDF viewer of any kind (not Preview).
Other authors (e.g Peek 2012) have since published methods for creating these 3D PDFs using free software, and a (perhaps too small!) number of authors have now embedded these 3D images inside of the scholarly articles. Even though interactive images are clearly seen to add value to articles, they are not (yet) widely used. 3D PDF as a format may be short-lived, as articles move more and more to a fully online environment, where other (e.g. javascript-based) technologies can offer superior options. BUT, the general idea of embedding data and interactive views of it, be they “3D” or not, is extremely valuable, and we will return to it in future posts–for now go have a look at The ‘Paper’ of the Future (Goodman at al. 2014).
We knew this might be the first question you’d ask! One of the 10VizQ founders (Alyssa Goodman) is quite involved in making new visualization software, and in the scientific software world in general, so she promised to make posts about software from time to time. At present, Alyssa’s pet project is “glue,” which is a python-based (but GUI enabled) tool for exploring diverse data sets using linked views. She’ll make a separate post on glue soon–it’s utility applies across all 10 Questions.