Do you want to show or explore pre-existing, known, human-interpretable, categories?
e.g. state boundaries or other geographic phenomena in 2D, galaxy limits in 3D, organs within an MRI scan, name-able objects (such as: “trees,” “birds,” “refrigerators”).
Note that this entry is closely related to Patterns, which refers to statistical clusters and other previously-unknown patterns that emerge from data exploration.
When it is appropriate to categorize information, and a goal is to explore or explain if or how objects or phenomena that fall into a category, the way visual variables are used will have a dramatic impact on how a visualization is interpreted. Inappropriate use of visual variables can hide existing categories, or mislead by suggesting categories that are simply wrong or do not exist. Decisions or conclusions made based on misleading or wrong categorizations can be costly, so when we rely on visualizations to make inferences, it is important to be aware visual variables’ potential to mislead or misinform. Below we feature examples based on two visual variables, color and shadow, to demonstrate how they might interfere with categorical recognition tasks.
When deciding on colors for a visualization, the: number of colors perceived; color scheme (qualitative, sequential or diverging*) ; and color distance (‘temperature’ difference between two shades of the same color) will change what we see in the visualization. Tools like Color Oracle can simulate how your color choices will look to people with color-perception differences, such as color blindness (which afflicts nearly 10% of men). In general, when using color to show categories, a good rule of thumb is to use no more than about seven colors (actually 7±2). Rogowitz & Treinish’s 1996-classic (below) offers a great example of how color and cognition interact.
Example 1 for categories (color): ‘Find the 7 differences’
Below the same data are visualized using two mathematically equivalent color schemes (read more here), which have dramatically different perceptual consequences (source: Rogowitz & Treinish,1996, borrowed from here). Both maps show elevation categories, as marked by the legends, in meters above and below sea level. ‘Rainbow’ colors (left) obscure information, making it impossible to discern distance above and below sea level. In the right-hand panel, using a salient divergence point (sea-level) results in colors above and below sea level being dramatically different, allowing elevation categories to be easily and intuitively understood.
In some cases, visualizations are shady (sorry, could not resist), I mean, they rely on shadows to illustrate 3D, a concept named “shape from shading” in the related literature. Shadows and shading is used a lot to represent 3D shapes, for example, in terrain representation. In this case, the virtual light source should be placed on ‘upper-left’ (north-west, or even better north-north-west) side of the image. Consider the example below.
Example 2 for categories (shadow): ‘disillusioned’
Below we see the same data visualized using two light directions (source: Biland and Çöltekin, 2016, form here). One can see this as another example of (nominal) elevation categories perhaps, where the goal is to identify landforms that are higher or lower, such as a ridge vs. a valley.
Original image caption by Biland and Çöltekin, 2016: “The same digital elevation model (DEM) is hillshaded under a) incident light from 337.5°, and b) from 157.5°. The marked landform (ABC) is a valley. Most observers perceive it correctly as a valley in the left (a), and inversely as a ridge in the right (b). The elevation angle of the light source is set to 45° in both image”.
* It appears that there is a debate on ColorBrewer palettes. We will invite a guest post on this issue, stay tuned!
Last revised: 7th of July 2018, Arzu Çöltekin, edited 19 February 2019 by Alyssa Goodman.