Categories

Do you want to show or explore pre-existing, known, human-interpretable, categories?

e.g. state boundaries or other geographic phenomena in 2D, galaxy limits in 3D, organs within an MRI scan, name-able objects (such as: “trees,” “birds,” “refrigerators”).
Note that this entry is closely related to Patterns, which refers to statistical clusters and other previously-unknown patterns that emerge from data exploration.


Assuming that it is reasonable to think in categories, and the goal is to show or explore objects/phenomena that fall into the same category; the way we use the visual variables will have a dramatic impact on how the scene is interpreted. The wrong use of the visual variables can hide existing categories, or mislead the viewer to identify categories that are simply wrong or do not exist. Insights gained from such an experience and the decisions that are made based on such insights are potentially costly. Thus, when we rely on visualizations to make inferences, it is important to be aware of the impact of visual variables. Below we feature examples based on two of them (color and shadows), and demonstrate how they might interfere with (categorical) recognition tasks.

When we decide on colors for our visualization: the number of colors we use (common understanding is that –if you can choose–, stick to 7±2), whether we use qualitative, sequential or diverging* color schemes (“color map” or “color ramp”), and the color distance (‘temperature’ difference between two shades of the same color) we use will change what we see in the visualization. Also remember to use a simulator (like Color Oracle) to get a sense if your visualization is accessible also to those with color deficiencies (~8% of men, ~1% of women). See Rogowitz & Treinish’s 1996-classic below for a great example.

Example 1 for categories (color): ‘Find the 7 differences’
Below we see the same data visualized using two ‘mathematically identical’ color schemes, which have dramatically different perceptual consequences (source: Rogowitz & Treinish,1996 borrowed from here). Both maps show elevation categories, as marked by the legends, in meters above and below sea level. The ‘rainbow’ colors (left) occlude information, we cannot see the larger category (above and below see level), and some categories are lost (see the annotations). Using a divergence point, that is, using different colors for points above and below zero (right) provides a lot of clarity in recognizing the elevation categories.

Read more about this visualization in an excerpt from the authors themselves, explaining what you see (and what you do not see) in the image above [Read it now.]

In some cases, visualizations are shady (sorry, could not resist), I mean, they rely on shadows to illustrate 3D, a concept named “shape from shading” in the related literature.  Shadows and shading is used a lot to represent 3D shapes, for example, in terrain representation. In this case, the virtual light source should be placed on ‘upper-left’ (north-west, or even better north-north-west) side of the image. Consider the example below.

Example 2 for categories (shadow): ‘disillusioned’
Below we see the same data visualized using two light directions (source: Biland and Çöltekin, 2016, form here). One can see this as another example of (nominal) elevation categories perhaps, where the goal is to identify landforms that are higher or lower, such as a ridge vs. a valley.

Original image caption by Biland and Çöltekin, 2016: “The same digital elevation model (DEM) is hillshaded under a) incident light from 337.5°, and b) from 157.5°. The marked landform (ABC) is a valley. Most observers perceive it correctly as a valley in the left (a), and inversely as a ridge in the right (b). The elevation angle of the light source is set to 45° in both image”.

* It appears that there is a debate on ColorBrewer palettes. We will invite a guest post on this issue, stay tuned!

Last revised: 7th of July 2018, Arzu Çöltekin