Do you want to identify new, previously unknown or undefined patterns?
e.g., statistical clusters, new visual groupings, unusual patterns (like stripes, holes, periodic behavior, or oddities/anomalies) in data displays Note that this question is closely related to Categories which refers to pre-ordained, human interpretable, categories, like geographic boundaries.
Sometimes, you don’t know exactly what you’re looking for in advance. A good goal of any exploratory data analysis is to enhance an investigator’s ability to notice and investigate unexpected patterns. Often, these patterns give new insight into the phenomenon under investigation, and other times they reveal imperfections or bias in a data set.
Here, we use interactive, linked-view, visualization as a good example of the importance of patterns. Today, many people in business or geography are familiar with products like Spotfire, Tableau, DataDesk or ArcGIS, which allow their users to display tabular and mapping data in many graphical displays at once, and to highlight salient groupings of data points, or regions to reveal how patterns in one display manifest in another. Below, we use the open-source glue linked-view package to give an example.
Example: “glue”, a python-based exploratory linked visualization environment (freely available at glueviz.org)
The image below is a screen shot from a short video (included below) that explains glue’s overall functionality. In the screenshot, three data displays, drawn from two data sets (see the “Data” listing to the upper left) linked by glue, are open at once. Dots shown in red and black mark positions of stars tabulated in the “W5 catalog” data set, and the “W5 image” shown as grey scale shows infrared emission in the region of the Sky covered by the catalog. Using this combination of linked data displays in glue, a user can see that the histogram in the lower right panel has two peaks. This double-peaked-ness is a “pattern” according to the definition we are using here. By highlighting (with a cursor drag) the right-hand peak in the histogram, a user will see, in real time, where the same selected cataloged stars are in the x-y graph of stellar colors (to the upper right) and also on the Sky (as shown on the image). The x-y graph shows a clear tendency for the points to cluster (another pattern) together along both x and y, and the Sky image also reveals a pattern in these stars’ positions that is meaningful to an astronomer.
Above is the “explanatory glue” (static image with annotations).
Video, from https://www.youtube.com/watch?v=qO3RQiRjWA4 as of July 2018:
Last revised: 13th of July 2018, Alyssa Goodman