1. Data Visualization - An introduction
Prof Jan Aerts
Biodata Visualization and Analysis
ESAT/SCD
University of Leuven
Belgium
twitter: @jandot
Google+: +Jan Aerts
jan.aerts@esat.kuleuven.be
http://biovizanlab.wordpress.com
http://saaientist.blogspot.com
4. “A good sketch is better than a long speech” (Napoleon)
shows: size of the army, geographical coordinates, direction that the army
was traveling, location of the army with respect to certain dates, temperature
along the path of the retreat
9. What I use as a definition:
“computer-based visualization systems providing visual representations of
datasets intended to help people carry out some task more effectively.” (T
Munzner)
12. Why do we visualize data?
• record information
• blueprints, photographs,
seismographs, ...
• analyze data to support reasoning
• develop & assess hypotheses
• discover errors in data
• expand memory
• find patterns (see Snow’s cholera map)
• communicate information
• share & persuade
• collaborate & revise
33. • huge space of design alternatives => many tradeoffs
• many possibilities known to be ineffective
• avoid random walk through parameter space
• avoid some of our past mistakes
• extensive experimentation has already been done
• guidelines continue to evolve
• we reflect on lessons learned in design studies
• iterative refinement usually wise
35. How do we get from data to visualization? We need to understand:
• properties of the data
• properties of the image
• the rules mapping data to image
39. Semiology of graphics
• Jacques Bertin, Gauthier-Villars 1967, EHESS 1998
• semiology = study of signs and sign processes, likeness, analogy, metaphor,
symbolism, signification, and communication (Wikipedia)
• visual encoding:
• what - points, lines, areas (, patterns, trees/networks, grids)
• where - positional: XY (1D, 2D, 3D)
• how - retinal: Z (size, lightness, texture, colour, orientation, shape)
• when - temporal: animation
40. “marks” - geometric primitives
H
V
S
“channels” - control appearance of marks
41. Gestalt laws - interplay between parts and the
whole (Kurt Koffka)
series of principles
Election results Florida:
• black = Bush
• white = Gore
43. Gestalt - Principle of Simplicity
Every pattern we see is seen such that we see a structure that is as simple as
possible.
44. Gestalt - Principle of Proximity
Things that are close to each other are seen as belonging together (=>
clusters)
45. Gestalt - Principle of Similarity
Things that are similar in some way are perceived as belonging together.
47. Gestalt - Principle of Connectedness
Things that are connected are perceived as belonging together. This encoding
is stronger than similarity, shape, colour, and size.
48. Gestalt - Principle of Good Continuation
Objects that are arranged in a straight or smooth line tend to be seen as a
unit.
49. Gestalt - Principle of Common Fate
Objects that move in the same direction tend to be seen as a unit.
56. Pre-attentive vision
= ability of low-level human visual system to rapidly identify certain basic visual
properties
• some features “pop out”
• used for:
• target detection
• boundary detection
• counting/estimation
• ...
• visual system takes over => all cognitive power available for interpreting the
figure, rather than needing part of it for processing the figure
58. Limitations of preattentive vision
1. Combining pre-attentive features does not always work => would need to
resort to “serial search” (most channel pairs; all channel triplets)
e.g. is there a red square in this picture
2. Speed depends on which channel (use one that is good for
categorical; see further (“accuracy”))
60. Language of graphics
• graphics = sign system:
• each mark (point, line, area) represents a data element
• choose visual variables to encode relationships between data elements
• difference, similarity, order, proportion
• only position supports all relationships (see later)
• huge range of alternatives for data with many attributes
• find images that express & effectively convey the information
61. Which encoding should I use?
• From huge list of possibilities, you have to choose the best one.
• Principle of Consistency
• properties of the representation should match properties of the data (e.g.
pie chart: area vs radius)
• Principle of Importance Ordering
• encode the most important piece of information in the most “effective”
way (i.e. spatial position)
63. Steven’s psychophysical law
= proposed relationship between the magnitude of a physical stimulus and its
perceived intensity or strength
64. Accuracy of quantitative perceptual tasks
how much (quantitative) what/where (qualitative)
McKinlay
65. Accuracy of quantitative perceptual tasks
how much (quantitative) what/where (qualitative)
McKinlay
66. Accuracy of quantitative perceptual tasks
how much (quantitative) what/where (qualitative)
McKinlay
“power of the plane”
67. Accuracy of quantitative perceptual tasks
how much (quantitative) what/where (qualitative)
grouping: see Gestalt laws
McKinlay
70. Colour space
• = mathematical model to talk about colour
• RGB (red-green-blue)
• most common, but less useful
• HSV (hue-saturation-value)
• more useful
78. Dynamic data
• animation is good sometimes, but often not:
• we can only follow 3-4 visual cues simultaneously
• change in “mental map”
• change blindness (e.g. http://nivea.psycho.univ-paris5.fr/CBMovies/
BarnTrackFlickerMovie.gif)