1. Data Visualization
The Ideas of Edward Tufte
David Giard
MCTS, MCSD, MCSE, MCDBA
blog: DavidGiard.com
tv: TechnologyAndFriends.com
twitter: @DavidGiard
e-mail: DavidGiard@DavidGiard.com
2. I II III IV
x y x y x y x y
10.0 8.04 10.0 9.14 10.0 7.46 8.0 6.58
8.0 6.95 8.0 8.14 8.0 6.77 8.0 5.76
13.0 7.58 13.0 8.74 13.0 12.74 8.0 7.71
9.0 8.81 9.0 8.77 9.0 7.11 8.0 8.84
11.0 8.33 11.0 9.26 11.0 7.81 8.0 8.47
14.0 9.96 14.0 8.10 14.0 8.84 8.0 7.04
6.0 7.24 6.0 6.13 6.0 6.08 8.0 5.25
4.0 4.26 4.0 3.10 4.0 5.39 19.0 12.50
12.0 10.84 12.0 9.13 12.0 8.15 8.0 5.59
7.0 4.82 7.0 7.26 7.0 6.42 8.0 7.91
5.0 5.68 5.0 4.74 5.0 5.72 8.0 6.89
3. II
I 10
10
5 5
0
0
0 10 20
0 10 20
III IV
10 10
5 5
0 0
0 10 20 0 10 20
23. 325 Connecticut Traffic Deaths,
Before (1955) and After(1956)
Before stricter Stricter Enforcement by the Police
enforcement
Against Cars Exceeding Speed Limit
300
After stricter
enforcement
275
1955 1956
26. 16 Traffic Deaths per 100,000
Persons in Connecticut, Massachusetts,
Rhode Island, and New York
1951-1959
14
NY
12
MA
10
CT
RI
8
6
1951 1952 1953 1954 1955 1956 1957 1958 1959
27. Principles of Graphical Integrity
• Data Representations proportional to Data
• #Dimensions in graph = #Dimensions in data
• Real dollars, instead of deflated dollars
• Provide context
43. Principles
• Above all else, show the data
• Maximize the Data-Ink ratio, within reason
• Erase non-data-ink
• Erase redundant data-ink
• Revise and edit
73. Slope Graph
Source: The Atlantic, June 30, 2012
74. Takeaways
• Maintain Graphical Integrity
• Maximize Data-Ink Ratio, within reason
• Avoid Chartjunk and Ducks
• Use Multifunctioning Graphical Elements, if
possible
• Keep Labels with data
• Maximize Data Density
84. David’s Speaking Schedule
Date Event Location Topic(s)
Sep 15 Code Camp NYC New York, NY Effective Data Visualization
Sep 22 SQL Saturday Kalamazoo, MI Effective Data Visualization
Sep 25 SoftwareGR Grand Rapids, MI TBA
Oct 13 Tampa Code Tampa, FL TBA
Camp
Nov 7 Ann Arbor Ann Arbor, MI How I Learned to Stop Worrying
Computing and Love jQuery
Society
Feb 21 Greater Lansing Okemos, MI How To Use Azure Storage
.NET User Group
Notas do Editor
Hand-drawn graph from the 1880’s, showing Paris train schedule.Attributed to the French engineer Ibry.Source: E.J. Marey, La Methode de Graphique (Paris, 1885)
William Playfair (1759-1823)3 series over time:-Wheat prices-Labor wages-Monarch
From 1960 census:# of families per county with very low income (<$3,000)# of families per county with very high income (>$10,000)
Charles Joseph Minard, French Engineer, 1781-1870“It may well be the best statistical graphic ever.” – TufteTan line = Napoleon’s march to Moscow in the winter of 1812. (422,000 men – 100,000 men)Black = Napoleon’s retreat to Poland. (422,000 men – 100,000 men)Width of lines represents size of army. (100,000 men - 10,000 men)Bottom line is linked to lower graph, showing dates and temperatures (very cold winter)Auxiliary troop movements are shown.Crossing Berzina River was a disaster.Variables: -Size of army -Location -Direction of movement -Temperature -Dates
Charles Joseph Minard, French Engineer, 1781-1870“It may well be the best statistical graphic ever.” – TufteTan line = Napoleon’s march to Moscow in the winter of 1812. (422,000 men – 100,000 men)Black = Napoleon’s retreat to Poland. (422,000 men – 100,000 men)Width of lines represents size of army. (100,000 men - 10,000 men)Bottom line is linked to lower graph, showing dates and temperatures (very cold winter)Auxiliary troop movements are shown.Crossing Berzina River was a disaster.Variables: -Size of army -Location -Direction of movement -Temperature -Dates
From NY Times, 1978Fuel economy standards increased by 53%Graphic shows fuel economy increased by 783%Lie factor = 14.8
From NY Times, 1978Fuel economy standards increased by 53%Graphic shows fuel economy increased by 783%Lie factor = 14.8
From TheLos Angeles Times, 1979Lie factor = 3.8(also horizontal spacing of X-axis is wrong)
Time, 19791-dimensional data is shown as 3-dimensional objectsIncrease of 454% is shown as volume increase of 27,000%Lie factor=48.8, a record!
Source: Sunday Times (London), 1979
New York Times, 1978
Data-ink = ink that directly shows the data and will result in loss of data if erasedAll else = decorations, metadata and redundant data.Proportion of a graphic’s ink devoted to the non-redundant display of data-information.1.0 – proportion of graphic that can be erased without loss of data-information
Duck-shaped building in Flanders, NY3 types of chart junk:1) Unintentional optical art2) Grid3) Self-promoting graphical duck
Moire’ EffectGraphic appears to vibrate or shimmer
Duck-shaped building in Flanders, NY3 types of chart junk:1) Unintentional optical art2) Grid3) Self-promoting graphical duck
Source: Executive Office of the President, Office of Management and Budget, 1973
Source: Executive Office of the President, Office of Management and Budget, 1973
Source: JASA
Source: Maps and Diagrams by F.J. Monkhouse and H.R. Wilkinson, 1971
Source: Fluctuations of the Great Fisheries of Northern Europe by John Hjort, 1914
Source: SemiologieGraphiqueby Jacques Bertin, 1973
Source: The Visual Display of Quantitative Information by Edward Tufte
Source: The Visual Display of Quantitative Information by Edward Tufte
Charles Joseph Minard, French Engineer, 1781-1870“It may well be the best statistical graphic ever.” – TufteTan line = Napoleon’s march to Moscow in the winter of 1812. (422,000 men – 100,000 men)Black = Napoleon’s retreat to Poland. (422,000 men – 100,000 men)Width of lines represents size of army. (100,000 men - 10,000 men)Bottom line is linked to lower graph, showing dates and temperatures (very cold winter)Auxiliary troop movements are shown.Crossing Berzina River was a disaster.Variables: -Size of army -Location -Direction of movement -Temperature -Dates