Creating Effective Data Visualizations in Excel 2016: Some Basics

Creating Effective Data
Visualizations in Excel 2016:
Some Basics
SIDLIT 2017
Aug. 3 – 4, 2017

Presentation Overview
• One of the mainstays of a modern software toolkit is Excel 2016, from
Microsoft Office 2016. By reputation, Excel is considered a beginner’s tool
that self-respecting data analysts would bypass, but Excel is fairly high-
powered, can take up to 1.06 million rows of data per set, contains
complex statistical analysis capabilities (without the need for scripting),
and enables rich data visualizations. It has a number of rich add-ons to
empower different analytical and data visualization functionalities. It
works as a great bridging tool to more complex types of statistical analyses.
• This session walks participants through some basic built-in data
visualizations in Excel 2016, including pie charts and doughnuts, bar charts,
tree maps and sunburst diagrams, cluster diagrams, spider (radar) charts,
scattergraphs, and others. This session will cover how data structures and
desired emphases will determine the options for particular data
visualizations.
2

Presentation Overview(cont.)
• In this session, participants will
• review how to load a data table,
• read the general data in a data table (or worksheet),
• process or clean the data as needed,
• use the Recommended Charts feature,
• decide which built-in data visualizations to use, and
• consider how to add relevant data visualization elements (including data
labels, background grids, axis labels, and titles) for a coherent and effective
data visualization.
• Also, participants will help co-build data visualizations from open-
source and other datasets.
3

Presentation Order
• Sourcing Datasets
• Reading General Data in a Data Table / Worksheet
• Processing or Cleaning Data
• Using the Recommended Charts Feature in Excel 2016
• Selecting Data Visualization Types
• Column, line, pie, bar, area, X Y (scatter), stock, surface, radar, treemap,
sunburst, histogram, box & whisker, waterfall, & combo
• Going “Off-Script” within Excel
• Some Common Mistakes
4

Presentation Order (cont.)
• Adding Relevant Data Visualization Elements
• Processing Graph Visualizations Outside of Excel 2016
• Add-ins to Excel 2016
• Streamgraphs, #hashtag networks on microblogging sites (on Twitter), related
tags networks (on Flickr), article-article networks on MediaWiki (on
Wikipedia), and others
• Data Visualization Standards
• A Note about Data
5

Sourcing Datasets
• Downloading public datasets from sites like data.gov
• Capturing the back data about how the publically-released data sets were
released
• Extracting data from online data portals (research sites, survey sites,
learning management systems, social media platforms, and others)
and converting those files into something readable in databases and
Excel
7

Sourcing Datasets (cont.)
• Downloading data from social media platforms
• These may include Facebook poststreams, Twitter tweetstreams, scraped
images from the Web and any image sharing sites, scraped videos from the
Web and video-sharing sites, articles from Wikipedia, #hashtag networks from
Twitter, keyword networks from Twitter, related tags networks from Flickr,
email networks from email systems, and others
• These datasets include both structured and semi-structured data
8

Sourcing Datasets (cont.)
• Autogenerating data…
• From online research suites (often used to test surveys)
• From graph visualization tools (to see what randomized graphs look like)
• Creating datasets in other software programs and saving out in a file
format readable by Excel
• Creating data manually in an Excel work sheet
• Capturing data in Excel using third-party data downloaders, and
others
9

Data Analytics Suites
• Some datasets may be exported from data analytics suites.
• SPSS, RapidMiner Studio, R, Python, and other tools may be used for
high-level statistical analysis and machine learning. However, the
data visualization tools may be more focused on conveying data than
in presentation-quality data visualizations.
• The underlying data may be exported in a form that Excel can use…in
order to create the data visualizations. (Excel has a lot of analytics
capabilities built-in, too, but complex analytics likely require
processing in other software programs.)
10

Data Capture and Pre-Processing
• Prior to importing data into Excel, it is likely that the data is pre-processed /
cleaned for accuracy.
• All data are changed with every touch of technology:
• Software may be used to extract or capture data (such as from social media
platforms). There are limits to APIs, which are virtually all limited by rate and by
amount of data capturable for free.
• Software may be used to convert manual coding into digital coding (transcoding).
• Software may be used to turn unstructured and semi-structured data into
quantitative-based data tables (such as text analytics applications). The reverse is
common, too: taking quantitative data and turning it semi-structured (as visuals).
• Software may be used to create synthetic or faux data that meets particular
requirements (such as a random network graph).
11

Data Everywhere and Fungible
• In other words, it is possible to datafy a lot of things.
• There is data everywhere…
• It is possible to turn most data into information and something
somewhat useful.
12

Reading General Data in a Data
Table / Worksheet
13

Structured Data
Structured data is labeled by row and column
headers
Such data is categorize-able by type and
common characteristics and functions
Rows tend to be data records (with unique
identifiers in Column A)
Columns tend to be variables and attributes
Data types include the following: General,
Number, Currency, Accounting, Data, Time,
Percentage, Fraction, Scientific, Text, Special,
and Custom
Each cell is labeled by data types, and these
types affect how the software handles the
data
14

“Unstructured” or
“Semi-Structured”
Data
Text sets, bags of words
Image files
Audio files
Video files
Multimedia, and others
• Tend to be multi-dimensional and / or high-
dimensional data
• Tend to be somewhat inherently structured
based on the data type (language has some
inherent structure; imagery may be defined
within 2D or 3D space, etc.), thus the
preference for “semi-structured” for word
purists
• Tend to be various file types, with different
file extensions
15

Basic “Structured” Data Structures
• Column A tends to contain unique identifiers for the row data
• Row 1 tends to contain all the column headers
• Column headers tend to be written in CamelCase format
• Each row of the row data except the first row contains an individual
record
• Each column contains a variable each
• Each column tends to contain data of a certain type, such as string /
text, numerical, percentage, date, and others
• Some of the data is human readable, and some is not (based on size
of data #### or type)
16

Coding “Structured” Data
• Structured data generally has a long history of conventional statistical
approaches to analysis, to identify patterns in the data.
• There are simple counts.
• There are measures of central tendency for parametric datasets.
• There are tools for observing and measuring associations.
• There are tools for observing and measuring causation-based associations.
• There are tools to compare observed data vs. expected data, and measures of
statistical significance.
• There are tools to support experimental setups, to compare control groups
with experimental groups.
• There are tools to measure confidence in statistical findings.
17

Basic “Unstructured” and “Semi-Structured”
Data Structures
• Language data tends to have an inherent structure based on how
evolved languages originate and change over time.
• Image data tends to have an inherent structure based on image
features: image sizes, orientation, main subject matter, colors,
resolution, and other factors.
• Audio data tends to have an inherent structure by voiceprint (and / or
waveform), occurrences in time, sound frequencies, and other
factors.
• Video data tends to have an inherent structure by frames-per-minute
imagery, waveforms, and other factors.
18

Coding “Unstructured” and “Semi-Structured”
Data
• So-called “unstructured” or “semi-structured” data are coded in a
variety of different ways.
• One approach is with a priori coding, or using an extant model, conceptual
framework, or other structure to create a codebook against which the data
are coded.
• Another general approach is with “emergent” coding, which starts with the
raw data and results in an evolved codebook.
• Then, there are many combinations of the two above approaches.
19

Data (cont.)
• Such unstructured / semi-structured data are multi-dimensional, so
they can be analyzed in a variety of different ways and are somewhat
robust against having a certain interpretation stick and predominate
over others.
• Data are generally polysemous or multi-meaninged.
• There are public text corpora that have been created for broad-scale
use in the testing of software tools, programs, algorithms, and
processes for text analysis, in order to be able to have comparable
and competitive analyses.
20

Data (cont.)
• There are some text corpora which are non-consumptive, which
means only the top-level statistics and other metrics about a text set
are available, but the underlying texts (the actual data) themselves
are not. “Shadow” datasets are made accessible for the queries, but
to avoid the risk of re-identification of original copyrighted
manuscripts, the original manuscripts in their original order are not
made available. (Google Books Ngram Viewer is a well known
example.)
21

Data (cont.)
• Such data may be coded by humans alone, computer alone, or a
cyborg-ian mix
• Advances in computer vision (object identification, sentiment analysis
of images, predictivity of “what happens next” in a video sequence)
and other capabilities have extended computer capabilities at coding
such data
22

Unlinked or Linked Data Tables
Flat Files
• Data tables treated as single
stand-alone files that may be
assessed alone or queried in
relation to other files
Linked Files
• Data tables treated as
interconnected and related files
that may be queried across data
tables and fields
26

Processing or Cleaning Data
27

Some Common Questions for Data Processing
or Cleaning
Structured Data
• How should missing data be
handled? (Should empty cells
mean deleting the whole record?
Should empty cells be filled with
N/A? Should empty cells be filled
with randomly-generated contents
based on the other data in the set?
Should empty cells be zeroed out?)
• How should repeated data be
handled?
Unstructured or Semi-Structured Data
• How should scraped imagery that
consists of a corrupted file be
handled? Should these be omitted?
Should these be kept and partially
coded?
• In an image set, how should
different versions of an image be
coded? Should that be counted
multiple times? What if the image
is re-inscribed and reused by
others in new ways?
28

or Cleaning (cont.)
Structured Data
• In a set of parametric data, how
should extreme outliers be
handled? Should they be
omitted, so has not to skew a
curve? Should they be treated
differently than the other data in
the set?
• In a multi-lingual text set, how
should all the other languages
besides the non-base language
be used? Should these language
inputs be manually handled?
Should these non-base language
inputs be translated to the base
language for machine analysis?
29

or Cleaning (cont.)
Structured Data
• There may be benefits to
combining multiple open-source
datasets, which each have insights
to contribute to the study of a
particular issue. The variables are
not exactly mappable to each
other though. How should such
datasets be melded? How should
the mixed dataset be described?
How should the original datasets
be credited?
• In a mixed multi-modal dataset
of various multimedia contents,
there is a lot of room for
interpretation and subjectivity.
What tools should be designed
to aid in creating consistency in
the coding and interpretations?
30

or Cleaning (cont.)
Structured Data
• The original labeling of online
data from an online research
suite is too verbose and
complex. Renaming the
variables is important to enable
easier data processing and
easier setup of data tables.
What is a legitimate process of
renaming variables for accuracy
and efficiency?
• Machine coding enables faster
processing of various types of
unstructured and semi-
structured data. However, the
machine coding also introduces
some degree of ambiguity and
“noise”? How should the use of
computers to code be balanced
against human-based insights?
31

or Cleaning (cont.)
Structured Data
• In combining manual coding for
a team coded project, there are
some new codes that were not
part of the original codebook.
Should these new nodes be
included in the similarity
analysis computation for a
Cohen’s Kappa / Kappa
Coefficient?
• In a particular study, there is a
set of videos that has been
hacked and taken from a
company. The videos are
relevant to the research and
would offer value, but they are
not legally available. Should
these videos be used, or should
they be expunged from the
study set?
32

or Cleaning (cont.)
Structured Data
• An online survey system has
accidentally captured
respondent identifier
information during the normal
course of the data capture. The
demographic data may be used
for deeper analytics. Should this
data be used? How so? Why or
why not?
• Scraped online data come from a
variety of sources, and the source
citations may be hard to find.
There are some tools that enable
reverse image searches, but other
search tools are more painstaking
to use, particularly for video
searches. How much effort should
be put into having proper and
correct citations for the original
sources?
33

or Cleaning (cont.)
Structured Data
• In many fields, original datasets
have to be published out and
shared at the time of publication.
In the process of releasing data, a
researcher has to go through a
process of de-identification…and
has to work hard to ensure that
the data may not be re-identified.
How much due diligence should a
researcher go through to protect
the participants of his / her / their
study?
• There are auto- machine-created
transcripts available for videos hosted
on a social video sharing sites. The
transcripts are sometimes improved
on with human coding, in some cases,
but in many cases, the transcripts are
not directly fixed and so include
various mistakes. Should these
transcripts be corrected first before
they are coded for research? Or
should they go in, mistakes and all,
even if this means that some garble is
included?
34

or Cleaning (cont.)
Structured Data
• In some cases, conceptual data
may be applied to communicate
theories, models, and
frameworks. Also, there may be
synthetic or faux data. How
should one communicate the
fact that this data is conceptual?
• In an image set, there will
images of various types: photos,
screen grabs of virtual worlds,
screen stills of videos, diagrams,
drawings, scans of documents,
and other types of visuals. How
should the various types of
modalities be addressed?
35

General Points about Data Processing /
Cleaning
• There should be clear principles and rationales for how data is
handled. These should be clearly documented.
• Generally, data processing should not be lossy (lose information).
• Data processing should be selective but non-destructive.
• Data processing should not result in undue skew or bias to the
original data.
• Data processing should not result in data leakage, confidentiality
compromises, re-identification of research participants, or any
compromise of data privacy .
36

General Points about Data Processing /
Cleaning(cont.)
• There should be clear steps and processes applied to data processing,
and these should be documented. If there are deviations to this
processing, that should be recorded as well.
• A raw set of all data should be preserved in its initial pristine state
before any data processing or data cleaning is done. This is to ensure
that there is a pristine master set against which to re-extract a new
set for other processing…and also a master against which to compare
cleaned datasets.
• If this data processing is done by machine, the “macros” should be
documented.
37

Using the “Recommended
Charts” Feature in Excel 2016
38

Accessing the “Recommended Charts”
Feature
• To access this feature, highlight the desired data to map (from the
dataset), and click the “Insert” tab, and select “Recommended
Charts.”
39

About the “Recommended Charts” Feature
• This “recommended charts” feature offers some cognitive scaffolding
to new data visualizers by suggesting possible data visualizations.
• This feature seems to offer the most simple options first, even after a
user may have gone with more complex data visualizations for
similarly structured data.
40

Selecting the Right Amount of Data to Map
• Every selected cell of data—even the empty ones—contain meaning in the data
visualization, and they will be represented there.
• The selected data should be structurally coherent.
• In other words, the positioning of the respective cells should convey to the software how the data
visualization should be drawn. Part of data preparation involves the positioning of the data in a
correct structure.
• It is possible to transpose elements on an axis and make other changes once the image is drawn,
but it’s preferable to have the data structured correctly.
• The selected variables in a dataset should be interconnected. If the data is not
interconnected somehow—by meaning or by potential association—then it would be
harder to justify having the same information in the same visualization.
• Too much data will mean that Excel cannot draw the graph. Too little information will
mean that the data visualization is not clear.
• Data labels are usually handled, in part, in the data table itself. Those should be correctly
set up, with proper spelling, proper capitalization, proper CamelCase (if used), and
parallel construction.
41

Accessing the “All Charts” Feature
• There is one tab that offers some “Recommended Charts”. The tab
next to it offers “All Charts.”
• Both are interactive and selectable.
42

The “Recommended Charts” Feature
• This Excel feature assesses the types of data in the dataset or
worksheet and proposes a few data visualizations that may best
represent that data.
• Sometimes, one needs to restart the software to get this to work.
• Some other software tools (like IBM Watson) will actually
preliminarily analyze the data and suggest aspects of the data to focus
on for human analysts.
43

The “Recommended Charts” Feature (cont.)
• If too much data have been highlighted, then a message will be
shown. It will read in part “Recommendations are not available for
the data you selected. To choose a chart type, click All Charts.”
• Some reasons why a chart may not be identifiable include the
following:
• no numbers that are summarizable
• data from multiple worksheets
• numerous number of data cells
• contains defined names (“range names”), columns defined as variables with
particular characteristics
• such as combinations of various columns linked by mathematical functions for a new
variable
44

The “Recommended Charts” Feature (cont.)
• This feature is a generalized one and does not include deep or unique
or insider insights about the underlying data.
• This means that the suggestions made may not be optimal for the dataset or
the context of the researcher.
• Researcher objectives will also affect the selection of the optimal data
visualizations (and data visualization sequences). To access this feature,
highlight the desired data to map (from the dataset), and click the “Insert”
tab, and select “Recommended Charts.”
45

49
0
10
20
30
40
50
60
70
80
90
100
Analytic Clout Authentic Tone
PercentileRankings
Four Text Feature Scores
Four Text Analytic Scores from a "Microsoft Excel" Article in Wikipedia

Abstracting Core Descriptive Functions in
Data Visualizations
• Proportionality (“intensity”)
• Frequency counts
• Pie charts, bar charts, intensity
matrices, area charts, radar
diagrams, histograms
• Changes over time
• Frequency counts over time
• Line graphs, scattergraphs
• Hierarchical relationships
• Word networks, frequency word
counts, topic modeling in word
sets (text corpora)
• Word network graphs,
dendrograms, sunburst diagrams
• Descriptive statistics
• Distribution; central tendency
(mean, median, mode), dispersion
(standard deviation, min-max
range, variance)
• Bar charts, curves
51

Abstracting Core Descriptive Functions in
Data Visualizations (cont.)
• Social relationships
• Intercommunications, follower-
followee relationship
• Social networks
• Physical-spatial relationships
• Events occurring in space,
locations
• Geographical maps
52

Abstracting Core Analytical Functions in Data
Visualizations: Deductive, Inductive, Inferential
• Data relationships
• Associations, causations
• Scattergraphs, line graphs, line
plots
• Text analysis
• Word frequency counts, text
queries, topic modeling
(unsupervised theme extraction),
sentiment analysis
• Cluster diagram, word cloud, word
tree, dendrogram, bar chart,
intensity matrix
53

Filtering Data
• The “Sort & Filter” option enables users to select a column or
segment of a column to alphabetize or sort numerically (from most to
least, from least to most) or sort by date (from most-recent to least-
recent, or least-recent to most-recent), and so on.
54

Filtering Data (cont.)
• A “Sort Warning” window asks whether the user wants to “Expand
the selection” or just “Continue with the current selection”
• Generally, the selection should be expanded. This means that the entire row
of data will move with however the selection moves. The data will still be
correct and of-a-piece.
• If not, only the selection will be sorted, and all the other row data will be left
in their prior positions. If there is a very limited issue that is being addressed
and the whole dataset is pristine and accurate elsewhere, then just
continuing with the current selection may be the right choice.
55

Selecting Data Visualization Types
56

What Follows
• In the following section are the main types of data graphs enabled in
Excel with its built-in charting features.
• Each section begins with the type of chart and some general
characteristics, followed by examples.
• A majority of the examples are drawn with open-source real-world data. One
data visualization was created using synthetic data for effects, and that
visualization has been labeled as being created using faux data.
• The data may have been processed using other tools, but the graphs
themselves were all created in Excel.
• On the same slide as the graph or directly after each graph is a table
with the underlying data, to help viewers understand the connection
between the data and the visualization.
57

Column Charts
• Column graphs may tend to be vertical (vs. horizontal).
• In other words, they tend to align with the placement of a column.
• Column graphs may be 2D or 3D.
• The common shapes representing data are rectangles.
• Columns may be stacked.
• Related columns may be clustered.
• Columns may be summed to 100% in “100% stacked column (or bar)
charts.”
58

Data Structure for the Vertical Column or Bar
Chart on the Prior Slide
pronou
n
ppron i we you shehe they ipron article prep auxver
b
adverb conj negate
2.47 0.36 0.07 0.08 0.07 0.00 0.15 2.10 5.77 10.78 3.74 1.69 4.07 0.52
60

Data Structure for the Stacked Bar Chart in
the Prior Slide
Very Negative Moderately
Negative
Moderately Positive Very Positive
SpaceX Public
Group FB
210 305 389 245
Tesla Motors Club
FB
12 20 45 23
62

63
Group Selfies Dronies
Babies 7 3
Children 63 21
Teens 20 9
20s and 30s 943 168
40s 49 10
50s 25 7
60s 12 2
70s and older 1 4
Mixed Age 224 116
Unknowable 27 185

Line Charts
• Line graphs tend to be horizontal
• Line graphs may represent changes over time
• In such cases, time is represented on the x-axis, and some variable with a
numerical measure (counts, percentages, frequencies, intensities) is
represented in the y-axis
• Time units should be consistent
• Line graphs with time on the x-axis may be enhanced with a drawn
“trendline” to indicate directionality of the phenomena over the
studied / observed time frame and into the near future.
• Comparative line graphs may show multiple related factors (variables)
interacting over time with each other
64

Line Charts (cont.)
• Line graphs may have two different variables with one represented on the
x-axis and one on the y
• The line itself then may show some association between the two variables (which
should be continuous variables)
• The associations may be negative or positive
• The associations may be more complex and curvilinear (not staying consistent one way or
another over time)
• Or there may be no apparent association
• Where a bar graph (the prior one) suggests a discretization (and “space”)
between the bar elements, a line graph suggests more nuance and some
continuity in the variables (and less space or no space between variables,
expressed as a dotted line or a continuous line).
65

66
Month Count
Sept. 2015 83
Oct. 2015 222
Nov. 2015 51
Dec. 2015 15
Jan. 2016 4
Feb. 2016 2

67
HLY-TEMP-NORMAL HLY-DEWP-NORMAL
34.9 28.9
34.4 28.7
33.9 28.4
33.4 28.3
33.1 28
32.7 27.9
32.5 27.7
32.3 27.6
32.1 27.4
33.8 28.2
36.4 28.9
39.4 29.3
42.1 29.2
44.2 29.1
45.6 28.9
46.2 28.5
45.8 28.5
44.1 28.6
41.2 28.5
39.6 28.8
38.2 29
37.2 29
36.3 29
35.5 29
34.9 28.8
34.4 28.7
33.9 28.4
33.5 28.3
33.1 28
32.7 27.9
32.5 27.7
32.3 27.6
32.1 27.5
33.8 28.3
36.6 29.1
39.6 29.5

68
Very Negative Moderately Negative Moderately Positive Very Positive
SpaceX Public Group FB
210 305 399 245
Tesla Motors Club FB
12 20 45 23

Pie Charts
• Pie charts are used to represent (related) proportions of a whole.
Proportions are determined numerically—by raw counts or
percentages, usually.
• The respective proportions are represented as “slices.”
• Pie charts may be 2D or 3D.
• One version of a “pie chart” is a doughnut, which is a circular
proportional representation.
• “Exploding” pie charts have sections that are pulled out from the
main pie as a point-of-emphasis.
69

70
creation_pending 65
deleted 2112
pre_registered 13795
registered 84351
100323

71
DELETE (Remove) 163
GET (Read) 824038
HEAD (Retrieve Resource) 204
PATCH (Update, Modify) 12
POST (Create) 218950
PUT (Create) 5209
1048576

72
Course 32936
Group 14576
47512

Bar Charts
• Bar charts use rectangular shapes (and the sizes of these shapes) to
indicate quantities and intensities.
• Bar charts may have bars be either horizontal or vertical.
• Bar charts may be 2D or 3D.
• The bars may be stacked; they may be clustered.
• Bars may be summed to 100% in “100% stacked column (or bar)
charts.”
• Note: The bar chart types are as follows: vertical stacked bar chart,
100% stacked horizontal bar chart, and a Pareto chart.
73

74
auto_graded 239430
human_graded 1070322
not_graded 787398
2097150

Data Structure for the 100% Bar Chart on the
Prior Slide
Very Negative Moderately
Negative
Moderately Positive Very Positive
SpaceX Public
Group FB
210 305 399 245
Tesla Motors Club
FB
12 20 45 23
76

Data Structure and Source for Stacked Bar
Charts in Prior Slide
• Data from “Comparative Analysis of 4-H Enrollment and U.S. Census
School Data”
• conditional data distribution
• REEIS Report
• July 2010
• 4H38-Comparitive(sic)-Analysis-of-4H-Enrolment(sic)-US-census-
school-grade-data.xlsx
78
Region Name:<All> State Name:<All>
Kindergarten 1st Grade 2nd Grade 3rd Grade 4th Grade 5th Grade 6th Grade 7th Grade 8th Grade 9th Grade 10th Grade 11th Grade 12th Grade
4-H Enrollment 3,091,210 3,090,230 3,733,658 5,327,058 6,851,803 6,383,579 4,522,690 2,955,880 2,465,910 1,625,339 1,444,406 1,258,010 989,248
US Census 27,624,237 27,945,407 28,086,468 27,494,293 28,503,328 28,461,965 28,102,045 27,757,994 27,640,711 27,576,100 27,584,713 27,644,003 27,744,150

79
Main Time Zones
Alaska 3
Arizona 7
Asuncion 1
Auckland 1
Baghdad 1
Bangkok 1
Beijing 1
Bogota 1
Brasilia 2
Bucharest 1
Central America 1
Central Time (US &
Canada) 34765
Copenhagen 1
Eastern Time (US &
Canada) 130
Harare 1
Hawaii 3
Indiana (East) 2
Islamabad 1
Jerusalem 1
La Paz 1
London 1
Mid-Atlantic 1
Moscow 1
Mountain Time (US &
Canada) 59
Nairobi 1
Pacific Time (US &
Canada) 46
Paris 1
Rome 3
Seoul 2
Tokyo 1
35041

Area Charts
• Area charts are built from line charts.
• In these, the areas under the respective lines are filled in with certain
colors and / or textures to indicate quantitative data (frequencies,
amounts, intensities, etc.).
• The data may be comprised of one record or multiple comparable
records.
• The areas are usually somewhat transparent to enable visualizing
other related data records to enable comparisons of quantities.
80

81
Null 769,964
1 246,371
2 20,907
3 6749
4 1762
5 1,239
6 368
7 212
8 309
9 138
10 104
11 39
12 20
13 32
14 24
15 23
16 27
17 14
18 8
19 10
20 13
21 8
22 153
23 7
24 7
25 6
26 8
27 6
28 1
29 6
30 7
31 4
32 2
33 1
34 4
35 4
36 1
38 2
39 4
40 3
41 1
43 1
45 2
58 1
59 1
62 2
108 1

82
Babies 7 3
Children 63 21
Teens 20 9
20s and 30s 943 168
40s 49 10
50s 25 7
60s 12 2
70s and older 1 4
Mixed Age 224 116
Unknowable 27 185

83
Year (All)
Data
Region State Detail of Male Detail of Female Detail of Youth Detail of 4H Units
CENTRAL Illinois 204,152 223,692 427,844 10,356
Indiana 236,705 270,895 507,600 53,547
Iowa 118,213 130,830 249,043 18,215
Kansas 79,472 82,265 161,737 17,218
Michigan 431,160 494,569 925,729 42,882
Minnesota 689,203 709,449 1,398,652 51,972
Missouri 82,138 90,864 173,002 13,016
Nebraska 210,271 254,071 464,342 22,238
North Dakota 88,858 109,733 198,591 9,555
Ohio 210,287 229,454 439,741 26,461
South Dakota 36,779 42,570 79,349 3,502
Wisconsin 347,307 435,679 782,986 41,115
EASTERN Connecticut 13,810 18,777 32,587 806
Delaware 28,683 34,198 62,881 2,027
District Of Columbia 4,746 7,360 12,106 14
Maine 44,324 55,631 99,955 2,290
Maryland 51,653 57,881 109,534 2,120
Massachusetts 66,039 77,434 143,473 7,277
New Hampshire 23,413 26,505 49,918 1,596
New Jersey 56,877 68,528 125,405 5,124
New York 485,408 538,688 1,024,096 55,771
Pennsylvania 60,696 62,811 123,507 3,909
Rhode Island 8,131 8,449 16,580 622
Vermont 9,544 13,366 22,910 1,070
West Virginia 81,066 102,271 183,337 11,883
SOUTHERN Alabama 55,709 63,794 119,503 3,457
American Samoa 1,470 1,563 3,033 33
Arkansas 185,849 214,993 400,842 16,074
Florida 145,672 165,895 311,567 10,723
Georgia 138,478 169,541 308,019 27,291
Kentucky 393,835 440,087 833,922 41,215
Louisiana 60,566 89,470 150,036 1,619
Mississippi 157,890 168,154 326,044 13,872
North Carolina 338,669 410,066 748,735 26,291
Oklahoma 162,894 189,696 352,590 13,111
Puerto Rico 55,376 63,447 118,823 4,194
South Carolina 94,726 113,897 208,623 6,772
Tennessee 48,558 55,487 104,045 6,872
Texas 1,676,187 1,737,070 3,413,257 15,562
VirginIslands 1,270 1,673 2,943 116
Virginia 126,927 134,656 261,583 11,760
WESTERN Alaska 19,513 21,081 40,594 425
Arizona 99,523 105,077 204,600 5,173
California 90,022 94,206 184,228 2,548
Colorado 95,659 115,351 211,010 7,956
Guam 16,447 17,307 33,754 215
Hawaii 16,996 18,050 35,046 838
Idaho 13,579 16,945 30,524 2,581
Micronesia 3,992 4,484 8,476 81
Montana 22,332 29,073 51,405 8,766
Nevada 49,466 59,650 109,116 1,809
New Mexico 110,287 114,815 225,102 2,728
Northern Mariana Islands 2,766 3,093 5,859 107
Oregon 20,999 25,360 46,359 3,324
Utah 86,325 113,686 200,011 4,156
Washington 70,648 79,181 149,829 6,649
Wyoming 29,581 36,899 66,480 4,038
Grand Total 8,061,146 9,019,717 17,080,863 654,942

84
Babies 7 3
Children 63 21
Teens 20 9
20s and 30s 943 168
40s 49 10
50s 25 7
60s 12 2
70s and older 1 4
Mixed Age 224 116
Unknowable 27 185

85
Date selfie selfie guy
1/1/2004 -0.583 -0.608
2/1/2004 -0.583 -0.608
3/1/2004 -0.583 -0.608
4/1/2004 -0.583 -0.608
5/1/2004 -0.583 -0.608
6/1/2004 -0.583 -0.608
7/1/2004 -0.583 -0.608
8/1/2004 -0.583 -0.608
9/1/2004 -0.583 -0.608
######## -0.583 -0.608
######## -0.583 -0.608
######## -0.583 -0.608
1/1/2005 -0.583 -0.608
2/1/2005 -0.583 -0.608
3/1/2005 -0.583 -0.608
4/1/2005 -0.583 -0.608
5/1/2005 -0.583 -0.608
6/1/2005 -0.583 -0.608
7/1/2005 -0.583 -0.608
8/1/2005 -0.583 -0.608
9/1/2005 -0.583 -0.608
######## -0.583 -0.608
######## -0.583 -0.608
######## -0.583 -0.608
1/1/2006 -0.583 -0.608
2/1/2006 -0.583 -0.608
3/1/2006 -0.583 -0.608
4/1/2006 -0.583 -0.608
5/1/2006 -0.583 -0.608
6/1/2006 -0.583 -0.608
7/1/2006 -0.583 -0.608
8/1/2006 -0.583 -0.608
9/1/2006 -0.583 -0.608
######## -0.583 -0.608
######## -0.583 -0.608
######## -0.583 -0.608
1/1/2007 -0.583 -0.608
2/1/2007 -0.583 -0.608
3/1/2007 -0.583 -0.608
4/1/2007 -0.583 -0.608
5/1/2007 -0.583 -0.608
6/1/2007 -0.583 -0.608
7/1/2007 -0.583 -0.608
8/1/2007 -0.583 -0.608
9/1/2007 -0.583 -0.608
######## -0.583 -0.608
######## -0.583 -0.608
######## -0.583 -0.608
1/1/2008 -0.583 -0.608
2/1/2008 -0.583 -0.608
3/1/2008 -0.583 -0.608
4/1/2008 -0.583 -0.608
5/1/2008 -0.583 -0.608
6/1/2008 -0.583 -0.608
7/1/2008 -0.583 -0.608
8/1/2008 -0.583 -0.608
9/1/2008 -0.583 -0.608
######## -0.583 -0.608
######## -0.583 -0.608
######## -0.583 -0.608
1/1/2009 -0.583 -0.608
2/1/2009 -0.583 -0.608
3/1/2009 -0.583 -0.608
4/1/2009 -0.583 -0.608
5/1/2009 -0.583 -0.608
6/1/2009 -0.583 -0.608
7/1/2009 -0.583 -0.608
8/1/2009 -0.583 -0.608
9/1/2009 -0.583 -0.608
######## -0.583 -0.608
######## -0.583 -0.608
######## -0.583 -0.608
1/1/2010 -0.583 -0.608
2/1/2010 -0.583 -0.608
3/1/2010 -0.582 -0.608
4/1/2010 -0.583 -0.608
5/1/2010 -0.583 -0.608
6/1/2010 -0.582 -0.608
7/1/2010 -0.582 -0.608
8/1/2010 -0.582 -0.608
9/1/2010 -0.583 -0.608
######## -0.582 -0.608
######## -0.582 -0.608
######## -0.582 -0.608
1/1/2011 -0.582 -0.608
2/1/2011 -0.582 -0.608
3/1/2011 -0.582 -0.608
4/1/2011 -0.582 -0.608
5/1/2011 -0.582 -0.608
6/1/2011 -0.581 -0.608
7/1/2011 -0.581 -0.608
8/1/2011 -0.581 -0.608
9/1/2011 -0.581 -0.608
######## -0.581 -0.608
######## -0.581 -0.608
######## -0.581 -0.608
1/1/2012 -0.579 -0.608
2/1/2012 -0.577 -0.608
3/1/2012 -0.579 -0.608
4/1/2012 -0.55 -0.608
5/1/2012 -0.576 -0.608
6/1/2012 -0.575 -0.608
7/1/2012 -0.573 -0.608
8/1/2012 -0.57 -0.608
9/1/2012 -0.566 -0.608
######## -0.553 -0.608
######## -0.546 -0.608
######## -0.521 -0.608
1/1/2013 -0.512 -0.608
2/1/2013 -0.487 -0.608
3/1/2013 -0.463 -0.387
4/1/2013 -0.428 -0.376
5/1/2013 -0.383 -0.314
6/1/2013 -0.314 -0.069
7/1/2013 -0.142 0.033
8/1/2013 -0.135 0.14
9/1/2013 -0.096 0.359
######## 0.178 0.8
######## 0.326 0.786
######## 1.33 1.282
1/1/2014 0.797 1.205
2/1/2014 1.043 1.413
3/1/2014 3.312 2.333
4/1/2014 2.671 3.007
5/1/2014 1.814 2.222
6/1/2014 1.501 2.251
7/1/2014 1.594 2.005
8/1/2014 1.781 1.983
9/1/2014 1.648 1.799
######## 1.637 1.863
######## 1.668 1.594
######## 2.114 1.958
1/1/2015 2.018 1.857
2/1/2015 1.779 1.449
3/1/2015 1.807 1.55
4/1/2015 2.02 1.69
5/1/2015 2.117 1.847
6/1/2015 2.279 1.924
7/1/2015 2.141 2.166
8/1/2015 1.717 1.851
9/1/2015 1.604 1.461
######## 1.393 1.412
######## 1.366 1.54
######## 1.748 1.829
1/1/2016 1.437 1.912
2/1/2016 1.231 1.466
3/1/2016 3.67 1.657
4/1/2016 1.385 1.445
5/1/2016 1.284 1.751
6/1/2016 1.377 1.578
7/1/2016 1.389 1.375
8/1/2016 1.091 1.612
9/1/2016 0.983 1.241
######## 1.026 1.162
######## 1.193 1.04
######## 1.387 1.193
1/1/2017 1.01 1.113
2/1/2017 0.915 0.836
3/1/2017 0.85 1.02

Data Structure and
Source for Area Chart
in Prior Slide
Comparison of search frequencies for “selfie”
and “selfie guy” on Google Search from 2004
– 2017 (June)
Selection of two columns of less-correlated
less co-varying web search activity data from
the related downloadable .csv file
Correlations are over time with normalized
data (z-scores) over weekly and monthly
intervals in the time period
Extracted from Google Correlate data
86

X Y (Scatter) Charts
• Scatter graphs (aka scatter plots or scatter diagrams) capture two sets
of point data.
• On the respective x-axis and y-axis, different variables are
represented. These variables are often continuous (vs. discrete) ones.
• Sometimes, lines are drawn through the data to help in visualizing
positive associations (the increase in one results in the increase in the
other), negative associations (the increase in one results in the
decrease in the other), no relations, or curvilinear relations (more
complex associations than linear ones).
• Of course, correlations do not mean causation per se.
87

88
open high low close
182.53 183.4 182.53 183.385
181.75 182.46 181.61 182.06
179.42 180.93 179.42 180.38
178.74 179.82 178.346 179.3
178.47 179.9 178.16 178.4
178.59 179.97 177.12 177.85
175.84 179.08 175.65 179.02
175.74 176.88 175.5639 175.62
178.25 178.25 175.94 176.05
177.5 178.6 176.96 178.57
179 179.97 177.48 177.56
178.39 179.09 177.26 178.85
177.56 178.22 177.12 177.37
179 180.18 176.89 177.08
176.88 178.79 176.76 178.7
177.08 177.73 175.5 176.65
178.02 178.18 176.81 176.86
177.25 178.49 177.22 177.98
177.4 177.99 176.97 177.63
176.29 177.683 175 177.36
174.37 176.44 173.75 176.1
176.85 177.53 174.7687 175.82
177.34 177.85 176.59 177.26
175.96 177.1 175.5 176.98
179.99 180.25 175.5 175.96
180.1 180.15 179.14 179.39
178.31 180.3835 178.17 180.1
179.82 180 177.64 178.19
179 179.24 177.97 178.71
178.54 179.69 177.71 178.73
177.16 179.19 177.07 179.05
181.9 181.97 177.92 178.7
181.43 182.59 179.58 180.57
182.4 182.694 181.49 181.74
180.64 182.84 180.6209 182.02
181.4 182.3 180.43 180.93
183.04 183.58 181.45 182.18
184 185.71 182.97 182.99
181.85 184.8 181.82 183.91
180.34 181.93 179.67 180.23
178 179.8839 177.55 179.43
176.75 178.8 176.1 177.44
175.97 177 175.7 176.86
174.98 175.75 174.01 175.36
173.92 176.17 173.68 175.56
170.41 173.25 170.4 172.71
169.64 170.81 168.75 170.81
168.42 169.95 168.35 169.3
167.7 168.8 167.22 168.5
166.45 169.07 166.35 168.03
165.25 166.45 164.47 166.23
164.67 165.086 164.06 164.28
165 165.24 163.69 163.81
165 167.4199 164.8725 166.5
162.42 164.08 162.38 163.98
162.99 163.56 162.31 162.4
163.22 163.97 160.82 162.26
164.25 165.81 163.12 163.97
164.96 165.1 163.22 163.42
165.92 165.99 163.82 165.57
169.21 169.8 167.01 167.7
167.25 170 167.25 169.12
163.59 168.65 163.24 167.36
158.58 160.93 157.84 160.55

89
http://www.nasdaq.com/symbol/pg/historical

Data Structure of the Scatter Graph from the
Prior Slide
90
date close volume open high low
15:26 90.13 5,975,886 89.55 90.25 89.47
######## 89.55 8274540 89.12 89.655 89.01
######## 88.62 9060987 89.17 89.28 88.61
######## 89.33 6979447 89.7 89.7 89.31
######## 89.6 6915161 90.1 90.45 89.5
######## 90.8 7087668 90.4 91.13 90.34
######## 90.39 6931622 90.23 90.59 90.13
######## 90.03 5036793 90.05 90.54 89.79
######## 90.31 6286282 89.7 90.421 89.56
######## 89.8 5184689 89.66 89.82 89.29
######## 89.49 5949792 89.13 89.679 88.75
4/7/2017 89.23 4739026 89.46 89.61 89.18
4/6/2017 89.4 7231494 89.77 89.81 89.31
4/5/2017 89.97 6310419 90 90.5 89.76
4/4/2017 89.91 5680493 89.75 89.96 89.42
4/3/2017 89.68 6967439 89.86 90.06 89.45
######## 89.85 6942342 90.03 90.34 89.84
######## 90.2 3691924 90.5 90.568 90.1
######## 90.6 4252780 90.56 90.87 90.39
######## 90.76 11855760 90.17 91.0599 90.13
######## 90.49 10066350 90.45 90.715 90.18
######## 90.57 9251849 90.77 90.915 90.21
######## 90.77 6824203 90.91 91.46 90.6
######## 90.99 7791844 91.31 91.8 90.75
######## 91.19 8212432 91.3 91.75 91.03
######## 91.22 7772903 90.96 91.41 90.94
######## 91 37004960 91.45 92 90.92
######## 91.44 6600312 91.44 91.7199 91.1
######## 91.4 7855256 91 91.78 90.73
######## 91 6684468 91.26 91.59 90.82
######## 91.31 7090852 91.06 91.49 90.85
######## 91.07 6794349 90.8 91.16 90.65
3/9/2017 90.34 5587367 90.14 90.49 90.095
3/8/2017 90.14 5511775 90.02 90.35 89.76
3/7/2017 90.29 5362951 90.14 90.49 90.06
3/6/2017 90.37 6491060 89.89 90.51 89.59
3/3/2017 90.5 8337132 90.8 90.82 89.8948
3/2/2017 90.91 7027222 91.4 91.585 90.88
3/1/2017 91.66 8787804 91.05 91.89 90.71
######## 91.07 10535470 90.9 91.79 90.65
######## 90.89 11840270 90.53 90.91 90.06
######## 91.05 6639827 90.97 91.34 90.67
######## 91.13 7611607 91.61 91.8 90.91
######## 91.44 6749856 91.45 91.8 91.25
######## 91.67 9018703 90.58 91.8 90.58
######## 91.09 12088310 90.61 91.36 90.45
######## 90.79 12404800 90.92 91.12 90.54
######## 91.12 25842240 89.81 91.15 89.81
######## 87.86 20073350 88 88.22 87.23
######## 88.31 6894184 88.04 88.355 87.64
######## 87.97 11050280 88.56 88.73 87.965
2/9/2017 88.67 9913696 88.25 88.78 88.03
2/8/2017 88.33 6815666 88.07 88.34 87.81
2/7/2017 88.01 6646055 87.64 88.28 87.475
2/6/2017 87.4 7474975 87.5 87.785 87.13
2/3/2017 87.41 7154381 88.12 88.17 87.39
2/2/2017 87.76 8996855 87.61 88.35 87.25
2/1/2017 87.33 8291504 87.03 87.59 86.75
######## 87.6 9694265 86.63 87.66 86.53
######## 86.75 7349934 86.78 86.86 86.5
######## 86.72 9324287 86.45 86.85 86.02
######## 86.6 6540385 87.12 87.23 86.59
######## 87.16 8081945 87.84 87.9 87.06
######## 87.86 8704054 87.22 87.95 87.22

Data Structure of the Scatter Graph from the
Prior Slide
92
date close volume open high low
16:00 30.61 6,225,686 30.47 30.73 30.41
######## 30.35 9132305 31.11 31.13 30.16
######## 30.7 5615095 31.05 31.1599 30.64
######## 31.07 7219874 30.59 31.16 30.39
######## 30.39 9324046 30.77 30.92 30.385
######## 30.66 5397538 30.68 30.775 30.415
######## 30.73 4003535 30.59 30.8 30.53
######## 30.43 7475848 30.75 31.03 30.43
######## 30.65 8697939 30.89 31.13 30.46
######## 31.15 5244927 31.13 31.285 30.79
######## 31.24 7836575 31.08 31.41 31.07
4/7/2017 31.07 4968497 31.14 31.25 30.83
4/6/2017 31.12 9198214 31.27 31.3574 30.65
4/5/2017 31.38 8600055 31.78 31.845 31.26
4/4/2017 31.75 9073817 32.01 32.17 31.61
4/3/2017 32.15 5842386 32.48 32.54 32.03
######## 32.39 6709777 32.22 32.6 32.2
######## 32.36 4124697 32.04 32.39 32.01
######## 32.11 7239136 32.31 32.485 32.07
######## 32.44 11092310 31.76 32.575 31.75
######## 31.9 11019120 31.4 32.05 31.31
######## 31.52 10347800 31.38 31.845 31.26
######## 31.33 11218630 30.97 31.55 30.88
######## 30.97 6600745 30.63 31.09 30.49
######## 30.56 9621925 30.77 30.86 30.52
######## 30.63 4093167 30.85 31.01 30.58
######## 30.82 9432103 30.97 31.03 30.65
######## 30.76 8728050 30.68 30.78 30.4235
######## 30.74 4065243 30.67 30.92 30.44
######## 30.52 4307008 30.49 30.61 30.32
######## 30.5 5352321 30.4 30.55 30.35
######## 30.55 7785827 30.83 30.92 30.4
3/9/2017 30.7 6664232 30.36 30.83 30.36
3/8/2017 30.35 4993331 30.59 30.6 30.325
3/7/2017 30.52 6833142 30.61 30.77 30.23
3/6/2017 30.69 10712340 30.41 31.03 30.29
3/3/2017 30.46 4421878 30.2 30.52 30.03
3/2/2017 30.2 6673690 30.33 30.48 30.09
3/1/2017 30.41 8348614 30.2 30.73 30.15
######## 29.92 8553473 30.33 30.42 29.9
######## 30.43 6475840 30.51 30.7 30.29
######## 30.61 4930194 30.44 30.62 30.265
######## 30.34 4368922 30.29 30.68 30.17
######## 30.47 4596911 30.39 30.6 30.39
######## 30.55 9231386 30.24 30.8 30.23
######## 30.35 5004142 30.42 30.51 30.16
######## 30.51 4313938 30.58 30.75 30.41
######## 30.695 7597440 30.08 30.7 30.07
######## 30.3 9255701 29.74 30.4 29.57
######## 29.72 6974038 30.09 30.12 29.51
######## 29.91 8876308 30.19 30.28 29.81
2/9/2017 30.12 5763891 30.35 30.51 30.09
2/8/2017 30.21 7847457 30.47 30.57 29.96
2/7/2017 30.48 15176580 30.85 31.46 30.14
2/6/2017 31.06 13619680 31.19 31.5 30.96
2/3/2017 31.4 7520499 31.46 31.62 31.16
2/2/2017 31.46 4731048 31.53 31.59 31.29
2/1/2017 31.62 7434994 31.32 31.75 31.315
######## 31.38 5170287 31.2 31.44 31.05
######## 31.37 9097541 31.28 31.4 30.995
######## 31.29 8259271 31.14 31.52 30.88
######## 31 12695620 30.43 31.61 30.31
######## 30.3 5237626 30.37 30.44 30.17
######## 30.28 4774302 30.11 30.3 29.89

Stock Charts
• Stock graphs (sometimes referred to as OHLC or “open high low
chart”) show the ups and downs in stock valuations over time.
• Stock graphs are sometimes referred to as “OHLC” because the
structure is as follows: identifier (whether stock or date or some
other identifier), open, high, low, and close.
• The open is the valuation of a stock at the open of the stock session. The high
describes the highest value of the stock in the day-long trading period. The
low refers to the lowest value of the stock in the trading period. The close
defines the closing value in that time period.
93

Stock Charts(cont.)
• The three examples were created from the online Nasdaq historical
data site. Their “quotes” tab enables access to historical prices of
stocks, and only recent datasets were used for the following: The
Boeing Company (BA), Alphabet, Inc. (GOOG), and Tesla, Inc. (TSLA).
• Because all the visualizations are from a single source and of a type,
variance was introduced by variations in Excel for this graph type.
• http://www.nasdaq.com/symbol/ba/historical
• http://www.nasdaq.com/symbol/goog/historical
• http://www.nasdaq.com/symbol/tsla/historical
94

95
date open high low close
10:07 182.53 183.4 182.53 183.385
4/24/2017 181.75 182.46 181.61 182.06
4/21/2017 179.42 180.93 179.42 180.38
4/20/2017 178.74 179.82 178.346 179.3
4/19/2017 178.47 179.9 178.16 178.4
4/18/2017 178.59 179.97 177.12 177.85
4/17/2017 175.84 179.08 175.65 179.02
4/13/2017 175.74 176.88 175.5639 175.62
4/12/2017 178.25 178.25 175.94 176.05
4/11/2017 177.5 178.6 176.96 178.57
4/10/2017 179 179.97 177.48 177.56
4/7/2017 178.39 179.09 177.26 178.85
4/6/2017 177.56 178.22 177.12 177.37
4/5/2017 179 180.18 176.89 177.08
4/4/2017 176.88 178.79 176.76 178.7
4/3/2017 177.08 177.73 175.5 176.65
3/31/2017 178.02 178.18 176.81 176.86
3/30/2017 177.25 178.49 177.22 177.98
3/29/2017 177.4 177.99 176.97 177.63
3/28/2017 176.29 177.683 175 177.36
3/27/2017 174.37 176.44 173.75 176.1
3/24/2017 176.85 177.53 174.7687 175.82
3/23/2017 177.34 177.85 176.59 177.26
3/22/2017 175.96 177.1 175.5 176.98
3/21/2017 179.99 180.25 175.5 175.96
3/20/2017 180.1 180.15 179.14 179.39
3/17/2017 178.31 180.3835 178.17 180.1
3/16/2017 179.82 180 177.64 178.19
3/15/2017 179 179.24 177.97 178.71
3/14/2017 178.54 179.69 177.71 178.73
3/13/2017 177.16 179.19 177.07 179.05
3/10/2017 181.9 181.97 177.92 178.7
3/9/2017 181.43 182.59 179.58 180.57
3/8/2017 182.4 182.694 181.49 181.74
3/7/2017 180.64 182.84 180.6209 182.02
3/6/2017 181.4 182.3 180.43 180.93
3/3/2017 183.04 183.58 181.45 182.18
3/2/2017 184 185.71 182.97 182.99
3/1/2017 181.85 184.8 181.82 183.91
2/28/2017 180.34 181.93 179.67 180.23
2/27/2017 178 179.8839 177.55 179.43
2/24/2017 176.75 178.8 176.1 177.44
2/23/2017 175.97 177 175.7 176.86
2/22/2017 174.98 175.75 174.01 175.36
2/21/2017 173.92 176.17 173.68 175.56
2/17/2017 170.41 173.25 170.4 172.71
2/16/2017 169.64 170.81 168.75 170.81
2/15/2017 168.42 169.95 168.35 169.3
2/14/2017 167.7 168.8 167.22 168.5
2/13/2017 166.45 169.07 166.35 168.03
2/10/2017 165.25 166.45 164.47 166.23
2/9/2017 164.67 165.086 164.06 164.28
2/8/2017 165 165.24 163.69 163.81
2/7/2017 165 167.4199 164.8725 166.5
2/6/2017 162.42 164.08 162.38 163.98
2/3/2017 162.99 163.56 162.31 162.4
2/2/2017 163.22 163.97 160.82 162.26
2/1/2017 164.25 165.81 163.12 163.97
1/31/2017 164.96 165.1 163.22 163.42
1/30/2017 165.92 165.99 163.82 165.57
1/27/2017 169.21 169.8 167.01 167.7
1/26/2017 167.25 170 167.25 169.12
1/25/2017 163.59 168.65 163.24 167.36
1/24/2017 158.58 160.93 157.84 160.55

96
date open high low close
10:24 865 867.5 862.81 866.64
4/24/2017 851.2 863.45 849.86 862.76
4/21/2017 842.88 843.88 840.6 843.19
4/20/2017 841.44 845.2 839.32 841.65
4/19/2017 839.79 842.22 836.29 838.21
4/18/2017 834.22 838.93 832.71 836.82
4/17/2017 825.01 837.75 824.47 837.17
4/13/2017 822.14 826.38 821.44 823.56
4/12/2017 821.93 826.66 821.02 824.32
4/11/2017 824.71 827.4267 817.0201 823.35
4/10/2017 825.39 829.35 823.77 824.73
4/7/2017 827.96 828.485 820.5127 824.67
4/6/2017 832.4 836.39 826.46 827.88
4/5/2017 835.51 842.45 830.72 831.41
4/4/2017 831.36 835.18 829.0363 834.57
4/3/2017 829.22 840.85 829.22 838.55
3/31/2017 828.97 831.64 827.39 829.56
3/30/2017 833.5 833.68 829 831.5
3/29/2017 825 832.765 822.3801 831.41
3/28/2017 820.41 825.99 814.027 820.92
3/27/2017 806.95 821.63 803.37 819.51
3/24/2017 820.08 821.93 808.89 814.43
3/23/2017 821 822.57 812.257 817.58
3/22/2017 831.91 835.55 827.1801 829.59
3/21/2017 851.4 853.5 829.02 830.46
3/20/2017 850.01 850.22 845.15 848.4
3/17/2017 851.61 853.4 847.11 852.12
3/16/2017 849.03 850.85 846.13 848.78
3/15/2017 847.59 848.63 840.77 847.2
3/14/2017 843.64 847.24 840.8 845.62
3/13/2017 844 848.685 843.25 845.54
3/10/2017 843.28 844.91 839.5 843.25
3/9/2017 836 842 834.21 838.68
3/8/2017 833.51 838.15 831.79 835.37
3/7/2017 827.4 833.41 826.52 831.91
3/6/2017 826.95 828.88 822.4 827.78
3/3/2017 830.56 831.36 825.751 829.08
3/2/2017 833.85 834.51 829.64 830.63
3/1/2017 828.85 836.255 827.26 835.24
2/28/2017 825.61 828.54 820.2 823.21
2/27/2017 824.55 830.5 824 829.28
2/24/2017 827.73 829 824.2 828.64
2/23/2017 830.12 832.46 822.88 831.33
2/22/2017 828.66 833.25 828.64 830.76
2/21/2017 828.66 833.45 828.35 831.66
2/17/2017 823.02 828.07 821.655 828.07
2/16/2017 819.93 824.4 818.98 824.16
2/15/2017 819.36 823 818.47 818.98
2/14/2017 819 823 816 820.45
2/13/2017 816 820.959 815.49 819.24
2/10/2017 811.7 815.25 809.78 813.67
2/9/2017 809.51 810.66 804.54 809.56
2/8/2017 807 811.84 803.1903 808.38
2/7/2017 803.99 810.5 801.78 806.97
2/6/2017 799.7 801.67 795.2501 801.34
2/3/2017 802.99 806 800.37 801.49
2/2/2017 793.8 802.7 792 798.53
2/1/2017 799.68 801.19 791.19 795.695
1/31/2017 796.86 801.25 790.52 796.79
1/30/2017 814.66 815.84 799.8 802.32
1/27/2017 834.71 841.95 820.44 823.31
1/26/2017 837.81 838 827.01 832.15
1/25/2017 829.62 835.77 825.06 835.67
1/24/2017 822.3 825.9 817.821 823.87

97
date volume open high low close
10:39 1,740,869 308 309.25 305.86 309.06
4/24/2017 5077771 309.22 310.55 306.0215 308.03
4/21/2017 4501958 302 306.4 300.42 305.6
4/20/2017 6145961 306.51 309.15 300.23 302.51
4/19/2017 3891145 302.46 306.62 302.11 305.52
4/18/2017 3034225 299.7 300.8399 297.9 300.25
4/17/2017 4128067 302.7 304 298.68 301.44
4/13/2017 9275682 296.7 307.39 295.3 304
4/12/2017 6043648 306.34 308.4481 296.32 296.84
4/11/2017 5718053 313.38 313.47 305.5 308.71
4/10/2017 7653623 309.15 313.7299 308.71 312.39
4/7/2017 4566632 297.5 302.69 297.15 302.54
4/6/2017 5517731 296.88 301.94 294.1 298.7
4/5/2017 7858565 302.04 304.88 294.2 295
4/4/2017 10108230 296.89 304.81 294.53 303.7
4/3/2017 13864850 286.9 299 284.58 298.52
3/31/2017 3293698 278.73 279.68 276.3197 278.3
3/30/2017 4141437 278.04 282 277.21 277.92
3/29/2017 3672526 278.34 279.6 275.54 277.38
3/28/2017 7978665 277.02 280.68 275 277.45
3/27/2017 6221361 260.6 270.57 259.75 270.22
3/24/2017 5637668 255.7 263.89 255.01 263.16
3/23/2017 3309844 255.39 257.672 253.3 254.78
3/22/2017 4056735 251.56 255.07 250.51 255.01
3/21/2017 6901555 262.83 264.8 250.24 250.68
3/20/2017 3601616 260.6 264.55 258.821 261.92
3/17/2017 6491018 264 265.33 261.2 261.5
3/16/2017 7127180 262.4 265.75 259.06 262.05
3/15/2017 5233365 257 261 254.27 255.73
3/14/2017 7581719 246.11 258.12 246.02 258
3/13/2017 3011280 244.82 246.85 242.781 246.17
3/10/2017 3062785 246.21 246.5 243 243.69
3/9/2017 3876494 247.63 248.66 243 244.9
3/8/2017 3726746 247 250.07 245.32 246.87
3/7/2017 3452587 251.92 253.89 248.32 248.59
3/6/2017 3353601 247.91 251.7 247.51 251.21
3/3/2017 2925481 250.74 251.9 249 251.57
3/2/2017 3345751 249.71 253.28 248.27 250.48
3/1/2017 4804963 254.18 254.85 249.11 250.02
2/28/2017 6073890 244.19 251 243.9 249.99
2/27/2017 11450160 248.17 248.36 242.01 246.23
2/24/2017 8166869 252.66 258.25 250.2 257
2/23/2017 14877090 264 264.66 255.56 255.99
2/22/2017 8537811 280.31 283.45 272.6 273.51
2/21/2017 5647575 275.45 281.4 274.01 277.39
2/17/2017 6251469 265.8 272.89 264.15 272.23
2/16/2017 7063860 277.6 280 268.5 268.95
2/15/2017 4943879 280 282.24 276.44 279.76
2/14/2017 7341450 279.03 287.39 278.61 280.98
2/13/2017 7023072 270.74 280.7899 270.51 280.6
2/10/2017 3618336 269.79 270.95 266.11 269.23
2/9/2017 7812600 266.25 271.18 266.15 269.2
2/8/2017 3912428 257.35 263.36 256.2 262.08
2/7/2017 4244063 258.19 260 256.42 257.48
2/6/2017 3557600 251 257.82 250.63 257.77
2/3/2017 2185230 251.91 252.179 249.68 251.33
2/2/2017 2498799 248.34 252.42 247.71 251.55
2/1/2017 3953105 253.05 253.2 249.05 249.24
1/31/2017 4112013 249.24 255.89 247.7 251.93
1/30/2017 3798638 252.53 255.2899 247.1 250.63
1/27/2017 3161774 251.38 253 248.52 252.95
1/26/2017 3143717 254.29 255.74 250.75 252.51
1/25/2017 5145301 257.31 258.46 251.8 254.47
1/24/2017 4958144 250 254.8 249.65 254.61

Surface Charts
• Surface graphs are 3-dimensional (3D) graphs with x, y, and z axes.
• The setup for a surface graph requires some early data processing,
not just three sets of data.
• The assumption behind the data in a surface graph is that x and y are
independent variables, and the values should be numeric.
98

Surface Charts(cont.)
The data should be structured as a matrix or
what some call a “mesh” because this
information will be the underlying data
behind the 3D contour.
To build this mesh, the x-axis should be one
row of data, and the y-axis should be one
column of data. The z axis (which is the
height of various points of the mesh) is
drawn as an intersection between x and y (in
the green area).
Sparse matrices (those with a lot of empty
cells or null values or zeroes) do not work as
well as fully defined ones.
Note that the referent in each cell has to be
back to the y-column and the x-row
($column letter and $row number)
99

Surface Charts (cont.)
• Surface charts enable the visualizing of some interaction between the
data represented in the x-axis and the y-axis.
• The colors of the surface chart (represented as bands) represent
similar values.
• 3D surface charts may be depicted as wireframe contours, aerial view
contour charts, and others.
• 3D surface charts may be viewed to see overall data patterns. They
may be used to visualize equations. They may be used to find
optimum combinations between two sets of data (represented on the
x and y axes).
100

Surface Charts (cont.)
• 3D visualizations are difficult for people to use because data may be
occluded or difficult to see.
• Data labels are important; legends are important.
• The positioning of the visualization is important.
• The labeling of the three axes is important, so people know what is
represented.
• Excel enables all the above.
• The background behind the data and how the 3D data visualization
was arrived at will be important to help users contextualize the
visualization.
101

Data Structure for the Prior Four Surface
Graphs (a selection of data)
106
308 309.22 302 306.51 302.46 299.7 302.7 296.7 306.34 313.38
309.06 $1.06 -$0.16 $7.06 $2.55 $6.60 $9.36 $6.36 $12.36 $2.72 -$4.32
308.03 $0.03 -$1.19 $6.03 $1.52 $5.57 $8.33 $5.33 $11.33 $1.69 -$5.35
305.6 -$2.40 -$3.62 $3.60 -$0.91 $3.14 $5.90 $2.90 $8.90 -$0.74 -$7.78
302.51 -$5.49 -$6.71 $0.51 -$4.00 $0.05 $2.81 -$0.19 $5.81 -$3.83 -$10.87
305.52 -$2.48 -$3.70 $3.52 -$0.99 $3.06 $5.82 $2.82 $8.82 -$0.82 -$7.86
300.25 -$7.75 -$8.97 -$1.75 -$6.26 -$2.21 $0.55 -$2.45 $3.55 -$6.09 -$13.13
301.44 -$6.56 -$7.78 -$0.56 -$5.07 -$1.02 $1.74 -$1.26 $4.74 -$4.90 -$11.94
304 -$4.00 -$5.22 $2.00 -$2.51 $1.54 $4.30 $1.30 $7.30 -$2.34 -$9.38
296.84 -$11.16 -$12.38 -$5.16 -$9.67 -$5.62 -$2.86 -$5.86 $0.14 -$9.50 -$16.54
308.71 $0.71 -$0.51 $6.71 $2.20 $6.25 $9.01 $6.01 $12.01 $2.37 -$4.67
312.39 $4.39 $3.17 $10.39 $5.88 $9.93 $12.69 $9.69 $15.69 $6.05 -$0.99
302.54 -$5.46 -$6.68 $0.54 -$3.97 $0.08 $2.84 -$0.16 $5.84 -$3.80 -$10.84
298.7 -$9.30 -$10.52 -$3.30 -$7.81 -$3.76 -$1.00 -$4.00 $2.00 -$7.64 -$14.68
295 -$13.00 -$14.22 -$7.00 -$11.51 -$7.46 -$4.70 -$7.70 -$1.70 -$11.34 -$18.38
303.7 -$4.30 -$5.52 $1.70 -$2.81 $1.24 $4.00 $1.00 $7.00 -$2.64 -$9.68
298.52 -$9.48 -$10.70 -$3.48 -$7.99 -$3.94 -$1.18 -$4.18 $1.82 -$7.82 -$14.86
278.3 -$29.70 -$30.92 -$23.70 -$28.21 -$24.16 -$21.40 -$24.40 -$18.40 -$28.04 -$35.08
277.92 -$30.08 -$31.30 -$24.08 -$28.59 -$24.54 -$21.78 -$24.78 -$18.78 -$28.42 -$35.46

Radar Charts
• Radar graphs, also known as spider graphs / charts, show quantitative
measures on axes emanating from a center point.
• Each axis represents a variable.
• In total, the radar graph represents a dataset on multi-variate features.
• Radar graphs may be used to compare multiple underlying datasets,
assuming that these are somehow comparable.
107

108
insight cause discrep tentat certain differ
2.62 3.11 0.91 2.32 0.99 2.89

109
affiliation achieve power reward risk
1.86 1.93 3.05 0.73 0.45

110
see hear feel
0.61 0.35 0.21

111
Analytic Clout
Authenti
c Tone
Area
chart -
Wikipedi
a.pdf 97.06 51.71 29.09 58.03
Bar chart
-
Wikipedi
a.pdf 97.45 53.41 11.35 50.76
Box plot -
Wikipedi
a.pdf 98.42 47.61 6.90 27.01
Histogra
m -
Wikipedi
a.pdf 97.67 48.28 7.49 42.60
Line
chart -
Wikipedi
a.pdf 97.34 48.49 16.09 80.38
Line
graph -
Wikipedi
a.pdf 96.79 46.55 3.46 37.85
Open-
high-low-
close
chart -
Wikipedi
a.pdf 98.36 46.74 76.25 12.73
Pie chart
-
Wikipedi
a.pdf 97.72 48.05 7.32 34.92
Radar
chart -
Wikipedi
a.pdf 94.91 48.90 10.47 49.27
Scatter
plot -
Wikipedi
a.pdf 97.06 51.44 6.61 52.82
Treemap
ping -
Wikipedi
a.pdf 97.76 55.21 8.76 42.44
Waterfall
chart -
Wikipedi
a.pdf 96.52 45.36 14.22 98.24

Treemap Charts
• Treemap diagrams are rectangular diagrams which convey frequency
in terms of spatial area of smaller rectangles fitted inside the space.
• Treemap diagrams, if they include nested rectangles within the larger
rectangles, are hierarchy charts because they capture the
relationships of the higher vs. the lower levels.
• By convention, the largest rectangles (indicating highest counts by
category) are to the left, and the smallest are to the right.
112

113
assignment 61385
graded_survey 1123
practice_quiz 2962
survey 896
66366

114
Word Count
9465008123 5035
amazon 2916
2017 2861
https 2712
com 2063
just 873
like 783
get 771
1015484624218312
4
734
one 712
order 698
time 650
now 637
company 568
please 541
prime 489
amzn 474
www 473
day 458
1015525727629433
9
444
know 430
united 402
see 400
states 400
new 396
delivery 389
http 383
seattle 372
customer 369
status 364
even 360
sorry 359
retail 358
122 355
service 355
33207 350

115
Very negative
Moderately
negative
Moderately
positive Very positive
1 :
InternalsA
mazon
(@amazon) ~
Twitter 37 73 176 103

Sunburst Charts
• Sunburst diagrams originated from piecharts. In sunburst diagrams,
variables are depicted as portions of a circular ring.
• Sunbursts are a form of hierarchical chart, which show upper and
lower level interrelationships between elements, such as topics and
sub-topics.
• The elements closest inside the circle are the top-level topics. Farther
out are the sub-topics, sub-sub-topics, and so on. (Or, some may
prefer child topics, grandchild topics, great grandchild topics.) It’s
the differentiation between the levels of information that makes this
a hierarchical chart.
116

Data Structure of the
Sunburst Diagram in
the Prior Slide
Nodes Sub-nodes No. Coding
References
account account access 7
account account business
days
4
account account details 3
account account info 2
account account information 9
account account issues 3
account account specialist 12
account account specialist
email
3
account account today 1
account amazon associate
account
3
account bank account 8
account checking account 1
account createspace account 2
account email account 26
Note the hierarchy with the “nodes” and
“sub-nodes”.
Note the alphabetization in both text (string)
columns.
Note the frequency counts in the “No.
Coding References” column.
118

119
Name Sources References
beautiful 1 782
day 1 4
employment 1 8
event 1 8
everyone 1 4
flags 1 5
friendly reminder 1 12
good 1 256
great photos 1 175
holiday 1 14
holiday festivities 1 7
holiday lights 1 4
home 1 162
job 1 8
listing 1 7
morning 1 4
offices 1 10
online 1 5
photo 2 453
picture 1 384
place 1 414
post 1 13
road 1 328
state 2 184
state offices 1 8
sunset 1 183
today 1 16
town 1 203
trip 1 361

120
✔ ✔ apps 1
✔ ✔ game 1
✔ ✔ income jaction 1
✔ ✔play store 1
delivery date estimated delivery date 8
delivery date false delivery date 2
delivery delivery persons 2
delivery delivery service 2
delivery delivery vehicle 2
delivery estimated delivery date 8
delivery fake delivery log 5
delivery false delivery date 2
delivery outsourced delivery 1
delivery perfect delivery performance 2
delivery poor delivery experience 1
gift gift cards 2
gift great client gift 1
mail mail box 2
mail mail room 3
mail provided prayer rooms 1
office apartments office 1
office post office 2
order confirmation e-mail order confirmation e-mail 11
order current order isnt 3
order order confirmation e-mail 11
order order status 1
service delivery service 2
service design services 2
service seller support service 1
shipping amzl shipping 2
shipping day shipping 1
shipping free shipping 1

Histogram Charts
• Histogram charts shows the frequency distribution of numerical data
over the comprehensive range of possible values. These are counts of
how many times a certain score appears.
• As such, they give a sense of the density of the data.
• Histograms are generally applied to continuous data. For categorical
data, regular bar charts with spaces between the bars are often used.
121

Data Structures for the Two Related
Histograms in the Prior Two Slides
124
Bins
Group Selfies
Frequencies Bins
Dronies
Frequencies
0 - 5 7 0 - 5 3
6 - 12 63 6 - 12 21
13-19 20 13-19 9
20s - 30s 943 20s - 30s 168
40s 49 40s 10
50s 25 50s 7
60s 12 60s 2
70s ≥ 1 70s ≥ 4
Mixed 224 Mixed 116
Unknowable 27 Unknowable 185

Data Structure for the Theme Histogram in
the Prior Slide
126
A :
compan
y
B :
engine
C :
engineer
ing
D :
landing
E :
launch
F :
mission
G : pad H : real I : rocket J : space
K :
spacex
L : stage
M :
station
N :
system
O : test P : time Q : units
R :
vehicle
S : work
1 :
Int
ern
als
(1)
Spa
ceX
43 66 34 39 142 30 37 29 105 101 51 37 31 49 50 35 27 34 43

Box & Whisker Charts
• Box and whisker diagrams enable the visualization of groups of numerical
data in quartiles (data broken into 25% or one-fourth segments). The
boxes in the boxplots show the range of values in quartiles for that
variable.
• The whiskers—or the lines running from the boxes—show the variability
outside the upper and lower quartiles. The longer the lines, the greater
the variability above the quartile ranges.
• The data mapped in box plots are not assumed to be parametric, so there
is no assumption of underlying statistical distributions.
• Lines within the boxes may indicate the median or midpoint where half the
data is above and half the data is below.
127

Box & Whisker Charts(cont.)
• Skewness shows what the tendency is so whether there are more
scores that trend high or trend low.
• A short box means low dispersion or spread (not a large variety in
numbers)…while a long box means high dispersion or spread (a large
variety of numbers).
• Outliers are indicated as dots outside the boxes and on the whiskers.
• The boxes in boxes & whisker diagrams may be vertical or horizontal.
128

Data Structure for the Box & Whisker Plot in
the Prior Slide (partial snippet)
130
YearStart YearEnd
LocationA
bbr
LocationD
esc
Data_Valu
e
2011 2011AL Alabama 32
2011 2011AL Alabama 32.3
2011 2011AL Alabama 31.8
2011 2011AL Alabama 33.6
2011 2011AL Alabama 32.8
2011 2011AL Alabama 33.8
2011 2011AL Alabama 26.4
2011 2011AL Alabama 16.3
2011 2011AL Alabama 35.2
2011 2011AL Alabama 35.5
2011 2011AL Alabama 36.4
2011 2011AL Alabama 27.1
2011 2011AL Alabama 38.5
2011 2011AL Alabama 34.8
2011 2011AL Alabama 35.8
2011 2011AL Alabama 32.3
2011 2011AL Alabama 34.1
2011 2011AL Alabama 28.8
2011 2011AL Alabama 23.8
2011 2011AL Alabama 29.8
2011 2011AL Alabama 40.1
2011 2011AL Alabama 28.6
2011 2011AL Alabama
2011 2011AL Alabama
2011 2011AL Alabama 32.9
2011 2011AL Alabama 27.8
2011 2011AL Alabama
2011 2011AL Alabama 34.7
2011 2011AL Alabama 30.5
2011 2011AL Alabama 33.2
2011 2011AL Alabama 34.1

Data Structure for the Box & Whisker Plot in
the Prior Slide (partial snippet)
132
Hospital
Referral
Region
Descriptio
n
Total
Discharges
Average Covered
Charges Average Total Payments Average Medicare Payments
AL -
Dothan 91 $32,963.07 $5,777.24 $4,763.73
AL -
Birmingha
m 14 $15,131.85 $5,787.57 $4,976.71
AL -
Birmingha
m 24 $37,560.37 $5,434.95 $4,453.79
AL -
Birmingha
m 25 $13,998.28 $5,417.56 $4,129.16
AL -
Birmingha
m 18 $31,633.27 $5,658.33 $4,851.44
AL -
Montgom
ery 67 $16,920.79 $6,653.80 $5,374.14
AL -
Birmingha
m 51 $11,977.13 $5,834.74 $4,761.41
AL -
Birmingha
m 32 $35,841.09 $8,031.12 $5,858.50
AL -
Huntsville 135 $28,523.39 $6,113.38 $5,228.40
AL -
Birmingha
m 34 $75,233.38 $5,541.05 $4,386.94
AL -
Birmingha
m 14 $67,327.92 $5,461.57 $4,493.57
AL -
Dothan 45 $39,607.28 $5,356.28 $4,408.20
AL -
Birmingha
m 43 $22,862.23 $5,374.65 $4,186.02
AL -
Birmingha
m 21 $31,110.85 $5,366.23 $4,376.23
AL -
Mobile 15 $25,411.33 $5,282.93 $4,383.73
AL -
Huntsville 27 $9,234.51 $5,676.55 $4,509.11
AL -
Mobile 27 $15,895.85 $5,930.11 $3,972.85
AL -
Tuscaloos
a 31 $19,721.16 $6,192.54 $5,179.38
AL -
Mobile 18 $10,710.88 $4,968.00 $3,898.88
AL -
Birmingha
m 33 $51,343.75 $5,996.00 $4,962.45
AL -
Birmingha
m 29 $55,219.31 $5,710.31 $4,471.68
AL -
Mobile 66 $14,948.15 $5,550.90 $4,219.90
AL -
Birmingha
m 19 $73,846.21 $4,987.26 $3,944.42
AK -
Anchorage 23 $34,805.13 $8,401.95 $6,413.78
AZ -
Phoenix 11 $34,803.81 $7,768.90 $6,951.45
AZ -
Tucson 40 $24,474.75 $6,799.85 $5,764.87
AZ -
Phoenix 18 $28,571.61 $9,133.00 $8,008.11
AZ -
Tucson 12 $35,968.50 $6,506.50 $5,379.83
AZ -
Tucson 42 $26,294.52 $6,083.42 $4,903.33
AZ -
Phoenix 28 $26,771.78 $7,140.85 $6,133.57
AZ -
Phoenix 20 $29,967.80 $6,978.75 $5,969.55
AZ -
Phoenix 15 $27,349.40 $11,026.33 $9,056.06
AZ -
Phoenix 18 $59,443.83 $8,487.44 $7,422.66

Waterfall Charts
• Waterfall diagrams (aka “flying bricks chart” or “Mario chart,” or
“bridge” in finance) capture intermediate positive or negative
valuations of something—such as products or services, housing, or
stocks.
• The x-axis may be time, or it may be a variable.
• The y-axis is some sort of measure.
• In some charts, the starting and ending values are shown as full bars,
while the intermediate values float (as floating steps) to various
heights depending on their varying values.
• A waterfall chart may show valuation variance over time.
133

Waterfall Charts (cont.)
• This graph displays “the cumulative effect of sequentially introduced
positive or negative values” (“Waterfall chart,” Mar. 2017).
• There is a non-naïve assumption that what has occurred before may have
effects on the near-term on what follows (or is part of a larger affecting
trend).
• The depicted variables exist in a context and are in co-relationship.
134

Data Structure for the Waterfall Chart in the
Prior Slide
136
Base Fall Rise Total
4/24/2017 30.35 0
4/21/2017 30.7 0 30.7 0.35
4/20/2017 31.07 0 31.07 1
4/19/2017 30.39 30.39 0 -0.32
4/18/2017 30.66 0 30.66 0.25
4/17/2017 30.73 0 30.73 0.08
4/13/2017 30.43 30.43 0 -0.3
4/12/2017 30.65 0 30.65 0.22
4/11/2017 31.15 0 31.15 0.5
4/10/2017 31.24 0 31.24 0.09
4/7/2017 31.07 31.07 0 0.17

Data Structure for the Waterfall Chart in the
Prior Slide
138
Dates Base Fall Rise
Total
Changes
4/3/2017 3.95 0 0 0
4/4/2017 3.95 0 0 0
4/5/2017 3.75 0.02 0 -0.2
4/6/2017 3.75 0 0 0
4/7/2017 3.75 0 0 0
4/10/2017 3.8 0 0.05 0.05
4/11/2017 3.85 0 0.05 0.05
4/12/2017 3.7 0.15 0 -0.15
4/13/2017 3.45 0 0.25 0.25
4/17/2017 3.5 0 0.05 0.05
4/18/2017 3.4 0.1 0 -0.1
4/19/2017 3.4 0 0 0
4/20/2017 3.4 0 0 0
4/21/2017 3.5 0 0.1 0.1
4/24/2017 3.5 0 0 0
4/25/2017 3.4 0.1 0 -0.1
This one was made with
the stacked vertical
column chart feature.
These are still not quite
presenting correctly, but
they’re close… The data
is from the Nasdaq
Historical Quotes tool.

Combo Chart
• Combination graphs are those which mix data and present the
findings in creative interlinked ways (optimally for new insights).
• Combining data requires finesse because there are ways to introduce
errors when mixing data. Data types may not align. Measures may
not be accurately matched. Some data may be redundant. Etc.
• There are many ways to create these.
• Some of the earlier charts may be “combination” ones as well because of the
integration of multiple variables and / or multiple datasets.
139

Data Structure for the Combo Chart in the
Prior Slide
141
function pronoun ppron i we you shehe they ipron article prep auxverb adverb conj negate
Area
chart -
Wikipedi
a.pdf 31.13 2.13 0.43 0.00 0.00 0.21 0.21 0.00 1.71 7.04 12.37 3.62 2.13 5.54 0.00
Bar chart
-
Wikipedi
a.pdf 31.02 1.70 0.36 0.00 0.00 0.12 0.12 0.12 1.34 8.27 11.31 4.99 1.95 3.77 0.24
Box plot -
Wikipedi
a.pdf 33.08 1.79 0.90 0.07 0.00 0.07 0.00 0.75 0.90 9.86 12.17 4.33 1.34 4.26 0.45
Histogra
m -
Wikipedi
a.pdf 31.79 2.45 0.23 0.00 0.10 0.03 0.00 0.10 2.22 9.14 11.16 4.34 1.79 3.94 0.33
Line
chart -
Wikipedi
a.pdf 35.63 3.76 0.63 0.00 0.13 0.13 0.00 0.38 3.14 10.04 11.67 5.02 2.63 3.51 0.13
Line
graph -
Wikipedi
a.pdf 33.69 3.52 0.41 0.05 0.04 0.04 0.02 0.27 3.11 8.56 11.60 4.86 1.72 4.07 0.52
Open-
high-low-
close
chart -
Wikipedi
a.pdf 32.30 2.61 0.49 0.00 0.00 0.16 0.00 0.33 2.12 10.11 11.26 3.26 1.47 4.08 0.33
Pie chart
-
Wikipedi
a.pdf 29.82 2.63 0.75 0.04 0.00 0.08 0.15 0.49 1.88 8.07 11.30 3.87 1.65 3.30 0.34
Radar
chart -
Wikipedi
a.pdf 30.38 2.97 1.04 0.22 0.22 0.22 0.00 0.38 1.92 6.43 10.60 5.11 2.36 3.85 0.33
Scatter
plot -
Wikipedi
a.pdf 34.01 2.95 0.86 0.07 0.14 0.07 0.43 0.14 2.09 9.51 11.02 5.26 1.37 4.61 0.36
Treemap
ping -
Wikipedi
a.pdf 25.13 2.10 0.46 0.00 0.00 0.07 0.13 0.26 1.64 6.10 10.76 3.08 1.38 2.62 0.00
Waterfall
chart -
Wikipedi
a.pdf 33.82 2.04 0.29 0.00 0.00 0.29 0.00 0.00 1.75 8.16 11.95 5.83 2.04 4.37 0.87

3D Maps Geographical Imagery
• The 3D Maps imagery is related to locational mapping on a digital 3D
globe.
• There should be at least one to two columns of locational information
based on standard names for cities, states (or provinces), and
countries. Regional names are also recognized.
• The spellings of the names, though, should be standard to the tool.
• There may be other columns of related quantitative data related to
the respective locations. This may be time data, demographic data,
or various other relevant information.
142

3D Maps Geographical Imagery (cont.)
• To set up data for 3D imagery, set up some locations: city,
state/province, country, and say, years of residence.
• Highlight the data.
• Go to Insert - > 3D Maps
• Adjust the fields for the look-and-feel.
• The maps are interactive (rotate-able), and zoomable.
143

Data Structure for the 3D Image in the Prior
Slide
City State Country Years of Residence
145

Some Tips for Creating Data Visualizations in
Excel 2016
• Do a mental walk-through of the underlying data.
• Consider what it is you want to communicate.
• Create a number of versions of the data visualizations. Experiment
broadly.
• Add data visualization details.
• Add surrounding information to ensure that the data visualization fits
the context.
146

Going “Off-Script” within Excel
Going with data visualization templates in Excel is a very fast way to portray
structured data.
However, there are some creative ways to re-visualize data in Excel by using existing
capabilities.
147

(1) A Composite Multi-Graph Image
• Let’s say that there is a need to create multiple graphs that are
interrelated and need to be exported as one file.
• Simply click on the outside borders of each of the elements, go to the
Page Layout tab, and click on Group. This will treat all the elements as
one group, and will enable clicking on just one part of the image to
“copy” the entire one into a photo editing tool.
• If the elements are not treated as one, then it will be difficult to export the
composite graphs as one with a screen grab (since a screen may not contain
the entire composite image).
• Piecemeal copy-and-paste exports will mean that the elements have to be
recomposed in a tool like Microsoft Visio, with the attendant challenges of
getting everything to align.
148

(2) Back-to-Back Bar Charts
• Begin with a set of relatively comparable data with the same variables
being compared (with a numerical measure).
• Assess the data with a shared measure.
• In Excel, create two separate horizontal bar charts.
• If the results are quite different, rework the horizontal axes to have
the same maximum number (so the two sides have a comparable
base).
• Add data labels for clarity of the bars.
149

(2) Back-to-Back Bar Charts (cont.)
• Create a name label for the data visualization using a text box.
• For one of the two horizontal bar charts, in the “Format Axis,” reverse
the order of the values.
• For the one with reversed values, delete the vertical axis with the
numbers.
• Create a text box with the variables centered.
• Strive to align the two bar charts. (This is easier said than done
because the horizontal bars are not the same thickness necessarily if
the numbers are quite different.)
150

(2) Back-to-Back Bar Charts (cont.)
• Add a white background to the image, so that the Excel cells do not
show up.
• If further cleanup work is needed, drop the image into Photoshop or
another image editing tool, and clean up the image before placing the
image.
• Once an Excel graph is made into an image, it is no longer machine
readable and not screen-readable, so informationally-equivalent alt-
text should be included to ride along with the image.
151

(2) A Rough Example of a Back-to-Back Bar
Chart
152

(3) A Stacked Pyramid Chart
• Create a list of frequency data.
• Highlight the frequency data, and filter from largest to smallest. Be
sure to extend the selection, so the data labels move with the correct
frequency amounts.
• Intersperse lines between each row, and put in a placeholder amount
(say, 100 for the amount).
• Highlight the data, and insert a 3D 100% stacked column chart.
• Highlight the data columns and right-click. In the Format Data Series
window, select “Full Pyramid.”
153

(3) A Stacked Pyramid Chart (cont.)
• With the chart highlighted, go to the Design tab, and click “Switch
Row/Column.” The separate columns will coalesce into one pyramid.
• Click on the left axis (100% to 0%), and select “Format Axis.” In the
“Format Axis” window at the right, select “Values in reverse order.”
• In the chart area, select the visual elements which are not desired and
click delete to remove any visual objects that are not desired.
• Click the “plus” at the right of the chart and add elements that are desired
(such as a Legend).
• Adjust the size of the separators from 100 to another consistent number to
create the sense of space between the reverse pyramid elements.
154

(3) A Stacked Pyramid Chart (cont.)
• Right-click one of the placeholder layers in the visualization, and go to
the Format Data Series window. In the “Fill” tab, select “No fill.” Do
this for each of the placeholder layers to give a sense of physical
distance between each of the actual data layers.
• The “Enrollment Summary by College” data in the following table
comes from the Office of the Registrar at Kansas State University, at
http://www.k-state.edu/registrar/statistics/colleges.html. This is
from 2016.
• This data visualization type may align with sequential or pipeline data
as well as others.
155

(3) A Stacked Pyramid Chart
156

Some Common Mistakes
• Not ensuring that the underlying data behind a data visualization is
correct
• A lack of alignment and fit between the underlying data and the data
visualization form
• Going with a data visualization only because the software seems to enable
it…but not working through the visualization to make sure that it makes sense
both visually and data-wise
• Confusing rates with actual measures, and others
• Combining non-comparable data types
• Having data in a cell which is not identified by accurate type (such as
“date” information as “general” data or “number” information as
“text” data)
158

Some Common Mistakes(cont.)
• An incoherent data visualization enabling a wide variety of
misinterpretations (or conflicting data in a data visualization)
• Insufficient data visualization context
• Poor labeling of data: insufficient labels, inaccurate labels, non-
neutral language, illegibility, and / or others
• Not spell checking data visualizations
• Not studying the conventions of the data visualization
• Assuming that viewers have the same level of background knowledge
as the creator of the data visualization
159

Some Common Mistakes(cont.)
• Excess data in the data visualization (such as extra decimal places for
whole numbers for a lot of .00)
• A 2D or 3D data visualization with excessive data and data element
occlusion
• A 4D data visualization with pacing that is too fast or too slow (or
which does not enable viewer pacing or control)
160

One Main Realization
• The work to conduct the research and to acquire the actual data takes
about 95% of the effort and time, and drawing the data visualization
takes about 5% of the effort…but the data visualization piece is also
critical (because a lot can be compromised with improper drawing of
the data visualization).
161

Adding Relevant Data
Visualization Elements
Data visualizations should be as simple as possible, with no extraneous elements
that do not contribute to the overall meaning of the chart.
162

Common Data Visualization Elements
• A clear noun-phrase title
• Labels for the x- and y-axes (and sometimes y1 and y2
axes)
• Data labels
• Gridlines
• A data table (for some data visualizations)
• A legend
• Error bars
• Trendlines, and others
163

Graph Styles
• Various style versions of the target graph
• Background styles
• Object handling
• Texturing of objects and shapes
• Font types and styles
• 2D vs. 3D, and others
164

Range of Color Palettes
• Ability to add a variety of colors in
palettes that are aesthetically
pleasing and of sufficient contrast
for visual accessibility
• Color palettes may be selected by
dominant colors
165

To Change Graph Colors…
• To change the colors of the plot, highlight the
plot.
• In the Design tab of the ribbon, select “Change
Colors.” A dropdown menu will enable the
selection from a variety of color palettes.
The palettes are divided into two sections:
colorful (polychromatic and contrastive) and
monochromatic (different shades of a
particular color, often in gradients).
166

To Select Custom Colors…
• Custom colors may be applied to particular elements. Just right click
the element, and change the fill color.
167

Dropdown Menus with Additional Options
• Users have a high level of
control for the look, feel, and
function of the chart / graph
elements.
169

MS Excel’s Page Layout Features
• Excel has a variety of layout features that may enable in-graph
editing.
• Some of the features of this tab include the following:
• Pre-built themes
• Backgrounds
• Scaling and sizing
• Gridlines
• Arrangements (bringing forward, sending back
• Auto alignment choices
• Grouping, and others
170

Processing Graph Visualizations
Outside of Excel 2016
171

172
side-by-side data visualizations
from different software tools

Several Main Ways to Export Excel Charts
Copy and Paste as a Linked Graph
• Can export data visualizations as a
copy and paste (which will maintain
the link to the original file—as long as
all the respective files’ locations are
not changed)
• Copied and pasted charts will
maintain an alpha channel behind
visual elements (so there is an
invisible layer with 100%
transparency)
• Colors of the data graphs will change
in PowerPoint based on the applied
design styles and color palettes
Save as Template
• Can export data templates for
use later on
173

Several Main Ways to Export Excel Charts (cont.)
Copy and Paste as an Image into a
Digital Image Editing Software Program
• Can copy the graph by clicking on
its outer edges, doing CTRL + C (to
save the image to a Windows
machine clipboard), and pasting
into Adobe Photoshop…and
changing the resolution, contrast,
and aspect ratio as needed…and
then exporting out / saving the
image as a .png, .tif, .jpg, .gif, or
some other
Copy and Paste as an Image into a
Diagramming / Drawing Software Program
• Can copy the graph as an image
into a diagramming / drawing
software program (like Microsoft
Visio) and adding image overlays
before outputting in the
proprietary file format and then
as a digital image
175

Microsoft Visio
• For example, MS Visio offers the following: pre-made templates,
forms, containers, call-outs, connectors, and others
• There are overlays of shapes, text boxes, lines, fonts, and others
• Shapes may be highlighted and operations may be applied to them: union,
combine, fragment, intersect, subtract, join, trim, and others…through an
activate-able Developer tab
• To offer more control, users have gridlines, drag-able guidelines,
automated positioning and alignment, grouping features, aspect-ratio
controls, and others
• Color-based themes and variants
176

Built-in Templates and Online Templates for
Excel (for Defined Applications)
178

180
Year Variable 1 Variable 2 Variable 3 Variable 4 Variable 5
2010 100 8 100 30 180
2011 4 1 7 4 0
2012 0 8 5 200 -180
2013 7 1 5 10 0
2014 200 50 0 12 20
2015 3 0 5 40 0
2016 -400 20 150 3 200
2017 1 -80 45 100 4
2018 800 82 -800 82 600
faux data to display
a streamgraph
data visualization

What are Add-ins?
• Add-ins are software programs built to function with Excel to add
various types of functionalities: data analytics, data visualizations, QR
code generation, expanded export file types, and others
• Add-ins / add-ons are helpful because they add functionality to a
software that is already somewhat familiar
184

Where Can One Find Add-ins for Excel?
• Some of the Excel add-ins are from Microsoft Research and may be
activated within the tool.
• Some are available from download from the Office Store (such as a free
Streamgraph drawing add-on that creates area charts that vary over time).
• Others are related to software programs (like Acrobat PDF) and
enable richer ways to share / interchange file types.
• Some are downloadable from CodePlex and GitHub (like Network
Overview, Discovery and Exploration for Excel or “NodeXL”), for social
media platform data extraction, network analysis, network graph
drawing, and other capabilities.
185

Where Can One Find Add-ins for Excel? (cont.)
• There are different directions for accessing different types of add-ins.
• Some will require mere activation.
• Those that are built into the tool will require mere activation, if that.
• Those that come with other software programs will require mere activation, if that.
• Some will require a download and some installation.
• Some will require a download, but these may auto-installation.
186

Activating Add-ins
• In Excel, click the File tab.
• Click Options. The Excel Options window opens.
• Click “Add-ins” in the left menu.
• A list of available add-ins will display in the window, in several
categories:
• Active Application Add-ins
• Inactive Application Add-ins
• Document Related Add-ins
• Disabled Application Add-ins
187

Activating Add-ins(cont.)
• Select an add-in of interest, and click “Go” at the bottom.
• An “Add-in” window will open allowing the user to check certain
boxes to activate or to uncheck boxes to de-activate.
• Click “OK” once the selections are decided.
• These are global settings, and the add-ins should be good for future
uses Excel.
188

Excel Options -> Add-ins Window
189

About Data
• Data…
• Has to be collected somewhere advertently or inadvertently
• Has to be practically applied in some way (strategic, tactical, other)
• May be pre-labeled or post-labeled
• Structured data datasets include the following:
• What a thing is (data type) and generally how it relates to everything else
• Dataset metadata include the following:
• How the data was collected (hopefully with high standards and finesse)
• When the data was collected
• Who collected the metadata and how they should be cited
192

About Data (cont.)
• Dataset metadata may be captured in data dictionaries if the dataset
is a larger sized one
• The fact that data is in the same set means there is some relatedness
whether you can see it or not (or you may have brought unrelated
contents into a dataset and are seeing relations that may not exist)
• Handling data requires finesse:
• Data handling should be back-stopped by protected raw datasets which are
left pristine and unprocessed (so researchers can always grab another set to
process differently)
• How you clean and handle it matters (handling can introduce artifacts,
mistakes, and skews)
• Researchers can’t afford to be sloppy or unthinking
193

About Data (cont.)
• Having access to a data table or a dataset can give the deceptive
sense of understanding
• Data has to be understood from a deep background in the subject matter
• Data has to be understood in the context of larger sets of data that may be
cross-referenced expertly
• Fragmentary data reveals in some cases and obfuscates in others
194

Data Visualization Standards
195

Some Common Standards for Data
Visualizations
• Data accuracy (underlying data;
proper contextualization; source
citations; disambiguation;
correction of errors; non-
manipulation of data consumers;
differentiation between
empirical, conceptual, and
synthetic data)
• Intellectual property protections
(copyright)
• Privacy protections (protection
against re-identification of de-
identified data)
• Proper crediting of all sources
• Accessibility through file
versioning, alt texting, access to
underlying databases, and
captioning
196

Some Common Standards for Data
Visualizations (cont.)
• Human and machine readability
of data tables
• Contextualizing
197

Contact and Conclusion
• Dr. Shalin Hai-Jew
• iTAC
• Kansas State University
• 785-532-5262
• shalin@k-state.edu
• Note:
• The data sources have generally been cited close to the data visualization.
• The presenter has no relationship to any of the software makers.
198

Creating Effective Data Visualizations in Excel 2016: Some Basics

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Creating Effective Data Visualizations in Excel 2016: Some Basics

Similar to Creating Effective Data Visualizations in Excel 2016: Some Basics (20)

More from Shalin Hai-Jew

More from Shalin Hai-Jew (20)

Recently uploaded

Recently uploaded (20)

Creating Effective Data Visualizations in Excel 2016: Some Basics