SlideShare uma empresa Scribd logo
1 de 80
Grammar of graphics (그래픽 문법)
Outline (개요)
• Motivating example (동기부여를 위한 예제)
• example of research question (연구 질문 예시)
• mpg data (Mile per Gallon, 기름량당 주행거리)
• ggplot example
• Math review (수학 리뷰)
• function mapping (함수 대응)
• Dimension (차원) & Co-ordinate system (좌표계)
• Grammar of graphics (그래픽 문법)
• aesthetic mapping (미학적 대응)
• facet (면)
• geometric object (기하학적 개체)
• Statistical transformations (통계적 변환)
• Position adjustments (위치 조정)
• Coordinate systems (좌표계)
• The layered grammar of graphics (층화된 그래픽 문법)
Understand (이해하다): data exploration (데이터 탐색)
Transform (변환하다) & Visualize (시각화하다) & Model(모형을 만들다)
First example)
engine size (엔진 크기) vs. fuel usage (연료 소모량)
• Research question (연구 질문)
• Do cars with big engines use more fuel than cars with small engines?
• 엔진이 큰 차가 엔진이 작은 차보다 연료 소모량이 큰가?
내장된 데이터) mpg data
Cf) Miles per Gallon (MPG): 기름 1갤런당 몇 마일 가나
mpg
#> # A tibble: 234 x 11
#> manufacturer model displ year cyl trans drv cty hwy fl class
#> <chr> <chr> <dbl> <int> <int> <chr> <chr> <int> <int> <chr> <chr>
#> 1 audi a4 1.8 1999 4 auto(… f 18 29 p comp…
#> 2 audi a4 1.8 1999 4 manua… f 21 29 p comp…
#> 3 audi a4 2 2008 4 manua… f 20 31 p comp…
#> 4 audi a4 2 2008 4 auto(… f 21 30 p comp…
#> 5 audi a4 2.8 1999 6 auto(… f 16 26 p comp…
#> 6 audi a4 2.8 1999 6 manua… f 18 26 p comp…
#> # ... with 228 more rows
help(mpg)
mpg {ggplot2} R Documentation
Fuel economy data from 1999 and 2008 for 38 popular models of car
Description
This dataset contains a subset of the fuel economy data that the EPA makes available on http://fueleconomy.gov. It contains only models which
had a new release every year between 1999 and 2008 - this was used as a proxy for the popularity of the car.
Usage
mpg
Format
A data frame with 234 rows and 11 variables
manufacturer
model
model name
displ
engine displacement, in litres
year
year of manufacture
cyl
number of cylinders
trans
type of transmission
drv
f = front-wheel drive, r = rear wheel drive, 4 = 4wd
cty
city miles per gallon
hwy
highway miles per gallon
fl
fuel type
class
"type" of car
Let's plot first in a 2-D plane(2차원 평면): x- & y-axis(축)
x = displ, y = hwy
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy))
template (템플릿, 주형)
ggplot(data = <DATA>) +
<GEOM_FUNCTION>(mapping = aes(<MAPPINGS>))
function mapping (함수 대응)
Dimension (차원)
Co-ordinate system (좌표계)
function (함수)
f:X→Y
f maps(대응시키다) X into Y
https://en.wikipedia.org/wiki/Function_(mathematics)
Dimension (차원)
Co-ordinate system (좌표계)
https://en.wikipedia.org/wiki/Dimension
Grammar of graphics (그래픽 문법)
aesthetic mapping (미학적 대응)
ggplot2
• ggplot2 is based on the grammar of
graphics(그래픽 문법), the idea that you
can build every graph from the same
components: a data set, a coordinate
system(좌표계), and geoms(기하,
도형)—visual marks that represent data
points.
• To display (화면에 표현하다) values (값),
map (대응시키다) variables (변수) in the
data to visual properties of the geom
(기하, 도형 = aesthetics 미학) like
size(크기), color(색), and x and y
locations(위치, 좌표).
Let's plot first in a 2-D plane(2차원 평면): x- & y-axis(축)
x = displ, y = hwy
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy))
Let's map (대응시키다) a 3rd dimension (차원) in color(색상)
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy, color = class))
Let's map (대응시키다) a 3rd dimension (차원) in size (크기)
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy, size = class))
Let's map (대응시키다) a 3rd dimension (차원) in alpha
(transparency, 투명도)
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy, alpha = class))
Let's map (대응시키다) a 3rd dimension (차원) in shape(모양)
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy, shape = class))
Let's map (대응시키다) a 3rd dimension (차원) in shape(모양)
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy, shape = class))
A calculated variable (계산한 변수) as a 3rd dimension (차원)
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy, color = displ < 5))
Cf) Common problem: syntax error~!
e.g.) Location of "+"
ggplot(data = mpg)
+ geom_point(mapping = aes(x = displ, y = hwy))
Cf) Common problem
aesthetic mapping (미학적 대응) is for variables (변수) in the data
e.g.) "blue" is not a variable
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy, color = "blue"))
Grammar of graphics (그래픽 문법)
facet (면)
Let's map (대응시키다) a 3rd dimension (차원) in color(색상)
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy, color = class))
facet (면): subplots that each display one subset of the data
facet_wrap(~VariableName)
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy)) +
facet_wrap(~ class, nrow = 2)
facet (면): subplots that each display one subset of the data
facet_grid(VariabeName4Row ~ VariabeName4Column)
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy)) +
facet_grid(drv ~ cyl)
facet (면): subplots that each display one subset of the data
facet_grid(VariabeName4Row ~ VariabeName4Column)
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy)) +
facet_grid(drv ~ .)
facet (면): subplots that each display one subset of the data
facet_grid(VariabeName4Row ~ VariabeName4Column)
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy)) +
facet_grid(. ~ cyl)
facet (면): subplots that each display one subset of the data
facet_grid(VariabeName4Row ~ VariabeName4Column)
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy)) +
facet_grid(. ~ cyl)
facet (면): subplots that each display one subset of the data
facet_grid(VariabeName4Row ~ VariabeName4Column)
-> use a variable with more unique (고유) levels (단계, 범주)
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy)) +
facet_grid(trans ~ drv)
facet (면): subplots that each display one subset of the data
facet_grid(VariabeName4Row ~ VariabeName4Column)
-> use a variable with more unique (고유) levels (단계, 범주)
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy)) +
facet_grid(drv ~ trans)
More on faceting
Grammar of graphics (그래픽 문법)
geometric object (기하학적 개체)
Plot in a 2-D plane(2차원 평면): x- & y-axis(축)
geom_point (점)
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy))
Plot in a 2-D plane(2차원 평면): x- & y-axis(축)
geom_smooth (매끄러운)
ggplot(data = mpg) +
geom_smooth(mapping = aes(x = displ, y = hwy))
Plot in a 2-D plane(2차원 평면): x- & y-axis(축)
geom_smooth (매끄러운)
ggplot(data = mpg) +
geom_smooth(mapping = aes(x = displ, y = hwy, group = drv))
Plot in a 2-D plane(2차원 평면): x- & y-axis(축)
geom_smooth (매끄러운)
ggplot(data = mpg) +
geom_smooth(mapping = aes(x = displ, y = hwy, linetype = drv))
Plot in a 2-D plane(2차원 평면): x- & y-axis(축)
geom_smooth (매끄러운)
ggplot(data = mpg) +
geom_smooth(mapping = aes(x = displ, y = hwy, color = drv), show.legend = FALSE)
Plot in a 2-D plane(2차원 평면): x- & y-axis(축)
geom_point (점) & geom_smooth (매끄러운)
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy)) +
geom_smooth(mapping = aes(x = displ, y = hwy))
Plot in a 2-D plane(2차원 평면): x- & y-axis(축)
geom_point (점) & geom_smooth (매끄러운)
ggplot(data = mpg, mapping = aes(x = displ, y = hwy)) +
geom_point() +
geom_smooth()
Plot in a 2-D plane(2차원 평면): x- & y-axis(축)
geom_point (점) & geom_smooth (매끄러운)
ggplot(data = mpg, mapping = aes(x = displ, y = hwy)) +
geom_point(mapping = aes(color = class)) +
geom_smooth()
Plot in a 2-D plane(2차원 평면): x- & y-axis(축)
geom_point (점) & geom_smooth (매끄러운)
ggplot(data = mpg, mapping = aes(x = displ, y = hwy)) +
geom_point(mapping = aes(color = class)) +
geom_smooth(data = filter(mpg, class == "subcompact"), se = FALSE)
Plot in a 2-D plane(2차원 평면): x- & y-axis(축)
geom_point (점) & geom_smooth (매끄러운)
ggplot(data = mpg, mapping = aes(x = displ, y = hwy, color = drv)) +
geom_point() +
geom_smooth(se = FALSE)
More on geometric object
Grammar of graphics (그래픽 문법)
Statistical transformations (통계적 변환)
내장된 데이터) diamond data
> diamonds
# A tibble: 53,940 x 10
carat cut color clarity depth table price x y z
<dbl> <ord> <ord> <ord> <dbl> <dbl> <int> <dbl> <dbl> <dbl>
1 0.23 Ideal E SI2 61.5 55 326 3.95 3.98 2.43
2 0.21 Premium E SI1 59.8 61 326 3.89 3.84 2.31
3 0.23 Good E VS1 56.9 65 327 4.05 4.07 2.31
4 0.290 Premium I VS2 62.4 58 334 4.2 4.23 2.63
5 0.31 Good J SI2 63.3 58 335 4.34 4.35 2.75
6 0.24 Very Good J VVS2 62.8 57 336 3.94 3.96 2.48
7 0.24 Very Good I VVS1 62.3 57 336 3.95 3.98 2.47
8 0.26 Very Good H SI1 61.9 55 337 4.07 4.11 2.53
9 0.22 Fair E VS2 65.1 61 337 3.87 3.78 2.49
10 0.23 Very Good H VS1 59.4 61 338 4 4.05 2.39
# ... with 53,930 more rows
> help(diamonds)
diamonds {ggplot2} R Documentation
Prices of 50,000 round cut diamonds
Description
A dataset containing the prices and other attributes of almost 54,000 diamonds. The variables are as follows:
Usage
diamonds
Format
A data frame with 53940 rows and 10 variables:
price
price in US dollars ($326–$18,823)
carat
weight of the diamond (0.2–5.01)
cut
quality of the cut (Fair, Good, Very Good, Premium, Ideal)
color
diamond colour, from J (worst) to D (best)
clarity
a measurement of how clear the diamond is (I1 (worst), SI2, SI1, VS2, VS1, VVS2, VVS1, IF (best))
x
length in mm (0–10.74)
y
width in mm (0–58.9)
z
depth in mm (0–31.8)
depth
total depth percentage = z / mean(x, y) = 2 * z / (x + y) (43–79)
table
width of top of diamond relative to widest point (43–95)
Plot in a 2-D plane(2차원 평면): x- & y-axis(축)
geom_bar
ggplot(data = diamonds) +
geom_bar(mapping = aes(x = cut))
Plot in a 2-D plane(2차원 평면): x- & y-axis(축)
geom_bar
ggplot(data = diamonds) +
geom_bar(mapping = aes(x = cut, colour = cut))
Plot in a 2-D plane(2차원 평면): x- & y-axis(축)
geom_bar
ggplot(data = diamonds) +
geom_bar(mapping = aes(x = cut, fill = cut))
Plot in a 2-D plane(2차원 평면): x- & y-axis(축)
geom_bar: how is the y-axis calculated?
ggplot(data = diamonds) +
geom_bar(mapping = aes(x = cut))
Plot in a 2-D plane(2차원 평면): x- & y-axis(축)
geom_bar: how is the y-axis calculated?
Plot in a 2-D plane(2차원 평면): x- & y-axis(축)
geom_bar() = stat_count()
ggplot(data = diamonds) +
stat_count(mapping = aes(x = cut))
Plot in a 2-D plane(2차원 평면): x- & y-axis(축)
geom_bar(): ..prop..
ggplot(data = diamonds) +
geom_bar(mapping = aes(x = cut, y = ..prop.., group = 1))
The data after the statistical transformation
demo <- tribble(
~cut, ~freq,
"Fair", 1610,
"Good", 4906,
"Very Good", 12082,
"Premium", 13791,
"Ideal", 21551
)
demo
# # A tibble: 5 x 2
# cut freq
# <chr> <dbl>
# 1 Fair 1610
# 2 Good 4906
# 3 Very Good 12082
# 4 Premium 13791
# 5 Ideal 21551
Plot in a 2-D plane(2차원 평면): x- & y-axis(축)
The data after the statistical transformation
ggplot(data = demo) +
geom_bar(mapping = aes(x = cut, y = freq), stat = "identity")
More on statistical transformation
Grammar of graphics (그래픽 문법)
Position adjustments (위치 조정)
Plot in a 2-D plane(2차원 평면): x- & y-axis(축)
geom_bar
ggplot(data = diamonds) +
geom_bar(mapping = aes(x = cut, fill = cut))
Plot in a 2-D plane(2차원 평면): x- & y-axis(축)
geom_bar
ggplot(data = diamonds) +
geom_bar(mapping = aes(x = cut, fill = clarity))
Plot in a 2-D plane(2차원 평면): x- & y-axis(축)
geom_bar
ggplot(data = diamonds, mapping = aes(x = cut, fill = clarity)) +
geom_bar(alpha = 1/5, position = "identity")
Plot in a 2-D plane(2차원 평면): x- & y-axis(축)
geom_bar
ggplot(data = diamonds, mapping = aes(x = cut, color = clarity)) +
geom_bar(fill = NA, position = "identity")
Plot in a 2-D plane(2차원 평면): x- & y-axis(축)
geom_bar
ggplot(data = diamonds) +
geom_bar(mapping = aes(x = cut, fill = clarity), position = "fill")
Plot in a 2-D plane(2차원 평면): x- & y-axis(축)
geom_bar
ggplot(data = diamonds) +
geom_bar(mapping = aes(x = cut, fill = clarity), position = "dodge")
Plot in a 2-D plane(2차원 평면): x- & y-axis(축)
geom_point
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy))
Plot in a 2-D plane(2차원 평면): x- & y-axis(축)
geom_point
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy), position = "jitter")
More on statistical transformation
Grammar of graphics (그래픽 문법)
Coordinate systems (좌표계)
Plot in a 2-D plane(2차원 평면): x- & y-axis(축)
Coordinate systems (좌표계)
ggplot(data = mpg, mapping = aes(x = class, y = hwy)) +
geom_boxplot()
Plot in a 2-D plane(2차원 평면): x- & y-axis(축)
Coordinate systems (좌표계)
ggplot(data = mpg, mapping = aes(x = class, y = hwy)) +
geom_boxplot() + coord_flip()
More on coordinate systems
The layered grammar of graphics
(층화된 그래픽 문법)
The Grammar of Graphics
• Wilkinson L. The grammar of graphics. 2ed. Springer. 2006.
library(ggplot2)
• Wickham H. ggplot2: elegant graphics for data analysis. Springer; 2016.
• https://github.com/hadley/ggplot2-book
• https://github.com/tidyverse/ggplot2
ggplot2 syntax
the grammar of graphics
= a formal system for building plots
the grammar of graphics
= a formal system for building plots
the grammar of graphics
= a formal system for building plots
REFERENCES
#1. RStudio Official Documentations (Help & Cheat Sheet)
Free Webpage) https://www.rstudio.com/resources/cheatsheets/
#2. Wickham, H. and Grolemund, G., 2016.R for data science:
import, tidy, transform, visualize, and model data. O'Reilly.
Free Webpage) https://r4ds.had.co.nz/
Cf) Tidyverse syntax (www.tidyverse.org), rather than R Base
syntax
Cf) Hadley Wickham: Chief Scientist at RStudio. Adjunct
Professor of Statistics at the University of Auckland, Stanford
University, and Rice University

Mais conteúdo relacionado

Mais procurados

Tech talk ggplot2
Tech talk   ggplot2Tech talk   ggplot2
Tech talk ggplot2jalle6
 
Seminar PSU 09.04.2013 - 10.04.2013 MiFIT, Arbuzov Vyacheslav
Seminar PSU 09.04.2013 - 10.04.2013 MiFIT, Arbuzov VyacheslavSeminar PSU 09.04.2013 - 10.04.2013 MiFIT, Arbuzov Vyacheslav
Seminar PSU 09.04.2013 - 10.04.2013 MiFIT, Arbuzov VyacheslavVyacheslav Arbuzov
 
Lecture 6-1543909797
Lecture 6-1543909797Lecture 6-1543909797
Lecture 6-1543909797Canh Le
 
Regression kriging
Regression krigingRegression kriging
Regression krigingFAO
 
R class 5 -data visualization
R class 5 -data visualizationR class 5 -data visualization
R class 5 -data visualizationVivian S. Zhang
 
Cubist
CubistCubist
CubistFAO
 
Rewriting Engine for Process Algebras
Rewriting Engine for Process AlgebrasRewriting Engine for Process Algebras
Rewriting Engine for Process AlgebrasAnatolii Kmetiuk
 
DATA VISUALIZATION WITH R PACKAGES
DATA VISUALIZATION WITH R PACKAGESDATA VISUALIZATION WITH R PACKAGES
DATA VISUALIZATION WITH R PACKAGESFatma ÇINAR
 
From planar maps to spatial topology change in 2d gravity
From planar maps to spatial topology change in 2d gravityFrom planar maps to spatial topology change in 2d gravity
From planar maps to spatial topology change in 2d gravityTimothy Budd
 
Introduction to spatstat
Introduction to spatstatIntroduction to spatstat
Introduction to spatstatRichard Wamalwa
 
Will it Blend? - ScalaSyd February 2015
Will it Blend? - ScalaSyd February 2015Will it Blend? - ScalaSyd February 2015
Will it Blend? - ScalaSyd February 2015Filippo Vitale
 
Datamining r 4th
Datamining r 4thDatamining r 4th
Datamining r 4thsesejun
 
Understanding histogramppt.prn
Understanding histogramppt.prnUnderstanding histogramppt.prn
Understanding histogramppt.prnLeyi (Kamus) Zhang
 
Linear models
Linear modelsLinear models
Linear modelsFAO
 

Mais procurados (20)

Tech talk ggplot2
Tech talk   ggplot2Tech talk   ggplot2
Tech talk ggplot2
 
Seminar psu 20.10.2013
Seminar psu 20.10.2013Seminar psu 20.10.2013
Seminar psu 20.10.2013
 
Perm winter school 2014.01.31
Perm winter school 2014.01.31Perm winter school 2014.01.31
Perm winter school 2014.01.31
 
Seminar PSU 09.04.2013 - 10.04.2013 MiFIT, Arbuzov Vyacheslav
Seminar PSU 09.04.2013 - 10.04.2013 MiFIT, Arbuzov VyacheslavSeminar PSU 09.04.2013 - 10.04.2013 MiFIT, Arbuzov Vyacheslav
Seminar PSU 09.04.2013 - 10.04.2013 MiFIT, Arbuzov Vyacheslav
 
Seminar PSU 10.10.2014 mme
Seminar PSU 10.10.2014 mmeSeminar PSU 10.10.2014 mme
Seminar PSU 10.10.2014 mme
 
Lecture 6-1543909797
Lecture 6-1543909797Lecture 6-1543909797
Lecture 6-1543909797
 
Regression kriging
Regression krigingRegression kriging
Regression kriging
 
R class 5 -data visualization
R class 5 -data visualizationR class 5 -data visualization
R class 5 -data visualization
 
Cubist
CubistCubist
Cubist
 
Data visualization
Data visualizationData visualization
Data visualization
 
Rewriting Engine for Process Algebras
Rewriting Engine for Process AlgebrasRewriting Engine for Process Algebras
Rewriting Engine for Process Algebras
 
DATA VISUALIZATION WITH R PACKAGES
DATA VISUALIZATION WITH R PACKAGESDATA VISUALIZATION WITH R PACKAGES
DATA VISUALIZATION WITH R PACKAGES
 
From planar maps to spatial topology change in 2d gravity
From planar maps to spatial topology change in 2d gravityFrom planar maps to spatial topology change in 2d gravity
From planar maps to spatial topology change in 2d gravity
 
Introduction to spatstat
Introduction to spatstatIntroduction to spatstat
Introduction to spatstat
 
Will it Blend? - ScalaSyd February 2015
Will it Blend? - ScalaSyd February 2015Will it Blend? - ScalaSyd February 2015
Will it Blend? - ScalaSyd February 2015
 
Datamining r 4th
Datamining r 4thDatamining r 4th
Datamining r 4th
 
Understanding histogramppt.prn
Understanding histogramppt.prnUnderstanding histogramppt.prn
Understanding histogramppt.prn
 
CLIM Fall 2017 Course: Statistics for Climate Research, Spatial Data: Models ...
CLIM Fall 2017 Course: Statistics for Climate Research, Spatial Data: Models ...CLIM Fall 2017 Course: Statistics for Climate Research, Spatial Data: Models ...
CLIM Fall 2017 Course: Statistics for Climate Research, Spatial Data: Models ...
 
Linear models
Linear modelsLinear models
Linear models
 
Lecture 6
Lecture 6Lecture 6
Lecture 6
 

Semelhante a r for data science 2. grammar of graphics (ggplot2) clean -ref

Data visualization with multiple groups using ggplot2
Data visualization with multiple groups using ggplot2Data visualization with multiple groups using ggplot2
Data visualization with multiple groups using ggplot2Rupak Roy
 
ggplot2: An Extensible Platform for Publication-quality Graphics
ggplot2: An Extensible Platform for Publication-quality Graphicsggplot2: An Extensible Platform for Publication-quality Graphics
ggplot2: An Extensible Platform for Publication-quality GraphicsClaus Wilke
 
Data visualization-2.1
Data visualization-2.1Data visualization-2.1
Data visualization-2.1RenukaRajmohan
 
Data Visualization with ggplot2.pdf
Data Visualization with ggplot2.pdfData Visualization with ggplot2.pdf
Data Visualization with ggplot2.pdfCarlosTrujillo199971
 
Exploratory Analysis Part1 Coursera DataScience Specialisation
Exploratory Analysis Part1 Coursera DataScience SpecialisationExploratory Analysis Part1 Coursera DataScience Specialisation
Exploratory Analysis Part1 Coursera DataScience SpecialisationWesley Goi
 
8. R Graphics with R
8. R Graphics with R8. R Graphics with R
8. R Graphics with RFAO
 
Introduction to R Graphics with ggplot2
Introduction to R Graphics with ggplot2Introduction to R Graphics with ggplot2
Introduction to R Graphics with ggplot2izahn
 
Broom: Converting Statistical Models to Tidy Data Frames
Broom: Converting Statistical Models to Tidy Data FramesBroom: Converting Statistical Models to Tidy Data Frames
Broom: Converting Statistical Models to Tidy Data FramesWork-Bench
 
Geo Spatial Plot using R
Geo Spatial Plot using R Geo Spatial Plot using R
Geo Spatial Plot using R Rupak Roy
 
[系列活動] Data exploration with modern R
[系列活動] Data exploration with modern R[系列活動] Data exploration with modern R
[系列活動] Data exploration with modern R台灣資料科學年會
 
R and Visualization: A match made in Heaven
R and Visualization: A match made in HeavenR and Visualization: A match made in Heaven
R and Visualization: A match made in HeavenEdureka!
 
R Programming: Numeric Functions In R
R Programming: Numeric Functions In RR Programming: Numeric Functions In R
R Programming: Numeric Functions In RRsquared Academy
 

Semelhante a r for data science 2. grammar of graphics (ggplot2) clean -ref (20)

data-visualization.pdf
data-visualization.pdfdata-visualization.pdf
data-visualization.pdf
 
Data visualization with multiple groups using ggplot2
Data visualization with multiple groups using ggplot2Data visualization with multiple groups using ggplot2
Data visualization with multiple groups using ggplot2
 
ggplot2: An Extensible Platform for Publication-quality Graphics
ggplot2: An Extensible Platform for Publication-quality Graphicsggplot2: An Extensible Platform for Publication-quality Graphics
ggplot2: An Extensible Platform for Publication-quality Graphics
 
Data visualization-2.1
Data visualization-2.1Data visualization-2.1
Data visualization-2.1
 
Data Visualization with ggplot2.pdf
Data Visualization with ggplot2.pdfData Visualization with ggplot2.pdf
Data Visualization with ggplot2.pdf
 
RBootcamp Day 4
RBootcamp Day 4RBootcamp Day 4
RBootcamp Day 4
 
Survey Demo
Survey DemoSurvey Demo
Survey Demo
 
Exploratory Analysis Part1 Coursera DataScience Specialisation
Exploratory Analysis Part1 Coursera DataScience SpecialisationExploratory Analysis Part1 Coursera DataScience Specialisation
Exploratory Analysis Part1 Coursera DataScience Specialisation
 
VISIALIZACION DE DATA.pdf
VISIALIZACION DE DATA.pdfVISIALIZACION DE DATA.pdf
VISIALIZACION DE DATA.pdf
 
8. R Graphics with R
8. R Graphics with R8. R Graphics with R
8. R Graphics with R
 
R Language Introduction
R Language IntroductionR Language Introduction
R Language Introduction
 
Introduction to R Graphics with ggplot2
Introduction to R Graphics with ggplot2Introduction to R Graphics with ggplot2
Introduction to R Graphics with ggplot2
 
Broom: Converting Statistical Models to Tidy Data Frames
Broom: Converting Statistical Models to Tidy Data FramesBroom: Converting Statistical Models to Tidy Data Frames
Broom: Converting Statistical Models to Tidy Data Frames
 
Geo Spatial Plot using R
Geo Spatial Plot using R Geo Spatial Plot using R
Geo Spatial Plot using R
 
[系列活動] Data exploration with modern R
[系列活動] Data exploration with modern R[系列活動] Data exploration with modern R
[系列活動] Data exploration with modern R
 
Ggplot
GgplotGgplot
Ggplot
 
R and Visualization: A match made in Heaven
R and Visualization: A match made in HeavenR and Visualization: A match made in Heaven
R and Visualization: A match made in Heaven
 
Ggplot2 cheatsheet-2.1
Ggplot2 cheatsheet-2.1Ggplot2 cheatsheet-2.1
Ggplot2 cheatsheet-2.1
 
Geospatial Data in R
Geospatial Data in RGeospatial Data in R
Geospatial Data in R
 
R Programming: Numeric Functions In R
R Programming: Numeric Functions In RR Programming: Numeric Functions In R
R Programming: Numeric Functions In R
 

Mais de Min-hyung Kim

20230511 Automation of EMR Tasks using AutoHotkey in MS Windows_MKv1.1.pdf
20230511 Automation of EMR Tasks using AutoHotkey in MS Windows_MKv1.1.pdf20230511 Automation of EMR Tasks using AutoHotkey in MS Windows_MKv1.1.pdf
20230511 Automation of EMR Tasks using AutoHotkey in MS Windows_MKv1.1.pdfMin-hyung Kim
 
20221001 KAFM 의학 형의상학(Medical Ontology) v5 -clean.pptx
20221001 KAFM 의학 형의상학(Medical Ontology) v5 -clean.pptx20221001 KAFM 의학 형의상학(Medical Ontology) v5 -clean.pptx
20221001 KAFM 의학 형의상학(Medical Ontology) v5 -clean.pptxMin-hyung Kim
 
MH prediction modeling and validation in r (2) classification 190709
MH prediction modeling and validation in r (2) classification 190709MH prediction modeling and validation in r (2) classification 190709
MH prediction modeling and validation in r (2) classification 190709Min-hyung Kim
 
MH prediction modeling and validation in r (1) regression 190709
MH prediction modeling and validation in r (1) regression 190709MH prediction modeling and validation in r (1) regression 190709
MH prediction modeling and validation in r (1) regression 190709Min-hyung Kim
 
MH Prediction Modeling and Validation -clean
MH Prediction Modeling and Validation -cleanMH Prediction Modeling and Validation -clean
MH Prediction Modeling and Validation -cleanMin-hyung Kim
 
r for data science 4. exploratory data analysis clean -rev -ref
r for data science 4. exploratory data analysis  clean -rev -refr for data science 4. exploratory data analysis  clean -rev -ref
r for data science 4. exploratory data analysis clean -rev -refMin-hyung Kim
 
CDM SynPuf OMOP CDM library(rodbc) library(ggplot2) library(jsonlite) 180403
CDM SynPuf OMOP CDM library(rodbc) library(ggplot2) library(jsonlite) 180403CDM SynPuf OMOP CDM library(rodbc) library(ggplot2) library(jsonlite) 180403
CDM SynPuf OMOP CDM library(rodbc) library(ggplot2) library(jsonlite) 180403Min-hyung Kim
 

Mais de Min-hyung Kim (7)

20230511 Automation of EMR Tasks using AutoHotkey in MS Windows_MKv1.1.pdf
20230511 Automation of EMR Tasks using AutoHotkey in MS Windows_MKv1.1.pdf20230511 Automation of EMR Tasks using AutoHotkey in MS Windows_MKv1.1.pdf
20230511 Automation of EMR Tasks using AutoHotkey in MS Windows_MKv1.1.pdf
 
20221001 KAFM 의학 형의상학(Medical Ontology) v5 -clean.pptx
20221001 KAFM 의학 형의상학(Medical Ontology) v5 -clean.pptx20221001 KAFM 의학 형의상학(Medical Ontology) v5 -clean.pptx
20221001 KAFM 의학 형의상학(Medical Ontology) v5 -clean.pptx
 
MH prediction modeling and validation in r (2) classification 190709
MH prediction modeling and validation in r (2) classification 190709MH prediction modeling and validation in r (2) classification 190709
MH prediction modeling and validation in r (2) classification 190709
 
MH prediction modeling and validation in r (1) regression 190709
MH prediction modeling and validation in r (1) regression 190709MH prediction modeling and validation in r (1) regression 190709
MH prediction modeling and validation in r (1) regression 190709
 
MH Prediction Modeling and Validation -clean
MH Prediction Modeling and Validation -cleanMH Prediction Modeling and Validation -clean
MH Prediction Modeling and Validation -clean
 
r for data science 4. exploratory data analysis clean -rev -ref
r for data science 4. exploratory data analysis  clean -rev -refr for data science 4. exploratory data analysis  clean -rev -ref
r for data science 4. exploratory data analysis clean -rev -ref
 
CDM SynPuf OMOP CDM library(rodbc) library(ggplot2) library(jsonlite) 180403
CDM SynPuf OMOP CDM library(rodbc) library(ggplot2) library(jsonlite) 180403CDM SynPuf OMOP CDM library(rodbc) library(ggplot2) library(jsonlite) 180403
CDM SynPuf OMOP CDM library(rodbc) library(ggplot2) library(jsonlite) 180403
 

Último

Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramMoniSankarHazra
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlCall Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlkumarajju5765
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 

Último (20)

Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlCall Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 

r for data science 2. grammar of graphics (ggplot2) clean -ref

  • 1. Grammar of graphics (그래픽 문법) Outline (개요) • Motivating example (동기부여를 위한 예제) • example of research question (연구 질문 예시) • mpg data (Mile per Gallon, 기름량당 주행거리) • ggplot example • Math review (수학 리뷰) • function mapping (함수 대응) • Dimension (차원) & Co-ordinate system (좌표계) • Grammar of graphics (그래픽 문법) • aesthetic mapping (미학적 대응) • facet (면) • geometric object (기하학적 개체) • Statistical transformations (통계적 변환) • Position adjustments (위치 조정) • Coordinate systems (좌표계) • The layered grammar of graphics (층화된 그래픽 문법)
  • 2. Understand (이해하다): data exploration (데이터 탐색) Transform (변환하다) & Visualize (시각화하다) & Model(모형을 만들다)
  • 3. First example) engine size (엔진 크기) vs. fuel usage (연료 소모량) • Research question (연구 질문) • Do cars with big engines use more fuel than cars with small engines? • 엔진이 큰 차가 엔진이 작은 차보다 연료 소모량이 큰가?
  • 4. 내장된 데이터) mpg data Cf) Miles per Gallon (MPG): 기름 1갤런당 몇 마일 가나 mpg #> # A tibble: 234 x 11 #> manufacturer model displ year cyl trans drv cty hwy fl class #> <chr> <chr> <dbl> <int> <int> <chr> <chr> <int> <int> <chr> <chr> #> 1 audi a4 1.8 1999 4 auto(… f 18 29 p comp… #> 2 audi a4 1.8 1999 4 manua… f 21 29 p comp… #> 3 audi a4 2 2008 4 manua… f 20 31 p comp… #> 4 audi a4 2 2008 4 auto(… f 21 30 p comp… #> 5 audi a4 2.8 1999 6 auto(… f 16 26 p comp… #> 6 audi a4 2.8 1999 6 manua… f 18 26 p comp… #> # ... with 228 more rows help(mpg)
  • 5. mpg {ggplot2} R Documentation Fuel economy data from 1999 and 2008 for 38 popular models of car Description This dataset contains a subset of the fuel economy data that the EPA makes available on http://fueleconomy.gov. It contains only models which had a new release every year between 1999 and 2008 - this was used as a proxy for the popularity of the car. Usage mpg Format A data frame with 234 rows and 11 variables manufacturer model model name displ engine displacement, in litres year year of manufacture cyl number of cylinders trans type of transmission drv f = front-wheel drive, r = rear wheel drive, 4 = 4wd cty city miles per gallon hwy highway miles per gallon fl fuel type class "type" of car
  • 6. Let's plot first in a 2-D plane(2차원 평면): x- & y-axis(축) x = displ, y = hwy ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy))
  • 7. template (템플릿, 주형) ggplot(data = <DATA>) + <GEOM_FUNCTION>(mapping = aes(<MAPPINGS>))
  • 8. function mapping (함수 대응) Dimension (차원) Co-ordinate system (좌표계)
  • 9. function (함수) f:X→Y f maps(대응시키다) X into Y https://en.wikipedia.org/wiki/Function_(mathematics)
  • 10. Dimension (차원) Co-ordinate system (좌표계) https://en.wikipedia.org/wiki/Dimension
  • 11. Grammar of graphics (그래픽 문법) aesthetic mapping (미학적 대응)
  • 12. ggplot2 • ggplot2 is based on the grammar of graphics(그래픽 문법), the idea that you can build every graph from the same components: a data set, a coordinate system(좌표계), and geoms(기하, 도형)—visual marks that represent data points. • To display (화면에 표현하다) values (값), map (대응시키다) variables (변수) in the data to visual properties of the geom (기하, 도형 = aesthetics 미학) like size(크기), color(색), and x and y locations(위치, 좌표).
  • 13. Let's plot first in a 2-D plane(2차원 평면): x- & y-axis(축) x = displ, y = hwy ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy))
  • 14. Let's map (대응시키다) a 3rd dimension (차원) in color(색상) ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy, color = class))
  • 15. Let's map (대응시키다) a 3rd dimension (차원) in size (크기) ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy, size = class))
  • 16. Let's map (대응시키다) a 3rd dimension (차원) in alpha (transparency, 투명도) ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy, alpha = class))
  • 17. Let's map (대응시키다) a 3rd dimension (차원) in shape(모양) ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy, shape = class))
  • 18. Let's map (대응시키다) a 3rd dimension (차원) in shape(모양) ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy, shape = class))
  • 19.
  • 20. A calculated variable (계산한 변수) as a 3rd dimension (차원) ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy, color = displ < 5))
  • 21.
  • 22. Cf) Common problem: syntax error~! e.g.) Location of "+" ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy))
  • 23. Cf) Common problem aesthetic mapping (미학적 대응) is for variables (변수) in the data e.g.) "blue" is not a variable ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy, color = "blue"))
  • 24. Grammar of graphics (그래픽 문법) facet (면)
  • 25. Let's map (대응시키다) a 3rd dimension (차원) in color(색상) ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy, color = class))
  • 26. facet (면): subplots that each display one subset of the data facet_wrap(~VariableName) ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy)) + facet_wrap(~ class, nrow = 2)
  • 27. facet (면): subplots that each display one subset of the data facet_grid(VariabeName4Row ~ VariabeName4Column) ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy)) + facet_grid(drv ~ cyl)
  • 28. facet (면): subplots that each display one subset of the data facet_grid(VariabeName4Row ~ VariabeName4Column) ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy)) + facet_grid(drv ~ .)
  • 29. facet (면): subplots that each display one subset of the data facet_grid(VariabeName4Row ~ VariabeName4Column) ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy)) + facet_grid(. ~ cyl)
  • 30. facet (면): subplots that each display one subset of the data facet_grid(VariabeName4Row ~ VariabeName4Column) ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy)) + facet_grid(. ~ cyl)
  • 31. facet (면): subplots that each display one subset of the data facet_grid(VariabeName4Row ~ VariabeName4Column) -> use a variable with more unique (고유) levels (단계, 범주) ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy)) + facet_grid(trans ~ drv)
  • 32. facet (면): subplots that each display one subset of the data facet_grid(VariabeName4Row ~ VariabeName4Column) -> use a variable with more unique (고유) levels (단계, 범주) ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy)) + facet_grid(drv ~ trans)
  • 34. Grammar of graphics (그래픽 문법) geometric object (기하학적 개체)
  • 35. Plot in a 2-D plane(2차원 평면): x- & y-axis(축) geom_point (점) ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy))
  • 36. Plot in a 2-D plane(2차원 평면): x- & y-axis(축) geom_smooth (매끄러운) ggplot(data = mpg) + geom_smooth(mapping = aes(x = displ, y = hwy))
  • 37. Plot in a 2-D plane(2차원 평면): x- & y-axis(축) geom_smooth (매끄러운) ggplot(data = mpg) + geom_smooth(mapping = aes(x = displ, y = hwy, group = drv))
  • 38. Plot in a 2-D plane(2차원 평면): x- & y-axis(축) geom_smooth (매끄러운) ggplot(data = mpg) + geom_smooth(mapping = aes(x = displ, y = hwy, linetype = drv))
  • 39. Plot in a 2-D plane(2차원 평면): x- & y-axis(축) geom_smooth (매끄러운) ggplot(data = mpg) + geom_smooth(mapping = aes(x = displ, y = hwy, color = drv), show.legend = FALSE)
  • 40. Plot in a 2-D plane(2차원 평면): x- & y-axis(축) geom_point (점) & geom_smooth (매끄러운) ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy)) + geom_smooth(mapping = aes(x = displ, y = hwy))
  • 41. Plot in a 2-D plane(2차원 평면): x- & y-axis(축) geom_point (점) & geom_smooth (매끄러운) ggplot(data = mpg, mapping = aes(x = displ, y = hwy)) + geom_point() + geom_smooth()
  • 42. Plot in a 2-D plane(2차원 평면): x- & y-axis(축) geom_point (점) & geom_smooth (매끄러운) ggplot(data = mpg, mapping = aes(x = displ, y = hwy)) + geom_point(mapping = aes(color = class)) + geom_smooth()
  • 43. Plot in a 2-D plane(2차원 평면): x- & y-axis(축) geom_point (점) & geom_smooth (매끄러운) ggplot(data = mpg, mapping = aes(x = displ, y = hwy)) + geom_point(mapping = aes(color = class)) + geom_smooth(data = filter(mpg, class == "subcompact"), se = FALSE)
  • 44. Plot in a 2-D plane(2차원 평면): x- & y-axis(축) geom_point (점) & geom_smooth (매끄러운) ggplot(data = mpg, mapping = aes(x = displ, y = hwy, color = drv)) + geom_point() + geom_smooth(se = FALSE)
  • 46. Grammar of graphics (그래픽 문법) Statistical transformations (통계적 변환)
  • 47. 내장된 데이터) diamond data > diamonds # A tibble: 53,940 x 10 carat cut color clarity depth table price x y z <dbl> <ord> <ord> <ord> <dbl> <dbl> <int> <dbl> <dbl> <dbl> 1 0.23 Ideal E SI2 61.5 55 326 3.95 3.98 2.43 2 0.21 Premium E SI1 59.8 61 326 3.89 3.84 2.31 3 0.23 Good E VS1 56.9 65 327 4.05 4.07 2.31 4 0.290 Premium I VS2 62.4 58 334 4.2 4.23 2.63 5 0.31 Good J SI2 63.3 58 335 4.34 4.35 2.75 6 0.24 Very Good J VVS2 62.8 57 336 3.94 3.96 2.48 7 0.24 Very Good I VVS1 62.3 57 336 3.95 3.98 2.47 8 0.26 Very Good H SI1 61.9 55 337 4.07 4.11 2.53 9 0.22 Fair E VS2 65.1 61 337 3.87 3.78 2.49 10 0.23 Very Good H VS1 59.4 61 338 4 4.05 2.39 # ... with 53,930 more rows > help(diamonds)
  • 48. diamonds {ggplot2} R Documentation Prices of 50,000 round cut diamonds Description A dataset containing the prices and other attributes of almost 54,000 diamonds. The variables are as follows: Usage diamonds Format A data frame with 53940 rows and 10 variables: price price in US dollars ($326–$18,823) carat weight of the diamond (0.2–5.01) cut quality of the cut (Fair, Good, Very Good, Premium, Ideal) color diamond colour, from J (worst) to D (best) clarity a measurement of how clear the diamond is (I1 (worst), SI2, SI1, VS2, VS1, VVS2, VVS1, IF (best)) x length in mm (0–10.74) y width in mm (0–58.9) z depth in mm (0–31.8) depth total depth percentage = z / mean(x, y) = 2 * z / (x + y) (43–79) table width of top of diamond relative to widest point (43–95)
  • 49. Plot in a 2-D plane(2차원 평면): x- & y-axis(축) geom_bar ggplot(data = diamonds) + geom_bar(mapping = aes(x = cut))
  • 50. Plot in a 2-D plane(2차원 평면): x- & y-axis(축) geom_bar ggplot(data = diamonds) + geom_bar(mapping = aes(x = cut, colour = cut))
  • 51. Plot in a 2-D plane(2차원 평면): x- & y-axis(축) geom_bar ggplot(data = diamonds) + geom_bar(mapping = aes(x = cut, fill = cut))
  • 52. Plot in a 2-D plane(2차원 평면): x- & y-axis(축) geom_bar: how is the y-axis calculated? ggplot(data = diamonds) + geom_bar(mapping = aes(x = cut))
  • 53. Plot in a 2-D plane(2차원 평면): x- & y-axis(축) geom_bar: how is the y-axis calculated?
  • 54. Plot in a 2-D plane(2차원 평면): x- & y-axis(축) geom_bar() = stat_count() ggplot(data = diamonds) + stat_count(mapping = aes(x = cut))
  • 55. Plot in a 2-D plane(2차원 평면): x- & y-axis(축) geom_bar(): ..prop.. ggplot(data = diamonds) + geom_bar(mapping = aes(x = cut, y = ..prop.., group = 1))
  • 56. The data after the statistical transformation demo <- tribble( ~cut, ~freq, "Fair", 1610, "Good", 4906, "Very Good", 12082, "Premium", 13791, "Ideal", 21551 ) demo # # A tibble: 5 x 2 # cut freq # <chr> <dbl> # 1 Fair 1610 # 2 Good 4906 # 3 Very Good 12082 # 4 Premium 13791 # 5 Ideal 21551
  • 57. Plot in a 2-D plane(2차원 평면): x- & y-axis(축) The data after the statistical transformation ggplot(data = demo) + geom_bar(mapping = aes(x = cut, y = freq), stat = "identity")
  • 58. More on statistical transformation
  • 59. Grammar of graphics (그래픽 문법) Position adjustments (위치 조정)
  • 60. Plot in a 2-D plane(2차원 평면): x- & y-axis(축) geom_bar ggplot(data = diamonds) + geom_bar(mapping = aes(x = cut, fill = cut))
  • 61. Plot in a 2-D plane(2차원 평면): x- & y-axis(축) geom_bar ggplot(data = diamonds) + geom_bar(mapping = aes(x = cut, fill = clarity))
  • 62. Plot in a 2-D plane(2차원 평면): x- & y-axis(축) geom_bar ggplot(data = diamonds, mapping = aes(x = cut, fill = clarity)) + geom_bar(alpha = 1/5, position = "identity")
  • 63. Plot in a 2-D plane(2차원 평면): x- & y-axis(축) geom_bar ggplot(data = diamonds, mapping = aes(x = cut, color = clarity)) + geom_bar(fill = NA, position = "identity")
  • 64. Plot in a 2-D plane(2차원 평면): x- & y-axis(축) geom_bar ggplot(data = diamonds) + geom_bar(mapping = aes(x = cut, fill = clarity), position = "fill")
  • 65. Plot in a 2-D plane(2차원 평면): x- & y-axis(축) geom_bar ggplot(data = diamonds) + geom_bar(mapping = aes(x = cut, fill = clarity), position = "dodge")
  • 66. Plot in a 2-D plane(2차원 평면): x- & y-axis(축) geom_point ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy))
  • 67. Plot in a 2-D plane(2차원 평면): x- & y-axis(축) geom_point ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy), position = "jitter")
  • 68. More on statistical transformation
  • 69. Grammar of graphics (그래픽 문법) Coordinate systems (좌표계)
  • 70. Plot in a 2-D plane(2차원 평면): x- & y-axis(축) Coordinate systems (좌표계) ggplot(data = mpg, mapping = aes(x = class, y = hwy)) + geom_boxplot()
  • 71. Plot in a 2-D plane(2차원 평면): x- & y-axis(축) Coordinate systems (좌표계) ggplot(data = mpg, mapping = aes(x = class, y = hwy)) + geom_boxplot() + coord_flip()
  • 73. The layered grammar of graphics (층화된 그래픽 문법)
  • 74. The Grammar of Graphics • Wilkinson L. The grammar of graphics. 2ed. Springer. 2006.
  • 75. library(ggplot2) • Wickham H. ggplot2: elegant graphics for data analysis. Springer; 2016. • https://github.com/hadley/ggplot2-book • https://github.com/tidyverse/ggplot2
  • 77. the grammar of graphics = a formal system for building plots
  • 78. the grammar of graphics = a formal system for building plots
  • 79. the grammar of graphics = a formal system for building plots
  • 80. REFERENCES #1. RStudio Official Documentations (Help & Cheat Sheet) Free Webpage) https://www.rstudio.com/resources/cheatsheets/ #2. Wickham, H. and Grolemund, G., 2016.R for data science: import, tidy, transform, visualize, and model data. O'Reilly. Free Webpage) https://r4ds.had.co.nz/ Cf) Tidyverse syntax (www.tidyverse.org), rather than R Base syntax Cf) Hadley Wickham: Chief Scientist at RStudio. Adjunct Professor of Statistics at the University of Auckland, Stanford University, and Rice University