SlideShare a Scribd company logo
1 of 91
Download to read offline
Quantitave research methods
Data analysis workflow
Statistical Software
Installing R and RStudio
Getting help
Introduction into R
Part 1A
Richard L. Zijdeman
2016-06-15
Richard L. Zijdeman Introduction into R
Quantitave research methods
Data analysis workflow
Statistical Software
Installing R and RStudio
Getting help
1 Quantitave research methods
2 Data analysis workflow
3 Statistical Software
4 Installing R and RStudio
5 Getting help
Richard L. Zijdeman Introduction into R
Quantitave research methods
Data analysis workflow
Statistical Software
Installing R and RStudio
Getting help
Quantitave research methods
Richard L. Zijdeman Introduction into R
Quantitave research methods
Data analysis workflow
Statistical Software
Installing R and RStudio
Getting help
Why
To answer descriptive and explanatory questions on populations
Richard L. Zijdeman Introduction into R
Quantitave research methods
Data analysis workflow
Statistical Software
Installing R and RStudio
Getting help
Workflow: PTE
problem (research question)
theory (hypothesis)
empirical test . . . with loops between T-E and P-T-E
Richard L. Zijdeman Introduction into R
Quantitave research methods
Data analysis workflow
Statistical Software
Installing R and RStudio
Getting help
Research Questions
descriptive (to what extent. . . )
comparative (comparing two entities)
trend (comparison over time)
explanatory (focus on mechanism at hand)
Richard L. Zijdeman Introduction into R
Quantitave research methods
Data analysis workflow
Statistical Software
Installing R and RStudio
Getting help
Theory
deductive reasoning
explanans
general mechanism
condition
explanandum (hypothesis)
Richard L. Zijdeman Introduction into R
Quantitave research methods
Data analysis workflow
Statistical Software
Installing R and RStudio
Getting help
Empirical test
sample vs. population
random vs. stratified samples
testing technique, e.g.:
T-test, correlation, regression
Software required for faster analysis
Richard L. Zijdeman Introduction into R
Quantitave research methods
Data analysis workflow
Statistical Software
Installing R and RStudio
Getting help
Data analysis workflow
Richard L. Zijdeman Introduction into R
Quantitave research methods
Data analysis workflow
Statistical Software
Installing R and RStudio
Getting help
Empirical testings has its own workflow
Grolemund & Wickham, 2016, Creative Commons
Attribution-NonCommercial-NoDerivs 4.0.
Richard L. Zijdeman Introduction into R
Quantitave research methods
Data analysis workflow
Statistical Software
Installing R and RStudio
Getting help
Statistical Software
Richard L. Zijdeman Introduction into R
Quantitave research methods
Data analysis workflow
Statistical Software
Installing R and RStudio
Getting help
The dangers of analysing with spreadsheets
(e.g. MS Excel)
tempting to input and clean data and analyse in the same sheet
di cult to track cleaning rules
defaults mess up your data (e.g. 01200 -> 1200)
Richard L. Zijdeman Introduction into R
Quantitave research methods
Data analysis workflow
Statistical Software
Installing R and RStudio
Getting help
Why use syntax (scripting)
E ciency (really)
Quality (error checking)
Replicatability
Communication
Richard L. Zijdeman Introduction into R
Quantitave research methods
Data analysis workflow
Statistical Software
Installing R and RStudio
Getting help
R
R is open source, which is good and bad:
anybody can contribute (check, improve, create code)
free of charge
but: R depends on collective action
cannot ‘demand’ support
sprawl of packages
Richard L. Zijdeman Introduction into R
Quantitave research methods
Data analysis workflow
Statistical Software
Installing R and RStudio
Getting help
RStudio
browser for R
provides easy access to:
scripts
data
plots
manual
Richard L. Zijdeman Introduction into R
Quantitave research methods
Data analysis workflow
Statistical Software
Installing R and RStudio
Getting help
Installing R and RStudio
Richard L. Zijdeman Introduction into R
Quantitave research methods
Data analysis workflow
Statistical Software
Installing R and RStudio
Getting help
Download R
Instructions via http://www.r-project.org
Choose a CRAN mirror
http://cran.r-project.org/mirrors.html
close, but active too!
Romania hasn’t gone (yet!)
Click on ‘Download R for Windows’
Follow usual installation procedure
Double click on R
You should now have a working session!
Close the session, do not save workspace image
Richard L. Zijdeman Introduction into R
Quantitave research methods
Data analysis workflow
Statistical Software
Installing R and RStudio
Getting help
Packages and libraries
base R (core product)
additional packages
CRAN repository
spread through ‘mirrors’
choose a local, but active mirror
Github
packages not on CRAN
development versions of CRAN libraries
Richard L. Zijdeman Introduction into R
Quantitave research methods
Data analysis workflow
Statistical Software
Installing R and RStudio
Getting help
RStudio
RStudio is found on http://www.rstudio.com
Download the version for your OS (e.g. windows)
http://www.rstudio.com/products/rstudio/download/
Install by double clicking on the downloaded file
Start RStudio by double clicking on the icon
You do not need to start R, before starting RStudio
Richard L. Zijdeman Introduction into R
Quantitave research methods
Data analysis workflow
Statistical Software
Installing R and RStudio
Getting help
Getting help
Richard L. Zijdeman Introduction into R
Quantitave research methods
Data analysis workflow
Statistical Software
Installing R and RStudio
Getting help
Build-in help: “?”
?[function] / ?[package]
e.g. “?plot” or “?graphics”
check the index for user guides and vignettes
Richard L. Zijdeman Introduction into R
Quantitave research methods
Data analysis workflow
Statistical Software
Installing R and RStudio
Getting help
Cran website
Manuals
R FAQ
R Journal
Richard L. Zijdeman Introduction into R
Quantitave research methods
Data analysis workflow
Statistical Software
Installing R and RStudio
Getting help
Online communities
Stackoverflow
Instance of Stackexchange
Reputation based Q&A
Specific lists for packages, e.g.:
ggplot2
R-sig-mixed-models
Richard L. Zijdeman Introduction into R
Quantitave research methods
Data analysis workflow
Statistical Software
Installing R and RStudio
Getting help
Asking a question Getting an answer
Search the web: others must have had this problem too
If you raise a question:
be polite
be concise
short background
replicatable example
debrief your e orts sofar
Richard L. Zijdeman Introduction into R
Introducing RStudio and R
Introducing base R
Data visualization using ggplot2
Introduction into R
Part 1B
Richard L. Zijdeman
2016-06-15
Richard L. Zijdeman Introduction into R
Introducing RStudio and R
Introducing base R
Data visualization using ggplot2
1 Introducing RStudio and R
2 Introducing base R
3 Data visualization using ggplot2
Richard L. Zijdeman Introduction into R
Introducing RStudio and R
Introducing base R
Data visualization using ggplot2
Introducing RStudio and R
Richard L. Zijdeman Introduction into R
Introducing RStudio and R
Introducing base R
Data visualization using ggplot2
RStudio
Rstudio is sort of a ‘viewer’ on R
helps to organize input and output:
editor (upper left)
console (lower left)
environment (upper right)
output (lower right)
Richard L. Zijdeman Introduction into R
Introducing RStudio and R
Introducing base R
Data visualization using ggplot2
R script
series of ))commands to manipulate data
always save your script, NEVER change your data
original data + script = reproducable research
Richard L. Zijdeman Introduction into R
Introducing RStudio and R
Introducing base R
Data visualization using ggplot2
Packages
Build your R system using packages
‘Base R’ is basic. Add packages for your specific needs
Packages are found on servers, called ‘mirrors’
Make sure to select a mirror first
https://cran.r-project.org/mirrors.html%5Bhttps:
//cran.r-project.org/mirrors.html%5D
## To permanently add the mirror, type:
options(repos=structure(
c(CRAN="http://cran.xl-mirror.nl")))
## replace http://... with your favorite mirror
Richard L. Zijdeman Introduction into R
Introducing RStudio and R
Introducing base R
Data visualization using ggplot2
Packages for book (see 1.4.2)
pkgs <- c(
"broom", "dplyr", "ggplot2", "jpeg", "jsonlite",
"knitr", "Lahman", "microbenchmark", "png", "pryr",
"purrr", "rcorpora", "readr", "stringr", "tibble",
"tidyr"
)
install.packages(pkgs)
Richard L. Zijdeman Introduction into R
Introducing RStudio and R
Introducing base R
Data visualization using ggplot2
R Session
contains scripts, data, functions
can be saved ‘workspace image’
prefer not to:
sessions are usually cluttered
only useful if running script takes time
Suggested tweak:
Options: uncheck “Restore .RData into workspace at startup”
Options: Save workspace to .RData on exit, select ‘never’
Richard L. Zijdeman Introduction into R
Introducing RStudio and R
Introducing base R
Data visualization using ggplot2
Introducing base R
Richard L. Zijdeman Introduction into R
Introducing RStudio and R
Introducing base R
Data visualization using ggplot2
base R: assignment and print()
‘attach’ values to an object (e.g. a variable)
x <- 5
y <- 4
z <- x * y
print(z)
## [1] 20
Richard L. Zijdeman Introduction into R
Introducing RStudio and R
Introducing base R
Data visualization using ggplot2
base R: assignment and print() (II)
Try and imagine the potential of assignment
x <- c(4, 3, 2, 1, 0, 27, 34, 35)
# c for concatenate values
y <- -1
z <- x*y
print(z)
## [1] -4 -3 -2 -1 0 -27 -34 -35
Richard L. Zijdeman Introduction into R
Introducing RStudio and R
Introducing base R
Data visualization using ggplot2
base R: data.frame
basically a table
contains columns (variables)
contains rows (cases)
“flat table” in Kees’ terminology
my.df <- data.frame(x,z)
str(my.df) # show STRucture
## data.frame : 8 obs. of 2 variables:
## $ x: num 4 3 2 1 0 27 34 35
## $ z: num -4 -3 -2 -1 0 -27 -34 -35
There’s much more, but let’s keep that for tomorrow
Richard L. Zijdeman Introduction into R
Introducing RStudio and R
Introducing base R
Data visualization using ggplot2
Data visualization using ggplot2
Richard L. Zijdeman Introduction into R
Introducing RStudio and R
Introducing base R
Data visualization using ggplot2
Visualizing your data
Not just for analyses!
Data quality
representativeness
missing data
Richard L. Zijdeman Introduction into R
Introducing RStudio and R
Introducing base R
Data visualization using ggplot2
plot() in base R
library(help = "datasets") # all datasets in R
?mtcars # show help on mtcars dataset
df <- mtcars()
str(mtcars) # display STRucture of an object
plot(mtcars$hp, mtcars$mpg)
plot(df)
Richard L. Zijdeman Introduction into R
Introducing RStudio and R
Introducing base R
Data visualization using ggplot2
plot() is like . . .
plot() is like latex:
Forge it in anyway you want
Heterogeneous approach though
Takes quite some time to get it right
Richard L. Zijdeman Introduction into R
Introducing RStudio and R
Introducing base R
Data visualization using ggplot2
ggplot() as alternative
ggplot is but one of many graph packages ggplot is nice bc, of:
similar approach to various types of graphs
easy build up for basic graphs
can get quite complex too
(but cannot do it all)
Richard L. Zijdeman Introduction into R
Introducing RStudio and R
Introducing base R
Data visualization using ggplot2
ggplot() and the canvas metaphore
ggplot() consists of two elements
canvas
(multiple) layers of paint
Richard L. Zijdeman Introduction into R
Introducing RStudio and R
Introducing base R
Data visualization using ggplot2
mapping and geom layers
ggplot() consists of two elements
canvas:
data
mapping (aesthetic)
(multiple) layers of paint
geom layers
ggplot(data = <DATASET>,
mapping = aes(x = <X-VAR>, y = <Y-VAR>)) +
geom_<TYPE>
Richard L. Zijdeman Introduction into R
Introducing RStudio and R
Introducing base R
Data visualization using ggplot2
our first ggplot
install.packages("ggplot2")
library(ggplot2)
df <- mtcars
ggplot(data = df, aes(x = hp, y = mpg)) +
geom_point()
Richard L. Zijdeman Introduction into R
Introducing RStudio and R
Introducing base R
Data visualization using ggplot2
geom_ features
? geom_point
install.packages("ggplot2")
library(ggplot2)
df <- mtcars
ggplot(data = df, aes(x = hp, y = mpg)) +
geom_point(fill = "white", colour = "blue",
shape = 21, size = 4)
Richard L. Zijdeman Introduction into R
Introducing RStudio and R
Introducing base R
Data visualization using ggplot2
Adding characteristics to your plot
Add variables to explain a pattern
ggplot(data = df, aes(x = hp, y = mpg)) +
geom_point(aes(colour = wt), size = 4)
NB: notice the di erence?
ggplot(data = df, aes(x = hp, y = mpg)) +
geom_point(aes(colour = wt, size = 4))
Richard L. Zijdeman Introduction into R
Introducing RStudio and R
Introducing base R
Data visualization using ggplot2
Multiple geom’s
Add variables to explain a pattern
ggplot(data = df, aes(x = hp, y = mpg)) +
geom_point(aes(colour = as.factor(am)),
size = 6) + # increase size bc overlap
geom_point(aes(shape = as.factor(vs)),
size = 3)
#V/S whether V8 (0) or Straight (European) (1)
Richard L. Zijdeman Introduction into R
Introducing RStudio and R
Introducing base R
Data visualization using ggplot2
Adding facets
Facets help reduce complexity
ggplot(data = df, aes(x = hp, y = mpg)) +
geom_point(aes(colour = as.factor(am)),
size = 4) +
facet_wrap( ~ vs)
Richard L. Zijdeman Introduction into R
Introducing RStudio and R
Introducing base R
Data visualization using ggplot2
Things to consider with geom(_point)
fill only works where shape actually can be filled
consider order of geoms
mind overlap:
decrease size
use alpha
use ‘open’ shapes
geom_jitter
Richard L. Zijdeman Introduction into R
Introducing RStudio and R
Introducing base R
Data visualization using ggplot2
ggplot and titles
Various ways to add titlex to axes and stu
Can get quite complex
Here’s the basiscs
ggplot(data = df, aes(x = hp, y = mpg)) +
geom_point() +
labs(title = "Nice graph", x = "Horse Power",
y = "Miles per Gallon" )
Richard L. Zijdeman Introduction into R
Introducing RStudio and R
Introducing base R
Data visualization using ggplot2
Themes and size
ggplot(data = df, aes(x = hp, y = mpg)) +
geom_point() +
labs(title = "Nice graph", x = "Horse Power",
y = "Miles per Gallon" ) +
theme_bw(base_size = 16)
Richard L. Zijdeman Introduction into R
Introducing RStudio and R
Introducing base R
Data visualization using ggplot2
Much more to learn
not just about ggplot()
axes
legend (guides)
geoms
also about dataviz in general
general do’s and don’ts
which problem fits which graph
it’s a science! (Graph theory)
Richard L. Zijdeman Introduction into R
Data wrangling
bit about NA
Introduction into R
Part 2A, 2B
Richard L. Zijdeman
2016-06-16
Richard L. Zijdeman Introduction into R
Data wrangling
bit about NA
1 Data wrangling
2 bit about NA
Richard L. Zijdeman Introduction into R
Data wrangling
bit about NA
Data wrangling
Richard L. Zijdeman Introduction into R
Data wrangling
bit about NA
Grolemund & Wickham, 2016, Creative Commons
Attribution-NonCommercial-NoDerivs 4.0.
Richard L. Zijdeman Introduction into R
Data wrangling
bit about NA
dplyr package
# install.packages("dplyr") # 1 time only
library(dplyr)
install.packages("nycflights13")
library(nycflights13)
print(flights)
Richard L. Zijdeman Introduction into R
Data wrangling
bit about NA
tibble or data_frame vs data.frame
str(mtcars)
class(mtcars)
mtcars_tbl <- as_data_frame(mtcars)
str(mtcars)
class(mtcars)
Richard L. Zijdeman Introduction into R
Data wrangling
bit about NA
filter
filter(mtcars, am == 1, vs == 0)
some.cars <- filter(mtcars, am == 1, vs == 0)
some.cars
(some.cars2 <- filter(mtcars, am == 1, vs == 0))
Richard L. Zijdeman Introduction into R
Data wrangling
bit about NA
filter and using or
filter(mtcars, gear == 3 | gear == 4) # !! not like this:
filter(mtcars, gear == 3 | 4)
Richard L. Zijdeman Introduction into R
Data wrangling
bit about NA
bit about NA
Richard L. Zijdeman Introduction into R
Data wrangling
bit about NA
Arrange
arrange(flights, dep_time)
arrange(flights, year, month, day) # ascending order
arrange(flights, desc(day))
# NB: missing values come at end
Richard L. Zijdeman Introduction into R
Data wrangling
bit about NA
Select
df <- select(flights, year, month, day)
names(flights)
df <- select(flights, tailnum:dest)
df <- select(flights, -(tailnum:dest))
df
df <- select(flights, starts_with("arr_"))
df <- select(flights, ends_with("e"))
df <- select(flights, contains("a"))
Richard L. Zijdeman Introduction into R
Data wrangling
bit about NA
rename
df <- rename(flights, Y_ear = year)
df <- mutate(flights, year1 = year+1)
select(df, year, year1)
df <- mutate(flights, year1 = year + 1, year2 = year1+1)
select(df, contains("year"))
df <- transmute(flights, year1 = year + 1, year2 = year1+1)
# only maintains the newly created variables
Richard L. Zijdeman Introduction into R
Data wrangling
bit about NA
group_by
by_day <- group_by(flights, year, month, day)
summarise(by_day)
cars <- mtcars
cars <- as_data_frame(mtcars)
summarise(cars, mean_hp = mean(hp, na.rm = TRUE))
mean(cars$hp, na.rm = TRUE)
Richard L. Zijdeman Introduction into R
Data wrangling
bit about NA
the pipe: %>%
cars_grp <- group_by(cars, carb)
class(cars)
class(cars_grp)
summarise(cars_grp, mmpg = mean(mpg, na.rm = TRUE))
cars_grp_sum <- summarise(cars_grp,
mmpg = mean(mpg, na.rm = TRUE),
count = n())
cars_grp_sum
plot <- ggplot(cars_grp_sum,
aes(x = carb, y = mmpg,
label = carb)) +
geom_point(aes(size = count)) +
geom_text(colour = "cyan")
plot Richard L. Zijdeman Introduction into R
Data wrangling
bit about NA
more pipe, adding a filter
cars_grp_sum3 <- cars %>%
group_by(carb) %>%
summarise(mmpg = mean(mpg, na.rm = TRUE),
count = n()) %>%
filter(count > 3)
ggplot(cars_grp_sum3, aes(x = carb, y = mmpg, label = carb)
geom_point(aes(size = count)) +
geom_text(colour = "cyan") +
labs(title = "figure with %>% and count > 3")
Richard L. Zijdeman Introduction into R
Session management
Basic data manipulation
Introduction into R
Part 3A
Richard L. Zijdeman
2016-06-17
Richard L. Zijdeman Introduction into R
Session management
Basic data manipulation
1 Session management
2 Basic data manipulation
Richard L. Zijdeman Introduction into R
Session management
Basic data manipulation
Session management
Richard L. Zijdeman Introduction into R
Session management
Basic data manipulation
Maintaining your workspace
Grolemund & Wickham, 2016, Creative Commons
Attribution-NonCommercial-NoDerivs 4.0.
Richard L. Zijdeman Introduction into R
Session management
Basic data manipulation
Setting up a session
clear your Environment
check sessionInfo() for loaded packages
detach obsolete packages under ‘other attached packages’
set your directory (“" on windows and”/" for linux/mac)
load libraries (install new ones)
load your data
Richard L. Zijdeman Introduction into R
Session management
Basic data manipulation
Example session setup
rm(list = ls())
sessionInfo() # check for other attached packages
detach("package:nycflights13", unload = TRUE)
setwd("/Users/RichardZ/Dropbox/
Summer school 2016/Richard Zijdeman/")
getwd() # to see whether you re in the right directory
dir() # shows what s in your directory
Richard L. Zijdeman Introduction into R
Session management
Basic data manipulation
Loading your data
read.table() (generic function)
read.csv()
library(foreign) # e.g. SPSS and Stata
library(readxl) # fast excel-package
Richard L. Zijdeman Introduction into R
Session management
Basic data manipulation
Reading in data
Di erent functions for di erent files:
Base R: read.table() (read.csv())
foreign package: read.spss(), read.dta(), read.dbf()
readxl
alternatives packages:
xlsx(Java required)
gdata (perl-based)
openxlsx package: read.xlsx()
Richard L. Zijdeman Introduction into R
Session management
Basic data manipulation
read.csv()
file: your file, including directory
header: variable names or not?
sep: seperator
read.csv default: “,”
read.csv2 default: “;”
skip: number of rows to skip
nrows: total number of rows to read
stringsAsFactors
encoding (e.g. “latin1” or “UTF-8”)
Richard L. Zijdeman Introduction into R
Session management
Basic data manipulation
read_excel from readxl package
path: your file, including directory
sheet: name or number of sheet
col_names: col names in 1st row?
col_types: specify type
na: what’s the sign for missing values
skip: how many rows to skip before data starts
Richard L. Zijdeman Introduction into R
Session management
Basic data manipulation
Example session loading your csv data
# setwd() to set your working directory
hmar100 <- read.csv("./Datafiles_HSN/HSN_marriages.csv",
stringsAsFactors = FALSE,
encoding = "latin1",
header = TRUE,
nrows = 100) # just first 100 rows
Richard L. Zijdeman Introduction into R
Session management
Basic data manipulation
Example session loading your excel data
# setwd() to set your working directory
install.packages("readxl")
library("readxl")
hmar <- read_excel("./Datafiles_HSN/HSN_marriages_awful.xls
col_names = TRUE,
skip = 3) # empty lines not counted!!!
Richard L. Zijdeman Introduction into R
Session management
Basic data manipulation
Basic data manipulation
Richard L. Zijdeman Introduction into R
Session management
Basic data manipulation
Change case of text
tolower()
toupper()
tolower("CaN we pleASe jUSt have LOWER cases?")
names(hmar) <- tolower(names(hmar))
Richard L. Zijdeman Introduction into R
Session management
Basic data manipulation
length()
Used to count how many instances there are
length(names(hmar))
# shows number of variables in hmar
Richard L. Zijdeman Introduction into R
Basic statistical techniques
Introduction into R
Part 3B
Richard L. Zijdeman
2016-06-17
Richard L. Zijdeman Introduction into R
Basic statistical techniques
1 Basic statistical techniques
Richard L. Zijdeman Introduction into R
Basic statistical techniques
Basic statistical techniques
Richard L. Zijdeman Introduction into R
Basic statistical techniques
Box and whisker plot
Distribution of data
Median: 50% of the cases above and below
Box: 1st and 3rd quartile
Interquartile range (IQR): Q3-Q1
Outliers (Tukey, 1977):
x < Q1 - 1.5*IQR
x > Q3 + 1.5*IQR
Richard L. Zijdeman Introduction into R
Basic statistical techniques
p <- ggplot(hmar, aes(sign_groom, age_groom))
p + geom_boxplot()
Richard L. Zijdeman Introduction into R
Basic statistical techniques
hmar <- mutate(hmar, sign_groomD = (sign_groom == "h" & !(i
p <- ggplot(hmar, aes(sign_groomD, age_groom))
p + geom_boxplot()
Richard L. Zijdeman Introduction into R
Basic statistical techniques
hmar <- mutate(hmar, sign_groomD = (sign_groom == "h" & !(i
p <- ggplot(hmar, aes(sign_groomD, age_groom))
p + geom_boxplot() + geom_jitter(shape = 24, width = 0.2)
Richard L. Zijdeman Introduction into R
Basic statistical techniques
library(stats)
var.test(age_groom ~ sign_groomD, data = hmar)
t.test(age_groom ~ sign_groomD, data = hmar)
# NB: always check for variances
Richard L. Zijdeman Introduction into R
Basic statistical techniques
A small PTE project
Look at the variables in the HSN files
Think of a research question
Provide a general mechanism and hypothesis
Plot your results
Richard L. Zijdeman Introduction into R

More Related Content

What's hot

R programming for data science
R programming for data scienceR programming for data science
R programming for data scienceSovello Hildebrand
 
A short tutorial on r
A short tutorial on rA short tutorial on r
A short tutorial on rAshraf Uddin
 
1 R Tutorial Introduction
1 R Tutorial Introduction1 R Tutorial Introduction
1 R Tutorial IntroductionSakthi Dasans
 
R programming groundup-basic-section-i
R programming groundup-basic-section-iR programming groundup-basic-section-i
R programming groundup-basic-section-iDr. Awase Khirni Syed
 
R language tutorial
R language tutorialR language tutorial
R language tutorialDavid Chiu
 
1.3 introduction to R language, importing dataset in r, data exploration in r
1.3 introduction to R language, importing dataset in r, data exploration in r1.3 introduction to R language, importing dataset in r, data exploration in r
1.3 introduction to R language, importing dataset in r, data exploration in rSimple Research
 
R programming slides
R  programming slidesR  programming slides
R programming slidesPankaj Saini
 
R programming presentation
R programming presentationR programming presentation
R programming presentationAkshat Sharma
 
Scalable Data Analysis in R -- Lee Edlefsen
Scalable Data Analysis in R -- Lee EdlefsenScalable Data Analysis in R -- Lee Edlefsen
Scalable Data Analysis in R -- Lee EdlefsenRevolution Analytics
 
A Workshop on R
A Workshop on RA Workshop on R
A Workshop on RAjay Ohri
 
2 it unit-1 start learning r
2 it   unit-1 start learning r2 it   unit-1 start learning r
2 it unit-1 start learning rNetaji Gandi
 
Is Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed
Is Revolution R Enterprise Faster than SAS? Benchmarking Results RevealedIs Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed
Is Revolution R Enterprise Faster than SAS? Benchmarking Results RevealedRevolution Analytics
 
Big Data Analytics with R
Big Data Analytics with RBig Data Analytics with R
Big Data Analytics with RGreat Wide Open
 
R Programming Overview
R Programming Overview R Programming Overview
R Programming Overview dlamb3244
 
Why R? A Brief Introduction to the Open Source Statistics Platform
Why R? A Brief Introduction to the Open Source Statistics PlatformWhy R? A Brief Introduction to the Open Source Statistics Platform
Why R? A Brief Introduction to the Open Source Statistics PlatformSyracuse University
 

What's hot (20)

R for data analytics
R for data analyticsR for data analytics
R for data analytics
 
R programming for data science
R programming for data scienceR programming for data science
R programming for data science
 
A short tutorial on r
A short tutorial on rA short tutorial on r
A short tutorial on r
 
1 R Tutorial Introduction
1 R Tutorial Introduction1 R Tutorial Introduction
1 R Tutorial Introduction
 
R programming groundup-basic-section-i
R programming groundup-basic-section-iR programming groundup-basic-section-i
R programming groundup-basic-section-i
 
R tutorial
R tutorialR tutorial
R tutorial
 
R language tutorial
R language tutorialR language tutorial
R language tutorial
 
1.3 introduction to R language, importing dataset in r, data exploration in r
1.3 introduction to R language, importing dataset in r, data exploration in r1.3 introduction to R language, importing dataset in r, data exploration in r
1.3 introduction to R language, importing dataset in r, data exploration in r
 
R programming
R programmingR programming
R programming
 
R programming slides
R  programming slidesR  programming slides
R programming slides
 
R programming presentation
R programming presentationR programming presentation
R programming presentation
 
Scalable Data Analysis in R -- Lee Edlefsen
Scalable Data Analysis in R -- Lee EdlefsenScalable Data Analysis in R -- Lee Edlefsen
Scalable Data Analysis in R -- Lee Edlefsen
 
A Workshop on R
A Workshop on RA Workshop on R
A Workshop on R
 
2 it unit-1 start learning r
2 it   unit-1 start learning r2 it   unit-1 start learning r
2 it unit-1 start learning r
 
R programming
R programmingR programming
R programming
 
Is Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed
Is Revolution R Enterprise Faster than SAS? Benchmarking Results RevealedIs Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed
Is Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed
 
Big Data Analytics with R
Big Data Analytics with RBig Data Analytics with R
Big Data Analytics with R
 
R Programming Overview
R Programming Overview R Programming Overview
R Programming Overview
 
Big data analytics using R
Big data analytics using RBig data analytics using R
Big data analytics using R
 
Why R? A Brief Introduction to the Open Source Statistics Platform
Why R? A Brief Introduction to the Open Source Statistics PlatformWhy R? A Brief Introduction to the Open Source Statistics Platform
Why R? A Brief Introduction to the Open Source Statistics Platform
 

Similar to Basic introduction into R

How to get started with R programming
How to get started with R programmingHow to get started with R programming
How to get started with R programmingRamon Salazar
 
Best corporate-r-programming-training-in-mumbai
Best corporate-r-programming-training-in-mumbaiBest corporate-r-programming-training-in-mumbai
Best corporate-r-programming-training-in-mumbaiUnmesh Baile
 
Introduction to R and R Studio
Introduction to R and R StudioIntroduction to R and R Studio
Introduction to R and R StudioRupak Roy
 
Unit1_Introduction to R.pdf
Unit1_Introduction to R.pdfUnit1_Introduction to R.pdf
Unit1_Introduction to R.pdfMDDidarulAlam15
 
R and Python, A Code Demo
R and Python, A Code DemoR and Python, A Code Demo
R and Python, A Code DemoVineet Jaiswal
 
R programming for psychometrics
R programming for psychometricsR programming for psychometrics
R programming for psychometricsDiane Talley
 
An introduction to R is a document useful
An introduction to R is a document usefulAn introduction to R is a document useful
An introduction to R is a document usefulssuser3c3f88
 
Up your data game: How to use R to wrangle, analyze, and visualize data faste...
Up your data game: How to use R to wrangle, analyze, and visualize data faste...Up your data game: How to use R to wrangle, analyze, and visualize data faste...
Up your data game: How to use R to wrangle, analyze, and visualize data faste...Charles Guedenet
 
A Gentle Introduction to Tidy Statistics in R.pdf
A Gentle Introduction to Tidy Statistics in R.pdfA Gentle Introduction to Tidy Statistics in R.pdf
A Gentle Introduction to Tidy Statistics in R.pdfVickyAlers
 
Introduction to R - Lab slides for UGA course FANR 6750
Introduction to R - Lab slides for UGA course FANR 6750Introduction to R - Lab slides for UGA course FANR 6750
Introduction to R - Lab slides for UGA course FANR 6750richardchandler
 
Introduction to the R Statistical Computing Environment
Introduction to the R Statistical Computing EnvironmentIntroduction to the R Statistical Computing Environment
Introduction to the R Statistical Computing Environmentizahn
 
Functional Programming in R
Functional Programming in RFunctional Programming in R
Functional Programming in RDavid Springate
 

Similar to Basic introduction into R (20)

How to get started with R programming
How to get started with R programmingHow to get started with R programming
How to get started with R programming
 
R crash course
R crash courseR crash course
R crash course
 
Lecture_R.ppt
Lecture_R.pptLecture_R.ppt
Lecture_R.ppt
 
Best corporate-r-programming-training-in-mumbai
Best corporate-r-programming-training-in-mumbaiBest corporate-r-programming-training-in-mumbai
Best corporate-r-programming-training-in-mumbai
 
Introduction to R and R Studio
Introduction to R and R StudioIntroduction to R and R Studio
Introduction to R and R Studio
 
R presentation
R presentationR presentation
R presentation
 
Unit1_Introduction to R.pdf
Unit1_Introduction to R.pdfUnit1_Introduction to R.pdf
Unit1_Introduction to R.pdf
 
R and Python, A Code Demo
R and Python, A Code DemoR and Python, A Code Demo
R and Python, A Code Demo
 
R programming
R programmingR programming
R programming
 
R programming for psychometrics
R programming for psychometricsR programming for psychometrics
R programming for psychometrics
 
An introduction to R is a document useful
An introduction to R is a document usefulAn introduction to R is a document useful
An introduction to R is a document useful
 
Basics of R
Basics of RBasics of R
Basics of R
 
Up your data game: How to use R to wrangle, analyze, and visualize data faste...
Up your data game: How to use R to wrangle, analyze, and visualize data faste...Up your data game: How to use R to wrangle, analyze, and visualize data faste...
Up your data game: How to use R to wrangle, analyze, and visualize data faste...
 
A Gentle Introduction to Tidy Statistics in R.pdf
A Gentle Introduction to Tidy Statistics in R.pdfA Gentle Introduction to Tidy Statistics in R.pdf
A Gentle Introduction to Tidy Statistics in R.pdf
 
Introduction to R - Lab slides for UGA course FANR 6750
Introduction to R - Lab slides for UGA course FANR 6750Introduction to R - Lab slides for UGA course FANR 6750
Introduction to R - Lab slides for UGA course FANR 6750
 
R and Data Science
R and Data ScienceR and Data Science
R and Data Science
 
R Studio (Report)
R Studio (Report)R Studio (Report)
R Studio (Report)
 
RStudio
RStudioRStudio
RStudio
 
Introduction to the R Statistical Computing Environment
Introduction to the R Statistical Computing EnvironmentIntroduction to the R Statistical Computing Environment
Introduction to the R Statistical Computing Environment
 
Functional Programming in R
Functional Programming in RFunctional Programming in R
Functional Programming in R
 

More from Richard Zijdeman

Linked Data: Een extra ontstluitingslaag op archieven
Linked Data: Een extra ontstluitingslaag op archieven Linked Data: Een extra ontstluitingslaag op archieven
Linked Data: Een extra ontstluitingslaag op archieven Richard Zijdeman
 
Linked Open Data: Combining Data for the Social Sciences and Humanities (and ...
Linked Open Data: Combining Data for the Social Sciences and Humanities (and ...Linked Open Data: Combining Data for the Social Sciences and Humanities (and ...
Linked Open Data: Combining Data for the Social Sciences and Humanities (and ...Richard Zijdeman
 
grlc. store, share and run sparql queries
grlc. store, share and run sparql queriesgrlc. store, share and run sparql queries
grlc. store, share and run sparql queriesRichard Zijdeman
 
Rijpma's Catasto meets SPARQL dhb2017_workshop
Rijpma's Catasto meets SPARQL dhb2017_workshopRijpma's Catasto meets SPARQL dhb2017_workshop
Rijpma's Catasto meets SPARQL dhb2017_workshopRichard Zijdeman
 
Data legend dh_benelux_2017.key
Data legend dh_benelux_2017.keyData legend dh_benelux_2017.key
Data legend dh_benelux_2017.keyRichard Zijdeman
 
Historical occupational classification and occupational stratification schemes
Historical occupational classification and occupational stratification schemesHistorical occupational classification and occupational stratification schemes
Historical occupational classification and occupational stratification schemesRichard Zijdeman
 
Labour force participation of married women, US 1860-2010
Labour force participation of married women, US 1860-2010Labour force participation of married women, US 1860-2010
Labour force participation of married women, US 1860-2010Richard Zijdeman
 
Advancing the comparability of occupational data through Linked Open Data
Advancing the comparability of occupational data through Linked Open DataAdvancing the comparability of occupational data through Linked Open Data
Advancing the comparability of occupational data through Linked Open DataRichard Zijdeman
 
work in a globalized world
work in a globalized worldwork in a globalized world
work in a globalized worldRichard Zijdeman
 
The Structured Data Hub in 2019
The Structured Data Hub in 2019The Structured Data Hub in 2019
The Structured Data Hub in 2019Richard Zijdeman
 
Examples of digital history at the IISH
Examples of digital history at the IISHExamples of digital history at the IISH
Examples of digital history at the IISHRichard Zijdeman
 
Historical occupational classification and stratification schemes (lecture)
Historical occupational classification and stratification schemes (lecture)Historical occupational classification and stratification schemes (lecture)
Historical occupational classification and stratification schemes (lecture)Richard Zijdeman
 
Using HISCO and HISCAM to code and analyze occupations
Using HISCO and HISCAM to code and analyze occupationsUsing HISCO and HISCAM to code and analyze occupations
Using HISCO and HISCAM to code and analyze occupationsRichard Zijdeman
 

More from Richard Zijdeman (15)

Linked Data: Een extra ontstluitingslaag op archieven
Linked Data: Een extra ontstluitingslaag op archieven Linked Data: Een extra ontstluitingslaag op archieven
Linked Data: Een extra ontstluitingslaag op archieven
 
Linked Open Data: Combining Data for the Social Sciences and Humanities (and ...
Linked Open Data: Combining Data for the Social Sciences and Humanities (and ...Linked Open Data: Combining Data for the Social Sciences and Humanities (and ...
Linked Open Data: Combining Data for the Social Sciences and Humanities (and ...
 
grlc. store, share and run sparql queries
grlc. store, share and run sparql queriesgrlc. store, share and run sparql queries
grlc. store, share and run sparql queries
 
Rijpma's Catasto meets SPARQL dhb2017_workshop
Rijpma's Catasto meets SPARQL dhb2017_workshopRijpma's Catasto meets SPARQL dhb2017_workshop
Rijpma's Catasto meets SPARQL dhb2017_workshop
 
Data legend dh_benelux_2017.key
Data legend dh_benelux_2017.keyData legend dh_benelux_2017.key
Data legend dh_benelux_2017.key
 
Toogdag 2017
Toogdag 2017Toogdag 2017
Toogdag 2017
 
Historical occupational classification and occupational stratification schemes
Historical occupational classification and occupational stratification schemesHistorical occupational classification and occupational stratification schemes
Historical occupational classification and occupational stratification schemes
 
Labour force participation of married women, US 1860-2010
Labour force participation of married women, US 1860-2010Labour force participation of married women, US 1860-2010
Labour force participation of married women, US 1860-2010
 
Advancing the comparability of occupational data through Linked Open Data
Advancing the comparability of occupational data through Linked Open DataAdvancing the comparability of occupational data through Linked Open Data
Advancing the comparability of occupational data through Linked Open Data
 
work in a globalized world
work in a globalized worldwork in a globalized world
work in a globalized world
 
The Structured Data Hub in 2019
The Structured Data Hub in 2019The Structured Data Hub in 2019
The Structured Data Hub in 2019
 
Examples of digital history at the IISH
Examples of digital history at the IISHExamples of digital history at the IISH
Examples of digital history at the IISH
 
Historical occupational classification and stratification schemes (lecture)
Historical occupational classification and stratification schemes (lecture)Historical occupational classification and stratification schemes (lecture)
Historical occupational classification and stratification schemes (lecture)
 
Using HISCO and HISCAM to code and analyze occupations
Using HISCO and HISCAM to code and analyze occupationsUsing HISCO and HISCAM to code and analyze occupations
Using HISCO and HISCAM to code and analyze occupations
 
Csdh sbg clariah_intr01
Csdh sbg clariah_intr01Csdh sbg clariah_intr01
Csdh sbg clariah_intr01
 

Recently uploaded

GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSJoshuaGantuangco2
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptxSherlyMaeNeri
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxGrade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxChelloAnnAsuncion2
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)lakshayb543
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYKayeClaireEstoconing
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceSamikshaHamane
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxAshokKarra1
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxDr.Ibrahim Hassaan
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfphamnguyenenglishnb
 

Recently uploaded (20)

GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptx
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxGrade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in Pharmacovigilance
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptx
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptx
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
 
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 

Basic introduction into R

  • 1. Quantitave research methods Data analysis workflow Statistical Software Installing R and RStudio Getting help Introduction into R Part 1A Richard L. Zijdeman 2016-06-15 Richard L. Zijdeman Introduction into R
  • 2. Quantitave research methods Data analysis workflow Statistical Software Installing R and RStudio Getting help 1 Quantitave research methods 2 Data analysis workflow 3 Statistical Software 4 Installing R and RStudio 5 Getting help Richard L. Zijdeman Introduction into R
  • 3. Quantitave research methods Data analysis workflow Statistical Software Installing R and RStudio Getting help Quantitave research methods Richard L. Zijdeman Introduction into R
  • 4. Quantitave research methods Data analysis workflow Statistical Software Installing R and RStudio Getting help Why To answer descriptive and explanatory questions on populations Richard L. Zijdeman Introduction into R
  • 5. Quantitave research methods Data analysis workflow Statistical Software Installing R and RStudio Getting help Workflow: PTE problem (research question) theory (hypothesis) empirical test . . . with loops between T-E and P-T-E Richard L. Zijdeman Introduction into R
  • 6. Quantitave research methods Data analysis workflow Statistical Software Installing R and RStudio Getting help Research Questions descriptive (to what extent. . . ) comparative (comparing two entities) trend (comparison over time) explanatory (focus on mechanism at hand) Richard L. Zijdeman Introduction into R
  • 7. Quantitave research methods Data analysis workflow Statistical Software Installing R and RStudio Getting help Theory deductive reasoning explanans general mechanism condition explanandum (hypothesis) Richard L. Zijdeman Introduction into R
  • 8. Quantitave research methods Data analysis workflow Statistical Software Installing R and RStudio Getting help Empirical test sample vs. population random vs. stratified samples testing technique, e.g.: T-test, correlation, regression Software required for faster analysis Richard L. Zijdeman Introduction into R
  • 9. Quantitave research methods Data analysis workflow Statistical Software Installing R and RStudio Getting help Data analysis workflow Richard L. Zijdeman Introduction into R
  • 10. Quantitave research methods Data analysis workflow Statistical Software Installing R and RStudio Getting help Empirical testings has its own workflow Grolemund & Wickham, 2016, Creative Commons Attribution-NonCommercial-NoDerivs 4.0. Richard L. Zijdeman Introduction into R
  • 11. Quantitave research methods Data analysis workflow Statistical Software Installing R and RStudio Getting help Statistical Software Richard L. Zijdeman Introduction into R
  • 12. Quantitave research methods Data analysis workflow Statistical Software Installing R and RStudio Getting help The dangers of analysing with spreadsheets (e.g. MS Excel) tempting to input and clean data and analyse in the same sheet di cult to track cleaning rules defaults mess up your data (e.g. 01200 -> 1200) Richard L. Zijdeman Introduction into R
  • 13. Quantitave research methods Data analysis workflow Statistical Software Installing R and RStudio Getting help Why use syntax (scripting) E ciency (really) Quality (error checking) Replicatability Communication Richard L. Zijdeman Introduction into R
  • 14. Quantitave research methods Data analysis workflow Statistical Software Installing R and RStudio Getting help R R is open source, which is good and bad: anybody can contribute (check, improve, create code) free of charge but: R depends on collective action cannot ‘demand’ support sprawl of packages Richard L. Zijdeman Introduction into R
  • 15. Quantitave research methods Data analysis workflow Statistical Software Installing R and RStudio Getting help RStudio browser for R provides easy access to: scripts data plots manual Richard L. Zijdeman Introduction into R
  • 16. Quantitave research methods Data analysis workflow Statistical Software Installing R and RStudio Getting help Installing R and RStudio Richard L. Zijdeman Introduction into R
  • 17. Quantitave research methods Data analysis workflow Statistical Software Installing R and RStudio Getting help Download R Instructions via http://www.r-project.org Choose a CRAN mirror http://cran.r-project.org/mirrors.html close, but active too! Romania hasn’t gone (yet!) Click on ‘Download R for Windows’ Follow usual installation procedure Double click on R You should now have a working session! Close the session, do not save workspace image Richard L. Zijdeman Introduction into R
  • 18. Quantitave research methods Data analysis workflow Statistical Software Installing R and RStudio Getting help Packages and libraries base R (core product) additional packages CRAN repository spread through ‘mirrors’ choose a local, but active mirror Github packages not on CRAN development versions of CRAN libraries Richard L. Zijdeman Introduction into R
  • 19. Quantitave research methods Data analysis workflow Statistical Software Installing R and RStudio Getting help RStudio RStudio is found on http://www.rstudio.com Download the version for your OS (e.g. windows) http://www.rstudio.com/products/rstudio/download/ Install by double clicking on the downloaded file Start RStudio by double clicking on the icon You do not need to start R, before starting RStudio Richard L. Zijdeman Introduction into R
  • 20. Quantitave research methods Data analysis workflow Statistical Software Installing R and RStudio Getting help Getting help Richard L. Zijdeman Introduction into R
  • 21. Quantitave research methods Data analysis workflow Statistical Software Installing R and RStudio Getting help Build-in help: “?” ?[function] / ?[package] e.g. “?plot” or “?graphics” check the index for user guides and vignettes Richard L. Zijdeman Introduction into R
  • 22. Quantitave research methods Data analysis workflow Statistical Software Installing R and RStudio Getting help Cran website Manuals R FAQ R Journal Richard L. Zijdeman Introduction into R
  • 23. Quantitave research methods Data analysis workflow Statistical Software Installing R and RStudio Getting help Online communities Stackoverflow Instance of Stackexchange Reputation based Q&A Specific lists for packages, e.g.: ggplot2 R-sig-mixed-models Richard L. Zijdeman Introduction into R
  • 24. Quantitave research methods Data analysis workflow Statistical Software Installing R and RStudio Getting help Asking a question Getting an answer Search the web: others must have had this problem too If you raise a question: be polite be concise short background replicatable example debrief your e orts sofar Richard L. Zijdeman Introduction into R
  • 25. Introducing RStudio and R Introducing base R Data visualization using ggplot2 Introduction into R Part 1B Richard L. Zijdeman 2016-06-15 Richard L. Zijdeman Introduction into R
  • 26. Introducing RStudio and R Introducing base R Data visualization using ggplot2 1 Introducing RStudio and R 2 Introducing base R 3 Data visualization using ggplot2 Richard L. Zijdeman Introduction into R
  • 27. Introducing RStudio and R Introducing base R Data visualization using ggplot2 Introducing RStudio and R Richard L. Zijdeman Introduction into R
  • 28. Introducing RStudio and R Introducing base R Data visualization using ggplot2 RStudio Rstudio is sort of a ‘viewer’ on R helps to organize input and output: editor (upper left) console (lower left) environment (upper right) output (lower right) Richard L. Zijdeman Introduction into R
  • 29. Introducing RStudio and R Introducing base R Data visualization using ggplot2 R script series of ))commands to manipulate data always save your script, NEVER change your data original data + script = reproducable research Richard L. Zijdeman Introduction into R
  • 30. Introducing RStudio and R Introducing base R Data visualization using ggplot2 Packages Build your R system using packages ‘Base R’ is basic. Add packages for your specific needs Packages are found on servers, called ‘mirrors’ Make sure to select a mirror first https://cran.r-project.org/mirrors.html%5Bhttps: //cran.r-project.org/mirrors.html%5D ## To permanently add the mirror, type: options(repos=structure( c(CRAN="http://cran.xl-mirror.nl"))) ## replace http://... with your favorite mirror Richard L. Zijdeman Introduction into R
  • 31. Introducing RStudio and R Introducing base R Data visualization using ggplot2 Packages for book (see 1.4.2) pkgs <- c( "broom", "dplyr", "ggplot2", "jpeg", "jsonlite", "knitr", "Lahman", "microbenchmark", "png", "pryr", "purrr", "rcorpora", "readr", "stringr", "tibble", "tidyr" ) install.packages(pkgs) Richard L. Zijdeman Introduction into R
  • 32. Introducing RStudio and R Introducing base R Data visualization using ggplot2 R Session contains scripts, data, functions can be saved ‘workspace image’ prefer not to: sessions are usually cluttered only useful if running script takes time Suggested tweak: Options: uncheck “Restore .RData into workspace at startup” Options: Save workspace to .RData on exit, select ‘never’ Richard L. Zijdeman Introduction into R
  • 33. Introducing RStudio and R Introducing base R Data visualization using ggplot2 Introducing base R Richard L. Zijdeman Introduction into R
  • 34. Introducing RStudio and R Introducing base R Data visualization using ggplot2 base R: assignment and print() ‘attach’ values to an object (e.g. a variable) x <- 5 y <- 4 z <- x * y print(z) ## [1] 20 Richard L. Zijdeman Introduction into R
  • 35. Introducing RStudio and R Introducing base R Data visualization using ggplot2 base R: assignment and print() (II) Try and imagine the potential of assignment x <- c(4, 3, 2, 1, 0, 27, 34, 35) # c for concatenate values y <- -1 z <- x*y print(z) ## [1] -4 -3 -2 -1 0 -27 -34 -35 Richard L. Zijdeman Introduction into R
  • 36. Introducing RStudio and R Introducing base R Data visualization using ggplot2 base R: data.frame basically a table contains columns (variables) contains rows (cases) “flat table” in Kees’ terminology my.df <- data.frame(x,z) str(my.df) # show STRucture ## data.frame : 8 obs. of 2 variables: ## $ x: num 4 3 2 1 0 27 34 35 ## $ z: num -4 -3 -2 -1 0 -27 -34 -35 There’s much more, but let’s keep that for tomorrow Richard L. Zijdeman Introduction into R
  • 37. Introducing RStudio and R Introducing base R Data visualization using ggplot2 Data visualization using ggplot2 Richard L. Zijdeman Introduction into R
  • 38. Introducing RStudio and R Introducing base R Data visualization using ggplot2 Visualizing your data Not just for analyses! Data quality representativeness missing data Richard L. Zijdeman Introduction into R
  • 39. Introducing RStudio and R Introducing base R Data visualization using ggplot2 plot() in base R library(help = "datasets") # all datasets in R ?mtcars # show help on mtcars dataset df <- mtcars() str(mtcars) # display STRucture of an object plot(mtcars$hp, mtcars$mpg) plot(df) Richard L. Zijdeman Introduction into R
  • 40. Introducing RStudio and R Introducing base R Data visualization using ggplot2 plot() is like . . . plot() is like latex: Forge it in anyway you want Heterogeneous approach though Takes quite some time to get it right Richard L. Zijdeman Introduction into R
  • 41. Introducing RStudio and R Introducing base R Data visualization using ggplot2 ggplot() as alternative ggplot is but one of many graph packages ggplot is nice bc, of: similar approach to various types of graphs easy build up for basic graphs can get quite complex too (but cannot do it all) Richard L. Zijdeman Introduction into R
  • 42. Introducing RStudio and R Introducing base R Data visualization using ggplot2 ggplot() and the canvas metaphore ggplot() consists of two elements canvas (multiple) layers of paint Richard L. Zijdeman Introduction into R
  • 43. Introducing RStudio and R Introducing base R Data visualization using ggplot2 mapping and geom layers ggplot() consists of two elements canvas: data mapping (aesthetic) (multiple) layers of paint geom layers ggplot(data = <DATASET>, mapping = aes(x = <X-VAR>, y = <Y-VAR>)) + geom_<TYPE> Richard L. Zijdeman Introduction into R
  • 44. Introducing RStudio and R Introducing base R Data visualization using ggplot2 our first ggplot install.packages("ggplot2") library(ggplot2) df <- mtcars ggplot(data = df, aes(x = hp, y = mpg)) + geom_point() Richard L. Zijdeman Introduction into R
  • 45. Introducing RStudio and R Introducing base R Data visualization using ggplot2 geom_ features ? geom_point install.packages("ggplot2") library(ggplot2) df <- mtcars ggplot(data = df, aes(x = hp, y = mpg)) + geom_point(fill = "white", colour = "blue", shape = 21, size = 4) Richard L. Zijdeman Introduction into R
  • 46. Introducing RStudio and R Introducing base R Data visualization using ggplot2 Adding characteristics to your plot Add variables to explain a pattern ggplot(data = df, aes(x = hp, y = mpg)) + geom_point(aes(colour = wt), size = 4) NB: notice the di erence? ggplot(data = df, aes(x = hp, y = mpg)) + geom_point(aes(colour = wt, size = 4)) Richard L. Zijdeman Introduction into R
  • 47. Introducing RStudio and R Introducing base R Data visualization using ggplot2 Multiple geom’s Add variables to explain a pattern ggplot(data = df, aes(x = hp, y = mpg)) + geom_point(aes(colour = as.factor(am)), size = 6) + # increase size bc overlap geom_point(aes(shape = as.factor(vs)), size = 3) #V/S whether V8 (0) or Straight (European) (1) Richard L. Zijdeman Introduction into R
  • 48. Introducing RStudio and R Introducing base R Data visualization using ggplot2 Adding facets Facets help reduce complexity ggplot(data = df, aes(x = hp, y = mpg)) + geom_point(aes(colour = as.factor(am)), size = 4) + facet_wrap( ~ vs) Richard L. Zijdeman Introduction into R
  • 49. Introducing RStudio and R Introducing base R Data visualization using ggplot2 Things to consider with geom(_point) fill only works where shape actually can be filled consider order of geoms mind overlap: decrease size use alpha use ‘open’ shapes geom_jitter Richard L. Zijdeman Introduction into R
  • 50. Introducing RStudio and R Introducing base R Data visualization using ggplot2 ggplot and titles Various ways to add titlex to axes and stu Can get quite complex Here’s the basiscs ggplot(data = df, aes(x = hp, y = mpg)) + geom_point() + labs(title = "Nice graph", x = "Horse Power", y = "Miles per Gallon" ) Richard L. Zijdeman Introduction into R
  • 51. Introducing RStudio and R Introducing base R Data visualization using ggplot2 Themes and size ggplot(data = df, aes(x = hp, y = mpg)) + geom_point() + labs(title = "Nice graph", x = "Horse Power", y = "Miles per Gallon" ) + theme_bw(base_size = 16) Richard L. Zijdeman Introduction into R
  • 52. Introducing RStudio and R Introducing base R Data visualization using ggplot2 Much more to learn not just about ggplot() axes legend (guides) geoms also about dataviz in general general do’s and don’ts which problem fits which graph it’s a science! (Graph theory) Richard L. Zijdeman Introduction into R
  • 53. Data wrangling bit about NA Introduction into R Part 2A, 2B Richard L. Zijdeman 2016-06-16 Richard L. Zijdeman Introduction into R
  • 54. Data wrangling bit about NA 1 Data wrangling 2 bit about NA Richard L. Zijdeman Introduction into R
  • 55. Data wrangling bit about NA Data wrangling Richard L. Zijdeman Introduction into R
  • 56. Data wrangling bit about NA Grolemund & Wickham, 2016, Creative Commons Attribution-NonCommercial-NoDerivs 4.0. Richard L. Zijdeman Introduction into R
  • 57. Data wrangling bit about NA dplyr package # install.packages("dplyr") # 1 time only library(dplyr) install.packages("nycflights13") library(nycflights13) print(flights) Richard L. Zijdeman Introduction into R
  • 58. Data wrangling bit about NA tibble or data_frame vs data.frame str(mtcars) class(mtcars) mtcars_tbl <- as_data_frame(mtcars) str(mtcars) class(mtcars) Richard L. Zijdeman Introduction into R
  • 59. Data wrangling bit about NA filter filter(mtcars, am == 1, vs == 0) some.cars <- filter(mtcars, am == 1, vs == 0) some.cars (some.cars2 <- filter(mtcars, am == 1, vs == 0)) Richard L. Zijdeman Introduction into R
  • 60. Data wrangling bit about NA filter and using or filter(mtcars, gear == 3 | gear == 4) # !! not like this: filter(mtcars, gear == 3 | 4) Richard L. Zijdeman Introduction into R
  • 61. Data wrangling bit about NA bit about NA Richard L. Zijdeman Introduction into R
  • 62. Data wrangling bit about NA Arrange arrange(flights, dep_time) arrange(flights, year, month, day) # ascending order arrange(flights, desc(day)) # NB: missing values come at end Richard L. Zijdeman Introduction into R
  • 63. Data wrangling bit about NA Select df <- select(flights, year, month, day) names(flights) df <- select(flights, tailnum:dest) df <- select(flights, -(tailnum:dest)) df df <- select(flights, starts_with("arr_")) df <- select(flights, ends_with("e")) df <- select(flights, contains("a")) Richard L. Zijdeman Introduction into R
  • 64. Data wrangling bit about NA rename df <- rename(flights, Y_ear = year) df <- mutate(flights, year1 = year+1) select(df, year, year1) df <- mutate(flights, year1 = year + 1, year2 = year1+1) select(df, contains("year")) df <- transmute(flights, year1 = year + 1, year2 = year1+1) # only maintains the newly created variables Richard L. Zijdeman Introduction into R
  • 65. Data wrangling bit about NA group_by by_day <- group_by(flights, year, month, day) summarise(by_day) cars <- mtcars cars <- as_data_frame(mtcars) summarise(cars, mean_hp = mean(hp, na.rm = TRUE)) mean(cars$hp, na.rm = TRUE) Richard L. Zijdeman Introduction into R
  • 66. Data wrangling bit about NA the pipe: %>% cars_grp <- group_by(cars, carb) class(cars) class(cars_grp) summarise(cars_grp, mmpg = mean(mpg, na.rm = TRUE)) cars_grp_sum <- summarise(cars_grp, mmpg = mean(mpg, na.rm = TRUE), count = n()) cars_grp_sum plot <- ggplot(cars_grp_sum, aes(x = carb, y = mmpg, label = carb)) + geom_point(aes(size = count)) + geom_text(colour = "cyan") plot Richard L. Zijdeman Introduction into R
  • 67. Data wrangling bit about NA more pipe, adding a filter cars_grp_sum3 <- cars %>% group_by(carb) %>% summarise(mmpg = mean(mpg, na.rm = TRUE), count = n()) %>% filter(count > 3) ggplot(cars_grp_sum3, aes(x = carb, y = mmpg, label = carb) geom_point(aes(size = count)) + geom_text(colour = "cyan") + labs(title = "figure with %>% and count > 3") Richard L. Zijdeman Introduction into R
  • 68. Session management Basic data manipulation Introduction into R Part 3A Richard L. Zijdeman 2016-06-17 Richard L. Zijdeman Introduction into R
  • 69. Session management Basic data manipulation 1 Session management 2 Basic data manipulation Richard L. Zijdeman Introduction into R
  • 70. Session management Basic data manipulation Session management Richard L. Zijdeman Introduction into R
  • 71. Session management Basic data manipulation Maintaining your workspace Grolemund & Wickham, 2016, Creative Commons Attribution-NonCommercial-NoDerivs 4.0. Richard L. Zijdeman Introduction into R
  • 72. Session management Basic data manipulation Setting up a session clear your Environment check sessionInfo() for loaded packages detach obsolete packages under ‘other attached packages’ set your directory (“" on windows and”/" for linux/mac) load libraries (install new ones) load your data Richard L. Zijdeman Introduction into R
  • 73. Session management Basic data manipulation Example session setup rm(list = ls()) sessionInfo() # check for other attached packages detach("package:nycflights13", unload = TRUE) setwd("/Users/RichardZ/Dropbox/ Summer school 2016/Richard Zijdeman/") getwd() # to see whether you re in the right directory dir() # shows what s in your directory Richard L. Zijdeman Introduction into R
  • 74. Session management Basic data manipulation Loading your data read.table() (generic function) read.csv() library(foreign) # e.g. SPSS and Stata library(readxl) # fast excel-package Richard L. Zijdeman Introduction into R
  • 75. Session management Basic data manipulation Reading in data Di erent functions for di erent files: Base R: read.table() (read.csv()) foreign package: read.spss(), read.dta(), read.dbf() readxl alternatives packages: xlsx(Java required) gdata (perl-based) openxlsx package: read.xlsx() Richard L. Zijdeman Introduction into R
  • 76. Session management Basic data manipulation read.csv() file: your file, including directory header: variable names or not? sep: seperator read.csv default: “,” read.csv2 default: “;” skip: number of rows to skip nrows: total number of rows to read stringsAsFactors encoding (e.g. “latin1” or “UTF-8”) Richard L. Zijdeman Introduction into R
  • 77. Session management Basic data manipulation read_excel from readxl package path: your file, including directory sheet: name or number of sheet col_names: col names in 1st row? col_types: specify type na: what’s the sign for missing values skip: how many rows to skip before data starts Richard L. Zijdeman Introduction into R
  • 78. Session management Basic data manipulation Example session loading your csv data # setwd() to set your working directory hmar100 <- read.csv("./Datafiles_HSN/HSN_marriages.csv", stringsAsFactors = FALSE, encoding = "latin1", header = TRUE, nrows = 100) # just first 100 rows Richard L. Zijdeman Introduction into R
  • 79. Session management Basic data manipulation Example session loading your excel data # setwd() to set your working directory install.packages("readxl") library("readxl") hmar <- read_excel("./Datafiles_HSN/HSN_marriages_awful.xls col_names = TRUE, skip = 3) # empty lines not counted!!! Richard L. Zijdeman Introduction into R
  • 80. Session management Basic data manipulation Basic data manipulation Richard L. Zijdeman Introduction into R
  • 81. Session management Basic data manipulation Change case of text tolower() toupper() tolower("CaN we pleASe jUSt have LOWER cases?") names(hmar) <- tolower(names(hmar)) Richard L. Zijdeman Introduction into R
  • 82. Session management Basic data manipulation length() Used to count how many instances there are length(names(hmar)) # shows number of variables in hmar Richard L. Zijdeman Introduction into R
  • 83. Basic statistical techniques Introduction into R Part 3B Richard L. Zijdeman 2016-06-17 Richard L. Zijdeman Introduction into R
  • 84. Basic statistical techniques 1 Basic statistical techniques Richard L. Zijdeman Introduction into R
  • 85. Basic statistical techniques Basic statistical techniques Richard L. Zijdeman Introduction into R
  • 86. Basic statistical techniques Box and whisker plot Distribution of data Median: 50% of the cases above and below Box: 1st and 3rd quartile Interquartile range (IQR): Q3-Q1 Outliers (Tukey, 1977): x < Q1 - 1.5*IQR x > Q3 + 1.5*IQR Richard L. Zijdeman Introduction into R
  • 87. Basic statistical techniques p <- ggplot(hmar, aes(sign_groom, age_groom)) p + geom_boxplot() Richard L. Zijdeman Introduction into R
  • 88. Basic statistical techniques hmar <- mutate(hmar, sign_groomD = (sign_groom == "h" & !(i p <- ggplot(hmar, aes(sign_groomD, age_groom)) p + geom_boxplot() Richard L. Zijdeman Introduction into R
  • 89. Basic statistical techniques hmar <- mutate(hmar, sign_groomD = (sign_groom == "h" & !(i p <- ggplot(hmar, aes(sign_groomD, age_groom)) p + geom_boxplot() + geom_jitter(shape = 24, width = 0.2) Richard L. Zijdeman Introduction into R
  • 90. Basic statistical techniques library(stats) var.test(age_groom ~ sign_groomD, data = hmar) t.test(age_groom ~ sign_groomD, data = hmar) # NB: always check for variances Richard L. Zijdeman Introduction into R
  • 91. Basic statistical techniques A small PTE project Look at the variables in the HSN files Think of a research question Provide a general mechanism and hypothesis Plot your results Richard L. Zijdeman Introduction into R