SlideShare uma empresa Scribd logo
1 de 20
Introduction to R and RStudio
Jeff Witmer
9 March 2016
R is
A (not ideal) programming language
A collection of 6,700 packages (as of June 2015, so more now)
A software package for statistical computing and graphics
A work environment
Widely used
Powerful
Free
Some history
R was based on S, with code written in C
R was created in the 1990s by Ross Ihaka and Robert Gentleman
S was developed at Bell Labs, starting in the 1970s
S largely was used to make good graphs – not an easy thing
in 1975. R, like S, is quite good for graphing.
For lots of examples, see http://rgraphgallery.blogspot.com/
or http://www.r-graph-gallery.com/
(Or for more detail, see http://docs.ggplot2.org/current/
See ggplot2-cheatsheet-2.0.pdf
A few simple graphs using the ggplot2 package
An example of graphing using the GGally package in R
Who uses R?
RStudio is
A gift, from J.J. Allaire (Macalester College, ‘91) to the world
An Integrated Development Environment (IDE) for R
Free – unless you want the newest version, with more
bells and whistles, and you are not eligible for the
educational discount (= free)
An easy (easier) way to use R
Available as a desktop product or, as used at OC, run
off of a file server.
R supports rpubs – see http://rpubs.com/jawitmer
RStudio screen shot
R is object-oriented
e.g., MyModel <- lm(wt ~ ht, data = mydata)
then hist(MyModel$residuals)
Note: lm(wt ~ ht*age + log(bp), data = mydata) regresses
wt on ht, age, the ht-by-age interaction, and log(bp).
There is no need to create the interaction or the lob(bp)
variable outside of the lm() command.
Comparing nested models:
mod1 <- lm(wt ~ ht*age + log(bp), data = mydata)
mod2 <- lm(wt ~ ht + log(bp), data = mydata)
anova(mod2, mod1) gives a nested F-test
R as a programming language
If you want R to be (relatively) fast, take advantage of
vector operations; e.g., use the replicate command
(rather than a loop) or the tapply function.
E.g., replicate(k=25,addingLines(n=10)) calls the
addingLines function (something I wrote) 25 times.
> with(Dabbs, tapply(testosterone, occupation, mean))
Actor MD Minister Prof
12.7 11.6 8.4 10.6
If you want to know how to do something in R
See the “Minimal R.pdf” handout
Go to the Quick-R.com page (http://www.statmethods.net/)
Google “How do I do xxx in R?”
A standing joke among R users is that the answer
is always “There are many ways to do that in R.”
See http://swirlstats.com/
See https://www.datacamp.com/home
Speaking of many ways to do something in R…
(1) mean(mydata$ht)
(2) with(mydata, mean(ht))
(3) mean(ht, data=mydata)
However
(1) plot(mydata$ht,mydata$wt) works
(2) with(mydata, plot(ht,wt)) works
(3) plot(ht, wt, data=mydata) does not work
(3a) plot(wt~ht, data=mydata) works
The mosaic package (Kaplan, Pruim, Horton) was created
to make R easy to use for intro stats.
mosaic package syntax:
goal(y ~ x|z, data=mydata)
E.g.: tally(~sex, data=HELPrct)
E.g.: test(age ~ sex, data=HELPrct)
E.g.: favstats(age ~ substance|sex, data=HELPrct)
E.g.: t.test(age ~ sex, data=HELPrct)$p.value
See MinimalR-2pages.pdf
The mosaic package mPlot() command makes graphing easy.
mPlot(SaratogaHouses)
The openintro package edaPlot() command makes exploring
data graphically easy to do. edaPlot(SaratogaHouses)
The mosaic tidyr and dplyr packages handle SQL-ytpe
work: merging files, extracting subsets, etc.
data(NCHS) #loads in the NCHS data frame
newNCHS <- NCHS %>% sample_n(size=5000)
%>% filter(age > 18) #takes a sample of size 5000,
extracts only the rows for which age > 18, and saves
the result in newNCHS
See data-wrangling-cheatsheet.pdf
I use R, and the do() command in the mosaic package, for
simulations.
data(FirstYearGPA) #loads in the data frame
FY <- FirstYearGPA) #rename the data frame
lm(GPA ~ SATM, data=FY) #gives 0.0012 as slope
lm(GPA ~ SATM, data=FY)$coeff[2] #just look at the slope
do(3)*lm(GPA ~ shuffle(SATM), data=FY)$coeff[2] #break link b/w GPA and
SATM
null.dist <- do(1000)*lm(GPA ~ shuffle(SATM), data=FY)$coeff[2] #1000
random slopes
histogram(null.dist$SATM, v=0.0012) #look at the 1000 slopes
with(null.dist, tally(abs(SATM.)>=0.0012)) #How many are far from zero?
with(null.dist, tally(abs(SATM.)>=0.0012, format='prop')) #What proportion are
far from zero?
plot(jitter(Win,amount=.05)~SaveDiff,data=LaXdata)
Predict.Plot(modelDiff,pred.var="SaveDiff",DrawDiff=-11, ShotDiff=6,
TODiff=-3, ClearPctDiff=0.0952, ShotGoalDiff=1, GroundDiff=5,
add=TRUE,plot.args=list(col='blue')) #OCWLaX game vs BW
Using Predict.Plot to show Pr(win) as SaveDiff varies,
for a fixed set of values for sixother predictors.

Mais conteúdo relacionado

Semelhante a introtorandrstudio.ppt

Presentation on use of r statistics
Presentation on use of r statisticsPresentation on use of r statistics
Presentation on use of r statisticsKrishna Dhakal
 
ComputeFest 2012: Intro To R for Physical Sciences
ComputeFest 2012: Intro To R for Physical SciencesComputeFest 2012: Intro To R for Physical Sciences
ComputeFest 2012: Intro To R for Physical Sciencesalexstorer
 
2015-10-23_wim_davis_r_slides.pptx on consumer
2015-10-23_wim_davis_r_slides.pptx on consumer2015-10-23_wim_davis_r_slides.pptx on consumer
2015-10-23_wim_davis_r_slides.pptx on consumertirlukachaitanya
 
Scala as a Declarative Language
Scala as a Declarative LanguageScala as a Declarative Language
Scala as a Declarative Languagevsssuresh
 
Best corporate-r-programming-training-in-mumbai
Best corporate-r-programming-training-in-mumbaiBest corporate-r-programming-training-in-mumbai
Best corporate-r-programming-training-in-mumbaiUnmesh Baile
 
R programming slides
R  programming slidesR  programming slides
R programming slidesPankaj Saini
 
Reproducibility with R
Reproducibility with RReproducibility with R
Reproducibility with RMartin Jung
 
What's new in Apache SystemML - Declarative Machine Learning
What's new in Apache SystemML  - Declarative Machine LearningWhat's new in Apache SystemML  - Declarative Machine Learning
What's new in Apache SystemML - Declarative Machine LearningLuciano Resende
 
Spark 4th Meetup Londond - Building a Product with Spark
Spark 4th Meetup Londond - Building a Product with SparkSpark 4th Meetup Londond - Building a Product with Spark
Spark 4th Meetup Londond - Building a Product with Sparksamthemonad
 
R Brown-bag seminars : Seminar-8
R Brown-bag seminars : Seminar-8R Brown-bag seminars : Seminar-8
R Brown-bag seminars : Seminar-8Muhammad Nabi Ahmad
 
Mapreduce Algorithms
Mapreduce AlgorithmsMapreduce Algorithms
Mapreduce AlgorithmsAmund Tveit
 
Introduction to R
Introduction to RIntroduction to R
Introduction to Ragnonchik
 

Semelhante a introtorandrstudio.ppt (20)

Have you met Julia?
Have you met Julia?Have you met Julia?
Have you met Julia?
 
Lecture_R.ppt
Lecture_R.pptLecture_R.ppt
Lecture_R.ppt
 
Presentation on use of r statistics
Presentation on use of r statisticsPresentation on use of r statistics
Presentation on use of r statistics
 
R basics
R basicsR basics
R basics
 
Easy R
Easy REasy R
Easy R
 
ComputeFest 2012: Intro To R for Physical Sciences
ComputeFest 2012: Intro To R for Physical SciencesComputeFest 2012: Intro To R for Physical Sciences
ComputeFest 2012: Intro To R for Physical Sciences
 
2015-10-23_wim_davis_r_slides.pptx on consumer
2015-10-23_wim_davis_r_slides.pptx on consumer2015-10-23_wim_davis_r_slides.pptx on consumer
2015-10-23_wim_davis_r_slides.pptx on consumer
 
Scala as a Declarative Language
Scala as a Declarative LanguageScala as a Declarative Language
Scala as a Declarative Language
 
Best corporate-r-programming-training-in-mumbai
Best corporate-r-programming-training-in-mumbaiBest corporate-r-programming-training-in-mumbai
Best corporate-r-programming-training-in-mumbai
 
An Intoduction to R
An Intoduction to RAn Intoduction to R
An Intoduction to R
 
R programming slides
R  programming slidesR  programming slides
R programming slides
 
Reproducibility with R
Reproducibility with RReproducibility with R
Reproducibility with R
 
Special topics in finance lecture 2
Special topics in finance   lecture 2Special topics in finance   lecture 2
Special topics in finance lecture 2
 
What's new in Apache SystemML - Declarative Machine Learning
What's new in Apache SystemML  - Declarative Machine LearningWhat's new in Apache SystemML  - Declarative Machine Learning
What's new in Apache SystemML - Declarative Machine Learning
 
Spark 4th Meetup Londond - Building a Product with Spark
Spark 4th Meetup Londond - Building a Product with SparkSpark 4th Meetup Londond - Building a Product with Spark
Spark 4th Meetup Londond - Building a Product with Spark
 
R Brown-bag seminars : Seminar-8
R Brown-bag seminars : Seminar-8R Brown-bag seminars : Seminar-8
R Brown-bag seminars : Seminar-8
 
User biglm
User biglmUser biglm
User biglm
 
R language
R languageR language
R language
 
Mapreduce Algorithms
Mapreduce AlgorithmsMapreduce Algorithms
Mapreduce Algorithms
 
Introduction to R
Introduction to RIntroduction to R
Introduction to R
 

Último

Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 

Último (20)

Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 

introtorandrstudio.ppt

  • 1. Introduction to R and RStudio Jeff Witmer 9 March 2016
  • 2. R is A (not ideal) programming language A collection of 6,700 packages (as of June 2015, so more now) A software package for statistical computing and graphics A work environment Widely used Powerful Free
  • 3.
  • 4. Some history R was based on S, with code written in C R was created in the 1990s by Ross Ihaka and Robert Gentleman S was developed at Bell Labs, starting in the 1970s S largely was used to make good graphs – not an easy thing in 1975. R, like S, is quite good for graphing. For lots of examples, see http://rgraphgallery.blogspot.com/ or http://www.r-graph-gallery.com/ (Or for more detail, see http://docs.ggplot2.org/current/ See ggplot2-cheatsheet-2.0.pdf
  • 5. A few simple graphs using the ggplot2 package
  • 6. An example of graphing using the GGally package in R
  • 8.
  • 9. RStudio is A gift, from J.J. Allaire (Macalester College, ‘91) to the world An Integrated Development Environment (IDE) for R Free – unless you want the newest version, with more bells and whistles, and you are not eligible for the educational discount (= free) An easy (easier) way to use R Available as a desktop product or, as used at OC, run off of a file server. R supports rpubs – see http://rpubs.com/jawitmer
  • 11. R is object-oriented e.g., MyModel <- lm(wt ~ ht, data = mydata) then hist(MyModel$residuals) Note: lm(wt ~ ht*age + log(bp), data = mydata) regresses wt on ht, age, the ht-by-age interaction, and log(bp). There is no need to create the interaction or the lob(bp) variable outside of the lm() command. Comparing nested models: mod1 <- lm(wt ~ ht*age + log(bp), data = mydata) mod2 <- lm(wt ~ ht + log(bp), data = mydata) anova(mod2, mod1) gives a nested F-test
  • 12. R as a programming language If you want R to be (relatively) fast, take advantage of vector operations; e.g., use the replicate command (rather than a loop) or the tapply function. E.g., replicate(k=25,addingLines(n=10)) calls the addingLines function (something I wrote) 25 times. > with(Dabbs, tapply(testosterone, occupation, mean)) Actor MD Minister Prof 12.7 11.6 8.4 10.6
  • 13. If you want to know how to do something in R See the “Minimal R.pdf” handout Go to the Quick-R.com page (http://www.statmethods.net/) Google “How do I do xxx in R?” A standing joke among R users is that the answer is always “There are many ways to do that in R.” See http://swirlstats.com/ See https://www.datacamp.com/home
  • 14. Speaking of many ways to do something in R… (1) mean(mydata$ht) (2) with(mydata, mean(ht)) (3) mean(ht, data=mydata) However (1) plot(mydata$ht,mydata$wt) works (2) with(mydata, plot(ht,wt)) works (3) plot(ht, wt, data=mydata) does not work (3a) plot(wt~ht, data=mydata) works
  • 15. The mosaic package (Kaplan, Pruim, Horton) was created to make R easy to use for intro stats. mosaic package syntax: goal(y ~ x|z, data=mydata) E.g.: tally(~sex, data=HELPrct) E.g.: test(age ~ sex, data=HELPrct) E.g.: favstats(age ~ substance|sex, data=HELPrct) E.g.: t.test(age ~ sex, data=HELPrct)$p.value See MinimalR-2pages.pdf
  • 16. The mosaic package mPlot() command makes graphing easy. mPlot(SaratogaHouses)
  • 17. The openintro package edaPlot() command makes exploring data graphically easy to do. edaPlot(SaratogaHouses)
  • 18. The mosaic tidyr and dplyr packages handle SQL-ytpe work: merging files, extracting subsets, etc. data(NCHS) #loads in the NCHS data frame newNCHS <- NCHS %>% sample_n(size=5000) %>% filter(age > 18) #takes a sample of size 5000, extracts only the rows for which age > 18, and saves the result in newNCHS See data-wrangling-cheatsheet.pdf
  • 19. I use R, and the do() command in the mosaic package, for simulations. data(FirstYearGPA) #loads in the data frame FY <- FirstYearGPA) #rename the data frame lm(GPA ~ SATM, data=FY) #gives 0.0012 as slope lm(GPA ~ SATM, data=FY)$coeff[2] #just look at the slope do(3)*lm(GPA ~ shuffle(SATM), data=FY)$coeff[2] #break link b/w GPA and SATM null.dist <- do(1000)*lm(GPA ~ shuffle(SATM), data=FY)$coeff[2] #1000 random slopes histogram(null.dist$SATM, v=0.0012) #look at the 1000 slopes with(null.dist, tally(abs(SATM.)>=0.0012)) #How many are far from zero? with(null.dist, tally(abs(SATM.)>=0.0012, format='prop')) #What proportion are far from zero?
  • 20. plot(jitter(Win,amount=.05)~SaveDiff,data=LaXdata) Predict.Plot(modelDiff,pred.var="SaveDiff",DrawDiff=-11, ShotDiff=6, TODiff=-3, ClearPctDiff=0.0952, ShotGoalDiff=1, GroundDiff=5, add=TRUE,plot.args=list(col='blue')) #OCWLaX game vs BW Using Predict.Plot to show Pr(win) as SaveDiff varies, for a fixed set of values for sixother predictors.

Notas do Editor

  1. R is an interpreted language, but with much of it compiled in C.
  2. plot(wt~ht, data=mydata) feeds the plot command a function, whereas plot(ht, wt, data=mydata) doesn’t
  3. myx=data.frame(DrawDiff=-11, ShotDiff=6, TODiff=-3, SaveDiff = 0, ClearPctDiff=0.0952, ShotGoalDiff=1, GroundDiff=5) predict.glm(modelDiff,myx,type="response") #gives 0.896