R is an open source programming language for statistical analysis and graphics. It can be used from the command line or desktop clients, and integrates with databases and web applications. R supports many types of statistical analysis like regression, clustering, and time series analysis. It also enables interactive graphics and visualizations through packages like ggplot2 and Shiny. R can handle date/time semantics and is useful for tasks like decomposing time series data and finding patterns in datasets through clustering algorithms.
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Intro to R for Data Analysis
1. An Intro to R
...or who survived the sinking of the Titanic?
daniel@dakoller.net - @dakoller
for Google Extended IO 2013/Munich
Donnerstag, 16. Mai 13
2. R is...
• an open source interpreted programming language for statistical
computations.
• As such it competes with software packages like SPSS, SAS and partially
MathLab.
• You can access it from the command line, from a number of desktop & web
clients and from APIs.
• You can use the results of R computations in databases, web applications,
graphics.
• R supports
Linear & nonlinear regression Clustering
Timeseries analysis Extensive graphics
Classification ...and much more via packages!
Donnerstag, 16. Mai 13
3. Graphics in R
R Standard Graphics Up to date graphics with
ggplot2 1)
1) http://www.r-bloggers.com/maps-with-ggplot2/ - 2) http://glimmer.rstudio.com/winston/stocks/
Interactive web graphics
with Shiny 2)
Donnerstag, 16. Mai 13
4. • R can handle the semantics of
dates & times very well.
• Needed to separate seasonal
special influences from data
sources to find the real trend.
Timeseries analysis
with R
1) http://www.stat.pitt.edu/stoffer/tsa3/R_toot.htm - Plot: decomposed time series
Donnerstag, 16. Mai 13
5. Clustering
with R
• Needed to find
commonalities in
datasets
• Also needed to find
„similar“ items.
• Plots support visual
interpretation:
ClusPlot,
Dendrogram,
Decision Trees
Donnerstag, 16. Mai 13
7. ..in RStudio
• data(Titanic)
• summary(Titanic)
• Number of cases in table: 2201
• Number of factors: 4 -> (Class, Age, Sex --> Survived Yes/No)
• mosaicplot(Titanic,color = TRUE)
Donnerstag, 16. Mai 13
9. Using R with Google
APIs/Tools
• Analyze Google Analytics
with R: https://
code.google.com/p/r-google-
analytics/
• Use Google Chart tools with
R data: https://
code.google.com/p/google-
motion-charts-with-r/
Donnerstag, 16. Mai 13
10. Resources
• R on Wikipedia: http://www.quickiwiki.com/en/R_(programming_language)
• RStudio: desktop client for Windows, Mac OS X and Linux: http://
www.rstudio.com/ide/
• Shiny, R web app framework for interactive apps: http://www.rstudio.com/
shiny/
• Quick-R: http://www.statmethods.net/
• inside-R: Community/blog site about R http://www.inside-r.org/
• R-Bloggers: feed of R-relevant blogs: http://www.r-bloggers.com/
Donnerstag, 16. Mai 13