Niche bloggers up to multinational corporations, they are all interested in monitoring their web traffic and its patterns across time.
Google Analytics is the most widely used solution to keep track of this type of data. It provides a UI for a wide range of reports and possibilities for various types of visualizations.
Moreover, the availability of the Analytics API coupled with the corresponding R packages can now give more options for custom web analyses.
The plan for this talk is to cover the following :
• What is web analytics ? How it works ?
• Interfacing with the Analytics Reporting API via an R package (RGA)
• Practical analytics applications with R
• Discussion
30. 9/21/2015 Web Analytics with R
file:///D:/Projects/R_project_temp/RGA/SimGAR.html 30/38
The Data
## Source: local data frame [6 x 16]
##
## dateHour minute sourceMedium operatingSystem subContinent
## 1 2015030100 00 facebook / display Windows Southern Europe
## 2 2015030100 01 facebook / display Macintosh Southern Europe
## 3 2015030100 01 google / cpc Windows Northern Europe
## 4 2015030100 01 google / cpc iOS Southern Europe
## 5 2015030100 02 facebook / display Macintosh Southern Europe
## 6 2015030100 02 facebook / display Windows Western Europe
## Variables not shown: pageDepth (chr), daysSinceLastSession (chr), sessions
## (dbl), percentNewSessions (dbl), transactions (dbl), transactionRevenue
## (dbl), bounceRate (dbl), avgSessionDuration (dbl), pageviewsPerSession
## (dbl), hits (dbl), Visitor (chr)
32. 9/21/2015 Web Analytics with R
file:///D:/Projects/R_project_temp/RGA/SimGAR.html 32/38
Data preparation
Session data made "almost" granular
Removed invalid sessions
Extra dimension added (user type)
Removed highly correlated vars
Data split into train and test
Day of the week extracted from date
Days since last session placed in buckets
Date converted to weekday or weekend
Datehour split in two component variables
Georgraphy split between top sub-continents and Other
Hour converted to AM or PM
·
·
·
·
·
·
·
·
·
·
·