SlideShare uma empresa Scribd logo
1 de 17
Baixar para ler offline
Model Automation in R
Using MASS, randomForest, forecast,
and caret
Who is Will Johnson?
● Database Manager at Uline (Pleasant Prairie)
● MS Predictive Analytics (2015)
● Operating www.LearnByMarketing.com
○ R tutorials, thoughts on analysis.
Learn By
Marketing.com
Agenda
1. What is Model Automation
2. Pros and Cons of Model Automation
3. Decision Trees and Random Forests {randomForest}
4. Stepwise Regression {MASS}
5. Auto.Arima for time series {forecast}
6. Hyperparameter Search {caret}
What is Model Automation?
Hypothesis Space
vs
Hyperparameter Space
Pros and Cons of Model Automation
PROS:
● You Don’t Have to Think!
● “Faster” Iterations.
● See what’s “Important”
CONS:
● You Don’t Have to Think!
● Jellybeans
Agenda
1. What is Model Automation
2. Pros and Cons of Model Automation
3. Decision Trees and Random Forests {randomForest}
4. Stepwise Regression {MASS}
5. Auto.Arima for time series {forecast}
6. Hyperparameter Search {caret}
Decision Trees
● Gini Index +
Entropy
randomForest
● Mean Decrease
in Gini Index
library(randomForest)
rf <- randomForest(y~., data = dat)
rf$importance #Var Name + Importance
varImpPlot(rf) #Visualization
Stepwise
Regression
● AIC
Stepwise
Regression
library(MASS)
mod <- lm(hp~.,data=mt)
#Step Backward and remove one variable at a time
stepAIC(mod,direction = "backward",trace = T)
#Create a model using only the intercept
mod_lower = lm(hp~1,data=mt)
#Step Forward and add one variable at a time
stepAIC(mod_lower,direction = "forward",
scope=list(upper=upper_form,lower=~1))
#Step Forward or Backward each step starting with a intercept model
stepAIC(mod_lower,direction = "both",
scope=list(upper=upper_form,lower=~1))
#Get the Independent Variables
#(and exclude hp dependent variable)
indep_vars <-paste(names(mt)[-which(names(mt)=="hp")],
collapse="+")
#Turn those variable names into a formula
upper_form = formula(paste("~",indep_vars,collapse=""))
#~mpg + cyl + disp + drat + wt + qsec + vs + am + gear + carb
Auto.Arima
● Time Series models.
● AutoRegressive…
● Moving Averages…
● With Differencing!
library(forecast)
library(fpp)
#Step Backward and remove one variable at a time
data("elecequip")
ee <- elecequip[1:180]
model <- auto.arima(ee,stationary = T)
# ar1 ma1 ma2 ma3 intercept
#0.8428 -0.6571 -0.1753 0.6353 95.7265
#s.e. 0.0431 0.0537 0.0573 0.0561 3.2223
plot(forecast(model,h=10))
lines(x = 181:191, y= elecequip[181:191],
type = 'l', col = 'red')
Auto.Arima
train {caret}
library(caret)
#Step Backward and remove one variable at a time
tctrl <- trainControl(method = "cv",number=10,
repeats=10)
rpart_opts <- expand.grid(cp = seq(0.0,0.01, by = 0.001))
rpart_model <- train(y~. data, method="rpart",
metric = "Kappa", trControl = tctrl,
tuneGrid = rpart_opts, subset = train_log)
train {caret}
Recap
Learn By
Marketing.com
library(randomForest) varImpPlot()
library(MASS) stepAIC()
library(forecast) auto.arima()
library(caret) train()
Questions?
Learn By
Marketing.com

Mais conteúdo relacionado

Destaque

Recommender Systems with Apache Spark's ALS Function
Recommender Systems with Apache Spark's ALS FunctionRecommender Systems with Apache Spark's ALS Function
Recommender Systems with Apache Spark's ALS FunctionWill Johnson
 
The caret package is a unified interface to a large number of predictive mode...
The caret package is a unified interface to a large number of predictive mode...The caret package is a unified interface to a large number of predictive mode...
The caret package is a unified interface to a large number of predictive mode...odsc
 
Random Forests: The Vanilla of Machine Learning - Anna Quach
Random Forests: The Vanilla of Machine Learning - Anna QuachRandom Forests: The Vanilla of Machine Learning - Anna Quach
Random Forests: The Vanilla of Machine Learning - Anna QuachWithTheBest
 
Hivemall dbtechshowcase 20160713 #dbts2016
Hivemall dbtechshowcase 20160713 #dbts2016Hivemall dbtechshowcase 20160713 #dbts2016
Hivemall dbtechshowcase 20160713 #dbts2016Makoto Yui
 
Error analysis randomforest
Error analysis randomforestError analysis randomforest
Error analysis randomforestriswan_zen
 
Machine Learning with R
Machine Learning with RMachine Learning with R
Machine Learning with Rbutest
 
Visualization and Machine Learning - for exploratory data ...
Visualization and Machine Learning - for exploratory data ...Visualization and Machine Learning - for exploratory data ...
Visualization and Machine Learning - for exploratory data ...butest
 
Big Data in Stock Exchange( HFT, Forex, Flash Crashes)
Big Data in Stock Exchange( HFT, Forex, Flash Crashes) Big Data in Stock Exchange( HFT, Forex, Flash Crashes)
Big Data in Stock Exchange( HFT, Forex, Flash Crashes) Dmytro Melnychuk
 
Larry tabb hft - part 1
Larry tabb   hft - part 1Larry tabb   hft - part 1
Larry tabb hft - part 1Smith Kim
 
Meeting the data management challenges of MiFID II
Meeting the data management challenges of MiFID IIMeeting the data management challenges of MiFID II
Meeting the data management challenges of MiFID IILeigh Hill
 
MiFID II: Data for best execution
MiFID II: Data for best executionMiFID II: Data for best execution
MiFID II: Data for best executionLeigh Hill
 
Getting Ready for MiFID II
Getting Ready for MiFID II Getting Ready for MiFID II
Getting Ready for MiFID II corfinancial
 
MiFID II: Data for transparency
MiFID II: Data for transparencyMiFID II: Data for transparency
MiFID II: Data for transparencyLeigh Hill
 
The impact of MiFID II on your OTC derivatives trading business
The impact of MiFID II on your OTC derivatives trading businessThe impact of MiFID II on your OTC derivatives trading business
The impact of MiFID II on your OTC derivatives trading businessTom White
 
MiFID II- Client issues presentation Leeds
MiFID II- Client issues presentation LeedsMiFID II- Client issues presentation Leeds
MiFID II- Client issues presentation LeedsBovill
 
Naive Bayes Example using R
Naive Bayes Example using  R Naive Bayes Example using  R
Naive Bayes Example using R Dr. Volkan OBAN
 
MiFID II - investor protection - Bovill briefing feb 15
MiFID II - investor protection - Bovill briefing feb 15MiFID II - investor protection - Bovill briefing feb 15
MiFID II - investor protection - Bovill briefing feb 15Bovill
 
Extent 2013 Obninsk Cross-Asset Portfolio Margin Risk Calculation for HFT
Extent 2013 Obninsk Cross-Asset Portfolio Margin Risk Calculation for HFTExtent 2013 Obninsk Cross-Asset Portfolio Margin Risk Calculation for HFT
Extent 2013 Obninsk Cross-Asset Portfolio Margin Risk Calculation for HFTextentconf Tsoy
 

Destaque (19)

Recommender Systems with Apache Spark's ALS Function
Recommender Systems with Apache Spark's ALS FunctionRecommender Systems with Apache Spark's ALS Function
Recommender Systems with Apache Spark's ALS Function
 
The caret package is a unified interface to a large number of predictive mode...
The caret package is a unified interface to a large number of predictive mode...The caret package is a unified interface to a large number of predictive mode...
The caret package is a unified interface to a large number of predictive mode...
 
Random Forests: The Vanilla of Machine Learning - Anna Quach
Random Forests: The Vanilla of Machine Learning - Anna QuachRandom Forests: The Vanilla of Machine Learning - Anna Quach
Random Forests: The Vanilla of Machine Learning - Anna Quach
 
Hivemall dbtechshowcase 20160713 #dbts2016
Hivemall dbtechshowcase 20160713 #dbts2016Hivemall dbtechshowcase 20160713 #dbts2016
Hivemall dbtechshowcase 20160713 #dbts2016
 
Error analysis randomforest
Error analysis randomforestError analysis randomforest
Error analysis randomforest
 
Machine Learning with R
Machine Learning with RMachine Learning with R
Machine Learning with R
 
Visualization and Machine Learning - for exploratory data ...
Visualization and Machine Learning - for exploratory data ...Visualization and Machine Learning - for exploratory data ...
Visualization and Machine Learning - for exploratory data ...
 
Access any data anywhere
Access any data anywhereAccess any data anywhere
Access any data anywhere
 
Big Data in Stock Exchange( HFT, Forex, Flash Crashes)
Big Data in Stock Exchange( HFT, Forex, Flash Crashes) Big Data in Stock Exchange( HFT, Forex, Flash Crashes)
Big Data in Stock Exchange( HFT, Forex, Flash Crashes)
 
Larry tabb hft - part 1
Larry tabb   hft - part 1Larry tabb   hft - part 1
Larry tabb hft - part 1
 
Meeting the data management challenges of MiFID II
Meeting the data management challenges of MiFID IIMeeting the data management challenges of MiFID II
Meeting the data management challenges of MiFID II
 
MiFID II: Data for best execution
MiFID II: Data for best executionMiFID II: Data for best execution
MiFID II: Data for best execution
 
Getting Ready for MiFID II
Getting Ready for MiFID II Getting Ready for MiFID II
Getting Ready for MiFID II
 
MiFID II: Data for transparency
MiFID II: Data for transparencyMiFID II: Data for transparency
MiFID II: Data for transparency
 
The impact of MiFID II on your OTC derivatives trading business
The impact of MiFID II on your OTC derivatives trading businessThe impact of MiFID II on your OTC derivatives trading business
The impact of MiFID II on your OTC derivatives trading business
 
MiFID II- Client issues presentation Leeds
MiFID II- Client issues presentation LeedsMiFID II- Client issues presentation Leeds
MiFID II- Client issues presentation Leeds
 
Naive Bayes Example using R
Naive Bayes Example using  R Naive Bayes Example using  R
Naive Bayes Example using R
 
MiFID II - investor protection - Bovill briefing feb 15
MiFID II - investor protection - Bovill briefing feb 15MiFID II - investor protection - Bovill briefing feb 15
MiFID II - investor protection - Bovill briefing feb 15
 
Extent 2013 Obninsk Cross-Asset Portfolio Margin Risk Calculation for HFT
Extent 2013 Obninsk Cross-Asset Portfolio Margin Risk Calculation for HFTExtent 2013 Obninsk Cross-Asset Portfolio Margin Risk Calculation for HFT
Extent 2013 Obninsk Cross-Asset Portfolio Margin Risk Calculation for HFT
 

Semelhante a Model Automation in R

TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...
TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...
TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...Chetan Khatri
 
Methods of Optimization in Machine Learning
Methods of Optimization in Machine LearningMethods of Optimization in Machine Learning
Methods of Optimization in Machine LearningKnoldus Inc.
 
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scalaAutomate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scalaChetan Khatri
 
How to Win Machine Learning Competitions ?
How to Win Machine Learning Competitions ? How to Win Machine Learning Competitions ?
How to Win Machine Learning Competitions ? HackerEarth
 
Linear regression in R
Linear regression in R Linear regression in R
Linear regression in R Leon Kim
 
Introduction to R Short course Fall 2016
Introduction to R Short course Fall 2016Introduction to R Short course Fall 2016
Introduction to R Short course Fall 2016Spencer Fox
 
Datascience101presentation4
Datascience101presentation4Datascience101presentation4
Datascience101presentation4Salford Systems
 
Ilab Metis: we optimize power systems and we are not afraid of direct policy ...
Ilab Metis: we optimize power systems and we are not afraid of direct policy ...Ilab Metis: we optimize power systems and we are not afraid of direct policy ...
Ilab Metis: we optimize power systems and we are not afraid of direct policy ...Olivier Teytaud
 
Different Models Used In Time Series - InsideAIML
Different Models Used In Time Series - InsideAIMLDifferent Models Used In Time Series - InsideAIML
Different Models Used In Time Series - InsideAIMLVijaySharma802
 
Production model lifecycle management 2016 09
Production model lifecycle management 2016 09Production model lifecycle management 2016 09
Production model lifecycle management 2016 09Greg Makowski
 
Cutting edge hyperparameter tuning made simple with ray tune
Cutting edge hyperparameter tuning made simple with ray tuneCutting edge hyperparameter tuning made simple with ray tune
Cutting edge hyperparameter tuning made simple with ray tuneXiaoweiJiang7
 
A GENETIC-FROG LEAPING ALGORITHM FOR TEXT DOCUMENT CLUSTERING
A GENETIC-FROG LEAPING ALGORITHM FOR TEXT DOCUMENT CLUSTERINGA GENETIC-FROG LEAPING ALGORITHM FOR TEXT DOCUMENT CLUSTERING
A GENETIC-FROG LEAPING ALGORITHM FOR TEXT DOCUMENT CLUSTERINGLubna_Alhenaki
 
Time Series Analysis: Challenge Kaggle with TensorFlow
Time Series Analysis: Challenge Kaggle with TensorFlowTime Series Analysis: Challenge Kaggle with TensorFlow
Time Series Analysis: Challenge Kaggle with TensorFlowSeungHyun Jeon
 
Synthesis of analytical methods data driven decision-making
Synthesis of analytical methods data driven decision-makingSynthesis of analytical methods data driven decision-making
Synthesis of analytical methods data driven decision-makingAdam Doyle
 
Mapreduce Algorithms
Mapreduce AlgorithmsMapreduce Algorithms
Mapreduce AlgorithmsAmund Tveit
 
Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC ...
Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC                           ...Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC                           ...
Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC ...PATHALAMRAJESH
 
EuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOME
EuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOMEEuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOME
EuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOMEHONGJOO LEE
 
Planning for power systems
Planning for power systemsPlanning for power systems
Planning for power systemsOlivier Teytaud
 
Dynamic Optimization without Markov Assumptions: application to power systems
Dynamic Optimization without Markov Assumptions: application to power systemsDynamic Optimization without Markov Assumptions: application to power systems
Dynamic Optimization without Markov Assumptions: application to power systemsOlivier Teytaud
 
Online advertising and large scale model fitting
Online advertising and large scale model fittingOnline advertising and large scale model fitting
Online advertising and large scale model fittingWush Wu
 

Semelhante a Model Automation in R (20)

TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...
TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...
TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...
 
Methods of Optimization in Machine Learning
Methods of Optimization in Machine LearningMethods of Optimization in Machine Learning
Methods of Optimization in Machine Learning
 
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scalaAutomate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
 
How to Win Machine Learning Competitions ?
How to Win Machine Learning Competitions ? How to Win Machine Learning Competitions ?
How to Win Machine Learning Competitions ?
 
Linear regression in R
Linear regression in R Linear regression in R
Linear regression in R
 
Introduction to R Short course Fall 2016
Introduction to R Short course Fall 2016Introduction to R Short course Fall 2016
Introduction to R Short course Fall 2016
 
Datascience101presentation4
Datascience101presentation4Datascience101presentation4
Datascience101presentation4
 
Ilab Metis: we optimize power systems and we are not afraid of direct policy ...
Ilab Metis: we optimize power systems and we are not afraid of direct policy ...Ilab Metis: we optimize power systems and we are not afraid of direct policy ...
Ilab Metis: we optimize power systems and we are not afraid of direct policy ...
 
Different Models Used In Time Series - InsideAIML
Different Models Used In Time Series - InsideAIMLDifferent Models Used In Time Series - InsideAIML
Different Models Used In Time Series - InsideAIML
 
Production model lifecycle management 2016 09
Production model lifecycle management 2016 09Production model lifecycle management 2016 09
Production model lifecycle management 2016 09
 
Cutting edge hyperparameter tuning made simple with ray tune
Cutting edge hyperparameter tuning made simple with ray tuneCutting edge hyperparameter tuning made simple with ray tune
Cutting edge hyperparameter tuning made simple with ray tune
 
A GENETIC-FROG LEAPING ALGORITHM FOR TEXT DOCUMENT CLUSTERING
A GENETIC-FROG LEAPING ALGORITHM FOR TEXT DOCUMENT CLUSTERINGA GENETIC-FROG LEAPING ALGORITHM FOR TEXT DOCUMENT CLUSTERING
A GENETIC-FROG LEAPING ALGORITHM FOR TEXT DOCUMENT CLUSTERING
 
Time Series Analysis: Challenge Kaggle with TensorFlow
Time Series Analysis: Challenge Kaggle with TensorFlowTime Series Analysis: Challenge Kaggle with TensorFlow
Time Series Analysis: Challenge Kaggle with TensorFlow
 
Synthesis of analytical methods data driven decision-making
Synthesis of analytical methods data driven decision-makingSynthesis of analytical methods data driven decision-making
Synthesis of analytical methods data driven decision-making
 
Mapreduce Algorithms
Mapreduce AlgorithmsMapreduce Algorithms
Mapreduce Algorithms
 
Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC ...
Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC                           ...Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC                           ...
Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC ...
 
EuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOME
EuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOMEEuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOME
EuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOME
 
Planning for power systems
Planning for power systemsPlanning for power systems
Planning for power systems
 
Dynamic Optimization without Markov Assumptions: application to power systems
Dynamic Optimization without Markov Assumptions: application to power systemsDynamic Optimization without Markov Assumptions: application to power systems
Dynamic Optimization without Markov Assumptions: application to power systems
 
Online advertising and large scale model fitting
Online advertising and large scale model fittingOnline advertising and large scale model fitting
Online advertising and large scale model fitting
 

Último

一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格q6pzkpark
 
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制vexqp
 
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制vexqp
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...Health
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxchadhar227
 
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling ManjurJual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjurptikerjasaptiker
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowgargpaaro
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........EfruzAsilolu
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabiaahmedjiabur940
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制vexqp
 
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制vexqp
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...gajnagarg
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.pptibrahimabdi22
 
Harnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptxHarnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptxParas Gupta
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...gajnagarg
 
Data Analyst Tasks to do the internship.pdf
Data Analyst Tasks to do the internship.pdfData Analyst Tasks to do the internship.pdf
Data Analyst Tasks to do the internship.pdftheeltifs
 
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制vexqp
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRajesh Mondal
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样wsppdmt
 

Último (20)

一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
 
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
 
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
 
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling ManjurJual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
 
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt
 
Harnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptxHarnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptx
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
 
Data Analyst Tasks to do the internship.pdf
Data Analyst Tasks to do the internship.pdfData Analyst Tasks to do the internship.pdf
Data Analyst Tasks to do the internship.pdf
 
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for Research
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
 

Model Automation in R

  • 1. Model Automation in R Using MASS, randomForest, forecast, and caret
  • 2. Who is Will Johnson? ● Database Manager at Uline (Pleasant Prairie) ● MS Predictive Analytics (2015) ● Operating www.LearnByMarketing.com ○ R tutorials, thoughts on analysis. Learn By Marketing.com
  • 3. Agenda 1. What is Model Automation 2. Pros and Cons of Model Automation 3. Decision Trees and Random Forests {randomForest} 4. Stepwise Regression {MASS} 5. Auto.Arima for time series {forecast} 6. Hyperparameter Search {caret}
  • 4. What is Model Automation? Hypothesis Space vs Hyperparameter Space
  • 5. Pros and Cons of Model Automation PROS: ● You Don’t Have to Think! ● “Faster” Iterations. ● See what’s “Important” CONS: ● You Don’t Have to Think! ● Jellybeans
  • 6.
  • 7. Agenda 1. What is Model Automation 2. Pros and Cons of Model Automation 3. Decision Trees and Random Forests {randomForest} 4. Stepwise Regression {MASS} 5. Auto.Arima for time series {forecast} 6. Hyperparameter Search {caret}
  • 8. Decision Trees ● Gini Index + Entropy
  • 9. randomForest ● Mean Decrease in Gini Index library(randomForest) rf <- randomForest(y~., data = dat) rf$importance #Var Name + Importance varImpPlot(rf) #Visualization
  • 11. Stepwise Regression library(MASS) mod <- lm(hp~.,data=mt) #Step Backward and remove one variable at a time stepAIC(mod,direction = "backward",trace = T) #Create a model using only the intercept mod_lower = lm(hp~1,data=mt) #Step Forward and add one variable at a time stepAIC(mod_lower,direction = "forward", scope=list(upper=upper_form,lower=~1)) #Step Forward or Backward each step starting with a intercept model stepAIC(mod_lower,direction = "both", scope=list(upper=upper_form,lower=~1)) #Get the Independent Variables #(and exclude hp dependent variable) indep_vars <-paste(names(mt)[-which(names(mt)=="hp")], collapse="+") #Turn those variable names into a formula upper_form = formula(paste("~",indep_vars,collapse="")) #~mpg + cyl + disp + drat + wt + qsec + vs + am + gear + carb
  • 12. Auto.Arima ● Time Series models. ● AutoRegressive… ● Moving Averages… ● With Differencing! library(forecast) library(fpp) #Step Backward and remove one variable at a time data("elecequip") ee <- elecequip[1:180] model <- auto.arima(ee,stationary = T) # ar1 ma1 ma2 ma3 intercept #0.8428 -0.6571 -0.1753 0.6353 95.7265 #s.e. 0.0431 0.0537 0.0573 0.0561 3.2223 plot(forecast(model,h=10)) lines(x = 181:191, y= elecequip[181:191], type = 'l', col = 'red')
  • 14. train {caret} library(caret) #Step Backward and remove one variable at a time tctrl <- trainControl(method = "cv",number=10, repeats=10) rpart_opts <- expand.grid(cp = seq(0.0,0.01, by = 0.001)) rpart_model <- train(y~. data, method="rpart", metric = "Kappa", trControl = tctrl, tuneGrid = rpart_opts, subset = train_log)
  • 16. Recap Learn By Marketing.com library(randomForest) varImpPlot() library(MASS) stepAIC() library(forecast) auto.arima() library(caret) train()