SlideShare a Scribd company logo
1 of 1
Download to read offline
9 Variable Selection
Variable selection is fundamental in statistical modeling. Many methods have been developed
to select variables which significantly explain the response variable. Some of the methods
are backward, forward, or their combination so called stepwise, and best subset variable
selection. Recently, new approaches are introduced such as ridge regression, Least Absolute
Shrinkage and Selection Operator (LASSO), and their combination so called Elastic Net.
The other new variable selection method is Smoothly Clipped Absolute Deviation (SCAD).
This chapter describes the application of LASSO, Elastic Net, and SCAD in linear regression
model and logit model which are described in the Chapter 8.


9.1 LASSO

9.1.1 LASSO in linear regression model

We have already introduced a linear regression model such as in Chapter 3, Chapter 7 and
also in (3.50) as
                                      y = X β + ε,
where y(n × 1) is the vector of observation for the response variable, X (n × p) is the data
matrix of the p explanatory variables and ε are the errors.
                                                                      ˆ
Suppose E (y|X ) = X β and β = {β1 , . . . , βp }, the LASSO estimate β is defined by
                                                              n
                                                                                 2
                                  ˆ
                                  β = argmin                       y i − xi β
                                                             i=1
                                                  p
                             s.t. =                   |βj | ≤ t.                            (9.1)
                                               j=1



9.1.2 LASSO in logit model

As described in Chapter 8 and (8.19), the logit model (with intercept) for binary response
is defined as                                         p
                                  p (xi )
                           log               = β0 +     βj xij ,
                                1 − p (xi )         j=1
and its log likelihood function is
                                    n
                log L(β0 , β) =           [yi log p (xi ) + (1 − yi ) log{1 − p (xi )}] .   (9.2)
                                    i=1

Penalized log likelihood for logit model using LASSO is as follow
                                              n                       p
                                          1
                            max                       (β0 , β) − λ         |βj | ,          (9.3)
                            β0 ,β         n   i=1                    j=1

More Related Content

What's hot

Analisis Korespondensi
Analisis KorespondensiAnalisis Korespondensi
Analisis Korespondensidessybudiyanti
 
Alternating direction
Alternating directionAlternating direction
Alternating directionDerek Pang
 
The wild McKay correspondence
The wild McKay correspondenceThe wild McKay correspondence
The wild McKay correspondenceTakehiko Yasuda
 
続・ビジュアル系高専生
続・ビジュアル系高専生続・ビジュアル系高専生
続・ビジュアル系高専生Daichi OBINATA
 
Exercise 2
Exercise 2Exercise 2
Exercise 2math126
 
Practical computation of Hecke operators
Practical computation of Hecke operatorsPractical computation of Hecke operators
Practical computation of Hecke operatorsMathieu Dutour Sikiric
 
Engr 371 final exam april 2010
Engr 371 final exam april 2010Engr 371 final exam april 2010
Engr 371 final exam april 2010amnesiann
 
ROBDD&Charecteristics
ROBDD&CharecteristicsROBDD&Charecteristics
ROBDD&CharecteristicsIffat Anjum
 
Engr 371 final exam april 2006
Engr 371 final exam april 2006Engr 371 final exam april 2006
Engr 371 final exam april 2006amnesiann
 
Reduced ordered binary decision diagram
Reduced ordered binary decision diagramReduced ordered binary decision diagram
Reduced ordered binary decision diagramTeam-VLSI-ITMU
 

What's hot (17)

Analisis Korespondensi
Analisis KorespondensiAnalisis Korespondensi
Analisis Korespondensi
 
Alternating direction
Alternating directionAlternating direction
Alternating direction
 
Computing F-blowups
Computing F-blowupsComputing F-blowups
Computing F-blowups
 
Aa4
Aa4Aa4
Aa4
 
確率伝播その2
確率伝播その2確率伝播その2
確率伝播その2
 
The wild McKay correspondence
The wild McKay correspondenceThe wild McKay correspondence
The wild McKay correspondence
 
続・ビジュアル系高専生
続・ビジュアル系高専生続・ビジュアル系高専生
続・ビジュアル系高専生
 
Exercise 2
Exercise 2Exercise 2
Exercise 2
 
Practical computation of Hecke operators
Practical computation of Hecke operatorsPractical computation of Hecke operators
Practical computation of Hecke operators
 
Calc 5.8b
Calc 5.8bCalc 5.8b
Calc 5.8b
 
Afshaa fybca1
Afshaa fybca1Afshaa fybca1
Afshaa fybca1
 
Binary decision diagrams
Binary decision diagramsBinary decision diagrams
Binary decision diagrams
 
Engr 371 final exam april 2010
Engr 371 final exam april 2010Engr 371 final exam april 2010
Engr 371 final exam april 2010
 
ROBDD&Charecteristics
ROBDD&CharecteristicsROBDD&Charecteristics
ROBDD&Charecteristics
 
Engr 371 final exam april 2006
Engr 371 final exam april 2006Engr 371 final exam april 2006
Engr 371 final exam april 2006
 
Reduced ordered binary decision diagram
Reduced ordered binary decision diagramReduced ordered binary decision diagram
Reduced ordered binary decision diagram
 
003 bd ds
003 bd ds003 bd ds
003 bd ds
 

Viewers also liked

Matrice animation.
Matrice animation.Matrice animation.
Matrice animation.Iancmaria
 
Enrollador dc 50
Enrollador dc 50Enrollador dc 50
Enrollador dc 50Barin SA
 
HVAC Management Trainee Positions with United Air Temp
HVAC Management Trainee Positions with United Air TempHVAC Management Trainee Positions with United Air Temp
HVAC Management Trainee Positions with United Air TempGeorgia Guard Family Program
 
Mei 573 formación de auxiliares operadores de grúa rigger
Mei 573   formación de auxiliares operadores de grúa riggerMei 573   formación de auxiliares operadores de grúa rigger
Mei 573 formación de auxiliares operadores de grúa riggerProcasecapacita
 
Qualitest acquires testing firm TCL
Qualitest acquires testing firm TCLQualitest acquires testing firm TCL
Qualitest acquires testing firm TCLTCL
 
Grade Inflation in Private Indian Schools & Colleges: Trappings of Kafkaesque...
Grade Inflation in Private Indian Schools & Colleges: Trappings of Kafkaesque...Grade Inflation in Private Indian Schools & Colleges: Trappings of Kafkaesque...
Grade Inflation in Private Indian Schools & Colleges: Trappings of Kafkaesque...Jagdish Pathak
 

Viewers also liked (11)

Welcome to spain
Welcome to spainWelcome to spain
Welcome to spain
 
Sound (january)
Sound (january)Sound (january)
Sound (january)
 
mobiliario de oficina
mobiliario de oficinamobiliario de oficina
mobiliario de oficina
 
Matrice animation.
Matrice animation.Matrice animation.
Matrice animation.
 
Enrollador dc 50
Enrollador dc 50Enrollador dc 50
Enrollador dc 50
 
HVAC Management Trainee Positions with United Air Temp
HVAC Management Trainee Positions with United Air TempHVAC Management Trainee Positions with United Air Temp
HVAC Management Trainee Positions with United Air Temp
 
Mei 573 formación de auxiliares operadores de grúa rigger
Mei 573   formación de auxiliares operadores de grúa riggerMei 573   formación de auxiliares operadores de grúa rigger
Mei 573 formación de auxiliares operadores de grúa rigger
 
éSta frase tiene 2067 años
éSta frase tiene 2067 añoséSta frase tiene 2067 años
éSta frase tiene 2067 años
 
Programa museu
Programa museuPrograma museu
Programa museu
 
Qualitest acquires testing firm TCL
Qualitest acquires testing firm TCLQualitest acquires testing firm TCL
Qualitest acquires testing firm TCL
 
Grade Inflation in Private Indian Schools & Colleges: Trappings of Kafkaesque...
Grade Inflation in Private Indian Schools & Colleges: Trappings of Kafkaesque...Grade Inflation in Private Indian Schools & Colleges: Trappings of Kafkaesque...
Grade Inflation in Private Indian Schools & Colleges: Trappings of Kafkaesque...
 

Similar to Ch9 variable selection

Bayesian regression models and treed Gaussian process models
Bayesian regression models and treed Gaussian process modelsBayesian regression models and treed Gaussian process models
Bayesian regression models and treed Gaussian process modelsTommaso Rigon
 
Logistic Regression(SGD)
Logistic Regression(SGD)Logistic Regression(SGD)
Logistic Regression(SGD)Prentice Xu
 
2 senarai rumus add maths k1 trial spm sbp 2010
2 senarai rumus add maths k1 trial spm sbp 20102 senarai rumus add maths k1 trial spm sbp 2010
2 senarai rumus add maths k1 trial spm sbp 2010zabidah awang
 
2 senarai rumus add maths k2 trial spm sbp 2010
2 senarai rumus add maths k2 trial spm sbp 20102 senarai rumus add maths k2 trial spm sbp 2010
2 senarai rumus add maths k2 trial spm sbp 2010zabidah awang
 
2 senarai rumus add maths k2 trial spm sbp 2010
2 senarai rumus add maths k2 trial spm sbp 20102 senarai rumus add maths k2 trial spm sbp 2010
2 senarai rumus add maths k2 trial spm sbp 2010zabidah awang
 
2 senarai rumus add maths k1 trial spm sbp 2010
2 senarai rumus add maths k1 trial spm sbp 20102 senarai rumus add maths k1 trial spm sbp 2010
2 senarai rumus add maths k1 trial spm sbp 2010zabidah awang
 
Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...
Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...
Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...Tomoya Murata
 
Stochastic Approximation and Simulated Annealing
Stochastic Approximation and Simulated AnnealingStochastic Approximation and Simulated Annealing
Stochastic Approximation and Simulated AnnealingSSA KPI
 
CVPR2010: Advanced ITinCVPR in a Nutshell: part 7: Future Trend
CVPR2010: Advanced ITinCVPR in a Nutshell: part 7: Future TrendCVPR2010: Advanced ITinCVPR in a Nutshell: part 7: Future Trend
CVPR2010: Advanced ITinCVPR in a Nutshell: part 7: Future Trendzukun
 
Engr 371 midterm february 1997
Engr 371 midterm february 1997Engr 371 midterm february 1997
Engr 371 midterm february 1997amnesiann
 
Example triple integral
Example triple integralExample triple integral
Example triple integralZulaikha Ahmad
 
Birkhoff coordinates for the Toda Lattice in the limit of infinitely many par...
Birkhoff coordinates for the Toda Lattice in the limit of infinitely many par...Birkhoff coordinates for the Toda Lattice in the limit of infinitely many par...
Birkhoff coordinates for the Toda Lattice in the limit of infinitely many par...Alberto Maspero
 
Sparsity with sign-coherent groups of variables via the cooperative-Lasso
Sparsity with sign-coherent groups of variables via the cooperative-LassoSparsity with sign-coherent groups of variables via the cooperative-Lasso
Sparsity with sign-coherent groups of variables via the cooperative-LassoLaboratoire Statistique et génome
 
Maths chapter wise Important questions
Maths chapter wise Important questionsMaths chapter wise Important questions
Maths chapter wise Important questionsSrikanth KS
 
Regularization and variable selection via elastic net
Regularization and variable selection via elastic netRegularization and variable selection via elastic net
Regularization and variable selection via elastic netKyusonLim
 

Similar to Ch9 variable selection (20)

Bayesian regression models and treed Gaussian process models
Bayesian regression models and treed Gaussian process modelsBayesian regression models and treed Gaussian process models
Bayesian regression models and treed Gaussian process models
 
Eigenvalues
EigenvaluesEigenvalues
Eigenvalues
 
Logistic Regression(SGD)
Logistic Regression(SGD)Logistic Regression(SGD)
Logistic Regression(SGD)
 
2 senarai rumus add maths k1 trial spm sbp 2010
2 senarai rumus add maths k1 trial spm sbp 20102 senarai rumus add maths k1 trial spm sbp 2010
2 senarai rumus add maths k1 trial spm sbp 2010
 
2 senarai rumus add maths k2 trial spm sbp 2010
2 senarai rumus add maths k2 trial spm sbp 20102 senarai rumus add maths k2 trial spm sbp 2010
2 senarai rumus add maths k2 trial spm sbp 2010
 
2 senarai rumus add maths k2 trial spm sbp 2010
2 senarai rumus add maths k2 trial spm sbp 20102 senarai rumus add maths k2 trial spm sbp 2010
2 senarai rumus add maths k2 trial spm sbp 2010
 
2 senarai rumus add maths k1 trial spm sbp 2010
2 senarai rumus add maths k1 trial spm sbp 20102 senarai rumus add maths k1 trial spm sbp 2010
2 senarai rumus add maths k1 trial spm sbp 2010
 
Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...
Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...
Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...
 
Stochastic Approximation and Simulated Annealing
Stochastic Approximation and Simulated AnnealingStochastic Approximation and Simulated Annealing
Stochastic Approximation and Simulated Annealing
 
CVPR2010: Advanced ITinCVPR in a Nutshell: part 7: Future Trend
CVPR2010: Advanced ITinCVPR in a Nutshell: part 7: Future TrendCVPR2010: Advanced ITinCVPR in a Nutshell: part 7: Future Trend
CVPR2010: Advanced ITinCVPR in a Nutshell: part 7: Future Trend
 
Engr 371 midterm february 1997
Engr 371 midterm february 1997Engr 371 midterm february 1997
Engr 371 midterm february 1997
 
Example triple integral
Example triple integralExample triple integral
Example triple integral
 
Steven Duplij, "Polyadic rings of p-adic integers"
Steven Duplij, "Polyadic rings of p-adic integers"Steven Duplij, "Polyadic rings of p-adic integers"
Steven Duplij, "Polyadic rings of p-adic integers"
 
Sparsity by worst-case quadratic penalties
Sparsity by worst-case quadratic penaltiesSparsity by worst-case quadratic penalties
Sparsity by worst-case quadratic penalties
 
Birkhoff coordinates for the Toda Lattice in the limit of infinitely many par...
Birkhoff coordinates for the Toda Lattice in the limit of infinitely many par...Birkhoff coordinates for the Toda Lattice in the limit of infinitely many par...
Birkhoff coordinates for the Toda Lattice in the limit of infinitely many par...
 
Update 2
Update 2Update 2
Update 2
 
Sparsity with sign-coherent groups of variables via the cooperative-Lasso
Sparsity with sign-coherent groups of variables via the cooperative-LassoSparsity with sign-coherent groups of variables via the cooperative-Lasso
Sparsity with sign-coherent groups of variables via the cooperative-Lasso
 
MUMS Opening Workshop - Panel Discussion: Facts About Some Statisitcal Models...
MUMS Opening Workshop - Panel Discussion: Facts About Some Statisitcal Models...MUMS Opening Workshop - Panel Discussion: Facts About Some Statisitcal Models...
MUMS Opening Workshop - Panel Discussion: Facts About Some Statisitcal Models...
 
Maths chapter wise Important questions
Maths chapter wise Important questionsMaths chapter wise Important questions
Maths chapter wise Important questions
 
Regularization and variable selection via elastic net
Regularization and variable selection via elastic netRegularization and variable selection via elastic net
Regularization and variable selection via elastic net
 

Ch9 variable selection

  • 1. 9 Variable Selection Variable selection is fundamental in statistical modeling. Many methods have been developed to select variables which significantly explain the response variable. Some of the methods are backward, forward, or their combination so called stepwise, and best subset variable selection. Recently, new approaches are introduced such as ridge regression, Least Absolute Shrinkage and Selection Operator (LASSO), and their combination so called Elastic Net. The other new variable selection method is Smoothly Clipped Absolute Deviation (SCAD). This chapter describes the application of LASSO, Elastic Net, and SCAD in linear regression model and logit model which are described in the Chapter 8. 9.1 LASSO 9.1.1 LASSO in linear regression model We have already introduced a linear regression model such as in Chapter 3, Chapter 7 and also in (3.50) as y = X β + ε, where y(n × 1) is the vector of observation for the response variable, X (n × p) is the data matrix of the p explanatory variables and ε are the errors. ˆ Suppose E (y|X ) = X β and β = {β1 , . . . , βp }, the LASSO estimate β is defined by n 2 ˆ β = argmin y i − xi β i=1 p s.t. = |βj | ≤ t. (9.1) j=1 9.1.2 LASSO in logit model As described in Chapter 8 and (8.19), the logit model (with intercept) for binary response is defined as p p (xi ) log = β0 + βj xij , 1 − p (xi ) j=1 and its log likelihood function is n log L(β0 , β) = [yi log p (xi ) + (1 − yi ) log{1 − p (xi )}] . (9.2) i=1 Penalized log likelihood for logit model using LASSO is as follow n p 1 max (β0 , β) − λ |βj | , (9.3) β0 ,β n i=1 j=1