SlideShare a Scribd company logo
1 of 30
Download to read offline
5th International Summer School
Achievements and Applications of Contemporary Informatics,
Mathematics and Physics
National University of Technology of the Ukraine
Kiev, Ukraine, August 3-15, 2010




              PARAMETER ESTIMATION FOR SEMIPARAMETRIC MODELS
                   WITH CMARS AND ITS APPLICATIONS

                                             Fatma YERLIKAYA-ÖZKURT
                               Institute of Applied Mathematics, METU, Ankara,Turkey

                                                Gerhard-Wilhelm WEBER
                               Institute of Applied Mathematics, METU, Ankara,Turkey
                             Faculty of Economics, Business and Law, University of Siegen, Germany
                         Center for Research on Optimization and Control, University of Aveiro, Portugal
                                       Universiti Teknologi Malaysia, Skudai, Malaysia

                                                      Pakize TAYLAN
                             Department of Mathematics, Dicle University, Diyarbakır, Turkey
Outline


•       Introduction
•       Estimation for Generalized Linear Model (GLM)
•       Generalized Partial Linear Model (GPLM)
•       Estimation for GPLM
      –      Least-Squares Estimation with Tikhonov Regularization
      –      CMARS Method
•       Penalized Residual Sum of Squares (PRSS) for GLM with MARS
•       Tikhonov Regularization for GLM with MARS
•       An Alternative Solution for Tikhonov Regularization Problem with CQP
•       Solution Methods
•       Application
•       Conclusion
Introduction


The class of Generalized Linear Models (GLMs) has gained popularity as a statistical modeling tool.

This popularity is due to:

• The flexibility of GLM in addressing a variety of statistical problems,
• The availability of software (Stata, SAS, S-PLUS, R) )to fit the models.


The class of GLM is an extension of traditional linear models allows:

 The mean of a dependent variable to depend on a linear predictor by a nonlinear link function......

 The probability distribution of the response, to be any member of an exponential family of distributions.

   Many widely used statistical models belong to GLM:

o linear models with normal errors,
o logistic and probit models for binary data,
o log-linear models for multinomial data.
Introduction


Many other useful statistical models such as with

•    Poisson, binomial,

•    Gamma or normal distributions,


can be formulated as GLM by the selection of an appropriate link function
and response probability distribution.

A GLM looks as follows:


                                       i  H ( i )  xiT  ;


•   i  E(Yi ) : expected value of the response variable Yi   ,
•   H:            smooth monotonic link function,
•   xi   :       observed value of explanatory variable for the i-th case,
•       :       vector of unknown parameters.
Introduction


•   Assumptions:          Yi   are independent and can have any distribution from exponential family density

                               Yi ~ fY ( yi ,i , )
                                       i


                                         y   b ( )           
                                   exp  i i i i  ci ( yi , )  (i  1, 2,..., n),
                                         ai ( )                

•   ai , bi , ci   are arbitrary “scale” parameters, and   i   is called a natural parameter.

•   General expressions for mean and variance of dependent variable              Yi   :


                                    i  E (Yi )  bi' (i ),
                                    Var (Yi )  V ( i ) ,
                                    V ( i )  bi" (i ) i , ai ( ) :  / i .
Estimation for GLM

•    Estimation and inference for GLM is based on the theory of

•    Maximum Likelihood Estimation


•    Least–Squares approach:


                                       n
                         l (  ) :    (  y 
                                      i 1
                                             i   i   i
                                                          bi (i )   ci ( yi , )).


•    The dependence of the right-hand side on               is solely through the dependence of the   i   on      .
Generalized Partial Linear Models                           (GPLMs)


•   Particular semiparametric models are the Generalized Partial Linear Models (GPLMs) :

    They extend the GLMs in that the usual parametric terms are augmented by a
    single nonparametric component:



                                                  
                               E Y X , T   G X T    T  ;       
             m  is a vector of parameters,
                          T
•                                                                and
                           is a smooth function,
                              which we try to estimate by CMARS.


•   Assumption: m-dimensional random vector                X   which represents (typically discrete) covariates,
                q-dimensional random vector                T   of continuous covariates,

                    which comes from a decomposition of explanatory variables.


                     Other interpretations of      :       role of the environment,
                                                               expert opinions,
                                                               Wiener processes,     etc..
Estimation for GPLM

•    There are different kinds of estimation methods for GPLM.
•    Generally, the estimation methods for model


                                                 
                                E Y X , T   G X T    T  ;
      is based on kernel methods and test procedures on the correct specification of this model.

•    Now, we will try to concentrate on special types of GPLM estimation based on

-------    Newly developed data mining method CMARS

     and

-------    Least –Squares estimation with Tikhonov regularization.
Least-Squares Estimation with Tikhonov Regularization

•    The general model

                                                          
                                   E Y X , T   G X T    T  ;      
can be considered as semiparametric generalized linear model and can be written as follows:
                                                                      m
                         H (  )   ( X , T )  X    T    X j  j   T  .
                                                      T

                                                                  j 1

observation values yi , xi , ti (i  1, 2,..., n) .

            i  G(i )      and   i  H ( i )  xiT     ti        and        is a smooth function.



•    For the estimation of parametric part, we apply the linear least squares with Tikhonov regularization.
The Least-Squares Estimation with Tikhonov Regularization

The process is as follows:

Firstly, we apply the linear least squares on the given data to find a vector           preproc :
                             (*)                Y  X T  preproc  
Equivalently, the model form is
                                                             m
                                                y  0   X j  j   .
                                                             j 1


The method of least squares is used for estimating the regression coefficients,


                                                                             m
                        ( 0 , 1 ,  2 ,...,  m )               y  0   X j  j  
              preproc                               T
                                                        in                                   ,
                                                                             j 1


to minimize the residual sum of squares (RSS).
The Least-Squares Estimation with Tikhonov Regularization

•    Tikhonov Regularization proposed an approximate solution to (*) equation by minimizing the
     quadratic functional


                                                    2                2
              (**)           min y  X  preproc   L preproc ,
                             
                             preproc                2                2



     where  is a regularization parameter between the first and the second part.

     The terms   y and  preproc represents the response vector and unknown coefficients.
•    They are obtained by solving Tikhonov regularization problem (**).

•    Generally, Tikhonov regularization involves higher-order regularization terms

      which can be solved using generalized singular value decomposition (GSVD).
The Least-Squares Estimation with Tikhonov Regularization

•    After getting the regression coefficients, we subtract the linear least- square model (without intercept)

     from corresponding responses:

                                                    m
                                               y   X j  j  y.
                                                               ˆ
                                                   j 1


•    Doing this at the input data, the resulting values (      ˆ
                                                               y   ) are our new responses.

•    Then, based on these new data, we find find the knots for nonparametric part with MARS.

•    Again consider model :

      i  H ( i )  xiT     ti    and          is a smooth function which we try to estimate by CMARS

      which is an alternative technique to the well-known data mining tool multivariate adaptive
      regression splines (MARS).
CMARS Method
•    What is MARS?

•    Multivariate adaptive regression splines (MARS) is developed in 1991 by Jerome Friedman.

•    MARS builds flexible models by introducing piecewise linear regressions.

•    MARS uses expansions in piecewise linear basis functions of the form

              c + ( x, ) = [( x   )] ,    c- ( x, ) = [( x   )] ,               [q ] : max 0, q .

•    Set of basis functions:

                                                                                    
           : ( X j   )  , (  X j )  |   x1, j , x2, j ,..., xN , j , j  1, 2,..., .p            
                                  y

                                                                               
                                                                                 
                                                                              
                                                                           
                                                                        
                                                                 

                                        c- (x,)=[ x)]
                                                   (               c+(x,)=[ x 
                                                                             (   )]
                                                                                     x


                                 Basic elements in the regression with MARS.
CMARS Method
•    Thus, we can represent       ti    by a linear combination which is successively built up by basis

     functions and the intercept     0     , such that

                                                                     M
                               i  H (i )  x   0  m m (ti ) .
                                                        T
                                                        i
                                                                  m 1

Here,  m (m  1, 2,..., M ) are basis functions from  or products of two or more such functions,
interaction basis functions are created by multiplying an existing basis function with a truncated linear
function involving new variable.  m are the unknown coefficients for the mth basis function (m  1, 2,..., M )
or for the constant 1 (m  0).


•    Provided the observations represented by the data ti (i  1, 2, ..., N ) :
                                             Km
                               m (t ) : [ s  (t    )]
                                                    m
                                                    j
                                                            m
                                                            j
                                                                 m
                                                                 j
                                                                         .
                                             j 1
CMARS Method
The MARS algorithm for estimating the model function consists of two algorithms:

I. Forward stepwise algorithm:

•    Search for the basis functions.
•    At each step, the split that minimized some “lack of fit” criterion from all the possible splits on each
     basis function is chosen.
•    The process stops when a user-specified value M max is reached. At the end of this process we have a
     large expression in Y .
•    This model typically overfits the data; so a backward deletion procedure is applied.

II. Backward stepwise algorithm:

•    Prevents from over-fitting by decreasing the complexity of the model without degrading the fit to the
     data.


•    Proposition: We do not employ the backward stepwise algorithm to estimate the function.

     At its place, as an alternative we propose to use penalty terms in addition to the least-squares
     estimation in order to control the lack of fit from the viewpoint of the complexity of the estimation.
The Penalized Residual Sum of Squares for GLM with MARS
•    Let we consider equation

•                                                i  H (i )  xiT     ti                ,


where     , 1 ,...,  M              and   ti      ti    ti   . Let us use the penalized residual sum
                                        T                                               T




of squares with        M max basis functions having been accumulated in the forward stepwise algorithm.
For the GLM model with MARS, PRSS has the following form:

                                                                M max

                                                            
                       N                                                        2
      PRSS   i  x     ti                                                                        
                                                            2                                                                   2
                                                   
                                        T
                                        i                               m                                    Dr,s m (t m )  dt m ,
                                                                                                             2
                                                                                                             m                 
                      i 1                                      m 1            1           r s
                                                                             (1 , 2 ) r , sV ( m )

                 
     V (m) :  m | j  1, 2,..., K m
                j                            
      t m := (tm1 , tm2 ,..., tm K )T
                                 m


        (1 ,  2 )
       : 1   2 , where 1 ,  2  0,1 ,                                          
                                                                 Dr,s m (t m ) :  m 1 trm 2 tsm (t m ) .    
The Penalized Residual Sum of Squares for GLM with MARS
•    Our optimization problem bases on the tradeoff between both accuracy, i.e., a small sum of error
     squares, and not too high a complexity.

•    This tradeoff is established through the penalty parameters                                        m .
•    Let use in approximate PRSS by discretizing the high-dimensional integration:

Then, PRSS becomes

                                                  i  xiT    (di )                      
                                                  N                                                 2
                           PRSS 
                                                  i 1


                                                                     ( N 1) Km
                                                                                   2                                          
                                                                                  
                                                                                                          Dr,s m (tˆim )  tˆ im.
                                                 M max

                                                                                   
                                                                                                                              2
                                                            2
                                                           m m                      1                                     
                                                  m 1                 i 1         ( , )           rs                   
                                                                                        1 2
                                                                                                    r , sV ( m )              

                                                                                                                              M           
                                                                                                                                                         
 di   1, 1 (t ),..., M (t ), M 1 (t                                                        
                                                                                                    T
                  1
                  i                 i
                                     M
                                                           i
                                                            M 1
                                                                   ),..., M max (t    i
                                                                                        M max
                                                                                                )                                                  max
                                                                                                                                                             ,

                                                                                                                  ( j ) j1,2,..., Km   0,1, 2,..., N  1
                                                                                                                                                                 Km
                            M 1         M 2
di : (t , t ,..., t , t
         1
         i   i
              2
                      i
                       M
                           i       ,t   i       ,..., t   i
                                                           M max T
                                                               )
                                                                                   Km
                                                                                                     
        ˆ
        t   tl , m , tl , m ,..., tl , m  ,
         m
                                                                        t    tl , m  tl , m 
                                                                         ˆ    m
                                                                              i
         i                                                                                             .
              j                                                          j 1    j 1         
                 m  j                                                                   m    j  m j
                           m j          m j
                           j            j                                                     j
Tikhonov Regularization for GLM with MARS
•   For a short representation, we can rewrite the approximate relation as

                                                                                 M max             ( N 1) K m

                                                                                  
                                                                        2
                      PRSS    X    (d )                                               m                  L2  m ,
                                                                                                                  im
                                                                                                                      2
                                                                        2
                                                                                  m 1                i 1

                                                                                                                            1
                                                                   2                                                   
                                                                                                                                2


                                                                                                                         
                                                            Lim   
                                                                                                                 2
                (d )   (d1 ),..., (d N )                                              Dr,s m (tˆim )   tˆ im  .
                                              T
                                                                      1                                    
                                                                    (1 , 2 )
                                                                  
                                                                                           r s
                                                                                      r , sV ( m )
                                                                                                                  
                                                                                                                         
                                                                                                                          
•   We can write PRSS as
                                                                    M max       ( N 1) K m
                           PRSS    X              *   * 2
                                                            2
                                                                    m
                                                                    m 1
                                                                                    i 1
                                                                                              L2  m ,
                                                                                               im
                                                                                                   2




where    X * =  X  (d )         is a block matrix constructed by ( N  p) -matrix                             X and ( N  (Μ max  1))

          (d ) ,  * =    ,                                               
                                          T
matrix                                            is a vector constructed                   and      vectors.

•   Then, we deal with the linear systems equations of                        X * *            , approximately.
Tikhonov Regularization for GLM with MARS
•   We approach our problem PRSS as a Tikhonov regularization problem by using the same penalty

parameter     m (:  2 )    for each derivative:

                                                                       * 2                  * 2
                                PRSS    X                    *
                                                                                 L *
                                                                                                    .
                                                                           2                    2


Here, high dimentional matrix               
                                    L* = R* L              , where R is an    *
                                                                                    (M max  1)  p matrix with entries being first

or second derivatives of       .


•   We can easily note that our Tikhonov regularization problem has multiple objective functions through
                                        * 2                          * 2
a linear combination of     X    *
                                                    and    X 
                                                             *
                                                                           . We select the solution such that it minimizes both
                                                2                      2
                                        2                                                   2
first objective function     X * *   2
                                                and second objective               X * *
                                                                                            2
                                                                                                in the sense of a compromise

(tradeoff) solution.
An Alternative Solution for Tikhonov Regularization Problem with CQP


•     We can solve Tikhonov regularization problem for MARS by continuous optimization techniques,
      especially, conic quadratic programming.

•     We formulate PRSS as a CQP problem:



                                          min       z,
                                           z , *

                   I *                  subject to         X *  * 2  z,
                                                           L*  *  M .
                                                                  2




      In general :      min cT x ,        subject to      Di x  di 2  piT x  qi (i  1, 2,..., k ).         (CQP )
                           x


      c  (1, 0T max 1 p )T , u  ( z,  *T )T , D1  (0 N , X * ), d1   , p1  (1, 0,..., 0)T , q1  0,
               M


      D2  (0M Max 1 , L* ), d2  0M max 1 , p2  0M max  p2 and q2   M .
An Alternative Solution for Tikhonov Regularization Problem with CQP


•     We first reformulate     I *  as a   Primal Problem:

                             min      z,
                             z , *

                                            0N          X *   z    
                             such that                                  
                                            1 0T 1 p    *   0 
                                                                                  ,
                                                                     
                                                      M max      
                                           0 M max 1        L*   z   0 M max 1 
                                                                     *        ,
                                           0              T               M 
                                                         0 M max 1 p 
                                                                                    
                                              LN 1 ,   LM max  2 ,
with ice-cream (or second order or Lorentz) cones:


               
       LN  1  x  ( x1 , x2 ,..., xN 1 )T  R N 1 | xN+1  x12  x2  ...  xN
                                                                      2          2
                                                                                          ( N  1).
An Alternative Solution for Tikhonov Regularization Problem with CQP



•     The corresponding Dual Problem is



                                
             max ( T , 0) 1  0T max 1 ,  M  2
                                 M                    
                          0T           1              0T max 1     0                    1        
                                                                             2  
                            N                             M
             such that    *T
                         X                                                                            ,
                                   0 M max 1 p  1  L*T
                                                                  0M max 1 p        0 M max 1 p 
                                                                               
                          1  LN 1 ,  2  L   M max  2
                                                             .
Solution Methods


•   CQPs belong to the well-structured convex problems.


•   Interior Point algorithms:

     –   We use the structure of problem.

     –   Yield better complexity bounds.

     –   Exhibit much better practical performance.
Application


•     GLPMs with CMARS and the parameter estimation for them have been presented and
      investigated in detail. Now, a numerical example for this study will be given.

•     Two data sets are used in our applications:

       •   Concrete Compressive Strength Data Set




       •   Concrete Slump Test Data Set




•     Data sets are obtained from UCI Machine Learning Repository (http://archive.ics.uci.edu/ml/).
       •
Application


•     Salford Systems is used for MARS application, for CMARS a code is written by using MATLAB
      and in order to solve the CQP problem in CMARS, MOSEK software is preferred.

      For Tikhonov Regularization Regularization Toolbox in MATLAB is used.


•     All test data sets are also compared according to the performance measures such as
      Root Mean Square Error (RMSE), Correlation Coefficient (r), R2, Adjusted R2.


       •
Application

•    To compare the performances of Tikhonov regularization, CMARS and GPLM models,
      let us look at the performance measure values for both data sets.




                                                WORSE                               BETTER

Evaluation of the models based on performance values:

•    CMARS performs better than Tikhonov regularization with respect to all the measures for both data sets.

•    On the other hand, GLM with CMARS (GPLM) performs better than both Tikhonov regularization and
     CMARS with respect to all the measures for both data sets.
Outlook


•     Important new class of GPLs: having written the Tikhonov regularization task for GLM using
      MARS as a CQP problem, we will call it CGLMARS:




                         E Y X , T          
                                         G XT               T  ,    e.g.,


                      GPLM (X ,T )      =     LM (X ) + MARS (T )
References


[1] Aster, A., Borchers, B., and Thurber, C., Parameter Estimation and Inverse Problems, Academic
     Press, 2004.
[2] Craven, P., and Wahba, G., Smoothing noisy data with spline functions, Numer. Math. 31, Linear
     Models, (1979), 377-403.
[3] De Boor, C., Practical Guide to Splines, Springer Verlag, 2001.
[4] Dongarra, J.J., Bunch, J.R., Moler, C.B., and Stewart, G.W., Linpack User’s Guide, Philadelphia,
     SIAM, 1979.
[5] Friedman, J.H., Multivariate adaptive regression splines, (1991), The Annals of Statistics
    19, 1, 1-141.
[6] Green, P.J., and Yandell, B.S., Semi-Parametric Generalized Linear Models, Lecture Notes in
     Statistics, 32 (1985).
[7] Hastie, T.J., and Tibshirani, R.J., Generalized Additive Models, New York, Chapman and Hall, 1990.
[8] Kincaid, D., and Cheney, W., Numerical Analysis: Mathematics of Scientific computing, Pacific
     Grove, 2002.
[9] Müller, M., Estimation and testing in generalized partial linear models – A comparive study, Statistics
     and Computing 11 (2001) 299-309, 2001.
[10] Nelder, J.A., and Wedderburn, R.W.M., Generalized linear models, Journal of the Royal Statistical
     Society A, 145, (1972) 470-484.
[11] Nemirovski, A., Lectures on modern convex optimization, Israel Institute of Technology
     http://iew3.technion.ac.il/Labs/Opt/opt/LN/Final.pdf.
References


[12] Nesterov, Y.E , and Nemirovskii, A.S., Interior Point Methods in Convex Programming,
     SIAM, 1993.
[13] Ortega, J.M., and Rheinboldt, W.C., Iterative Solution of Nonlinear Equations in Several
     Variables, Academic Press, New York, 1970.
[14] Renegar, J., Mathematical View of Interior Point Methods in Convex Programming, SIAM,
     2000.
[15] Sheid, F., Numerical Analysis, McGraw-Hill Book Company, New-York, 1968.
[16] Taylan, P., Weber, G.-W., and Beck, A., New approaches to regression by generalized
     additive and continuous optimization for modern applications in finance, science and
     technology, Optimization 56, 5-6 (2007), pp. 1-24.
[17] Taylan, P., Weber, G.-W., and Liu, L., On foundations of parameter estimation for
     generalized partial linear models with B-splines and continuous optimization, in the
     proceedings of PCO 2010, 3rd Global Conference on Power Control and Optimization,
     February 2-4, 2010, Gold Coast, Queensland, Australia.
[18] Weber, G.-W., Akteke-Öztürk, B., İşcanoğlu, A., Özöğür, S., and Taylan, P., Data Mining:
     Clustering, Classification and Regression, four lectures given at the Graduate Summer
     School on New Advances in Statistics, Middle East Technical University, Ankara, Turkey,
     August 11-24, 2007 (http://www.statsummer.com/).
[19] Wood, S.N., Generalized Additive Models, An Introduction with R, New York, Chapman
     and Hall, 2006.
Thank you very much for your attention!

More Related Content

What's hot

Lesson 14: Derivatives of Logarithmic and Exponential Functions
Lesson 14: Derivatives of Logarithmic and Exponential FunctionsLesson 14: Derivatives of Logarithmic and Exponential Functions
Lesson 14: Derivatives of Logarithmic and Exponential Functions
Matthew Leingang
 
Sienna 3 bruteforce
Sienna 3 bruteforceSienna 3 bruteforce
Sienna 3 bruteforce
chidabdu
 
Matrix factorization
Matrix factorizationMatrix factorization
Matrix factorization
rubyyc
 
Chapter 3 projection
Chapter 3 projectionChapter 3 projection
Chapter 3 projection
NBER
 

What's hot (20)

Tro07 sparse-solutions-talk
Tro07 sparse-solutions-talkTro07 sparse-solutions-talk
Tro07 sparse-solutions-talk
 
ABC and empirical likelihood
ABC and empirical likelihoodABC and empirical likelihood
ABC and empirical likelihood
 
MCMSki III (poster)
MCMSki III (poster)MCMSki III (poster)
MCMSki III (poster)
 
Monte-Carlo method for Two-Stage SLP
Monte-Carlo method for Two-Stage SLPMonte-Carlo method for Two-Stage SLP
Monte-Carlo method for Two-Stage SLP
 
Lesson 14: Derivatives of Logarithmic and Exponential Functions
Lesson 14: Derivatives of Logarithmic and Exponential FunctionsLesson 14: Derivatives of Logarithmic and Exponential Functions
Lesson 14: Derivatives of Logarithmic and Exponential Functions
 
Monash University short course, part II
Monash University short course, part IIMonash University short course, part II
Monash University short course, part II
 
Stability of adaptive random-walk Metropolis algorithms
Stability of adaptive random-walk Metropolis algorithmsStability of adaptive random-walk Metropolis algorithms
Stability of adaptive random-walk Metropolis algorithms
 
Sienna 3 bruteforce
Sienna 3 bruteforceSienna 3 bruteforce
Sienna 3 bruteforce
 
А.Н. Ширяев "Обзор современных задач об оптимальной обстановке"
А.Н. Ширяев "Обзор современных задач об оптимальной обстановке"А.Н. Ширяев "Обзор современных задач об оптимальной обстановке"
А.Н. Ширяев "Обзор современных задач об оптимальной обстановке"
 
Curve fitting
Curve fittingCurve fitting
Curve fitting
 
Theory of Relations (2)
Theory of Relations (2)Theory of Relations (2)
Theory of Relations (2)
 
R180304110115
R180304110115R180304110115
R180304110115
 
Statistical inference for agent-based SIS and SIR models
Statistical inference for agent-based SIS and SIR modelsStatistical inference for agent-based SIS and SIR models
Statistical inference for agent-based SIS and SIR models
 
Matrix factorization
Matrix factorizationMatrix factorization
Matrix factorization
 
Mixture Models for Image Analysis
Mixture Models for Image AnalysisMixture Models for Image Analysis
Mixture Models for Image Analysis
 
ABC in Venezia
ABC in VeneziaABC in Venezia
ABC in Venezia
 
Desm s mathematics
Desm s mathematicsDesm s mathematics
Desm s mathematics
 
Sequential Monte Carlo algorithms for agent-based models of disease transmission
Sequential Monte Carlo algorithms for agent-based models of disease transmissionSequential Monte Carlo algorithms for agent-based models of disease transmission
Sequential Monte Carlo algorithms for agent-based models of disease transmission
 
Chapter 3 projection
Chapter 3 projectionChapter 3 projection
Chapter 3 projection
 
Optimization tutorial
Optimization tutorialOptimization tutorial
Optimization tutorial
 

Viewers also liked (6)

Predicting Real-valued Outputs: An introduction to regression
Predicting Real-valued Outputs: An introduction to regressionPredicting Real-valued Outputs: An introduction to regression
Predicting Real-valued Outputs: An introduction to regression
 
Evolution of regression ols to gps to mars
Evolution of regression   ols to gps to marsEvolution of regression   ols to gps to mars
Evolution of regression ols to gps to mars
 
Presentation Machine Learning
Presentation Machine LearningPresentation Machine Learning
Presentation Machine Learning
 
Introduction to MARS (1999)
Introduction to MARS (1999)Introduction to MARS (1999)
Introduction to MARS (1999)
 
Tokyowebmining41
Tokyowebmining41Tokyowebmining41
Tokyowebmining41
 
Data Science - Part XV - MARS, Logistic Regression, & Survival Analysis
Data Science -  Part XV - MARS, Logistic Regression, & Survival AnalysisData Science -  Part XV - MARS, Logistic Regression, & Survival Analysis
Data Science - Part XV - MARS, Logistic Regression, & Survival Analysis
 

Similar to Parameter Estimation for Semiparametric Models with CMARS and Its Applications

IGARSS2011 FR3.T08.3 BenDavid.pdf
IGARSS2011 FR3.T08.3 BenDavid.pdfIGARSS2011 FR3.T08.3 BenDavid.pdf
IGARSS2011 FR3.T08.3 BenDavid.pdf
grssieee
 
Graph theoretic approach to solve measurement placement problem for power system
Graph theoretic approach to solve measurement placement problem for power systemGraph theoretic approach to solve measurement placement problem for power system
Graph theoretic approach to solve measurement placement problem for power system
IAEME Publication
 
Presentation cm2011
Presentation cm2011Presentation cm2011
Presentation cm2011
antigonon
 
Presentation cm2011
Presentation cm2011Presentation cm2011
Presentation cm2011
antigonon
 
Dimension Reduction Introduction & PCA.pptx
Dimension Reduction Introduction & PCA.pptxDimension Reduction Introduction & PCA.pptx
Dimension Reduction Introduction & PCA.pptx
RohanBorgalli
 

Similar to Parameter Estimation for Semiparametric Models with CMARS and Its Applications (20)

IGARSS2011 FR3.T08.3 BenDavid.pdf
IGARSS2011 FR3.T08.3 BenDavid.pdfIGARSS2011 FR3.T08.3 BenDavid.pdf
IGARSS2011 FR3.T08.3 BenDavid.pdf
 
Session II - Estimation methods and accuracy Li-Chun Zhang Discussion: Sess...
Session II - Estimation methods and accuracy   Li-Chun Zhang Discussion: Sess...Session II - Estimation methods and accuracy   Li-Chun Zhang Discussion: Sess...
Session II - Estimation methods and accuracy Li-Chun Zhang Discussion: Sess...
 
IST module 1
IST module 1IST module 1
IST module 1
 
Random Matrix Theory and Machine Learning - Part 4
Random Matrix Theory and Machine Learning - Part 4Random Matrix Theory and Machine Learning - Part 4
Random Matrix Theory and Machine Learning - Part 4
 
machine learning.pptx
machine learning.pptxmachine learning.pptx
machine learning.pptx
 
Principle of Maximum Entropy
Principle of Maximum EntropyPrinciple of Maximum Entropy
Principle of Maximum Entropy
 
The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)
 
Degree presentation: Indirect Inference Applied to Financial Econometrics
Degree presentation: Indirect Inference Applied to Financial EconometricsDegree presentation: Indirect Inference Applied to Financial Econometrics
Degree presentation: Indirect Inference Applied to Financial Econometrics
 
Graph theoretic approach to solve measurement placement problem for power system
Graph theoretic approach to solve measurement placement problem for power systemGraph theoretic approach to solve measurement placement problem for power system
Graph theoretic approach to solve measurement placement problem for power system
 
Unbiased Markov chain Monte Carlo
Unbiased Markov chain Monte CarloUnbiased Markov chain Monte Carlo
Unbiased Markov chain Monte Carlo
 
2008 JSM - Meta Study Data vs Patient Data
2008 JSM - Meta Study Data vs Patient Data2008 JSM - Meta Study Data vs Patient Data
2008 JSM - Meta Study Data vs Patient Data
 
A bit about мcmc
A bit about мcmcA bit about мcmc
A bit about мcmc
 
Varaiational formulation fem
Varaiational formulation fem Varaiational formulation fem
Varaiational formulation fem
 
Mk slides.ppt
Mk slides.pptMk slides.ppt
Mk slides.ppt
 
Logistic Regression.pptx
Logistic Regression.pptxLogistic Regression.pptx
Logistic Regression.pptx
 
An overview of Hidden Markov Models (HMM)
An overview of Hidden Markov Models (HMM)An overview of Hidden Markov Models (HMM)
An overview of Hidden Markov Models (HMM)
 
Presentation cm2011
Presentation cm2011Presentation cm2011
Presentation cm2011
 
Presentation cm2011
Presentation cm2011Presentation cm2011
Presentation cm2011
 
Dimension Reduction Introduction & PCA.pptx
Dimension Reduction Introduction & PCA.pptxDimension Reduction Introduction & PCA.pptx
Dimension Reduction Introduction & PCA.pptx
 
Data Structures - Lecture 1 [introduction]
Data Structures - Lecture 1 [introduction]Data Structures - Lecture 1 [introduction]
Data Structures - Lecture 1 [introduction]
 

More from SSA KPI

Germany presentation
Germany presentationGermany presentation
Germany presentation
SSA KPI
 
Grand challenges in energy
Grand challenges in energyGrand challenges in energy
Grand challenges in energy
SSA KPI
 
Engineering role in sustainability
Engineering role in sustainabilityEngineering role in sustainability
Engineering role in sustainability
SSA KPI
 
Consensus and interaction on a long term strategy for sustainable development
Consensus and interaction on a long term strategy for sustainable developmentConsensus and interaction on a long term strategy for sustainable development
Consensus and interaction on a long term strategy for sustainable development
SSA KPI
 
Competences in sustainability in engineering education
Competences in sustainability in engineering educationCompetences in sustainability in engineering education
Competences in sustainability in engineering education
SSA KPI
 
Introducatio SD for enginers
Introducatio SD for enginersIntroducatio SD for enginers
Introducatio SD for enginers
SSA KPI
 

More from SSA KPI (20)

Germany presentation
Germany presentationGermany presentation
Germany presentation
 
Grand challenges in energy
Grand challenges in energyGrand challenges in energy
Grand challenges in energy
 
Engineering role in sustainability
Engineering role in sustainabilityEngineering role in sustainability
Engineering role in sustainability
 
Consensus and interaction on a long term strategy for sustainable development
Consensus and interaction on a long term strategy for sustainable developmentConsensus and interaction on a long term strategy for sustainable development
Consensus and interaction on a long term strategy for sustainable development
 
Competences in sustainability in engineering education
Competences in sustainability in engineering educationCompetences in sustainability in engineering education
Competences in sustainability in engineering education
 
Introducatio SD for enginers
Introducatio SD for enginersIntroducatio SD for enginers
Introducatio SD for enginers
 
DAAD-10.11.2011
DAAD-10.11.2011DAAD-10.11.2011
DAAD-10.11.2011
 
Talking with money
Talking with moneyTalking with money
Talking with money
 
'Green' startup investment
'Green' startup investment'Green' startup investment
'Green' startup investment
 
From Huygens odd sympathy to the energy Huygens' extraction from the sea waves
From Huygens odd sympathy to the energy Huygens' extraction from the sea wavesFrom Huygens odd sympathy to the energy Huygens' extraction from the sea waves
From Huygens odd sympathy to the energy Huygens' extraction from the sea waves
 
Dynamics of dice games
Dynamics of dice gamesDynamics of dice games
Dynamics of dice games
 
Energy Security Costs
Energy Security CostsEnergy Security Costs
Energy Security Costs
 
Naturally Occurring Radioactivity (NOR) in natural and anthropic environments
Naturally Occurring Radioactivity (NOR) in natural and anthropic environmentsNaturally Occurring Radioactivity (NOR) in natural and anthropic environments
Naturally Occurring Radioactivity (NOR) in natural and anthropic environments
 
Advanced energy technology for sustainable development. Part 5
Advanced energy technology for sustainable development. Part 5Advanced energy technology for sustainable development. Part 5
Advanced energy technology for sustainable development. Part 5
 
Advanced energy technology for sustainable development. Part 4
Advanced energy technology for sustainable development. Part 4Advanced energy technology for sustainable development. Part 4
Advanced energy technology for sustainable development. Part 4
 
Advanced energy technology for sustainable development. Part 3
Advanced energy technology for sustainable development. Part 3Advanced energy technology for sustainable development. Part 3
Advanced energy technology for sustainable development. Part 3
 
Advanced energy technology for sustainable development. Part 2
Advanced energy technology for sustainable development. Part 2Advanced energy technology for sustainable development. Part 2
Advanced energy technology for sustainable development. Part 2
 
Advanced energy technology for sustainable development. Part 1
Advanced energy technology for sustainable development. Part 1Advanced energy technology for sustainable development. Part 1
Advanced energy technology for sustainable development. Part 1
 
Fluorescent proteins in current biology
Fluorescent proteins in current biologyFluorescent proteins in current biology
Fluorescent proteins in current biology
 
Neurotransmitter systems of the brain and their functions
Neurotransmitter systems of the brain and their functionsNeurotransmitter systems of the brain and their functions
Neurotransmitter systems of the brain and their functions
 

Recently uploaded

The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 

Recently uploaded (20)

Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
Dyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptxDyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptx
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 

Parameter Estimation for Semiparametric Models with CMARS and Its Applications

  • 1. 5th International Summer School Achievements and Applications of Contemporary Informatics, Mathematics and Physics National University of Technology of the Ukraine Kiev, Ukraine, August 3-15, 2010 PARAMETER ESTIMATION FOR SEMIPARAMETRIC MODELS WITH CMARS AND ITS APPLICATIONS Fatma YERLIKAYA-ÖZKURT Institute of Applied Mathematics, METU, Ankara,Turkey Gerhard-Wilhelm WEBER Institute of Applied Mathematics, METU, Ankara,Turkey Faculty of Economics, Business and Law, University of Siegen, Germany Center for Research on Optimization and Control, University of Aveiro, Portugal Universiti Teknologi Malaysia, Skudai, Malaysia Pakize TAYLAN Department of Mathematics, Dicle University, Diyarbakır, Turkey
  • 2. Outline • Introduction • Estimation for Generalized Linear Model (GLM) • Generalized Partial Linear Model (GPLM) • Estimation for GPLM – Least-Squares Estimation with Tikhonov Regularization – CMARS Method • Penalized Residual Sum of Squares (PRSS) for GLM with MARS • Tikhonov Regularization for GLM with MARS • An Alternative Solution for Tikhonov Regularization Problem with CQP • Solution Methods • Application • Conclusion
  • 3. Introduction The class of Generalized Linear Models (GLMs) has gained popularity as a statistical modeling tool. This popularity is due to: • The flexibility of GLM in addressing a variety of statistical problems, • The availability of software (Stata, SAS, S-PLUS, R) )to fit the models. The class of GLM is an extension of traditional linear models allows:  The mean of a dependent variable to depend on a linear predictor by a nonlinear link function......  The probability distribution of the response, to be any member of an exponential family of distributions.  Many widely used statistical models belong to GLM: o linear models with normal errors, o logistic and probit models for binary data, o log-linear models for multinomial data.
  • 4. Introduction Many other useful statistical models such as with • Poisson, binomial, • Gamma or normal distributions, can be formulated as GLM by the selection of an appropriate link function and response probability distribution. A GLM looks as follows: i  H ( i )  xiT  ; • i  E(Yi ) : expected value of the response variable Yi , • H: smooth monotonic link function, • xi : observed value of explanatory variable for the i-th case, •  : vector of unknown parameters.
  • 5. Introduction • Assumptions: Yi are independent and can have any distribution from exponential family density Yi ~ fY ( yi ,i , ) i  y   b ( )   exp  i i i i  ci ( yi , )  (i  1, 2,..., n),  ai ( )  • ai , bi , ci are arbitrary “scale” parameters, and i is called a natural parameter. • General expressions for mean and variance of dependent variable Yi : i  E (Yi )  bi' (i ), Var (Yi )  V ( i ) , V ( i )  bi" (i ) i , ai ( ) :  / i .
  • 6. Estimation for GLM • Estimation and inference for GLM is based on the theory of • Maximum Likelihood Estimation • Least–Squares approach: n l (  ) :  (  y  i 1 i i i  bi (i )   ci ( yi , )). • The dependence of the right-hand side on  is solely through the dependence of the i on  .
  • 7. Generalized Partial Linear Models (GPLMs) • Particular semiparametric models are the Generalized Partial Linear Models (GPLMs) : They extend the GLMs in that the usual parametric terms are augmented by a single nonparametric component:  E Y X , T   G X T    T  ;          m  is a vector of parameters, T • and    is a smooth function, which we try to estimate by CMARS. • Assumption: m-dimensional random vector X which represents (typically discrete) covariates, q-dimensional random vector T of continuous covariates, which comes from a decomposition of explanatory variables. Other interpretations of    : role of the environment, expert opinions, Wiener processes, etc..
  • 8. Estimation for GPLM • There are different kinds of estimation methods for GPLM. • Generally, the estimation methods for model  E Y X , T   G X T    T  ; is based on kernel methods and test procedures on the correct specification of this model. • Now, we will try to concentrate on special types of GPLM estimation based on ------- Newly developed data mining method CMARS and ------- Least –Squares estimation with Tikhonov regularization.
  • 9. Least-Squares Estimation with Tikhonov Regularization • The general model  E Y X , T   G X T    T  ;  can be considered as semiparametric generalized linear model and can be written as follows: m H (  )   ( X , T )  X    T    X j  j   T  . T j 1 observation values yi , xi , ti (i  1, 2,..., n) . i  G(i ) and i  H ( i )  xiT     ti  and    is a smooth function. • For the estimation of parametric part, we apply the linear least squares with Tikhonov regularization.
  • 10. The Least-Squares Estimation with Tikhonov Regularization The process is as follows: Firstly, we apply the linear least squares on the given data to find a vector  preproc : (*) Y  X T  preproc   Equivalently, the model form is m y  0   X j  j   . j 1 The method of least squares is used for estimating the regression coefficients, m   ( 0 , 1 ,  2 ,...,  m ) y  0   X j  j   preproc T in , j 1 to minimize the residual sum of squares (RSS).
  • 11. The Least-Squares Estimation with Tikhonov Regularization • Tikhonov Regularization proposed an approximate solution to (*) equation by minimizing the quadratic functional 2 2 (**) min y  X  preproc   L preproc ,  preproc 2 2 where  is a regularization parameter between the first and the second part. The terms y and  preproc represents the response vector and unknown coefficients. • They are obtained by solving Tikhonov regularization problem (**). • Generally, Tikhonov regularization involves higher-order regularization terms which can be solved using generalized singular value decomposition (GSVD).
  • 12. The Least-Squares Estimation with Tikhonov Regularization • After getting the regression coefficients, we subtract the linear least- square model (without intercept) from corresponding responses: m y   X j  j  y. ˆ j 1 • Doing this at the input data, the resulting values ( ˆ y ) are our new responses. • Then, based on these new data, we find find the knots for nonparametric part with MARS. • Again consider model : i  H ( i )  xiT     ti  and    is a smooth function which we try to estimate by CMARS which is an alternative technique to the well-known data mining tool multivariate adaptive regression splines (MARS).
  • 13. CMARS Method • What is MARS? • Multivariate adaptive regression splines (MARS) is developed in 1991 by Jerome Friedman. • MARS builds flexible models by introducing piecewise linear regressions. • MARS uses expansions in piecewise linear basis functions of the form c + ( x, ) = [( x   )] , c- ( x, ) = [( x   )] , [q ] : max 0, q . • Set of basis functions:    : ( X j   )  , (  X j )  |   x1, j , x2, j ,..., xN , j , j  1, 2,..., .p  y                    c- (x,)=[ x)] ( c+(x,)=[ x  ( )]  x Basic elements in the regression with MARS.
  • 14. CMARS Method • Thus, we can represent   ti  by a linear combination which is successively built up by basis functions and the intercept 0 , such that M i  H (i )  x   0  m m (ti ) . T i m 1 Here,  m (m  1, 2,..., M ) are basis functions from  or products of two or more such functions, interaction basis functions are created by multiplying an existing basis function with a truncated linear function involving new variable.  m are the unknown coefficients for the mth basis function (m  1, 2,..., M ) or for the constant 1 (m  0). • Provided the observations represented by the data ti (i  1, 2, ..., N ) : Km  m (t ) : [ s  (t    )] m j m j m j . j 1
  • 15. CMARS Method The MARS algorithm for estimating the model function consists of two algorithms: I. Forward stepwise algorithm: • Search for the basis functions. • At each step, the split that minimized some “lack of fit” criterion from all the possible splits on each basis function is chosen. • The process stops when a user-specified value M max is reached. At the end of this process we have a large expression in Y . • This model typically overfits the data; so a backward deletion procedure is applied. II. Backward stepwise algorithm: • Prevents from over-fitting by decreasing the complexity of the model without degrading the fit to the data. • Proposition: We do not employ the backward stepwise algorithm to estimate the function. At its place, as an alternative we propose to use penalty terms in addition to the least-squares estimation in order to control the lack of fit from the viewpoint of the complexity of the estimation.
  • 16. The Penalized Residual Sum of Squares for GLM with MARS • Let we consider equation • i  H (i )  xiT     ti  , where     , 1 ,...,  M  and   ti      ti    ti   . Let us use the penalized residual sum T T of squares with M max basis functions having been accumulated in the forward stepwise algorithm. For the GLM model with MARS, PRSS has the following form: M max      N 2 PRSS   i  x     ti     2 2  T i m   Dr,s m (t m )  dt m , 2 m  i 1 m 1  1 r s  (1 , 2 ) r , sV ( m )  V (m) :  m | j  1, 2,..., K m j  t m := (tm1 , tm2 ,..., tm K )T m   (1 ,  2 )  : 1   2 , where 1 ,  2  0,1 ,  Dr,s m (t m ) :  m 1 trm 2 tsm (t m ) . 
  • 17. The Penalized Residual Sum of Squares for GLM with MARS • Our optimization problem bases on the tradeoff between both accuracy, i.e., a small sum of error squares, and not too high a complexity. • This tradeoff is established through the penalty parameters m . • Let use in approximate PRSS by discretizing the high-dimensional integration: Then, PRSS becomes  i  xiT    (di )   N 2 PRSS  i 1 ( N 1) Km  2     Dr,s m (tˆim )  tˆ im. M max     2  2 m m   1    m 1 i 1   ( , ) rs   1 2 r , sV ( m )     M    di   1, 1 (t ),..., M (t ), M 1 (t  T 1 i i M i M 1 ),..., M max (t i M max ) max , ( j ) j1,2,..., Km   0,1, 2,..., N  1  Km M 1 M 2 di : (t , t ,..., t , t 1 i i 2 i M i ,t i ,..., t i M max T )    Km  ˆ t   tl , m , tl , m ,..., tl , m  , m t    tl , m  tl , m  ˆ m i i .   j  j 1    j 1  m j m j m j m j m j  j  j  j
  • 18. Tikhonov Regularization for GLM with MARS • For a short representation, we can rewrite the approximate relation as M max ( N 1) K m   2 PRSS    X    (d )  m L2  m , im 2 2 m 1 i 1 1  2   2   Lim    2  (d )   (d1 ),..., (d N )    Dr,s m (tˆim )   tˆ im  . T   1     (1 , 2 )  r s r , sV ( m )     • We can write PRSS as M max ( N 1) K m PRSS    X  * * 2 2   m m 1  i 1 L2  m , im 2 where X * =  X  (d )  is a block matrix constructed by ( N  p) -matrix X and ( N  (Μ max  1))  (d ) ,  * =    ,     T matrix is a vector constructed and  vectors. • Then, we deal with the linear systems equations of   X * * , approximately.
  • 19. Tikhonov Regularization for GLM with MARS • We approach our problem PRSS as a Tikhonov regularization problem by using the same penalty parameter   m (:  2 ) for each derivative: * 2 * 2 PRSS    X  *   L * . 2 2 Here, high dimentional matrix  L* = R* L  , where R is an * (M max  1)  p matrix with entries being first or second derivatives of  . • We can easily note that our Tikhonov regularization problem has multiple objective functions through * 2 * 2 a linear combination of X  * and X  * . We select the solution such that it minimizes both 2 2 2 2 first objective function   X * * 2 and second objective X * * 2 in the sense of a compromise (tradeoff) solution.
  • 20. An Alternative Solution for Tikhonov Regularization Problem with CQP • We can solve Tikhonov regularization problem for MARS by continuous optimization techniques, especially, conic quadratic programming. • We formulate PRSS as a CQP problem: min z, z , *  I * subject to   X *  * 2  z, L*  *  M . 2 In general : min cT x , subject to Di x  di 2  piT x  qi (i  1, 2,..., k ). (CQP ) x c  (1, 0T max 1 p )T , u  ( z,  *T )T , D1  (0 N , X * ), d1   , p1  (1, 0,..., 0)T , q1  0, M D2  (0M Max 1 , L* ), d2  0M max 1 , p2  0M max  p2 and q2   M .
  • 21. An Alternative Solution for Tikhonov Regularization Problem with CQP • We first reformulate  I *  as a Primal Problem: min z, z , *  0N X *   z     such that      1 0T 1 p    *   0  ,      M max   0 M max 1 L*   z   0 M max 1     *    ,  0 T     M  0 M max 1 p       LN 1 ,   LM max  2 , with ice-cream (or second order or Lorentz) cones:  LN  1  x  ( x1 , x2 ,..., xN 1 )T  R N 1 | xN+1  x12  x2  ...  xN 2 2  ( N  1).
  • 22. An Alternative Solution for Tikhonov Regularization Problem with CQP • The corresponding Dual Problem is  max ( T , 0) 1  0T max 1 ,  M  2 M   0T 1   0T max 1 0   1      2   N M such that  *T X ,  0 M max 1 p  1  L*T  0M max 1 p   0 M max 1 p    1  LN 1 ,  2  L M max  2 .
  • 23. Solution Methods • CQPs belong to the well-structured convex problems. • Interior Point algorithms: – We use the structure of problem. – Yield better complexity bounds. – Exhibit much better practical performance.
  • 24. Application • GLPMs with CMARS and the parameter estimation for them have been presented and investigated in detail. Now, a numerical example for this study will be given. • Two data sets are used in our applications: • Concrete Compressive Strength Data Set • Concrete Slump Test Data Set • Data sets are obtained from UCI Machine Learning Repository (http://archive.ics.uci.edu/ml/). •
  • 25. Application • Salford Systems is used for MARS application, for CMARS a code is written by using MATLAB and in order to solve the CQP problem in CMARS, MOSEK software is preferred. For Tikhonov Regularization Regularization Toolbox in MATLAB is used. • All test data sets are also compared according to the performance measures such as Root Mean Square Error (RMSE), Correlation Coefficient (r), R2, Adjusted R2. •
  • 26. Application • To compare the performances of Tikhonov regularization, CMARS and GPLM models, let us look at the performance measure values for both data sets. WORSE BETTER Evaluation of the models based on performance values: • CMARS performs better than Tikhonov regularization with respect to all the measures for both data sets. • On the other hand, GLM with CMARS (GPLM) performs better than both Tikhonov regularization and CMARS with respect to all the measures for both data sets.
  • 27. Outlook • Important new class of GPLs: having written the Tikhonov regularization task for GLM using MARS as a CQP problem, we will call it CGLMARS: E Y X , T    G XT   T  , e.g., GPLM (X ,T ) = LM (X ) + MARS (T )
  • 28. References [1] Aster, A., Borchers, B., and Thurber, C., Parameter Estimation and Inverse Problems, Academic Press, 2004. [2] Craven, P., and Wahba, G., Smoothing noisy data with spline functions, Numer. Math. 31, Linear Models, (1979), 377-403. [3] De Boor, C., Practical Guide to Splines, Springer Verlag, 2001. [4] Dongarra, J.J., Bunch, J.R., Moler, C.B., and Stewart, G.W., Linpack User’s Guide, Philadelphia, SIAM, 1979. [5] Friedman, J.H., Multivariate adaptive regression splines, (1991), The Annals of Statistics 19, 1, 1-141. [6] Green, P.J., and Yandell, B.S., Semi-Parametric Generalized Linear Models, Lecture Notes in Statistics, 32 (1985). [7] Hastie, T.J., and Tibshirani, R.J., Generalized Additive Models, New York, Chapman and Hall, 1990. [8] Kincaid, D., and Cheney, W., Numerical Analysis: Mathematics of Scientific computing, Pacific Grove, 2002. [9] Müller, M., Estimation and testing in generalized partial linear models – A comparive study, Statistics and Computing 11 (2001) 299-309, 2001. [10] Nelder, J.A., and Wedderburn, R.W.M., Generalized linear models, Journal of the Royal Statistical Society A, 145, (1972) 470-484. [11] Nemirovski, A., Lectures on modern convex optimization, Israel Institute of Technology http://iew3.technion.ac.il/Labs/Opt/opt/LN/Final.pdf.
  • 29. References [12] Nesterov, Y.E , and Nemirovskii, A.S., Interior Point Methods in Convex Programming, SIAM, 1993. [13] Ortega, J.M., and Rheinboldt, W.C., Iterative Solution of Nonlinear Equations in Several Variables, Academic Press, New York, 1970. [14] Renegar, J., Mathematical View of Interior Point Methods in Convex Programming, SIAM, 2000. [15] Sheid, F., Numerical Analysis, McGraw-Hill Book Company, New-York, 1968. [16] Taylan, P., Weber, G.-W., and Beck, A., New approaches to regression by generalized additive and continuous optimization for modern applications in finance, science and technology, Optimization 56, 5-6 (2007), pp. 1-24. [17] Taylan, P., Weber, G.-W., and Liu, L., On foundations of parameter estimation for generalized partial linear models with B-splines and continuous optimization, in the proceedings of PCO 2010, 3rd Global Conference on Power Control and Optimization, February 2-4, 2010, Gold Coast, Queensland, Australia. [18] Weber, G.-W., Akteke-Öztürk, B., İşcanoğlu, A., Özöğür, S., and Taylan, P., Data Mining: Clustering, Classification and Regression, four lectures given at the Graduate Summer School on New Advances in Statistics, Middle East Technical University, Ankara, Turkey, August 11-24, 2007 (http://www.statsummer.com/). [19] Wood, S.N., Generalized Additive Models, An Introduction with R, New York, Chapman and Hall, 2006.
  • 30. Thank you very much for your attention!