SlideShare a Scribd company logo
1 of 28
NAME   :                                IDREES WARIS  REG NO:                                3095 SEMESTER :                           4TH COURSE :                               QTIA  COURSE FACILITATOR:    SIR IMTIAZ ARIF 1 GENERALIZED LINEAR MODEL
MAIN POINTS TO BE DISCUSSED IN GZLM 2 What is GZLM or GRZ and why to use GZLM(History and Explanation) When to use GZLM (Assumptions) How to use GZLM in SPSS (Statistical Procedure)
What is Generalized linear model (GZLM)? 3 The Generalized Linear Model is a generalization of the general linear model (GLM) discussed separately with regard to Anova/Ancova andManova/Mancova models, as well as regression models.  GZLM allows for dependent variables with non-normal distributions and for many link functions other than identity.  GZLM supports not only traditional regression models but also logistic models for binary dependents, log-linear analysis of count data, Poisson regression for count data, gamma regression, complementary log-log models for interval-censored survival data, and many others.
HISTORY 4 Generalized Linear Model was first discussed by John Nelder and Robert Wedderbun in 1972 in an article. You may find its overview in article by Gill (2001)
Difference between General linear model(GLM) and Generalized linear model(GZLM) 5 General linear model (GLM)  The general linear model (GLM) is a flexible statistical model that incorporates normally distributed dependent variables and categorical or continuous independent variables. GLM enables you to accommodate designs with empty cells, more readily interpret the results using profile plots of estimated means, and customize the linear model so that it directly addresses the research questions you ask. Anyone who regularly fits linear models, whether univariate, multivariate or repeated measures, will find the GLM procedure to be very useful. General Equation: Y= b + b₁X₁ + b₂X₂ +………+ bkXk + ℮
GZLM Extensions: 6 Correlated or clustered data: Generalized Estimating Equations (GEEs) Generalized Linear mixed Models (GLMMs) Hierarchical generalized linear models  (HGLMs) Generalized additive models (GAMs)
Components of GZLM 7 There are 3 components of a generalized linear model 	(or GLM): 1. Random Component — identify the response variable (Y ) andspecify/assume a probability distribution for it. 2. Systematic Component — specify what the explanatory or predictor variables are (e.g., X1, X2, etc). These variable enter in a linear manner α + β1X1 + β2X2 + . . . + βkXk 3. Link Function— Specify the relationship between the mean or expected value of the random component 	(i.e., E(Y )) and the systematic component.
Random ComponentLet N = sample size and suppose that we have Y1, Y2, . . . , YN observations on our response variable and that the observations are all independent. Y ’s that are discrete variables where Y is either 8 Counts (including cells of a contingency table): Number of people who die from AIDS during a given time period. Number of times a child tries to take a toy away from another child. Number of times patents generated by firms. These responses have a Poisson distribution. Dichotomous (binary) with a fixed numbers of trials. success/failure correct/incorrect agree/disagree academic/non-academic program These responses have a Binomial distribution.
Systematic Component 9 As in ordinary regression, we were modeling means. The focus is on the expected value of our response variable E(Y ) = μ We want to investigate whether and how μ varies as a function of the levels of our predictor or explanatory variables, X’s. The systematic component of the model consists of a set of explanatory variables and some linear function of them. βo + β1x1 + β2x2 + β3x3 + . . . + βkxk. This linear combination of our explanatory variables is referred to as a “linear predictor”. This part of the model is very much like what you know with respect to ordinary linear regression
The Link Function 10 “Left hand” side of an equation/model — the random component; that is, E(Y ) = μ “Right hand” side of the equation — the systematic component; that is, α + β1x1 + β2x2 + . . . + βkxk We now need to “link” the two sides. How is μ = E(Y ) related to α + β1x1 + β2x2 + . . . + βkxk? We do this using a “Link Function” =) g(μ) g(μ) = α + β1x1 + β2x2 + . . . + βkxk
More about the Link Function 11 The link function provides the relationship between the linear predictor and the mean of the distribution function.  Important things about g(.): This function g(.) is “monotone” — as the systematic part gets larger, μ gets larger (or smaller). The relationship between E(Y ) and the systematic part can be non-linear. Some common links are: 1. Identity(ordinary regression, ANOVA, ANCOVA):  E(Y ) = α + βx 2. Log link which is often used when Y is nonnegative (i.e., 0  Y ):  log(E(Y )) = log(μ) = α + βx This yields a “loglinear” model.  3. Logit link, which is often used when 0  μ  1 (e.g., when response is  dichotomous/binary and we’re interested in a probability). log(μ/(1 − μ)) = α + βx
Link FunctionThe Canonical links 12
When? (ASSUMPTIONS) 13 Not assumed. GZLM/GEE, compared to GLM, do not assume a normally distributed dependent variable (or normally distributed independents), nor linearity between the predictors and the dependent, nor homogeneity of variance for the range of the dependent variable.  Linearity of the link function. Absence of high multicollinearity Centered data Data distribution Independent vs. correlated data Data levels Missing data
How to run GZLM in SPSS 14 Model Types  (Already given Common model types) Scale Response.  Linear. Specifies Normal as the distribution and Identity as the link function.  Gamma with log link. Specifies Gamma as the distribution and Log as the link function.  Ordinal Response.  Ordinal logistic. Specifies Multinomial (ordinal) as the distribution and Cumulative logit as the link function.  Ordinal probit. Specifies Multinomial (ordinal) as the distribution and Cumulative probit as the link function.  Counts.  Poisson loglinear. Specifies Poisson as the distribution and Log as the link function.  Negative binomial with log link. Specifies Negative binomial (with a value of 1 for the ancillary parameter) as the distribution and Log as the link function. To have the procedure estimate the value of the ancillary parameter, specify a custom model with Negative binomial distribution and select Estimate value in the Parameter group.
15 Model Types continued… Binary Response or Events/Trials Data.  Binary logistic. Specifies Binomial as the distribution and Logit as the link function. Binary probit. Specifies Binomial as the distribution and Probit as the link function. Interval censored survival. Specifies Binomial as the distribution and Complementary log-log as the link function. Mixture.  Tweedie with log link. Specifies Tweedie as the distribution and Log as the link function. Tweedie with identity link. Specifies Tweedie as the distribution and Identity as the link function. Custom. Specify your own combination of distribution and link function.
16 Model Types (8 Custom distributions) Normal Inverse Gaussian Gamma Multinomial Binomial Poisson Negative Binomial Tweedie
DISTRIBUTIONS Normal Inverse Gaussian 17
DISTRIBUTIONS Gamma Binomial 18
DISTRIBUTIONS Poisson Negative Binomial 19
Distributions 20 Tweedie Tweedie distribution requires a parameter, p, which the researcher enters to determine the shape of the distribution: p=0: normal distribution p=1: Poisson distribution 1< p< 2: for continuous data with exact zeros (the default in SPSS is 1.5) p=2: gamma distribution p>2: for positive continuous data Multinomial Dependent has a finite number of categories, has text string values, or is ordinal.  The distribution among categories, not shown, is arbitrary.
15 custom Link functions 21 Normal, Gamma, Inverse Gaussian, Poisson and Twedie distributions: Identity Log Power
22 Negative binomial distributions  Negative binomial Binomial distributions  Logit Probit Complementary log-log Negative log  Log complement Odds power Multinomial distributions  Cumulative logit Cumulative Probit Cumulative Cauchit Cumulative Complementary log  Cumulative negative log
Data for Analysis 23 Take data from SPSS 18.0 sample files of ships data sav. To study the effect of  Ships type Year of Construction & Period of Operation on No. of damage incidents To run a Generalized Linear Models analysis, from the menus choose:  Analyze   Generalized Linear Models     Generalized Linear Models...
24  Analyze  Generalized Linear Models   Generalized Linear Models... Type of Model Tab  (specify DV distribution and link function)  On the Response tab, select a dependent variable.  On the Predictors tab, select factors and covariates for use in predicting the dependent variable. (Factors are categorical predictors; they can be numeric or string and Covariates are scale predictors; they must be numeric)  On the Model tab, specify model effects using the selected factors and covariates. Estimation Statistics EM means Save Export
25 Type of model tab will appear: Select Poisson log-linear as the type of model. This specifies a Poisson distribution with a log link function. Click the Response tab: Select Number of damage incidents as the dependent variable. Click the Predictors tab: Select Ship type, Year of construction, and Period of operation as factors. Select Logarithm of aggregate months of service as the offset. Click Options. Select Descending as the category order for factors Click Continue Click OK
26 Click the Model tab Select type (Ship type), construction (Year of construction), and operation (Period of operation) as main effects in the model. Click the Estimation tab. Select Pearson chi-square as the method for estimating the scale parameter. Click the EM Means tab Select type (Ship type) and construction (Year of construction) as terms to display means for and select Pairwise as the contrast for each. Select Compute means for linear predictor as the scale.  Select Sequential Sidak as the adjustment method. Click the Save tab. Select Predicted value of linear predictor and Standardized deviance residual. These values are saved to the active dataset and can help you diagnose any problems with the model fit.
Scatter Plot 27 To produce a scatter plot of Standardized Deviance Residual by Predicted Value of the Linear Predictor, from the menus choose:   Graphs    Chart Builder... Select the Scatter/Dot gallery and choose Simple Scatter. Select Standardized Deviance Residual as the y variable and Predicted Value of the Linear Predictor as the x variable. Click OK.
Research Papers and Thesis for Understanding 28 Development of an Accident Prediction Model using GLIM (Generalized Log-linear Model) and EB method: A case of Seoul (Korea) Log-Linear Models by Noah A. Smith Fitting Tweedie’s Compound Poisson Model to Insurance Claims Data: Dispersion Modeling On the Distribution of Discounted Loss Reserves Using Generalized Linear Models by Gordon K. Smyth (December 2001) The application of over dispersion and (GEE) Generalized Estimating Equations in repeated categorical data ( for understanding over dispersion, Poisson, negative binomial and GEE) Clustering of foot-based pitch contours in expressive speech by Esther Klabbers and Jan P. H. van Santen Collaborative filtering with interlaced generalized linear models by Nicolas Delannay, Michel Verleysen DISSERTATION OF STANFORD UNIVERSITY  GENERALIZED LINEAR MODELS WITH REGULARIZATION by Mee Young Park (September, 2006)

More Related Content

What's hot

Generalized linear model
Generalized linear modelGeneralized linear model
Generalized linear modelRahul Rockers
 
Bernoullis Random Variables And Binomial Distribution
Bernoullis Random Variables And Binomial DistributionBernoullis Random Variables And Binomial Distribution
Bernoullis Random Variables And Binomial Distributionmathscontent
 
Simple Linier Regression
Simple Linier RegressionSimple Linier Regression
Simple Linier Regressiondessybudiyanti
 
Regression analysis in R
Regression analysis in RRegression analysis in R
Regression analysis in RAlichy Sowmya
 
Logistic regression with SPSS
Logistic regression with SPSSLogistic regression with SPSS
Logistic regression with SPSSLNIPE
 
Improving predictions: Lasso, Ridge and Stein's paradox
Improving predictions: Lasso, Ridge and Stein's paradoxImproving predictions: Lasso, Ridge and Stein's paradox
Improving predictions: Lasso, Ridge and Stein's paradoxMaarten van Smeden
 
Statistics-Regression analysis
Statistics-Regression analysisStatistics-Regression analysis
Statistics-Regression analysisRabin BK
 
Inferential statictis ready go
Inferential statictis ready goInferential statictis ready go
Inferential statictis ready goMmedsc Hahm
 
General Linear Model | Statistics
General Linear Model | StatisticsGeneral Linear Model | Statistics
General Linear Model | StatisticsTransweb Global Inc
 
Multiple linear regression
Multiple linear regressionMultiple linear regression
Multiple linear regressionJames Neill
 
Linear regression analysis
Linear regression analysisLinear regression analysis
Linear regression analysisNimrita Koul
 

What's hot (20)

Regression
RegressionRegression
Regression
 
Normal distribution
Normal distributionNormal distribution
Normal distribution
 
Multiple linear regression
Multiple linear regressionMultiple linear regression
Multiple linear regression
 
Generalized linear model
Generalized linear modelGeneralized linear model
Generalized linear model
 
Bernoullis Random Variables And Binomial Distribution
Bernoullis Random Variables And Binomial DistributionBernoullis Random Variables And Binomial Distribution
Bernoullis Random Variables And Binomial Distribution
 
Propensity Score Matching Methods
Propensity Score Matching MethodsPropensity Score Matching Methods
Propensity Score Matching Methods
 
Simple Linier Regression
Simple Linier RegressionSimple Linier Regression
Simple Linier Regression
 
Testing Hypothesis
Testing HypothesisTesting Hypothesis
Testing Hypothesis
 
Part 2 Cox Regression
Part 2 Cox RegressionPart 2 Cox Regression
Part 2 Cox Regression
 
Regression analysis in R
Regression analysis in RRegression analysis in R
Regression analysis in R
 
Logistic regression with SPSS
Logistic regression with SPSSLogistic regression with SPSS
Logistic regression with SPSS
 
Improving predictions: Lasso, Ridge and Stein's paradox
Improving predictions: Lasso, Ridge and Stein's paradoxImproving predictions: Lasso, Ridge and Stein's paradox
Improving predictions: Lasso, Ridge and Stein's paradox
 
Statistics-Regression analysis
Statistics-Regression analysisStatistics-Regression analysis
Statistics-Regression analysis
 
Linear regression theory
Linear regression theoryLinear regression theory
Linear regression theory
 
Inferential statictis ready go
Inferential statictis ready goInferential statictis ready go
Inferential statictis ready go
 
Chapter13
Chapter13Chapter13
Chapter13
 
General Linear Model | Statistics
General Linear Model | StatisticsGeneral Linear Model | Statistics
General Linear Model | Statistics
 
Multiple linear regression
Multiple linear regressionMultiple linear regression
Multiple linear regression
 
Linear regression analysis
Linear regression analysisLinear regression analysis
Linear regression analysis
 
Ordinal Logistic Regression
Ordinal Logistic RegressionOrdinal Logistic Regression
Ordinal Logistic Regression
 

Viewers also liked

Big_data_es_kozigazgatas
Big_data_es_kozigazgatasBig_data_es_kozigazgatas
Big_data_es_kozigazgatasLogDrill
 
Union Web2.0 adoption in insurance
Union Web2.0 adoption in insuranceUnion Web2.0 adoption in insurance
Union Web2.0 adoption in insuranceJoseph A. Bayer
 
Varjú Zoltán - Túlélőkészlet adatáradat esetére
Varjú Zoltán - Túlélőkészlet adatáradat esetéreVarjú Zoltán - Túlélőkészlet adatáradat esetére
Varjú Zoltán - Túlélőkészlet adatáradat esetéreÁgnes W. Kovács
 
Generalized Linear Models
Generalized Linear ModelsGeneralized Linear Models
Generalized Linear ModelsAvinash Chamwad
 
Balogh gyorgy modern_big_data_megoldasok_sec_world_2014
Balogh gyorgy modern_big_data_megoldasok_sec_world_2014Balogh gyorgy modern_big_data_megoldasok_sec_world_2014
Balogh gyorgy modern_big_data_megoldasok_sec_world_2014LogDrill
 
Generalized Linear Models with H2O
Generalized Linear Models with H2O Generalized Linear Models with H2O
Generalized Linear Models with H2O Sri Ambati
 
SAS Insurance Analytics Architecture
SAS Insurance Analytics ArchitectureSAS Insurance Analytics Architecture
SAS Insurance Analytics Architecturestuartdrose
 
Linear models for data science
Linear models for data scienceLinear models for data science
Linear models for data scienceBrad Klingenberg
 
Regression Analysis
Regression AnalysisRegression Analysis
Regression AnalysisSalim Azad
 
Linear model of Curriculum
Linear model of CurriculumLinear model of Curriculum
Linear model of CurriculumJonna May Berci
 
Regression analysis
Regression analysisRegression analysis
Regression analysissaba khan
 
Regression Analysis
Regression AnalysisRegression Analysis
Regression Analysisnadiazaheer
 
Simple linear regression (final)
Simple linear regression (final)Simple linear regression (final)
Simple linear regression (final)Harsh Upadhyay
 
Presentation On Regression
Presentation On RegressionPresentation On Regression
Presentation On Regressionalok tiwari
 
Regression analysis
Regression analysisRegression analysis
Regression analysisRavi shankar
 
Reading the Lindley-Smith 1973 paper on linear Bayes estimators
Reading the Lindley-Smith 1973 paper on linear Bayes estimatorsReading the Lindley-Smith 1973 paper on linear Bayes estimators
Reading the Lindley-Smith 1973 paper on linear Bayes estimatorsChristian Robert
 
Regression analysis ppt
Regression analysis pptRegression analysis ppt
Regression analysis pptElkana Rorio
 

Viewers also liked (18)

Big_data_es_kozigazgatas
Big_data_es_kozigazgatasBig_data_es_kozigazgatas
Big_data_es_kozigazgatas
 
Union Web2.0 adoption in insurance
Union Web2.0 adoption in insuranceUnion Web2.0 adoption in insurance
Union Web2.0 adoption in insurance
 
Varjú Zoltán - Túlélőkészlet adatáradat esetére
Varjú Zoltán - Túlélőkészlet adatáradat esetéreVarjú Zoltán - Túlélőkészlet adatáradat esetére
Varjú Zoltán - Túlélőkészlet adatáradat esetére
 
Generalized Linear Models
Generalized Linear ModelsGeneralized Linear Models
Generalized Linear Models
 
Balogh gyorgy modern_big_data_megoldasok_sec_world_2014
Balogh gyorgy modern_big_data_megoldasok_sec_world_2014Balogh gyorgy modern_big_data_megoldasok_sec_world_2014
Balogh gyorgy modern_big_data_megoldasok_sec_world_2014
 
Generalized Linear Models with H2O
Generalized Linear Models with H2O Generalized Linear Models with H2O
Generalized Linear Models with H2O
 
SAS Insurance Analytics Architecture
SAS Insurance Analytics ArchitectureSAS Insurance Analytics Architecture
SAS Insurance Analytics Architecture
 
Linear models for data science
Linear models for data scienceLinear models for data science
Linear models for data science
 
Regression Analysis
Regression AnalysisRegression Analysis
Regression Analysis
 
Linear model of Curriculum
Linear model of CurriculumLinear model of Curriculum
Linear model of Curriculum
 
Regression analysis
Regression analysisRegression analysis
Regression analysis
 
Regression Analysis
Regression AnalysisRegression Analysis
Regression Analysis
 
Simple linear regression (final)
Simple linear regression (final)Simple linear regression (final)
Simple linear regression (final)
 
Presentation On Regression
Presentation On RegressionPresentation On Regression
Presentation On Regression
 
Regression analysis
Regression analysisRegression analysis
Regression analysis
 
Reading the Lindley-Smith 1973 paper on linear Bayes estimators
Reading the Lindley-Smith 1973 paper on linear Bayes estimatorsReading the Lindley-Smith 1973 paper on linear Bayes estimators
Reading the Lindley-Smith 1973 paper on linear Bayes estimators
 
Models of curriculum
Models of curriculumModels of curriculum
Models of curriculum
 
Regression analysis ppt
Regression analysis pptRegression analysis ppt
Regression analysis ppt
 

Similar to Final generalized linear modeling by idrees waris iugc

mix2.pdf
mix2.pdfmix2.pdf
mix2.pdfdawitg2
 
Supervised Learning.pdf
Supervised Learning.pdfSupervised Learning.pdf
Supervised Learning.pdfgadissaassefa
 
GLMM in interventional study at Require 23, 20151219
GLMM in interventional study at Require 23, 20151219GLMM in interventional study at Require 23, 20151219
GLMM in interventional study at Require 23, 20151219Shuhei Ichikawa
 
Multinomial Logistic Regression.pdf
Multinomial Logistic Regression.pdfMultinomial Logistic Regression.pdf
Multinomial Logistic Regression.pdfAlemAyahu
 
7. logistics regression using spss
7. logistics regression using spss7. logistics regression using spss
7. logistics regression using spssDr Nisha Arora
 
Mc0079 computer based optimization methods--phpapp02
Mc0079 computer based optimization methods--phpapp02Mc0079 computer based optimization methods--phpapp02
Mc0079 computer based optimization methods--phpapp02Rabby Bhatt
 
Logistic regression - one of the key regression tools in experimental research
Logistic regression - one of the key regression tools in experimental researchLogistic regression - one of the key regression tools in experimental research
Logistic regression - one of the key regression tools in experimental researchAdrian Olszewski
 
0 Model Interpretation setting.pdf
0 Model Interpretation setting.pdf0 Model Interpretation setting.pdf
0 Model Interpretation setting.pdfLeonardo Auslender
 
2018 p 2019-ee-a2
2018 p 2019-ee-a22018 p 2019-ee-a2
2018 p 2019-ee-a2uetian12
 
Multiple Regression.ppt
Multiple Regression.pptMultiple Regression.ppt
Multiple Regression.pptTanyaWadhwani4
 
Multinomial Logistic Regression Analysis
Multinomial Logistic Regression AnalysisMultinomial Logistic Regression Analysis
Multinomial Logistic Regression AnalysisHARISH Kumar H R
 
Master of Computer Application (MCA) – Semester 4 MC0079
Master of Computer Application (MCA) – Semester 4  MC0079Master of Computer Application (MCA) – Semester 4  MC0079
Master of Computer Application (MCA) – Semester 4 MC0079Aravind NC
 
Penalized Regressions with Different Tuning Parameter Choosing Criteria and t...
Penalized Regressions with Different Tuning Parameter Choosing Criteria and t...Penalized Regressions with Different Tuning Parameter Choosing Criteria and t...
Penalized Regressions with Different Tuning Parameter Choosing Criteria and t...CSCJournals
 
Guide for building GLMS
Guide for building GLMSGuide for building GLMS
Guide for building GLMSAli T. Lotia
 
Generalized Linear Models in Spark MLlib and SparkR by Xiangrui Meng
Generalized Linear Models in Spark MLlib and SparkR by Xiangrui MengGeneralized Linear Models in Spark MLlib and SparkR by Xiangrui Meng
Generalized Linear Models in Spark MLlib and SparkR by Xiangrui MengSpark Summit
 
Generalized Linear Models in Spark MLlib and SparkR
Generalized Linear Models in Spark MLlib and SparkRGeneralized Linear Models in Spark MLlib and SparkR
Generalized Linear Models in Spark MLlib and SparkRDatabricks
 

Similar to Final generalized linear modeling by idrees waris iugc (20)

GLMs.pptx
GLMs.pptxGLMs.pptx
GLMs.pptx
 
Logistical Regression.pptx
Logistical Regression.pptxLogistical Regression.pptx
Logistical Regression.pptx
 
mix2.pdf
mix2.pdfmix2.pdf
mix2.pdf
 
Supervised Learning.pdf
Supervised Learning.pdfSupervised Learning.pdf
Supervised Learning.pdf
 
GLMM in interventional study at Require 23, 20151219
GLMM in interventional study at Require 23, 20151219GLMM in interventional study at Require 23, 20151219
GLMM in interventional study at Require 23, 20151219
 
Multinomial Logistic Regression.pdf
Multinomial Logistic Regression.pdfMultinomial Logistic Regression.pdf
Multinomial Logistic Regression.pdf
 
7. logistics regression using spss
7. logistics regression using spss7. logistics regression using spss
7. logistics regression using spss
 
Chapter 18,19
Chapter 18,19Chapter 18,19
Chapter 18,19
 
Mc0079 computer based optimization methods--phpapp02
Mc0079 computer based optimization methods--phpapp02Mc0079 computer based optimization methods--phpapp02
Mc0079 computer based optimization methods--phpapp02
 
Paper473
Paper473Paper473
Paper473
 
Logistic regression - one of the key regression tools in experimental research
Logistic regression - one of the key regression tools in experimental researchLogistic regression - one of the key regression tools in experimental research
Logistic regression - one of the key regression tools in experimental research
 
0 Model Interpretation setting.pdf
0 Model Interpretation setting.pdf0 Model Interpretation setting.pdf
0 Model Interpretation setting.pdf
 
2018 p 2019-ee-a2
2018 p 2019-ee-a22018 p 2019-ee-a2
2018 p 2019-ee-a2
 
Multiple Regression.ppt
Multiple Regression.pptMultiple Regression.ppt
Multiple Regression.ppt
 
Multinomial Logistic Regression Analysis
Multinomial Logistic Regression AnalysisMultinomial Logistic Regression Analysis
Multinomial Logistic Regression Analysis
 
Master of Computer Application (MCA) – Semester 4 MC0079
Master of Computer Application (MCA) – Semester 4  MC0079Master of Computer Application (MCA) – Semester 4  MC0079
Master of Computer Application (MCA) – Semester 4 MC0079
 
Penalized Regressions with Different Tuning Parameter Choosing Criteria and t...
Penalized Regressions with Different Tuning Parameter Choosing Criteria and t...Penalized Regressions with Different Tuning Parameter Choosing Criteria and t...
Penalized Regressions with Different Tuning Parameter Choosing Criteria and t...
 
Guide for building GLMS
Guide for building GLMSGuide for building GLMS
Guide for building GLMS
 
Generalized Linear Models in Spark MLlib and SparkR by Xiangrui Meng
Generalized Linear Models in Spark MLlib and SparkR by Xiangrui MengGeneralized Linear Models in Spark MLlib and SparkR by Xiangrui Meng
Generalized Linear Models in Spark MLlib and SparkR by Xiangrui Meng
 
Generalized Linear Models in Spark MLlib and SparkR
Generalized Linear Models in Spark MLlib and SparkRGeneralized Linear Models in Spark MLlib and SparkR
Generalized Linear Models in Spark MLlib and SparkR
 

More from Id'rees Waris

New Microsoft Word Document
New Microsoft Word DocumentNew Microsoft Word Document
New Microsoft Word DocumentId'rees Waris
 
General overview of theories of developmental psych
General overview of theories of developmental psychGeneral overview of theories of developmental psych
General overview of theories of developmental psychId'rees Waris
 
Tcs1 by idrees waris iugc
Tcs1 by idrees waris iugcTcs1 by idrees waris iugc
Tcs1 by idrees waris iugcId'rees Waris
 
Conjoint by idrees iugc
Conjoint by idrees iugcConjoint by idrees iugc
Conjoint by idrees iugcId'rees Waris
 
Interest rate by idrees iugc
Interest rate by idrees iugcInterest rate by idrees iugc
Interest rate by idrees iugcId'rees Waris
 
Tcs by idrees waris iugc
Tcs by idrees waris iugcTcs by idrees waris iugc
Tcs by idrees waris iugcId'rees Waris
 
Proctor and gamble by idrees iugc
Proctor and gamble by  idrees iugcProctor and gamble by  idrees iugc
Proctor and gamble by idrees iugcId'rees Waris
 
Strategy by idrees waris IUGC
Strategy by idrees waris IUGCStrategy by idrees waris IUGC
Strategy by idrees waris IUGCId'rees Waris
 
Tapal by idrees IUGC
Tapal by idrees IUGCTapal by idrees IUGC
Tapal by idrees IUGCId'rees Waris
 
Bata case by idrees IUGC
Bata case by idrees IUGCBata case by idrees IUGC
Bata case by idrees IUGCId'rees Waris
 
Compile logistic1 Idrees waris IUGC
Compile logistic1 Idrees waris IUGCCompile logistic1 Idrees waris IUGC
Compile logistic1 Idrees waris IUGCId'rees Waris
 
Demand mgt in scm idrees waris IUGC
Demand mgt in scm idrees waris IUGCDemand mgt in scm idrees waris IUGC
Demand mgt in scm idrees waris IUGCId'rees Waris
 
Logistic in scm mngt idrees waris IUGC
Logistic in scm mngt idrees waris IUGCLogistic in scm mngt idrees waris IUGC
Logistic in scm mngt idrees waris IUGCId'rees Waris
 
Operations management iqra university
Operations management iqra universityOperations management iqra university
Operations management iqra universityId'rees Waris
 
Vantage point ppt iqra university
Vantage point ppt iqra universityVantage point ppt iqra university
Vantage point ppt iqra universityId'rees Waris
 

More from Id'rees Waris (18)

New Microsoft Word Document
New Microsoft Word DocumentNew Microsoft Word Document
New Microsoft Word Document
 
Idrees
IdreesIdrees
Idrees
 
Ijtihad 2
Ijtihad 2Ijtihad 2
Ijtihad 2
 
General overview of theories of developmental psych
General overview of theories of developmental psychGeneral overview of theories of developmental psych
General overview of theories of developmental psych
 
Tcs1 by idrees waris iugc
Tcs1 by idrees waris iugcTcs1 by idrees waris iugc
Tcs1 by idrees waris iugc
 
Conjoint by idrees iugc
Conjoint by idrees iugcConjoint by idrees iugc
Conjoint by idrees iugc
 
Interest rate by idrees iugc
Interest rate by idrees iugcInterest rate by idrees iugc
Interest rate by idrees iugc
 
Tcs by idrees waris iugc
Tcs by idrees waris iugcTcs by idrees waris iugc
Tcs by idrees waris iugc
 
Proctor and gamble by idrees iugc
Proctor and gamble by  idrees iugcProctor and gamble by  idrees iugc
Proctor and gamble by idrees iugc
 
Strategy by idrees waris IUGC
Strategy by idrees waris IUGCStrategy by idrees waris IUGC
Strategy by idrees waris IUGC
 
Tapal by idrees IUGC
Tapal by idrees IUGCTapal by idrees IUGC
Tapal by idrees IUGC
 
Bata case by idrees IUGC
Bata case by idrees IUGCBata case by idrees IUGC
Bata case by idrees IUGC
 
Compile logistic1 Idrees waris IUGC
Compile logistic1 Idrees waris IUGCCompile logistic1 Idrees waris IUGC
Compile logistic1 Idrees waris IUGC
 
Demand mgt in scm idrees waris IUGC
Demand mgt in scm idrees waris IUGCDemand mgt in scm idrees waris IUGC
Demand mgt in scm idrees waris IUGC
 
Demand mgt in scm
Demand mgt in scmDemand mgt in scm
Demand mgt in scm
 
Logistic in scm mngt idrees waris IUGC
Logistic in scm mngt idrees waris IUGCLogistic in scm mngt idrees waris IUGC
Logistic in scm mngt idrees waris IUGC
 
Operations management iqra university
Operations management iqra universityOperations management iqra university
Operations management iqra university
 
Vantage point ppt iqra university
Vantage point ppt iqra universityVantage point ppt iqra university
Vantage point ppt iqra university
 

Recently uploaded

IoT Insurance Observatory: summary 2024
IoT Insurance Observatory:  summary 2024IoT Insurance Observatory:  summary 2024
IoT Insurance Observatory: summary 2024Matteo Carbone
 
Islamabad Escorts | Call 03070433345 | Escort Service in Islamabad
Islamabad Escorts | Call 03070433345 | Escort Service in IslamabadIslamabad Escorts | Call 03070433345 | Escort Service in Islamabad
Islamabad Escorts | Call 03070433345 | Escort Service in IslamabadAyesha Khan
 
Ms Motilal Padampat Sugar Mills vs. State of Uttar Pradesh & Ors. - A Milesto...
Ms Motilal Padampat Sugar Mills vs. State of Uttar Pradesh & Ors. - A Milesto...Ms Motilal Padampat Sugar Mills vs. State of Uttar Pradesh & Ors. - A Milesto...
Ms Motilal Padampat Sugar Mills vs. State of Uttar Pradesh & Ors. - A Milesto...ShrutiBose4
 
Call US-88OO1O2216 Call Girls In Mahipalpur Female Escort Service
Call US-88OO1O2216 Call Girls In Mahipalpur Female Escort ServiceCall US-88OO1O2216 Call Girls In Mahipalpur Female Escort Service
Call US-88OO1O2216 Call Girls In Mahipalpur Female Escort Servicecallgirls2057
 
Global Scenario On Sustainable and Resilient Coconut Industry by Dr. Jelfina...
Global Scenario On Sustainable  and Resilient Coconut Industry by Dr. Jelfina...Global Scenario On Sustainable  and Resilient Coconut Industry by Dr. Jelfina...
Global Scenario On Sustainable and Resilient Coconut Industry by Dr. Jelfina...ictsugar
 
Buy gmail accounts.pdf Buy Old Gmail Accounts
Buy gmail accounts.pdf Buy Old Gmail AccountsBuy gmail accounts.pdf Buy Old Gmail Accounts
Buy gmail accounts.pdf Buy Old Gmail AccountsBuy Verified Accounts
 
Annual General Meeting Presentation Slides
Annual General Meeting Presentation SlidesAnnual General Meeting Presentation Slides
Annual General Meeting Presentation SlidesKeppelCorporation
 
8447779800, Low rate Call girls in New Ashok Nagar Delhi NCR
8447779800, Low rate Call girls in New Ashok Nagar Delhi NCR8447779800, Low rate Call girls in New Ashok Nagar Delhi NCR
8447779800, Low rate Call girls in New Ashok Nagar Delhi NCRashishs7044
 
Innovation Conference 5th March 2024.pdf
Innovation Conference 5th March 2024.pdfInnovation Conference 5th March 2024.pdf
Innovation Conference 5th March 2024.pdfrichard876048
 
APRIL2024_UKRAINE_xml_0000000000000 .pdf
APRIL2024_UKRAINE_xml_0000000000000 .pdfAPRIL2024_UKRAINE_xml_0000000000000 .pdf
APRIL2024_UKRAINE_xml_0000000000000 .pdfRbc Rbcua
 
Independent Call Girls Andheri Nightlaila 9967584737
Independent Call Girls Andheri Nightlaila 9967584737Independent Call Girls Andheri Nightlaila 9967584737
Independent Call Girls Andheri Nightlaila 9967584737Riya Pathan
 
Marketplace and Quality Assurance Presentation - Vincent Chirchir
Marketplace and Quality Assurance Presentation - Vincent ChirchirMarketplace and Quality Assurance Presentation - Vincent Chirchir
Marketplace and Quality Assurance Presentation - Vincent Chirchirictsugar
 
India Consumer 2024 Redacted Sample Report
India Consumer 2024 Redacted Sample ReportIndia Consumer 2024 Redacted Sample Report
India Consumer 2024 Redacted Sample ReportMintel Group
 
8447779800, Low rate Call girls in Saket Delhi NCR
8447779800, Low rate Call girls in Saket Delhi NCR8447779800, Low rate Call girls in Saket Delhi NCR
8447779800, Low rate Call girls in Saket Delhi NCRashishs7044
 
Future Of Sample Report 2024 | Redacted Version
Future Of Sample Report 2024 | Redacted VersionFuture Of Sample Report 2024 | Redacted Version
Future Of Sample Report 2024 | Redacted VersionMintel Group
 
8447779800, Low rate Call girls in Rohini Delhi NCR
8447779800, Low rate Call girls in Rohini Delhi NCR8447779800, Low rate Call girls in Rohini Delhi NCR
8447779800, Low rate Call girls in Rohini Delhi NCRashishs7044
 
Kenya Coconut Production Presentation by Dr. Lalith Perera
Kenya Coconut Production Presentation by Dr. Lalith PereraKenya Coconut Production Presentation by Dr. Lalith Perera
Kenya Coconut Production Presentation by Dr. Lalith Pereraictsugar
 

Recently uploaded (20)

IoT Insurance Observatory: summary 2024
IoT Insurance Observatory:  summary 2024IoT Insurance Observatory:  summary 2024
IoT Insurance Observatory: summary 2024
 
Islamabad Escorts | Call 03070433345 | Escort Service in Islamabad
Islamabad Escorts | Call 03070433345 | Escort Service in IslamabadIslamabad Escorts | Call 03070433345 | Escort Service in Islamabad
Islamabad Escorts | Call 03070433345 | Escort Service in Islamabad
 
Ms Motilal Padampat Sugar Mills vs. State of Uttar Pradesh & Ors. - A Milesto...
Ms Motilal Padampat Sugar Mills vs. State of Uttar Pradesh & Ors. - A Milesto...Ms Motilal Padampat Sugar Mills vs. State of Uttar Pradesh & Ors. - A Milesto...
Ms Motilal Padampat Sugar Mills vs. State of Uttar Pradesh & Ors. - A Milesto...
 
Call Us ➥9319373153▻Call Girls In North Goa
Call Us ➥9319373153▻Call Girls In North GoaCall Us ➥9319373153▻Call Girls In North Goa
Call Us ➥9319373153▻Call Girls In North Goa
 
Call US-88OO1O2216 Call Girls In Mahipalpur Female Escort Service
Call US-88OO1O2216 Call Girls In Mahipalpur Female Escort ServiceCall US-88OO1O2216 Call Girls In Mahipalpur Female Escort Service
Call US-88OO1O2216 Call Girls In Mahipalpur Female Escort Service
 
Global Scenario On Sustainable and Resilient Coconut Industry by Dr. Jelfina...
Global Scenario On Sustainable  and Resilient Coconut Industry by Dr. Jelfina...Global Scenario On Sustainable  and Resilient Coconut Industry by Dr. Jelfina...
Global Scenario On Sustainable and Resilient Coconut Industry by Dr. Jelfina...
 
Buy gmail accounts.pdf Buy Old Gmail Accounts
Buy gmail accounts.pdf Buy Old Gmail AccountsBuy gmail accounts.pdf Buy Old Gmail Accounts
Buy gmail accounts.pdf Buy Old Gmail Accounts
 
Annual General Meeting Presentation Slides
Annual General Meeting Presentation SlidesAnnual General Meeting Presentation Slides
Annual General Meeting Presentation Slides
 
8447779800, Low rate Call girls in New Ashok Nagar Delhi NCR
8447779800, Low rate Call girls in New Ashok Nagar Delhi NCR8447779800, Low rate Call girls in New Ashok Nagar Delhi NCR
8447779800, Low rate Call girls in New Ashok Nagar Delhi NCR
 
Innovation Conference 5th March 2024.pdf
Innovation Conference 5th March 2024.pdfInnovation Conference 5th March 2024.pdf
Innovation Conference 5th March 2024.pdf
 
APRIL2024_UKRAINE_xml_0000000000000 .pdf
APRIL2024_UKRAINE_xml_0000000000000 .pdfAPRIL2024_UKRAINE_xml_0000000000000 .pdf
APRIL2024_UKRAINE_xml_0000000000000 .pdf
 
Enjoy ➥8448380779▻ Call Girls In Sector 18 Noida Escorts Delhi NCR
Enjoy ➥8448380779▻ Call Girls In Sector 18 Noida Escorts Delhi NCREnjoy ➥8448380779▻ Call Girls In Sector 18 Noida Escorts Delhi NCR
Enjoy ➥8448380779▻ Call Girls In Sector 18 Noida Escorts Delhi NCR
 
No-1 Call Girls In Goa 93193 VIP 73153 Escort service In North Goa Panaji, Ca...
No-1 Call Girls In Goa 93193 VIP 73153 Escort service In North Goa Panaji, Ca...No-1 Call Girls In Goa 93193 VIP 73153 Escort service In North Goa Panaji, Ca...
No-1 Call Girls In Goa 93193 VIP 73153 Escort service In North Goa Panaji, Ca...
 
Independent Call Girls Andheri Nightlaila 9967584737
Independent Call Girls Andheri Nightlaila 9967584737Independent Call Girls Andheri Nightlaila 9967584737
Independent Call Girls Andheri Nightlaila 9967584737
 
Marketplace and Quality Assurance Presentation - Vincent Chirchir
Marketplace and Quality Assurance Presentation - Vincent ChirchirMarketplace and Quality Assurance Presentation - Vincent Chirchir
Marketplace and Quality Assurance Presentation - Vincent Chirchir
 
India Consumer 2024 Redacted Sample Report
India Consumer 2024 Redacted Sample ReportIndia Consumer 2024 Redacted Sample Report
India Consumer 2024 Redacted Sample Report
 
8447779800, Low rate Call girls in Saket Delhi NCR
8447779800, Low rate Call girls in Saket Delhi NCR8447779800, Low rate Call girls in Saket Delhi NCR
8447779800, Low rate Call girls in Saket Delhi NCR
 
Future Of Sample Report 2024 | Redacted Version
Future Of Sample Report 2024 | Redacted VersionFuture Of Sample Report 2024 | Redacted Version
Future Of Sample Report 2024 | Redacted Version
 
8447779800, Low rate Call girls in Rohini Delhi NCR
8447779800, Low rate Call girls in Rohini Delhi NCR8447779800, Low rate Call girls in Rohini Delhi NCR
8447779800, Low rate Call girls in Rohini Delhi NCR
 
Kenya Coconut Production Presentation by Dr. Lalith Perera
Kenya Coconut Production Presentation by Dr. Lalith PereraKenya Coconut Production Presentation by Dr. Lalith Perera
Kenya Coconut Production Presentation by Dr. Lalith Perera
 

Final generalized linear modeling by idrees waris iugc

  • 1. NAME : IDREES WARIS REG NO: 3095 SEMESTER : 4TH COURSE : QTIA COURSE FACILITATOR: SIR IMTIAZ ARIF 1 GENERALIZED LINEAR MODEL
  • 2. MAIN POINTS TO BE DISCUSSED IN GZLM 2 What is GZLM or GRZ and why to use GZLM(History and Explanation) When to use GZLM (Assumptions) How to use GZLM in SPSS (Statistical Procedure)
  • 3. What is Generalized linear model (GZLM)? 3 The Generalized Linear Model is a generalization of the general linear model (GLM) discussed separately with regard to Anova/Ancova andManova/Mancova models, as well as regression models. GZLM allows for dependent variables with non-normal distributions and for many link functions other than identity. GZLM supports not only traditional regression models but also logistic models for binary dependents, log-linear analysis of count data, Poisson regression for count data, gamma regression, complementary log-log models for interval-censored survival data, and many others.
  • 4. HISTORY 4 Generalized Linear Model was first discussed by John Nelder and Robert Wedderbun in 1972 in an article. You may find its overview in article by Gill (2001)
  • 5. Difference between General linear model(GLM) and Generalized linear model(GZLM) 5 General linear model (GLM) The general linear model (GLM) is a flexible statistical model that incorporates normally distributed dependent variables and categorical or continuous independent variables. GLM enables you to accommodate designs with empty cells, more readily interpret the results using profile plots of estimated means, and customize the linear model so that it directly addresses the research questions you ask. Anyone who regularly fits linear models, whether univariate, multivariate or repeated measures, will find the GLM procedure to be very useful. General Equation: Y= b + b₁X₁ + b₂X₂ +………+ bkXk + ℮
  • 6. GZLM Extensions: 6 Correlated or clustered data: Generalized Estimating Equations (GEEs) Generalized Linear mixed Models (GLMMs) Hierarchical generalized linear models  (HGLMs) Generalized additive models (GAMs)
  • 7. Components of GZLM 7 There are 3 components of a generalized linear model (or GLM): 1. Random Component — identify the response variable (Y ) andspecify/assume a probability distribution for it. 2. Systematic Component — specify what the explanatory or predictor variables are (e.g., X1, X2, etc). These variable enter in a linear manner α + β1X1 + β2X2 + . . . + βkXk 3. Link Function— Specify the relationship between the mean or expected value of the random component (i.e., E(Y )) and the systematic component.
  • 8. Random ComponentLet N = sample size and suppose that we have Y1, Y2, . . . , YN observations on our response variable and that the observations are all independent. Y ’s that are discrete variables where Y is either 8 Counts (including cells of a contingency table): Number of people who die from AIDS during a given time period. Number of times a child tries to take a toy away from another child. Number of times patents generated by firms. These responses have a Poisson distribution. Dichotomous (binary) with a fixed numbers of trials. success/failure correct/incorrect agree/disagree academic/non-academic program These responses have a Binomial distribution.
  • 9. Systematic Component 9 As in ordinary regression, we were modeling means. The focus is on the expected value of our response variable E(Y ) = μ We want to investigate whether and how μ varies as a function of the levels of our predictor or explanatory variables, X’s. The systematic component of the model consists of a set of explanatory variables and some linear function of them. βo + β1x1 + β2x2 + β3x3 + . . . + βkxk. This linear combination of our explanatory variables is referred to as a “linear predictor”. This part of the model is very much like what you know with respect to ordinary linear regression
  • 10. The Link Function 10 “Left hand” side of an equation/model — the random component; that is, E(Y ) = μ “Right hand” side of the equation — the systematic component; that is, α + β1x1 + β2x2 + . . . + βkxk We now need to “link” the two sides. How is μ = E(Y ) related to α + β1x1 + β2x2 + . . . + βkxk? We do this using a “Link Function” =) g(μ) g(μ) = α + β1x1 + β2x2 + . . . + βkxk
  • 11. More about the Link Function 11 The link function provides the relationship between the linear predictor and the mean of the distribution function. Important things about g(.): This function g(.) is “monotone” — as the systematic part gets larger, μ gets larger (or smaller). The relationship between E(Y ) and the systematic part can be non-linear. Some common links are: 1. Identity(ordinary regression, ANOVA, ANCOVA): E(Y ) = α + βx 2. Log link which is often used when Y is nonnegative (i.e., 0 Y ): log(E(Y )) = log(μ) = α + βx This yields a “loglinear” model. 3. Logit link, which is often used when 0 μ 1 (e.g., when response is dichotomous/binary and we’re interested in a probability). log(μ/(1 − μ)) = α + βx
  • 13. When? (ASSUMPTIONS) 13 Not assumed. GZLM/GEE, compared to GLM, do not assume a normally distributed dependent variable (or normally distributed independents), nor linearity between the predictors and the dependent, nor homogeneity of variance for the range of the dependent variable. Linearity of the link function. Absence of high multicollinearity Centered data Data distribution Independent vs. correlated data Data levels Missing data
  • 14. How to run GZLM in SPSS 14 Model Types (Already given Common model types) Scale Response. Linear. Specifies Normal as the distribution and Identity as the link function. Gamma with log link. Specifies Gamma as the distribution and Log as the link function. Ordinal Response. Ordinal logistic. Specifies Multinomial (ordinal) as the distribution and Cumulative logit as the link function. Ordinal probit. Specifies Multinomial (ordinal) as the distribution and Cumulative probit as the link function. Counts. Poisson loglinear. Specifies Poisson as the distribution and Log as the link function. Negative binomial with log link. Specifies Negative binomial (with a value of 1 for the ancillary parameter) as the distribution and Log as the link function. To have the procedure estimate the value of the ancillary parameter, specify a custom model with Negative binomial distribution and select Estimate value in the Parameter group.
  • 15. 15 Model Types continued… Binary Response or Events/Trials Data. Binary logistic. Specifies Binomial as the distribution and Logit as the link function. Binary probit. Specifies Binomial as the distribution and Probit as the link function. Interval censored survival. Specifies Binomial as the distribution and Complementary log-log as the link function. Mixture. Tweedie with log link. Specifies Tweedie as the distribution and Log as the link function. Tweedie with identity link. Specifies Tweedie as the distribution and Identity as the link function. Custom. Specify your own combination of distribution and link function.
  • 16. 16 Model Types (8 Custom distributions) Normal Inverse Gaussian Gamma Multinomial Binomial Poisson Negative Binomial Tweedie
  • 20. Distributions 20 Tweedie Tweedie distribution requires a parameter, p, which the researcher enters to determine the shape of the distribution: p=0: normal distribution p=1: Poisson distribution 1< p< 2: for continuous data with exact zeros (the default in SPSS is 1.5) p=2: gamma distribution p>2: for positive continuous data Multinomial Dependent has a finite number of categories, has text string values, or is ordinal. The distribution among categories, not shown, is arbitrary.
  • 21. 15 custom Link functions 21 Normal, Gamma, Inverse Gaussian, Poisson and Twedie distributions: Identity Log Power
  • 22. 22 Negative binomial distributions Negative binomial Binomial distributions Logit Probit Complementary log-log Negative log Log complement Odds power Multinomial distributions Cumulative logit Cumulative Probit Cumulative Cauchit Cumulative Complementary log Cumulative negative log
  • 23. Data for Analysis 23 Take data from SPSS 18.0 sample files of ships data sav. To study the effect of Ships type Year of Construction & Period of Operation on No. of damage incidents To run a Generalized Linear Models analysis, from the menus choose:  Analyze   Generalized Linear Models     Generalized Linear Models...
  • 24. 24  Analyze  Generalized Linear Models   Generalized Linear Models... Type of Model Tab (specify DV distribution and link function)  On the Response tab, select a dependent variable.  On the Predictors tab, select factors and covariates for use in predicting the dependent variable. (Factors are categorical predictors; they can be numeric or string and Covariates are scale predictors; they must be numeric)  On the Model tab, specify model effects using the selected factors and covariates. Estimation Statistics EM means Save Export
  • 25. 25 Type of model tab will appear: Select Poisson log-linear as the type of model. This specifies a Poisson distribution with a log link function. Click the Response tab: Select Number of damage incidents as the dependent variable. Click the Predictors tab: Select Ship type, Year of construction, and Period of operation as factors. Select Logarithm of aggregate months of service as the offset. Click Options. Select Descending as the category order for factors Click Continue Click OK
  • 26. 26 Click the Model tab Select type (Ship type), construction (Year of construction), and operation (Period of operation) as main effects in the model. Click the Estimation tab. Select Pearson chi-square as the method for estimating the scale parameter. Click the EM Means tab Select type (Ship type) and construction (Year of construction) as terms to display means for and select Pairwise as the contrast for each. Select Compute means for linear predictor as the scale. Select Sequential Sidak as the adjustment method. Click the Save tab. Select Predicted value of linear predictor and Standardized deviance residual. These values are saved to the active dataset and can help you diagnose any problems with the model fit.
  • 27. Scatter Plot 27 To produce a scatter plot of Standardized Deviance Residual by Predicted Value of the Linear Predictor, from the menus choose:   Graphs    Chart Builder... Select the Scatter/Dot gallery and choose Simple Scatter. Select Standardized Deviance Residual as the y variable and Predicted Value of the Linear Predictor as the x variable. Click OK.
  • 28. Research Papers and Thesis for Understanding 28 Development of an Accident Prediction Model using GLIM (Generalized Log-linear Model) and EB method: A case of Seoul (Korea) Log-Linear Models by Noah A. Smith Fitting Tweedie’s Compound Poisson Model to Insurance Claims Data: Dispersion Modeling On the Distribution of Discounted Loss Reserves Using Generalized Linear Models by Gordon K. Smyth (December 2001) The application of over dispersion and (GEE) Generalized Estimating Equations in repeated categorical data ( for understanding over dispersion, Poisson, negative binomial and GEE) Clustering of foot-based pitch contours in expressive speech by Esther Klabbers and Jan P. H. van Santen Collaborative filtering with interlaced generalized linear models by Nicolas Delannay, Michel Verleysen DISSERTATION OF STANFORD UNIVERSITY GENERALIZED LINEAR MODELS WITH REGULARIZATION by Mee Young Park (September, 2006)

Editor's Notes

  1. Correlated or clustered dataThe standard GLM assumes that the observations are uncorrelated. Extensions have been developed to allow for correlation between observations, as occurs for example in longitudinal studies and clustered designs:Generalized estimating equations (GEEs) allow for the correlation between observations without the use of an explicit probability model for the origin of the correlations, so there is no explicit likelihood. They are suitable when the random effects and their variances are not of inherent interest, as they allow for the correlation without explaining its origin. The focus is on estimating the average response over the population (&quot;population-averaged&quot; effects) rather than the regression parameters that would enable prediction of the effect of changing one or more components of X on a given individual. GEEs are usually used in conjunction with Huber-White standard errors. [4][5]Generalized linear mixed models (GLMMs) are an extension to GLMs that includes random effects in the linear predictor, giving an explicit probability model that explains the origin of the correlations. The resulting &quot;subject-specific&quot; parameter estimates are suitable when the focus is on estimating the effect of changing one or more components of X on a given individual. GLMMs are a particular type of multilevel model (mixed model). In general, fitting GLMMs is more computationally complex and intensive than fitting GEEs.Hierarchical generalized linear models (HGLMs) are similar to GLMMs apart from two distinctions:The random effects can have any distribution in the exponential family, whereas current GLMMs nearly always have normal random effects;They are not as computationally intensive, as instead of integrating out the random effects they are based on a modified form of likelihood known as the hierarchical likelihood or h-likelihood.The theoretical basis and accuracy of the methods used in HGLMs have been the subject of some debate in the statistical literature. As of 2008, the method is only available in one statistical software package, namely
  2. Canonical means: conforming to a general rule reduced to the simplest or clearest scheme possible the simplest form of a matrix (specifically the form of a square matrix that has zero off-diagonals).
  3. AssumptionsNot assumed. GZLM/GEE, compared to GLM, do not assume a normally distributed dependent variable (or normally distributed independents), nor linearity between the predictors and the dependent, nor homogeneity of variance for the range of the dependent variable. Linearity of the link function. The researcher still must select a link function such that there is a linear relation between the linear predictor (the right-hand side of the model equation) and the link function of the dependent. For example, in logistic regression models, one must check to see if there is &quot;linearity in the logit,&quot; meaning the linear predictor is in fact linearly related to the logit of the dependent. Absence of high multicollinearity. As in other linear models, presence of high multicollinearity among the independents will inflate standard errors and confound interpretation of the relative contributions of the independents. Centered data. As in regression, centering may be necessary either to reduce multicollinearity or to make interpretation of coefficients meaningful. Centering is almost always recommended for independent variables which are components of interaction terms in a logistic model. Data distribution. The independents may be of any distribution so long as linearity of the link function is maintained. The dependent may assume any of a wide variety of distributions, including normal, inverse normal (inverse Gaussian), binomial, multinomial, and Poisson. Independent vs. correlated data . In GZLM, observations are assumed to be independent. In GEE, observations are assumed to be independent between subjects within any given time period, cluster, or repeated measures factor, but are assumed to be dependent within the same subject across repeated measures factors. That is, GEE assumes between-subjects independence and within-subjects dependence. In practical terms, if there are repeated measures, GEE, not GZLM, should be chosen and the repeated measures properly specified. Data levels. In both GZLM and GEE, the dependent (response) variable may be binary, counts, scale, or events-in-trials. &quot;Scale&quot; means interval or ordinal if it can be assumed that ordinal values represent ordered categories with a meaningful metric, so that distance comparisons between values are appropriate. For ordinal response variables, the model type should be multinomial (ordinal) logistic or ordinal probit. Factors are categorical. Covariates are scale variables. Weight and offset variables, if any, are also assumed to be scale. In GEE, the variables specified as repeated measures within-subjects effects, or the variables used to define subjects, cannot be used as dependent variables. Missing data. As in other forms of statistical analysis, missing data can lead to biased coefficients unless data are missing completely at random.  
  4. Interpretation: There seems to be a Shopping style effect; on average, &quot;biweekly&quot; customers spend $378.52, while &quot;weekly&quot; customers spend $404.55, and &quot;often&quot; customers spend $406.76. There also appears to be a Gender effect; on average, males in the sample spend $430.30 compared to $365.66 for females. Lastly, there may be an interaction effect between Gender and Shopping style, because the mean differences in amount spent by shopping style vary between genders. For example, &quot;biweekly&quot; male customers tend to spend more than &quot;often&quot; male customers, but this trend is reversed for &quot;biweekly&quot; and &quot;often&quot; female customers. The N column in the table shows there are unequal cell sizes. Most customers prefer to shop on a weekly basis. The standard deviations appear relatively homogenous, although you should check Levene&apos;s test and the spread-versus-level plots to be sure. Dependent variable Amount spent: This table tests the null hypothesis that the variance of the error term is constant across the cells defined by the combination of factor levelsSince the significance value of the test, 0.330, is greater than 0.10, there is no reason to believe that the equal variances assumption is violated. Thus, the small differences in group standard deviations observed in the descriptive statistics table are due to random variationThe spread-versus-level plot is a scatterplot of the cell means and standard deviations from the descriptive statistics tableIt provides a visual test of the equal variances assumption, with the added benefit of helping you to assess whether violations of the assumption are due to a relationship between the cell means and standard deviationsThere is no apparent pattern in this plot, so there is no indication of such a relationship here. The tests of between-subjects effects help you to determine the significance of a factor. However, they do not indicate how the levels of a factor differ. The post hoc tests show the differences in model-predicted means for each pair of factor levels. The first column displays the different post hoc testsThe next two columns display the pair of factor levels being tested. When the significance value for the difference in Amount spent for a pair of factor levels is less than 0.05, an asterisk (*) is printed by the differenceIn this case, there do not appear to be significant differences in the spending habits of &quot;biweekly&quot;, &quot;weekly&quot;, or &quot;often&quot; customers. Tamhane&apos;s T2 is generally more appropriate than Tukey&apos;s HSD when there are unequal cell sizes, but the results in this case are largely the sameThe confidence intervals for Tamhane&apos;s T2 are only slightly wider than those for Tukey&apos;s HSD. Since the results of these two tests are not very different, it is safe for you to look at the results of the homogenous subsets, which are available for Tukey&apos;s HSD but not for Tamhane&apos;s T2. The homogenous subsets table takes the results of the post hoc tests and shows them in a more easily interpretable formIn the subset columns the factor levels that do not have significantly different effects are displayed in the same columnIn this example, the first subset contains the &quot;biweekly&quot;, &quot;weekly&quot;, and &quot;often&quot; customers.  These are all the customers, so there are no other subsets.  The post hoc tests suggest that efforts at enticing customers to shop more often than usual is wasted because they will not spend significantly more. However, the post hoc test results do not account for the levels of other factors, thus ignoring the possibility of an interaction effect with Gender seen in the descriptive statistics table. See the estimated marginal means to see how this might change your conclusionsThis table displays the model-estimated marginal means and standard errors of Amount spent at the factor combinations of Gender and Shopping style. This table is useful for exploring the possible interaction effect between these two factorsIn this example, a male customer who makes purchases weekly is expected to spend about $440.96, while one who makes purchases more often is expected to spend $407.77. A female customer who makes purchases weekly is expected to spend $361.72, while one who makes purchases more often is expected to spend $405.72. Thus, there is a difference between &quot;weekly&quot; and &quot;often&quot; customers, depending upon the gender of the customer. This fact suggests an interaction effect between Gender and Shopping style. If there were no interaction, you would expect the difference between shopping styles to remain constant for male and female customers. The interaction can be seen more easily in the profile plotsThe profile plot is a visual representation of the marginal means table.The factor levels of Shopping style are shown along the horizontal axisSeparate lines are produced for each level of Gender.  Alternately, the factor levels of Gender could be shown along the horizontal axis, with separate lines produced for each level of Shopping style.  If there were no interaction effect, the lines in the table would be parallel. Instead, you can see that the difference in spending between &quot;weekly&quot; and &quot;often&quot; customers is greater for female customers, as the line for male customers slopes downward and that for female customers slopes upward.  This is a strong interaction effect and is unlikely to be due to chance, but you should check the tests of between-subjects effects for confirmation of its significanceThis is an analysis of variance table. Each term in the model, plus the model as a whole, is tested for its ability to account for variation in the dependent variable. Note that variable labels are not displayed in this table.  The significance value for each term, except STYLE, is less than 0.05. Therefore each term, except STYLE, is statistically significant.  The partial eta squared statistic reports the &quot;practical&quot; significance of each term, based upon the ratio of the variation (sum of squares) accounted for by the term, to the sum of the variation accounted for by the term and the variation left to error.Larger values of partial eta squared indicate a greater amount of variation accounted for by the model term, to a maximum of 1. Here the individual terms, while statistically significant, do not have great effect on the value of Amount spent.  The GLM Univariate procedure is useful for modeling the linear relationship between a dependent scale variable and one or more categorical and scale predictors. If you have only one factor, you can alternatively use the One-Way ANOVA procedure. If you only have covariates, use the Linear Regression procedure for more model-building, residual-checking, and output options
  5. 15 custom Link functionsIdentity. f(x)=x. The dependent variable is not transformed. This link can be used with any distribution.• Complementary log-log. f(x)=log(−log(1−x)). This is appropriate only with the binomial distribution.• Cumulative Cauchit. f(x) = tan(π (x – 0.5)), applied to the cumulative probability of each category of the response. This is appropriate only with the multinomial distribution.• Cumulative complementary log-log. f(x)=ln(−ln(1−x)), applied to the cumulative probability of each category of the response. This is appropriate only with the multinomial distribution.• Cumulative logit. f(x)=ln(x / (1−x)), applied to the cumulative probability of each category of the response. This is appropriate only with the multinomial distribution.• Cumulative negative log-log. f(x)=−ln(−ln(x)), applied to the cumulative probability of each category of the response. This is appropriate only with the multinomial distribution.• Cumulative probit. f(x)=Φ−1(x), applied to the cumulative probability of each category of the response, where Φ−1 is the inverse standard normal cumulative distribution function. This is appropriate only with the multinomial distribution.• Log. f(x)=log(x). This link can be used with any distribution.• Log complement. f(x)=log(1−x). This is appropriate only with the binomial distribution.• Logit. f(x)=log(x / (1−x)). This is appropriate only with the binomial distribution.• Negative binomial. f(x)=log(x / (x+k−1)), where k is the ancillary parameter of the negative binomial distribution. This is appropriate only with the negative binomial distribution.• Negative log-log. f(x)=−log(−log(x)). This is appropriate only with the binomial distribution.• Odds power. f(x)=[(x/(1−x))α−1]/α, if α ≠ 0. f(x)=log(x), if α=0. α is the required number specification and must be a real number. This is appropriate only with the binomial distribution.• Probit. f(x)=Φ−1(x), where Φ−1 is the inverse standard normal cumulative distribution function. This is appropriate only with the binomial distribution.• Power. f(x)=xα, if α ≠ 0. f(x)=log(x), if α=0. α is the required number specification and must be a real number. This link can be used with any distribution.