SlideShare uma empresa Scribd logo
1 de 28
Logistic Regression in Case-
Control study using – A
statistical tool
Satish Gupta
What is R?
 The R statistical programming language is a free open
source package.
 The language is very powerful for writing programs.
 Many statistical functions are already built in.
 Contributed packages expand the functionality to
cutting edge research.
Getting Started
 Go to www.r-project.org
 Downloads: CRAN (Comprehensive R Archive
Network)
 Set your Mirror: location close to you.
 Select Windows 95 or later, MacOS or UNIX
platforms
Getting Started
Basic operators and calculations
Comparison operators
 equal: ==
 not equal: !=
 greater/less than: > <
 greater/less than or equal: >= <=
Example: 1 == 1 # Returns TRUE
Basic operators and calculations
Logical operators
 AND: &
x <- 1:10; y <- 10:1 # Creates the sample vectors 'x' and 'y'.
x > y & x > 5 # Returns TRUE where both comparisons return TRUE.
 OR: |
x == y | x != y # Returns TRUE where at least one comparison is
TRUE.
 NOT: !
!x > y # The '!' sign returns the negation (opposite) of a logical
vector.
Basic operators and calculations
Calculations
 Four basic arithmetic functions: addition, subtraction,
multiplication and division
1 + 1; 1 - 1; 1 * 1; 1 / 1 # Returns results of basic arithmetic
calculations.
 Calculations on vectors
x <- 1:10; sum(x); mean(x), sd(x); sqrt(x) # Calculates for
the vector x its sum, mean, standard deviation and square root.
x <- 1:10; y <- 1:10; x + y # Calculates the sum for each element
in the vectors x and y.
R-Graphics
R provides comprehensive graphics utilities for
visualizing and exploring scientific data. It includes:
 Scatter plots
 Line plots
 Bar plots
 Pie charts
 Heatmaps
 Venn diagrams
 Density plots
 Box plots
Data handling in R
 Load data: mydata = read.csv(“/path/mydata.csv”)
 See data on screen: data(mydata)
 See top part of data: head(mydata)
 Specific number of rows and column of data:
mydata[1:10,1:3]
 To get a type of data: class(mydata)
 Changing class of data: newdata = as.matrix(mydata)
 Summary of data: summary(mydata)
 Selecting (KEEPING) variables (columns)
newdata = mydata[c(1,3:5)]
Data handling in R
 Selecting observations
newdata= subset(mydata, age>=20 | age <10,
select=c(ID, weight)
newdata= subset(mydata, sex==“Male” & age >25,
select=weight:income)
 Excluding (DROPPING) variables (columns)
newdata = mydata[c(-3,-5)]
mydata$v3 = NULL
R-Library
 There are many tools defined as “package” are present in R for
different kind of analysis including data from genetics and
genomics.
 Depending upon the availability of library, it can be
downloaded from two sources
Using CRAN (Comprehensive R Archive Network) as:
install.packages(“package_name”)
Using Bioconductor as:
source("http://bioconductor.org/biocLite.R")
biocLite(“package_name”)
R-Library
 To load a package,
library() #Lists all libraries/packages that are available on a system.
library(genetics) #Package for genetics data analysis
library(help=genetics) #Lists all functions/objects of “genetics”
package
?function #Opens documentation of a function
What is Logistic Regression?
 Logistic regression describes the relationship between
a dichotomous response variable and a set of
explanatory variables.
 Logistic regression is often used because the
relationship between the DV (a discrete variable) and
a predictor is non-linear.
 A General Model:
Logistic Regression
JJ
disease
disease
disease XX
p
p
p βββ +++=
−
= 110)
1
log()logit(
Where:
pdisease is the probability that an individual has a particular
disease.
β0 is the intercept
β1, β2 …βJ are the coefficients (effects) of genetic factors
X1, X2 …XJ are the variables of genetic factors
Assumptions
 Logistic regression does not make any assumptions
of normality, linearity, and homogeneity of variance
for the independent variables.
 Because it does not impose these requirements, it is
preferred to discriminant analysis when the data does
not satisfy these assumptions.
Questions ??
 What is the relative importance of each predictor variable?
 How does each predictor variable affect the outcome?
 Does a predictor variable make the solution better or
worse or have no effect?
 Are there interactions among predictors?
 Does adding interactions among predictors
(continuous or categorical) improve the model?
 What is the strength of association between the outcome
variable and a set of predictors?
 Often in model comparison you want non-significant
differences so strength of association is reported for
even non-significant effects.
Types of Logistic Regression
 Unconditional logistic regression
 Conditional logistic regression
** Rule of thumbs
 Use conditional logistic regression if matching has been done,
and unconditional if there has been no matching.
 When in doubt, use conditional because it always gives
unbiased results. The unconditional method is said to
overestimate the odds ratio if it is not appropriate.
Data Format
Status Matset Se_Quartiles GPX1 GPX4 SEP15 TXN2
1 1 <60 CT TT AG AG
0 1 >60 – 70 CC CC GG GG
1 2 <60 TT CC AG AA
0 2 >70 – 80 CC CT GG GG
1 3 >80 CC CC AA AA
0 3 >60 – 70 CT TT GG GG
1 4 <60 CC CC AA AG
0 4 >70 – 80 TT TT GG GG
1 5 >80 CC CC AG AA
0 5 <60 CC CC GG GG
1 6 >70 – 80 CT TT AA AA
0 6 >80 CC CC GG AG
1 7 >60 – 70 TT CC AA AG
Data and Library loading
 Load and use data in R (Using Lung cancer data from
PLoS One 2013, 8(3):e59051).
lung = read.csv(“/path/lung.csv”, sep= “t”, header = TRUE)
 Load the library and use data for analysis
library(epicalc)
use(lung)
Data Analysis
 Performing conditional logistic regression (Case vs. Control)
clogit_lung = clogit(Status ~ Se_Quartiles + strata(Matset), data = .data)
clogistic.display(clogit_lung)
OR(95%CI) P(Wald's test) P(LR-test)
Quartiles: ref.=<60 <0.001
>60 – 70 0.4(0.15 – 1.09) 0.074
>70 – 80 0.11(0.03 – 0.33) <0.001
>80 0.10(0.03 – 0.34) <0.001
Data Analysis
 Performing conditional logistic regression (Case vs. Control),
clogit_lung = clogit(Status ~ GPX1+ strata(Matset), data = .data)
clogistic.display(clogit_lung)
OR(95%CI) P(Wald's test) P(LR-test)
GPX1: ref.=CC 0.032
CT 0.44(0.22 – 0.86) 0.017
TT 0.42(0.13 – 1.38) 0.151
Data Analysis
 Performing conditional logistic regression (Case vs. Control),
clogit_lung = clogit(Status ~ Se_Quartiles + GPX1+ strata(Matset), data = .data)
clogistic.display(clogit_lung)
 
crude
OR(95%CI)
adj.
OR(95%CI)
P(Wald's
test) P(LR-test)
Quartiles: ref.=<60 <0.001
>60 – 70 0.4(0.15 – 1.09) 0.32(0.11 – 0.96) 0.042
>70 – 80 0.11(0.03 – 0.33) 0.09(0.02 – 0.3) <0.001
>80 0.1(0.03 – 0.34) 0.05(0.01 – 0.23) <0.001
GPX1:ref.=CC 0.006
CT 0.44(0.22 – 0.86) 0.26(0.11 – 0.65) 0.004
TT 0.42(0.13 – 1.38) 0.44(0.09 – 2.18) 0.313
Environmental
Factor
Genetic Factor
Data Analysis
 Performing unconditional logistic regression (Case vs.
Control),
ulogit_lung = glm(Status ~ Se_Quartiles , family=binomial, data =
.data)
logistic.display(ulogit_lung)
OR(95%CI) P(Wald's test) P(LR-test)
Quartiles: ref.=<60 <0.001
>60 – 70 0.41 (0.17 – 1.02) 0.054
>70 – 80 0.13 (0.05 – 0.34) <0.001
>80 0.17 (0.07 – 0.42) <0.001
Data Analysis
 Performing unconditional logistic regression (Case vs.
Control),
ulogit_lung = glm(Status ~ GPX1 , family=binomial, data = .data)
logistic.display(ulogit_lung)
OR(95%CI) P(Wald's test) P(LR-test)
Quartiles: ref.=CC 0.034
CT 0.45 (0.24 – 0.85) 0.014
TT 0.44 (0.14 – 1.36) 0.156
Data Analysis
 Performing unconditional logistic regression (Case vs.
Control),
ulogit_lung = glm(Status ~ Se_Quartiles , family=binomial, data =
.data)
logistic.display(ulogit_lung)
crude
OR(95%CI)
adj.
OR(95%CI) P(Wald's test) P(LR-test)
Quartiles: ref.=<60 <0.001
>60 – 70 0.41 (0.17 – 1.02) 0.43 (0.17 – 1.08) 0.074
>70 – 80 0.13 (0.05 – 0.34) 0.13 (0.05 – 0.34) <0.001
>80 0.17 (0.07 – 0.42) 0.15 (0.06 – 0.39) <0.001
GPX1:ref.=CC 0.024
CT 0.45 (0.24 – 0.85) 0.40(0.20 – 0.80) 0.01
TT 0.44 (0.14 – 1.36) 0.42 (0.12 – 1.41) 0.161
Something More 
 Changing the default reference
GPX1 = relevel(GPX1, ref = "TT")
pack()
 Saving the result
result = clogistic.display(clogit_lung)
write.csv(result$table, file=“path/result.csv“, sep = “t”)
write.table(result$table, file=“path/result.xls“, sep = “t”)
Summary: regression models
 Regression models can be used to describe the
average effect of predictors on outcomes in your data
set.
 They can tell how likely that the effect is just be due
to chance.
 They can look at each predictor “adjusting for” the
others (estimating what would happen if all others
were held constant.)
Thanks to,
Prof. Virasakdi Chongsuvivatwong
Epidemiology Unit,
Faculty of Medicine,
Prince of Songkla University, Thailand

Mais conteúdo relacionado

Mais procurados

5. Non parametric analysis
5. Non parametric analysis5. Non parametric analysis
5. Non parametric analysisRazif Shahril
 
Data management in Stata
Data management in StataData management in Stata
Data management in Stataizahn
 
Advanced Biostatistics - Simplified
Advanced Biostatistics - SimplifiedAdvanced Biostatistics - Simplified
Advanced Biostatistics - SimplifiedMohammed Alhefzi
 
How to calculate power in statistics
How to calculate power in statisticsHow to calculate power in statistics
How to calculate power in statisticsStat Analytica
 
Binary OR Binomial logistic regression
Binary OR Binomial logistic regression Binary OR Binomial logistic regression
Binary OR Binomial logistic regression Dr Athar Khan
 
Mixed Effects Models - Fixed Effects
Mixed Effects Models - Fixed EffectsMixed Effects Models - Fixed Effects
Mixed Effects Models - Fixed EffectsScott Fraundorf
 
Power and sample size calculations for survival analysis webinar Slides
Power and sample size calculations for survival analysis webinar SlidesPower and sample size calculations for survival analysis webinar Slides
Power and sample size calculations for survival analysis webinar SlidesnQuery
 
7. logistics regression using spss
7. logistics regression using spss7. logistics regression using spss
7. logistics regression using spssDr Nisha Arora
 
Introduction to Generalized Linear Models
Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models
Introduction to Generalized Linear Modelsrichardchandler
 
Regression analysis
Regression analysisRegression analysis
Regression analysisRavi shankar
 
Data Analysis with SPSS : One-way ANOVA
Data Analysis with SPSS : One-way ANOVAData Analysis with SPSS : One-way ANOVA
Data Analysis with SPSS : One-way ANOVADr Ali Yusob Md Zain
 
Nested case control study
Nested case control studyNested case control study
Nested case control studyPrayas Gautam
 
Multinomial Logistic Regression
Multinomial Logistic RegressionMultinomial Logistic Regression
Multinomial Logistic RegressionDr Athar Khan
 
Introduction to principal component analysis (pca)
Introduction to principal component analysis (pca)Introduction to principal component analysis (pca)
Introduction to principal component analysis (pca)Mohammed Musah
 
Ordinal logistic regression
Ordinal logistic regression Ordinal logistic regression
Ordinal logistic regression Dr Athar Khan
 
9-Meta Analysis/ Systematic Review
9-Meta Analysis/ Systematic Review9-Meta Analysis/ Systematic Review
9-Meta Analysis/ Systematic ReviewResearchGuru
 

Mais procurados (20)

5. Non parametric analysis
5. Non parametric analysis5. Non parametric analysis
5. Non parametric analysis
 
Data management in Stata
Data management in StataData management in Stata
Data management in Stata
 
Advanced Biostatistics - Simplified
Advanced Biostatistics - SimplifiedAdvanced Biostatistics - Simplified
Advanced Biostatistics - Simplified
 
Binary Logistic Regression
Binary Logistic RegressionBinary Logistic Regression
Binary Logistic Regression
 
How to calculate power in statistics
How to calculate power in statisticsHow to calculate power in statistics
How to calculate power in statistics
 
Spss software
Spss softwareSpss software
Spss software
 
Binary OR Binomial logistic regression
Binary OR Binomial logistic regression Binary OR Binomial logistic regression
Binary OR Binomial logistic regression
 
Mixed Effects Models - Fixed Effects
Mixed Effects Models - Fixed EffectsMixed Effects Models - Fixed Effects
Mixed Effects Models - Fixed Effects
 
Power and sample size calculations for survival analysis webinar Slides
Power and sample size calculations for survival analysis webinar SlidesPower and sample size calculations for survival analysis webinar Slides
Power and sample size calculations for survival analysis webinar Slides
 
7. logistics regression using spss
7. logistics regression using spss7. logistics regression using spss
7. logistics regression using spss
 
Data analysis with spss anova
Data analysis with spss anovaData analysis with spss anova
Data analysis with spss anova
 
Introduction to Generalized Linear Models
Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models
Introduction to Generalized Linear Models
 
Regression analysis
Regression analysisRegression analysis
Regression analysis
 
Data Analysis with SPSS : One-way ANOVA
Data Analysis with SPSS : One-way ANOVAData Analysis with SPSS : One-way ANOVA
Data Analysis with SPSS : One-way ANOVA
 
Nested case control study
Nested case control studyNested case control study
Nested case control study
 
Multinomial Logistic Regression
Multinomial Logistic RegressionMultinomial Logistic Regression
Multinomial Logistic Regression
 
Introduction to principal component analysis (pca)
Introduction to principal component analysis (pca)Introduction to principal component analysis (pca)
Introduction to principal component analysis (pca)
 
Ordinal logistic regression
Ordinal logistic regression Ordinal logistic regression
Ordinal logistic regression
 
Fishers test
Fishers testFishers test
Fishers test
 
9-Meta Analysis/ Systematic Review
9-Meta Analysis/ Systematic Review9-Meta Analysis/ Systematic Review
9-Meta Analysis/ Systematic Review
 

Destaque

Destaque (15)

ACCUPASS活動通 行銷廣告版位說明
ACCUPASS活動通 行銷廣告版位說明ACCUPASS活動通 行銷廣告版位說明
ACCUPASS活動通 行銷廣告版位說明
 
Spatial Data Science with R
Spatial Data Science with RSpatial Data Science with R
Spatial Data Science with R
 
Confounder and effect modification
Confounder and effect modificationConfounder and effect modification
Confounder and effect modification
 
手把手教你 R 語言分析實務
手把手教你 R 語言分析實務手把手教你 R 語言分析實務
手把手教你 R 語言分析實務
 
R統計軟體簡介
R統計軟體簡介R統計軟體簡介
R統計軟體簡介
 
Bias and confounding
Bias and confoundingBias and confounding
Bias and confounding
 
Research Methodology
Research MethodologyResearch Methodology
Research Methodology
 
Dummy variable
Dummy variableDummy variable
Dummy variable
 
CM KaggleTW Share
CM KaggleTW ShareCM KaggleTW Share
CM KaggleTW Share
 
R programming
R programmingR programming
R programming
 
Antenatal care
Antenatal careAntenatal care
Antenatal care
 
Variables
VariablesVariables
Variables
 
Variables
 Variables Variables
Variables
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
SAMPLING AND SAMPLING ERRORS
SAMPLING AND SAMPLING ERRORSSAMPLING AND SAMPLING ERRORS
SAMPLING AND SAMPLING ERRORS
 

Semelhante a Logistic Regression in Case-Control Study

Essay on-data-analysis
Essay on-data-analysisEssay on-data-analysis
Essay on-data-analysisRaman Kannan
 
Interpreting Logistic Regression.pptx
Interpreting Logistic Regression.pptxInterpreting Logistic Regression.pptx
Interpreting Logistic Regression.pptxGairuzazmiMGhani
 
Data mining with R- regression models
Data mining with R- regression modelsData mining with R- regression models
Data mining with R- regression modelsHamideh Iraj
 
Statistics for Data Analytics
Statistics for Data AnalyticsStatistics for Data Analytics
Statistics for Data AnalyticsABHISHEKDAHALE
 
Accounting serx
Accounting serxAccounting serx
Accounting serxzeer1234
 
Accounting serx
Accounting serxAccounting serx
Accounting serxzeer1234
 
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...Yao Yao
 
PCA and LDA in machine learning
PCA and LDA in machine learningPCA and LDA in machine learning
PCA and LDA in machine learningAkhilesh Joshi
 
Logistic regression vs. logistic classifier. History of the confusion and the...
Logistic regression vs. logistic classifier. History of the confusion and the...Logistic regression vs. logistic classifier. History of the confusion and the...
Logistic regression vs. logistic classifier. History of the confusion and the...Adrian Olszewski
 
analysis part 02.pptx
analysis part 02.pptxanalysis part 02.pptx
analysis part 02.pptxefrembeyene4
 
Supervised Learning.pdf
Supervised Learning.pdfSupervised Learning.pdf
Supervised Learning.pdfgadissaassefa
 
2014-mo444-practical-assignment-04-paulo_faria
2014-mo444-practical-assignment-04-paulo_faria2014-mo444-practical-assignment-04-paulo_faria
2014-mo444-practical-assignment-04-paulo_fariaPaulo Faria
 
[M3A3] Data Analysis and Interpretation Specialization
[M3A3] Data Analysis and Interpretation Specialization [M3A3] Data Analysis and Interpretation Specialization
[M3A3] Data Analysis and Interpretation Specialization Andrea Rubio
 

Semelhante a Logistic Regression in Case-Control Study (20)

Essay on-data-analysis
Essay on-data-analysisEssay on-data-analysis
Essay on-data-analysis
 
Interpreting Logistic Regression.pptx
Interpreting Logistic Regression.pptxInterpreting Logistic Regression.pptx
Interpreting Logistic Regression.pptx
 
Data mining with R- regression models
Data mining with R- regression modelsData mining with R- regression models
Data mining with R- regression models
 
Statistics for Data Analytics
Statistics for Data AnalyticsStatistics for Data Analytics
Statistics for Data Analytics
 
Accounting serx
Accounting serxAccounting serx
Accounting serx
 
Accounting serx
Accounting serxAccounting serx
Accounting serx
 
Gene expression profiling ii
Gene expression profiling  iiGene expression profiling  ii
Gene expression profiling ii
 
spss teaching
spss teachingspss teaching
spss teaching
 
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...
 
ML MODULE 2.pdf
ML MODULE 2.pdfML MODULE 2.pdf
ML MODULE 2.pdf
 
PCA and LDA in machine learning
PCA and LDA in machine learningPCA and LDA in machine learning
PCA and LDA in machine learning
 
Logistic regression vs. logistic classifier. History of the confusion and the...
Logistic regression vs. logistic classifier. History of the confusion and the...Logistic regression vs. logistic classifier. History of the confusion and the...
Logistic regression vs. logistic classifier. History of the confusion and the...
 
analysis part 02.pptx
analysis part 02.pptxanalysis part 02.pptx
analysis part 02.pptx
 
working with python
working with pythonworking with python
working with python
 
R for Statistical Computing
R for Statistical ComputingR for Statistical Computing
R for Statistical Computing
 
Supervised Learning.pdf
Supervised Learning.pdfSupervised Learning.pdf
Supervised Learning.pdf
 
2014-mo444-practical-assignment-04-paulo_faria
2014-mo444-practical-assignment-04-paulo_faria2014-mo444-practical-assignment-04-paulo_faria
2014-mo444-practical-assignment-04-paulo_faria
 
[M3A3] Data Analysis and Interpretation Specialization
[M3A3] Data Analysis and Interpretation Specialization [M3A3] Data Analysis and Interpretation Specialization
[M3A3] Data Analysis and Interpretation Specialization
 
Quality data management
Quality data managementQuality data management
Quality data management
 
Quality data management
Quality data managementQuality data management
Quality data management
 

Último

On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsMebane Rash
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17Celine George
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibitjbellavia9
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the ClassroomPooky Knightsmith
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024Elizabeth Walsh
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxEsquimalt MFRC
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSCeline George
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptxMaritesTamaniVerdade
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17Celine George
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxVishalSingh1417
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseAnaAcapella
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfNirmal Dwivedi
 
Dyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptxDyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptxcallscotland1987
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Jisc
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxVishalSingh1417
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and ModificationsMJDuyan
 

Último (20)

On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
Spatium Project Simulation student brief
Spatium Project Simulation student briefSpatium Project Simulation student brief
Spatium Project Simulation student brief
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
Dyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptxDyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptx
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 

Logistic Regression in Case-Control Study

  • 1. Logistic Regression in Case- Control study using – A statistical tool Satish Gupta
  • 2. What is R?  The R statistical programming language is a free open source package.  The language is very powerful for writing programs.  Many statistical functions are already built in.  Contributed packages expand the functionality to cutting edge research.
  • 3. Getting Started  Go to www.r-project.org  Downloads: CRAN (Comprehensive R Archive Network)  Set your Mirror: location close to you.  Select Windows 95 or later, MacOS or UNIX platforms
  • 5. Basic operators and calculations Comparison operators  equal: ==  not equal: !=  greater/less than: > <  greater/less than or equal: >= <= Example: 1 == 1 # Returns TRUE
  • 6. Basic operators and calculations Logical operators  AND: & x <- 1:10; y <- 10:1 # Creates the sample vectors 'x' and 'y'. x > y & x > 5 # Returns TRUE where both comparisons return TRUE.  OR: | x == y | x != y # Returns TRUE where at least one comparison is TRUE.  NOT: ! !x > y # The '!' sign returns the negation (opposite) of a logical vector.
  • 7. Basic operators and calculations Calculations  Four basic arithmetic functions: addition, subtraction, multiplication and division 1 + 1; 1 - 1; 1 * 1; 1 / 1 # Returns results of basic arithmetic calculations.  Calculations on vectors x <- 1:10; sum(x); mean(x), sd(x); sqrt(x) # Calculates for the vector x its sum, mean, standard deviation and square root. x <- 1:10; y <- 1:10; x + y # Calculates the sum for each element in the vectors x and y.
  • 8. R-Graphics R provides comprehensive graphics utilities for visualizing and exploring scientific data. It includes:  Scatter plots  Line plots  Bar plots  Pie charts  Heatmaps  Venn diagrams  Density plots  Box plots
  • 9. Data handling in R  Load data: mydata = read.csv(“/path/mydata.csv”)  See data on screen: data(mydata)  See top part of data: head(mydata)  Specific number of rows and column of data: mydata[1:10,1:3]  To get a type of data: class(mydata)  Changing class of data: newdata = as.matrix(mydata)  Summary of data: summary(mydata)  Selecting (KEEPING) variables (columns) newdata = mydata[c(1,3:5)]
  • 10. Data handling in R  Selecting observations newdata= subset(mydata, age>=20 | age <10, select=c(ID, weight) newdata= subset(mydata, sex==“Male” & age >25, select=weight:income)  Excluding (DROPPING) variables (columns) newdata = mydata[c(-3,-5)] mydata$v3 = NULL
  • 11. R-Library  There are many tools defined as “package” are present in R for different kind of analysis including data from genetics and genomics.  Depending upon the availability of library, it can be downloaded from two sources Using CRAN (Comprehensive R Archive Network) as: install.packages(“package_name”) Using Bioconductor as: source("http://bioconductor.org/biocLite.R") biocLite(“package_name”)
  • 12. R-Library  To load a package, library() #Lists all libraries/packages that are available on a system. library(genetics) #Package for genetics data analysis library(help=genetics) #Lists all functions/objects of “genetics” package ?function #Opens documentation of a function
  • 13. What is Logistic Regression?  Logistic regression describes the relationship between a dichotomous response variable and a set of explanatory variables.  Logistic regression is often used because the relationship between the DV (a discrete variable) and a predictor is non-linear.
  • 14.  A General Model: Logistic Regression JJ disease disease disease XX p p p βββ +++= − = 110) 1 log()logit( Where: pdisease is the probability that an individual has a particular disease. β0 is the intercept β1, β2 …βJ are the coefficients (effects) of genetic factors X1, X2 …XJ are the variables of genetic factors
  • 15. Assumptions  Logistic regression does not make any assumptions of normality, linearity, and homogeneity of variance for the independent variables.  Because it does not impose these requirements, it is preferred to discriminant analysis when the data does not satisfy these assumptions.
  • 16. Questions ??  What is the relative importance of each predictor variable?  How does each predictor variable affect the outcome?  Does a predictor variable make the solution better or worse or have no effect?  Are there interactions among predictors?  Does adding interactions among predictors (continuous or categorical) improve the model?  What is the strength of association between the outcome variable and a set of predictors?  Often in model comparison you want non-significant differences so strength of association is reported for even non-significant effects.
  • 17. Types of Logistic Regression  Unconditional logistic regression  Conditional logistic regression ** Rule of thumbs  Use conditional logistic regression if matching has been done, and unconditional if there has been no matching.  When in doubt, use conditional because it always gives unbiased results. The unconditional method is said to overestimate the odds ratio if it is not appropriate.
  • 18. Data Format Status Matset Se_Quartiles GPX1 GPX4 SEP15 TXN2 1 1 <60 CT TT AG AG 0 1 >60 – 70 CC CC GG GG 1 2 <60 TT CC AG AA 0 2 >70 – 80 CC CT GG GG 1 3 >80 CC CC AA AA 0 3 >60 – 70 CT TT GG GG 1 4 <60 CC CC AA AG 0 4 >70 – 80 TT TT GG GG 1 5 >80 CC CC AG AA 0 5 <60 CC CC GG GG 1 6 >70 – 80 CT TT AA AA 0 6 >80 CC CC GG AG 1 7 >60 – 70 TT CC AA AG
  • 19. Data and Library loading  Load and use data in R (Using Lung cancer data from PLoS One 2013, 8(3):e59051). lung = read.csv(“/path/lung.csv”, sep= “t”, header = TRUE)  Load the library and use data for analysis library(epicalc) use(lung)
  • 20. Data Analysis  Performing conditional logistic regression (Case vs. Control) clogit_lung = clogit(Status ~ Se_Quartiles + strata(Matset), data = .data) clogistic.display(clogit_lung) OR(95%CI) P(Wald's test) P(LR-test) Quartiles: ref.=<60 <0.001 >60 – 70 0.4(0.15 – 1.09) 0.074 >70 – 80 0.11(0.03 – 0.33) <0.001 >80 0.10(0.03 – 0.34) <0.001
  • 21. Data Analysis  Performing conditional logistic regression (Case vs. Control), clogit_lung = clogit(Status ~ GPX1+ strata(Matset), data = .data) clogistic.display(clogit_lung) OR(95%CI) P(Wald's test) P(LR-test) GPX1: ref.=CC 0.032 CT 0.44(0.22 – 0.86) 0.017 TT 0.42(0.13 – 1.38) 0.151
  • 22. Data Analysis  Performing conditional logistic regression (Case vs. Control), clogit_lung = clogit(Status ~ Se_Quartiles + GPX1+ strata(Matset), data = .data) clogistic.display(clogit_lung)   crude OR(95%CI) adj. OR(95%CI) P(Wald's test) P(LR-test) Quartiles: ref.=<60 <0.001 >60 – 70 0.4(0.15 – 1.09) 0.32(0.11 – 0.96) 0.042 >70 – 80 0.11(0.03 – 0.33) 0.09(0.02 – 0.3) <0.001 >80 0.1(0.03 – 0.34) 0.05(0.01 – 0.23) <0.001 GPX1:ref.=CC 0.006 CT 0.44(0.22 – 0.86) 0.26(0.11 – 0.65) 0.004 TT 0.42(0.13 – 1.38) 0.44(0.09 – 2.18) 0.313 Environmental Factor Genetic Factor
  • 23. Data Analysis  Performing unconditional logistic regression (Case vs. Control), ulogit_lung = glm(Status ~ Se_Quartiles , family=binomial, data = .data) logistic.display(ulogit_lung) OR(95%CI) P(Wald's test) P(LR-test) Quartiles: ref.=<60 <0.001 >60 – 70 0.41 (0.17 – 1.02) 0.054 >70 – 80 0.13 (0.05 – 0.34) <0.001 >80 0.17 (0.07 – 0.42) <0.001
  • 24. Data Analysis  Performing unconditional logistic regression (Case vs. Control), ulogit_lung = glm(Status ~ GPX1 , family=binomial, data = .data) logistic.display(ulogit_lung) OR(95%CI) P(Wald's test) P(LR-test) Quartiles: ref.=CC 0.034 CT 0.45 (0.24 – 0.85) 0.014 TT 0.44 (0.14 – 1.36) 0.156
  • 25. Data Analysis  Performing unconditional logistic regression (Case vs. Control), ulogit_lung = glm(Status ~ Se_Quartiles , family=binomial, data = .data) logistic.display(ulogit_lung) crude OR(95%CI) adj. OR(95%CI) P(Wald's test) P(LR-test) Quartiles: ref.=<60 <0.001 >60 – 70 0.41 (0.17 – 1.02) 0.43 (0.17 – 1.08) 0.074 >70 – 80 0.13 (0.05 – 0.34) 0.13 (0.05 – 0.34) <0.001 >80 0.17 (0.07 – 0.42) 0.15 (0.06 – 0.39) <0.001 GPX1:ref.=CC 0.024 CT 0.45 (0.24 – 0.85) 0.40(0.20 – 0.80) 0.01 TT 0.44 (0.14 – 1.36) 0.42 (0.12 – 1.41) 0.161
  • 26. Something More   Changing the default reference GPX1 = relevel(GPX1, ref = "TT") pack()  Saving the result result = clogistic.display(clogit_lung) write.csv(result$table, file=“path/result.csv“, sep = “t”) write.table(result$table, file=“path/result.xls“, sep = “t”)
  • 27. Summary: regression models  Regression models can be used to describe the average effect of predictors on outcomes in your data set.  They can tell how likely that the effect is just be due to chance.  They can look at each predictor “adjusting for” the others (estimating what would happen if all others were held constant.)
  • 28. Thanks to, Prof. Virasakdi Chongsuvivatwong Epidemiology Unit, Faculty of Medicine, Prince of Songkla University, Thailand

Notas do Editor

  1. Coeffcients are calculated my MLE
  2. In order to test hypotheses in logistic regression, we have used the likelihood ratio test and the Wald test.
  3. If the confidence interval includes 0 we can say that there is no significant difference between the means of the two populations, at a given level of confidence. The width of the confidence interval gives us some idea about how uncertain we are about the difference in the means. A very wide interval may indicate that more data should be collected before anything definite can be said. A confidence interval that includes 1.0 means that the association between the exposure and outcome could have been found by chance alone and that the association is not statistically significant.
  4. Binomial is specifying a choice of variance and link functions. Variance is binomial and link is logit function.