SlideShare uma empresa Scribd logo
1 de 104
Lecture 4
Survey Research & Design in Psychology
James Neill, 2017
Creative Commons Attribution 4.0
Correlation
Image source: http://commons.wikimedia.org/wiki/File:Gnome-power-statistics.svg, GPL
2
Howitt & Cramer (2014)
● Ch 7: Relationships between two or more
variables: Diagrams and tables
● Ch 8: Correlation coefficients: Pearson
correlation and Spearman’s rho
● Ch 11: Statistical significance for the
correlation coefficient: A practical introduction
to statistical inference
● Ch 15: Chi-square: Differences between
samples of frequency data
● Note: Howitt and Cramer doesn't cover point bi-serial correlation
Readings
3
1. Covariation
2. Purpose of correlation
3. Linear correlation
4. Types of correlation
5. Interpreting correlation
6. Assumptions / limitations
Overview
4
Covariation
5
The world is made of
co-variations
e.g., pollen and bees
e.g., study and grades
e.g., nutrients and growth
6
We observe
covariations in
the psycho-
social world.
We can measure our
observations and
analyse the extent to
which they co-occur.
e.g., depictions of
violence in the
environment.
e.g., psychological states
such as stress
and depression.
7
Co-variations are the basis
of more complex models.
8
Purpose of correlation
9
The underlying purpose of
correlation is to help address the
question:
What is the
• relationship or
• association or
• shared variance or
• co-relation
between two variables?
Purpose of correlation
10
Purpose of correlation
Other ways of expressing the
underlying correlational question
include:
To what extent do variables
• covary?
• depend on one another?
• explain one another?
11
Linear correlation
12
r = -.76
Linear correlation
Extent to which two variables have a
simple linear (straight-line) relationship.
Image source: http://commons.wikimedia.org/wiki/File:Scatterplot_r%3D-.76.png, Public domain
13
Linear correlation
The linear relation between two
variables is indicated by a correlation:
• Direction: Sign (+ / -) indicates
direction of relationship (+ve or -ve slope)
• Strength: Size indicates strength
(values closer to -1 or +1 indicate greater strength)
• Statistical significance: p indicates
likelihood that the observed relationship
could have occurred by chance
14
Types of relationships
• No relationship (r ~ 0)
(X and Y are independent)
• Linear relationship
(X and Y are dependent)
–As X ↑s, so does Y (r > 0)
–As X ↑s, Y ↓s (r < 0)
• Non-linear relationship
15
Types of correlation
To decide which type of
correlation to use, consider
the levels of measurement
for each variable.
16
Types of correlation
• Nominal by nominal:
Phi (Φ) / Cramer’s V, Chi-square
• Ordinal by ordinal:
Spearman’s rank / Kendall’s Tau b
• Dichotomous by interval/ratio:
Point bi-serial rpb
• Interval/ratio by interval/ratio:
Product-moment or Pearson’s r
17
Types of correlation and LOM
Scatterplot
Product-
moment
correlation (r)
Interval/Ratio
⇐⇑Recode
Clustered bar
chart or
scatterplot
Spearman's
Rho or
Kendall's Tau
Ordinal
Clustered bar
chart or
scatterplot
Point bi-serial
correlation
(rpb
)
⇐ Recode
Clustered bar-
chart
Chi-square,
Phi (φ) or
Cramer's V
Nominal
Int/RatioOrdinalNominal
18
Nominal by nominal
19
Nominal by nominal
correlational approaches
• Contingency (or cross-tab) tables
– Observed frequencies
– Expected frequencies
– Row and/or column %s
– Marginal totals
• Clustered bar chart
• Chi-square
• Phi (φ) / Cramer's V
20
Contingency tables
● Bivariate frequency tables
● Marginal totals (blue)
● Observed cell frequencies (red)
21
Contingency table: Example
BLUE = Marginal totals
RED = Cell frequencies
22
Contingency table: Example
●Expected counts are the cell frequencies that should
occur if the variables are not correlated.
●Chi-square is based on the squared differences
between the actual and expected cell counts.
2 2
22
- -
--
2
sum of ((observed – expected)2
/ expected)χ
2
=
23
Cell percentages
Row and/or column cell percentages can also be useful
e.g., ~60% of smokers snore, whereas only ~30%d
of non-smokers
snore.
24
Clustered bar graph
Bivariate bar graph of frequencies or percentages.
The category
axis bars are
clustered (by
colour or fill
pattern) to
indicate the the
second variable’s
categories.
25
~60% of
snorers are
smokers,
whereas only
~30% of non-
snores are
smokers.
26
Pearson chi-square test
27
Write-up: χ2
(1, 188) = 8.07, p = .004
Pearson chi-square test:
Example
Smoking (2) x Snoring (2)
28
Chi-square distribution: Example
The critical value for chi-square with 1 df
and a critical alpha of .05 is 3.84
29
Phi (φ) & Cramer’s V
Phi (φ)
• Use for 2 x 2, 2 x 3, 3 x 2 analyses
e.g., Gender (2) & Pass/Fail (2)
Cramer’s V
• Use for 3 x 3 or greater analyses
e.g., Favourite Season (4) x Favourite
Sense (5)
(non-parametric measures of correlation)
30
Phi (φ) & Cramer’s V: Example
χ2
(1, 188) = 8.07, p = .004, φ = .21
Note that the sign is ignored here (because nominal coding is arbitrary,
the researcher should explain the direction of the relationship)
31
Ordinal by ordinal
32
Ordinal by ordinal
correlational approaches
• Spearman's rho (rs
)
• Kendall tau (τ)
• Alternatively, use nominal by
nominal techniques (i.e., recode the
variables or treat them as having a lower
level of measurement)
33
Graphing ordinal by ordinal data
• Ordinal by ordinal data is difficult to
visualise because its non-parametric,
with many points.
• Consider using:
–Non-parametric approaches
(e.g., clustered bar chart)
–Parametric approaches
(e.g., scatterplot with line of best fit)
34
Spearman’s rho (rs
) or
Spearman's rank order correlation
• For ranked (ordinal) data
–e.g., Olympic Placing correlated with
World Ranking
• Uses product-moment correlation
formula
• Interpretation is adjusted to consider
the underlying ranked scales
35
Kendall’s Tau (τ)
• Tau a
–Does not take joint ranks into account
• Tau b
–Takes joint ranks into account
–For square tables
• Tau c
–Takes joint ranks into account
–For rectangular tables
36
Ordinal correlation example
37
Ordinal correlation example
38
Ordinal correlation example
τb
= -.07, p = .298
39
Dichotomous by
scale (interval/ratio)
40
Point-biserial correlation (rpb
)
• One dichotomous & one
interval/ratio variable
–e.g., belief in god (yes/no) and
number of countries visited
• Calculate as for Pearson's
product-moment r
• Adjust interpretation to consider
the direction of the dichotomous
scales
41
Point-biserial correlation (rpb
):
Example
Those who report that they
believe in God also report
having travelled to slightly
fewer countries (rpb
= -.10) but
this difference could have
occurred by chance (p > .05),
thus H0
is not rejected.
Do not believe Believe
42
Point-biserial correlation (rpb
):
Example
0 = No
1 = Yes
43
Scale (interval/ratio) by
Scale (interval/ratio)
44
Scatterplot
• Plot each pair of observations (X, Y)
–x = predictor variable (independent; IV)
–y = criterion variable (dependent; DV)
• By convention:
–IV on the x (horizontal) axis
–DV on the y (vertical) axis
• Direction of relationship:
–+ve = trend from bottom left to top right
–-ve = trend from top left to bottom right
45
Scatterplot showing relationship between
age & cholesterol with line of best fit
Upward slope = positive correlation
46
• The correlation between 2
variables is a measure of the
degree to which pairs of numbers
(points) cluster together around a
best-fitting straight line
• Line of best fit: y = a + bx
• Check for:
– outliers
– linearity
Line of best fit
47
What's wrong with this scatterplot?
X-axis should be
IV (predictor)
Y-axis should be
DV (outcome)
48
Scatterplot example:
Strong positive (.81)
Q: Why is infant
mortality positively
linearly associated
with the number of
physicians (with the
effects of GDP
removed)?
A: Because more
doctors tend to be
deployed to areas
with infant mortality
(socio-economic
status aside).
49
Scatterplot example:
Weak positive (.14)
Outlier –
exaggerating
the correlation
50
Scatterplot example:
Moderately strong negative (-.76)
A: Having sufficient
Vitamin D (via sunlight)
lowers risk of cancer.
However, UV light
exposure increases risk
of skin cancer.
Q: Why is there
a strong negative
correlation
between solar
radiation and
breast cancer?
51
Pearson product-moment correlation (r)
● The product-moment
correlation is the
standardised covariance.
52
Covariance
• Variance shared by 2 variables
• Covariance reflects the
direction of the relationship:
+ve cov indicates +ve relationship
-ve cov indicates -ve relationship
● Covariance is unstandardised.
Cross products
1
))((
−
−−Σ
=
N
YYXX
CovXY N - 1 for the
sample; N for
the population
53
Covariance: Cross-products
-ve
cro
ss
pro
duc
ts
X1
403020100
Y1 3
3
2
2
1
1
0
-ve cross
products
-ve cross
products
+ve cross
products
+ve cross
products
54
Covariance → Correlation
• Size depends on the
measurement scale → Can’t compare
covariance across different scales of
measurement (e.g., age by weight in kilos versus
age by weight in grams).
• Therefore, standardise
covariance (divide by the cross-product of
the SDs) → correlation
• Correlation is an effect size - i.e.,
standardised measure of strength of linear
relationship
55
The covariance between X and Y is
1.2. The SD of X is 2 and the SD of
Y is 3. The correlation is:
a. 0.2
b. 0.3
c. 0.4
d. 1.2
Covariance, SD, and correlation:
Example quiz question
Answer:
1.2 / 2 x 3 = 0.2
56
Hypothesis testing
Almost all correlations are not 0.
So, hypothesis testing seeks to
answer:
• What is the likelihood that an
observed relationship between two
variables is “true” or “real”?
• What is the likelihood that an
observed relationship is simply due
to chance?
57
Significance of correlation
• Null hypothesis (H0): ρ = 0
i.e., no “true” relationship in the
population
• Alternative hypothesis (H1): ρ <> 0
i.e., there is a real relationship in the
population
• Initially, assume H0 is true, and then
evaluate whether the data support H1.
• ρ (rho) = population product-moment
correlation coefficient
rho
58
How to test the null hypothesis
• Select a critical value (alpha (α));
commonly .05
• Use a 1- or 2-tailed test; 1-tailed if
hypothesis is directional
• Calculate correlation and its p value.
Compare to the critical alpha value.
• If p < critical alpha, correlation is
statistically significant, i.e., there is less than
critical alpha chance that the observed relationship
is due to random sampling variability.
59
Correlation – SPSS output
C o r r e l a t i o n s
. 7 1 3 * *
. 0 0 0
2 1
P e a r s o n
C o r r e l a t i o n
S i g .
( 2 - t a i l e d )
N
P e a r s o n
C o r r e l a t i o n
S i g .
( 2 - t a i l e d )
N
C i g a r e t t e
C o n s u m p t i o n p e r
A d u l t p e r D a y
C H D M o r t a l i t y
p e r 1 0 , 0 0 0
C i g a r e t t e
C o n s u m p t i o n
p e r A d u l t p e r
D a y
C H D
M o r t a l i
t y p e r
1 0 , 0 0 0
C o r r e l a t i o n i s s i g n i f i c a n t a t t h e 0 . 0 1 l e v e l
( 2 - t a i l e d ) .
* * .
60
Errors in hypothesis testing
• Type I error:
decision to reject H0 when H0
is true
• Type II error:
decision to not reject H0 when H0
is false
• A significance test outcome depends on the
statistical power which is a function of:
–Effect size (r)
–Sample size (N)
–Critical alpha level (αcrit
)
61
Significance of correlation
df critical
(N - 2) p = .05
5 .67
10 .50
15 .41
20 .36
25 .32
30 .30
50 .23
200 .11
500 .07
1000 .05
The higher the
N, the smaller
the correlation
required for a
statistically
significant result
– why?
62
Scatterplot showing a confidence
interval for a line of best fit
63
US States 4th
Academic Achievement by SES
Scatterplot showing a confidence
interval for a line of best fit
Image source: http://www.chriscorrea.com/2005/poverty-and-performance-on-the-naep/
64
If the correlation between Age and
Performance is statistically
significant, it means that:
a. there is an important relationship between the
variables
b. the true correlation between the variables in
the population is equal to 0
c. the true correlation between the variables in
the population is not equal to 0
d. getting older causes you to do poorly on tests
Practice quiz question:
Significance of correlation
65
Interpreting correlation
66
Coefficient of Determination (r2
)
• CoD = The proportion of
variance in one variable that
can be accounted for by
another variable.
• e.g., r = .60, r2
= .36
or 36% of shared
variance
67
Interpreting correlation
(Cohen, 1988)
• A correlation is an effect size
• Rule of thumb:
Strength r r2
Weak: .1 - .3 1 - 9%
Moderate: .3 - .5 10 - 25%
Strong: >.5 > 25%
68
Size of correlation (Cohen, 1988)
WEAK (.1 - .3)
MODERATE (.3 - .5)
STRONG (> .5)
69
Interpreting correlation
(Evans, 1996)
Strength r r2
very weak 0 - .19 (0 to 4%)
weak .20 - .39 (4 to 16%)
moderate .40 - .59 (16 to 36%)
strong .60 - .79 (36% to 64%)
very strong .80 - 1.00 (64% to 100%)
70
X1
403020100
Y1
3
3
2
2
1
1
0
Correlation of this scatterplot = -.9
Scale has no effect
on correlation.
71
X1
1009080706050403020100
Y1
2
2
2
2
2
2
2
2
2
2
1
1
1
1
1
1
1
1
1
1
0
0
0
0
0
Correlation of this scatterplot = -.9
Scale has no effect
on correlation.
72
What do you estimate the correlation of this
scatterplot of height and weight to be?
a. -.5
b. -1
c. 0
d. .5
e. 1
WEIGHT
737271706968676665
HEIGHT
176
174
172
170
168
166
73
What do you estimate the correlation of this
scatterplot to be?
a. -.5
b. -1
c. 0
d. .5
e. 1
X
5.65.45.25.04.84.64.4
Y
14
12
10
8
6
4
2
74
What do you estimate the correlation of this
scatterplot to be?
a. -.5
b. -1
c. 0
d. .5
e. 1
X
1412108642
Y
6
5
5
5
5
5
4
75
Write-up: Example
“Number of children and marital
satisfaction were inversely related
(r (48) = -.35, p < .05), such that
contentment in marriage tended
to be lower for couples with more
children. Number of children
explained approximately 10% of
the variance in marital
satisfaction, a small-moderate
effect.”
76
Assumptions and
limitations
(Pearson product-moment
linear correlation)
77
Assumptions and limitations
1. Levels of measurement
2. Normality
3. Linearity
1. Effects of outliers
2. Non-linearity
4. Homoscedasticity
5. No range restriction
6. Homogenous samples
7. Correlation is not causation
8. Dealing with multiple correlations
78
Normality
• The X and Y data should be sampled
from populations with normal distributions
• Do not overly rely on any single indicator
of normality; use histograms, skewness
and kurtosis (e.g., within -1 and +1)
• Inferential tests of normality (e.g.,
Shapiro-Wilks) are overly sensitive when
sample is large
79
Effect of outliers
• Outliers can disproportionately
increase or decrease r.
• Options
–compute r with & without outliers
–get more data for outlying values
–recode outliers as having more
conservative scores
–transformation
–recode variable into lower level of
measurement and a non-parametric
approach
80
Age & self-esteem
(r = .63)
AGE
8070605040302010
SE
10
8
6
4
2
0
81
Age & self-esteem
(outliers removed) r = .23
AGE
40302010
SE
9
8
7
6
5
4
3
2
1
82
Non-linear relationships
Check
scatterplot
Can a linear
relationship
‘capture’ the
lion’s share of
the variance?
If so,use r.
Y2
6050403020100
X2
40
30
20
10
0
Yes – straight-
line appropriate
No – straight-line
inappropriate
83
Non-linear relationships
If non-linear, consider:
• Does a linear relation help?
• Use a non-linear mathematical
function to describe the
relationship between the variables
• Transforming variables to “create”
linear relationship
84
Scedasticity
• Homoscedasticity refers to even
spread of observations about a line
of best fit
• Heteroscedasticity refers to uneven
spread of observations about a line
of best fit
• Assess visually and with Levene's
test
85
Scedasticity
Image source:
https://commons.wikimedia.org/wiki/File:Homoscedasticity.png
Image source:
https://commons.wikimedia.org/wiki/File:Heteroscedasticity.png
86
Scedasticity
Image source: Unknown
87
Range restriction
• Range restriction is when the
sample contains a restricted (or
truncated) range of scores
–e.g., level of hormone X and age < 18
might have linear relationship
• If range is restricted, be cautious
about generalising beyond the
range for which data is available
–e.g., level of hormone X may not
continue to increase linearly with age
after age 18
88
Range restriction
89
Heterogenous samples
● Sub-samples (e.g.,
males & females) may
artificially increase
or decrease overall
r.
● Solution - calculate
r r separately for
sub-samples &
overall; look for
differences W1
80706050
H1
190
180
170
160
150
140
130
90
Scatterplot of Same-sex &
Opposite-sex Relations by Gender
Opp Sex Relations
76543210
SameSexRelations
7
6
5
4
3
2
SEX
female
male
♂ r = .67
r2
= .45
♀ r = .52
r2
= .27
91
Scatterplot of Weight and
Self-esteem by Gender
WEIGHT
120110100908070605040
SE
10
8
6
4
2
0
SEX
male
female
♂ r = .50
♀r = -.48
92
Correlation is not causation e.g.,:
correlation between ice cream consumption and crime,
but actual cause is temperature
Image source: Unknown
93
Correlation is not causation e.g.,:
Stop global warming: Become a pirate
As the number of
pirates has reduced,
global temperature
has increased.
Image source: Unknown
94
Scatterplot matrices
organise scatterplots
and correlations
amongst several
variables at once.
However, they are not
sufficiently detailed for
more than about five
variables at a time.
Dealing with several correlations
Image source: Unknown
95
Correlation matrix:
Example of an APA Style
Correlation Table
96
Summary
97
Summary: Correlation
1. The world is made of covariations.
2. Covariations are the building blocks of
more complex multivariate
relationships.
3. Correlation is a standardised measure
of the covariance (extent to which two
phenomenon co-relate).
4. Correlation does not prove
causation - may be opposite causality,
bi-directional, or due to other variables.
98
Summary:
Types of correlation
• Nominal by nominal:
Phi (Φ) / Cramer’s V, Chi-square
• Ordinal by ordinal:
Spearman’s rank / Kendall’s Tau b
• Dichotomous by interval/ratio:
Point bi-serial rpb
• Interval/ratio by interval/ratio:
Product-moment or Pearson’s r
99
Summary:
Correlation steps
1. Choose measure of correlation
and graphs based on levels of
measurement.
2. Check graphs (e.g., scatterplot):
–Linear or non-linear?
–Outliers?
–Homoscedasticity?
–Range restriction?
–Sub-samples to consider?
100
Summary:
Correlation steps
3. Consider
–Effect size (e.g., Φ, Cramer's V, r, r2
)
–Direction
–Inferential test (p)
4. Interpret/Discuss
–Relate back to hypothesis
–Size, direction, significance
–Limitations e.g.,
• Heterogeneity (sub-samples)
• Range restriction
• Causality?
101
Summary:
Interpreting correlation
• Coefficient of determination
–Correlation squared
–Indicates % of shared variance
Strength r r2
Weak: .1 - .3 1 – 10%
Moderate: .3 - .5 10 - 25%
Strong: > .5 > 25%
102
Summary:
Asssumptions & limitations
1. Levels of measurement
2. Normality
3. Linearity
4. Homoscedasticity
5. No range restriction
6. Homogenous samples
7. Correlation is not causation
8. Dealing with multliple correlations
103
References
Evans, J. D. (1996). Straightforward statistics for the
behavioral sciences. Pacific Grove, CA: Brooks/Cole
Publishing.
Howell, D. C. (2007). Fundamental statistics for the
behavioral sciences. Belmont, CA: Wadsworth.
Howell, D. C. (2010). Statistical methods for
psychology (7th ed.). Belmont, CA: Wadsworth.
Howitt, D. & Cramer, D. (2011). Introduction to
statistics in psychology (5th ed.). Harlow, UK:
Pearson.
104
Open Office Impress
● This presentation was made using
Open Office Impress.
● Free and open source software.
● http://www.openoffice.org/product/impress.html

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Regression analysis
Regression analysisRegression analysis
Regression analysis
 
Multiple regression presentation
Multiple regression presentationMultiple regression presentation
Multiple regression presentation
 
Correlation and partial correlation
Correlation and partial correlationCorrelation and partial correlation
Correlation and partial correlation
 
Introduction to correlation and regression analysis
Introduction to correlation and regression analysisIntroduction to correlation and regression analysis
Introduction to correlation and regression analysis
 
Regression analysis.
Regression analysis.Regression analysis.
Regression analysis.
 
Karl pearson's coefficient of correlation (1)
Karl pearson's coefficient of correlation (1)Karl pearson's coefficient of correlation (1)
Karl pearson's coefficient of correlation (1)
 
Multiple Correlation - Thiyagu
Multiple Correlation - ThiyaguMultiple Correlation - Thiyagu
Multiple Correlation - Thiyagu
 
Regression analysis
Regression analysisRegression analysis
Regression analysis
 
Analysis of variance anova
Analysis of variance anovaAnalysis of variance anova
Analysis of variance anova
 
Regression analysis
Regression analysisRegression analysis
Regression analysis
 
multiple regression
multiple regressionmultiple regression
multiple regression
 
Analysis of variance
Analysis of varianceAnalysis of variance
Analysis of variance
 
Spearman rank correlation coefficient
Spearman rank correlation coefficientSpearman rank correlation coefficient
Spearman rank correlation coefficient
 
Multiple linear regression
Multiple linear regressionMultiple linear regression
Multiple linear regression
 
Correlation and Regression
Correlation and Regression Correlation and Regression
Correlation and Regression
 
Simple linear regression
Simple linear regressionSimple linear regression
Simple linear regression
 
Regression analysis
Regression analysisRegression analysis
Regression analysis
 
Correlation and regression
Correlation and regressionCorrelation and regression
Correlation and regression
 
Correlation analysis
Correlation analysis Correlation analysis
Correlation analysis
 
Correlation and regression
Correlation and regressionCorrelation and regression
Correlation and regression
 

Semelhante a Correlation

Correlation and Regression
Correlation and RegressionCorrelation and Regression
Correlation and Regressionjasondroesch
 
Correlation and regression
Correlation and regressionCorrelation and regression
Correlation and regressionANCYBS
 
Correlation and Regression.pptx
Correlation and Regression.pptxCorrelation and Regression.pptx
Correlation and Regression.pptxJayaprakash985685
 
Correlation and Regression Analysis.pptx
Correlation and Regression Analysis.pptxCorrelation and Regression Analysis.pptx
Correlation and Regression Analysis.pptxasemzkgmu
 
Unit 1 Correlation- BSRM.pdf
Unit 1 Correlation- BSRM.pdfUnit 1 Correlation- BSRM.pdf
Unit 1 Correlation- BSRM.pdfRavinandan A P
 
T17 correlation
T17 correlationT17 correlation
T17 correlationkompellark
 
Educ eval ppt correlation
Educ eval ppt correlationEduc eval ppt correlation
Educ eval ppt correlationcampionjelmar
 
Correlation _ Regression Analysis statistics.pptx
Correlation _ Regression Analysis statistics.pptxCorrelation _ Regression Analysis statistics.pptx
Correlation _ Regression Analysis statistics.pptxkrunal soni
 
Lect w8 w9_correlation_regression
Lect w8 w9_correlation_regressionLect w8 w9_correlation_regression
Lect w8 w9_correlation_regressionRione Drevale
 
marketing research & applications on SPSS
marketing research & applications on SPSSmarketing research & applications on SPSS
marketing research & applications on SPSSANSHU TIWARI
 
P G STAT 531 Lecture 9 Correlation
P G STAT 531 Lecture 9 CorrelationP G STAT 531 Lecture 9 Correlation
P G STAT 531 Lecture 9 CorrelationAashish Patel
 
Regression and Co-Relation
Regression and Co-RelationRegression and Co-Relation
Regression and Co-Relationnuwan udugampala
 
Class 9 Covariance & Correlation Concepts.pptx
Class 9 Covariance & Correlation Concepts.pptxClass 9 Covariance & Correlation Concepts.pptx
Class 9 Covariance & Correlation Concepts.pptxCallplanetsDeveloper
 
Lesson 8 Linear Correlation And Regression
Lesson 8 Linear Correlation And RegressionLesson 8 Linear Correlation And Regression
Lesson 8 Linear Correlation And RegressionSumit Prajapati
 
Biostatistics lecture notes 7.ppt
Biostatistics lecture notes 7.pptBiostatistics lecture notes 7.ppt
Biostatistics lecture notes 7.pptletayh2016
 

Semelhante a Correlation (20)

Correlation and Regression
Correlation and RegressionCorrelation and Regression
Correlation and Regression
 
Correlation and regression
Correlation and regressionCorrelation and regression
Correlation and regression
 
Correlation and Regression.pptx
Correlation and Regression.pptxCorrelation and Regression.pptx
Correlation and Regression.pptx
 
Correlation and Regression Analysis.pptx
Correlation and Regression Analysis.pptxCorrelation and Regression Analysis.pptx
Correlation and Regression Analysis.pptx
 
Unit 1 Correlation- BSRM.pdf
Unit 1 Correlation- BSRM.pdfUnit 1 Correlation- BSRM.pdf
Unit 1 Correlation- BSRM.pdf
 
T17 correlation
T17 correlationT17 correlation
T17 correlation
 
Simple linear regressionn and Correlation
Simple linear regressionn and CorrelationSimple linear regressionn and Correlation
Simple linear regressionn and Correlation
 
Correlation
CorrelationCorrelation
Correlation
 
Educ eval ppt correlation
Educ eval ppt correlationEduc eval ppt correlation
Educ eval ppt correlation
 
13943056.ppt
13943056.ppt13943056.ppt
13943056.ppt
 
Correlation _ Regression Analysis statistics.pptx
Correlation _ Regression Analysis statistics.pptxCorrelation _ Regression Analysis statistics.pptx
Correlation _ Regression Analysis statistics.pptx
 
Lect w8 w9_correlation_regression
Lect w8 w9_correlation_regressionLect w8 w9_correlation_regression
Lect w8 w9_correlation_regression
 
correlation.ppt
correlation.pptcorrelation.ppt
correlation.ppt
 
marketing research & applications on SPSS
marketing research & applications on SPSSmarketing research & applications on SPSS
marketing research & applications on SPSS
 
P G STAT 531 Lecture 9 Correlation
P G STAT 531 Lecture 9 CorrelationP G STAT 531 Lecture 9 Correlation
P G STAT 531 Lecture 9 Correlation
 
Statistics ppt
Statistics pptStatistics ppt
Statistics ppt
 
Regression and Co-Relation
Regression and Co-RelationRegression and Co-Relation
Regression and Co-Relation
 
Class 9 Covariance & Correlation Concepts.pptx
Class 9 Covariance & Correlation Concepts.pptxClass 9 Covariance & Correlation Concepts.pptx
Class 9 Covariance & Correlation Concepts.pptx
 
Lesson 8 Linear Correlation And Regression
Lesson 8 Linear Correlation And RegressionLesson 8 Linear Correlation And Regression
Lesson 8 Linear Correlation And Regression
 
Biostatistics lecture notes 7.ppt
Biostatistics lecture notes 7.pptBiostatistics lecture notes 7.ppt
Biostatistics lecture notes 7.ppt
 

Mais de James Neill

Personal control beliefs
Personal control beliefsPersonal control beliefs
Personal control beliefsJames Neill
 
Goal setting and goal striving
Goal setting and goal strivingGoal setting and goal striving
Goal setting and goal strivingJames Neill
 
Implicit motives
Implicit motivesImplicit motives
Implicit motivesJames Neill
 
Psychological needs
Psychological needsPsychological needs
Psychological needsJames Neill
 
Extrinsic motivation
Extrinsic motivationExtrinsic motivation
Extrinsic motivationJames Neill
 
Physiological needs
Physiological needsPhysiological needs
Physiological needsJames Neill
 
Motivated and emotional brain
Motivated and emotional brainMotivated and emotional brain
Motivated and emotional brainJames Neill
 
Motivation in historical perspective
Motivation in historical perspectiveMotivation in historical perspective
Motivation in historical perspectiveJames Neill
 
Motivation and emotion unit outline
Motivation and emotion unit outlineMotivation and emotion unit outline
Motivation and emotion unit outlineJames Neill
 
Development and evaluation of the PCYC Catalyst outdoor adventure interventio...
Development and evaluation of the PCYC Catalyst outdoor adventure interventio...Development and evaluation of the PCYC Catalyst outdoor adventure interventio...
Development and evaluation of the PCYC Catalyst outdoor adventure interventio...James Neill
 
Individual emotions
Individual emotionsIndividual emotions
Individual emotionsJames Neill
 
How and why to edit wikipedia
How and why to edit wikipediaHow and why to edit wikipedia
How and why to edit wikipediaJames Neill
 
Going open (education): What, why, and how?
Going open (education): What, why, and how?Going open (education): What, why, and how?
Going open (education): What, why, and how?James Neill
 
Introduction to motivation and emotion 2013
Introduction to motivation and emotion 2013Introduction to motivation and emotion 2013
Introduction to motivation and emotion 2013James Neill
 
Summary and conclusion - Survey research and design in psychology
Summary and conclusion - Survey research and design in psychologySummary and conclusion - Survey research and design in psychology
Summary and conclusion - Survey research and design in psychologyJames Neill
 
The effects of green exercise on stress, anxiety and mood
The effects of green exercise on stress, anxiety and moodThe effects of green exercise on stress, anxiety and mood
The effects of green exercise on stress, anxiety and moodJames Neill
 
Multiple linear regression II
Multiple linear regression IIMultiple linear regression II
Multiple linear regression IIJames Neill
 
Conclusion and review
Conclusion and reviewConclusion and review
Conclusion and reviewJames Neill
 

Mais de James Neill (20)

Self
SelfSelf
Self
 
Personal control beliefs
Personal control beliefsPersonal control beliefs
Personal control beliefs
 
Mindsets
MindsetsMindsets
Mindsets
 
Goal setting and goal striving
Goal setting and goal strivingGoal setting and goal striving
Goal setting and goal striving
 
Implicit motives
Implicit motivesImplicit motives
Implicit motives
 
Psychological needs
Psychological needsPsychological needs
Psychological needs
 
Extrinsic motivation
Extrinsic motivationExtrinsic motivation
Extrinsic motivation
 
Physiological needs
Physiological needsPhysiological needs
Physiological needs
 
Motivated and emotional brain
Motivated and emotional brainMotivated and emotional brain
Motivated and emotional brain
 
Motivation in historical perspective
Motivation in historical perspectiveMotivation in historical perspective
Motivation in historical perspective
 
Motivation and emotion unit outline
Motivation and emotion unit outlineMotivation and emotion unit outline
Motivation and emotion unit outline
 
Development and evaluation of the PCYC Catalyst outdoor adventure interventio...
Development and evaluation of the PCYC Catalyst outdoor adventure interventio...Development and evaluation of the PCYC Catalyst outdoor adventure interventio...
Development and evaluation of the PCYC Catalyst outdoor adventure interventio...
 
Individual emotions
Individual emotionsIndividual emotions
Individual emotions
 
How and why to edit wikipedia
How and why to edit wikipediaHow and why to edit wikipedia
How and why to edit wikipedia
 
Going open (education): What, why, and how?
Going open (education): What, why, and how?Going open (education): What, why, and how?
Going open (education): What, why, and how?
 
Introduction to motivation and emotion 2013
Introduction to motivation and emotion 2013Introduction to motivation and emotion 2013
Introduction to motivation and emotion 2013
 
Summary and conclusion - Survey research and design in psychology
Summary and conclusion - Survey research and design in psychologySummary and conclusion - Survey research and design in psychology
Summary and conclusion - Survey research and design in psychology
 
The effects of green exercise on stress, anxiety and mood
The effects of green exercise on stress, anxiety and moodThe effects of green exercise on stress, anxiety and mood
The effects of green exercise on stress, anxiety and mood
 
Multiple linear regression II
Multiple linear regression IIMultiple linear regression II
Multiple linear regression II
 
Conclusion and review
Conclusion and reviewConclusion and review
Conclusion and review
 

Último

18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppCeline George
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 

Último (20)

18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website App
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 

Correlation

  • 1. Lecture 4 Survey Research & Design in Psychology James Neill, 2017 Creative Commons Attribution 4.0 Correlation Image source: http://commons.wikimedia.org/wiki/File:Gnome-power-statistics.svg, GPL
  • 2. 2 Howitt & Cramer (2014) ● Ch 7: Relationships between two or more variables: Diagrams and tables ● Ch 8: Correlation coefficients: Pearson correlation and Spearman’s rho ● Ch 11: Statistical significance for the correlation coefficient: A practical introduction to statistical inference ● Ch 15: Chi-square: Differences between samples of frequency data ● Note: Howitt and Cramer doesn't cover point bi-serial correlation Readings
  • 3. 3 1. Covariation 2. Purpose of correlation 3. Linear correlation 4. Types of correlation 5. Interpreting correlation 6. Assumptions / limitations Overview
  • 5. 5 The world is made of co-variations e.g., pollen and bees e.g., study and grades e.g., nutrients and growth
  • 6. 6 We observe covariations in the psycho- social world. We can measure our observations and analyse the extent to which they co-occur. e.g., depictions of violence in the environment. e.g., psychological states such as stress and depression.
  • 7. 7 Co-variations are the basis of more complex models.
  • 9. 9 The underlying purpose of correlation is to help address the question: What is the • relationship or • association or • shared variance or • co-relation between two variables? Purpose of correlation
  • 10. 10 Purpose of correlation Other ways of expressing the underlying correlational question include: To what extent do variables • covary? • depend on one another? • explain one another?
  • 12. 12 r = -.76 Linear correlation Extent to which two variables have a simple linear (straight-line) relationship. Image source: http://commons.wikimedia.org/wiki/File:Scatterplot_r%3D-.76.png, Public domain
  • 13. 13 Linear correlation The linear relation between two variables is indicated by a correlation: • Direction: Sign (+ / -) indicates direction of relationship (+ve or -ve slope) • Strength: Size indicates strength (values closer to -1 or +1 indicate greater strength) • Statistical significance: p indicates likelihood that the observed relationship could have occurred by chance
  • 14. 14 Types of relationships • No relationship (r ~ 0) (X and Y are independent) • Linear relationship (X and Y are dependent) –As X ↑s, so does Y (r > 0) –As X ↑s, Y ↓s (r < 0) • Non-linear relationship
  • 15. 15 Types of correlation To decide which type of correlation to use, consider the levels of measurement for each variable.
  • 16. 16 Types of correlation • Nominal by nominal: Phi (Φ) / Cramer’s V, Chi-square • Ordinal by ordinal: Spearman’s rank / Kendall’s Tau b • Dichotomous by interval/ratio: Point bi-serial rpb • Interval/ratio by interval/ratio: Product-moment or Pearson’s r
  • 17. 17 Types of correlation and LOM Scatterplot Product- moment correlation (r) Interval/Ratio ⇐⇑Recode Clustered bar chart or scatterplot Spearman's Rho or Kendall's Tau Ordinal Clustered bar chart or scatterplot Point bi-serial correlation (rpb ) ⇐ Recode Clustered bar- chart Chi-square, Phi (φ) or Cramer's V Nominal Int/RatioOrdinalNominal
  • 19. 19 Nominal by nominal correlational approaches • Contingency (or cross-tab) tables – Observed frequencies – Expected frequencies – Row and/or column %s – Marginal totals • Clustered bar chart • Chi-square • Phi (φ) / Cramer's V
  • 20. 20 Contingency tables ● Bivariate frequency tables ● Marginal totals (blue) ● Observed cell frequencies (red)
  • 21. 21 Contingency table: Example BLUE = Marginal totals RED = Cell frequencies
  • 22. 22 Contingency table: Example ●Expected counts are the cell frequencies that should occur if the variables are not correlated. ●Chi-square is based on the squared differences between the actual and expected cell counts. 2 2 22 - - -- 2 sum of ((observed – expected)2 / expected)χ 2 =
  • 23. 23 Cell percentages Row and/or column cell percentages can also be useful e.g., ~60% of smokers snore, whereas only ~30%d of non-smokers snore.
  • 24. 24 Clustered bar graph Bivariate bar graph of frequencies or percentages. The category axis bars are clustered (by colour or fill pattern) to indicate the the second variable’s categories.
  • 25. 25 ~60% of snorers are smokers, whereas only ~30% of non- snores are smokers.
  • 27. 27 Write-up: χ2 (1, 188) = 8.07, p = .004 Pearson chi-square test: Example Smoking (2) x Snoring (2)
  • 28. 28 Chi-square distribution: Example The critical value for chi-square with 1 df and a critical alpha of .05 is 3.84
  • 29. 29 Phi (φ) & Cramer’s V Phi (φ) • Use for 2 x 2, 2 x 3, 3 x 2 analyses e.g., Gender (2) & Pass/Fail (2) Cramer’s V • Use for 3 x 3 or greater analyses e.g., Favourite Season (4) x Favourite Sense (5) (non-parametric measures of correlation)
  • 30. 30 Phi (φ) & Cramer’s V: Example χ2 (1, 188) = 8.07, p = .004, φ = .21 Note that the sign is ignored here (because nominal coding is arbitrary, the researcher should explain the direction of the relationship)
  • 32. 32 Ordinal by ordinal correlational approaches • Spearman's rho (rs ) • Kendall tau (τ) • Alternatively, use nominal by nominal techniques (i.e., recode the variables or treat them as having a lower level of measurement)
  • 33. 33 Graphing ordinal by ordinal data • Ordinal by ordinal data is difficult to visualise because its non-parametric, with many points. • Consider using: –Non-parametric approaches (e.g., clustered bar chart) –Parametric approaches (e.g., scatterplot with line of best fit)
  • 34. 34 Spearman’s rho (rs ) or Spearman's rank order correlation • For ranked (ordinal) data –e.g., Olympic Placing correlated with World Ranking • Uses product-moment correlation formula • Interpretation is adjusted to consider the underlying ranked scales
  • 35. 35 Kendall’s Tau (τ) • Tau a –Does not take joint ranks into account • Tau b –Takes joint ranks into account –For square tables • Tau c –Takes joint ranks into account –For rectangular tables
  • 40. 40 Point-biserial correlation (rpb ) • One dichotomous & one interval/ratio variable –e.g., belief in god (yes/no) and number of countries visited • Calculate as for Pearson's product-moment r • Adjust interpretation to consider the direction of the dichotomous scales
  • 41. 41 Point-biserial correlation (rpb ): Example Those who report that they believe in God also report having travelled to slightly fewer countries (rpb = -.10) but this difference could have occurred by chance (p > .05), thus H0 is not rejected. Do not believe Believe
  • 44. 44 Scatterplot • Plot each pair of observations (X, Y) –x = predictor variable (independent; IV) –y = criterion variable (dependent; DV) • By convention: –IV on the x (horizontal) axis –DV on the y (vertical) axis • Direction of relationship: –+ve = trend from bottom left to top right –-ve = trend from top left to bottom right
  • 45. 45 Scatterplot showing relationship between age & cholesterol with line of best fit Upward slope = positive correlation
  • 46. 46 • The correlation between 2 variables is a measure of the degree to which pairs of numbers (points) cluster together around a best-fitting straight line • Line of best fit: y = a + bx • Check for: – outliers – linearity Line of best fit
  • 47. 47 What's wrong with this scatterplot? X-axis should be IV (predictor) Y-axis should be DV (outcome)
  • 48. 48 Scatterplot example: Strong positive (.81) Q: Why is infant mortality positively linearly associated with the number of physicians (with the effects of GDP removed)? A: Because more doctors tend to be deployed to areas with infant mortality (socio-economic status aside).
  • 49. 49 Scatterplot example: Weak positive (.14) Outlier – exaggerating the correlation
  • 50. 50 Scatterplot example: Moderately strong negative (-.76) A: Having sufficient Vitamin D (via sunlight) lowers risk of cancer. However, UV light exposure increases risk of skin cancer. Q: Why is there a strong negative correlation between solar radiation and breast cancer?
  • 51. 51 Pearson product-moment correlation (r) ● The product-moment correlation is the standardised covariance.
  • 52. 52 Covariance • Variance shared by 2 variables • Covariance reflects the direction of the relationship: +ve cov indicates +ve relationship -ve cov indicates -ve relationship ● Covariance is unstandardised. Cross products 1 ))(( − −−Σ = N YYXX CovXY N - 1 for the sample; N for the population
  • 53. 53 Covariance: Cross-products -ve cro ss pro duc ts X1 403020100 Y1 3 3 2 2 1 1 0 -ve cross products -ve cross products +ve cross products +ve cross products
  • 54. 54 Covariance → Correlation • Size depends on the measurement scale → Can’t compare covariance across different scales of measurement (e.g., age by weight in kilos versus age by weight in grams). • Therefore, standardise covariance (divide by the cross-product of the SDs) → correlation • Correlation is an effect size - i.e., standardised measure of strength of linear relationship
  • 55. 55 The covariance between X and Y is 1.2. The SD of X is 2 and the SD of Y is 3. The correlation is: a. 0.2 b. 0.3 c. 0.4 d. 1.2 Covariance, SD, and correlation: Example quiz question Answer: 1.2 / 2 x 3 = 0.2
  • 56. 56 Hypothesis testing Almost all correlations are not 0. So, hypothesis testing seeks to answer: • What is the likelihood that an observed relationship between two variables is “true” or “real”? • What is the likelihood that an observed relationship is simply due to chance?
  • 57. 57 Significance of correlation • Null hypothesis (H0): ρ = 0 i.e., no “true” relationship in the population • Alternative hypothesis (H1): ρ <> 0 i.e., there is a real relationship in the population • Initially, assume H0 is true, and then evaluate whether the data support H1. • ρ (rho) = population product-moment correlation coefficient rho
  • 58. 58 How to test the null hypothesis • Select a critical value (alpha (α)); commonly .05 • Use a 1- or 2-tailed test; 1-tailed if hypothesis is directional • Calculate correlation and its p value. Compare to the critical alpha value. • If p < critical alpha, correlation is statistically significant, i.e., there is less than critical alpha chance that the observed relationship is due to random sampling variability.
  • 59. 59 Correlation – SPSS output C o r r e l a t i o n s . 7 1 3 * * . 0 0 0 2 1 P e a r s o n C o r r e l a t i o n S i g . ( 2 - t a i l e d ) N P e a r s o n C o r r e l a t i o n S i g . ( 2 - t a i l e d ) N C i g a r e t t e C o n s u m p t i o n p e r A d u l t p e r D a y C H D M o r t a l i t y p e r 1 0 , 0 0 0 C i g a r e t t e C o n s u m p t i o n p e r A d u l t p e r D a y C H D M o r t a l i t y p e r 1 0 , 0 0 0 C o r r e l a t i o n i s s i g n i f i c a n t a t t h e 0 . 0 1 l e v e l ( 2 - t a i l e d ) . * * .
  • 60. 60 Errors in hypothesis testing • Type I error: decision to reject H0 when H0 is true • Type II error: decision to not reject H0 when H0 is false • A significance test outcome depends on the statistical power which is a function of: –Effect size (r) –Sample size (N) –Critical alpha level (αcrit )
  • 61. 61 Significance of correlation df critical (N - 2) p = .05 5 .67 10 .50 15 .41 20 .36 25 .32 30 .30 50 .23 200 .11 500 .07 1000 .05 The higher the N, the smaller the correlation required for a statistically significant result – why?
  • 62. 62 Scatterplot showing a confidence interval for a line of best fit
  • 63. 63 US States 4th Academic Achievement by SES Scatterplot showing a confidence interval for a line of best fit Image source: http://www.chriscorrea.com/2005/poverty-and-performance-on-the-naep/
  • 64. 64 If the correlation between Age and Performance is statistically significant, it means that: a. there is an important relationship between the variables b. the true correlation between the variables in the population is equal to 0 c. the true correlation between the variables in the population is not equal to 0 d. getting older causes you to do poorly on tests Practice quiz question: Significance of correlation
  • 66. 66 Coefficient of Determination (r2 ) • CoD = The proportion of variance in one variable that can be accounted for by another variable. • e.g., r = .60, r2 = .36 or 36% of shared variance
  • 67. 67 Interpreting correlation (Cohen, 1988) • A correlation is an effect size • Rule of thumb: Strength r r2 Weak: .1 - .3 1 - 9% Moderate: .3 - .5 10 - 25% Strong: >.5 > 25%
  • 68. 68 Size of correlation (Cohen, 1988) WEAK (.1 - .3) MODERATE (.3 - .5) STRONG (> .5)
  • 69. 69 Interpreting correlation (Evans, 1996) Strength r r2 very weak 0 - .19 (0 to 4%) weak .20 - .39 (4 to 16%) moderate .40 - .59 (16 to 36%) strong .60 - .79 (36% to 64%) very strong .80 - 1.00 (64% to 100%)
  • 70. 70 X1 403020100 Y1 3 3 2 2 1 1 0 Correlation of this scatterplot = -.9 Scale has no effect on correlation.
  • 72. 72 What do you estimate the correlation of this scatterplot of height and weight to be? a. -.5 b. -1 c. 0 d. .5 e. 1 WEIGHT 737271706968676665 HEIGHT 176 174 172 170 168 166
  • 73. 73 What do you estimate the correlation of this scatterplot to be? a. -.5 b. -1 c. 0 d. .5 e. 1 X 5.65.45.25.04.84.64.4 Y 14 12 10 8 6 4 2
  • 74. 74 What do you estimate the correlation of this scatterplot to be? a. -.5 b. -1 c. 0 d. .5 e. 1 X 1412108642 Y 6 5 5 5 5 5 4
  • 75. 75 Write-up: Example “Number of children and marital satisfaction were inversely related (r (48) = -.35, p < .05), such that contentment in marriage tended to be lower for couples with more children. Number of children explained approximately 10% of the variance in marital satisfaction, a small-moderate effect.”
  • 77. 77 Assumptions and limitations 1. Levels of measurement 2. Normality 3. Linearity 1. Effects of outliers 2. Non-linearity 4. Homoscedasticity 5. No range restriction 6. Homogenous samples 7. Correlation is not causation 8. Dealing with multiple correlations
  • 78. 78 Normality • The X and Y data should be sampled from populations with normal distributions • Do not overly rely on any single indicator of normality; use histograms, skewness and kurtosis (e.g., within -1 and +1) • Inferential tests of normality (e.g., Shapiro-Wilks) are overly sensitive when sample is large
  • 79. 79 Effect of outliers • Outliers can disproportionately increase or decrease r. • Options –compute r with & without outliers –get more data for outlying values –recode outliers as having more conservative scores –transformation –recode variable into lower level of measurement and a non-parametric approach
  • 80. 80 Age & self-esteem (r = .63) AGE 8070605040302010 SE 10 8 6 4 2 0
  • 81. 81 Age & self-esteem (outliers removed) r = .23 AGE 40302010 SE 9 8 7 6 5 4 3 2 1
  • 82. 82 Non-linear relationships Check scatterplot Can a linear relationship ‘capture’ the lion’s share of the variance? If so,use r. Y2 6050403020100 X2 40 30 20 10 0 Yes – straight- line appropriate No – straight-line inappropriate
  • 83. 83 Non-linear relationships If non-linear, consider: • Does a linear relation help? • Use a non-linear mathematical function to describe the relationship between the variables • Transforming variables to “create” linear relationship
  • 84. 84 Scedasticity • Homoscedasticity refers to even spread of observations about a line of best fit • Heteroscedasticity refers to uneven spread of observations about a line of best fit • Assess visually and with Levene's test
  • 87. 87 Range restriction • Range restriction is when the sample contains a restricted (or truncated) range of scores –e.g., level of hormone X and age < 18 might have linear relationship • If range is restricted, be cautious about generalising beyond the range for which data is available –e.g., level of hormone X may not continue to increase linearly with age after age 18
  • 89. 89 Heterogenous samples ● Sub-samples (e.g., males & females) may artificially increase or decrease overall r. ● Solution - calculate r r separately for sub-samples & overall; look for differences W1 80706050 H1 190 180 170 160 150 140 130
  • 90. 90 Scatterplot of Same-sex & Opposite-sex Relations by Gender Opp Sex Relations 76543210 SameSexRelations 7 6 5 4 3 2 SEX female male ♂ r = .67 r2 = .45 ♀ r = .52 r2 = .27
  • 91. 91 Scatterplot of Weight and Self-esteem by Gender WEIGHT 120110100908070605040 SE 10 8 6 4 2 0 SEX male female ♂ r = .50 ♀r = -.48
  • 92. 92 Correlation is not causation e.g.,: correlation between ice cream consumption and crime, but actual cause is temperature Image source: Unknown
  • 93. 93 Correlation is not causation e.g.,: Stop global warming: Become a pirate As the number of pirates has reduced, global temperature has increased. Image source: Unknown
  • 94. 94 Scatterplot matrices organise scatterplots and correlations amongst several variables at once. However, they are not sufficiently detailed for more than about five variables at a time. Dealing with several correlations Image source: Unknown
  • 95. 95 Correlation matrix: Example of an APA Style Correlation Table
  • 97. 97 Summary: Correlation 1. The world is made of covariations. 2. Covariations are the building blocks of more complex multivariate relationships. 3. Correlation is a standardised measure of the covariance (extent to which two phenomenon co-relate). 4. Correlation does not prove causation - may be opposite causality, bi-directional, or due to other variables.
  • 98. 98 Summary: Types of correlation • Nominal by nominal: Phi (Φ) / Cramer’s V, Chi-square • Ordinal by ordinal: Spearman’s rank / Kendall’s Tau b • Dichotomous by interval/ratio: Point bi-serial rpb • Interval/ratio by interval/ratio: Product-moment or Pearson’s r
  • 99. 99 Summary: Correlation steps 1. Choose measure of correlation and graphs based on levels of measurement. 2. Check graphs (e.g., scatterplot): –Linear or non-linear? –Outliers? –Homoscedasticity? –Range restriction? –Sub-samples to consider?
  • 100. 100 Summary: Correlation steps 3. Consider –Effect size (e.g., Φ, Cramer's V, r, r2 ) –Direction –Inferential test (p) 4. Interpret/Discuss –Relate back to hypothesis –Size, direction, significance –Limitations e.g., • Heterogeneity (sub-samples) • Range restriction • Causality?
  • 101. 101 Summary: Interpreting correlation • Coefficient of determination –Correlation squared –Indicates % of shared variance Strength r r2 Weak: .1 - .3 1 – 10% Moderate: .3 - .5 10 - 25% Strong: > .5 > 25%
  • 102. 102 Summary: Asssumptions & limitations 1. Levels of measurement 2. Normality 3. Linearity 4. Homoscedasticity 5. No range restriction 6. Homogenous samples 7. Correlation is not causation 8. Dealing with multliple correlations
  • 103. 103 References Evans, J. D. (1996). Straightforward statistics for the behavioral sciences. Pacific Grove, CA: Brooks/Cole Publishing. Howell, D. C. (2007). Fundamental statistics for the behavioral sciences. Belmont, CA: Wadsworth. Howell, D. C. (2010). Statistical methods for psychology (7th ed.). Belmont, CA: Wadsworth. Howitt, D. & Cramer, D. (2011). Introduction to statistics in psychology (5th ed.). Harlow, UK: Pearson.
  • 104. 104 Open Office Impress ● This presentation was made using Open Office Impress. ● Free and open source software. ● http://www.openoffice.org/product/impress.html

Notas do Editor

  1. 7126/6667 Survey Research &amp; Design in Psychology Semester 1, 2017, University of Canberra, ACT, Australia James T. Neill Home page: http://en.wikiversity.org/wiki/Survey_research_and_design_in_psychology Lecture page: http://en.wikiversity.org/wiki/Survey_research_and_design_in_psychology/Lectures/Correlation Image source:http://commons.wikimedia.org/wiki/File:Gnome-power-statistics.svg Image author: GNOME project Image license: GPL, http://en.wikipedia.org/wiki/GNU_General_Public_License Description: Overviews non-parametric and parametric approaches to (bivariate) linear correlation. This lecture is accompanied by a computer-lab based tutorial, the notes for which are available here: http://en.wikiversity.org/wiki/Survey_research_and_design_in_psychology/Tutorials/Correlation
  2. This lecture explains: The purpose of using parametric and non-parametric correlational analyses (i.e., what question(s) are we trying to answer) The nature of covariation – what does it mean if two variables covary or “vary together” Correlational analyses Types of answers – What are we looking for? What can we conclude? Types of correlation – And how to select appropriate correlational measures and graphical techniques based on the variables&amp;apos; level of measurement Interpretation – of correlational relations and graphs Assumptions / Limitations Dealing with several correlations
  3. Image source: http://commons.wikimedia.org/wiki/File:Bee_1_by_andy205.jpg Image author: http://www.sxc.hu/profile/andy205 License: Copyright, can be used for any purpose as stated on the image author&amp;apos;s profile Responses to a single variable will vary (i.e., they will be distributed across a range). Responses to two or more variables may covary, i.e., they may share some variation e.g., when the value of one variable is high, the other also tends to be relatively high (or low). The world is made of covariation! Look around – look closely - everywhere we look, there are patterns of covariation, i.e., when two or more states tend to co-occur - e.g., higher rainfall tends to be associated with lusher plant growth. From the distribution of stars to the behaviour of ants, we can observe predictable co-occurrence of phenomena (they tend to occur together) - e.g., when students study harder, they tend to perform better on assessment tasks.
  4. Image source: http://www.flickr.com/photos/notsogoodphotography/420751815/ By notsogoodphotography: http://www.flickr.com/photos/notsogoodphotography/ License: CC-by-A 2.0 http://creativecommons.org/licenses/by/2.0/deed.en We observe covariations in the psycho-social world. We can measure our observations.
  5. Image source: http://www.flickr.com/photos/kimota/146865077/ By gualtiero: http://www.flickr.com/photos/kimota/ License: CC-by-SA 2.0 http://creativecommons.org/licenses/by-sa/2.0/deed.en The covariation between a set of variables provides the underlying the building blocks for all the major types of statistical modelling. If you can understand bivariation covariation (the shared variation of two variables) then you are well on your way to being able to understand multivariate analyses and modelling techniques. Correlation underlies more advanced methods in this course: - internal consistency - linear and multiple regression - factor analysis One of the themes I’ve been trying to convince you of so far in this course is that a solid grasp of the power of supposedly ‘basic’ statistics is extremely valuable for both: - reporting descriptive statistics - as a bases for understanding advanced statistical methods Correlation is one of the most useful underlying basic statistics.
  6. This is a basic, fundamental bivariate question. Really, the only other question we can ask is about differences between two variables. However, these two questions are essential the same, just framed in different ways, a point we will came back in later lectures on analysing differences.
  7. Image source: http://commons.wikimedia.org/wiki/File:Scatterplot_r%3D-.76.png Image author: Online Statistics Education: A Multimedia Course of Study Project Leader: David M. Lane, Rice University Image license: Public domain
  8. See also: http://en.wikiversity.org/wiki/Survey_research_and_design_in_psychology/Tutorials/Correlation/Types_of_correlation_and_level_of_measurement http://wilderdom.com/courses/surveyresearch/tutorials/2/#Types
  9. Image source: http://www.health.state.pa.us/hpa/stats/techassist/risknotion.htm
  10. Image author: James Neill, 2007 Image license: Creative Commons Attribution 2.5 Available through SPSS Crosstabs From the Quick Fun Survey Data – Tutorial 2
  11. Image author: James Neill, 2007 Image license: Creative Commons Attribution 2.5 Available through SPSS Crosstabs - Cells From the Quick Fun Survey Data – Tutorial 2
  12. Image author: James Neill, 2007 Image license: Creative Commons Attribution 2.5 Available through SPSS Crosstabs - Cells From the Quick Fun Survey Data – Tutorial 2 - Correlation
  13. Image source: http://www.burningcutlery.com/derek/bargraph/ http://www.burningcutlery.com/derek/bargraph/cluster.png Image license: Appears to be GPL, http://www.burningcutlery.com/derek/bargraph/gpl.txt
  14. Image author: James Neill, 2007 Image license: Creative Commons Attribution 2.5 Available through SPSS Graphs – Bar Chart - Clustered From the Quick Fun Survey Data – Tutorial 2 - Correlation
  15. Check minimum expected cell size &amp;gt;= 5 Image source: http://en.wikiversity.org/wiki/Pearson%27s_chi-square_test Image license: CC by SA 3.0, Creative Commons Attribution/Share-Alike Licens, http://creativecommons.org/licenses/by-sa/3.0/ Previously used: http://www.geography-site.co.uk/pages/skills/fieldwork/statimage/chisqu.gif
  16. Image author: James Neill, 2007 Image license: Creative Commons Attribution 2.5 From the Quick Fun Survey Data – Tutorial 2 - Correlation
  17. Image source: Unknown Chi-square obtained (1 df) = 10.26 &amp;gt; Chi-square critical value for critical alpha .05 (1 df) = 3.84, therefore the sample data was unlikely to have come from a population in which the two variables are unrelated, i.e., reject the null hypothesis that there is no relationship between the variables.
  18. Image author: James Neill, 2007 Image license: Creative Commons Attribution 2.5 Available through SPSS Crosstabs - Statistics
  19. Correlations still range -1 to +1
  20. Available through SPSS Crosstabs or Correlate – Bivariate Reference: http://en.wikipedia.org/wiki/Kendall_tau_rank_correlation_coefficient
  21. Image author: James Neill, 2007 Image license: Creative Commons Attribution 2.5 Available through SPSS Crosstabs - Statistics
  22. Image author: James Neill, 2007 Image license: Creative Commons Attribution 2.5 Available through SPSS Crosstabs - Statistics
  23. Image author: James Neill, 2007 Image license: Creative Commons Attribution 2.5 Available through SPSS Crosstabs - Statistics
  24. Image author: James Neill, 2007 Image license: Creative Commons Attribution 2.5 On the x-axis, 0 = believe; 1 = don’t believe
  25. Image author: James Neill, 2007 Image license: Creative Commons Attribution 2.5 Is there a relationship between belief in God and the number of countries visited? From the Quick Fun Survey data – Tutorial 2
  26. Image source: Unknown Image author: Unknown Image license: Unknown
  27. Image source: http://www.flickr.com/photos/brandi666/495013956/ IV should be treated as X and DV as Y, although this is not always distinct.
  28. Image source: Howell (Fundamentals)‏
  29. Image source: Howell (Fundamentals)‏
  30. Image source: Howell (Fundamentals)‏
  31. Image source: Wikiversity
  32. Recall that Variance is sum of (X-meanX) squared / N-1 Covariance denominator is N for the population and N - 1 for the sample
  33. Image author: James Neill, 2007 Image license: Creative Commons Attribution 2.5 When would covXY be large and positive? When would covXY be large and negative?
  34. b. .15 because the SD’s are 2 and 3 respectively, which will result in the covariance being standardized to less than half its value for the correlation.
  35. Research hypothesis - predict that there will be a negative relationship between weight &amp; self-esteem, such that as weight increases self-esteem decreases. Null hypothesis: no relationship between weight &amp; SE.
  36. PASW automatically provides sig. test when using Correlate – Bivariate Also try this handy calculator: Significance of the Difference Between an Observed Correlation Coefficient and a Hypothetical Value of rho http://faculty.vassar.edu/lowry/rpop.html Alternatively look up r tables with df = N - 2
  37. Image source: Howell (Fundamentals)‏
  38. The size of correlation required to be significant decreases as N increases – why? To test the hypothesis that the population correlation is zero (p = 0) use:t = [r*SQRT(n-2)]/[SQRT(1-r2)] where r is the correlation coefficient and n is sample size. Look up the t value in a table of the distribution of t, for (n - 2) degrees of freedom. If the computed t value is as high or higher than the table t value, then conclude r is significant different from 0). http://www2.chass.ncsu.edu/garson/pa765/correl.htm#signif
  39. Image license: Unknown A confidence interval can be calculated for a line of best fit (or correlation). If the lower and upper confidence intervals are both above or below 0, then it indicates a significant difference from no correlation (zero). For more information, see: http://onlinestatbook.com/chapter8/correlation_ci.html pp. 260-261, Howell (Methods)‏ How do I set confidence limits on my correlation coefficients? SPSS does not compute confidence limits directly. http://www2.chass.ncsu.edu/garson/pa765/correl.htm#limits
  40. Image source: http://www.chriscorrea.com/2005/poverty-and-performance-on-the-naep/
  41. c. the true correlation between and Age and Performance is not equal to 0
  42. Image author: James Neill, 2007 Image license: Creative Commons Attribution 2.5 References Allen &amp; Bennett, 2010, p. 173 Howell, 2010, p. 344
  43. Image author: James Neill, 2007 Image license: Creative Commons Attribution 2.5
  44. Evans, J. D. (1996). Straightforward statistics for the behavioral sciences . Pacific Grove, CA: Brooks/Cole Publishing.
  45. Image author: James Neill, 2007 Image license: Creative Commons Attribution 2.5
  46. Image author: James Neill, 2007 Image license: Creative Commons Attribution 2.5 This plot is the same as the previous one - the x-axis display has simply been rescaled. Also see the effect of linear transformation on r: http://davidmlane.com/hyperstat/A69920.html
  47. Image author: James Neill, 2007 Image license: Creative Commons Attribution 2.5 Answer: c. 0
  48. Image author: James Neill, 2007 Image license: Creative Commons Attribution 2.5 Answer: c. 0 (no variation in X scores)‏
  49. Also see end of Howell (Fundamentals) Ch9 for an example write-up
  50. Also see Howell (Methods) p. 263 http://www2.chass.ncsu.edu/garson/pa765/correl.htm
  51. Evans, J. D. (1996). Straightforward statistics for the behavioral sciences . Pacific Grove, CA: Brooks/Cole Publishing.
  52. Image author: James Neill, 2007 Image license: Creative Commons Attribution 2.5
  53. Image author: James Neill, 2007 Image license: Creative Commons Attribution 2.5
  54. Image author: James Neill, 2007 Image license: Creative Commons Attribution 2.5 Statistical tests are available non-linear relationships, but it is best to start with a visual examination of the plotted data.
  55. Evans, J. D. (1996). Straightforward statistics for the behavioral sciences . Pacific Grove, CA: Brooks/Cole Publishing.
  56. Image source: https://commons.wikimedia.org/wiki/File:Homoscedasticity.png Image source: https://commons.wikimedia.org/wiki/File:Heteroscedasticity.png Image license: https://creativecommons.org/licenses/by-sa/3.0/deed.en
  57. Image source: Unknown Image license: Unknown Homoscedasticity is guaranteed by having normally distributed variables. Heteroscedasticity weakens but does not invalidate correlational analysis. http://www.pfc.cfs.nrcan.gc.ca/profiles/wulder/mvstats/homosced_e.html http://davidmlane.com/hyperstat/A121947.html
  58. Image source: http://davidmlane.com/hyperstat/A68809.html Image license: Unknown
  59. Image author: James Neill, 2007 Image license: Creative Commons Attribution 2.5
  60. Image author: James Neill, 2007 Image license: Creative Commons Attribution 2.5
  61. Image author: James Neill, 2007 Image license: Creative Commons Attribution 2.5
  62. Image source: Unknown
  63. Image source: Unknown
  64. Image source: Unknown
  65. Image source: This lecture Image author: James Neill Image license: Creative Commons Attribution 3.0, CC-by-A 3.0