SlideShare a Scribd company logo
1 of 9
Download to read offline
Kappa

Is a form of correlation for measuring agreement on two
or more diagnostic categories by two or more clinicians or methods.

Why not use % agreement?

Because just by chance there could be lots of agreement.

Kappa can be defined as the proportion of agreements after chance
agreement is removed.

       Kappa of 0 occurs when agreement is no better than chance.
Kappa of 1 indicates perfect agreement.


Negative Kappa means that there is less
agreement than you’d expect by chance (very
rare)
      Categories may be ordinal or nominal
How is it calculated?

Patient ID   Psychiatrist   Psychologist
1            1              2
2            2              2
3            2              2
4            3              3
5            3              3
6            3              3
7            3              3
8            3              4
9            3              4
10           4              4
11           4              4
12           4              3
Category   1   2     3     4
   1
   2       1   11
   3                1111   1
   4                 11    11
Steps

1. Add agreements = 2 + 4 + 2 = 8
2. Multiply number of times each judge
 used a category:
      (1x0) + (2x3) + (6x5) + (3x4)
3. Add them up = 48
4. Apply formula
Kappa = (N x agreements) – N as in 3
              N2 – N as in 3




Which = (12 x 8) – 48 = 96 – 48 = 48 =    0.50
          144 – 48        96      96
How large should Kappa be?

Landis & Koch (1977) suggested


0.0 – 0.20 = no or slight agreement

0.21 – 0.40 = fair

0.41 – 0.60 = moderate

0.61 – 0.80 = good

> 0.80 = very good
Weighted Kappa

In ordinary Kappa, all disagreements are

treated equally. Weighted Kappa takes

magnitude of discrepancy into account (often

most useful); is often higher than unweighted

Kappa.
N.B. Be careful with Kappa if the prevalence of

one of the categories in very low (< 10%); this

will underestimate level of agreement.

Example:

If 2 judges are very accurate (95%) a Kappa of

0.61 with a prevalence of 10% will drop to

 •
0.45 if prevalence is 5%

 •
0.14 if prevalence is 1%.

More Related Content

What's hot

Sensitivity, specificity, positive and negative predictive
Sensitivity, specificity, positive and negative predictiveSensitivity, specificity, positive and negative predictive
Sensitivity, specificity, positive and negative predictiveMusthafa Peedikayil
 
Case control study
Case control studyCase control study
Case control studyswati shikha
 
Nested case control,
Nested case control,Nested case control,
Nested case control,shefali jain
 
Measures of association 2013
Measures of association 2013Measures of association 2013
Measures of association 2013dinahoefer11
 
3. descriptive study
3. descriptive study3. descriptive study
3. descriptive studyNaveen Phuyal
 
Chi-square, Yates, Fisher & McNemar
Chi-square, Yates, Fisher & McNemarChi-square, Yates, Fisher & McNemar
Chi-square, Yates, Fisher & McNemarAzmi Mohd Tamil
 
Measures of association
Measures of associationMeasures of association
Measures of associationIAU Dent
 
What does an odds ratio or relative risk mean?
What does an odds ratio or relative risk mean? What does an odds ratio or relative risk mean?
What does an odds ratio or relative risk mean? Terry Shaneyfelt
 
Odds ratios (Basic concepts)
Odds ratios (Basic concepts)Odds ratios (Basic concepts)
Odds ratios (Basic concepts)Tarekk Alazabee
 
randomized clinical trials II
randomized clinical trials IIrandomized clinical trials II
randomized clinical trials IIIAU Dent
 
Common measures of association in medical research handout
Common measures of association in medical research handoutCommon measures of association in medical research handout
Common measures of association in medical research handoutPat Barlow
 

What's hot (20)

Chisquare Test
Chisquare Test Chisquare Test
Chisquare Test
 
Sensitivity, specificity, positive and negative predictive
Sensitivity, specificity, positive and negative predictiveSensitivity, specificity, positive and negative predictive
Sensitivity, specificity, positive and negative predictive
 
Case control study
Case control studyCase control study
Case control study
 
Nested case control,
Nested case control,Nested case control,
Nested case control,
 
Bias and validity
Bias and validityBias and validity
Bias and validity
 
Measures of association 2013
Measures of association 2013Measures of association 2013
Measures of association 2013
 
3. descriptive study
3. descriptive study3. descriptive study
3. descriptive study
 
Chi-square, Yates, Fisher & McNemar
Chi-square, Yates, Fisher & McNemarChi-square, Yates, Fisher & McNemar
Chi-square, Yates, Fisher & McNemar
 
Measures of association
Measures of associationMeasures of association
Measures of association
 
What does an odds ratio or relative risk mean?
What does an odds ratio or relative risk mean? What does an odds ratio or relative risk mean?
What does an odds ratio or relative risk mean?
 
Chi squared test
Chi squared testChi squared test
Chi squared test
 
Bias in health research
Bias in health researchBias in health research
Bias in health research
 
Chi square mahmoud
Chi square mahmoudChi square mahmoud
Chi square mahmoud
 
Roc curves
Roc curvesRoc curves
Roc curves
 
Cohort ppt
Cohort pptCohort ppt
Cohort ppt
 
Odds ratios (Basic concepts)
Odds ratios (Basic concepts)Odds ratios (Basic concepts)
Odds ratios (Basic concepts)
 
randomized clinical trials II
randomized clinical trials IIrandomized clinical trials II
randomized clinical trials II
 
Common measures of association in medical research handout
Common measures of association in medical research handoutCommon measures of association in medical research handout
Common measures of association in medical research handout
 
Case control study
Case control studyCase control study
Case control study
 
Cross sectional study
Cross sectional studyCross sectional study
Cross sectional study
 

Viewers also liked

Statistical Methods
Statistical MethodsStatistical Methods
Statistical Methodsdionisos
 
Bias and reliability
Bias and reliabilityBias and reliability
Bias and reliabilityRob Golding
 
Sources, reliability and bias
Sources, reliability and biasSources, reliability and bias
Sources, reliability and biasdposkerhill
 
3.3 hierarchy of populations
3.3 hierarchy of populations3.3 hierarchy of populations
3.3 hierarchy of populationsA M
 
3.7 preventing biases
3.7 preventing biases3.7 preventing biases
3.7 preventing biasesA M
 
Inter-rater Reliability Training - ACT Holistic Writing Rubric
Inter-rater Reliability Training - ACT Holistic Writing RubricInter-rater Reliability Training - ACT Holistic Writing Rubric
Inter-rater Reliability Training - ACT Holistic Writing RubricPeggy Muehlenkamp
 
Psychometric Assessment at its best
Psychometric Assessment at its best Psychometric Assessment at its best
Psychometric Assessment at its best Dr. Johnsey Thomas
 
How to detect bias in the news ('12)
How to detect bias in the news ('12)How to detect bias in the news ('12)
How to detect bias in the news ('12)LHendersonRSS
 
Population 3.5 - Spearman’S Rank
Population 3.5 - Spearman’S RankPopulation 3.5 - Spearman’S Rank
Population 3.5 - Spearman’S RankEcumene
 
Advanced statistics Lesson 1
Advanced statistics Lesson 1Advanced statistics Lesson 1
Advanced statistics Lesson 1Cliffed Echavez
 
Spearman’s Rank Correlation Coefficient
Spearman’s Rank Correlation CoefficientSpearman’s Rank Correlation Coefficient
Spearman’s Rank Correlation CoefficientSharlaine Ruth
 
Multidimensional scaling1
Multidimensional scaling1Multidimensional scaling1
Multidimensional scaling1Carlo Magno
 
Research methods - PSYA1 psychology AS
Research methods - PSYA1 psychology ASResearch methods - PSYA1 psychology AS
Research methods - PSYA1 psychology ASNicky Burt
 
Multidimensional scaling
Multidimensional scalingMultidimensional scaling
Multidimensional scalingH9460730008
 
Characteristics of a good test
Characteristics of a good testCharacteristics of a good test
Characteristics of a good testcyrilcoscos
 

Viewers also liked (20)

Statistical Methods
Statistical MethodsStatistical Methods
Statistical Methods
 
Dishonest by Design?
Dishonest by Design?Dishonest by Design?
Dishonest by Design?
 
Bias and reliability
Bias and reliabilityBias and reliability
Bias and reliability
 
Sources, reliability and bias
Sources, reliability and biasSources, reliability and bias
Sources, reliability and bias
 
3.3 hierarchy of populations
3.3 hierarchy of populations3.3 hierarchy of populations
3.3 hierarchy of populations
 
3.7 preventing biases
3.7 preventing biases3.7 preventing biases
3.7 preventing biases
 
Identifying bias
Identifying biasIdentifying bias
Identifying bias
 
Inter-rater Reliability Training - ACT Holistic Writing Rubric
Inter-rater Reliability Training - ACT Holistic Writing RubricInter-rater Reliability Training - ACT Holistic Writing Rubric
Inter-rater Reliability Training - ACT Holistic Writing Rubric
 
Psychometric Assessment at its best
Psychometric Assessment at its best Psychometric Assessment at its best
Psychometric Assessment at its best
 
How to detect bias in the news ('12)
How to detect bias in the news ('12)How to detect bias in the news ('12)
How to detect bias in the news ('12)
 
Population 3.5 - Spearman’S Rank
Population 3.5 - Spearman’S RankPopulation 3.5 - Spearman’S Rank
Population 3.5 - Spearman’S Rank
 
Advanced statistics
Advanced statisticsAdvanced statistics
Advanced statistics
 
Advanced statistics Lesson 1
Advanced statistics Lesson 1Advanced statistics Lesson 1
Advanced statistics Lesson 1
 
Spearman Rank
Spearman RankSpearman Rank
Spearman Rank
 
Spearman’s Rank Correlation Coefficient
Spearman’s Rank Correlation CoefficientSpearman’s Rank Correlation Coefficient
Spearman’s Rank Correlation Coefficient
 
Multidimensional scaling1
Multidimensional scaling1Multidimensional scaling1
Multidimensional scaling1
 
Research methods - PSYA1 psychology AS
Research methods - PSYA1 psychology ASResearch methods - PSYA1 psychology AS
Research methods - PSYA1 psychology AS
 
Multidimensional scaling
Multidimensional scalingMultidimensional scaling
Multidimensional scaling
 
Characteristics of a good test
Characteristics of a good testCharacteristics of a good test
Characteristics of a good test
 
Psychometric Tests
Psychometric TestsPsychometric Tests
Psychometric Tests
 

Kappa

  • 1. Kappa Is a form of correlation for measuring agreement on two or more diagnostic categories by two or more clinicians or methods. Why not use % agreement? Because just by chance there could be lots of agreement. Kappa can be defined as the proportion of agreements after chance agreement is removed. Kappa of 0 occurs when agreement is no better than chance.
  • 2. Kappa of 1 indicates perfect agreement. Negative Kappa means that there is less agreement than you’d expect by chance (very rare) Categories may be ordinal or nominal
  • 3. How is it calculated? Patient ID Psychiatrist Psychologist 1 1 2 2 2 2 3 2 2 4 3 3 5 3 3 6 3 3 7 3 3 8 3 4 9 3 4 10 4 4 11 4 4 12 4 3
  • 4. Category 1 2 3 4 1 2 1 11 3 1111 1 4 11 11
  • 5. Steps 1. Add agreements = 2 + 4 + 2 = 8 2. Multiply number of times each judge used a category: (1x0) + (2x3) + (6x5) + (3x4) 3. Add them up = 48 4. Apply formula
  • 6. Kappa = (N x agreements) – N as in 3 N2 – N as in 3 Which = (12 x 8) – 48 = 96 – 48 = 48 = 0.50 144 – 48 96 96
  • 7. How large should Kappa be? Landis & Koch (1977) suggested 0.0 – 0.20 = no or slight agreement 0.21 – 0.40 = fair 0.41 – 0.60 = moderate 0.61 – 0.80 = good > 0.80 = very good
  • 8. Weighted Kappa In ordinary Kappa, all disagreements are treated equally. Weighted Kappa takes magnitude of discrepancy into account (often most useful); is often higher than unweighted Kappa.
  • 9. N.B. Be careful with Kappa if the prevalence of one of the categories in very low (< 10%); this will underestimate level of agreement. Example: If 2 judges are very accurate (95%) a Kappa of 0.61 with a prevalence of 10% will drop to • 0.45 if prevalence is 5% • 0.14 if prevalence is 1%.