O slideshow foi denunciado.
Seu SlideShare está sendo baixado. ×

3Measurements of health and disease_MCTD.pdf

Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Próximos SlideShares
Measures of Variation
Measures of Variation
Carregando em…3
×

Confira estes a seguir

1 de 33 Anúncio

Mais Conteúdo rRelacionado

Semelhante a 3Measurements of health and disease_MCTD.pdf (20)

Anúncio

Mais recentes (20)

3Measurements of health and disease_MCTD.pdf

  1. 1. Data summarization Measures of central tendency and Dispersion [MCTD] 1/2/2023 Data summarization 1 Emiru Merdassa(MSc, Assistant Professor)
  2. 2. Learning Objectives By the end of this session, the students will be able to compute and interpret Mean Median Mode o Range (R) o Variance and Standard deviation. o Coefficient of variation (C.V) o Interquartile Range 1/2/2023 Data summarization 2
  3. 3. Introduction • Compiling and presenting the data in tabular or graphical form will not give complete information of the data collected. • We need to “summarize” the entire data in one figure, looking at which we can get overall idea of the data. • Summary measures provide description of data in terms of concentration of data and variability existing in data. • We use these summary figures to draw certain conclusions about the reference population from which the sample data has been drawn. 1/2/2023 Data summarization 3
  4. 4. I. Arithmetic Mean • It is the average of the data. • Random sample of size 10 of ages, where ҧ 𝑥 = 42 + 28 + 28 + 61 + 31 + 23 + 50 + 38 + 32 + 37 10 1/2/2023 Data summarization 4 n X X n i i  = = 1 ഥ X = 370 10 = 𝟑𝟕
  5. 5. Properties of the Mean o Uniqueness: For a given set of data there is one and only one mean. o Simplicity: It is easy to understand and to compute. o Affected by extreme values: since all values enter into the computation. Example: Assume the values are 115, 110, 119,117,121 and 126. The mean = 118. But assume that the values are 75, 75, 80, 80 and 280. The mean = 118, a value that is not representative of the set of data as a whole. 1/2/2023 Data summarization 5
  6. 6. Median 1/2/2023 Data summarization 6 It is the middle value in the ordering of all data values from smallest to largest. • For the same random sample, the ordered observations will 23, 28, 28, 31, 32, 34, 37, 42, 50, 61. • Since n = 10, then the median is the 5.5𝑡ℎobservation, i.e. = (32+34)/2 = 33
  7. 7. …Median Properties of the Median: • Uniqueness: For a given set of data there is one and only one median. • Simplicity: It is easy to calculate. • It is not affected by extreme values as is the mean. 1/2/2023 Data summarization 7
  8. 8. Mode • It is the value which occurs most frequently. • If all values are different there is no mode. • Sometimes, there are more than one mode. Sample: • For the same random sample, the value 28 is repeated two times, so it is the mode. Properties of the Mode • Sometimes, it is not unique. • It may be used for describing qualitative data. 1/2/2023 Data summarization 8
  9. 9. Exercises Calculate 1) Arithmetic Mean 2) Median, 3) Mode, 4) Range, 5) IQR and 6) Standard Deviation using the following data 9 Ages of Women in Clinic 23 31 55 43 55 19 17 44 43 37 1/2/2023 Data summarization
  10. 10. Measures of Spread/Dispersion 1/2/2023 Data summarization 10
  11. 11. Measures of Spread… • Measures of spread are : o Range (R). o Variance and Standard deviation. o Coefficient of variation (C.V). o Interquartile Range • Measures of Relative Position(Quantiles and Percentiles) 1/2/2023 Data summarization 11
  12. 12. Introduction • Knowledge of central tendency alone is not sufficient for complete understanding of distribution. • Measures of spread tell us how far or how close together the data points are in a sample. • Measures of variability are measures of spread that tell us how varied our data points are from the average of the sample. 1/2/2023 Data summarization 12
  13. 13. Range (R) Range = Largest value - Smallest value Note: o Range concern only onto two values o Highly sensitive to outliers o Data: 43, 66, 61, 64, 65, 38, 59, 57, 57, 50. o Find Range? Range=66-38=28 1/2/2023 Data summarization 13
  14. 14. Variance • It measure dispersion relative to the scatter of the values about their mean, a) Sample Variance(S2 ): ,where ത X is sample mean • Find Sample Variance of ages, ҧ 𝑥= 56 Solution: S2 = [ (43 − 56)2+(66 − 56)2+ ⋯ +(50−56)2]/ 10-1 = 810/9 = 90  − = − = n i n i x x s 1 2 2 1 ) ( 1/2/2023 Data summarization 14
  15. 15. Standard Deviation • It is the square root of variance ( Variance ) a) Sample Standard Deviation(SD) = S2 b) Population Standard Deviation(𝜎) = 𝜎2 1/2/2023 Data summarization 15
  16. 16. Standard Deviation (SD) 1/2/2023 16 7 7 7 7 7 7 7 8 7 7 7 6 3 2 7 8 13 9 Mean = 7 SD=0 Mean = 7 SD=0.63 Mean = 7 SD=4.04 Data summarization
  17. 17. Measures of Dispersion… Consider the following two sets of data: A: 177 193 195 209 226 Mean = 200 B: 192 197 200 202 209 Mean = 200 Two or more sets may have the same mean and/or median but they may be quite different. 1/2/2023 Data summarization 17
  18. 18. Measures of Dispersion… A measure of dispersion conveys information regarding the amount of variability present in a set of data, Note: 1. If all the values are the same: There is no dispersion , 2. If all the values are different: There is a dispersion: 3. If the values close to each other: The amount of Dispersion is small. 4. If the values are widely scattered: The Dispersion is greater. 1/2/2023 Data summarization 18
  19. 19. Standard deviation • Caution must be exercised when using standard deviation as a comparative index of dispersion Weights of newborn elephants (Kg) 929 553 878 939 895 972 937 841 801 826 Weights of newborn mice (Kg) 0.72 0.42 0.53 0.31 0.59 0.38 0.79 0.96 1.06 0.89 n = 10 ഥ 𝑿= 887.1 SD = 56.50 n = 10 ഥ 𝑿 = 0.68 SD = 0.255 • Incorrect to say that elephants show greater variation for birth- weights than mice because of higher standard deviation 1/2/2023 Data summarization 19
  20. 20. The Coefficient of Variation (C.V) • Is a measure use to compare the dispersion in two sets of data which is independent of the unit of the measurement. CV = SD ഥ X *100; Where S: Sample standard deviation. ത X: Sample mean. 1/2/2023 Data summarization 20
  21. 21. Coefficient of Variance • Coefficient of variance expresses standard deviation relative to its mean Weights of newborn elephants (Kg) 929 553 878 939 895 972 937 841 801 826 Weights of newborn mice (Kg) 0.72 0.42 0.53 0.31 0.59 0.38 0.79 0.96 1.06 0.89 n = 10 ഥ 𝑿 = 887.1 SD = 56.50 CV = 0.0637 n = 10 ഥ 𝑿 = 0.68 SD = 0.255 CV = 0.375 Note : Mice show greater birth weight variation 1/2/2023 Data summarization 21
  22. 22. Example: • Suppose two samples of human males yield the following data: We wish to know which is more variable. Solution: C.V (Sample 1) = (10/145)*100= 6.9 C.V (Sample 2) = (10/80)* 100= 12.5 • Then age of 11-years olds(sample 2) is more variation Sample 1 Sample 2 Age 25-year-olds 11 year-olds Mean weight 145 pound 80 pound Standard Deviation 10 pound 10 pound 1/2/2023 Data summarization 22
  23. 23. When to use coefficient of variance o When comparison groups have very different means o When different units of measurement are involved, e.g. group 1 unit is mm, and group 2 unit is gm (CV is suitable for comparison as it is unit free) o In such cases, SD should not be used for comparison 1/2/2023 Data summarization 23
  24. 24. Measures of Relative Position  Locate the relative position of an observation in relation to the other observations.  Divide the data set into 100 equal groups  Suppose a data set is arranged in ascending (or descending ) order. The pth percentile is a number such that p% of the observations of the data set fall below and (100-p)% of the observations fall above it. For Example  45 % of observations are below the 45th percentile  55 % of observations are above 45th percentile 1/2/2023 24 Data summarization
  25. 25. Example: 20th percentile 1/2/2023 Data summarization 25
  26. 26. Percentile 1/2/2023 Data presentation & summarization 26 Data: 13, 11, 10, 13; 11, 10, 8, 12, 9, 9, 8, 9 What is the percentile rank for12? Solution: First, we need to arrange the values from smallest to largest. This ordered set is given as: 8, 8, 9, 9, 9, 10, 10, 11, 11, 12, 13, 13 Observe that the number of values below 12 is 9 and the total number of values in the data set is 12. Thus, using the formula, the corresponding percentile is That is, the value of 12corresponds to approximately the 79th percentile
  27. 27. Ctd 1. Tertiles: • Two points that divide and order a sample variable into three categories, each containing a third of the population (e.g., high, medium, low). 1/2/2023 Data summarization 27
  28. 28. 2. Quartiles: • Three points that divide and order a sample variable into four categories, each containing a fourth of the population. • The 25th, 50th, and 75th percentiles of a variable are used to categorize it into quartiles. 3. Quintiles: • Four points that divide and order a sample variable into five categories, each containing a fifth of the population. • The 20th, 40th, 60th, and 80th percentiles of a variable are used to categorize it into quintiles. 1/2/2023 Data summarization 28
  29. 29. Ctd 4. Deciles: • Nine points that divide and order a sample variable into ten categories, each containing a tenth of the population. • The 10th, 20th, 30th, 40th, 50th, 60th, 70th, 80th, and 90th percentiles of a variable are used to categorize it into deciles 1/2/2023 Data summarization 29
  30. 30. Quartile 1/2/2023 Data presentation & summarization 30 • Quartiles are the values that divide a list of numbers into quarters: • Put the list of numbers in order • Then cut the list into four equal parts • Example: 5, 7, 4, 4, 6, 2, 8 • Put them in order: 2, 4, 4, 5, 6, 7, 8 • Cut the list into quarters: – Quartile 1 (Q1) = 4 – Quartile 2 (Q2), which is also the Median, = 5 – Quartile 3 (Q3) = 7
  31. 31. Interquartile Range 1/2/2023 Data presentation & summarization 31 3rd quartile – 1st quartile  75th – 25th percentile 3(n+1)/4 - (n+1)/4 Robust to outliers Middle 50% of observations The Interquartile Range is: IQR = Q3 − Q1 = 7 − 4 = 3
  32. 32. Exercise The incubation period of smallpox in 9 patients where it was found to be 14, 13, 11, 15, 10, 7, 9, 12 and 10. Find: 1. Mean, Median & Mode 2. Recommend the best MCT 3. Range & IQR 4. S2 & SD 5. C.V 1/2/2023 Data summarization 32
  33. 33. Thank you ! 1/2/2023 Data summarization 33

×