Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.
Próximos SlideShares
Carregando em…5
×

Standard deviation and variance

960 visualizações

Standard deviation and variance

• Full Name
Comment goes here.

Are you sure you want to Yes No
• Seja o primeiro a comentar

Standard deviation and variance

1. 1. REPRESENTATION AND SUMMARY OF DATA 37 VARIABIL]TY OF DATA Each of these sets of numbers has a mean of 7 but the spread of each is set is different: (a) 7,7,7,7,7 (b) 4, 6, 6.5,7.2, 1.t.3 (c) -193, -46,28, 69, 1.77 There is no variability in set (a), but the numbers in set (c) are obviously much more spread out than those in set (b). There are various ways of measuring the variability or spread of a distribution, two of which are described here. The range The range is based entirely on the extreme values of the distribution. Range : highest value - lowest value In(a)therange=7-7:0 In (b) the range = 1.1.3 - 4 = 7.3 In (c) the range = 177 - (-1,93) = 370 Note that there are also ranges based on particular observations within the data and these percentile and quartile ranges are considered on page 68. THE STANDARD DEVIATION, S, AND THE VARIANCE, S2 The standard deviation, s, is a very important and useful measure of spread. It gives a measure of the deviations of the readings from the mean, E. It is calculated using all the values in the distribution. To calculate s: o for each reading r, calculate x - x, its deviation from the mean, o square this deviation to give (x - x)'and note that, irrespective of whether the deviation was positive or negative, this is now positive, o find L(x - *)2, the sum of all these values, o find the average by dividing the sum by n, the number of readings; L(x - *2 this gives 2* *) and is known as the variance, lt o finally take the positive square root of the variance to obtain the standard deviation, s. The standard deviation, s, of a set of n numbers, with mean *, is given by Each of the three sets of numbers on the previous page has mean 7, i.e. * = 7. (a) For the set 7,7,7,7,7 Since r - -x :7 -7 = O for every reading, s = 0, indicating that there is no deviation from the mean. 2(x - x)z n
2. 2. : ::"];St COURSE IN A-LEVEL STATISTICS (b) For the set 4,6,6.5,7.2,11.3 2(x -x)2 : (4 - 7)2 + g - 7)2+ (5.5 - 7)z + (7.2 - 7), +(11.3 - 7)z = 28.78 trwBt - " .4 (1 d.p.) ! 5 -' (c) For the set -1,93, -46,28, 69, 177 Z(x - x)2 = (-1,93 - 7)' + (-46 - z)2 + (28 - 7), + 6g - Z)2 + (t77 - 7), = 7 S g94 R;:t Fl s% ' : ,/--, : ,/ , :123.3 (1 d.p.) Notice that set (c) has a much high'er standard deviation than set (b), confirming that it is much more spread about the mean. Remember that Standard deviation = rlr*irrr* Variance = (standard deviation)2 NOTE: o The standard deviation gives an indication of the lowest and highest values of the data as follows. In most distributions, the bulk of the distribution lies within two srandard deviations of the mean, i.e. within the interval x + 2s or (* - 2s, -x + 2s). This helps to give an idea of the spread of the data. o The units of standard deviation are the same as the units of the data. o Standard deviations are useful when comparing sets of data; the higher the standard deviation, the greater the variability in the data. Example 1.22 Two machines, A and B, are used to pack biscuits. A random sample of ten packets was taken from each machine and the mass of each packet was measured to the nearest gram and noted. Find the standard deviation of the masses of the packets taken in the sample from each machine. Comment on your answer. t*," i" sl l+ :l,1.fl\$illfrgl.li.i?i.eQgi:i 4.qgii,+\$l, aqr, zoz, zos (mass i* g) 19,2,1;;,,,|1||5,4;.,...1.,9 ;::::19:8,j,::.'efi.0'.'.2:0,I j,,ififi3,',2A4, 2A6, 207 Solution L.22 Machine A x *zx - 'o?o - zaa Machine B n 10 Since the mean mass for each machine is 200 , x - x - x Ex -M- ,)u - n - 200 2000 10 - 200
3. 3. -^l-- * To calculate s, put the data into ) 2(x - 20q2 I . 10 : 5.6 ,-ffi:2.37 (2 d.p.) REPRESENTATION AND SUMMARY OF DATA 39 s2 I(x - 20qz 10 :24 s-tffi :4.gO (2 d.p.) rL B-1) a table: Machine A: s.d. =2.37 g (2 d.p.) Machine B: s.d. - 4.90 gr(2 d.p,) Machine A has less variation, indicating that it is more reliable than machine B. Alternative form of the formula for standard deviation The formula given above is sometimes difficult to use, especially when r is not an integei, so an alternative form is often used. This is derived as follows: ^1s2:i>(, -X)z n L - ^ E(xL-2*x+-xz) n 1 - ^ (Ex' - LxEx +Z-xz) n Lxz Ex nxz :--2X +- n. n n Zxz A-l- -):--/.xx)+x' slnce n zxz -)ML n Alternative format for standard deviation w,s- l__x, n Ex -t ,/V n
4. 4. 40 A CCNCIST COURSE IN A-LEVEL STATISTICS NOTE: It is useful to remember that Lc - x2 canbe thought of as 'the mean of the squares minus thr rq{ur"of the mean,. Example 1.23 The mean of the five numbers 2,3, 5,6, 8 is 4.8. Calculate the standard deviation. Solution L.23 MethodLusing s_ r7./__iL r/ '* VN x x--x (x-X)z 2 -2.8 7.84 3 -L.g 3.24 5 A.2 0.04 6 1..2 1.44 8 3.2 fi.24 B4:t.8.0 ^2 22.80 J 5 _ 4.56 s-{ffi - 2.'1.4 (2 d.p.) The working for metho d 2 is less involved. ^ 138 s2:-_(4.8)' 5 - 4.s6 s-{ffi - 2.14 (2 d.p.) L(x - -r.)2 Method2 using s_ 24 39 5 i'S 6.. ' 8,. ',,,,,;,1\$& Using the calculator to find the standard deviation The standard deviation canbe found directly using the calculator in SD mode. The numbers arc entered in the same way as when you arc finding the mean. To find the standard deviation of the five numbers z, 3, s,6, 8 used in Example 1.23: , .h\$\$ Set SD mode lMOpEl tr Clear mem.o*igb. l2raErcE Input data mffi mmi uDArAl trlpAT.t trDAMI tr lp-ATil E ip-iFAl raffiE:;- l6llDrlE.ij]*# r@rl L-qJLW To obtain s :2.135 ... You can check .f,,.= {,\$ fi;,;,.y =,',,1r\$ E.... a = 138 lL= J Imlp,,- lstriFrl tr H iEettr rcEtr IR-CI tr Red letters on third row of calculator Ira-fltr lzilrt E l5eEtr I2.aEtr I].EID To clear SD, mode lMop-El tr IMDEU
5. 5. REPRESENTATION AND SUMMARY OF DATA 4I When dataare in the form of a frequency distribution, the formula for s is or in the alternative form Efi -,s - -+ - x. where .t is the mean. Consider again the data given in Example 1..1.9, on page 32, which shows the number of children in 20 families. The meanis2.9. I\$u;ffi ffe.r,,1,;of ;;;e;ffi.1d*e. 1 1,..pe;#...1l.familyr,,,x il 4I' L 3: 4,,1: } F,re:fiueffiC#;,::,,fl s # You could use one more popular than Method 1 - using of these three methods for finding the standard deviation. Method 2 is Method 1. 1g,' '," ,;i': ; (. ..lr,., ..fi ''X.. .l .'a I l3l 4,: i . - ..rF jfi::xii\$ 0,11.f ["1* '2t"0:,.r ,, s..xii6.ft orif:l 0ill0lt :::::::::::::: 1,..i.ft..ft 414...L # ..& fr,, , t:,; r! !: r i r !:, :: ! i r i:,: i r: r r i i i r i:i: r:: i r:::: r::::: :::: i l ff..ffi.*,,*'.s... :1:..*,..tli:, :: . :'t.. r4: !.i ':'i),.'.a ll:.. ::1... : . t;:: i ::: i I ! i i:: ! l;t:::: i:: i i:::: i i: !: i i: !: i I i ! i i i f: : ..l.is.Iis#.,l. ,''),,.,:;fi;rft,""1,, LL:: l':U : ,,.i:iiiiii:iii:ii:ii::lii:il:ii:i:::it;il::i .t, r.,4t ,, 4l:':i}. ., l: I ' 'rl:i.4., ; .r.:g.a&.9.: !:lir:iriir:!:lii'ririri:i:ri:rXi:i:l:::ir:r::r::;iir; lffl 1. o s2- Zf(*-2.9)' >f 29.80 20 : 1.49 . s - {L49 : 1.22 (2 d.p.) of the number of children per family is L.22 (2 d.p.).The standard deviation Method} - using s - Z f(x - -x)Z >f zf (* - -x)' Lf z f*' -)- ^/-- !, Z,f Lf :2A Zf*r:198
6. 6. : CCNCIST COURST IN A-LEVEL STATISTICS , Lfxz .- ^. 1 s.: ,r _Q.g)' Lt = 1# _ (z.g)' = L.49 t:"[t.+g :1..22 (2 d.p) The standard deviation is 1.22 (2 d.p.), as before. Method 3 - using the calculator in SD mode. This time you need to take account of the frequencies, and this is done in exactly the same way as when finding the mean: Casio 57i0W85 85ru Sherp fiffiffi,Tpd€ lMop-EHMop-qFlFs.:lMO-p-EBl2l IMO-pTH-rI rc64.6.4-q.;d' W:SW#{qg. * ad\${FTtr\$S[rt:-l .-, i., , * ", ,*.:., n,,. iz',d 4:lEE, ." E.,...X.,....tr.... tr,,trtr' E,E_l_g_]jp?sl trtrJEl Input data Dor,,this i",,.Ih* or0er,, x x r ffil..i#...l. .il. s... +.... t..tfl,ft.Sil,i,. 5f:*1,fl.\$,: Eft1.{..1.. 18,.' E#*.*.....,..il., '8 ., E.:.l ii:.ilE.:.ii:.tr:iiffi E....ffi.,.,,H,iliH:.i ffi...: l.,.. ..l.: lffi, ffi'l.. '..'E..l.. .ffi. mffil,E.ll.H ffi.*'E.tH ME.IRdI tr lEdl@ Red Ietters on third row of calculator :lF rll l : l \$ ,.. , lrfoDEliE i,::i Therefore the standard deviation is 1 .22 In a grouped frequency distribution, the interval, as in the followi.rg example. Example L.24 (2 d.p.), as before. mid-interval value is taken as representative of the 30c) +) =C '- :2sCJ c- 3) =:-l -3 . U (j C UJ =crq(I) lr
7. 7. REPRESENTATION AND SUMMARY OF DATA 43 An intelligence test was taken by 115 candidates. For each candidate the time taken to complete the test was recorded, and the times were summarised in a histogram (see diagram). '!7riti down the frequency f.or each of the class intervals O-1,'1,-2,2-3,3-S and 5-10 minutes. Calculate estimates of the mean and standard deviation of the times taken to complete the tesr. (c) Solution 1.24 Frequency = frequency density x interval width. Note that the interval 2-3, f.or exarnple, represents2(time<3. To calculate estimates for the mean and standard deviation, use mid-interval values, r. Zfx 437.5 ^-x-+: *:3.8 (2 s.f.) >f 11s - 2.2 (2 s.f.) The mean time is 3.8 minutes and the standard deviation is 2.2 minutes. [You could have calculated these directly using the calculator in SD mode. Check them yourself.l 2238.7 5 If you are given summary information, rather than the raw data or fre(uency distribution, you cannot use the calculator in SD mode. You will have to use the formulae to calculate the mean and standard deviation, as in the following example. Example 1.25 (a) Cartons of orange juice are advertised as containing L litre. A random sample of 100 cartons gave the following results for the volume, r. Lx = 1,0L.4, Ltcz = L02.83 Calculate the mean and the standard deviation of the volume of orange juice in these 100 cartons.
8. 8. 44 A CONCISE COURSE IN A-LEVEL STATISTICS (b) A machine is supposed to cut lengths of rod 50 cm long. A sample of 20 rods gave the followirg results for the length , x. Z,fx - 9g7, Lf*z : 49 711 (i) Calcul ate, the mean length of the 20 rods. (ii) Calculate the yariance of the lengths of the 20 rods. State the units of the variance in your answer. Solution I.25 , (a) Ex :101 .4r1fr2 = 102.83, n - 100 Ex 1.0'1,.4 , .'. -x - ;: L00 : 1-014 The mean volume is 1 .0L4 litres. L02.83Lv-'vv - 1,.0142 : 0.0101 ... 100 The standard deviation of the volume is 0.0L0 litres (2 s.f.) (b) 2 f* - 997,2 f*2 - 49 711,2 f : 20 Z, fx 997 0 rt- ' :-:49.85 >f 20 The mean length of the rods is 49.85 cm. zfxz ^ 49 711 (ii) Yariance :; - x2 :ff - 49.852 : 0.5275 The variance is 0.5275 cm2. J_ Exercise 1f Mean and standard deviation L. Do not use the statistical progrnm oru your calculator for tltis question. (i) For each of the following sets of numbers, calculate the mean and the stan dard deviation . Try using both forms of the formula for the standard deviation in parts (a) to (c). In parts (d) to (f) choose one of the mefiods- (a) 2,4,5, 6, 8 (b) 6,8,9,11 (c) 17, 14,17,23,29 (d) 5, 1.3, 7,9,16,15 (e) 4.6,2.7,3.1, 0.5, 62 (f) 200, 203, 206,207,209 (ii) Now check your answers usin8 yorrr calculator in SD (STAT) mode. 2. The table shows the weekly wages in {" of each of 100 factory workers. (a) Draw a histogram to illustrate this information. (b) Calculate the mean wage and the standard deviation.
9. 9. 7.3. Do this question (a) without using SD mode, (b) using SD mode on your calculator. The score for a round of golf for each of 50 club members was noted. Find the mean score for a round and the standard deviation. SCore, x FrequbHeyr,f \$'\$; 2 6:7 5 68 r0 69 1,2 709 7X, 6 724 732 The scores in an IQ test for shown in the table. Find the standard deviation. 50 candidates are mean score and the Score Frequency 100-10..5 ,Si fi7 -11,3 13 n4-12A 24 tZ't'*I.2;7 1,,,1. t28=134 4 The stemplot shows the times, recorded to the nearest second, of 12 people in a race. Calculate the mean time and the standard deviation. Kay..it :.I .5:i :: U.S :.t5,.:.lsecotd3 66 A vertical line graph for a set of data is shown below. Calculate the mean and standard deviation of the data. REPRESENTATION AND SUMMARY OF DATA 45 The following table shows the duration of 40 telephone calls from an office via the switchboard. (a) Obtain an estimate of the mean length of a telephone call and the standard deviation. (b) Illustrate the data graphically. D.UretiOn:,,in',:::m[fifite\$, .,Nuffi:b r:,,.::o.f::;lica[l\$ (..,.["'ll'.l :::: :: :::: : :: : "' ! !j" i...b a-l L-r 15 3 5 5 S;.fl..S ', '4 F.l.,il.'0. ,0 (o6c) For a set of ten numbers Zx - 290 and Lxz : 8469. Find the mean and the variance. For a set of nine numbers I(r - x)z : 234. Find the standard deviation of the numbers. For a set of nine numbers I(r - *)z : 60 and Lxz :285. Find the mean of the numbers. A group of 20 people played a game. The table below shows the frequency distribution of their scores. Score 1 7. 4 x Given that the mean score is 5, find (a) the value of x, (b) the variance of the distribution. (C Additional) 12. From the information given about each of the following sets of data, work out the missing values in the table: )-Ex Ex' x r5i.6 !A.9 1.7 52 \$..firl..fi.b ...,. 3i 18 5v 4 At a bird observatory, migrating willow warblers are caught, measured and ringed before being released. The histogram below illustrates the lengths, in millimetres, of the willow warblers caught during one migration season. 4. 8. 9. 10. 11. 5. Stem 1 1 1 2 Leaf 23 556 799 01, 6. (J alb =(,(.) L 'L 4 (a) (b) (c) (d) 13.
10. 10. 46 A CONCISE COURSE IIV A.LEYEL STAT/SI'CS (a) Explain how the histogram shows that the / totrl number of willow warblers caught at the observatory during the migration season is 118. 1.6. The speeds of cars passing a speed camera are shown in the histogram' calculate estimates of the mean speed and the standard deviation. (b) State briefly how it may be deduced from the histogram (without any calculation) that an estimite of the mean length is 111 mm. e"pt"irr briefly why thi-s value..may notbe rhe true mean lengih of the willow warblers caught. (c) GiuE" that the lengths, x mm,.of the willow /warblerscaughtduringthismigration. season were J,,ch that Lx :1'3 099 and bi : t +SS 506, calculate the standard deviation of the iengths' (C) 1,4. For a particular set of observationt '{ - 20 ' . ,r Y; #^:#',:!:o;*'i;#*ile varues or'che 15. For a given frequency distribution Lf(x lil' : 182:3;Lf*'= 1025'Zf = 30' Find the mean of the distribution' b? 16 =boEC =G) ;E 12 a> ct 3u 8 >,o 9a olY 3.E 4 9bLL 0 a10a C o) E (J L o) J 88l! calculations involving the mean and standard deviation Example 1.25 (a) calculate the mean and the standard deviation of the four numbers 2, 3, 6, 9 ' Speed (m.P.h.) set of four numkrs, such that the mean is 2.5. Find a and b- & Additional)(b) Two numbers, increased bY 1 a arrd b, are to be added to this and the variance is increased bY L20 125 Length (mm)