O slideshow foi denunciado.
Seu SlideShare está sendo baixado. ×

Use_of_computer_in_data_analysis.ppt

Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Próximos SlideShares
Excel tips and tricks
Excel tips and tricks
Carregando em…3
×

Confira estes a seguir

1 de 48 Anúncio

Mais Conteúdo rRelacionado

Semelhante a Use_of_computer_in_data_analysis.ppt (20)

Mais de UMAIRASHFAQ20 (20)

Anúncio

Mais recentes (20)

Use_of_computer_in_data_analysis.ppt

  1. 1. Use of computer in data analysis DR SA Balogun ACC HOU Research
  2. 2. Introduction • A Computer is an electronic devices which is capable of accepting data in a prescribed format, capable of processing the data, storing the data and result of the processed and help to release the information in a prescribed format. • In recent times it has become imperative for Road Safety practitioners to have the knowledge of computer. • This is because of the voluminous data often in more than two dimensions which neither human brain nor ordinary statistics can cope with.
  3. 3. SV GA – Supper V ideo Graphic A dapter CU M M RA M – Read/W rite M emory , loses its data when C omputer is switch off . RO M EPO M – Enable Programmable RO M (Ca nnot be a ltered) EA RO M Pe riphe ra ls A uxiliary M emory Input/O utput (T ransport) - T erminal Floppy 5 / or 1.2m, 3 / or 1.44, 2H D Flash H ard disk – C driv e, (1.2G – ov er 80G ) C D RO M K eyboard V D U or M onitor M onochrome (B lack and W hite) C olour D isplay VGA – Vide o Gra phic Ada pte r CGA – Co lo ur Gra phic Ada pte r EGA – Enha nce d Gra phic Ada pte r LCD – Liquid Crysta l Displa y Printe r D O T Inkjet/D eskJet LaserJet Softw a re H a rdw a re C PU Programs O perating System Language D O S Low Lev el H igh Lev el W idow 98, 2000, M il X P Unix Linux FO R T RA N , Formula translation BASIC – Be ginne r a ll purpo se sym bo l Instructio ns COBOL – Co m m o n Busine ss Orie nte d La ngua ge O thers are PA SC A L, c/Ett, A D A File extension P rog. (‘EK E, .C om, ‘B A T T ypes Source Prog. (Raw code) O bject Prog. (C ode T ranslation EK E C om – C ommand B A T – for running V irus IN T ano, Sys etc. A lgorithm T ypes of C omputer Programing System Programing Applica tio n Pro gra m ing (W o rd, Exce l, SPSS e tc FIG 3 6 : CO M PO N EN T O F CO M PU TER
  4. 4. INTRODUCTION • I shall concentrate on the last two application systems; • EXEL • Statistical Package for Social Scientist (SPSS)
  5. 5. EXEL • The Electronic spreadsheet is used for many business applications such as, Accounting, Financial analysis, payroll processing, statistical analysis tool, budgeting etc. • It consists of row and column. • The intersection of row and column is known as cell and a group of cell is workbook. Each cell has address. • The addition of content of cell C5 and B, is written in another cell c3 but only the formula is seen in the cell C3 hence spreadsheet is used for auditing.
  6. 6. An example of spreadsheet packages are: • - Microsoft EXCEL • - Lotus 1,2,3 • - Quattro pro etc
  7. 7. operation in spreadsheet environment • financial operation, • mathematical trigonometry, • Date and time function, • Data base management, • logical (and or not) as well as • statistical operation (such as Average, Beta distribution, chi test, confidence level correlation and Repression. • To do this, open the computer, click F (x) on tool, click statistical and financial and see at least 3 functions.
  8. 8. Starting and closing Exel- • Click start button, programs then Exel. You can also do this by double clicking Exel icon • Other activities- A wide range of activities similar to those in Word can be performed with Exel. • These include entering text, numbers, dates, percentages, correcting mistakes either by overwriting or using undo (ctrl z) and redo (ctrl y), opening of work book from task pane and other uses of task pane. • The print dialogue box usually has default instruction so one has to instruct it as desired.
  9. 9. Auto fill • It saves time while copying text, number etc in a spreadsheet. • To do this, click on the desired cell (i.e Jan) and ensure that the cursor changes to plus sign then drag the cursor over all the cell you wish to copy to. • With this all other cell reads Feb, March etc. Formula can also be copied to different cells by this method. • A cell in the formula can be made absolute or constant by pressing F4 after its reference then ok before auto fill is carried out.
  10. 10. column or row • Selecting column or row- in a spreadsheet you will click letters representing the column or row • Inserting row or column- Select the row, click insert then rows. • Changing column width and row height- Place the mouse below the row or right to the column one wish to resize then double click.
  11. 11. Cut, copy and paste- • Cut, copy and paste- Press ctrl X or C, click the desired cell (s), then control V. Microsoft XP has office clipboard that can allow copying up to 24 items. • To do this press control C twice to display clipboard, select item to paste and select paste in clipboard. • Cells can also be dragged or copied to a new location by selecting the cells (holding down the control key in case of copy) positioning cursor at the edge of the selection so that the cursor can change to a cross sign and drag to new location. • Zoom- is found in the view menu
  12. 12. Auto sum • -It is an easy way to add figures. • To do this click empty answer cell next to all additions, click auto sum icon, and then click enter. • Alternatively, to do this with a formula, select the answer cell, type{=sum ( }then click first cell to reveal its cell reference, type colon then answer and ok. • It must be noted that +, -, *, -, (), and ^ stands for addition, subtraction, multiplication, division, bracket and raise to power respectively. • Where they occur together, using BODMAS priority, bracket takes priority before raise to power, then multiplication, division and plus and minus respectively.
  13. 13. Sheet manipulation • Inserting sheet- Right click sheet tab abd select worksheet from option displayed or select insert from menu and then worksheet. • Same for deleting a sheet. • Renaming sheet- Double click the desired sheet tab then rename and click any cell. • Moving sheet- Click sheet tab of desired sheet and drag to new location or select sheet, click edit, copy and select new location. • Copying a sheet - Select the sheet, hold down control key, click and drag the sheet to new location. • Creating formulae across sheet- This is used to get grand total of data. • For instance if a RTA figure by month for each year occupy a sheet each, the grand total for many years can be obtained on the last sheet by typing =sum(, clicking on first sheet tab, then on required cell location on last sheet, holding down shift key, and clicking on last sheet tab. • By this, the sheets are grouped together. • If such sheets are not next to each other, rather than shift a control key is used in the above exercise.
  14. 14. Sending spreadsheet to Word- • Follow the select, copy and paste procedure but a smart tag would appear after pasting. • You are at liberty to keep the surface formatting or destination (Word) style or even maintain the link to Exel in which case any change in the Exel would automatically reflect in the Word table. • It is possible to paste in Word with the Exel menu available for use. • This is done by selecting paste special from edit and then selecting Exel worksheet object. • Alternatively, double click the spreadsheet within the word and Exel function appears.
  15. 15. Function • -This can be obtained from drop down arrow of auto sum icon or insert icon. • Click on cell of desired answer then the function and ok. • An 'if' function request Exel to consider something i.e true or false. • In this case the condition {= (count if)}is stated followed by the rule and a coma, then what one wants the computer to do i.e if (B6>$B$3, 'NO' 'YES'). • Other operators used apart from equal to is <, >, >, >, >=, and <= respectively. • It is sometimes necessary to nest several 'ifs' together hoping that Exel would pick the applicable answer as in =if (((B2>65,' 'over65', if (B2>4, '45-65', If B2>24, '25-45', UNDER 25'))). • In this case if any of the 'if' is correct Exel picks it but if all are wrong then it pick age under 25.
  16. 16. Exel as database • is another word for querying the database to find information. • To do this, find auto filter from filter in the data menu and a drop down arrows appear on all the headings. • Click on the arrow of desired heading and filter with chosen criteria. • For instance, the RTA table shown below can be filtered to show locations of fatal accident and its causes. • Filter can be customized to find say day that begin with 'T' Location Typeof RTA cause Sexof victim Ageof victim Hourof occurrence season day date AbujA- Lokoj F DGD M 45 1300 Rain Mon 25th etc S SPV M etc etc Dry etc etc S etc F Rain S F Rain M M Dry S M Rain etc etc Rain
  17. 17. Drawing diagram • Click drawing tool bar, click insert diagram, select desired diagram and click ok. • Enter the words into the diagram and in case additional shape is desired resize or change the diagram, click insert shape layout or change to respectively as found on the diagram tool bar.
  18. 18. EXEL FOR STATISTICS • A lecture on different method of data analysis shall be given by another lecturer here but as a refresher see different areas of analysis below;
  19. 19. The Statistic Model
  20. 20. STATISTICS IN EXEL Confidence 5 0 v e h w i t h A v s p e e d 3 0 k m p h S D = 2 . 5 Data D escription 0.05 2.5 50 Sig SD Sample size = C O N F ID E N C E (A 2,A 3,A 4) 0 .6 9 = B tw 2 9 .3 to 3 0 .7 k m p h Binom ials (i.e F ailure / Success of outcom e) I.eProbof next baby beingm ale Data 6 10 0.5 = B IN O M D IS T (A 2 ,A 3 ,A 4 ,F A L S E ) Probability X Probability 0 1 2 3 0.2 0.3 0.1 0.4 =P R O B (A 2:A 5,B 2:B 5,2 Probx=2(0.1) = P R O B ( A 2 :A 5 , B 2 :B 5 ,1 ,3 P r o x b tw 1 & 3 ( 0 .8 ) 1 2 3 4 5 6 7 8 10 11 12 13 14 15 9 Frequency How often Scores Bin 79 70 85 79 89 78 85 50 81 95 88 97 =F R E Q U E N C Y (A 2:A 10,B 2:B 4) Description Nocrash/ successat b/spot No tips/ indep trial Prob of success O netail P rofor x=4(0.0906) 2 tail = 0.1812 O netail P rofor x=6(0.863) 2 tail = 0.274 Show that sam ple m ean will be greater than average observation D e s c rip tio n Z Score =2*MIN(ZTEST (A2:A11,6),ZTEST (A2:A11,6)) =2*M IN (Z T E S T (A 2:A 11,4),Z T E S T (A 2:A 11,4)) =ZTEST (A2:A11,6) 7 9 =ZTEST (A2:A11,4) 6 8 6 5 4 2 1 2
  21. 21. STATISTICS IN EXEL Chi test T test Verify Hypothesis M e n (O b s e rv e d ) W om en (O bserved) D is c r ip tio n 35 25 23 W omen (Expected) Etc Etc Etc T estthat2sam pleare fromsam epopulation .Thedatacouldbe 1.Paired(to/fro) 2.equal variance(Hom oscadacity) 3.Unequal variance(Heteroscadacity) 3 4 5 8 9 1 2 4 6 19 3 2 14 4 5 17 Correlation 5 1 =TTEST (A2:A10, B2:B10,2,1) Pairedtest =0.196 C o m p a re s tw o p ro p e rtie s Veh speed Fuel consumed 3 2 4 5 6 9 7 12 15 17 0.997 =CORREL (A2:A6,B2:B6) F Test T estsigif2sam ples havediffvariance Test scoreof untrained T est scoreof trained 7 9 15 21 20 28 31 38 40 6 =FTEST (A2:A6,B2:B6) 58 11 10 45.35or (79x93/162) M en (E xpected) Etc Etc = C H IT E S T (A 2 :B 4 , A 6 :B 8 ) 1 2 3 4 5 6 7 8 10 11 12 13 14 15 9 6 Pearson Ranges from -1 to 1 and it reflect linear relationship Indep (driv exp) Dep (nocrash) 9 7 5 3 1 10 6 1 5 3 0.699 =PEARSON (A2:A6, B2:B60 Agree Agree Neutral Neutral D isagree D isgree
  22. 22. STATISTICS IN EXEL F o re c a s t P re d ic t fu tu re fro m e x is tin g Fuel consum ed (Y) V e h s p e e d ( X ) 6 7 9 1 5 2 1 2 0 2 8 3 1 3 8 4 0 =FORECAST (30,A2:A6,B2:B6) Predict y given x=30 G ro w th P r e d i c t e x p o n e n t i a l ( c u r v e ) g r o w t h u s i n g e x i s t i n g d a t a L o c a tio n T ra ffic C o u n t 11 1 2 1 3 1 4 1 5 1 6 3 3 ,1 0 0 4 7 ,3 0 0 6 9 ,0 0 0 1 0 2 ,0 0 0 1 5 0 ,0 0 0 2 2 0 ,0 0 0 T re n d U s e l e a s t s q u a r e m e t h o d t o f i t l i n e a r v a l u e o f Y f o r g i v e n X M o n th R e v e n u e / R T A 1 2 3 4 5 6 7 8 9 1 0 11 1 2 145,290 1 4 4 ,0 0 0 1 4 3 ,2 3 0 1 4 1 ,8 9 0 1 4 1 ,1 2 0 1 3 9 ,9 0 0 1 3 9 ,1 0 0 1 3 8 ,1 3 0 1 3 7 ,3 0 0 1 3 5 ,7 9 0 1 3 5 ,0 0 0 1 3 3 ,8 9 0 R e g re s s io n =TREND (A2:A13,A15:A19) = G R O W T H ( B 2 : B 7 , A 2 : A 7 , A 9 : A 1 0 )
  23. 23. SPSS • The three stages in the use of SPSS in data analysis are; • 1. Data entry and data encoding • 2. Data analysis • 3. Data interpretation
  24. 24. Parametric Test 1.Summary statistics using frequency 2.Summary statistics using descriptive 3.Exploratory data analysis 4.Analysis of cross classification using crosstab - To study normal-normal relationship - Ordinal-ordinal - To measure relative risk of events - To measure agreements 5.The summarise procedure 6.The means procedure 7.The OLAP cubes procedure 8.T Test -one sample T Test a)each machine as separate sample b)sample mean against known value -paired sample Ttest -independent sample Ttest a)determining groups in an independent sample Ttest b)Testing two independent sample means c)using cut point to define
  25. 25. Parametric Test 9.One way ANOVA -Testing equality of group means -Performing One way ANOVA -All possible comparison between means -Robust ANOVA 10.GLM Univariate 11.Partial correlation 12.Linear regression 13.Ordinal regression 14.Curve estimation -model law of diminishing returns -Model viral growth 15.Discriminat analysis -assess credit risk -Classify customer 16.Factor analysis -data reduction -structure detection 17.Two step cluster analysis -to classify 18.Hierachical cluster analysis -use to classify -used to study relationships 19.K-means cluster analysis -to classify customers
  26. 26. Non-Parametric Test 20.Chi-square test a)Testing independence b)Testing a specific range c)Customising expected value -Binomial test a)comparing several distribution b)using cut point to define the samples -Run test a)examining usability of test result b)testing multiple cut point -One sample Kolmogorov-Smirnov test a)testing goodness of fit -Two independent sample test a)Using Man whitney to test b)using two sample Komogorov- smirnov test to compare distribution -Test for several independent samples a)using the median test to detect group differences b)using Kruskal-Wallis to test ordinal outcomes -Test related sample test a)testing a sample median against known value b)using McNeman test in a pre- post design -Test for several related samples a)i.e testing usability of a website b)using Friedman test on related
  27. 27. 21.Multiple response analysis 22.Ratio statistics 23.ROC curve -used to evaluate performance -Used to choose between competing classification scheme 24.Measure of reliability in scale problems -used to analyze survey items -used to analyze inter-rated agreement 25.Control chart -used to monitor Used to track proportion of detective unit 26.Scoring data with predictive model 27.Select predictors -used to mine customer database 28.Naïve Bayes -used for prediction, selection and classification -used for classify respondents
  28. 28. modeling 21.Regression model option -Binary logistic regression a)to asses credit risk -Multinomial logistic regression a)used to profile consumer of packaged gods b)used to classify customer c)used to analyse a 1-1 matched cases control study -Non linear regression a)used to model law of diminishing returns b)used to model viral growth -Probit analysis a)used to test promotional effect on sales -Weight estimation a)to model cost of mall construction -Two stage least-square regression a)used to model cross sales
  29. 29. modeling 22.Advanced model option -Multivariate General Linear Modeling a)GLM Multivariate to profile difference in amount spent by two groups b)GLM repeated measure to measure effect of each promotion on sales Variance component a)used to analyse product test result from multiple markets -Linear Mixed models a)used to analyse product test result fro multiple markets b)used to analyse repeated measurement of i.e weight and alcohol level after a meal for 6 months c)used to model (random effects and repeated measure
  30. 30. modeling i. e banning helmet in few states and using different enforcement strategies to know the best) d)used to fit a random coefficient model to find change before and after treatment-General linear models a)GLM to model poison distribution of RT cases b)to fit Gama regression of insurance claims on RTC c)to analyze internal censored survival data of event whose time of occurrence is unknown i.e RTC -Generalized Estimation equation a)to fit repeated measure logistic regression i.e effect of an observed behavior on the subject -Loglinear modeling (relationship between categorical variables)
  31. 31. modeling a)to model accident rate i.e Age & Gender risk factors b)paired data-response f subject before and after treatment c)Logit Linear analysis to model 1/more categorical against 1/more predictors i.e 800 drivers asked which of 3 helmet they like best -Life tables a)i.e to find relationship between time spent before becoming licenses driver -Kaplan-Meievr survival analysis a)to study distribution of time to event i.e time take for drug to affect driving -Cox regression a)to model time to specific event based on given covariance
  32. 32. modeling 23.Complex sample option -sampling wizard a)obtaining sample from full sampling frame b)obtaining samples from partial frame c)sampling with probability proportional to size(PPS) -Analysis wizard a)used to ready NHIS data b)used when sampling weights are not in the data files c)Tabulation d)descriptive -Complex sample a)frequencies b)descriptives c)crosstab d)ratio e)GLM f)logistic regression g)ordinal regression
  33. 33. 24.Trend option-forecast and modeling -Bulk forecasting with expert modeler a)using expert modeler to determine significant predictors-Bulk forecasting by applying saved models a)experimenting with predictors by applying saved models -seasonal decomposition -spectral plots 25.Categories option -Categorical regression a)i.e effect of socio-eco on driving habit -Categorical Principal component analysis a)i.e effect of socio-eco on driving habit of different vehicle modes -Non linear canonical correlation analysis a)i.e to find similarities in the socio-eco factors
  34. 34. -Corresponding analysis a)analysis from cross tab -Multiple correspondence analysis a0i.e effect of socio-eco on driving habit of different vehicle types in the mode -Multidimensional scaling -Multidimensional unfolding 26.Conjoint option -used to model carpet cleaner preference i.e effect of socio eco on driving habit 27.Tree option -to evaluate credit risk 28.Data preparation option
  35. 35. Scale/numeric variblewithin category Whatkindof displaydo youwant Tables whatkindof summary Chart/graph kindofchart? Box-grap,chart buider,gallary .boxplot,drop thevariables Discriptive-analyse ,report,casesummary ,selectvariable,display cases
  36. 36. C o m p a r e g r o u p f o r s i g n d i f f D a t a i n c a t e g o r i e s ( o r d i n ) - a n a l , d e s c , c r o s s t a b , v a r i a b l e S c a l e / n u m e r d i v i d e d i n t o g r o u p s O n e g r o u p - i . e c o m p a r i n g v i o l a t i o n o f 1 0 0 k m p h a l o n g 8 d i f f r o a d s . 1 6 v a l u e s a r e o b t a i n e d . D a t a , s p l i t f i l e , s e l e c t r o a d T w o G r o u p s / v a r i a b l e s O n e s c a l e / n u m e r i c v a r i a b l e d i v i d e d i n t o t w o u n r e l a t e d g r o u p s . W h i c h i n d e p s a m p l e d o y o u w a n t t o t e s t T e s t t h a t a s s u m e s d a t a a r e n o r m a l l y d i s t r i b u t e d w i t h i n g r o u p s i . e f i n d i n g a c c u r a c y o f 2 t y p e s o f t r a f f i c c o u n t e r o n r x t n t i m e o f v e h i c l e s m o v i n g a t h y p o t h e t i c a l s p e e d o f 1 0 0 k m p h . T h e v e h w e r e r a n d o m l y a s s i g n e d t o t r a i n e d & u n t r g d r i v & m a d e t o t r a v e l e q u a l d i s t a n c e - a n a l y s e c o m p m e a n s , i n d e p s a m p l e T t e s t T e s t t h a t d o e s n o t a s s u m e s d a t a a r e n o r m a l l y d i s t r i b u t e d w i t h i n g r o u p s - 2 i n d e p s a m p l e t e s t T w o s c a l e / n u m e r i c w h i c h p a i r e d s a m p l e d o y o u w a n t ? Assum es both variable norm ally distributed-Analyse,com p m eans, paired sam ple T test select paired variable D o n o t a s s u m e s b o t h v a r i a b l e n o r m a l l y d i s t r i b u t e d - a n a l y , n o n p a r a , 2 r e l a t e d s e l e c t p a i r e d v a r i a b l e T h r e e / m o r e g r o u p . H o w m a n y g r o u p i n g ( f a c t o r ) v a r i a b d o y o u h a v e ? O n e - i . e r e v e n u e f o r t h r e e g r o u p s d e f i n e d b y r e g i o n T e s t th a t a s s u m e d a ta a r e n o r m a ly d is tr ib u te dw ith ing r o u p i.ein flu e n c eo f s p e e d (in d e p v a ria b ) o n fa ta l a n d s e rio u s a c c c a s e - a n a ly s e ,o n e w a y A N O V A D a t a n o t n o r m a l l y d i s t r i b u t e d - a n a l y s e , n o n p a r a m e t r , k - i n d e p e n d e n t s a m p l e N o r m a l i t y i s s k e w e d T w o / m o r e - i . e r e v e n u e f o r g r o u p s d e f i n e d b y d i v i s i o n w i t h i n e a c h r e g i o n - a n a l y s e , G L M , u n i v a r i a t e P u t s c a l e a s d e p e n d e n t v a r i a b l e
  37. 37. I d e n t i f y s i g r e l a b t w v a r i a b l e . W h a t k i n d o f d a t a d o y o u h a v e ? D a t a i n c a t e g o r i e s ( n o m / o r d i n ) - a n a l y , d e s c , c r o s s t a b O r d i n a / r a n k o r d e r o r n o n m o r m a l - A n a l y , c o r r e l a t e , b i v a r i a t e , s p e a r m a n / k e n d a l S c a l e / n u m e r i c ( i n t e / r a t i o ) - H o w m a n y v a r i a b l e d o y o u w a n t t o e v a l u a t e ? T w o ( o r m u l t i l p e p a i r o f v a r i a ) T a b l e s & n u m b e r - A n a l , c o r r e l a t e , b i v a r i a t e , p e a r s o n . I . e c o m p a r e d r i v i e x p w i t h c a u s e s o f a c c i d l i k e i n d i s & s p e e d C h a r t s & g r a p h H o w m a n y p a i r s o f v a r i a b l e d o y o u w a n t t o l o o k a t ? O n e - G r a p h , c h a r t b u i l d e r , g a l e r y s c a t t e r T w o o r m o r e p a i r s o f v a r i a b l e s - G r a p h , c h a r t b u i l d e r , s c a t t e r T w o c o n t r o l l i n g f o r t h e e f f e c t s o f o n e o r m o r e a d d i t i o n a l v a r i a b l e s - A n a l y , c o r r e l a t , p a r t i a l , E x a c t l y t h r e e ( 3 D s c a t t e r p l o t ) - G r a p h , c h a r t b u i l d e r , g a l l e r y s c a t t e r , 3 D O n e d e p e n d e n t v a r i a b l e a n d t w o o r m o r e i n d e p e n d e n t ( p r e d i c t o r ) v a r i b l e - A n a l y , r e g r e s s i o n , l i n e a r s e l e c t s c a l e a s d e p e n d e n t , 2 / m o r e s c a l e a s i n d e O r d i n a l d e p a n d s c a l e o r c a t e g i c a l i n d e p v a r i a b l e - A n a l y s e , r e g r e s s i o n , s e l e c t o r d n a l d e p e n d e a s c a t e g o r i c a l a n d o t h e r s a s f a c t o r / c o v a r i a t e M u l t i v a r i a t e
  38. 38. I d e n t i f y g r o u p s o f s i m i l a r c a s e s . W h a t k i n d o f d a t a d o y o u h a v e ? S c a l e , n u m e r i c ( i n t e r v a / r a t i o ) I d e n t i f y g r o u p s L e s s t h a n 2 0 0 - A n a l y s e , c l a s s i f y , h i e r r a c h c a l c l u s t e r, s e l e c t a l l s c a l e v a l u a b l e s t o u s e 2 0 0 o r m o r e c a s e s - A n a l y , c l a s s i f y , K - m e a n s c l u s t e r , s e l e c t s c a l e t o b e u s e d , s p e c i f y n o c l u s t e r e t c I d e n t i f y c h a r a c t r i s t i c o f k n o w n g r o u p s - A n a l y, c l a s s i f y, d i s c r i m i n a n t , s e l e c t c a t e g o r i c a l g r o u p i n g v a r i a b l e r a n g e t o s p e c i f y t h e c a t e g o i e s o f i n t e r e s t t h e n s e l e c t s c a l e i n d e p e n d e n t v a r i a b l e ( n o . C r a s h & e x p , a g e , i n c o m e e t c ) C a t e g o r i c a l ( n o m i / o r d i n a ) o r a m i x o f s c a l e & c a t e g o r i c v a r i a b t o u s e i n c l u s t e r a n a l y s i s I d e n t i f y g r o u p s o f s i m i l a r v a r i a b l e s - A n a l s , d a t a r e d u c t i o , f a c t o r s e l e c t s c a l e v a r i a b l e f o r f a c t o r a n a l y s i s
  39. 39. TESTING • This involves hypothesizing. • Usually a hypothesis of no significance different (Ho) is set. • Ho is rejected if the calculated value is more than the table value at 95% confidence level or 0.05% level of significance (except in one parametric test known as Wilconox test). • Rejection of Hypothesis means that the result was due to chance. • The two levels used in hypothesizing are confidence level and level of significance.
  40. 40. Testing These are the level within which our errors are constrained. They also compare figure from the sample to the population. Although the two levels are interchangeably used, they are nonetheless different. While confidence level is represented as 90%, 95%, 99%, the level of significance is represented as 0.1, 0.5 and 0.01 respectively. The 99% OR 0.01 means the result was wrong one in hundred or 99% right respectively.
  41. 41. Conclusion • Both the EXEL and SPSS are useful in data analysis • The way data are arranged in SPP is different from that of EXEL • While SPSS data do not require any preliminary summary , the EXEL data sometimes require preliminary summaries. • The explanations in this paper does not fully cover all bout the two application programmes. • The more each and every one of you continue to practice analysis with the above software the more you become versatile in them-PRACTICE MEANS PERFECT • Thank you

×