Analyzing Quantitative Data

A Simple Guide to the Analysis of
Quantitative Data

An Introduction with hypotheses,
illustrations and references

By

Paul Andrew Bourne

A Simple Guide to the Analysis of
Quantitative Data: An Introduction with
hypotheses, illustrations and references

By

Paul Andrew Bourne
Health Research Scientist, the University of the West Indies,
Mona Campus

Department of Community Health and Psychiatry
Faculty of Medical Sciences
The University of the West Indies, Mona Campus, Kingston, Jamaica

2

© Paul Andrew Bourne 2009

A Simple Guide to the Analysis of Quantitative Data: An Introduction with hypotheses,
illustrations and references

The copyright of this text is vested in Paul Andrew Bourne and the Department of Community
Health and Psychiatry is the publisher, no chapter may be reproduced wholly or in part without
the expressed permission in writing of both author and publisher.

All rights reserved. Published April, 2009

The University of the West Indies, Mona Campus, Kingston, Jamaica.

National Library of Jamaica Cataloguing in Publication Data

A catalogue record for this book is available from the National Library of Jamaica

ISBN 978-976-41-0231-1 (pbk)

Covers were designed and photograph taken by Paul Andrew Bourne

3

Table of Contents
Page

Preface 8
Menu bar – Contents of the Menu bar in SPSS 11
Function - Purposes of the different things on the menu bar 12
Mathematical symbols (numeric operations), in SPSS 13
Listing of Other Symbols 14
The whereabouts of some SPSS functions, or commands 16
Disclaimer 19
Coding Missing Data 20
Computing Date of Birth 21
List of Figures 26
List of Tables 29
How do I obtain access to the SPSS PROGRAM? 35
1. INTRODUCTION ……………………………………………………………........ 43
1.1.0a: steps in the analysis of hypothesis…………………………………… 45
1.1.1a Operational definitions of a variable………………………………… 47
1.1.1b Typologies of variable ………………..………………………………. 49
1.1.1 Levels of measurement………..………………………………………... 50
1.1.3 Conceptualizing descriptive and inferential statistics ……………….. 59

2. DESCRIPTIVE STATISTICS ANALYZED ….……………………………........ 62
2.1.1 Interpreting data based on their levels of measurement………..……. 64
2.1.2 Treating missing (i.e. non-response) cases…………………….………. 84

3. HYPOTHESES: INTRODUCTION …………………………….………………. 87
3.1.1 Definitions of Hypotheses………………..……..………………………. 88
3.1.2: Typologies of Hypothesis……………………………………………… 89
3.1.3: Directional and non-Directional Hypotheses………………………….. 90
3.1.4 Outliers (i.e. skewness)…………………………….……………………. 91
3.1.5 Statistical approaches for treating skewness…………….……………… 93

4. Hypothesis 1…[using Cross tabulations and Spearman ranked ordered correlation]
……………………………………………………….. 96

A1. Physical and social factors and instructional resources will directly influence the
academic performance of students who will write the Advanced Level Accounting
Examination;
A2. Physical and social factors and instructional resources positively influence the
academic performance of students who write the Advanced level Accounting
examination and that the relationship varies according to gender;

4

B1. Pass successes in Mathematics, Principles of Accounts and English Language at the
Ordinary/CXC General level will positively influence success on the Advanced level
Accounting examination;
B2. Pass successes in Mathematics, Principles of Accounts and English Language at the
Ordinary.

5. Hypothesis 2…………[using Crosstabulations]..…………………………….. 152

There is a relationship between religiosity, academic performance, age and marijuana
smoking of Post-primary schools students and does this relationship varies based on
gender.

6. Hypothesis 3……….…..…[Paired Sample t-test]…….……………………… 164

There is a statistical difference between the pre-Test and the post-Test scores.

7. Hypothesis 4….………[using Pearson Product Moment Correlation]…..…........ 184

Ho: There is no statistical relationship between expenditure on social programmes (public
expenditure on education and health) and levels of development in a country; and
H1: There is a statistical association between expenditure on social programmes (i.e.
public expenditure on education and health) and levels of development in a country

8. Hypothesis 5….. ………[using Logistic Regression]…………………………........ 199

The health care seeking behaviour of Jamaicans is a function of educational level,
poverty, union status, illnesses, duration of illnesses, gender, per capita consumption,
ownership of health insurance policy, and injuries. [ Health Care Seeking Behaviour =
f( educational levels, poverty, union status, illnesses, duration of illnesses, gender, per
capita consumption, ownership of health insurance policy, injuries)]

9. Hypothesis 6….. ……[using Linear Regression] ….………………………….. 207

There is a negative correlation between access to tertiary level education and
poverty controlled for sex, age, area of residence, household size, and educational level
of parents

10. Hypothesis 7….. ……[using Pearson Product Moment Correlation Coefficient and
Crosstabulations]………………………....................... 223

There is an association between the introduction of the Inventory Readiness Test and
the Performance of Students in Grade 1

5

11. Hypothesis 8….…………[using Spearman rho]……………………………….... 232

The people who perceived themselves to be in the upper class and middle class are
more so than those in the lower (or working) class do strongly believe that acts of
incivility are only caused by persons in garrison communities

12. Hypothesis 9………………………………………………………………........ 235

Various cross tabulations

13. Hypothesis 10………[using Pearson and Crosstabulations]………………........ 249

There is no statistical difference between the typology of workers in the construction
industry and how they view 10-most top productivity outcomes

14. Hypothesis 11….…[using Crosstabulations and Linear Regression]……........ 265

Determinants of the academic performance of students

15. Hypothesis 12….……[using Spearman ranked ordered correlation]…........ 278

People who perceived themselves to be within the lower social status (i.e. class) are
more likely to be in-civil than those of the upper classes.

16. Data Transformation…………………………………………………........ 281

Recoding 291
Dummying variables 309
Summing similar variables 331
Data reduction 340

Glossary……………..….. ………………………………………………………........ 350

Reference…..………….…………………………………………………………........ 352

Appendices…………..….. ………………………………………………………........ 356
Appendix 1- Labeling non-responses 356

6

Appendix 2- Statistical errors in data 357
Appendix 3- Research Design 359
Appendix 4- Example of Analysis Plan 366
Appendix 5- Assumptions in regression 367
Appendix 6- Steps in running a bivariate cross tabulation 368
Appendix 7- Steps in running a trivariate cross tabulation 380
Appendix 8- What is placed in a cross tabulations table, using the above SPSS output
394
Appendix 9- How to run a Regression in SPSS 395
Appendix 10- Running Regression in SPSS 396
Appendix 11a- Interpreting strength of associations 407
Appendix 11b - Interpreting strength of association 408
Appendix 12- Selecting cases 409
Appendix 13- ‘UNDO’ selecting cases 417
Appendix 14- Weighting cases 420
Appendix 15- ‘Undo’ weighting cases 429
Appendix 15- Statistical symbolisms 440
Appendix 16 – Converting from ‘string’ to ‘numeric’ data –

Apparatus One – Converting from string data to numeric data 443

Apparatus Two – Converting from alphabetic and numeric data
to all ‘numeric data 447

Appendix 17- Steps in running Spearman rho 454

Appendix 18- Steps in running Pearson’s Product Moment Correlation 459

Appendix 19-Sample sizes and their appropriate sampling error 464

Appendix 20 – Calculating sample size from sampling error(s) 465

Appendix 21 – Sample sizes and their sampling errors 467

Appendix 22 - Sample sizes and their sampling errors 468

Appendix 23 – If conditions 469

Appendix 24 – The meaning of ρ value 477

Appendix 25 – Explaining Kurtosis and Skewness 478

Appendix 26 – Sampled Research Papers 479-560

7

PREFACE

One of the complexities for many undergraduate students and for first time researchers is ‘How
to blend their socialization with the systematic rigours of scientific inquiry?’ For some, the
socialization process would have embedded in them hunches, faith, family authority and even
‘hearsay’ as acceptable modes of establishing the existence of certain phenomena. These are
not principles or approaches rooted in academic theorizing or critical thinking. Despite
insurmountable scientific evidence that have been gathered by empiricism, the falsification of
some perspectives that students hold are difficulty to change as they still want to hold ‘true’ to
the previous ways of gaining knowledge. Even though time may be clearly showing those
issues are obsolete or even ‘mythological’, students will always adhere to information that they
had garnered in their early socialization. The difficulty in objectivism is not the ‘truths’ that it
claims to provide and/or how we must relate to these realities, it is ‘how do young researchers
abandon their preferred socialization to research findings? Furthermore, the difficulty of
humans and even more so upcoming scholars is how to validate their socialization with
research findings in the presence of empiricism.
Within the aforementioned background, social researchers must understand that ethic
must govern the reporting of their findings, irrespective of the results and their value systems.
Ethical principles, in the social or natural research, are not ‘good’ because of their inherent
construction, but that they are protectors of the subjects (participants) from the researcher(s)
who may think the study’s contribution is paramount to any harm that the interviewees may
suffer from conducting the study. Then, there is the issue of confidentiality, which sometimes
might be conflicting to the personal situations faced by the researcher. I will be simplistic to
suggest that who takes precedence is based on the code of conduct that guides that profession.
Hence, undergraduate students should be brought into the general awareness that findings must
be reported without any form of alteration. This then give rise to ‘how do we systematically
investigate social phenomena?’

The aged old discourse of the correctness of quantitative versus qualitative research
will not be explored in this work as such a debate is obsolete and by rehashing this here is a
pointless dialogue. Nevertheless, this textbook will forward illustrations of how to analyze
quantitative data without including any qualitative interpretation techniques. I believe that the
problems faced by students as how to interpret statistical data (ie quantitative data), must be
addressed as the complexities are many and can be overcome in a short time with assistance.

My rationale for using ‘hypotheses’ as the premise upon which to build an analysis is
embedded in the logicity of how to explore social or natural happenings. I know that
hypothesis testing is not the only approach to examining current germane realities, but that it is
one way which uses more ‘pure’ science techniques than other approaches.

Hypothesis testing is simply not about null hypothesis, Ho (no statistical relationships),
or alternative hypothesis, Ha, it is a systematic approach to the investigation of observable
phenomenon. In attempting to make undergraduate students recognize the rich annals of
hypothesis testing and how they are paramount to the discovery of social fact, I will

8

recommend that we begin by reading Thomas S. Kuhn (the Scientific Revolution), Emile
Durkheim (study on suicide), W.E.B. DuBois (study on the Philadelphian Negro) and the
works of Garth Lipps that clearly depict the knowledge base garnered from their usage.

In writing this book, I tried not to assume that readers have grasped the intricacies of
quantitative data analysis as such I have provided the apparatus and the solutions that are
needed in analyzing data from stated hypotheses. The purpose for this approach is for junior
researchers to thoroughly understand the materials while recognizing the importance of
hypothesis testing in scientific inquiry.

Paul Andrew Bourne, Dip Ed, BSc, MSc, PhD
Health Research Scientist
The University of the West Indies
Mona-Jamaica.

9

ACKNOWLEDGEMENT

This textbook would not have materialized without the assistance of a number of people
(scholars, associates, and students) who took the time from their busy schedule to guide,
proofread and make invaluable suggestions to the initial manuscript. Some of the individuals
who have offered themselves include Drs. Ikhalfani Solan, Samuel McDaniel and Lawrence
Nicholson who proofread the manuscript and made suggestions as to its appropriateness,
simplicities and reach to those it intend to serve. Furthermore, Mr. Maxwell S. Williams is
very responsible for fermenting the idea in my mind for a book of this nature. Special thanks
must be extended to Mr. Douglas Clarke, an associate, who directed my thoughts in time of
frustration and bewilderment, and on occasions gave me insight on the material and how it
could be made better for the students.

In addition, I would like to extend my heartiest appreciation to Professor Anthony
Harriott and Dr. Lawrence Powell both of the department of Government, UWI, Mona-
Jamaica, who are my mentors and have provided me with the guidance, scope for the material
and who also offered their expert advice on the initial manuscript.

Also, I would like to take this opportunity to acknowledge all the students of
Introduction to Political Science (GT24M) of the class 2006/07 who used the introductory
manuscript and made their suggestions for its improvement, in particular Ms. Nina Mighty.

10

Menú Bar

Content:

A social researcher should not only be cognizant of statistical techniques and modalities of
performing his/her discipline, but he/she needs to have a comprehensive grasp of the various
functions within the ‘menu’ of the SPSS program. Where and what are constituted within the
‘menu bar’; and what are the contents’ functions?

‘Menu bar’ contains
the following:

- File
- Edit
- View
- Data
- Transform
- Analyze
- Graph
- Utilities
- Add-ons
- Window
- Help

The functions of the various contents of the
‘menu bar’ are explored overleaf

Box 1: Menu Function

11

Menu Bar

Functions: Purposes of the different things on the menu bar

File – This icon deals with the different functions associated with files such as (i) opening ..,
(ii) reading …, (iii) saving …, (iv) existing.

Edit – This icon stores functions such as – (i) copying, (ii) pasting, (iii) finding, and (iv)
replacing.

View – Within this lie functions that are screen related.

Data – This icon operates several functions such as – (i) defining, (ii) configuring, (iii)
entering data, (iv) sorting, (v) merging files, (vi) selecting and weighting cases, and
(vii) aggregating files.

Transform – Transformation is concerned with previously entered data including (i) recoding,
(ii) computing, (iii) reordering, and (vi) addressing missing cases.

Analyze – This houses all forms of data analysis apparatus, with a simply click of the Analyze
command.

Graph – Creation of graphs or charts can begin with a click on Graphs command

Utilities – This deals with sophisticated ways of making complex data operations easier, as
well as just simply viewing the description of the entered data

12

MATHEMATICAL SYMBOLS (NUMERIC OPERATIONS), in SPSS

NUMERIC OPERATIONS FUNCTIONS

+ Add
- Subtract
* Multiply
/ Divide
** Raise to a power
() Order of operations
< Less than
> Greater than
<= Less than or equal to
>= Greater than or equal to
= Equal
~= Not equal to
& and: both relations must be true
I Or: either relation may be true
~ Negation: true between false, false
become true
Box 2: Mathematical symbols and their Meanings

13

LISTING OF OTHER SYMBOLS

SYMBOLS MEANINGS

YRMODA (i.e. yr. month, day) Date of birth (e.g. 1968, 12, 05)
a Y intercept
b Coefficient of slope (or regression)
f frequency
n Sample size
N Population
R Coefficient of correlation, Spearman’s
r Coefficient of correlation , Pearson
Sy Standard error of estimate
W ot Wt Weight
µ Mu or population mean
β Beta coefficient
3 or χ Measure of skewness
∑ summation
σ Standard deviation
χ2 Chi-Square or chi square, this is the
value use to test for goodness of fit
CC Coefficient of Contingency
fa Frequency of class interval above
modal group
fb Frequency of class interval below
modal group
X A single value or variable
_ Adjusted r, which is the coefficient of
R correlation corrected for the number
of cases
_ _ Arithmetic mean of X or Y
X or Y
RND Round off to the nearest integer
SYSMIS This denotes system-missing values
MISSING All missing values
Type I Error Claiming that events are related (or
means are different when they are not
Type II Error This assumes that events (or means
are not different) when they are
Φ Phi coefficient
r2 The proportion of variation in the
dependent variable explained by the
independent variable(s)

14

LISTING OF OTHER SYMBOLS

SYMBOLS MEANINGS

P(A) Probability of event A

P(A/B) Probability of event A given that event
B has happened

CV Coefficient of variation

SE Standard error

O Observed frequency

X Independent (explanatory, predictor)
variable in regression

Y Dependent (outcome, response,
criterion) variable in regression
df
Degree of freedom
t
Symbol for the t ratio (the critical
ratio that follows a t distribution
R2
Squared multiple correlation in
multiple regression

15

FURTHER INFORMATION ON TYPE I and TYPE II Error

The Real world
The null hypothesis is really……..

True False
Finding from your
Survey
You found that True No Problem Type 2 Error
the null
hypothesis is:
False Type 1 Error No Problem

THE WHEREABOUTS OF SOME SPSS FUNCTIONS

Functions or Commands Whereabouts, in SPSS (the process in
arriving at various commands)

Mean, Analyze
Mode, Descriptive statistics
Median, Frequency
Standard deviation,
Skewness, or kurtosis, Statistics
Range
Minimum or maximum
Analyze
Chi-square Descriptive statistics
crosstabs

16

Analyze
Pearson’s Moment Correlation Correlate
bivariate

Analyze
Spearman’s rho Correlate
Bivariate
(ensure that you deselect Pearson’s, and
select Spearman’s rho)

Analyze
Linear Regression Regression
Linear

Analyze
Logistic Regression Regression
Binary

Analyze
Discriminant Analysis Classify
Discriminant

Analyze
Mann-Whitney U Test Nonparametric Test
2 Independent Samples

Independent –Sample t-test Analyze
Compare means
Independent Samples T-Test

Analyze
Wilcoxon matched-pars test or Nonparametric Test
2 Independent Samples
Wilcoxon signed-rank test

Analyze
t-test Compare means

Analyze
Paired-samples t-test Compare means
Paired-samples T-test

Analyze
One-sample t-test Compare means
One-samples T-test

Analyze
One-way analysis of variance Compare means
One-way ANOVA

17

Analyze
Factor Analysis Data reduction
Factor

Analyze
Descriptive (for a single metric Descriptive statistics
Descriptive
variable)

Graphs
Graphs (select the appropriate type)
Pie chart
Bar charts
Histogram

Graphs
Scatter plots Scatter…

Data
Weighting cases Weight cases….
Select weight cases by
Graphs
Selecting cases Select cases…
If all conditions are satisfied
Select If

Transform
Replacing missing values Missing cases values…

Box 3: The whereabouts of some SPSS Functions

18

Disclaimer

I am a trained Demographer, and as such, I have undertaken extensive review of

various aspects to the SPSS program. However, I would like to make this unequivocally clear

that this does not represent SPSS (Statistical Product and Service Solutions, formerly Statistical

Package for the Social Sciences) brand. Thus, this text is not sponsored or approved by SPSS,

and so any errors that are forthcoming are not the responsibility of the brand name.

Continuing, the SPSS is a registered trademark, of SPSS Inc. In the event that you need more

pertinent information on the SPSS program or other related products, this may be forwarded to:

SPSS UK Ltd., First Floor, St. Andrews House, West Street, Working GU211EB, United

Kingdom.

19

Coding Missing Data

The coding of data for survey research is not limited to response, as we need to code missing

data. For example, several codes indicate missing values and the researcher should know them

and the context in which they are applicable in the coding process. No answer in a survey

indicates something apart from the respondent’s refusal to answer or did not remember to

answer. The fundamental issue here is that there is no information for the respondent, as the

information is missing.

Table : Missing Data codes for Survey Research

Question Refused answer Didn’t know answer No answer recorded
Less than 6 categories 7 8 9
More than 7 and less 97 98 99

than 3 digits
More than 3 digits 997 998 999

Note

Less than 6 categories – when a question is asked of a respondent, the option (or response) may

be many. In this case, if the option to the question is 6 items or less, refusal can be 7, didn’t

know 8 or no answer 9.

Some researchers do not make a distinction between the missing categories, and 999 are used

in all cases of missing values (or 99).

20

Computing Date of Birth – If you are only given year of birth
Step 1

Step 1:

First, select transform, and
then compute

21

Step 2

On selecting
‘compute variable’ it
will provide this
dialogue box

22

Step 3

In the ‘target
variable’, write
the word which
the researcher
wants to use to
represents the idea

23

Step 4

If the SPSS program is
more than 12.0 (ie 13 –
17), the next process is
to select all in ‘function
group’ dialogue box

In order to
convert year
of birth to
actual ‘age’,
select
‘Xdate.Year’

24

Step 5

Replace the
‘?’ mark
with
variable in
the dataset

Having selected
XYear, use this
arrow to take it
into the ‘Numeric
Expression’
dialogue box

25

LISTING OF FIGURES AND TABLES

Listing of Figures

Figure 1.1.1: Flow Chart: How to Analyze Quantitative Data?

Figure 1.1.2: Properties of a Variable.

Figure 1.1.3: Illustration of Dichotomous Variables

Figure 1.1.4: Ranking of the Levels of Measurement

Figure 1.1.5: Levels of Measurement

Figure 2.1.0: Steps in Analyzing Non-Metric Data

Figure 2.1.1: Respondents’ Gender

Figure 2.1.2: Respondents’ Gender

Figure 2.1.3: Social Class of Respondents

Figure 2.1.4: Social Class of Respondents

Figure 2.1.5: Steps in Analyzing Metric Data

Figure 2.1.6: ‘Running’ SPSS for a Metric Variable










26


Figure 4.1.1: Age - Descriptive Statistics

Figure 4.1.2: Gender of Respondents

Figure 4.1.3: Respondent’s parent educational level

Figure 4.1.4: Parental/Guardian Composition for Respondents

Figure 4.1.5: Home Ownership of Respondent’s Parent/Guardian

Figure 4.1.6: Respondents’ Affected by Mental and/or Physical Illnesses

Figure 4.1.7: Suffering from mental illnesses

Figure 4.1.8: Affected by at least one Physical Illnesses

Figure 4.1.9: Dietary Consumption for Respondents

Figure 6.1.2: Typology of Previous School

Figure 6.1.3: Skewness of Examination i (i.e. Test i)

Figure 6.1.4: Skewness of Examination ii (i.e. Test ii)

Figure 6.1.5: Perception of Ability

Figure 6.1.6: Self-perception

Figure 6.1.7: Perception of task

Figure 6.1.8: Perception of utility

Figure 6.1.9: Class environment influence on performance

Figure 6.1.10: Perception of Ability



Figure 6.1.13: Perception of task

Figure 6.1.14: Perception of Utility

27

Figure 6.1.15: Class Environment influence on Performance

Figure 7.1.1: Frequency distribution of total expenditure on health as % of GDP

Figure 7.1.2: Frequency distribution of total expenditure on education as % of GNP

Figure 7.1.3: Frequency distribution of the Human Development Index

Figure 7.1.4: Running SPSS for social expenditure on social programme

Figure 7.1.5: Running bivariate correlation for social expenditure on social programme

Figure 7.1.6: Running bivariate correlation for social expenditure on social programme

Figure13.1.1: Categories that describe Respondents’ Position

Figure13.1.2: Company’s Annual Work Volume

Figure13.1.3: Company’s Labour Force – ‘on an averAge per year’

Figure13.1.4: Respondents’ main Area of Construction Work

Figure13.1.5: Percentage of work ‘self-performed’ in contrast to ‘sub-contracted’

Figure13.1.6: Percentage of work ‘self-performed’ in contrast to ‘sub-contracted’

Figure 13.1.7: Years of Experience in Construction Industry

Figure13.1.8: Geographical Area of Employment

Figure13.1.9: Duration of service with current employer

Figure13.1.10: Productivity changes over the past five years

Figure 14.1.1: Characteristic of Sampled Population

Figure 14.1.2: Employment Status of Respondents

28

Listing of Tables

Table 1.1.1: Synonyms for the different Levels of measurement

Table 1.1.2: Appropriateness of Graphs, from different Levels of measurement

Table 1.1.3: Levels of measurement1 with examples and other characteristics

Table1.1.4: Levels of measurement, and measure of central tendencies and measure of
variability

Table1.1.5: combinations of Levels of measurement, and types of statistical Test which are
application

Table 1.1.6a: Statistical Tests and their Levels of Measurement

Table 1.1.6b:

Table 2.1.1a: Gender of Respondents

Table 2.1.1b: General happiness

Table 2.1.2: Social Status

Table 2.1.3: Descriptive Statistics on the Age of the Respondents

Table 2.1.4:“From the following list, please choose what the most important characteristic of
democracy …are for you”

Table 4.1.1: Respondents’ Age

Table 4.1.2 (a) Univariate Analysis of the explanatory Variables

Table 4.1.2(b): Univariate Analysis of explanatory

Table 4.1.2 (c): Univariate Analysis of explanatory

Table 4.1.3: Bivariate Relationships between academic performance and subjective Social
Class (n=99)

1

29

Table 4.1.4: Bivariate Relationships between comparative academic performance and
subjective Social Class (n=108)
Table 4.1.5: Bivariate Relationships between academic performance and physical exercise (n=
111)

Table 4.1.6 (i): Bivariate Relationships between academic performance and instructional
materials (n=113)

Table 4.1.6 (ii) Relationship between academic performance and materials among students
who will be writing the A’ Level Accounting Examination, 2004

Table 4.1.7: Bivariate Relationships between academic performance and Class attendance (n=
106)

Table 4.1.8: Bivariate Relationship between academic performance and attendance

Table 4.1.9: Bivariate Relationships between academic performance and breakfast
consumption, (n=114)

Table 4.1.10: Relationship between academic performances and breakfasts consumption
among A’ Level Accounting students, controlling for Gender

Table 4.1.11: Bivariate Relationships between academic performance and
migraine (n=116)

Table 4.1.12: Bivariate Relationships between academic performance and mental illnesses,
(n=116)

Table 4.1.13: Bivariate Relationships between academic performance and physical illnesses,
(n=116)

Table 4.1.14: Bivariate Relationships between academic performance and illnesses (n=116)

Table 4.1.15. Bivariate Relationships between current academic performance and past
performance in CXC/GCE English language Examination, (n= 112)

Table 4.1.16: Bivariate Relationships between academic performance and past performance in
CXC/GCE English language Examination, controlling for Gender

Table 4.1.17: Bivariate Relationships between academic performance and past performance in
CXC/GCE Mathematics Examination n=

Table 4.1.18 (i): Bivariate Relationships between academic performance and past performance
in CXC/GCE principles of accounts Examination (n= 114)

30

Table 4.1.19 (ii): Bivariate Relationships between academic performance and past
performance in CXC/GCEPOA Examination, controlling for Gender

Table 4.1.20: Bivariate Relationships between academic performance and Self-Concept (n=
112)

Table 4.1.21: Bivariate Relationships between academic performance and Dietary
Requirements (n=116)

Table 4.1.22: Summary of Tables

Table 5.1.1: Frequency and percent Distributions of explanatory model Variables

Table 5.1.2: Relationship between Religiosity and Marijuana Smoking (n=7,869)

Table 5.1.3: Relationship between Religiosity and Marijuana Smoking controlled for Gender

Table 5.1.4: Relationship between Age and marijuana smoking (n=7,948)

Table 5.1.5: Relationship between marijuana smoking and Age of Respondents, controlled
for sex

Table 5.1.6: Relationship between academic performances and marijuana smoking,
(n=7,808)

Table 5.1.7: Relationship between academic performances and marijuana smoking,
controlled for Gender

Table 5.1.8: Summary of Tables

Table 6.1.1: Age Profile of respondent

Table 6.1.2: Examination Scores

Table 6.1.3(a): Class Distribution by Gender

Table 6.1.3(b): Class Distribution by Age Cohorts

Table 6.1.3(c): Pre-Test Score by Typology of Group

Table 6.1.3(c): Pre-Test Score by Typology of Group

Table 6.1.4: Comparison of Examination I and Examination II

Table 6.1.5: Comparison a Cross the Group by Tests

31

Table 6.1.6: Analysis of Factors influence on Test ii Scores

Table 6.1.7: Cross-Tabulation of Test ii Scores and Factors

Table 6.1.8: Bivariate Relationship between student’s Factors and Test ii Scores

Table 7.1.1: Descriptive Statistics - total expenditure on public health (as Percentage of GNP
HRD, 1994)

Table 7.1.2: Descriptive Statistics of expenditure on public education (as Percentage of GNP,
Hrd, 1994)
Table 7.1.3: Descriptive Statistics of Human Development (proxy for development)

Table 7.1.4: Bivariate Relationships between dependent and independent Variables

Table 7.1.5: Summary of Hypotheses Analysis

Table8.1.1: Age Profile of Respondents (n = 16,619)

Table 8.1.2: Logged Age Profile of Respondents (n = 16,619)

Table 8.1.3: Household Size (all individuals) of Respondents

Table 8.1.4: Union Status of the sampled Population (n=16,619)

Table 8.1.5: Other Univariate Variables of the Explanatory Model

Table 8.1.6: Variables in the Logistic Equation

Table 8.1.7: Classification Table

Table 8.1.1: Univariate Analyses

Table 8.1.2: Frequency Distribution of Educational Level by Quintile

Table 8.1.3: Frequency Distribution of Jamaica’s Population by Quintile and Gender

Table 8.1.4: Frequency Distribution of Educational Level by Quintile

Table 8.1.5: Frequency Distribution of Pop. Quintile by Household Size

Table 8.1.6: Bivariate Analysis of access to Tertiary Edu. and Poverty Status

Table 8.1.7: Bivariate Analysis of access to Tertiary Edu. and Geographic Locality of
Residents

32

Table 8.1.8: Bivariate Analysis of geographic locality of residents and poverty Status

Table 8.1.9: Bivariate Relationship between access to tertiary level education by Gender

Table 8.1.10: Bivariate Relationship between Access to Tertiary Level Education by Gender
controlled for Poverty Status

Table 8.1.11: Regression Model Summary

Table 10.1.1: Univariate Analysis of Parental Information

Table 10.1.2: Descriptive on Parental Involvement

Table 10.1.3: Univariate Analysis of Teacher’s Information

Table 10.1.4: Univariate Analysis of ECERS-R Profile

Table 10.1.5: Bivariate Analysis of Self-reported Learning Environment and Mastery on
Inventory Test

Table 10.1.6: Relationship between Educational Involvement, Psychosocial and Environment
involvement and Inventory Test

Table 10.1.6: Relationship between Educational Involvement, Psychosocial and Environment
Involvement and Inventory Test

Table 10.1.8: School Type by Inventory Readiness Score

Table 11.1.1: Incivility and Subjective Social Status

Table 12.1.2: Have you or someone in your family known of an act of Corruption in the last 12 months?

Table 12.1.3: Gender of Respondent

Table 12.1.4: In what Parish do you live?

Table 12.1.5: Suppose that you, or someone close to you, have been a victim of a crime. What would
you do...?

Table 12.1.6: What is your highest level of Education?

Table 12.1.7: In terms of Work, which of these best describes your Present situation?

Table 12.1.8: Which best represents your Present position in Jamaica Society?

Table 12.1.9: Age on your last Birthday?

Table 12.1.10: Age categorization of Respondents

33

Table 12.1.11: Suppose that you, or someone close to you, have been a victim of a crime. what would
you do... by Gender of respondent Cross Tabulation

Table 12.1.12: If involved in a dispute with neighbour and repeated discussions have not made a
difference, would you...? by Gender of respondent Cross Tabulation

Table 12.1.13: Do you believe that corruption is a serious problem in Jamaica? by Gender of
respondent Cross Tabulation

Table 12.1.14: have you or someone in your family known of an act of corruption in the last 12
months? by Gender of respondent Cross Tabulation

Table 14.1.1: Marital Status of Respondents

Table 14.1.2: Marital Status of Respondents by Gender

Table 14.1.3: Marital Status by Gender by Age cohort

Table 14.1.4: Marital Status by Gender by Age Cohort

Table 14.1.5 Educational Level by Gender by Age Cohorts

Table 14.1.6: Income Distribution of Respondents

Table 14.1.7: Parental Attitude Toward School

Table 14.1.8: Parent Involving Self

Table 14.1.9: School Involving Parent

Table 14.1.8: Regression Model Summary

Table 15.1.1: Correlations

Table 15.1.2: Cross Tabulation between incivility and social status

34

How do I obtain access to the SPSS PROGRAM?

Step One:

In order to access the SPSS program, the student should select ‘START’ to the

bottom left hand corner of the computer monitor. This is followed by selecting

‘All programs’ (see below).

Select ‘START’ and then ‘All
Program

35

Step Two:

The next step to the select ‘SPSS for widows’. Having chosen ‘SPSS for

widows’ to the right of that appears a dialogue box with the following options –

SPSS for widows; SPSS 12.0 (or 13.0…or, 15.0); SPSS Map Geo-dictionary

Manager Ink; and last with SPSS Manager.

Select
‘SPSS for
widows’

36

Step Three:

Having done step two, the student will select SPSS 12.0 (or 13.0, or 14.0 or 15.0) for

Widows as this is the program with which he/she will be working.

Select SPSS 12.0 (or 13.0,
or 14.0 or 15.0) for Widows

37

Step Four:

On selecting ‘SPSS for widows’ in step 3, the below dialogue box appears. The

next step is the select ‘OK’, which result in what appears in step five.

Select
‘OK’

38

What should I now do? The student should then select the ‘inner red box’ with the ‘X’.

Select the
‘inner red
box’ with
the X’.

40

Step Six:

This is what the SPSS spreadsheet looks like (see Figure below).

41

Step Seven:

What is the difference here? Look to the bottom left-hand cover the spreadsheet

and you will see two terms – (1) ‘Data View’ and (2) ‘Variable View’. Data

View accommodates the entering of the data having established the template in

the ‘Variable View’. Thus, the variable view allows for the entering of data (i.e.

responses from the questionnaires) in the ‘Data View’. Ergo, the student must

ensure that he/she has established the template, before any typing can be done in

the ‘Data View.

widow looks like
‘Data View’
Observe what the

Data View

43

44
Variable View
Observe what the
‘Variable View’
widow looks like

CHAPTER 1

1.1.0a: INTRODUCTION

This book is in response to an associate’s request for the provision of some material that would

adequately provide simple illustrations of ‘How to analyze quantitative data in the Social

Sciences from actual hypotheses’. He contended that all the current available textbooks,

despite providing some degree of analysis on quantitative data, failed to provide actual

illustrations of cases, in which hypotheses are given and a comprehensive assessment made to

answer issues surrounding appropriate univariate, bivariate and/or multivariate processes of

analysis. Hence, I began a quest to pursued textbooks that presently exist in ‘Research Methods

in Social Sciences’, ‘Research Methods in Political Sciences’, “Introductory Statistics’,

‘Statistical Methods’, ‘Multivariate Statistics’, and ‘Course materials on Research Methods’

which revealed that a vortex existed in this regard.

Hence, I have consulted a plethora of academic sources in order to formulate this text.

In wanting to comprehensively fulfill my friend’s request, I have used a number of dataset that

I have analyzed over the past 6 years, along with the provision of key terminologies which are

applicable to understanding the various hypotheses.

I am cognizant that a need exist to provide some information in ‘Simple Quantitative

Data Analysis’ but this text is in keeping with the demand to make available materials for

aiding the interpretation of ‘quantitative data’, and is not intended to unveil any new materials

in the discipline. The rationale behind this textbook is embedded in simple reality that many

undergraduate students are faced with the complex task of ‘how to choose the most appropriate

statistical test’ and this becomes problematic for them as the issue of wanting to complete an

45

assignment, and knowing that it is properly done, will plague the pupil. The answer to this

question lies in the fundamental issues of - (1) the nature of the variables (continuous or

discrete), and (2) what is the purpose of the analysis – is to mere description, or to provide

statistical inference and/or (3) if any of the independent variables are covariates2. Nevertheless,

the materials provided here are a range of research projects, which will give new information

on particular topics from the hypothesis to the univariate analysis and the bivariate or

multivariate analyses.

2
“If the effects of some independent variables are assessed after the effects of other independent variables are
statistically removed…” (Tabachnick and Fidell 2001, 17)

46

1.1.0b: STEPS IN ANALYZING A HYPOTHESIS

One of the challenges faced by a social researcher is how to succinctly conceptualize (i.e.

define) his/her variables, which will also be operationalized (measured) for the purpose of the

study. Having written a hypothesis, the researcher should identify the number of variables

which are present, from which we are to identify the dependent from the independent variables.

Following this he/she should recognize the level of measurement to which each variable

belongs, then the which statistical test is appropriate based on the level of measurement

combination of the variables. The figure below is a flow chart depicting the steps in analyzing

data when given a hypothesis.

The production of this text is in response to the provision of a simple book which

would address the concerns of undergraduate students who must analyze a hypothesis. Among

the issues raise in this book are (1) the systematic steps involved in the completion of

analyzing a hypothesis, (2) definitions of a hypothesis, (3) typologies of hypothesis, (4)

conceptualization of a variable, (4) types of variables, (5) levels of measurement, (6)

illustration of how to perform SPSS operations on the description of different levels of

measurement and inferential statistics, (7) Type I and II errors, (8) arguments on the treatment

of missing variables as well as outliers, (9) how to transform selected quantitative data, (10)

and other pertinent matters.

The primary reason behind the use of many of the illustrations, conceptualizations and

peripheral issues rest squarely on the fact the reader should grasp a thorough understanding of

how the entire process is done, and the rationale for the used method.

47

STEP ONE
STEP TEN Write your
Having used the Hypothesis STEP TWO
test, Identify the
analyze the data variables from the
carefully, based on hypothesis
the statistical test

STEP TEN STEP THREE
Choose the Define and
appropriate operationalize
statistical test based each variable
on the combination selected from the
of DV and IVS, and hypothesis

STEP NINE STEP FOUR
ANALYZING
If statistical Inference
is needed, look at the QUANTITATIVE
Decide on the level
combination DV and DATA
of measurement
IV(s)
for each variable

STEP EIGHT STEP FIVE
If statistical
association, causality
Decide which
or predictability is
need, continue, if not variable is DV, and
stop! IV
STEP SIX
STEP SEVEN Check for
Do descriptive skewness, and/or
statistics for chosen outliers in metric
variables selected variables

FIGURE 1.1.1: FLOW CHART: HOW TO ANALYZE QUANTITATIVE DATA?

This entire text is ‘how to analyze quantitative data from hypothesis’, but based on Figure

1.1.1, it may appear that a research process begins from a hypothesis, but this is not the case.

Despite that, I am emphasizing interpreting hypothesis, which is the base for this monograph

starting from an actual hypothesis. Thus, before I provide you with operational definitions of

48

variables, I will provide some contextualization of ‘what is a variable?’ then the steps will be

worked out.

49

1.1.1a: DEFINITIONS OF A VARIABLE

Undergraduates and first time researchers should be aware that quantitative data analysis are
primarily based on (1) empirical literature, (2) typologies of variables within the hypothesis,
(3) conceptualization and operationalization of the variables, (4) the level of measurement for
each variables. It should be noted that defining a variable is simply not just the collation a
group of words together, because we feel a mind to as each variable requires two critical
characteristics in order that it is done properly (see Figure 1.1.2).

PROPERITIES OF A VARIABLE

MUTUAL EXCLUSIVITIY EXHAUSTIVNESS

FIGURE 1.1.2: PROPERTIES OF A VARIABLE.

In order to provide a comprehensive outlook of a variable, I will use the definitions of a

various scholars so as to give a clear understanding of what it is.

“Variables are empirical indicators of the concepts we are researching. Variables, as their
name implies, have the ability to take on two or more values...The categories of each variable
must have two requirements. They should be both exhaustive and mutually exclusive. By
exhaustive, we mean that the categories of each variable must be comprehensive enough that it
is possible to categorize every observation” (Babbie, Halley, and Zaino 2003, 11).

“.. Exclusive refers to the fact that every observation should fit into only one category
“(Babbie, Halley and Zaino 2003, 12)

“A variable is therefore something which can change and can be measured.” (Boxill, Chambers
and Wint 1997, 22)

50

“The definition of a variable, then, is any attribute or characteristic of people, places, or events
that takes on different values.” (Furlong, Lovelace, Lovelace 2000, 42)

“A variable is a characteristic or property of an individual population unit” (McClave, Benson
and Sincich 2001, 5)

“Variable. A concept or its empirical measure that can take on multiple values” (Neuman
2003, 547).

“Variables are, therefore, the quantification of events, people, and places in order to measure
observations which are categorical (i.e. nominal and ordinal data) and non-categorical (i.e.
metric) in an attempt to be informed about the observation in reality. Each variable must fill
two basic conditions – (i) Exhaustiveness – the variable must be so defined that all tenets are
captured as its is comprehensive enough include all the observations, and (ii) mutually
exclusivity – the variable should be so defined that it applies to one event and one event only –
(i.e. Every observation should fit into only one category) (Bourne 2007).

One of the difficulties of social research is not the identification of a variable or

variables in the study but it’s the conceptualization and oftentimes the operationalization of

chosen construct. Thus, whereas the conceptualization (i.e. the definition) of the variable may

(or may not) be complex, it is the ‘how do you measure such a concept (i.e. variable) which

oftentimes possesses the problem for researchers. Why this must be done properly bearing in

mind the attributes of a variable, it is this operational definition, which you will be testing in

the study (see Typologies of Variables, below). Thus, the testing of hypothesis is embedded

within variables and empiricism from which is used to guide present studies. Hypothesis

testing is a technique that is frequently employed by demographers, statisticians, economists,

psychologists, to name new practitioners, who are concerned about the testing of theories, and

the verification of reality truths, and the modifications of social realities within particular time,

space and settings. With this being said, researchers must ensure that a variable is properly

defined in an effort to ensure that the stated phenomenon is so defined and measured.

51

1.1.1b TYPOLOGIES of VARIABLE (examples, using Figure 1.1.2, above)

Health care seeking behaviour: is defined as people visiting a health practitioner or health

consultant such as doctor, nurse, pharmacist or healer for care and/ or advice.

Levels of education: This is denominated into the number of years of formal schooling that

one has completed.

Union status – It is a social arrangement between or among individuals. This arrangement

may include ‘conjugal’ or a social state for an individual.

Gender: A sociological state of being male or female.

Per capita income: This is used a proxy for income of the individual by analyzing the

consumption pattern.

Ownership of Health insurance: Individuals who possess of an insurance polic/y (ies).

Injuries: A state of being physically hurt. The examples here are incidences of disability,

impairments, chronic or acute cuts and bruises.

Illness: A state of unwellness.

Age: The number of years lived up to the last birthday.

Household size - The numbers of individuals, who share at least one common meal, use

common sanitary convenience and live within the same dwelling.

Now that the premise has been formed, in regard to the definition of a variable, the next

step in the process is the category in which all the variables belong. Thus, the researcher needs

to know the level of measurement for each variable - nominal; ordinal; interval, or ration (see

1.1.2a).

52

1.1.2a: LEVELS OF MEASUREMENT3: Examples and definitions

Nominal - The naming of events, peoples, institutions, and places, which are coded numerical
by the researcher because the variable has no normal numerical attributes. This
variable may be either (i) dichotomous, or (ii) non-dichotomous.

Dichotomous variable – The categorization of a variable, which has only two sub-
groupings - for example, gender – male and female; capital punishment –
permissive and restrictive; religious involvement – involved and not involved.

Non-dichotomous variable – The naming of events which span more than two
sub-categories (example Counties in Jamaica – Cornwall, Middlesex and Surrey;
Party Identification – Democrat, Independent, Republican; Ethnicity – Caucasian,
Blacks, Chinese, Indians; Departments in the Faculty of Social Sciences –
Management Studies, Economics, Sociology, Psychology and Social Work,
Government; Political Parties in Jamaica – Peoples’ National Party (PNP),
Jamaica Labour Party (JLP), and the National Democratic Movement (NDM);
Universities in Jamaica – University of the West Indies; University of
Technology, Jamaica; Northern Caribbean University; University College of the
Caribbean; et cetera)

Ordinal - Rank-categorical variables: Variables which name categories, which by their very
nature indicates a position, or arrange the attributes in some rank ordering (The
examples here are as follows i) Level of Educational Institutions –
Primary/Preparatory, All-Age, Secondary/High, Tertiary; ii) Attitude toward gun
control – strongly oppose, oppose, favour, strongly favour; iii) Social status –
upper--upper, upper-middle, middle-middle, lower-middle, lower class; iv)
Academic achievement – A, B, C, D, F.

Interval
or ratio These variables share all the characteristics of a nominal and an ordinal variable
along with an equal distance between each category and a ‘true’ zero value – (for
example – age; weight; height; temperature; fertility; votes in an election,
mortality; population; population growth; migration rates, .

Now that the definitions and illustrations have been provided for the levels of measurement,

the student should understand the position of these measures (see 1.1.2b).

3
Stanley S. Stevens is created for the development of the typologies of scales – level of measurement – (i)
nominal, (ii) ordinal, (iii) interval and (iv) ratio. (see Steven 1946, 1948, 1968; Downie and Heath 1970)

53

Dichotomy
(or
Dichotomous
variable

Typologies of
Gender Science
Book

Non-
Fictional Male Female Pure Applied
Fictional

Alive Dead Induction Deduction

Non-
Parametric
Burial Non-burial parametric
statistics
statistics

Religious Non-religious Non- use primary use secondary
Decomposed data data
service service decomposed

Figure 1.1.3: Illustration of dichotomous variables

54

1.1.2b: RANKING LEVELS OF MEASUREMENT

RATIO
highes
t

INTERVAL

ORDINAL

lowest
NOMINAL

Figure 1.1.4: Ranking of the levels of measurement

The very nature of levels of measurement allows for (or do not allow for) data manipulation. If

the level of measurement is nominal (for example fiction and non-fiction books), then the

researcher does not have a choice in the reconstruction of this variable to a level which is

below it. If the level of measurement, however, is ordinal (for example no formal education,

primary, secondary and tertiary), then one may decide to use a lower level of measure (for

example below secondary and above secondary). The same is possible with an interval

variable. The social scientist may want to use one level down, ordinal, or two levels down,

nominal. This is equally the same of a ratio variable. Thus, the further ones go up the

pyramid, the more scope exists in data transformation.

55

Table 1.1.1: Synonyms for the different Levels of measurement

Levels of Measurement Other terms

Nominal Categorical; qualitative, discrete4

Ordinal Qualitative, discrete; rank-ordered; categorical

Interval/Ratio Numerical, continuous5, quantitative; scale; metric, cardinal

Table 1.1.2: Appropriateness of Graphs for different levels of measurement

Levels of Measurement Graphs

Bar chart Pie chart Histogram Line Graph

Nominal √ √ __ __

√ √ __ __
Ordinal

__ __ √ √
Interval/Ratio (or metric)

4
Discrete variable – take on a finite and usually small number of values, and there is no smooth transition from
one value or category to the next – gender, social class, types of community, undergraduate courses
5
Continuous variables are measured on a scale that changes values smoothly rather than in steps

56

Table 1.1.3: Levels of measurement6 with Examples and Other Characteristics

Levels of Measurement

Nominal Ordinal Interval Ratio

Examples Gender Social class Temperature Age
Religion Preference Shoe size Height
Political Parties Level of education Life span Weight
Race/Ethnicity Gender equity Reaction time
Political Ideologies levels of fatigue Income; Score on an Exam.
Noise level Fertility; Population of a country
Job satisfaction Population growth; crime rates

Mathematical properties Identity Identity Identity Identity
____ Magnitude
Magnitude Magnitude
____ _____ Equal Interval Equal interval
____ _____ _____ True zero

Mathematical
Operation(s) None Ranking Addition; Addition;
Subtraction Subtraction;
Division;
Multiplication

Compiled: Paul A. Bourne, 2007; a modification of Furlong, Lovelace and Lovelace 2000, 74

6
“Levels of measurement concern the essential nature of a variable, and it is important to know this because it determines what one can do with a variable
(Burham, Gilland, Grant and Layton-Henry 2004, 114)

57

Table1.1.4: Levels of measurement, Measure of Central Tendency and Measure of Variability

Levels of Measurement Measure of central tendencies Measure of variability

Mean Mode Median Mean deviation Standard deviation

Nominal NA √ NA NA NA

Ordinal NA √ √ NA NA

Interval/Ratio7 √ √ √ √ √

NA denotes Not Applicable

7
Ratio variable is the highest level of measurement, with nominal being first (i.e. lowest); ordinal, second; and interval, third.

58

Table1.1.5: Combinations of Levels of measurement, and types of Statistical test which are applicable8

Levels of Measurement Statistical Test

Dependent Independent Variable
Nominal Nominal Chi-square

Nominal Ordinal Chi-square; Mann-Whitney

Nominal Interval/ratio Binomial distribution; ANOVA;
Logistic Regression; Kruskal-Wallis
Discriminant Analysis

Ordinal Nominal Chi-square

Ordinal Ordinal Chi-square; Spearman rho;

Ordinal Interval/ratio Kruskal-Wallis H; ANOVA

Interval/ratio Nominal ANOVA;

Interval/ratio Ordinal
Interval/ratio Interval/ratio Pearson r, Multiple Regression
Independent-sample t test

Table 1.1.5 depicts how a dependent variable, which for example is nominal, which when combined with an independent variable,

Nominal, uses a particular statistical test.

8
One of the fundamental issues within analyzing quantitative data is not merely to combine then interpret data, but it is to use each variable appropriately. This
is further explained below.

59

STATISTICAL TESTS AND THEIR LEVELS OF MEASUREMENT

Test Independent Dependent
Variable variable

Chi-Square (χ2) Nominal, Ordinal Nominal, Ordinal
Mann-Whitney U Dichotomous Nominal, Ordinal
test
Kruskal-Wallis H Non-dichotomous, Ordinal, or skewed9
test Ordinal Metric
Pearson’s r Normally distributed10 Normally distributed
Metric Metric
Linear Regress Normally distributed Normally distributed
Metric, dummy Metric
Independent Dichotomous Normally distributed
Samples Metric
T-test
AVONA Nominal, Ordinal Normally distributed
(non-dichotomous11) Metric
Logistic regression Metric, dummy Dichotomous (skewed
values or otherwise
Discriminant Metric, dummy Dichotomous (normally distributed
analysis value)

Notes to Table 1.1.6b

Chi-Square (χ2) Used to test for associations between two variables
Mann-Whitney U test Used to determine differences between two groups
Kruskal-Wallis H test Used to determine differences between three or more groups
Pearson’s r Used to determine strength and direction of a relationship
between two values
Linear Regression Used to determine strength and direction of a relationship
between two or more values
Independent Samples
T-test Used to determine difference between two groups
AVONA Used to determine difference between three or more groups
Logistic regression Used to predict relationship between many values
Discriminant analysis Used to predict relationship between many values

9
Skewness indicates that there is a ‘pileup’ of cases to the left or right tail of the distribution
10
Normality is observed, whenever, the values of skewness and kurtosis are zero
11
Non-dichotomous (i.e. polytomous) which denotes having many (i.e. several) categories

61

LEVELS OF MEASURMENT AND THEIR MEASURING
ASSOCIATION

LEVELS OF
MEASUREMENT

NOMINAL ORDINAL INTERVAL/RATIO

Lambda Gamma Pearson’s r

Cramer’s V Somer’s D

Contingency coefficients Kendall ‘s tau-B

Phi Kendall’s tau-c

Figure 1.1.5: Levels of measurement
‫ג‬
Lambda ( ) – This is a measure of statistical relationship between the uses of two nominal
variables
Phi (Φ) – This is a measure of association between the use of two dichotomous
variables (i.e. dichotomous dependent and dichotomous independent) – [Φ
= √[ χ2/N]

Cramer’s V (V) – This is a measure of association between the use of two nominal
variables (i.e. in the event that there is dichotomous dependent and
dichotomous independent) – V = √[ χ2/N(k – 1)] is identical to phi.

γ
Gamma ( ) – This is used to measure the statistical association between ordinal by
ordinal variable

Contingency coefficient (cc) – Is used for association in which the matrix is more than 2
X 2 (i.e. 2 for dependent and 2 for the independent – for example 2X3; 3X2;
3X3 …) - √ [χ2/ χ2 + N]

Pearson’s r – This is used for non-skewed metric variables - n∑xy - ∑x.∑y
√ [n∑x2 – (∑x) 2 - [n∑y2 – (∑y) 2

62

1.1.3: CONCEPTUALIZING DESCRIPTIVE AND INFERENTIAL
STATISTICS

Research is not done in isolation from the reality of the wider society. Thus, the social

researcher needs to understand whether his/her study is descriptive and/or inferential as it

guides the selection of certain statistical tools. Furthermore, an understanding of two

constructs dictate the extent to which the analyst will employ as there is a clear

demarcation between descriptive and inferential statistics. In order to grasp this

distinction, I will provide a number of authors’ perspectives on each terminology.

“Descriptive statistics describe samples of subjects in terms of variables or combination

of variables” (Tabachnick and Fidell 2001, 7)

“Numerical descriptive measures are commonly used to convey a mental image of

pictures, objects, tables and other phenomenon. The two most common numerical

descriptive measures are: measures of central tendencies and measures of variability

(McDaniel 1999, 29; see also Watson, Billingsley, Croft and Huntsberger 1993, 71)

“Techniques such as graphs, charts, frequency distributions, and averages may be used

for description and these have much practical use” (Yamane 2973, 2; see also Blaikie

2003, 29; Crawshaw and Chambers 1994, Chapter 1)

“Descriptive statistics – statistics which help in organizing and describing data, including

showing relationships between variables” (Boxill, Chamber and Wind 1997, 149)

63

“We’ll see that there are two areas of statistics: descriptive statistics, which focuses on

developing graphical and numeral summaries that describes some…phenomenon, and

inferential statistics, which uses these numeral summaries to assist in making…

decisions” (McClave, Benson, Sinchich 2001, 1)

“Descriptive statistics utilizes numerical and graphical methods to look for patterns in a

data set, to summarize the information revealed in a data set, and to present the

information in a convenient form” (McClave, Benson and Sincich 2001, 2)

“Inferential statistics utilizes sample data to make estimates, decisions, predictions, or

other generalizations about a larger set of data” (McClave, Benson and Sincich 2001, 2)

“The phrase statistical inference will appear often in this book. By this we mean, we

want to “infer” or learn something about the real world by analyzing a sample of data.

The ways in which statistical inference are carried out include: estimating…parameters;

predicting…outcomes, and testing…hypothesis …” (Hill, Griffiths and Judge 2001, 9).

Inferential statistics is not only about ‘causal’ relationships; King, Keohane and

Verba argue that it is categorized into two broad areas: (1) descriptive, and (2) causal

inference. Thus, descriptive inference speaks to the description of a population from

what is made possible, the sample size. According to Burham, Gilland, Grant and

Layton-Henry (2004) state that:

Causal inferences differ from descriptive ones in one very significant way: they
take a ‘leap’ not only in terms of description, but in terms of some specific causal

64

process [i.e. predictability of the variables]” (Burham, Gilland, Grand and Layton-
Henry 2004, 148).

In order that this textbook can be helping and simple, I will provide operational

definitions of concepts as well as illustration of particular terminologies along with

appropriateness of statistical techniques based on the typologies of variable and the level

of measurement (see in Tables 1.1.1 – 1.1.6, below).

65

CHAPTER 2

2.1.0: DESCRIPTIVE STATISTICS

The interpretation of quantitative data commences with an overview (i.e. background

information on survey or study – this is normally demographic information) of the

general dataset in an attempt to provide a contextual setting of the research (descriptive

statistics, see above), upon which any association may be established (inferential

statistics, see above). Hence, this chapter provides the reader with the analysis of

univariate data (descriptive statistics), with appropriate illustration of how various levels

of measurement may be interpreted, and/or diagrams chosen based on their suitability.

A variable may be non-metric (i.e. nominal or ordinal) or metric (i.e. scale,

interval/ratio). It is based on this premise that particular descriptive statistics are provide.

In keeping with this background, I will begin this process with non-metric, then metric

data. The first part of this chapter will provide a thorough outline of how nominal and/or

ordinal variables are analyzed. Then, the second aspect will analyze metric variables.

66

STEP ONE
Ensure that the
STEP TEN variable is non-
Analyze the output metric (e.g. Gender, STEP TWO
(use Table 2.1.1a) general happiness)
Select Analyze

STEP TEN STEP THREE
Select descriptive
select paste or ok statistics

HOW TO DO
DESCRIPTIVE
STEP NINE STATISTICS FOR A STEP FOUR
NO-METRIC
Choose bar or pie graphs VARIABLE? select frequency

STEP FIVE
STEP EIGHT
select the non-metric
select Chart
variable

STEP SEVEN STEP SIX
select mode or mode and
median (based on if the select statistics at the
variable is nominal or end
ordinal respective

Figure 2.1.0: Steps in Analyzing Non-metric data

67

2.1.1a: INTERPRETING NON-METRIC (or Categorical) DATA

NOMINAL VARIABLE (when there are not missing cases)

Table 2.1.1a: Gender of respondents

Frequency Percent Valid
Percent

Male 150 69.4 69.4
Gender:
Female 66 30.6 30.6

Total 216 100.0 100.0

Identifying Non-missing Cases: When there are no differences between the percent

column and those of the valid percent column, then there are no missing cases.

How is the table analyzed? Of the sampled population (n=21612), 69.4% were males

compared to 30.6% females.

12
The total number of persons interviewed for the study. It is advisable that valid percents are used in
descriptive statistics as there may be some instances then missing cases are present with the dataset, which
makes the percent figure different from those of the valid percent (Table 2.1.1b).

68

NOMINAL VARIABLE: Establishment of when missing cases

Table 2.1.1b: General Happiness

Frequency Percent Valid
Percent

Very happy 467 30.8 31.1
General
Happiness:
Pretty happy 872 57.5 58.0

Not too happy 165 10.9 11.0

Missing Cases 13 0.9 -

Total 1,517 100.0 100.0

Identifying Missing Cases: In seeking to ascertain missing data (which indicates that
some of the respondents did no answer the specified question), there is a disparity
between the values for percent and those in valid percent. In this case, 13 of 1,517
respondents did not answer question on ‘general happiness’. In cases where there is a
difference between the two aforementioned categories (i.e. percent and valid percent), the
student should remember to use the valid percent. The rationale behind the use of the
valid percent is simple, the research is about those persons who have answered and they
are captured in the valid percent column. Hence, it is recommended that the student use
the valid percent column at all time in analyzing quantitative data.

Interpretation: Of the sampled population (n=1,517), the response rate is 99.1%

(n=1,504)13. Of the valid responses (n=1,504), 31.1% (n=467) indicated that they were

‘very happy’, with 58.0% (n=872) reported being ‘pretty happy’, compared to 11.0%

(n=165) who said ‘not too happy’.

13
Because missing cases are within the dataset (13 or 0.9%), there is a difference between percent and valid
percent. Thus, care should be taken when analyzing data. This is overcome when the valid percents are
used.

69

Owing to the typology of the variable (i.e. nominal), this may be presented graphical by

either a pie graph or a bar graph.

Pie graph

Female,
30.6, 31%

Male, 69.4,
69%

Figure 2.1.1: Respondents’ gender

OR

Bar graph

70

60

50

40

30

20

10

0
Male Female

Figure 2.1.2: Respondents’ gender

70

ORDINAL VARIABLE

Table 2.1.2: Subjective (or self-reported) Social Class

Frequency Percent Valid Percent

Social class:
Lower 100 46.3 46.3

Middle 104 48.1 48.1

Upper 12 5.6 50.6

Total 216 100.0 100.0

Interpreting the Data in Table 2.1.2:

When the respondents were asked to select what best describe their social standing, of the
sampled population (n=216), 46.3% reported lower (working) class, 48.1% revealed
middle class compared to 5.6% who said upper middle class. Based on the typology of
variable (i.e. ordinal), the graphical options are (i) pie graph and/or (2) bar graph.

Note: In cases where there is no difference between the percent column and that of valid
percent, researchers infrequently use both columns. The column which is normally used
is valid percent as this provides the information of those persons who have actually
responded to the specified question. Instead of using ‘valid percent’ the choice term is
‘percent’.

71

50
45
48.1
40 46.3
35
30
25
20
15
10
5 5.6
0
Lower class Middle class Upper middle
class

Figure 2.1.3: Social class of respondents

Or

Upper
middle
class, 5.6 Lower
class, 46.3

Middle
class, 48.1

Figure 2.1.4: Social class of respondents

72

2.1.1b: STEPS IN INTERPRETING METRIC VARIABLE:
METRIC (i.e. scale or interval/ratio)

STEP ONE
STEP TEN Know the metric
variable (Age) STEP TWO
Analyze the output
(use Table 2.1.3)
Select Analyze

STEP TEN STEP THREE
Select descriptive
select paste or ok statistics

HOW TO DO
STEP NINE DESCRIPTIVE
STATISTICS FOR STEP FOUR
Choose histogram A METRIC
with normal curve VARIABLE? select frequency

STEP FIVE
STEP EIGHT
select Chart
select the metric
variable
STEP SIX
STEP SEVEN
select mean, select statistics at
standard deviation,
the end
skewness

Figure 2.1.5: Steps in Analyzing Metric data

73

Analyzing Quantitative Data

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Destaque

Destaque (10)

Semelhante a Analyzing Quantitative Data

Semelhante a Analyzing Quantitative Data (20)

Último

Último (20)

Analyzing Quantitative Data