SlideShare uma empresa Scribd logo
1 de 145
DBA6000




Quantitative
  Business
  Research
   Methods




    Rob J Hyndman
c Rob J Hyndman, 2008.


Professor Rob Hyndman
Department of Econometrics and Business Statistics
Monash University (Clayton campus)
VIC 3800.


Email: Rob.Hyndman@buseco.monash.edu.au
Telephone: (03) 9905 2358
www.robhyndman.info
Contents


Preface                                                                                                                                                                      5

1   Research design                                                                                                                                                           9
    1.1 Statistics in research . . . . . . . . . .                   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .    9
    1.2 Organizing a quantitative research study                     .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   14
    1.3 Some quantitative research designs . .                       .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   15
    1.4 Data structure . . . . . . . . . . . . . .                   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   17
    1.5 The survey process . . . . . . . . . . .                     .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   18
    Appendix A: Case studies . . . . . . . . . . .                   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   19

2   Data collection                                                                                                                                                          23
    2.1 Introduction . . . . . . . .     .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   23
    2.2 Data collecting instruments      .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   23
    2.3 Errors in statistical data . .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   27
    2.4 Questionnaire design . . .       .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   30
    2.5 Data processing . . . . . .      .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   40
    2.6 Sampling schemes . . . .         .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   41
    2.7 Scale development . . . .        .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   44
    Appendix B: Case studies . . . .     .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   45

3   Data summary                                                                                                                                                             53
    3.1 Summarising categorical data . . . .                     .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   54
    3.2 Summarizing numerical data . . . . .                     .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   56
    3.3 Summarising two numerical variables                      .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   63
    3.4 Measures of reliability . . . . . . . . .                .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   65
    3.5 Normal distribution . . . . . . . . . .                  .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   68

4   Computing and quantitative research                                                                                                                                      70
    4.1 Data preparation . . . . . . . . .               .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   70
    4.2 Using a statistics package . . . .               .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   71
    4.3 Further reading . . . . . . . . . .              .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   75
    4.4 SPSS exercise . . . . . . . . . .                .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   76

5   Significance                                                                                                                                                              77
    5.1 Proportions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                                                        77


                                                                     3
5.2   Numerical differences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                                            81

6   Statistical models and regression                                                                                                                               88
    6.1 One numerical explanatory variable         .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .    88
    6.2 One categorical explanatory variable       .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .    93
    6.3 Several explanatory variables . . . .      .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .    94
    6.4 Comparing regression models . . .          .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   102
    6.5 Choosing regression variables . . .        .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   103
    6.6 Multicollinearity . . . . . . . . . . .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   105
    6.7 SPSS exercises . . . . . . . . . . .       .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   106

7   Significance in regression                                                                                                                                      107
    7.1 Statistical model . . . . . . . . . . . . . . . . .                .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   107
    7.2 ANOVA tables and F-tests . . . . . . . . . . .                     .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   107
    7.3 t-tests and confidence intervals for coefficients                    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   108
    7.4 Post-hoc tests . . . . . . . . . . . . . . . . . .                 .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   111
    7.5 SPSS exercises . . . . . . . . . . . . . . . . .                   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   111

8   Dimension reduction                                                                               112
    8.1 Factor analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
    8.2 Further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

9   Data analysis with a categorical response variable                                                                                                             119
    9.1 Chi-squared test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                                             119
    9.2 Logistic and multinomial regression . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                                                122
    9.3 SPSS exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                                               123

10 A survey of statistical methodology                                                                                                                             124

11 Further methods                                                                                                                                                 131
   11.1 Classification and regression trees     .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   131
   11.2 Structural equation modelling . . .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   133
   11.3 Time series models . . . . . . . .     .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   134
   11.4 Rank-based methods . . . . . . .       .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   134

12 Presenting quantitative research                                                                                                                                135
   12.1 Numerical tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                                               135
   12.2 Graphics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                                             136
   Appendix: Good graphs for better business . . . . . . . . . . . . . . . . . . . . . . . . . . .                                                                 141

13 Readings                                                                                                                                                        145




DBA6000: Quantitative Business Research Methods                                                                                                                     4
Preface

Subject convenor

Professor Rob J Hyndman
B.Sc.(Hons), Ph.D., A.Stat
Department of Econometrics and Business Statistics
Location:   Room 671, Menzies Building, Clayton.
Phone:      (03) 9905 2358
Email:      Rob.Hyndman@buseco.monash.edu.au
WWW:        http://www.robhyndman.info



Objectives

On completion of this subject, students should have:

   • the necessary quantitative skills to conduct high quality independent research related to
     business administration;
   • comprehensive grounding in a number of quantitative methods of data production and
     analysis;
   • been introduced to quantitative data analysis through a practical research activity.



Synopsis

This unit considers the quantitative research methods used in studying business, management
and organizational analysis. Topics to be covered:

  1. research design including experimental designs, observational studies, case studies, lon-
     gitudinal analysis and cross-sectional analysis;
  2. data collection including designing data collection instruments, sampling strategies and
     assessing the appropriateness of archival data for a research purpose;
  3. data analysis including graphical and numerical techniques for the exploration of large


                                              5
Preface


     data sets and a survey of advanced statistical methods for modelling the relationships
     between variables;
  4. communication of quantitative research; and
  5. the use of statistical software packages such as SPSS in research.

The effective use of several quantitative research methods will be illustrated through reading
research papers drawn from several disciplines.



References

None of these are required texts—they provide useful background material if you want to read
further. Huck (2007) is excellent on interpreting statistical results in academic papers. Pallant
(2007) is very helpful when using SPSS and in giving advice on how to write up research results.
Use Wild and Seber (2000) if you need to brush up on your basic statistics; it contains lots of
helpful advice and interesting examples.

  1. H UCK , S.W. (2007) Reading statistics and research. 5th ed., Allyn & Bacon: Boston, MA
  2. PALLANT, J. (2007) SPSS survival manual, 3rd ed., Allen & Unwin.
  3. DE VAUS , D. (2002) Analyzing social science data. SAGE Publications: London.
  4. W ILD , C.J., & S EBER , G.A.F. (2000) Chance encounters: a first course in data analysis and
     inference. John Wiley & Sons: New York.



Timetable

               17 July          Introduction/Chapter 1
               24 July          Chapters 2
               31 July          Chapter 3
               7 August         Chapter 4                  SPSS tutorial
               14 August        Chapter 5
               21 August        Chapter 6
               28 August        Chapter 7                  SPSS tutorial
               4 September      Chapter 8–9                SPSS tutorial
               11 September     Chapter 10
               18 September     Chapter 11–12              First assignment due
               25 September     No class
               2 October        No class
               9 October                                   SPSS tutorial
               16 October       Oral presentations         Second assignment due




DBA6000: Quantitative Business Research Methods                                                6
Preface


Assessment
  1. A written report presenting and critiquing a research paper which uses quantitative re-
     search methods. 45%
        • It can be a published research paper from a scholarly journal, or a company report.
          It must contain substantial quantitative research. It must be approved in advance.
        • Your report should include comments on the research questions addressed, the ap-
          propriateness of the data used, how the data were collected, the method of analysis
          chosen, and the conclusions drawn.
        • Length: 4000–5000 words excluding tables and graphs.
        • Due: 17 September
  2. A written report presenting some original quantitative analysis of a suitable multivariate
     data set. 45%
        • You may use your own data, or use data that I will provide. The data set must
          include at least four variables. It can be data from your workplace.
        • Your report should include comments on the research questions addressed, the ap-
          propriateness of the data used, how the data were collected, the method of analysis
          chosen, and the conclusions drawn.
        • You may use any statistical computing package or Excel for analysis.
        • Length: 4000–5000 words excluding tables and graphs.
        • Due: 15 October
  3. A 20 minute oral presentation of one of the above reports. 10%.
        • On either 8 or 15 October.


Assignment marking scheme

   •   Research questions addressed: 6%
   •   Appropriateness of data: 6%
   •   Data collection: 6%
   •   Description of statistical methods used: 6%
   •   Suitability of statistical methods: 6%
   •   Discussion of statistical results: 8%
   •   Conclusions (are they supported/valid?): 7%


Choosing a paper for Assignment 1

Choose something you are interested in. For example, it can be an article you are reading as
part of your other DBA studies or something you have read as part of your professional life.

The following journals contain some articles that would be suitable. There are also many others.

   •   Australian Journal of Management
   •   International Journal of Human Resource Management
   •   Journal of Advertising
   •   Journal of Applied Management Studies
   •   Journal of Management
   •   Journal of Management Accounting Research


DBA6000: Quantitative Business Research Methods                                               7
Preface


   •   Journal of Management Development
   •   Journal of Managerial Issues
   •   Journal of Marketing
   •   Management Decision

You can obtain online copies for some of these via the Monash Voyager Catalogue. Hard copies
should be in the Monash library.

Things to look for:

   • it should involve some substantial data analysis;
   • it should involve more than summary statistics (e.g., a regression model, or some chi-
     squared tests);
   • it should not use sophisticated statistical methods that are beyond this subject (e.g., avoid
     factor analysis and structural equation models).

All papers should be approved by Rob Hyndman before you begin work on the assignment.


Choosing a data set for Assignment 2

   • Choose something you know about. The best data analyses involve a mix of good knowl-
     edge of the data context as well as good use of statistical methodology.
   • Don’t try to do too much. One response variable with 3–5 explanatory variables is usually
     sufficient. Resist the temptation to write a long treatise!
   • You will find it easier if the response variable is numeric. Analysing categorical response
     variables with several explanatory variables can be tricky.
   • Be clear about the purpose of your analysis. State some explicit objectives or hypotheses,
     and address them via your statistical analysis.
   • Think about what you include. A few well-chosen graphics that tell a story is better than
     pages of computer output that mean very little.
   • Start early. Even before we cover much methodology, you can do some basic data sum-
     maries and think about the key questions you want to address.
   • All data sets should be approved by Rob Hyndman before you begin work on the assign-
     ment.



Readings

Most weeks we will read a case study from a research journal and discuss the analysis. Please
read these in advance. We will discuss them in the third hour. You cannot use a paper we
have discussed for your first assessment task. If you have a suggestion of a paper that may be
suitable for class discussion, please let me know.




DBA6000: Quantitative Business Research Methods                                                 8
CHAPTER
                      1
                                    Research design

1.1 Statistics in research
            “Statistics is the study of making sense of data.”   Ott and Mendenhall
            “The key principle of statistics is that the analysis of observations
            doesn’t depend only on the observations but also on how they were
            obtained.”                                                Anonymous

   • Data beat anecdotes “For example” proves nothing.                       (Hebrew proverb)
   • Data beat intuition
     “Belief is no substitute for arithmetic.”                                 (Henry Spencer)
   • Data beat “expert” opinion
     “When information becomes unavailable, the expert comes into his own.” (A.J. Liebling)


1.1.1 Statistics answers questions using data

   •   Do pollutants cause asthma?
   •   Do transaction volumes on the stock market react to price changes?
   •   Does deregulation reduce unemployment?
   •   Does fluoride reduce tooth decay?

A definition

Statistical Analysis: Mysterious, sometimes bizarre, manipulations performed upon the col-
lected data of an experiment in order to obscure the fact that the results have no generalizable
meaning for humanity. Commonly, computers are used, lending an additional aura of unreality
to the proceedings.
                                                                              (Source unknown)




                              97.3% of all statistics are made up.

                                                 9
Part 1. Research design


1.1.2 Some statistics stories

The Challenger disaster

                                      2
         Number of O-rings damaged




                                      1




                                      0

                                              55        60             65          70        75   80


                                                             Ambient temperature at launch




Charlie’s chooks
                                     14
                                     12
         Y: Percentage mortality


                                     10
                                     8
                                     6
                                     4




                                          0        20             40              60         80    100

                                                               X: Percentage Tegel birds




DBA6000: Quantitative Business Research Methods                                                          10
Part 1. Research design


Risk factors for heart disease

A doctor wants to investigate who is most at risk for coronary-related deaths. He selects 12
patients at random from his clinic and records their age, blood pressure and drug used. He
also records whether they eventually died from heart disease or not.

 Age   BP   Drug    L/D
  18   68    1       D
  20   64    2       L
  22   72    1       D
  25   67    2       L
  29   80    –       D
  33   70    –       D
  34   86    1       D
  36   85    –       D
  37   73    2       L
  39   82    –       L
  41   90    1       D
  45   87    2       L

                              Drug    Lived   Died   % lived
                                 1        0      4        0%
                                 2        4      0     100%
                                 –        1      3      25%
                                          5      7

Drug 1 looks bad, 2 looks good.




DBA6000: Quantitative Business Research Methods                                         11
Part 1. Research design


1.1.3 Causation and association




Smoking and Lung Cancer

There is a strong positive correlation between smoking and lung cancer. There are several
possible explanations.

   • Causal hypothesis: Smoking causes lung cancer.
   • Genetic hypothesis: There is a hereditary trait which predisposes people to both nicotine
     addiction and lung cancer.
   • Sloppy lifestyle hypothesis: Smoking is most prevalent amongst people who also drink
     too much, don’t exercise, eat unhealthy food, etc.

Postnatal care

Mothers who return home from hospital soon after birth do better than those who stay in
hospital longer.

   • Causation hypothesis: Hospital is harmful and/or home is helpful.
   • Common response hypothesis: Mothers return home early because they are coping well.
   • Confounding hypothesis: Mothers return home early if there is someone at home to help.

University applicants

                                           Male   Female   Total
                                  Accept    70       40     110
                                  Reject   100      100     200
                                  Total    170      140     310

Is there evidence of discrimination?

Course: Introduction to bean counting

                                           Male   Female   Total
                                  Accept    60      20      80
                                  Reject    60      20      80
                                  Total    120      40      160



DBA6000: Quantitative Business Research Methods                                           12
Part 1. Research design


Course: Advanced welding

                                        Male    Female    Total
                              Accept     10        20      30
                              Reject     40       80       120
                              Total      50       100      150
                This is an example of Simpson’s Paradox. Simpson’s
                Paradox occurs when the association between variables is
                reversed when data from several groups are combined.

Other examples of Simpsons’ paradox

   • Average tax rate has increased with time even though rate in every income category has
     decreased. Why?
   • Ave. female salary of B.Sc. graduates is lower than ave. male salary. Why?


     Causality or association?


       1. A positive correlation between blood pressure and income is observed. Does
          this indicate a causal connection?
       2. In a survey in 1960, it was found that for 25–34 y.o. males there was a positive
          correlation between years of school completed and height. Does going to
          school longer make a man taller?
       3. The same survey showed a negative correlation between age and educational
          level for persons aged over 25. Why?
       4. Students at fee paying private schools perform better on average in VCE than
          students at government funded schools. Why?


Some subtle differences

   • Distinguish between: causation & association, prediction & causation, prediction & ex-
     planation.
   • Note difference between deterministic and probabilistic causation.




DBA6000: Quantitative Business Research Methods                                              13
Part 1. Research design


1.2 Organizing a quantitative research study

As a quick check, ask the following questions

  1. What is your hypothesis (your research question)?

  2. What is already known about the problem (literature review)?

  3. What sort of design is best suited to studying your hypothesis? (method)

  4. What data will you collect to test your hypothesis? (sample)

  5. How will you analyse these data? (data analysis)

  6. What will you do with the results of the study? (communication)

These questions are broken down in more detail below. (These are mostly taken from Rubin et
al. (1990), and have also appeared in Balnaves and Caputi (2001).)


1.2.1 Hypothesis

   •   What is the goal of the research?
   •   What is the problem, issue, or critical focus to be researched?
   •   What are the important terms? What do they mean?
   •   What is the significance of the problem?
   •   Do you want to test a theory?
   •   Do you want to extend a theory?
   •   Do you want to test competing theories?
   •   Do you want to test a method?
   •   Do you want to replicate a previous study?
   •   Do you want to correct previous research that was conducted in an inadequate manner?
   •   Do you want to resolve inconsistent results from earlier studies?
   •   Do you want to solve a practical problem?
   •   Do you want to add to the body of knowledge in another manner?


1.2.2 Review of literature

   •   What does previous research reveal about the problem?
   •   What is the theoretical framework for the investigation?
   •   Are there complementary or competing theoretical frameworks?
   •   What are the hypotheses and research questions that have emerged from the literature
       review?




DBA6000: Quantitative Business Research Methods                                         14
Part 1. Research design


1.2.3 Method

   • What methods or techniques will be used to collect the data? (This holds for applied and
     non-applied research)
   • What procedures will be used to apply the methods or techniques?
   • What are the limitations of these methods?
   • What factors will affect the study’s internal and external validity?
   • Will any ethical principles be jeopardized?


1.2.4 Sample

   •   Who (what) will provide (constitute) the data for the research?
   •   What is the population being studied?
   •   Who will be the participants for the research?
   •   What sampling technique will be used?
   •   What materials and information are necessary to conduct the research?
   •   How will they be obtained?
   •   What special problems can be anticipated in acquiring needed materials and information?
   •   What are the limitations in the availability and reporting of materials and information?


1.2.5 Data analysis

   •   How will data be analysed?
   •   What statistics will be used?
   •   What criteria will be used to determine whether hypotheses are supported?
   •   What was discovered (about the goal, data, method, and data analysis) as a result of
       doing preliminary work (if conducted)?


1.2.6 Communication

   •   How will the final research report be organised? (Outline)
   •   What sources have you examined thus far that pertain to your study? (Reference list)
   •   What additional information does the reader need?
   •   What time frame (deadlines) have you established for collecting, analysing and present-
       ing data? (Timetable)



1.3 Some quantitative research designs
   • Case study: questionnaire, interview, observation. Best for exploratory work and hy-
     pothesis generation. Limited quantitative analysis possible.
   • Survey: questionnaire, interview, observation. Best if sample is random.
   • Experiment: questionnaire, interview, observation. Best for demonstrating
     causality.




DBA6000: Quantitative Business Research Methods                                            15
Part 1. Research design


1.3.1 Cross-sectional vs longitudinal analysis

All designs can be either cross-sectional or longitudinal.

   • Cross-sectional design involves data collection for one time only.
   • Longitudinal design involves successive data collection over a period of time. Necessary
     if you want to study changes over time.


1.3.2 Case study designs

   • involves intense involvement with a few cases rather than limited involvement with
     many cases
   • can’t generalize results easily
   • useful in exploring ideas and generating hypotheses


1.3.3 Survey designs

   • Most popular in business/management research
   • useful when you cannot control the things you want to study
   • difficult to get random and representative samples


1.3.4 Experimental designs

   •   requires control group to allow for the placebo effect
   •   requires the experimenter to control all variables other than the variable of interest
   •   requires randomization to groups
   •   allows causation to be tested


       Which research design would you use?

       Hypotheses:
         1. Women believe they are better at managing than men.
         2. Children who listen to poetry in early childhood make better progress in learn-
            ing to read than those who do not.
         3. A business will run more efficiently if no person is directly responsible for more
            than five other people.
         4. There are inherent advantages in businesses staying small.
         5. Employees with postgraduate qualifications have shorter job expectancy than
            employees without postgraduate qualifications.

       What data would you collect in each case?




DBA6000: Quantitative Business Research Methods                                                 16
Part 1. Research design


1.4 Data structure

1.4.1 Populations and samples

A population is the entire collection of ‘things’ in which we are interested. A sample is a subset of
a population. We wish to make an inference about a population of interest based on information
obtained from a sample from that population.

E XAMPLES :

   • You measure the profit/loss of 50 public hospitals in Victoria, randomly selected.
     Population:
     Sample:
     Points of interest:
   • Sales on 500 products from one company for the last 5 years are analysed.
     Population:
     Sample:
     Points of interest:


1.4.2 Cases and variables

Think about your data in terms of cases and variables.

   • A case is the unit about which you are taking measurements. E.g., a person, a business.
   • A variable is a measurement taken on each case.
     E.g., age, score on test, grade-level, income.


1.4.3 Types of Data

The ways of organizing, displaying and analysing data depends on the type of data we are
investigating.

   • Categorical Data (also called nominal or qualitative)

      e.g. sex, race, type of business, postcode
      Averages don’t make sense. Ordered categories are called ordinal data

   • Numerical Data (also called scale, interval and ratio)

      e.g. income, test score, age, weight, temperature, time.
      Averages make sense.

Note that we sometimes treat numerical data as categories. (e.g. three age groups.)




DBA6000: Quantitative Business Research Methods                                                  17
Part 1. Research design


1.4.4 Response and explanatory variables

Response variable: measures the outcome of a study. Also called dependent variable.

Explanatory variable: attempts to explain the variation in the observed outcomes.
     Also called independent variables.
           Many statistical problems can be thought of in terms of a response
           variable and one or more explanatory variables.

Sometimes the response variable is called the dependent variable and the explanatory variables
are called the independent variables.

   • Study of profit/loss in Victorian hospitals.
     Response variable:
     Explanatory variables:

   • Monthly sales of 500 products
     Response variable:
     Explanatory variables: competitor advertising.



1.5 The survey process
1. Planning a survey
      State the objectives: In order to state the objectives we often need to ask questions such as:
         • What is the survey’s exact purpose?
         • What do we not know and want to know?
         • What inferences do we need to draw?
      Begin by developing a specific list of information needs. Then write focused survey ques-
      tions.
2. Design the sampling procedure
      Identify the target population: Whom are we drawing conclusions about?
      Select a sampling scheme: Examples: simple random sampling, stratified random sampling,
            systematic sampling, and cluster sampling.
3. Select a survey method
      Decide how to collect the data: personal interviews, telephone interviews, mailed ques-
      tionnaires, diaries, . . .
4. Develop the questionnaire
      Write the questionnaire. Decide on the wording, types of questions, and other issues.
5. Pretest the questionnaire
      Select a very small sample from the sampling frame. Conduct the survey and see what
      goes wrong. Correct any problems before carrying out the full-scale study.
6. Conduct the survey
      Run the survey in an efficient and time effective manner.
7. Analyze the data
      Gather the results and determine outcomes.




DBA6000: Quantitative Business Research Methods                                                18
Part 1. Research design


Appendix A: Case studies

Injury management in NSW

Four injury management pilots (IMP) running during 2001:

   • private hospitals and nursing homes within NSW;

   • all industry groups within the Central West NSW region;

   • two insurance companies (QBE and EML).

We wish to do a statistical comparison of the injury management pilots with the current stan-
dard injury management arrangements.

Performance measures

   • incidence of specific payment types
   • duration of claims
   • number of claims
   • proportion of claimants in receipt of weekly benefits at 4, 8, 13 and 26 weeks.
   • costs for claimants at 4, 8, 13 and 26 weeks.
        – medical, rehabilitation, physiotherapy, chiropractic
        – weekly-benefits
   • timeliness
        – number of days from injury to agent notification
        – number of days from injury to first payment

Some potential driving variables

   •   age
   •   gender
   •   injury type
   •   agency (e.g., powered tools)
   •   severity of injury
   •   medical interventions
   •   employer size
   •   insuring agency
   •   weekly pay at time of injury
   •   industry (ANZSIC code)
   •   occupation (ASCO code)

   • Driving variables affect the performance measures.
   • Variations between groups in key driver variables can induce apparent differences be-
     tween groups. This is then confused with any real differences due to the programs being
     evaluated.
   • Therefore any comparisons of groups of employees should either eliminate the effect of
     drivers or try to measure the effect of the drivers.


DBA6000: Quantitative Business Research Methods                                          19
Part 1. Research design


The ideal design!

Ideally, we would use a randomized control trial. This eliminates the effect of driving vari-
ables.

   • The control group would be employees on the old IM system.
   • The treatment group would be employees in the new IMP.
   • Employees would be randomly allocated to the two groups.
   • Statistical comparisons between the two groups would show differences between the old
     IM system and the new IMP.
   • This random allocation would prevent any systematic differences between those in the
     IMP and those not in the IMP.
   • Such a scheme is impracticable.

The actual design

We have to use pseudo-control groups and eliminate differences between the control and IMP
groups using statistical models.

   • All injuries within the specified industry group, geographical region or insurer will be
     subject to the new IMP during 2001.
   • The pseudo-controls will be the equivalent groups of employees in 2000 who are not
     subject to the new IMP.

Problem of confounding

   • If there are differences between the IMP and the control, is it due to the different IM
     program or the different group?

Solution:

   • adjust for as many driving variables as possible;
   • compare similar groups not subject to the IMP.

Comparisons undertaken

IMP group: Private hospitals/nursing homes in NSW 2001
    Pseudo-control: Private hospitals/nursing homes 2000

IMP group: Central West NSW region 2001
    Pseudo-controls: Central West NSW region 2000

IMP group: Insurance company 2001
    Pseudo-control: Insurance company 2000

Non-IMP group: Comparable industry group 2001
    Pseudo-controls: Comparable industry group 2000

Non-IMP group: Comparable NSW region 2001
    Pseudo-controls: Comparable NSW region 2000

DBA6000: Quantitative Business Research Methods                                          20
Part 1. Research design


We do not directly compare:

   • private hospitals/nursing homes with other industry groups;
   • Central West NSW region with other geographical regions.

Instead, we compare the change between 2000 and 2001 in each industry group and each geo-
graphical region.

How to interpret the results. . .

   • If all 2001 groups are different from the 2000 groups after taking into account all drivers,
     then it is likely there are changes between years not reflected in the drivers. We won’t be
     able to attribute any changes to the IMP.

   • If all IMP 2001 groups are different from the 2000 groups after taking into account all drivers,
     but the non-IMP 2001 groups are not different from the 2000 groups, then it is likely the
     changes between years are due to the IMP.




DBA6000: Quantitative Business Research Methods                                                  21
Part 1. Research design


Needlestick injuries

You are interested in the number and severity of needle stick injuries amongst health workers
involved in blood donation and transfusion. Work in groups of three to carefully define the
objectives of your survey. You will need to specify

   •   the objective of the survey
   •   what data are to be collected
   •   the target population
   •   the survey population
   •   the sample
   •   the data collection method
   •   potential errors which could occur in your survey.




Palliative care referrals

A few years ago, I helped the Health Department with a survey on palliative care. As part
of the study, it was necessary to study the ‘referral’ pattern for palliative care providers: how
many patients they send to hospital (for inpatient or outpatient treatment); how many they
refer to consultants for specialist comment; how many to community health programs; and so
on.

Possible sampling schemes:

  1. sample a group of palliative care practitioners and study their referral patterns;
  2. sample a group of palliative care patients and study their referral patterns.

Discuss the possible advantages and disadvantages of the two schemes.




DBA6000: Quantitative Business Research Methods                                              22
CHAPTER
                      2
                                          Data collection

2.1 Introduction

           “You don’t have to eat the whole ox to know that the meat is tough.”
                                                                              Samuel Johnson


Sampling is very familiar to all of us, because we often reach conclusions about phenomena
on the basis of a sample of such phenomena. You may test a swimming pool’s temperature by
dipping your toe in the water or the performance of a new vehicle by a short test drive. These
are among the countless small samples that we rely on when making personal decisions. We
tend to use haphazard methods in picking our sample and risk substantial sampling error.

Research also usually reaches its conclusions on the basis of sampling, but the methods used
must adhere to certain rules that are going to be discussed. The goal in obtaining data through
survey sampling is to use a sample to make precise inferences about the target population. We
want to be highly confident about our inferences. It is important to have a substantial grasp
of sampling theory to appraise the reliability and validity of the conclusions drawn from the
sample taken.



2.2 Data collecting instruments

The choice of data collection instrument is crucial to the success of the survey. When deter-
mining an appropriate data collection method, many factors need to be taken into account,
including complexity or sensitivity of the topic, response rate required, time or money avail-
able for the survey and the population that is to be targeted. Some of the most common data
collection methods are described in the following sections.




                                               23
Part 2. Data collection


2.2.1 Interviewer enumerated surveys

Interviewer enumerated surveys involve a trained interviewer going to the potential respon-
dent, asking the questions and recording the responses.

The advantages of using this methodology are:
   •   provides better data quality
   •   special questioning techniques can be used
   •   greater rapport established with the respondent
   •   allows more complex issues to be included
   •   produces higher response rates
   •   more flexibility in explaining things to respondents
   •   greater success in dealing with language problems

The disadvantages of using this methodology are:
   •   expensive to conduct
   •   training for interviewers is required
   •   more intrusive for the respondent
   •   interviewer bias may become a source of error


2.2.2 Web surveys

Web surveys are increasingly popular, although care must be taken to avoid sample selection
bias and multiple responses from an individual.

The advantages of this methodology are:
   •   cheap to administer
   •   private and confidential
   •   easy to use conditional questions and to prompt if no response or inappropriate response.
   •   can build in live checking.
   •   can provide multiple language versions

The disadvantages of this methodology are:
   •   respondent bias may become a source of error
   •   not everyone has access to the internet
   •   language and interface must be very simple
   •   cannot build up a rapport with respondents
   •   resolution of queries is difficult
   •   only appropriate when straight forward data can be collected


2.2.3 Mail surveys

Self-enumeration mail surveys are where the questionnaire is left with the respondent to com-
plete.

The advantages of this methodology are:

DBA6000: Quantitative Business Research Methods                                             24
Part 2. Data collection


   • cheaper to administer
   • more private and confidential
   • in some cases does not require interviewers

The disadvantages of this methodology are:
   •   difficult to follow-up non-response
   •   respondent bias may become a source of error
   •   response rates are much lower
   •   language must be very simple
   •   problems with poor English and literacy skills
   •   cannot build up a rapport with respondents
   •   resolution of queries is difficult
   •   only appropriate when straight forward data can be collected


2.2.4 Telephone surveys

A telephone survey is the process where a potential respondent is phoned and asked the survey
questions over the phone.

The advantages of this methodology are:
   • cheap to administer
   • convenient for interviewers and respondents

The disadvantages of this methodology are:
   •   interviews easily terminated by respondent
   •   cannot use prompt cards to provide alternatives for answers
   •   burden placed on interviewers and respondents
   •   biased sample through households with phones


2.2.5 Diaries

Diaries can be used as a format for a survey. In these surveys respondents are directed to record
the required information over a predetermined period in the diary, book or booklet supplied.

The advantages of this methodology are:
   • high quality and detailed data from the completed diaries
   • more private and confidential circumstances for the respondent
   • does not require interviewers

The disadvantages of this methodology are:
   •   response rates are lower and the diaries are rarely completed well
   •   language must be simple
   •   can only include relatively simple concepts
   •   cannot build up a rapport
   •   cannot explain the purpose of survey items to respondents


DBA6000: Quantitative Business Research Methods                                              25
Part 2. Data collection


                                                       Face-to-face   Telephone      Mail
        Response rates                                 Good           Good           Good

        Representative samples
        Avoidance or refusal bias                      Good           Good           Poor
        Control over who completes the questionnaire   Good           Good           Satisfactory
        Gaining access to the selected person          Satisfactory   Good           Good
        Locating the selected person                   Satisfactory   Good           Good

        Effects on questionnaire design
        Ability to handle:
           Long questionnaires                         Good           Satisfactory   Satisfactory
           Complex questions                           Good           Poor           Satisfactory
           Boring questions                            Good           Satisfactory   Poor
           Item non-response                           Good           Good           Satisfactory
           Filter questions                            Good           Good           Satisfactory
           Question sequence control                   Good           Good           Poor
           Open ended questions                        Good           Good           Poor

        Quality of answers
        Minimize socially desirable responses          Poor           Satisfactory   Good
        Ability to avoid distortion due to
           Interviewer characteristics                 Poor           Satisfactory   Good
           Interviewer opinions                        Satisfactory   Satisfactory   Good
           Influence of other people                    Satisfactory   Good           Poor
        Allows opportunities to consult                Satisfactory   Poor           Good
        Avoids subversion                              Poor           Satisfactory   Good

        Implementing the survey
        Ease of finding suitable staff                  Poor           Good           Good
        Speed                                          Poor           Good           Satisfactory
        Cost                                           Poor           Satisfactory   Good

Table 2.1: Advantages and disadvantages of three methods of data collection. Table taken from de Vaus
(2001) who adapted it from Dillman (1978).


2.2.6 Ideas for increasing response rates

   1.   Provide reward
   2.   Systematic follow up
   3.   Keep it short.
   4.   Interesting topic.




DBA6000: Quantitative Business Research Methods                                                     26
Part 2. Data collection


2.2.7 Archival data

Rather than collecting your own data, you may use some existing data. If you do, keep the
following points in mind.

Available information Is there sufficient documentation of the original research proposal for
     which the data were collected? If not, there may be hidden problems in re-using the data.

Geographical area Are the data relevant to the geographical area you are studying? e.g., what
    country, city, state or other area does the archive data cover?

Time period Are the data relevant to the time period you are studying? Does your research
     area cover recent events, or is it historical or does it look at changes over a specified range
     of time? Most data are at least a year old before they are released to the public.

Population What population do you wish to study? This can refer to a group or groups of
    people, particular events, official records, etc. In addition you should consider whether
    you will look at a specific sample or subset of people, events, records, etc.

Context Does the archival data contain the information relevant to your research area?



2.3 Errors in statistical data

In sample surveys there are two types of error that can occur:

   • sampling error which arises as only a part of the population is used to represent the whole
     population and;
   • non-sampling error which can occur at any stage of a sample survey.

It is important to be aware of these errors so that they can be minimized.


2.3.1 Sampling error

Sampling error is the error we make in selecting samples that are not representative of the
population. Since it is practically impossible for a smaller segment of a population to be exactly
representative of the population, some degree of sampling error will be present whenever we
select a sample. It is important to consider sampling error when publishing survey results as
it gives an indication of the accuracy of the estimate and therefore reflects the importance that
can be placed on interpretations.

If sampling principles are carefully applied within the constraints of available resources, sam-
pling error can be accurately measured and kept to a minimum. Sampling error is affected
by:

   • sample size
   • variability within the population
   • sampling scheme

DBA6000: Quantitative Business Research Methods                                                27
Part 2. Data collection


Generally larger sample sizes decrease sampling error. To halve the sampling error the sample
size has to be increased fourfold. In fact, sampling error can be completely eliminated by
increasing the sample size to include every element in the population.

The population variability also affects the error, more variable populations give rise to larger
errors as the samples or estimates calculated from different samples are more likely to have
greater variation. The effect of the variability within the population can be reduced by increas-
ing sample size to make it more representative of the target population.


2.3.2 Non-sampling error

Non-sampling error can be defined as those errors in a survey that are not sampling errors.
Non-sampling error is any error not caused by the fact that we have only selected part of
the population in the survey. Even if we were to undertake a complete enumeration of the
population, non-sampling errors might remain. In fact, as the size of the sample increases, the
non-sampling errors may get larger, because of such factors as possible increase in the response
rate, interviewer errors, and data processing errors.

For the most part we cannot measure the effect that non-sampling errors will have on the re-
sults. Because of their nature, these errors may not be totally eliminated. Perhaps the biggest
source of non-sampling error is a poorly designed questionnaire. The questionnaire can in-
fluence the response rate achieved in the survey, the quality of responses obtained and conse-
quently the conclusions drawn from survey results.

Some common sources of non-sampling error are discussed in the following paragraphs.

Target Population
     Failure to identify clearly who is to be surveyed. This can result in an inadequate sam-
     pling frame; imprecise definitions of concepts and poor coverage rules.
Non-response
    A non-response error occurs when the respondents do not reflect the sampling frame.
    This could occur when the people who do not respond to the survey differ to the people
    who did respond to the survey. This often occurs in voluntary response polls. For ex-
    ample, suppose that in an air bag study we asked respondents to call a 0018 number to
    be interviewed. Because a 0018 call cost $2 per minute, many drivers may not respond.
    Furthermore, those who do respond may be the people who have had bad experiences
    with air bags. Thus the final sample of respondents may not even represent the sampling
    frame.
     For example,
        • telephone polls miss those people without phones
        • household surveys miss homeless, prisoners, students in colleges, etc.
        • train surveys only target public transport users and tend to include regular public
          transport users.




DBA6000: Quantitative Business Research Methods                                              28
Part 2. Data collection


        Manufacturers and advertising agencies often use interviews at shopping malls to
        gather information about the habits of consumers and the effectiveness of ads. A
        sample of mall shoppers is fast and cheap. “Mall interviewing is being propelled
        primarily as a budget issue”, one expert told the New York Times. But people con-
        tacted at shopping malls are not representative of the entire population. They are
        richer, for example, and more likely to be teenagers or retired. Moreover, mall inter-
        viewers tend to select neat safe looking individuals from the stream of customers.
        Decisions based on mall interviews may not reflect the preferences of all consumers.


        In 1991 it was claimed that data showed that right-handed persons live on average
        almost a decade longer than left-handed or ambidextrous persons. The investigators
        had compared mean ages at death of people who appeared to be survivors as left,
        right or mixed handed.
           • What is the problem?
The questionnaire
     Poorly designed questionnaires with mistakes in wording, content or layout may make it
     difficult to record accurate answers. The most effective methods of designing a question-
     naire are discussed in Section 2.4. If these principles are followed it will help reduce the
     non-sampling error associated with the questionnaire.
Interviewers
      If an interviewer is used to administer the survey, their work has the potential to produce
      non-sampling error. This can be due to the personal characteristics of the interviewer.
      For example, an elderly person will often be more comfortable giving information to a
      female interviewer. Other factors which could cause error are the interviewer’s opinions
      and characteristics which may influence the respondent’s answers.

     In 1968, one year after a major racial disturbance in Detroit, a sample of black resi-
     dents was asked:
           Do you personally feel that you can trust most white people, some white people,
           or none at all?
     Of those interviewed by whites, 35% answered “Most”, while only 7% of those in-
     terviewed by blacks gave this answer. Many questions were asked in this study.
     Only on some topics, particularly black-white trust or hostility, did the race of the
     interviewer have a strong effect on the answers given. The interviewer was a large
     source of non-sample error in this study.

Respondents
    Respondents can also be a source of non-sampling error. They may refuse to answer ques-
    tions, or provide inaccurate information to protect themselves. They may have memory
    lapses and/or lack of motivation to answer the questionnaire, particularly if the ques-
    tionnaire is lengthy, overly complicated or of a sensitive nature. Respondent fatigue is a
    very important factor.

     Social desirability bias refers to the effect where respondents will provide answers which
     they think are more acceptable, or which they think the interviewer wants to hear. For
     example, respondents may state that they have a higher income than is actually the case
     if they feel this will increase their status.


DBA6000: Quantitative Business Research Methods                                               29
Part 2. Data collection


     Respondents may refuse to answer a question which they find embarrassing or choose
     a response which prevents them from continuing with the questions. For example, if
     asked the question: “Are you taking oral contraceptive pills for any reason?”, and know-
     ing that if they respond “Yes” they will be asked for more details, respondents who are
     embarrassed by the question are likely to answer “No”, even if this is incorrect.

     Fatigue can be a problem in surveys which require a high level of commitment for respon-
     dents. The level of accuracy and detail supplied may decrease as respondents become
     tired of recording all information. Sometimes interviewer fatigue can also be a problem,
     particularly when the interviewers have a large number of interviews to conduct.

Processing and collection
     Processing and collection errors can be a source of non-sampling error. For example,
     the results from the survey may be entered incorrectly . The time of year the survey is
     enumerated can produce non-sampling error. For example, if the survey is conducted in
     the school holidays, potential respondents with school children could possibly be away
     or hard to contact.

The Shere Hite surveys

In 1987, Shere Hite published a best-selling book called Women and Love. The author distributed
100,000 questionnaires through various women’s groups, asking questions about love, sex, and
relations between women and men. She based her book on the 4.5% of questionnaires that were
returned.

   • 95% said they were unhappily married
   • 91% of those who were divorced said that they had initiated the divorce

What are the problems with this research?

     Exercise 1: In Case 2, it was necessary to study the ‘referral’ pattern for palliative
          care providers: how many patients they send to hospital (for inpatient or out-
          patient treatment); how many they refer to consultants for specialist comment;
          how many to community health programs; and so on. Two alternative sam-
          pling schemes are available: sample a group of palliative care practitioners
          and study their referral patterns; or sample a group of palliative care patients
          and study their referral patterns. Discuss the possible advantages and disad-
          vantages of the two schemes.



2.4 Questionnaire design

2.4.1 Introduction

The purpose of a questionnaire is to obtain specific information with tolerable accuracy and
completeness. Before the questionnaire is designed, the collection objectives should be defined.
These include:



DBA6000: Quantitative Business Research Methods                                               30
Part 2. Data collection


   •   clarifying the objectives of the survey
   •   determining who is to be interviewed
   •   defining the content
   •   justifying the content
   •   prioritizing the data that are to be collected. This is important as it makes it easier to
       discard items if the survey, once developed, is too lengthy.

Careful consideration should be given to the content, wording and format of the questionnaire
as one of the largest sources of non-sampling error is poor questionnaire design. This error can
be minimized by considering the objectives of the survey and the required output, and then
devising a list of questions that will accurately obtain the information required.


2.4.2 Content of the questionnaire

Relevant questions

It is important to ask only questions that are directly related to the objectives of a survey as a
means of minimizing the burden place on respondents. The concept of a fatigue point, which oc-
curs when respondents can no longer be bothered answering questions, should be recognized,
and questions designed so that the respondent is through the form before this point is reached.

Towards the end of long questionnaires, respondents may give less thought to their answers
and concentrate less on the instructions and questions, thereby decreasing the accuracy of in-
formation they provide. Very long questionnaires can also lead the respondent to refuse to
complete the questionnaire. Hence it is necessary to ensure only relevant questions are asked.

Reliable questions

It is important to include questions in a questionnaire that can be easily answered. This objec-
tive can be achieved by adhering to the following techniques.

Appropriate recall If information is requested by recall, the events should be sufficiently recent
     or familiar to respondents. People tend to remember what they should have done, have
     selective memories, and move into reference period activities which surround the event.
     Minimizing the need for recall improves the accuracy of response.

Common reference periods To make it easier for the respondent to answer, use reference periods
    which match those of the respondent’s records.

Results justify efforts The amount of effort to which a respondent goes to obtain the data must
      be worth it. It is reasonable to accept a respondent’s estimate when calculating the exact
      figures would make little difference to the outcome.

Filtering Respondents should not be asked question they cannot answer. Filter questions should
       be asked to exclude respondents from irrelevant questions.




DBA6000: Quantitative Business Research Methods                                               31
Part 2. Data collection


2.4.3 Types of questions

Factual questions
     Information is required from these questions rather than an opinion. For example respon-
     dents could be asked about behaviour patterns (e.g., When did you last visit a General
     Practitioner?).

Classification or demographic questions
     These are used to gain a profile of the population that has been surveyed and provide
     important data for analysis.

Opinion questions
    Rather than facts, these questions seek opinion. There are many problems associated with
    opinion questions:

        • a respondent may not have an opinion/attitude towards the subject so the response
          may be provided without much thought;
        • opinion questions are very sensitive to changes in wording;
        • it is impossible to check the validity of responses to opinion questions.

Hypothetical questions
    The “What would you do if . . . ?” type of question. The problems with these questions
    are similar to opinion questions. You can never be certain how valid any answer to a
    hypothetical is likely to be.


2.4.4 Answer formats

Questions can generally be classified as one of two types, open or closed, depending on the
amount of freedom allowed in answering the question. When deciding which type of question
to use, consideration should be given to the kind of information sought, ease of processing the
response, and the availability of the resources of time, money, and personnel.

Open questions

Open questions allow the respondents to answer the question in their own words. These ques-
tions allow as many possible answers and they can collect exact values from a wide range of
possible values. Hence, open questions are used when the list of responses is very long and not
obvious.

The major disadvantage of open questions is they are far more demanding than closed ques-
tions both to answer and process. These questions are most commonly used where a wide
range of responses is expected. Also, the answers to these questions depend on the respon-
dents ability to write or speak as much as their knowledge. Two respondents might have the
same knowledge and opinions, but their answers may seem different because of their varying
abilities.




DBA6000: Quantitative Business Research Methods                                            32
Part 2. Data collection


     Question                                               Format

     Which country makes the best cars                      Open ended
     ...............................................


     Which country makes the best cars?                     Multiple choice questions
          1. USA 2. Germany 3. Japan


     Which country makes the best cars?                     Partially closed questions
          1. USA 2. Germany 3. Japan
          4. Other (please specify)


     For the list provided, indicate which brand/s of       Checklist questions
     cars you have owned?
            1. Ford 2. Toyota 3. BMW


     I believe Japanese cars are less reliable than         Likert scale (opinion) questions
     European cars.
      Strongly Agree   Agree    No opinion   Disagree   Strongly disagree
             1            2         3            4              5



Closed questions

Closed questions ask the respondents to choose an answer from the alternatives provided.
These questions should be used when the full range of responses is known. Closed questions
are far easier to process than open questions. The main disadvantage of closed questions is the
reasons behind a particular selection cannot be determined.

There are a number of types of closed questions.

   • Limited choice questions require the respondent to choose one of two mutually exclusive
     answers. For example yes/no.
   • Multiple choice questions require the respondent to choose from a number of responses
     provided.
   • Checklist questions allow a respondent to choose more than one of the responses pro-
     vided.
   • Partially closed questions provide a list of alternatives where the last alternative is “Other,
     please specify”. These questions are useful when it is difficult to list all possible choices.
   • Opinion (Likert) scale An opinion scale question seeks to locate a respondent’s opin-
     ion on a rating scale with a limited number of points. For example, a five point scale
     measure of strong and weak attitudes would ask the respondent whether they strongly
     agree/agree/are neutral/disagree/strongly disagree with a particular statement of opin-


DBA6000: Quantitative Business Research Methods                                                 33
Part 2. Data collection


     ion. Whereas a three point scale would only measure whether they agree, disagree or are
     neutral. Opinion scales of this sort are called Likert scales.
     Five point scales are best because:
        –
        –
        –

Response Categories

When questions have categories provided, it is important that every response is catered for.

Number of Categories
    The quality of the data can be influenced if there are too few categories as the respondent
    may have difficulty finding one which accurately describes their situation. If there are too
    many categories the respondent may also have difficulty finding one which accurately
    describes their situation.

Don’t Know A ‘Don’t Know’ category can be included so respondents are not forced to make
      decisions/attitudes that they would not normally make. Excluding the option is not usu-
      ally good, however, it is hard to predict the effect of including it. The decision of whether
      or not to include a ‘Don’t Know’ option depends, to a large extent, on the subject matter.
          I was gifted to be able to answer promptly, and I did. I said I didn’t know.
                                                        Mark Twain, Life on the Mountain


2.4.5 Wording of questions

Language

Questions which employ complex or technical language or jargon can confuse or irritate re-
spondents. Respondents who do not understand the question may be unwilling to appear
ignorant by asking the interviewer to explain the question or if a interviewer is not present,
may not answer or answer incorrectly.

Ambiguity

If ambiguous words or phrases are included in a question, the meaning may be interpreted
differently by different people. This will introduce errors in the data since different respondents
will virtually be answering different questions.

For example “Why did you fly to New Zealand on Qantas airlines?”. Most might interpret
this question as was intended, but it contains three possible questions, so the response might
concern any of these:

   • I flew (rather than another mode of travel) because . . .
   • I went to New Zealand because . . .
   • I selected Qantas because . . .




DBA6000: Quantitative Business Research Methods                                                34
Part 2. Data collection


Double-barreled questions

When one question contains two concepts, it is known as a double-barreled question. For
example , “How often do you go grocery shopping and do you enjoy it?”.

Each concept in the question may have a different answer, or one concept may not be relevant,
respondents may be unsure how to respond. The interpretation of the answers to these ques-
tions is almost impossible. Double-barreled questions should be split into two or more separate
questions.

Leading questions

Questions which lead respondents to answers can introduce error. For example, the question
“How many days did you work last week?”, if asked without first determining whether re-
spondents did in fact take work in the previous week, is a leading question. It implies that
the person would have been at work. Respondents may answer incorrectly to avoid telling the
interviewer that they were not working.

Unbalanced questions
“Are you in favour of euthanasia?” is an unbalanced question because is provides only one al-
ternative. It can be reworded to ‘Do you favour or not favour euthanasia?’, to give respondents
more than one alternative.
Similarly, the use of a persuasive tone can affect the respondent’s answers. Wording should be
chosen carefully to avoid a tone that may produce bias in responses.

Recall/memory error
Respondents tend to remember what should have been done rather that what was done. The
quality of data collected from recall questions is influenced by the importance of the event to
the respondent and the length of time since the event took place. Subjects of greater interest or
importance to the respondent, or events which happen infrequently, will be remembered over
longer periods and more accurately. Minimizing the recall period also helps to reduce memory
bias.
Telescoping is a specific type of memory error. This occurs if the respondent reports events
as occurring either earlier or later than they actually occur. Error occurs when respondents
included details of an event which actually occurred outside the specified reference period.

Sensitive questions
Questions on topics which respondents may see as embarrassing or highly sensitive can pro-
duce inaccurate answers. If respondents are required to answer questions with information
that might seem socially undesirable, they may provide the interviewer with responses they
believe are more ‘acceptable’. If placed at the being of the questionnaire, it could lead to non-
response if respondents are unwilling to continue with the remaining questions.
For example, “Approximately how many cans of beer do you consume each week, on aver-
age?”
           1. None


DBA6000: Quantitative Business Research Methods                                              35
Part 2. Data collection


            2. 1–3 cans
            3. 4–6 cans
            4. More than 6
A respondent might answer response 2 or 3 rather than admit to consuming the greatest quan-
tity on the scale. Consider extending the range of choices far beyond what is expected. The
respondent can select an answer closer to the middle and feel more in the normal range.

     In 1980, the New York Times CBS News Poll asked a random sample of Americans
     about abortion. When asked “Do you think there should be an amendment to the
     Constitution prohibiting abortions, or should not there be such an amendment?”
     29% were in favour and 62% were opposed. The rest of the sample were uncer-
     tain. The same people were later asked a different question: “Do you believe there
     should be an amendment to the Constitution protecting the life of the unborn child,
     or should not there be such an amendment?” Now 50% were in favour and only
     39% were opposed.

Acquiescence

This situation arises when there is a long series of questions for which respondents answer
with the same response category. Respondents get used to providing the same answer and
may answer inaccurately.


2.4.6 Questionnaire format

Including an introduction

It can be advantageous to include an introductory statement or explanation at the beginning of
a survey. The introduction may included such information as the purpose of the survey or the
scope of collection. It will aid the respondent when answering the questions if they know why
the information is being sought. The respondent should be given a context in which to frame
his or her answers. An assurance of confidentiality will provide respondents with confidence
that the results will not be obtained by unwanted parties.

Question and page numbers

To ensure that the questionnaire can be easily administered by interviewer or respondents, the
pages of the questionnaire and the questions should be number consecutively with a simple
numbering system. Question numbering is a way of providing sign-posts along the way. They
help if remedial action is required later, and you want to refer the interviewer or respondent
back to a particular place.

Sequencing

The questions in a questionnaire should follow an order which is logical and smoothly flows
from one question to the next. The questionnaire layout should have the following character-
istics.



DBA6000: Quantitative Business Research Methods                                            36
Part 2. Data collection


Related questions grouped
      Questions which are related should be grouped together and where necessary placed into
      sections. Sections should contain an introductory heading or statement.

     If possible, question ordering should try and anticipate the order in which respondents
     will supply information. It shows good survey design if a question not only prompts an
     answer but also prompts an answer to a question following shortly.

Question ordering
     It is important to be aware that earlier questions can influence the responses of later ques-
     tions, so the order of questions should be carefully decided. In attitudinal questions, it
     is important to avoid conditioning respondents in an early question which could then
     bias their responses to later questions. For example, you should ask about awareness of
     a concept before any other mention of the concept.

Respondent motivation

Whenever possible, start the questionnaire with easy and pleasant questions to promote inter-
est in the survey and give the respondent confidence in their ability to complete the survey.
The opening questions should ensure that the particular respondent is a member of the survey
population.

Questions that are perceived as irritating or obtrusive tend to get a low response rate and
may effectively trigger a refusal from the respondent. These questions need to be carefully
positioned in a questionnaire where they are least likely to be sensitive.

It is also important that respondents are only asked relevant questions. Respondents may be-
come annoyed and disinterested if this does not occur. Include filter questions to direct re-
spondents to skip to questions which do not apply to them. Filter questions often identify
sub-populations. For example,

                 “Do you usually speak English at home?”       Yes (Go to Q34)
                                                               No (Go to Q10)

Questionnaire layout

The questionnaire layout should be aesthetically pleasing, so the layout does not contribute to
respondent fatigue. Things that can interfere with the answering of a questionnaire are: unclear
instructions and questions, insufficient space to provide answers, hard-to-read text, difficulty
in understanding language, back-tracking through the form. Many of these things are bad form
design and are avoidable.

Only include essentials on the questionnaire form. Keep the amount of ink on the form to the
minimum necessary for the form to work properly. Anything that is not necessary contributes
to the fatigue point of the respondent and to the subsequent detriment of the data quality.




DBA6000: Quantitative Business Research Methods                                              37
Part 2. Data collection


General layout

Consistency of layout: If consistency and logical patterns are introduced into the form design, it
     eases the form filler’s task. Patterns that can be useful are:

         •   white spaces for responses
         •   using the same question type throughout the form
         •   using the same layout throughout the form
         •   using a different style, consistently, for instructions or directions.

Type Size: A font size between 10 and 12 is considered the best in most circumstances. If the
      respondent does not have perfect vision, or ideal working conditions, small fonts can
      cause problems.

Use of all upper-case text: It is best to avoid upper case text. Upper case text has been shown to
      be hard to read, especially where large amounts of text are involved. Words lose their
      shape when in upper case, becoming converted to rectangles. Text in upper case should
      be left for use for titles or for emphasis but, this can often be done just as well using other
      methods, such as bold, italics, or slightly larger type size.

Line length: As the eye has a clear focus range of only a few degrees, lines should be kept short.
      It takes the eyeball several eye movements to scan a line of text. If more than 2 or 3 such
      movement occur then the eye can become fatigued. There is a tendency for the eye to lose
      track of which line it is reading. This leads to backtracking the text or misinterpretation.

Character and line spacing: It is very important to leave enough space on a form for answers. It
     has been shown in research that forms requiring hand written responses need a distance
     of 7–8mm between lines and a 4–5mm width for each possible character.

Response layout

Obtaining responses: A popular way of obtaining responses is using tick boxes. However, it is
     usually preferable to use a labelled list (e.g., a, b, c, . . . ) and ask respondents to circle their
     response. This makes coding and data entry easier.

      If a written response is required it is best to provide empty answer spaces, with lines
      made up of dots.

Positioning of responses: Vertical alignment of responses is preferred to horizontal alignment. It
      is easier to read up and down the list, and select the correct box, than read across the page
      and locate an item in a horizontal string. Captions to the left of the answer box are easier
      for respondents to complete.

Order of response options: The consideration of the order of responses is important as the order
     can be a source of bias. The options presented first may be selected because they make
     an impact on respondents or because respondents lose concentration and do not hear or
     read the remaining options. The last options may be chosen because it was easily recalled,
     particularly if respondents are faced with a long list of options. Long or complex response
     options may also make recall more difficult and increase the effects due to the order of


DBA6000: Quantitative Business Research Methods                                                       38
Part 2. Data collection


     response options.

Prompt card: If the questionnaire is interviewer based, and a number of response options are
     given for some questions, then a prompt card may be appropriate. A prompt card is a list
     of possible responses to a question, displayed on a separate card which are shown by the
     interviewer to assist respondents. This helps to decrease error resulting from respondents
     being unable to remember all the options read out. However respondents with poor
     eyesight, migrants with limited English or adults with literacy problems will experience
     difficulties in answering accurately.

     Exercise 2: (Case 2) The questionnaire on pages 47–48 was an early draft of the
          questionnaire prepared by the client. The questionnaire on pages 49–51 is a
          later draft of the questionnaire after I had provided the client with some advice.
          See if you can determine why each of the changes has been made. How could
          you further improve the questionnaire?


2.4.7 Pretesting the questionnaire

A pretest of a questionnaire should be considered mandatory. Although the designer of the
questionnaire would have reviewed the drafted questionnaire meticulously on all points of
good design, it is still likely to contain faults. Normally, a number of these emerge when the
form is used in the field, because the researcher did not completely anticipate what would take
place. The only way that these faults may be fully detected is by actually administering the
survey with the types of respondents who would be sampled in the study.

Each type of testing is used at a different stage of survey development and aims to test different
aspects of the survey.

Skirmishing
     Skirmishing is the process of informally testing questionnaire design with groups of re-
     spondents. The questionnaire is basically unstructured and is tested with a group of
     people who can provide feedback on issues such as each question’s frame of reference,
     the level of knowledge needed to answer the questions, the range of likely answers to
     questions and how answers are formulated by respondents. Skirmishing is also used to
     detect flaws or awkward wording of questionnaires as well as testing alternative designs.
     At this stage we may use open-ended response categories to work-out likely responses.
     The questionnaire should be redrafted after skirmishing.

Focus groups
     A skirmish tests the questionnaire design against general respondents whilst focus groups
     concentrate on a specific audience. For example, a survey studying the effects of living
     on unemployment benefits could have a group of unemployed people as a focus group.

     A focus group can be used to test questions directed at small sub-populations. For ex-
     ample if we were looking at community services we may have a filter question to target
     disabled people. Since there may not be many disabled chosen in the sample, we need to
     test the questions on a focus group of disabled people, which is a biased sample.



DBA6000: Quantitative Business Research Methods                                                39
Part 2. Data collection


Observational studies
    Respondents complete a draft questionnaire in the presence of an observer during an
    observational study. Whilst completing the form the respondents explain their under-
    standing of the questions and the method required in providing the information. These
    studies can be a means of identifying problem questions through observations, questions
    asked by the respondents, or the time taken to complete a particular question. Data avail-
    ability and the most appropriate person to supply the information can also be gauged
    through observational studies. The form is being tested and not the respondent and this
    should be stressed to the respondent.

Pilot testing
      Pilot testing involves formally testing a questionnaire or survey with a small represen-
      tative sample of respondents. Semi-closed questions are usually used in pilot testing to
      gather a range of likely responses which are used to develop a more highly structured
      questionnaire with closed questions. Pilot testing is used to identify any problems asso-
      ciated with the form, such as questionnaire format, length, question wording and allows
      comparison of alternative versions of a questionnaire.



2.5 Data processing

Data processing involves translating the answers on a questionnaire into a form that can be
manipulated to produce statistics. In general, this involves coding, editing, data entry, and
monitoring the whole data processing procedure. The main aim of checking the various stages
of data processing is to produce a file of data that is as error free as possible.


2.5.1 Data coding

Up to this point, the questionnaire has been considered mainly as a means of communication
with the respondent. Just as important, the questionnaire is a working document for the trans-
fer of data on to a computer file. Consequently it is important to design the questionnaire to
facilitate data entry.

Unless all the questions on a questionnaire are “closed” questions, some degree of coding is
required before the survey data can be sent for punching. The appropriate codes should be de-
vised before the questionnaires are processed, and are usually based on the results of pretesting.

Coding consists of labelling the responses to questions (using numerical or alphabetic codes) in
order to facilitate data entry and manipulation. Codes should be formulated to be simple and
easy. For example if Question 1 has four responses then those four responses could be given
the codes a, b, c, and d. The advantage of coding is the simplistic storage of data as a few-digit
code compared to lengthy alphabetical descriptions which almost certainly will not be easy to
categorize.

Coding is relatively expensive in terms of resource effort. However, improvements are always
being sought by developing automated techniques to cover this task. Other options include the
use of self coding where respondents answer the appropriate code or the interviewer performs


DBA6000: Quantitative Business Research Methods                                               40
Part 2. Data collection


the coding task.

Before the interviewing begins, the coding frame for most questions can be devised. That is, the
likely responses are obvious from previous similar surveys or thorough pilot testing, allowing
those responses and relevant codes to be printed on the questionnaire. An “Other (Please
Specify)” answer code is often added to the end of a question with space for interviewers to
write the answer. The standard instruction to interviewers in doubt about any precodes is that
they should write the answers on the questionnaire in full so that they can be dealt with by a
coder later.


2.5.2 Data entry

Ensure that the questionnaire is designed so data entry personnel have minimal handling of
pages. For example, all codes should be on the left (or right) hand side of the page. It is
advisable to use trained data entry people to enter the data. It is quicker and more reliable and
therefore more cost effective.



2.6 Sampling schemes

When you have a clear idea of the aims of the survey and the data requirements, the degree of
accuracy required, and have considered the resources and time available, you are in a position
to make a decision about the size and the form of collection of sampling units.

The two qualities most desired in a sample (besides that of providing the appropriate findings),
are its representativeness and stability. Sample units may be selected in a variety of ways. The
sampling schemes fall into two general types: probability and non-probability methods.


2.6.1 Non-probability samples

If the probability of selection for each unit is unknown, or cannot be calculated, the sample is
called a non-probability sample. For non-probability samples, since there is no control over rep-
resentativeness of the sample, it is not possible to accurately evaluate the precision of estimates
(i.e., closeness of estimates under repeated sampling of the same size). However, where time
and financial constraints make probability sampling infeasible, or where knowing the level of
accuracy in the results is not an important consideration, non-probability samples do have a
role to play. Non-probability samples are inexpensive, easy to run and no frame is required.
This form of sampling is popular amongst market researchers and political pollsters as a lot of
their surveys are based on a pre-determined sample of respondents of certain categories.

One common method of non-probability sampling is voluntary response polling. A general
appeal is made (often via television) for people to contact the researcher with their opinion.
Voluntary response samples are rarely useful because they over-represent people with strong
opinions, most often negative opinion.




DBA6000: Quantitative Business Research Methods                                                41
Part 2. Data collection


2.6.2 Probability sampling schemes

Probability sampling schemes are those in which the population elements have a known chance
of being selected for inclusion in a sample. Probability sampling rigorously adheres to a pre-
cisely specified system that permits no arbitrary or biased selection. There are four main types
of probability sampling schemes.

Simple Random Sample: If a sample size of size n is drawn from a population of size N in
    such a way that every possible sample of size n has the sample chance of being selected,
    the sampling procedure is called simple random sampling. The sample thus obtained
    is called a simple random sample. This is the simplest form of probability sample to
    analyse.

Stratified Random Sample: A stratified random sample is one obtained by separating the pop-
      ulation elements into non-overlapping groups, called strata, and then selecting a simple
      random sample from each stratum. This can be useful when a population is naturally
      divided into several groups. If the results on each stratum vary greatly, then it is possi-
      ble to obtain more efficient estimators (and therefore more precise results) than would be
      possible without stratification.

Systematic Sample: A sample obtained by randomly selecting one element from the first k el-
     ements in the frame and every kth element thereafter is called a 1-in-k systematic sample,
     with a random start. This is obviously a simple method if there is a list of elements in
     the frame. Systematic sampling will provide better results than simple random sampling
     when the systematic sample has larger variance than the population. This can occur when
     the frame is ordered.

Cluster Sample: A cluster sample is a probability sample in which each sampling unit is a
     collection, or cluster, of elements. The population is divided into clusters and one or
     more of the clusters is chosen at random and sampled. Sometimes the entire cluster is
     sampled; on other occasions a simple random sample of the chosen clusters is taken.
     Cluster sampling is usually done for administrative convenience, and is especially useful
     if the population has a hierarchical structure.

A comparison of these four sampling schemes appears in the table on the following page.

     Example (Case 2): A few years ago, I advised the Department of Health and Com-
     munity Services on a survey of palliative care patients in Victoria.
      Objective:           To estimate the proportion of palliative care patients in Vic-
                           torian hospitals.
      Difficulties:         What is a “palliative care patient”? Proportion of what?
      Target population: Patients in acute beds at the time of the survey?
      Survey population: All patients in acute beds in Victorian hospitals except for
                           very small (< 10 bed) country hospitals.
      Sampling scheme: Stratified (hospital types) and clustered (hospitals). Ran-
                           dom selection of hospitals within each strata. Total cover-
                           age of patients in the selected hospitals.
      Sample:              All patients in the 18 hospitals selected out of 115 hospitals
                           in Victoria.


DBA6000: Quantitative Business Research Methods                                              42
Part 2. Data collection




    Scheme                How to select sample           Strengths/Weaknesses

    Simple Random         Assign numbers to elements
                                                         • The basic building block
    Sample                in sampling. Use a random
                                                         • Simple, but often costly.
                          number table or random
                                                         • Cannot use unless we can
                          number generator to select
                                                           assign a number to each
                          sample.
                                                           element in a target
                                                           population.


    Stratified Sample      Divide population into
                                                         • With proper strata, can
                          groups that are similar
                                                           produce very accurate
                          within and different between
                                                           estimates.
                          on the variable of interest.
                                                         • Less costly than simple
                          Use random numbers to
                                                           random sampling.
                          select the sample from each
                                                         • Must stratify target
                          stratum.
                                                           population correctly.


    Systematic Sample     Select every kth element
                                                         • Produces very accurate
                          from a list after a random
                                                           estimates when elements
                          start.
                                                           in a population exhibit
                                                           order.
                                                         • Used when simple
                                                           random or stratified
                                                           sampling is impractical:
                                                           e.g., the population size is
                                                           not known.
                                                         • Simplifies the selection
                                                           process.
                                                         • Do not use with periodic
                                                           populations.


    Cluster Sample        Randomly choose clusters
                                                         • With proper clusters, can
                          and sample all elements
                                                           produce very accurate
                          within each cluster.
                                                           estimates.
                                                         • Useful when sampling
                                                           frame unavailable or
                                                           travel costs high.
                                                         • Must cluster target
                                                           population correctly.




DBA6000: Quantitative Business Research Methods                                           43
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method
Quentative research method

Mais conteúdo relacionado

Mais procurados

Abstract contents
Abstract contentsAbstract contents
Abstract contentsloisy28
 
Bloom's+digital+taxonomy+v3.01
Bloom's+digital+taxonomy+v3.01Bloom's+digital+taxonomy+v3.01
Bloom's+digital+taxonomy+v3.01Peggy Hale
 
Guide to Hydrological Practices: Data Acquisition and Processing, Analysis, F...
Guide to Hydrological Practices: Data Acquisition and Processing, Analysis, F...Guide to Hydrological Practices: Data Acquisition and Processing, Analysis, F...
Guide to Hydrological Practices: Data Acquisition and Processing, Analysis, F...indiawrm
 
Cloud enabled business process management systems
Cloud enabled business process management systemsCloud enabled business process management systems
Cloud enabled business process management systemsJa'far Railton
 
R journal 2010-2
R journal 2010-2R journal 2010-2
R journal 2010-2Ajay Ohri
 
St104a vle statistics.
St104a vle statistics. St104a vle statistics.
St104a vle statistics. Yazmin Abat
 
Trinity Impulse - Event Aggregation to Increase Stundents Awareness of Events...
Trinity Impulse - Event Aggregation to Increase Stundents Awareness of Events...Trinity Impulse - Event Aggregation to Increase Stundents Awareness of Events...
Trinity Impulse - Event Aggregation to Increase Stundents Awareness of Events...Jason Cheung
 
Applying Machine Learning Techniques to Revenue Management
Applying Machine Learning Techniques to Revenue ManagementApplying Machine Learning Techniques to Revenue Management
Applying Machine Learning Techniques to Revenue ManagementAhmed BEN JEMIA
 
Book root cause analysis in health care compressed
Book root cause analysis in health care compressedBook root cause analysis in health care compressed
Book root cause analysis in health care compressedMilantikaKristanti
 
Health Accounts Production Tool SHA 2011
Health Accounts Production Tool SHA 2011Health Accounts Production Tool SHA 2011
Health Accounts Production Tool SHA 2011HFG Project
 
1 s2.0-s0959652618305985-main
1 s2.0-s0959652618305985-main1 s2.0-s0959652618305985-main
1 s2.0-s0959652618305985-mainjulioricardez3
 

Mais procurados (20)

GHopkins_BSc_2014
GHopkins_BSc_2014GHopkins_BSc_2014
GHopkins_BSc_2014
 
Rand rr2364
Rand rr2364Rand rr2364
Rand rr2364
 
Abstract contents
Abstract contentsAbstract contents
Abstract contents
 
Lecturenotesstatistics
LecturenotesstatisticsLecturenotesstatistics
Lecturenotesstatistics
 
Pentest standard
Pentest standardPentest standard
Pentest standard
 
btpreport
btpreportbtpreport
btpreport
 
Bloom's+digital+taxonomy+v3.01
Bloom's+digital+taxonomy+v3.01Bloom's+digital+taxonomy+v3.01
Bloom's+digital+taxonomy+v3.01
 
Guide to Hydrological Practices: Data Acquisition and Processing, Analysis, F...
Guide to Hydrological Practices: Data Acquisition and Processing, Analysis, F...Guide to Hydrological Practices: Data Acquisition and Processing, Analysis, F...
Guide to Hydrological Practices: Data Acquisition and Processing, Analysis, F...
 
Case sas 2
Case sas 2Case sas 2
Case sas 2
 
Cloud enabled business process management systems
Cloud enabled business process management systemsCloud enabled business process management systems
Cloud enabled business process management systems
 
R journal 2010-2
R journal 2010-2R journal 2010-2
R journal 2010-2
 
St104a vle statistics.
St104a vle statistics. St104a vle statistics.
St104a vle statistics.
 
Trinity Impulse - Event Aggregation to Increase Stundents Awareness of Events...
Trinity Impulse - Event Aggregation to Increase Stundents Awareness of Events...Trinity Impulse - Event Aggregation to Increase Stundents Awareness of Events...
Trinity Impulse - Event Aggregation to Increase Stundents Awareness of Events...
 
25 quick formative assessments
25 quick formative assessments25 quick formative assessments
25 quick formative assessments
 
Marketing Analytics
Marketing AnalyticsMarketing Analytics
Marketing Analytics
 
Notes econometricswithr
Notes econometricswithrNotes econometricswithr
Notes econometricswithr
 
Applying Machine Learning Techniques to Revenue Management
Applying Machine Learning Techniques to Revenue ManagementApplying Machine Learning Techniques to Revenue Management
Applying Machine Learning Techniques to Revenue Management
 
Book root cause analysis in health care compressed
Book root cause analysis in health care compressedBook root cause analysis in health care compressed
Book root cause analysis in health care compressed
 
Health Accounts Production Tool SHA 2011
Health Accounts Production Tool SHA 2011Health Accounts Production Tool SHA 2011
Health Accounts Production Tool SHA 2011
 
1 s2.0-s0959652618305985-main
1 s2.0-s0959652618305985-main1 s2.0-s0959652618305985-main
1 s2.0-s0959652618305985-main
 

Destaque

Grounded Theory: an Introduction (updated Jan 2011)
Grounded Theory: an Introduction (updated Jan 2011)Grounded Theory: an Introduction (updated Jan 2011)
Grounded Theory: an Introduction (updated Jan 2011)Hora Tjitra
 
Grounded Theory
Grounded TheoryGrounded Theory
Grounded TheoryKomal Raja
 
Grounded theory methodology of qualitative data analysis
Grounded theory methodology of qualitative data analysisGrounded theory methodology of qualitative data analysis
Grounded theory methodology of qualitative data analysisDr. Shiv S Tripathi
 
'Research proposal'
'Research proposal''Research proposal'
'Research proposal'Noor Hasmida
 
Grounded Theory Presentation
Grounded Theory PresentationGrounded Theory Presentation
Grounded Theory PresentationLarry Weas
 
Sample size
Sample sizeSample size
Sample sizezubis
 
Research proposal sample
Research proposal sampleResearch proposal sample
Research proposal sampleVanessa Cuesta
 
How to write research proposal(1)
How to write research proposal(1)How to write research proposal(1)
How to write research proposal(1)Hamid Ur-Rahman
 
The Research Proposal
The Research ProposalThe Research Proposal
The Research Proposalguest349908
 
8 Elements In A Research Proposal
8 Elements In A Research Proposal8 Elements In A Research Proposal
8 Elements In A Research ProposalAzmi Latiff
 
Research Proposal Presentation
Research Proposal PresentationResearch Proposal Presentation
Research Proposal PresentationVal MacMillan
 

Destaque (13)

Grounded Theory: an Introduction (updated Jan 2011)
Grounded Theory: an Introduction (updated Jan 2011)Grounded Theory: an Introduction (updated Jan 2011)
Grounded Theory: an Introduction (updated Jan 2011)
 
Grounded Theory
Grounded TheoryGrounded Theory
Grounded Theory
 
Grounded theory methodology of qualitative data analysis
Grounded theory methodology of qualitative data analysisGrounded theory methodology of qualitative data analysis
Grounded theory methodology of qualitative data analysis
 
'Research proposal'
'Research proposal''Research proposal'
'Research proposal'
 
Grounded Theory Presentation
Grounded Theory PresentationGrounded Theory Presentation
Grounded Theory Presentation
 
Sample size
Sample sizeSample size
Sample size
 
Grounded theory
Grounded theoryGrounded theory
Grounded theory
 
Research proposal sample
Research proposal sampleResearch proposal sample
Research proposal sample
 
How to write research proposal(1)
How to write research proposal(1)How to write research proposal(1)
How to write research proposal(1)
 
P value
P valueP value
P value
 
The Research Proposal
The Research ProposalThe Research Proposal
The Research Proposal
 
8 Elements In A Research Proposal
8 Elements In A Research Proposal8 Elements In A Research Proposal
8 Elements In A Research Proposal
 
Research Proposal Presentation
Research Proposal PresentationResearch Proposal Presentation
Research Proposal Presentation
 

Semelhante a Quentative research method

Semelhante a Quentative research method (20)

Masters Thesis: A reuse repository with automated synonym support and cluster...
Masters Thesis: A reuse repository with automated synonym support and cluster...Masters Thesis: A reuse repository with automated synonym support and cluster...
Masters Thesis: A reuse repository with automated synonym support and cluster...
 
Project appraisal system at APSFC
Project appraisal system at APSFCProject appraisal system at APSFC
Project appraisal system at APSFC
 
Dimensional modelling sg247138
Dimensional modelling sg247138Dimensional modelling sg247138
Dimensional modelling sg247138
 
Analytical-Chemistry
Analytical-ChemistryAnalytical-Chemistry
Analytical-Chemistry
 
main
mainmain
main
 
Vekony & Korneliussen (2016)
Vekony & Korneliussen (2016)Vekony & Korneliussen (2016)
Vekony & Korneliussen (2016)
 
Master thesis xavier pererz sala
Master thesis  xavier pererz salaMaster thesis  xavier pererz sala
Master thesis xavier pererz sala
 
Ibm spss direct_marketing
Ibm spss direct_marketingIbm spss direct_marketing
Ibm spss direct_marketing
 
Oop c++ tutorial
Oop c++ tutorialOop c++ tutorial
Oop c++ tutorial
 
Thesis_Report
Thesis_ReportThesis_Report
Thesis_Report
 
Supply chain science
Supply chain scienceSupply chain science
Supply chain science
 
Mth201 COMPLETE BOOK
Mth201 COMPLETE BOOKMth201 COMPLETE BOOK
Mth201 COMPLETE BOOK
 
User manual
User manualUser manual
User manual
 
PSA user manual
PSA user manualPSA user manual
PSA user manual
 
1026332_Master_Thesis_Eef_Lemmens_BIS_269.pdf
1026332_Master_Thesis_Eef_Lemmens_BIS_269.pdf1026332_Master_Thesis_Eef_Lemmens_BIS_269.pdf
1026332_Master_Thesis_Eef_Lemmens_BIS_269.pdf
 
Red book Blueworks Live
Red book Blueworks LiveRed book Blueworks Live
Red book Blueworks Live
 
Bwl red book
Bwl red bookBwl red book
Bwl red book
 
User manual
User manualUser manual
User manual
 
User manual
User manualUser manual
User manual
 
SAP MM Tutorial ds_42_tutorial_en.pdf
SAP MM Tutorial    ds_42_tutorial_en.pdfSAP MM Tutorial    ds_42_tutorial_en.pdf
SAP MM Tutorial ds_42_tutorial_en.pdf
 

Mais de Marketing Utopia (20)

Philosophy report final
Philosophy report finalPhilosophy report final
Philosophy report final
 
Notice
NoticeNotice
Notice
 
Writing a dissertation
Writing a dissertationWriting a dissertation
Writing a dissertation
 
Report writing instruction manual
Report writing instruction manualReport writing instruction manual
Report writing instruction manual
 
10words
10words10words
10words
 
Friends are
Friends areFriends are
Friends are
 
Three things
Three thingsThree things
Three things
 
Personality development
Personality developmentPersonality development
Personality development
 
Consumer behavior 2
Consumer behavior 2Consumer behavior 2
Consumer behavior 2
 
Consumer behavior
Consumer behaviorConsumer behavior
Consumer behavior
 
Marketing plan
Marketing planMarketing plan
Marketing plan
 
Online branding
Online brandingOnline branding
Online branding
 
Significance of colors
Significance of colorsSignificance of colors
Significance of colors
 
Colors meaning
Colors meaningColors meaning
Colors meaning
 
Maslows marketing theory
Maslows marketing theoryMaslows marketing theory
Maslows marketing theory
 
Ibm spss statistics 19 brief guide
Ibm spss statistics 19 brief guideIbm spss statistics 19 brief guide
Ibm spss statistics 19 brief guide
 
Practical guide-to-market-research
Practical guide-to-market-researchPractical guide-to-market-research
Practical guide-to-market-research
 
Chi squire test
Chi squire testChi squire test
Chi squire test
 
Z test
Z testZ test
Z test
 
T table
T tableT table
T table
 

Último

Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfphamnguyenenglishnb
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parentsnavabharathschool99
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomnelietumpap1
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceSamikshaHamane
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxCarlos105
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONHumphrey A Beña
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 

Último (20)

Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parents
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choom
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in Pharmacovigilance
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptxFINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 

Quentative research method

  • 1. DBA6000 Quantitative Business Research Methods Rob J Hyndman
  • 2. c Rob J Hyndman, 2008. Professor Rob Hyndman Department of Econometrics and Business Statistics Monash University (Clayton campus) VIC 3800. Email: Rob.Hyndman@buseco.monash.edu.au Telephone: (03) 9905 2358 www.robhyndman.info
  • 3. Contents Preface 5 1 Research design 9 1.1 Statistics in research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 1.2 Organizing a quantitative research study . . . . . . . . . . . . . . . . . . . . . . . . . . 14 1.3 Some quantitative research designs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 1.4 Data structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 1.5 The survey process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Appendix A: Case studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2 Data collection 23 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 2.2 Data collecting instruments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 2.3 Errors in statistical data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 2.4 Questionnaire design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 2.5 Data processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 2.6 Sampling schemes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 2.7 Scale development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 Appendix B: Case studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 3 Data summary 53 3.1 Summarising categorical data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 3.2 Summarizing numerical data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 3.3 Summarising two numerical variables . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 3.4 Measures of reliability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 3.5 Normal distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 4 Computing and quantitative research 70 4.1 Data preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 4.2 Using a statistics package . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 4.3 Further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 4.4 SPSS exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 5 Significance 77 5.1 Proportions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 3
  • 4. 5.2 Numerical differences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 6 Statistical models and regression 88 6.1 One numerical explanatory variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 6.2 One categorical explanatory variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 6.3 Several explanatory variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 6.4 Comparing regression models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 6.5 Choosing regression variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 6.6 Multicollinearity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 6.7 SPSS exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 7 Significance in regression 107 7.1 Statistical model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 7.2 ANOVA tables and F-tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 7.3 t-tests and confidence intervals for coefficients . . . . . . . . . . . . . . . . . . . . . . 108 7.4 Post-hoc tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 7.5 SPSS exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 8 Dimension reduction 112 8.1 Factor analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 8.2 Further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 9 Data analysis with a categorical response variable 119 9.1 Chi-squared test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 9.2 Logistic and multinomial regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 9.3 SPSS exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 10 A survey of statistical methodology 124 11 Further methods 131 11.1 Classification and regression trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 11.2 Structural equation modelling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 11.3 Time series models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 11.4 Rank-based methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 12 Presenting quantitative research 135 12.1 Numerical tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 12.2 Graphics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 Appendix: Good graphs for better business . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 13 Readings 145 DBA6000: Quantitative Business Research Methods 4
  • 5. Preface Subject convenor Professor Rob J Hyndman B.Sc.(Hons), Ph.D., A.Stat Department of Econometrics and Business Statistics Location: Room 671, Menzies Building, Clayton. Phone: (03) 9905 2358 Email: Rob.Hyndman@buseco.monash.edu.au WWW: http://www.robhyndman.info Objectives On completion of this subject, students should have: • the necessary quantitative skills to conduct high quality independent research related to business administration; • comprehensive grounding in a number of quantitative methods of data production and analysis; • been introduced to quantitative data analysis through a practical research activity. Synopsis This unit considers the quantitative research methods used in studying business, management and organizational analysis. Topics to be covered: 1. research design including experimental designs, observational studies, case studies, lon- gitudinal analysis and cross-sectional analysis; 2. data collection including designing data collection instruments, sampling strategies and assessing the appropriateness of archival data for a research purpose; 3. data analysis including graphical and numerical techniques for the exploration of large 5
  • 6. Preface data sets and a survey of advanced statistical methods for modelling the relationships between variables; 4. communication of quantitative research; and 5. the use of statistical software packages such as SPSS in research. The effective use of several quantitative research methods will be illustrated through reading research papers drawn from several disciplines. References None of these are required texts—they provide useful background material if you want to read further. Huck (2007) is excellent on interpreting statistical results in academic papers. Pallant (2007) is very helpful when using SPSS and in giving advice on how to write up research results. Use Wild and Seber (2000) if you need to brush up on your basic statistics; it contains lots of helpful advice and interesting examples. 1. H UCK , S.W. (2007) Reading statistics and research. 5th ed., Allyn & Bacon: Boston, MA 2. PALLANT, J. (2007) SPSS survival manual, 3rd ed., Allen & Unwin. 3. DE VAUS , D. (2002) Analyzing social science data. SAGE Publications: London. 4. W ILD , C.J., & S EBER , G.A.F. (2000) Chance encounters: a first course in data analysis and inference. John Wiley & Sons: New York. Timetable 17 July Introduction/Chapter 1 24 July Chapters 2 31 July Chapter 3 7 August Chapter 4 SPSS tutorial 14 August Chapter 5 21 August Chapter 6 28 August Chapter 7 SPSS tutorial 4 September Chapter 8–9 SPSS tutorial 11 September Chapter 10 18 September Chapter 11–12 First assignment due 25 September No class 2 October No class 9 October SPSS tutorial 16 October Oral presentations Second assignment due DBA6000: Quantitative Business Research Methods 6
  • 7. Preface Assessment 1. A written report presenting and critiquing a research paper which uses quantitative re- search methods. 45% • It can be a published research paper from a scholarly journal, or a company report. It must contain substantial quantitative research. It must be approved in advance. • Your report should include comments on the research questions addressed, the ap- propriateness of the data used, how the data were collected, the method of analysis chosen, and the conclusions drawn. • Length: 4000–5000 words excluding tables and graphs. • Due: 17 September 2. A written report presenting some original quantitative analysis of a suitable multivariate data set. 45% • You may use your own data, or use data that I will provide. The data set must include at least four variables. It can be data from your workplace. • Your report should include comments on the research questions addressed, the ap- propriateness of the data used, how the data were collected, the method of analysis chosen, and the conclusions drawn. • You may use any statistical computing package or Excel for analysis. • Length: 4000–5000 words excluding tables and graphs. • Due: 15 October 3. A 20 minute oral presentation of one of the above reports. 10%. • On either 8 or 15 October. Assignment marking scheme • Research questions addressed: 6% • Appropriateness of data: 6% • Data collection: 6% • Description of statistical methods used: 6% • Suitability of statistical methods: 6% • Discussion of statistical results: 8% • Conclusions (are they supported/valid?): 7% Choosing a paper for Assignment 1 Choose something you are interested in. For example, it can be an article you are reading as part of your other DBA studies or something you have read as part of your professional life. The following journals contain some articles that would be suitable. There are also many others. • Australian Journal of Management • International Journal of Human Resource Management • Journal of Advertising • Journal of Applied Management Studies • Journal of Management • Journal of Management Accounting Research DBA6000: Quantitative Business Research Methods 7
  • 8. Preface • Journal of Management Development • Journal of Managerial Issues • Journal of Marketing • Management Decision You can obtain online copies for some of these via the Monash Voyager Catalogue. Hard copies should be in the Monash library. Things to look for: • it should involve some substantial data analysis; • it should involve more than summary statistics (e.g., a regression model, or some chi- squared tests); • it should not use sophisticated statistical methods that are beyond this subject (e.g., avoid factor analysis and structural equation models). All papers should be approved by Rob Hyndman before you begin work on the assignment. Choosing a data set for Assignment 2 • Choose something you know about. The best data analyses involve a mix of good knowl- edge of the data context as well as good use of statistical methodology. • Don’t try to do too much. One response variable with 3–5 explanatory variables is usually sufficient. Resist the temptation to write a long treatise! • You will find it easier if the response variable is numeric. Analysing categorical response variables with several explanatory variables can be tricky. • Be clear about the purpose of your analysis. State some explicit objectives or hypotheses, and address them via your statistical analysis. • Think about what you include. A few well-chosen graphics that tell a story is better than pages of computer output that mean very little. • Start early. Even before we cover much methodology, you can do some basic data sum- maries and think about the key questions you want to address. • All data sets should be approved by Rob Hyndman before you begin work on the assign- ment. Readings Most weeks we will read a case study from a research journal and discuss the analysis. Please read these in advance. We will discuss them in the third hour. You cannot use a paper we have discussed for your first assessment task. If you have a suggestion of a paper that may be suitable for class discussion, please let me know. DBA6000: Quantitative Business Research Methods 8
  • 9. CHAPTER 1 Research design 1.1 Statistics in research “Statistics is the study of making sense of data.” Ott and Mendenhall “The key principle of statistics is that the analysis of observations doesn’t depend only on the observations but also on how they were obtained.” Anonymous • Data beat anecdotes “For example” proves nothing. (Hebrew proverb) • Data beat intuition “Belief is no substitute for arithmetic.” (Henry Spencer) • Data beat “expert” opinion “When information becomes unavailable, the expert comes into his own.” (A.J. Liebling) 1.1.1 Statistics answers questions using data • Do pollutants cause asthma? • Do transaction volumes on the stock market react to price changes? • Does deregulation reduce unemployment? • Does fluoride reduce tooth decay? A definition Statistical Analysis: Mysterious, sometimes bizarre, manipulations performed upon the col- lected data of an experiment in order to obscure the fact that the results have no generalizable meaning for humanity. Commonly, computers are used, lending an additional aura of unreality to the proceedings. (Source unknown) 97.3% of all statistics are made up. 9
  • 10. Part 1. Research design 1.1.2 Some statistics stories The Challenger disaster 2 Number of O-rings damaged 1 0 55 60 65 70 75 80 Ambient temperature at launch Charlie’s chooks 14 12 Y: Percentage mortality 10 8 6 4 0 20 40 60 80 100 X: Percentage Tegel birds DBA6000: Quantitative Business Research Methods 10
  • 11. Part 1. Research design Risk factors for heart disease A doctor wants to investigate who is most at risk for coronary-related deaths. He selects 12 patients at random from his clinic and records their age, blood pressure and drug used. He also records whether they eventually died from heart disease or not. Age BP Drug L/D 18 68 1 D 20 64 2 L 22 72 1 D 25 67 2 L 29 80 – D 33 70 – D 34 86 1 D 36 85 – D 37 73 2 L 39 82 – L 41 90 1 D 45 87 2 L Drug Lived Died % lived 1 0 4 0% 2 4 0 100% – 1 3 25% 5 7 Drug 1 looks bad, 2 looks good. DBA6000: Quantitative Business Research Methods 11
  • 12. Part 1. Research design 1.1.3 Causation and association Smoking and Lung Cancer There is a strong positive correlation between smoking and lung cancer. There are several possible explanations. • Causal hypothesis: Smoking causes lung cancer. • Genetic hypothesis: There is a hereditary trait which predisposes people to both nicotine addiction and lung cancer. • Sloppy lifestyle hypothesis: Smoking is most prevalent amongst people who also drink too much, don’t exercise, eat unhealthy food, etc. Postnatal care Mothers who return home from hospital soon after birth do better than those who stay in hospital longer. • Causation hypothesis: Hospital is harmful and/or home is helpful. • Common response hypothesis: Mothers return home early because they are coping well. • Confounding hypothesis: Mothers return home early if there is someone at home to help. University applicants Male Female Total Accept 70 40 110 Reject 100 100 200 Total 170 140 310 Is there evidence of discrimination? Course: Introduction to bean counting Male Female Total Accept 60 20 80 Reject 60 20 80 Total 120 40 160 DBA6000: Quantitative Business Research Methods 12
  • 13. Part 1. Research design Course: Advanced welding Male Female Total Accept 10 20 30 Reject 40 80 120 Total 50 100 150 This is an example of Simpson’s Paradox. Simpson’s Paradox occurs when the association between variables is reversed when data from several groups are combined. Other examples of Simpsons’ paradox • Average tax rate has increased with time even though rate in every income category has decreased. Why? • Ave. female salary of B.Sc. graduates is lower than ave. male salary. Why? Causality or association? 1. A positive correlation between blood pressure and income is observed. Does this indicate a causal connection? 2. In a survey in 1960, it was found that for 25–34 y.o. males there was a positive correlation between years of school completed and height. Does going to school longer make a man taller? 3. The same survey showed a negative correlation between age and educational level for persons aged over 25. Why? 4. Students at fee paying private schools perform better on average in VCE than students at government funded schools. Why? Some subtle differences • Distinguish between: causation & association, prediction & causation, prediction & ex- planation. • Note difference between deterministic and probabilistic causation. DBA6000: Quantitative Business Research Methods 13
  • 14. Part 1. Research design 1.2 Organizing a quantitative research study As a quick check, ask the following questions 1. What is your hypothesis (your research question)? 2. What is already known about the problem (literature review)? 3. What sort of design is best suited to studying your hypothesis? (method) 4. What data will you collect to test your hypothesis? (sample) 5. How will you analyse these data? (data analysis) 6. What will you do with the results of the study? (communication) These questions are broken down in more detail below. (These are mostly taken from Rubin et al. (1990), and have also appeared in Balnaves and Caputi (2001).) 1.2.1 Hypothesis • What is the goal of the research? • What is the problem, issue, or critical focus to be researched? • What are the important terms? What do they mean? • What is the significance of the problem? • Do you want to test a theory? • Do you want to extend a theory? • Do you want to test competing theories? • Do you want to test a method? • Do you want to replicate a previous study? • Do you want to correct previous research that was conducted in an inadequate manner? • Do you want to resolve inconsistent results from earlier studies? • Do you want to solve a practical problem? • Do you want to add to the body of knowledge in another manner? 1.2.2 Review of literature • What does previous research reveal about the problem? • What is the theoretical framework for the investigation? • Are there complementary or competing theoretical frameworks? • What are the hypotheses and research questions that have emerged from the literature review? DBA6000: Quantitative Business Research Methods 14
  • 15. Part 1. Research design 1.2.3 Method • What methods or techniques will be used to collect the data? (This holds for applied and non-applied research) • What procedures will be used to apply the methods or techniques? • What are the limitations of these methods? • What factors will affect the study’s internal and external validity? • Will any ethical principles be jeopardized? 1.2.4 Sample • Who (what) will provide (constitute) the data for the research? • What is the population being studied? • Who will be the participants for the research? • What sampling technique will be used? • What materials and information are necessary to conduct the research? • How will they be obtained? • What special problems can be anticipated in acquiring needed materials and information? • What are the limitations in the availability and reporting of materials and information? 1.2.5 Data analysis • How will data be analysed? • What statistics will be used? • What criteria will be used to determine whether hypotheses are supported? • What was discovered (about the goal, data, method, and data analysis) as a result of doing preliminary work (if conducted)? 1.2.6 Communication • How will the final research report be organised? (Outline) • What sources have you examined thus far that pertain to your study? (Reference list) • What additional information does the reader need? • What time frame (deadlines) have you established for collecting, analysing and present- ing data? (Timetable) 1.3 Some quantitative research designs • Case study: questionnaire, interview, observation. Best for exploratory work and hy- pothesis generation. Limited quantitative analysis possible. • Survey: questionnaire, interview, observation. Best if sample is random. • Experiment: questionnaire, interview, observation. Best for demonstrating causality. DBA6000: Quantitative Business Research Methods 15
  • 16. Part 1. Research design 1.3.1 Cross-sectional vs longitudinal analysis All designs can be either cross-sectional or longitudinal. • Cross-sectional design involves data collection for one time only. • Longitudinal design involves successive data collection over a period of time. Necessary if you want to study changes over time. 1.3.2 Case study designs • involves intense involvement with a few cases rather than limited involvement with many cases • can’t generalize results easily • useful in exploring ideas and generating hypotheses 1.3.3 Survey designs • Most popular in business/management research • useful when you cannot control the things you want to study • difficult to get random and representative samples 1.3.4 Experimental designs • requires control group to allow for the placebo effect • requires the experimenter to control all variables other than the variable of interest • requires randomization to groups • allows causation to be tested Which research design would you use? Hypotheses: 1. Women believe they are better at managing than men. 2. Children who listen to poetry in early childhood make better progress in learn- ing to read than those who do not. 3. A business will run more efficiently if no person is directly responsible for more than five other people. 4. There are inherent advantages in businesses staying small. 5. Employees with postgraduate qualifications have shorter job expectancy than employees without postgraduate qualifications. What data would you collect in each case? DBA6000: Quantitative Business Research Methods 16
  • 17. Part 1. Research design 1.4 Data structure 1.4.1 Populations and samples A population is the entire collection of ‘things’ in which we are interested. A sample is a subset of a population. We wish to make an inference about a population of interest based on information obtained from a sample from that population. E XAMPLES : • You measure the profit/loss of 50 public hospitals in Victoria, randomly selected. Population: Sample: Points of interest: • Sales on 500 products from one company for the last 5 years are analysed. Population: Sample: Points of interest: 1.4.2 Cases and variables Think about your data in terms of cases and variables. • A case is the unit about which you are taking measurements. E.g., a person, a business. • A variable is a measurement taken on each case. E.g., age, score on test, grade-level, income. 1.4.3 Types of Data The ways of organizing, displaying and analysing data depends on the type of data we are investigating. • Categorical Data (also called nominal or qualitative) e.g. sex, race, type of business, postcode Averages don’t make sense. Ordered categories are called ordinal data • Numerical Data (also called scale, interval and ratio) e.g. income, test score, age, weight, temperature, time. Averages make sense. Note that we sometimes treat numerical data as categories. (e.g. three age groups.) DBA6000: Quantitative Business Research Methods 17
  • 18. Part 1. Research design 1.4.4 Response and explanatory variables Response variable: measures the outcome of a study. Also called dependent variable. Explanatory variable: attempts to explain the variation in the observed outcomes. Also called independent variables. Many statistical problems can be thought of in terms of a response variable and one or more explanatory variables. Sometimes the response variable is called the dependent variable and the explanatory variables are called the independent variables. • Study of profit/loss in Victorian hospitals. Response variable: Explanatory variables: • Monthly sales of 500 products Response variable: Explanatory variables: competitor advertising. 1.5 The survey process 1. Planning a survey State the objectives: In order to state the objectives we often need to ask questions such as: • What is the survey’s exact purpose? • What do we not know and want to know? • What inferences do we need to draw? Begin by developing a specific list of information needs. Then write focused survey ques- tions. 2. Design the sampling procedure Identify the target population: Whom are we drawing conclusions about? Select a sampling scheme: Examples: simple random sampling, stratified random sampling, systematic sampling, and cluster sampling. 3. Select a survey method Decide how to collect the data: personal interviews, telephone interviews, mailed ques- tionnaires, diaries, . . . 4. Develop the questionnaire Write the questionnaire. Decide on the wording, types of questions, and other issues. 5. Pretest the questionnaire Select a very small sample from the sampling frame. Conduct the survey and see what goes wrong. Correct any problems before carrying out the full-scale study. 6. Conduct the survey Run the survey in an efficient and time effective manner. 7. Analyze the data Gather the results and determine outcomes. DBA6000: Quantitative Business Research Methods 18
  • 19. Part 1. Research design Appendix A: Case studies Injury management in NSW Four injury management pilots (IMP) running during 2001: • private hospitals and nursing homes within NSW; • all industry groups within the Central West NSW region; • two insurance companies (QBE and EML). We wish to do a statistical comparison of the injury management pilots with the current stan- dard injury management arrangements. Performance measures • incidence of specific payment types • duration of claims • number of claims • proportion of claimants in receipt of weekly benefits at 4, 8, 13 and 26 weeks. • costs for claimants at 4, 8, 13 and 26 weeks. – medical, rehabilitation, physiotherapy, chiropractic – weekly-benefits • timeliness – number of days from injury to agent notification – number of days from injury to first payment Some potential driving variables • age • gender • injury type • agency (e.g., powered tools) • severity of injury • medical interventions • employer size • insuring agency • weekly pay at time of injury • industry (ANZSIC code) • occupation (ASCO code) • Driving variables affect the performance measures. • Variations between groups in key driver variables can induce apparent differences be- tween groups. This is then confused with any real differences due to the programs being evaluated. • Therefore any comparisons of groups of employees should either eliminate the effect of drivers or try to measure the effect of the drivers. DBA6000: Quantitative Business Research Methods 19
  • 20. Part 1. Research design The ideal design! Ideally, we would use a randomized control trial. This eliminates the effect of driving vari- ables. • The control group would be employees on the old IM system. • The treatment group would be employees in the new IMP. • Employees would be randomly allocated to the two groups. • Statistical comparisons between the two groups would show differences between the old IM system and the new IMP. • This random allocation would prevent any systematic differences between those in the IMP and those not in the IMP. • Such a scheme is impracticable. The actual design We have to use pseudo-control groups and eliminate differences between the control and IMP groups using statistical models. • All injuries within the specified industry group, geographical region or insurer will be subject to the new IMP during 2001. • The pseudo-controls will be the equivalent groups of employees in 2000 who are not subject to the new IMP. Problem of confounding • If there are differences between the IMP and the control, is it due to the different IM program or the different group? Solution: • adjust for as many driving variables as possible; • compare similar groups not subject to the IMP. Comparisons undertaken IMP group: Private hospitals/nursing homes in NSW 2001 Pseudo-control: Private hospitals/nursing homes 2000 IMP group: Central West NSW region 2001 Pseudo-controls: Central West NSW region 2000 IMP group: Insurance company 2001 Pseudo-control: Insurance company 2000 Non-IMP group: Comparable industry group 2001 Pseudo-controls: Comparable industry group 2000 Non-IMP group: Comparable NSW region 2001 Pseudo-controls: Comparable NSW region 2000 DBA6000: Quantitative Business Research Methods 20
  • 21. Part 1. Research design We do not directly compare: • private hospitals/nursing homes with other industry groups; • Central West NSW region with other geographical regions. Instead, we compare the change between 2000 and 2001 in each industry group and each geo- graphical region. How to interpret the results. . . • If all 2001 groups are different from the 2000 groups after taking into account all drivers, then it is likely there are changes between years not reflected in the drivers. We won’t be able to attribute any changes to the IMP. • If all IMP 2001 groups are different from the 2000 groups after taking into account all drivers, but the non-IMP 2001 groups are not different from the 2000 groups, then it is likely the changes between years are due to the IMP. DBA6000: Quantitative Business Research Methods 21
  • 22. Part 1. Research design Needlestick injuries You are interested in the number and severity of needle stick injuries amongst health workers involved in blood donation and transfusion. Work in groups of three to carefully define the objectives of your survey. You will need to specify • the objective of the survey • what data are to be collected • the target population • the survey population • the sample • the data collection method • potential errors which could occur in your survey. Palliative care referrals A few years ago, I helped the Health Department with a survey on palliative care. As part of the study, it was necessary to study the ‘referral’ pattern for palliative care providers: how many patients they send to hospital (for inpatient or outpatient treatment); how many they refer to consultants for specialist comment; how many to community health programs; and so on. Possible sampling schemes: 1. sample a group of palliative care practitioners and study their referral patterns; 2. sample a group of palliative care patients and study their referral patterns. Discuss the possible advantages and disadvantages of the two schemes. DBA6000: Quantitative Business Research Methods 22
  • 23. CHAPTER 2 Data collection 2.1 Introduction “You don’t have to eat the whole ox to know that the meat is tough.” Samuel Johnson Sampling is very familiar to all of us, because we often reach conclusions about phenomena on the basis of a sample of such phenomena. You may test a swimming pool’s temperature by dipping your toe in the water or the performance of a new vehicle by a short test drive. These are among the countless small samples that we rely on when making personal decisions. We tend to use haphazard methods in picking our sample and risk substantial sampling error. Research also usually reaches its conclusions on the basis of sampling, but the methods used must adhere to certain rules that are going to be discussed. The goal in obtaining data through survey sampling is to use a sample to make precise inferences about the target population. We want to be highly confident about our inferences. It is important to have a substantial grasp of sampling theory to appraise the reliability and validity of the conclusions drawn from the sample taken. 2.2 Data collecting instruments The choice of data collection instrument is crucial to the success of the survey. When deter- mining an appropriate data collection method, many factors need to be taken into account, including complexity or sensitivity of the topic, response rate required, time or money avail- able for the survey and the population that is to be targeted. Some of the most common data collection methods are described in the following sections. 23
  • 24. Part 2. Data collection 2.2.1 Interviewer enumerated surveys Interviewer enumerated surveys involve a trained interviewer going to the potential respon- dent, asking the questions and recording the responses. The advantages of using this methodology are: • provides better data quality • special questioning techniques can be used • greater rapport established with the respondent • allows more complex issues to be included • produces higher response rates • more flexibility in explaining things to respondents • greater success in dealing with language problems The disadvantages of using this methodology are: • expensive to conduct • training for interviewers is required • more intrusive for the respondent • interviewer bias may become a source of error 2.2.2 Web surveys Web surveys are increasingly popular, although care must be taken to avoid sample selection bias and multiple responses from an individual. The advantages of this methodology are: • cheap to administer • private and confidential • easy to use conditional questions and to prompt if no response or inappropriate response. • can build in live checking. • can provide multiple language versions The disadvantages of this methodology are: • respondent bias may become a source of error • not everyone has access to the internet • language and interface must be very simple • cannot build up a rapport with respondents • resolution of queries is difficult • only appropriate when straight forward data can be collected 2.2.3 Mail surveys Self-enumeration mail surveys are where the questionnaire is left with the respondent to com- plete. The advantages of this methodology are: DBA6000: Quantitative Business Research Methods 24
  • 25. Part 2. Data collection • cheaper to administer • more private and confidential • in some cases does not require interviewers The disadvantages of this methodology are: • difficult to follow-up non-response • respondent bias may become a source of error • response rates are much lower • language must be very simple • problems with poor English and literacy skills • cannot build up a rapport with respondents • resolution of queries is difficult • only appropriate when straight forward data can be collected 2.2.4 Telephone surveys A telephone survey is the process where a potential respondent is phoned and asked the survey questions over the phone. The advantages of this methodology are: • cheap to administer • convenient for interviewers and respondents The disadvantages of this methodology are: • interviews easily terminated by respondent • cannot use prompt cards to provide alternatives for answers • burden placed on interviewers and respondents • biased sample through households with phones 2.2.5 Diaries Diaries can be used as a format for a survey. In these surveys respondents are directed to record the required information over a predetermined period in the diary, book or booklet supplied. The advantages of this methodology are: • high quality and detailed data from the completed diaries • more private and confidential circumstances for the respondent • does not require interviewers The disadvantages of this methodology are: • response rates are lower and the diaries are rarely completed well • language must be simple • can only include relatively simple concepts • cannot build up a rapport • cannot explain the purpose of survey items to respondents DBA6000: Quantitative Business Research Methods 25
  • 26. Part 2. Data collection Face-to-face Telephone Mail Response rates Good Good Good Representative samples Avoidance or refusal bias Good Good Poor Control over who completes the questionnaire Good Good Satisfactory Gaining access to the selected person Satisfactory Good Good Locating the selected person Satisfactory Good Good Effects on questionnaire design Ability to handle: Long questionnaires Good Satisfactory Satisfactory Complex questions Good Poor Satisfactory Boring questions Good Satisfactory Poor Item non-response Good Good Satisfactory Filter questions Good Good Satisfactory Question sequence control Good Good Poor Open ended questions Good Good Poor Quality of answers Minimize socially desirable responses Poor Satisfactory Good Ability to avoid distortion due to Interviewer characteristics Poor Satisfactory Good Interviewer opinions Satisfactory Satisfactory Good Influence of other people Satisfactory Good Poor Allows opportunities to consult Satisfactory Poor Good Avoids subversion Poor Satisfactory Good Implementing the survey Ease of finding suitable staff Poor Good Good Speed Poor Good Satisfactory Cost Poor Satisfactory Good Table 2.1: Advantages and disadvantages of three methods of data collection. Table taken from de Vaus (2001) who adapted it from Dillman (1978). 2.2.6 Ideas for increasing response rates 1. Provide reward 2. Systematic follow up 3. Keep it short. 4. Interesting topic. DBA6000: Quantitative Business Research Methods 26
  • 27. Part 2. Data collection 2.2.7 Archival data Rather than collecting your own data, you may use some existing data. If you do, keep the following points in mind. Available information Is there sufficient documentation of the original research proposal for which the data were collected? If not, there may be hidden problems in re-using the data. Geographical area Are the data relevant to the geographical area you are studying? e.g., what country, city, state or other area does the archive data cover? Time period Are the data relevant to the time period you are studying? Does your research area cover recent events, or is it historical or does it look at changes over a specified range of time? Most data are at least a year old before they are released to the public. Population What population do you wish to study? This can refer to a group or groups of people, particular events, official records, etc. In addition you should consider whether you will look at a specific sample or subset of people, events, records, etc. Context Does the archival data contain the information relevant to your research area? 2.3 Errors in statistical data In sample surveys there are two types of error that can occur: • sampling error which arises as only a part of the population is used to represent the whole population and; • non-sampling error which can occur at any stage of a sample survey. It is important to be aware of these errors so that they can be minimized. 2.3.1 Sampling error Sampling error is the error we make in selecting samples that are not representative of the population. Since it is practically impossible for a smaller segment of a population to be exactly representative of the population, some degree of sampling error will be present whenever we select a sample. It is important to consider sampling error when publishing survey results as it gives an indication of the accuracy of the estimate and therefore reflects the importance that can be placed on interpretations. If sampling principles are carefully applied within the constraints of available resources, sam- pling error can be accurately measured and kept to a minimum. Sampling error is affected by: • sample size • variability within the population • sampling scheme DBA6000: Quantitative Business Research Methods 27
  • 28. Part 2. Data collection Generally larger sample sizes decrease sampling error. To halve the sampling error the sample size has to be increased fourfold. In fact, sampling error can be completely eliminated by increasing the sample size to include every element in the population. The population variability also affects the error, more variable populations give rise to larger errors as the samples or estimates calculated from different samples are more likely to have greater variation. The effect of the variability within the population can be reduced by increas- ing sample size to make it more representative of the target population. 2.3.2 Non-sampling error Non-sampling error can be defined as those errors in a survey that are not sampling errors. Non-sampling error is any error not caused by the fact that we have only selected part of the population in the survey. Even if we were to undertake a complete enumeration of the population, non-sampling errors might remain. In fact, as the size of the sample increases, the non-sampling errors may get larger, because of such factors as possible increase in the response rate, interviewer errors, and data processing errors. For the most part we cannot measure the effect that non-sampling errors will have on the re- sults. Because of their nature, these errors may not be totally eliminated. Perhaps the biggest source of non-sampling error is a poorly designed questionnaire. The questionnaire can in- fluence the response rate achieved in the survey, the quality of responses obtained and conse- quently the conclusions drawn from survey results. Some common sources of non-sampling error are discussed in the following paragraphs. Target Population Failure to identify clearly who is to be surveyed. This can result in an inadequate sam- pling frame; imprecise definitions of concepts and poor coverage rules. Non-response A non-response error occurs when the respondents do not reflect the sampling frame. This could occur when the people who do not respond to the survey differ to the people who did respond to the survey. This often occurs in voluntary response polls. For ex- ample, suppose that in an air bag study we asked respondents to call a 0018 number to be interviewed. Because a 0018 call cost $2 per minute, many drivers may not respond. Furthermore, those who do respond may be the people who have had bad experiences with air bags. Thus the final sample of respondents may not even represent the sampling frame. For example, • telephone polls miss those people without phones • household surveys miss homeless, prisoners, students in colleges, etc. • train surveys only target public transport users and tend to include regular public transport users. DBA6000: Quantitative Business Research Methods 28
  • 29. Part 2. Data collection Manufacturers and advertising agencies often use interviews at shopping malls to gather information about the habits of consumers and the effectiveness of ads. A sample of mall shoppers is fast and cheap. “Mall interviewing is being propelled primarily as a budget issue”, one expert told the New York Times. But people con- tacted at shopping malls are not representative of the entire population. They are richer, for example, and more likely to be teenagers or retired. Moreover, mall inter- viewers tend to select neat safe looking individuals from the stream of customers. Decisions based on mall interviews may not reflect the preferences of all consumers. In 1991 it was claimed that data showed that right-handed persons live on average almost a decade longer than left-handed or ambidextrous persons. The investigators had compared mean ages at death of people who appeared to be survivors as left, right or mixed handed. • What is the problem? The questionnaire Poorly designed questionnaires with mistakes in wording, content or layout may make it difficult to record accurate answers. The most effective methods of designing a question- naire are discussed in Section 2.4. If these principles are followed it will help reduce the non-sampling error associated with the questionnaire. Interviewers If an interviewer is used to administer the survey, their work has the potential to produce non-sampling error. This can be due to the personal characteristics of the interviewer. For example, an elderly person will often be more comfortable giving information to a female interviewer. Other factors which could cause error are the interviewer’s opinions and characteristics which may influence the respondent’s answers. In 1968, one year after a major racial disturbance in Detroit, a sample of black resi- dents was asked: Do you personally feel that you can trust most white people, some white people, or none at all? Of those interviewed by whites, 35% answered “Most”, while only 7% of those in- terviewed by blacks gave this answer. Many questions were asked in this study. Only on some topics, particularly black-white trust or hostility, did the race of the interviewer have a strong effect on the answers given. The interviewer was a large source of non-sample error in this study. Respondents Respondents can also be a source of non-sampling error. They may refuse to answer ques- tions, or provide inaccurate information to protect themselves. They may have memory lapses and/or lack of motivation to answer the questionnaire, particularly if the ques- tionnaire is lengthy, overly complicated or of a sensitive nature. Respondent fatigue is a very important factor. Social desirability bias refers to the effect where respondents will provide answers which they think are more acceptable, or which they think the interviewer wants to hear. For example, respondents may state that they have a higher income than is actually the case if they feel this will increase their status. DBA6000: Quantitative Business Research Methods 29
  • 30. Part 2. Data collection Respondents may refuse to answer a question which they find embarrassing or choose a response which prevents them from continuing with the questions. For example, if asked the question: “Are you taking oral contraceptive pills for any reason?”, and know- ing that if they respond “Yes” they will be asked for more details, respondents who are embarrassed by the question are likely to answer “No”, even if this is incorrect. Fatigue can be a problem in surveys which require a high level of commitment for respon- dents. The level of accuracy and detail supplied may decrease as respondents become tired of recording all information. Sometimes interviewer fatigue can also be a problem, particularly when the interviewers have a large number of interviews to conduct. Processing and collection Processing and collection errors can be a source of non-sampling error. For example, the results from the survey may be entered incorrectly . The time of year the survey is enumerated can produce non-sampling error. For example, if the survey is conducted in the school holidays, potential respondents with school children could possibly be away or hard to contact. The Shere Hite surveys In 1987, Shere Hite published a best-selling book called Women and Love. The author distributed 100,000 questionnaires through various women’s groups, asking questions about love, sex, and relations between women and men. She based her book on the 4.5% of questionnaires that were returned. • 95% said they were unhappily married • 91% of those who were divorced said that they had initiated the divorce What are the problems with this research? Exercise 1: In Case 2, it was necessary to study the ‘referral’ pattern for palliative care providers: how many patients they send to hospital (for inpatient or out- patient treatment); how many they refer to consultants for specialist comment; how many to community health programs; and so on. Two alternative sam- pling schemes are available: sample a group of palliative care practitioners and study their referral patterns; or sample a group of palliative care patients and study their referral patterns. Discuss the possible advantages and disad- vantages of the two schemes. 2.4 Questionnaire design 2.4.1 Introduction The purpose of a questionnaire is to obtain specific information with tolerable accuracy and completeness. Before the questionnaire is designed, the collection objectives should be defined. These include: DBA6000: Quantitative Business Research Methods 30
  • 31. Part 2. Data collection • clarifying the objectives of the survey • determining who is to be interviewed • defining the content • justifying the content • prioritizing the data that are to be collected. This is important as it makes it easier to discard items if the survey, once developed, is too lengthy. Careful consideration should be given to the content, wording and format of the questionnaire as one of the largest sources of non-sampling error is poor questionnaire design. This error can be minimized by considering the objectives of the survey and the required output, and then devising a list of questions that will accurately obtain the information required. 2.4.2 Content of the questionnaire Relevant questions It is important to ask only questions that are directly related to the objectives of a survey as a means of minimizing the burden place on respondents. The concept of a fatigue point, which oc- curs when respondents can no longer be bothered answering questions, should be recognized, and questions designed so that the respondent is through the form before this point is reached. Towards the end of long questionnaires, respondents may give less thought to their answers and concentrate less on the instructions and questions, thereby decreasing the accuracy of in- formation they provide. Very long questionnaires can also lead the respondent to refuse to complete the questionnaire. Hence it is necessary to ensure only relevant questions are asked. Reliable questions It is important to include questions in a questionnaire that can be easily answered. This objec- tive can be achieved by adhering to the following techniques. Appropriate recall If information is requested by recall, the events should be sufficiently recent or familiar to respondents. People tend to remember what they should have done, have selective memories, and move into reference period activities which surround the event. Minimizing the need for recall improves the accuracy of response. Common reference periods To make it easier for the respondent to answer, use reference periods which match those of the respondent’s records. Results justify efforts The amount of effort to which a respondent goes to obtain the data must be worth it. It is reasonable to accept a respondent’s estimate when calculating the exact figures would make little difference to the outcome. Filtering Respondents should not be asked question they cannot answer. Filter questions should be asked to exclude respondents from irrelevant questions. DBA6000: Quantitative Business Research Methods 31
  • 32. Part 2. Data collection 2.4.3 Types of questions Factual questions Information is required from these questions rather than an opinion. For example respon- dents could be asked about behaviour patterns (e.g., When did you last visit a General Practitioner?). Classification or demographic questions These are used to gain a profile of the population that has been surveyed and provide important data for analysis. Opinion questions Rather than facts, these questions seek opinion. There are many problems associated with opinion questions: • a respondent may not have an opinion/attitude towards the subject so the response may be provided without much thought; • opinion questions are very sensitive to changes in wording; • it is impossible to check the validity of responses to opinion questions. Hypothetical questions The “What would you do if . . . ?” type of question. The problems with these questions are similar to opinion questions. You can never be certain how valid any answer to a hypothetical is likely to be. 2.4.4 Answer formats Questions can generally be classified as one of two types, open or closed, depending on the amount of freedom allowed in answering the question. When deciding which type of question to use, consideration should be given to the kind of information sought, ease of processing the response, and the availability of the resources of time, money, and personnel. Open questions Open questions allow the respondents to answer the question in their own words. These ques- tions allow as many possible answers and they can collect exact values from a wide range of possible values. Hence, open questions are used when the list of responses is very long and not obvious. The major disadvantage of open questions is they are far more demanding than closed ques- tions both to answer and process. These questions are most commonly used where a wide range of responses is expected. Also, the answers to these questions depend on the respon- dents ability to write or speak as much as their knowledge. Two respondents might have the same knowledge and opinions, but their answers may seem different because of their varying abilities. DBA6000: Quantitative Business Research Methods 32
  • 33. Part 2. Data collection Question Format Which country makes the best cars Open ended ............................................... Which country makes the best cars? Multiple choice questions 1. USA 2. Germany 3. Japan Which country makes the best cars? Partially closed questions 1. USA 2. Germany 3. Japan 4. Other (please specify) For the list provided, indicate which brand/s of Checklist questions cars you have owned? 1. Ford 2. Toyota 3. BMW I believe Japanese cars are less reliable than Likert scale (opinion) questions European cars. Strongly Agree Agree No opinion Disagree Strongly disagree 1 2 3 4 5 Closed questions Closed questions ask the respondents to choose an answer from the alternatives provided. These questions should be used when the full range of responses is known. Closed questions are far easier to process than open questions. The main disadvantage of closed questions is the reasons behind a particular selection cannot be determined. There are a number of types of closed questions. • Limited choice questions require the respondent to choose one of two mutually exclusive answers. For example yes/no. • Multiple choice questions require the respondent to choose from a number of responses provided. • Checklist questions allow a respondent to choose more than one of the responses pro- vided. • Partially closed questions provide a list of alternatives where the last alternative is “Other, please specify”. These questions are useful when it is difficult to list all possible choices. • Opinion (Likert) scale An opinion scale question seeks to locate a respondent’s opin- ion on a rating scale with a limited number of points. For example, a five point scale measure of strong and weak attitudes would ask the respondent whether they strongly agree/agree/are neutral/disagree/strongly disagree with a particular statement of opin- DBA6000: Quantitative Business Research Methods 33
  • 34. Part 2. Data collection ion. Whereas a three point scale would only measure whether they agree, disagree or are neutral. Opinion scales of this sort are called Likert scales. Five point scales are best because: – – – Response Categories When questions have categories provided, it is important that every response is catered for. Number of Categories The quality of the data can be influenced if there are too few categories as the respondent may have difficulty finding one which accurately describes their situation. If there are too many categories the respondent may also have difficulty finding one which accurately describes their situation. Don’t Know A ‘Don’t Know’ category can be included so respondents are not forced to make decisions/attitudes that they would not normally make. Excluding the option is not usu- ally good, however, it is hard to predict the effect of including it. The decision of whether or not to include a ‘Don’t Know’ option depends, to a large extent, on the subject matter. I was gifted to be able to answer promptly, and I did. I said I didn’t know. Mark Twain, Life on the Mountain 2.4.5 Wording of questions Language Questions which employ complex or technical language or jargon can confuse or irritate re- spondents. Respondents who do not understand the question may be unwilling to appear ignorant by asking the interviewer to explain the question or if a interviewer is not present, may not answer or answer incorrectly. Ambiguity If ambiguous words or phrases are included in a question, the meaning may be interpreted differently by different people. This will introduce errors in the data since different respondents will virtually be answering different questions. For example “Why did you fly to New Zealand on Qantas airlines?”. Most might interpret this question as was intended, but it contains three possible questions, so the response might concern any of these: • I flew (rather than another mode of travel) because . . . • I went to New Zealand because . . . • I selected Qantas because . . . DBA6000: Quantitative Business Research Methods 34
  • 35. Part 2. Data collection Double-barreled questions When one question contains two concepts, it is known as a double-barreled question. For example , “How often do you go grocery shopping and do you enjoy it?”. Each concept in the question may have a different answer, or one concept may not be relevant, respondents may be unsure how to respond. The interpretation of the answers to these ques- tions is almost impossible. Double-barreled questions should be split into two or more separate questions. Leading questions Questions which lead respondents to answers can introduce error. For example, the question “How many days did you work last week?”, if asked without first determining whether re- spondents did in fact take work in the previous week, is a leading question. It implies that the person would have been at work. Respondents may answer incorrectly to avoid telling the interviewer that they were not working. Unbalanced questions “Are you in favour of euthanasia?” is an unbalanced question because is provides only one al- ternative. It can be reworded to ‘Do you favour or not favour euthanasia?’, to give respondents more than one alternative. Similarly, the use of a persuasive tone can affect the respondent’s answers. Wording should be chosen carefully to avoid a tone that may produce bias in responses. Recall/memory error Respondents tend to remember what should have been done rather that what was done. The quality of data collected from recall questions is influenced by the importance of the event to the respondent and the length of time since the event took place. Subjects of greater interest or importance to the respondent, or events which happen infrequently, will be remembered over longer periods and more accurately. Minimizing the recall period also helps to reduce memory bias. Telescoping is a specific type of memory error. This occurs if the respondent reports events as occurring either earlier or later than they actually occur. Error occurs when respondents included details of an event which actually occurred outside the specified reference period. Sensitive questions Questions on topics which respondents may see as embarrassing or highly sensitive can pro- duce inaccurate answers. If respondents are required to answer questions with information that might seem socially undesirable, they may provide the interviewer with responses they believe are more ‘acceptable’. If placed at the being of the questionnaire, it could lead to non- response if respondents are unwilling to continue with the remaining questions. For example, “Approximately how many cans of beer do you consume each week, on aver- age?” 1. None DBA6000: Quantitative Business Research Methods 35
  • 36. Part 2. Data collection 2. 1–3 cans 3. 4–6 cans 4. More than 6 A respondent might answer response 2 or 3 rather than admit to consuming the greatest quan- tity on the scale. Consider extending the range of choices far beyond what is expected. The respondent can select an answer closer to the middle and feel more in the normal range. In 1980, the New York Times CBS News Poll asked a random sample of Americans about abortion. When asked “Do you think there should be an amendment to the Constitution prohibiting abortions, or should not there be such an amendment?” 29% were in favour and 62% were opposed. The rest of the sample were uncer- tain. The same people were later asked a different question: “Do you believe there should be an amendment to the Constitution protecting the life of the unborn child, or should not there be such an amendment?” Now 50% were in favour and only 39% were opposed. Acquiescence This situation arises when there is a long series of questions for which respondents answer with the same response category. Respondents get used to providing the same answer and may answer inaccurately. 2.4.6 Questionnaire format Including an introduction It can be advantageous to include an introductory statement or explanation at the beginning of a survey. The introduction may included such information as the purpose of the survey or the scope of collection. It will aid the respondent when answering the questions if they know why the information is being sought. The respondent should be given a context in which to frame his or her answers. An assurance of confidentiality will provide respondents with confidence that the results will not be obtained by unwanted parties. Question and page numbers To ensure that the questionnaire can be easily administered by interviewer or respondents, the pages of the questionnaire and the questions should be number consecutively with a simple numbering system. Question numbering is a way of providing sign-posts along the way. They help if remedial action is required later, and you want to refer the interviewer or respondent back to a particular place. Sequencing The questions in a questionnaire should follow an order which is logical and smoothly flows from one question to the next. The questionnaire layout should have the following character- istics. DBA6000: Quantitative Business Research Methods 36
  • 37. Part 2. Data collection Related questions grouped Questions which are related should be grouped together and where necessary placed into sections. Sections should contain an introductory heading or statement. If possible, question ordering should try and anticipate the order in which respondents will supply information. It shows good survey design if a question not only prompts an answer but also prompts an answer to a question following shortly. Question ordering It is important to be aware that earlier questions can influence the responses of later ques- tions, so the order of questions should be carefully decided. In attitudinal questions, it is important to avoid conditioning respondents in an early question which could then bias their responses to later questions. For example, you should ask about awareness of a concept before any other mention of the concept. Respondent motivation Whenever possible, start the questionnaire with easy and pleasant questions to promote inter- est in the survey and give the respondent confidence in their ability to complete the survey. The opening questions should ensure that the particular respondent is a member of the survey population. Questions that are perceived as irritating or obtrusive tend to get a low response rate and may effectively trigger a refusal from the respondent. These questions need to be carefully positioned in a questionnaire where they are least likely to be sensitive. It is also important that respondents are only asked relevant questions. Respondents may be- come annoyed and disinterested if this does not occur. Include filter questions to direct re- spondents to skip to questions which do not apply to them. Filter questions often identify sub-populations. For example, “Do you usually speak English at home?” Yes (Go to Q34) No (Go to Q10) Questionnaire layout The questionnaire layout should be aesthetically pleasing, so the layout does not contribute to respondent fatigue. Things that can interfere with the answering of a questionnaire are: unclear instructions and questions, insufficient space to provide answers, hard-to-read text, difficulty in understanding language, back-tracking through the form. Many of these things are bad form design and are avoidable. Only include essentials on the questionnaire form. Keep the amount of ink on the form to the minimum necessary for the form to work properly. Anything that is not necessary contributes to the fatigue point of the respondent and to the subsequent detriment of the data quality. DBA6000: Quantitative Business Research Methods 37
  • 38. Part 2. Data collection General layout Consistency of layout: If consistency and logical patterns are introduced into the form design, it eases the form filler’s task. Patterns that can be useful are: • white spaces for responses • using the same question type throughout the form • using the same layout throughout the form • using a different style, consistently, for instructions or directions. Type Size: A font size between 10 and 12 is considered the best in most circumstances. If the respondent does not have perfect vision, or ideal working conditions, small fonts can cause problems. Use of all upper-case text: It is best to avoid upper case text. Upper case text has been shown to be hard to read, especially where large amounts of text are involved. Words lose their shape when in upper case, becoming converted to rectangles. Text in upper case should be left for use for titles or for emphasis but, this can often be done just as well using other methods, such as bold, italics, or slightly larger type size. Line length: As the eye has a clear focus range of only a few degrees, lines should be kept short. It takes the eyeball several eye movements to scan a line of text. If more than 2 or 3 such movement occur then the eye can become fatigued. There is a tendency for the eye to lose track of which line it is reading. This leads to backtracking the text or misinterpretation. Character and line spacing: It is very important to leave enough space on a form for answers. It has been shown in research that forms requiring hand written responses need a distance of 7–8mm between lines and a 4–5mm width for each possible character. Response layout Obtaining responses: A popular way of obtaining responses is using tick boxes. However, it is usually preferable to use a labelled list (e.g., a, b, c, . . . ) and ask respondents to circle their response. This makes coding and data entry easier. If a written response is required it is best to provide empty answer spaces, with lines made up of dots. Positioning of responses: Vertical alignment of responses is preferred to horizontal alignment. It is easier to read up and down the list, and select the correct box, than read across the page and locate an item in a horizontal string. Captions to the left of the answer box are easier for respondents to complete. Order of response options: The consideration of the order of responses is important as the order can be a source of bias. The options presented first may be selected because they make an impact on respondents or because respondents lose concentration and do not hear or read the remaining options. The last options may be chosen because it was easily recalled, particularly if respondents are faced with a long list of options. Long or complex response options may also make recall more difficult and increase the effects due to the order of DBA6000: Quantitative Business Research Methods 38
  • 39. Part 2. Data collection response options. Prompt card: If the questionnaire is interviewer based, and a number of response options are given for some questions, then a prompt card may be appropriate. A prompt card is a list of possible responses to a question, displayed on a separate card which are shown by the interviewer to assist respondents. This helps to decrease error resulting from respondents being unable to remember all the options read out. However respondents with poor eyesight, migrants with limited English or adults with literacy problems will experience difficulties in answering accurately. Exercise 2: (Case 2) The questionnaire on pages 47–48 was an early draft of the questionnaire prepared by the client. The questionnaire on pages 49–51 is a later draft of the questionnaire after I had provided the client with some advice. See if you can determine why each of the changes has been made. How could you further improve the questionnaire? 2.4.7 Pretesting the questionnaire A pretest of a questionnaire should be considered mandatory. Although the designer of the questionnaire would have reviewed the drafted questionnaire meticulously on all points of good design, it is still likely to contain faults. Normally, a number of these emerge when the form is used in the field, because the researcher did not completely anticipate what would take place. The only way that these faults may be fully detected is by actually administering the survey with the types of respondents who would be sampled in the study. Each type of testing is used at a different stage of survey development and aims to test different aspects of the survey. Skirmishing Skirmishing is the process of informally testing questionnaire design with groups of re- spondents. The questionnaire is basically unstructured and is tested with a group of people who can provide feedback on issues such as each question’s frame of reference, the level of knowledge needed to answer the questions, the range of likely answers to questions and how answers are formulated by respondents. Skirmishing is also used to detect flaws or awkward wording of questionnaires as well as testing alternative designs. At this stage we may use open-ended response categories to work-out likely responses. The questionnaire should be redrafted after skirmishing. Focus groups A skirmish tests the questionnaire design against general respondents whilst focus groups concentrate on a specific audience. For example, a survey studying the effects of living on unemployment benefits could have a group of unemployed people as a focus group. A focus group can be used to test questions directed at small sub-populations. For ex- ample if we were looking at community services we may have a filter question to target disabled people. Since there may not be many disabled chosen in the sample, we need to test the questions on a focus group of disabled people, which is a biased sample. DBA6000: Quantitative Business Research Methods 39
  • 40. Part 2. Data collection Observational studies Respondents complete a draft questionnaire in the presence of an observer during an observational study. Whilst completing the form the respondents explain their under- standing of the questions and the method required in providing the information. These studies can be a means of identifying problem questions through observations, questions asked by the respondents, or the time taken to complete a particular question. Data avail- ability and the most appropriate person to supply the information can also be gauged through observational studies. The form is being tested and not the respondent and this should be stressed to the respondent. Pilot testing Pilot testing involves formally testing a questionnaire or survey with a small represen- tative sample of respondents. Semi-closed questions are usually used in pilot testing to gather a range of likely responses which are used to develop a more highly structured questionnaire with closed questions. Pilot testing is used to identify any problems asso- ciated with the form, such as questionnaire format, length, question wording and allows comparison of alternative versions of a questionnaire. 2.5 Data processing Data processing involves translating the answers on a questionnaire into a form that can be manipulated to produce statistics. In general, this involves coding, editing, data entry, and monitoring the whole data processing procedure. The main aim of checking the various stages of data processing is to produce a file of data that is as error free as possible. 2.5.1 Data coding Up to this point, the questionnaire has been considered mainly as a means of communication with the respondent. Just as important, the questionnaire is a working document for the trans- fer of data on to a computer file. Consequently it is important to design the questionnaire to facilitate data entry. Unless all the questions on a questionnaire are “closed” questions, some degree of coding is required before the survey data can be sent for punching. The appropriate codes should be de- vised before the questionnaires are processed, and are usually based on the results of pretesting. Coding consists of labelling the responses to questions (using numerical or alphabetic codes) in order to facilitate data entry and manipulation. Codes should be formulated to be simple and easy. For example if Question 1 has four responses then those four responses could be given the codes a, b, c, and d. The advantage of coding is the simplistic storage of data as a few-digit code compared to lengthy alphabetical descriptions which almost certainly will not be easy to categorize. Coding is relatively expensive in terms of resource effort. However, improvements are always being sought by developing automated techniques to cover this task. Other options include the use of self coding where respondents answer the appropriate code or the interviewer performs DBA6000: Quantitative Business Research Methods 40
  • 41. Part 2. Data collection the coding task. Before the interviewing begins, the coding frame for most questions can be devised. That is, the likely responses are obvious from previous similar surveys or thorough pilot testing, allowing those responses and relevant codes to be printed on the questionnaire. An “Other (Please Specify)” answer code is often added to the end of a question with space for interviewers to write the answer. The standard instruction to interviewers in doubt about any precodes is that they should write the answers on the questionnaire in full so that they can be dealt with by a coder later. 2.5.2 Data entry Ensure that the questionnaire is designed so data entry personnel have minimal handling of pages. For example, all codes should be on the left (or right) hand side of the page. It is advisable to use trained data entry people to enter the data. It is quicker and more reliable and therefore more cost effective. 2.6 Sampling schemes When you have a clear idea of the aims of the survey and the data requirements, the degree of accuracy required, and have considered the resources and time available, you are in a position to make a decision about the size and the form of collection of sampling units. The two qualities most desired in a sample (besides that of providing the appropriate findings), are its representativeness and stability. Sample units may be selected in a variety of ways. The sampling schemes fall into two general types: probability and non-probability methods. 2.6.1 Non-probability samples If the probability of selection for each unit is unknown, or cannot be calculated, the sample is called a non-probability sample. For non-probability samples, since there is no control over rep- resentativeness of the sample, it is not possible to accurately evaluate the precision of estimates (i.e., closeness of estimates under repeated sampling of the same size). However, where time and financial constraints make probability sampling infeasible, or where knowing the level of accuracy in the results is not an important consideration, non-probability samples do have a role to play. Non-probability samples are inexpensive, easy to run and no frame is required. This form of sampling is popular amongst market researchers and political pollsters as a lot of their surveys are based on a pre-determined sample of respondents of certain categories. One common method of non-probability sampling is voluntary response polling. A general appeal is made (often via television) for people to contact the researcher with their opinion. Voluntary response samples are rarely useful because they over-represent people with strong opinions, most often negative opinion. DBA6000: Quantitative Business Research Methods 41
  • 42. Part 2. Data collection 2.6.2 Probability sampling schemes Probability sampling schemes are those in which the population elements have a known chance of being selected for inclusion in a sample. Probability sampling rigorously adheres to a pre- cisely specified system that permits no arbitrary or biased selection. There are four main types of probability sampling schemes. Simple Random Sample: If a sample size of size n is drawn from a population of size N in such a way that every possible sample of size n has the sample chance of being selected, the sampling procedure is called simple random sampling. The sample thus obtained is called a simple random sample. This is the simplest form of probability sample to analyse. Stratified Random Sample: A stratified random sample is one obtained by separating the pop- ulation elements into non-overlapping groups, called strata, and then selecting a simple random sample from each stratum. This can be useful when a population is naturally divided into several groups. If the results on each stratum vary greatly, then it is possi- ble to obtain more efficient estimators (and therefore more precise results) than would be possible without stratification. Systematic Sample: A sample obtained by randomly selecting one element from the first k el- ements in the frame and every kth element thereafter is called a 1-in-k systematic sample, with a random start. This is obviously a simple method if there is a list of elements in the frame. Systematic sampling will provide better results than simple random sampling when the systematic sample has larger variance than the population. This can occur when the frame is ordered. Cluster Sample: A cluster sample is a probability sample in which each sampling unit is a collection, or cluster, of elements. The population is divided into clusters and one or more of the clusters is chosen at random and sampled. Sometimes the entire cluster is sampled; on other occasions a simple random sample of the chosen clusters is taken. Cluster sampling is usually done for administrative convenience, and is especially useful if the population has a hierarchical structure. A comparison of these four sampling schemes appears in the table on the following page. Example (Case 2): A few years ago, I advised the Department of Health and Com- munity Services on a survey of palliative care patients in Victoria. Objective: To estimate the proportion of palliative care patients in Vic- torian hospitals. Difficulties: What is a “palliative care patient”? Proportion of what? Target population: Patients in acute beds at the time of the survey? Survey population: All patients in acute beds in Victorian hospitals except for very small (< 10 bed) country hospitals. Sampling scheme: Stratified (hospital types) and clustered (hospitals). Ran- dom selection of hospitals within each strata. Total cover- age of patients in the selected hospitals. Sample: All patients in the 18 hospitals selected out of 115 hospitals in Victoria. DBA6000: Quantitative Business Research Methods 42
  • 43. Part 2. Data collection Scheme How to select sample Strengths/Weaknesses Simple Random Assign numbers to elements • The basic building block Sample in sampling. Use a random • Simple, but often costly. number table or random • Cannot use unless we can number generator to select assign a number to each sample. element in a target population. Stratified Sample Divide population into • With proper strata, can groups that are similar produce very accurate within and different between estimates. on the variable of interest. • Less costly than simple Use random numbers to random sampling. select the sample from each • Must stratify target stratum. population correctly. Systematic Sample Select every kth element • Produces very accurate from a list after a random estimates when elements start. in a population exhibit order. • Used when simple random or stratified sampling is impractical: e.g., the population size is not known. • Simplifies the selection process. • Do not use with periodic populations. Cluster Sample Randomly choose clusters • With proper clusters, can and sample all elements produce very accurate within each cluster. estimates. • Useful when sampling frame unavailable or travel costs high. • Must cluster target population correctly. DBA6000: Quantitative Business Research Methods 43