Slides from the International Advanced School on Empirical Software Engineering 2015, held as part of the Empirical Software Engineering International Week in Beijing. The slides are posted with the permission of the main organiser Roel Wieringa.
5. 4. Methodology (the study of research methods)
a. Notion of conceptual framework; statements about them
b. Notion of generalization; statements about them
3. Theory (statement about many research results)
a. Conceptual framework
b. Generalization
2. Research questions (what, how, when where, …., why) aimed at
generalizable knowledge, research method, and research result
1. Practice domain: SW, methods, tools, processes (as is / to be)
21 October 2015 IASESE 5
Looking at
research from the
sky
General
knowledge is the
gold we are after
Hard work to grow
knowledge
Grass roots
• Everything on the slides in this talk , except the examples, is at level 4.
• The examples on these slides contain explicit level indications.
• The separate example slides report about research that contains 2 and 3.
• The reported research studies some aspect of 1.
6. Agenda
Time Topic
09:00 – 10:30 Opening and Introduction
10:30 – 11:00 Coffee break
11:00 – 12:30 Inferring Theories from Data
12:30 – 13:30 Lunch
13:30 – 15:00 Designing Research based on Theories
15:00 – 15:30 Coffee break
15:30 – 16:30 Hands-on Working Session and Q&A
16:30 – 17:00 Wrap up (all)
6
8. Scientific theories
• A theory is a belief that there is a pattern in phenomena
• A scientific theory is a theory that
– Has survived tests against experience
• Observation, measurement
• Possiblyexperiment, simulation, trials
– Has survived criticism by critical peers
• Anonymous peer review
• Publication
• Replication
21 October 2015 IASESE 8
9. Examples (level 3)
• Theory of cognitive dissonance
• Theory of electromagnetism
• The Balance theorem in social networks
• Theories X, Y, Z, and W of (project) management
• Technology Acceptance Model
• Hannayet al. A Systematic “Review of Theory Use in Software
Engineering Experiments”. IEEE TOSEM 33(2), February2007
• Lim et al. “Theories Used in Information Systems Research:
Identifying Theory Networks in Leading IS Journals”./ ICIC 2009,
paper 91.
• Non-examples
– Speculations based on imagination rather than fact: Conspiracy theories
about who killed John Kennedy
– Opinions that cannot be refuted: The Dutch lost the World Championship
because they play like prima donnas
21 October 2015 IASESE 9
12. The structure of scientific theories
1. Conceptual framework
– Constructs used to express beliefs about patternsin phenomena
– E.g. The concepts of beamforming, of multi-agent planning, of data
location compliance. (level 3)
2. Generalizations
– stated in terms of these concepts, that express beliefs about
patterns in phenomena.
– E.g. relationbetween angle of incidence and phase difference,
– Statement about delay reduction on airports. (level 3)
• Generalizations have a scope, a.k.a. target of generalization
21 October 2015 IASESE 4 12
13. The structure of design theories
1. Conceptual framework
2. Generalizations
– Artifact specification X Context assumptions → Effects
– Effects satisfya requirement to some extent
21 October 2015 IASESE 4 13
14. 1. Architectural structures: Class of systems, componentswith
capabilities, interactions
– E.g. entities, (de)composition,taxonomies, cardinality, events,
processes, procedures, constraints, … (level 4)
– Useful for case-based research (observationalcase studies, case
experiments, simulations, technical action research)
– Typically qualitative
2. Statistical structures: Population, variables with probability
distributions, relations among variables
– Useful for sample-based research (surveys, statisticaldifference-
making experiments)
– Typically quantitative
Two kinds of conceptual structures
21 October 2015 IASESE
14
15. • Prechelt: What is a theory, the structure of
theories
• Vriezekolk: The structure of theories
• Méndez: The structure of theories
21 October 2015 IASESE 15
17. Uses of a conceptual framework
• Framing a problem or artifact: choosing which concepts to
use
– Using the theory of infectuous diseases to understand a patient’s
symptoms
– Using concepts of force & energy to understand behavior of a machine
– Using concept of a coordination gatekeeper to understand a
distributedSE project (all three examples at level 1)
• Describe a problemor specify an artifact: using the concepts
• Generalize about the problem or artifact
• Analyze a problem or artifact (i.e. analyze the framework)
21 October 2015 IASESE 17
18. Functions of generalizations
• Functions of generalizations
– Explanation: explain phenomenaby identifyingcauses,
mechanisms or reasons
– Prediction: state what will happen in the future
• Design: use generalizations to justifya design choice
21 October 2015 IASESE 18
19. • Prechelt: the use of theories
• Vriezekolk: the use of theories
• Méndez: the use of theories
21 October 2015 IASESE 19
20. Usability of theories
• When is a design theory
Context assumptions X Artifact design → Effects
usable by a practitioner?
1. He/she is capable to recognize Context assumptions
2. and to acquire/build Artifact under constraints of practice,
3. effects will indeed occur, and
4. He/she can observe this, and
5. They will contribute to stakeholder goals/satisfy
requirements
• Practitioner has to asses the risk that each of these fails
21 October 2015 IASESE 20
21. • Prechelt: the usability of theories
• Vriezekolk: the usability of theories
• Méndez: the usability of theories
21 October 2015 IASESE 21
22. Agenda
Time Topic
09:00 – 10:30 Opening and Introduction
10:30 – 11:00 Coffee break
11:00 – 12:30 Inferring Theories from Data
12:30 – 13:30 Lunch
13:30 – 15:00 Designing Research based on Theories
15:00 – 15:30 Coffee break
15:30 – 16:30 Hands-on Working Session and Q&A
16:30 – 17:00 Wrap up (all)
22
25. • Architectural explanation must be the basis of the
analogic generalization;
• Otherwise, we engage in wishful/magical thinking
– You have observed that some small companies did not put
a customer representative on-site of an agile project;
– you explain this as a result of tight resources (level 3);
– you generalize by analogy that this will happen in (almost)
all small companies (level 3).
21 October 2015 IASESE 25
Data
Explanations
Observations
Generalizations
Abduction
Analogy
Description
Architectural
Architectural
26. Sample-based inference
• Descriptive inference: Describe sample statistics
• Statistical inference: Generalize to population parameters
• Abductive inference: Provide an explanation
• Analogic inference: Expand the scope of a theory based on similarity
21 October 2015 IASESE 26
Explanations
Observations
GeneralizationsStatistical
inference
AbductionAnalogyData
Description
27. • Causal explanations can be supported by sample-based
designs (treatment group/control group)
• Generalization from a population, to similar populations
must be based on architectural explanation
– In an experiment witha sample of students you observe a difference between
treatment group and control group;
– By randomness you generalize topopulation of students
– Your explanation: this difference is caused by the treatment (level 3);
– In turn explainedby cognitive processes of students (level 3);
– generalizedby analogy to novice software engineers (level 3).
21 October 2015 IASESE 27
Explanations
Observations
Generalizations
AbductionAnalogyData
Description
Statistical inference
Architectural
Causal & Architectural
28. • Vriezekolk: Inferring theories from data
• Méndez: inferring theories from data
• Prechelt: Applying/inferring theories to/from
data
21 October 2015 IASESE 28
29. Agenda
Time Topic
09:00 – 10:30 Opening and Introduction
10:30 – 11:00 Coffee break
11:00 – 12:30 Inferring Theories from Data
12:30 – 13:30 Lunch
13:30 – 15:00 Designing Research based on Theories
15:00 – 15:30 Coffee break
15:30 – 16:30 Hands-on Working Session and Q&A
16:30 – 17:00 Wrap up (all)
29
31. The research setup
• In experiments we are interested in the effect of the
treatment on the OoS
– Requires capabilityto applytreatment and control
• In observational studies we are interested in the structure and
dynamics of the OoS itself
– Only weak support for causality
21 October 2015 IASESE 31
Population
Sample of
Objects of
Study
Represents
one or
more
population
elements
Treatment
instruments
Measure-
ment
instruments
32. • Case-based designs
– provide architecturalexplanations
– generalize by architectural analogy
– Nondeterminism across cases is not quantified
• Sample-based designs
– Collect sample statistics
– Infer properties of distributionover population
– May be purely descriptive!
– Possibly a causal explanation
– To generalize further, need architectural explanation too
– Nondeterminsim within the population is quantified, but not
across analogous populations
21 October 2015 IASESE 32
33. Field versus lab
21 October 2015 IASESE 33
• If a phenomenoncannot be (re)produced in the lab, it can
only be investigatedin the field
• Which of the followingdesigns can be done in a lab?
Case-based inference Sample-based inference
No treatment
(observational study)
Observational case study Survey
Treatment
(experimental study)
Single-case mechanism
experiment,
Technical action research
Statistical difference-
making experiment
E.g. simulation, test
of individual OoS Treatment group /
control group designs
E.g. test with client,
pilot project
37. Hands-on Working Session
1. What is your research question?
2. Describe a research setup to answer it
3. What inferences do you plan to base on this setup?
Groups of 3
• 15:30 Each person first drafts a flipchartwith his/her answers for
own research
• 15:45 Each group member comments on the two flipcharts of
others in his/her group, in particularon:
– Are the answers clear?
– Are the answers defensible?
• 16:30 Each person finalizes (for now) his/her flipchart
• 16:31 Paste to the wall. See what you can learn from other designs.
• 16:45 Plenary wrap-up
21 October 2015 IASESE 37
41. • International on-linesurvey of requirements engineering
professionals’ opinion about causes and effects of RE
problems
• Research questions
– RQ 1 What are the expectations on a good RE?
– RQ 2 How is RE defined, applied, and controlled?
– RQ 3 How is RE continuously improved?
– RQ 4 Which contemporary problems exist in RE, and what implications
do they have?
– RQ 5 Are there observable patterns of expectations, status quo, and
problems in RE?
• Observational research
41
47. • The conceptual structure of social mechanisms in
the previous two slides is architectural:
– Components
– Interactions
• Conceptual structure of the causal theories on
the next slides is statistical:
– Variables
– Distribution over population
47
51. Usability of theories
• The theory of 34 hypotheses is not intendedto be used by
professionals to improve their practice. Consider the theory
``improvingRE skills reduces requirements incompleteness’’
1. Professional is capable to recognize Context assumptions
– Yes: recognizable when there is requirements engineering
2. Capable to acquire/build Artifact under constraintsof practice
– That depends on the available budget (time, money) for RE training
3. The effects will indeed occur
– That depends on the training; and on other factors causingRE incompleteness
4. He/she can observe this
– Hard to say whether requirements are more complete
5. They will contribute to stakeholder goals/satisfy requirements
– Hard to say whether RE completeness will contribute to stakeholder goals
51
52. Inferring theories from data
– Description
• Interpretation of the answers of the respondents
• Descriptive statistics
– Statistical inference
• No statistical inference
– Abductiveinference
• The assumed explanation of the respondent’s answers is that
they base them on experience
– Analogic inference
• Other professionals will answer similarly; but possibly different
across countries/cultures
52
58. • Theory 2, proposed by Prechelt and Pepper based
on the case study:
– R1: …
– …
– R5: There is no affordable method to assess the
reliability of the results of MSR in DICA
– R6: The reliability of MSR results in DICA is low
– R5 and R6 are the major reasons why MSR is not used
for DICA
• Artifact: MSR
• Context: organizations that develop web
applications for a long period of time, confuse
defects with issues, and have no dedicated staff
to maintain bug tracks (sect 8.1)
58
Descriptive
generalizations
Rational
explanation of a
phenomenon.
(= architectural
explanation,
where some
components are
actors that have
goals and may
have reasons for
actions)
61. Usability of theories
1. Professional is capable to recognize Context assumptions
– yes
2. Capable to acquire/build Artifact under constraintsof
practice
– Prechelt & Pepper: considerable effort in their case
3. The effects will indeed occur
– No evidence that reliable information about processes will be
produced
4. He/she can observe this
– No: considerable uncertaintywhether effects have occured
5. They will contribute to stakeholder goals/satisfy
requirements
– No evidence that process improvements will occur
61
62. Applying existing theories to data and
Inferring new or updated theories from data
• Description
– Case descriptions of every step
– Interpretation of every step in terms of R1 – R6
• Statistical inference
– Not possible from a case
– (but there is one inside this case to investigate the
relation between defect descriptions and issue
descriptions)
• Abductiveinference
– Explanation of non-use in terms of R1 – R6
– Rational explanation in terms of reasons of actors
• Analogic inference
– Descriptions and explanation generalized by analogy
– Discussion of external validity
62
How did it happen?
• Existing theory 1
assumed, and falsified
• New theory 2 emerged
from the data and from
opinions of actors in the
OoS. Or were the
propositions R1-6
specified before the case
study was started?
63. The research setup
Population
Sample of
Objects of
Study
Represents
one or
more
population
elements
Treatment
instruments
Measure-
ment
instruments
63
Sources of evidence p. 5:
Context information, raw data of version archive and
bugtracker, analysis steps taken and not taken, issues
and arguments of those steps, data provided by MSR tools,
Infopark’s interpretation of the outcomes of the steps
MSR tools providing data;
Peppers work notes;
Pepper’s memory
(sect 8.3)
MSR tools
One complex Object of Study:
Infopark and its software repositories
Other software development
organizations and their repositories
Treatment is the 4–step procedure listed in
sect 2.3 performed by Pepper at Infopark
66. • Lab experiment to test reliability of a method,
RASTER, to assess risk of telecom availability
– Research question: How reliable is RASTER?
– Research setup: Six groupsof three students each
had to estimate likelihood and impact of a list of
non-availability risks for an email service, using
the RASTER method
66
69. The use of theories
• “Raster x Professionals → risk assessments”
– Frame a phenomenon: risk assessments are made by professionals
– Describe it: describe telco infrastructure architecture and its
vulnerabilities
– Specify a treatment: use RASTER to assess risks
– Analyze it: Trace risks to architecture components
– Generalize about it: claim that other professionals would find the
same risks of similar telco architectures
– Predict an effect: predict that this will happen in the next project
– Explain an effect: Explain assessments in terms of RASTER method and
ToA
69
70. Usability of theories
1. Professional is capable to recognize Context assumptions
– Yes
2. Capable to acquire/build Artifact under constraintsof practice
– RASTER requires relativelylittle training; RA is expensive, but not due to
RASTER
3. The effects will indeed occur
– Has been shown in experiments and pilots
4. He/she can observe this
– Plain for all to see
5. They will contribute to stakeholder goals/satisfy requirements
– Goal is to obtain accurate and reliable assessments
70
71. Inferring theories from data
– Description
• Outcome of RA’s on paper
• Krippendorf’s alpha to measure interrater agreement
• Outcome of exit questionnaires to asses sources of variability
– Statistical inference
• Sample non-random, and too small.
– Abductiveinference
Observed variability explained by
1. lack of expert knowledge,
2. differences in assumptions,
3. difficulty to choose between adjacent ordinal values for likelihood
– Analogic inference
• 1 and 2 absent/reduced in the field, so less variabilitythere
• 3 motivates improvement of the method to reduce this phenomenon
71
72. The research setup
Population
Sample of
Objects of
Study
Represents
one or
more
population
elements
Treatment
instruments
Measure-
ment
instruments
72
RA professionals in telco
Doing RA in a quiet room
Self-selected sample of students
In a quiet room
Application of RASTER to a small case
Personal observation,
Exit questionnaire,
RASTER forms
Oral instruction, written case
description and RASTER help
Similarities and dissimilarities!
Used both to reason from sample to population
1. Theory of variability formulated;
2. Designed a research setup that minimized the impact of these sources;
3. Explained observed variation in terms of this theory
4. Used this to generalize to population and to improve RASTER