2. Empirical SE Questions?
• The questions similar to those an anthropologist
might ask during first contact with a previously
unknown culture.
– How do people learn to program?
– Can the future success of a programmer be predicted
by personality tests?
– Does the choice of programming language affect
productivity?
– Can the quality of code be measured?
– Can data mining predict the location of software
bugs?
– …….
Greg Wilson, Jorge Aranda, Empirical Software Engineering, American Scientist
https://www.americanscientist.org/issues/pub/empirical-software-engineering
3. Empirical Software Engineering
• Evidence of particular aspect of SE
– Activities, processes, technologies
– Best practices, Lessons learned
– Increased knowledge
– Showing a new perspective of a particular
knowledge.
– ….
–
5. Systematic Literature Review (SLR)
• Finding evidence from (scientific) literature
– Do it in a systematic way
• State a question
• Find the answer
Based on
Barbara Kitchenham, Evidence-Based Software Engineering and Systematic Reviews
www.scm.keele.ac.uk/ease/ease05_bk.ppt
6. Systematic (Literature) Review
• Questions
– what are the current problems in a specific area?
– for a specific problem what are the reported solutions?
– Which are the newest results in a particular area?
– Which particular combination of two/several areas do exist?
• Important!
– The questions should be interesting form a research point of
view
– The questions should be attractive for the readers
– The questions should be enough general to come to a
conclusions that are sufficiently general
– The questions should be specific enough to be able to provide
enough specific findings
7. Systematic Review Procedure
• Support Evidence-based paradigm
– Start from a well-defined question
• Step 1
– Define a repeatable strategy for searching the
literature
• Step 2
– Critically assess relevant literature
• Step 3
– Synthesise literature
• Step 4
Ref: Barbara Kitchenham, Evidence-Based Software Engineering and Systematic Reviews 7
8. Systematic Review Process
Develop Review Protocol
Plan Review
Validate Review Protocol
Identify Relevant Research
Select Primary Studies
Conduct Review Assess Study Quality
Extract Required Data
Synthesise Data
Write Review Report
Document Review
Validate Report
8
9. Showing the SLR through an example
Example #1
Example 1:
A systematic review of software architecture evolution research.
Hongyu Pei Breivold, Ivica Crnkovic, Magnus Larsson,
Information & Software Technology 54(1): 16-40 (2012)
• software evolvability
– the ability of a system to accommodate changes in its
requirements throughout the system’s lifespan with the
least possible cost while maintaining architectural
integrity”
• Interest: evolvability through software architecture
10. Evolvability property model
is refined to
Evolvability subcharacteristics
1 1..*
1
is refined to
1..*
measured by
measuring attributes 1..* metrics
1
1
reason about 1..*
QoS
Evolvability subcharacteristics
Analyzability
Architectural Integrity
Question: which are subcharacteristics? Changeability
Portability
Extensibility
Testability
Domain-specific attributes
1/22/2013 10
11. Showing the SLR through an example
Example #2
Example 2:
15 Years of CBSE Symposium: Impact on the Research Community
Josip Maras, Luka Lednicki, Ivica Crnkovic
ACM/SigSoft Component-based Software Engineering Symposium 2012
• Interest: What is the impact of CBSE
Symposium publications?
12. CBSE events
1998 – Tokyo Workshop@ICSE
1999 – Los Angeles Initiation
2000 – Limerick
2001 – Toronto Focus
2002 – Orlando
2003 – Portland
2004 – Edinburgh Symposium@ICSE
2005 – St. Louis Broadening Scope
QoSA 2006 – Västerås
2007 – Boston Symposium!@ICSE
CompArch
2008 – Karlsruhe
WCOP Collaboration phase
2009 – E. Stroudsburg
ISARCS 2010 – Prague
(WICSA) 2011 – Boulder
2012 - Bertinoro
2013-01-22 CBSE 2012 - Bertinoro, Italy 12
13. Systematic Review Process
Develop Review Protocol
Plan Review
Validate Review Protocol
Identify Relevant Research
Select Primary Studies
Conduct Review Assess Study Quality
Extract Required Data
Synthesise Data
Write Review Report
Document Review
Validate Report
13
14. Developing the Protocol
• Review protocol
– Specifies methods to be used for a systematic review
– Predefined protocol
• Reduces researcher bias by reducing opportunity for
– Selection of papers driven by researcher expectations
– Changing the research question to fit the results of the searches
– Good practice for any empirical study
14
15. Protocol Contents -1/3
• Background
– Rationale for survey
• Research question
– Critical to define this before starting the research
– Strategy used to search for primary sources
15
16. Protocol Contents – 2/3
• Strategy to find primary studies
– Search terms/keywords
– Identify resources, databases, journals, conferences
– Procedures for storing references
– How publication bias will be handled
• Grey literature
• Direct approach to active researchers
– How completeness will be determined
• Useful to have the baseline paper to set start date
• Selection Strategy
– Inclusion/exclusion criteria
• Handling multiple papers on one experiment
• Quality assessment criteria
16
17. Protocol Contents- 3/3
• Data extraction
– What data will be extracted from each primary source
– How to handle missing information
– How data extraction reliability will be addressed
• Usually multiple reviewers
– Where data will be stored
• Procedures for data synthesis
– Formats for summarising data
– Measures and analysis if meta-analysis is proposed
17
18. Research questions Search Keywords Resources/Database
Search
Inclusion/Exclusion
Studies
criteria
filtering
Primary Studies
legend analysis Statistical data
activity
synthesis New findings
artifact
19. Example 1 (Software Architecture Evolution)
Research questions
1. What approaches have been reported regarding the analysis and achievement
of software evolvability at the architectural level?
2. What are the main research topics covered in the scientific literature regarding
analysis and achievement of evolvability-related quality attributes?
3. …..
4. What is the impact of the studies to research community and practice?
20. Example 1 (Software Architecture Evolution)
Research questions Search Keywords Resources/Database
Search keywords Databases & Resources:
S1: software architecture AND evolvability ACM Digital Library
IEEE Xplore
S2: software architecture AND maintainability ScienceDirect – Elsevier
S3: software architecture AND extensibility SpringerLink
S4: software architecture AND adaptability Wiley InterScience
S5: software architecture AND flexibility ISI Web of Science
S6: software architecture AND changeability SCOPUS
S7: software architecture AND modifiability (Google Scholar )
S8: software architecture AND analyzability
Keywords should reflect the questions and the underlying theory/model
21. Example 2 (CBSE publications)
Research questions
Questions
Impact
- Number of publications, total, per year, geographical distribution
- citation index
- Indirect impact: backward citations, Impact of the authors
- What is the maturity level of CBSE?
Topics of interest
Which research topics where the most present at CBSE?
What kind of research results were presented?
What type of validations the publications had?
22. Example 2 (CBSE publications)
Research questions Search Keywords Resources/Database
Questions Search keywords Databases & Resources:
Impact No search keywords CBSE Proceedings
- Publications - all CBSE papers SpringerLink
- citation index ACM Didgital Library
- Indirect impact Google Scholar
Topics of interest Web search
23. Search Studies filtering Primary Studies
Example 1: Primary studies selection process
Inclusion Criteria
English peer-reviewed studies that provide answers to the research
questions.
Studies that focus on software evolution.
Studies that focus on software architecture analysis and/or software quality
analysis related to software evolvability.
Studies are published up to and including the first two quarters of 2010.
Exclusion Criteria
Studies are not in English.
Studies that are not related to the research questions.
Studies in which claims are non-justified or ad-hoc statements instead of
based on evidence.
Duplicated studies.
24. Search Studies filtering Primary Studies
Example 1: Primary studies selection process
25. Search Studies filtering Primary Studies
Example 1: Primary studies selection process
• Activities:
– Provide search strings in databases and export the
results to EndNote
• Tedious work – different query languages and different
export functionality
– Extraction of the information in a suitable form for
reading and selecting, removing duplicates, etc.
• Goal:
– To get a reasonable number of studies (<500, >20)
• May require refinement of the questions
– Achieve reliability – select the most significant
literature
26. Search Studies filtering Primary Studies
Example 2 (CBSE publications)
• All publications are primary studies
– 318 studies
• Activities
– Extract publications and create an relational-
database
– Populate database
– Provide “Query and View” interactive web-based
application for fast reading and publication
classification
27. analysis Statistical data
• Data extracted from the studies – “objective data”
– Distribution of studies with respect to
• Year of publications
• Authors and research communities
• Sources of publications
• Citation distribution, the most cited studies
• Analysis support
– Manual, writing own software,
– Help from some tools/portals
• Google scholar
• Perish & publish
• Mendeley,…
28. analysis Statistical data
Example 1: statistical data
29. analysis Statistical data
Example 1: statistical data
31. analysis Statistical data Example 2: statistical data
Ref Study #citations
Bruneton, Eric; Coupaye, Thierry; Leclercq, Matthieu; Quema, Vivien; Stefani, Jean-Bernard; An
S04-02 306
Open Component Model and its Support in Java, 2004
PORE Procurement-Oriented Requirements Engineering Method for the Component-Based
S99-1 Systems Engineering Development Paradigm,1999 118
Aoyama, Mikio; New Age of Software Development: How Component-Based Software
S98-18 115
Engineering Changes the Way of Software Development ? 1998
Cervantes, Humberto; Hall, Richard S; Automating Service Dependency Management in a
S03-3 103
Service-Oriented Component Model; 2003
S02-0
Chen, Shiping; Liu, Yan; Gorton, Ian; Performance Prediction of Component-based Applications,
2002
77 Top 10 citations
Lau, Kung-kiu; Elizondo, Velasco, Perla; Wang, Zheng; Exogenous Connectors for Software
S05-13 68
Components, 2005
Sentilles, Severine; Vulgarakis, Aneta; Bures, Tomas; Carlson, Jan; Crnkovic, Ivica; A Component
S06-25 65
Model for Control-Intensive Embedded Systems; 2008
Seinturier, Lionel; Pessemier, Nicolas; Duchien, Laurence; Coupaye, Thierry; A Component Model
S08-16 Engineered with Components and Aspects, 2006 65
S98-10 Kruchten, Philippe; Modeling Component Systems with the Unified Modeling Language, 1998 63
S04-2 S00-9 S03-1 S04-9 S99-1 S04-26 S03-3 S02-0 S04-19 S06-25 S98-18 S02-08 S04-5 S06-13 S05-13
Citation of papers
f 2294 1984 909 899 840 832 817 810 646 555 543 455 454 450 447 that cited top 10 papers
CBSE references outside CBSE events from CBSE authors #Citations
C Szyperski, Component software: beyond object-oriented programming, 1998, 2002 6594
GT. Heineman, WT. Councill, Component-based software engineering: putting the pieces
together, 2001 924 The most influential
Authors from CBSE
I Crnkovic, M Larsson, Building reliable component-based systems, 2002 623 (citations of the related work)
T Coupaye et al, The fractal component model and its support in Java, Software: Practice,
2006 443
RH Reussner et al, Reliability prediction for component-based software architectures,
Journal of Systems and Software 66 (3), 241-252 189
32. synthesis New findings
Procedures for data synthesis
• Goal: synthesize the information into a new knowledge
– Based on a theory previously established
• Validation of the theory
• Description of some specific characteristics of the theory
– Grounded theory
• Build up a theory from the reading & analysis
– Manual
– Using some tools – the most frequent words, Concordance
• The most difficult part
– Requires experience and knowledge in the subject
– Requires a kind of validation/review
33. synthesis New findings
Example 1 (Software Architecture Evolution)
Quality Attribute
Requirement Focused
7 studies
Quality Considerations Quality Attribute
during Design Scenario Focused
15 studies 2 studies
Influencing Factor
Focused
6 studies
Experience Based
5 studies
Quality Evaluation at
Classification of 82 studies Scenario Based
Architectural Level
7 studies
22 studies
Metric Based
Economic Valuation 10 studies
11 studies
Architectural Knowledge
Management
18 studies
Modeling Techniques
16 studies
34. synthesis New findings
Example 1 (Software Architecture Evolution)
Maturity classification:
• Basic research
• Concept formulation
• Development and extension
• Internal use
• External use
• Popularization
35. synthesis New findings
Example 2 (CBSE publications)
Component models
15% Component technologies Research Area
24% Extra-functional properties
12% Composition & predictability
7% Software Architecture
15% 13% Lifecycle
Domains
8%
Methodology
6%
1%
Result characteristics
• Procedures or techniques
19% Procedure or technique • Qualitative models
36%
Qualita ve/Descrip ve Model
Analy c Model
• Analytic models
2%
3%
Nota on Or Tool • Notations or tools
9%
Specific Solu on
Answer Or Judgment
• Specific solutions
Report • Judgments
12% Empirical model
• Reports
18%
• Empirical models
36. synthesis New findings
Example 2 (CBSE publications)
1%
Evaluation Type
• Not presented
7%
16%
19%
Not presented
Academic case study
• Academic case study
Simple examples • Simple examples
Experiments • Experiments
Industrial case study
Formal specifica on
• Industrial case study
18% 39% Literature comparision • Formal specification
• Literature review
100%
Research Maturity
• External enhancement and exploration
90%
80% External Enhancement
70%
And Explora on
• Internal enhancement and exploration
60%
Internal Enhancement
And Explora on • Development and extension
50%
Development And
Extension
• Conceptual formulation
40%
30% Concept Formula on
20%
10%
0%
98 99 00 01 02 03 04 05 06 07 08 09 10 11 12
37. Validation issues
0. Is your approach OK?
– Do you have the right questions?
– Is the procedure feasible?
1. How you can ensure that you have selected
the right studies?
2. How you can ensure that your analysis and
synthesis is right?
38. The right studies?
1. Are the selected sources appropriate
– Selection of databases important (fortunately
there are not so many)
– Is Google/Google Scholar appropriate as a
source?
2. Have you missed to select some important
studies? Do you have too many unimportant
studies?
39. Studies selection
• Several researchers involved in the process
Selected
studies A
Selected studies
Selected
using automatic
studies B
queries
comparison
Discussion
Selected
studies C
Filtering
Final list
41. Synthesis/Findings Validation
a) Your analysis/synthesis is based on a theory/model
a) Existing classification, ontology
b) Previous research results
c) Extending/refinement of the existing theories
b) You build your theory/model from start
Iterative process – building & validation
Validation by a
third person
Synthesis
Discussion
42. Reporting results
• Several levels of information
– Raw source information
– Extensive detailed technical report
– Research papers (Journal, Conference) – reference
to source data, technical report
43. Write an SLR paper 1/2
• Intro
– Motivation – the most important
• Why the question is interesting
• What is the main question
• The overall method used
• The questions, the search keywords, source of information
• Election process, data storage
• Selected studies
• Refer to the most important studies
• Provide statistics, comment them
44. Write an SLR paper 2/2
• Synthesis – Important
– The findings (short description in general)
– The findings related to the studies (classification/grouping of the
studies)
• Discussion
– Additional findings, remarks, statistics from the studies related to the
findings
• Validation
– Validation threat
– Validation procedures (this can be specified in the methods part)
• Conclusion
• List of primary studies
• references
45. Some Research Databases
• SCOPUS http://www.scopus.com/home.url
• ACM Digital Library (http://portal.acm.org)
• Compendex (http://www.engineeringvillage.com)
• IEEE Xplore
(http://www.ieee.org/web/publications/xplore/)
• ScienceDirect – Elsevier (http://www.elsevier.com)
• SpringerLink (http://www.springerlink.com)
• Wiley InterScience
(http://www3.interscience.wiley.com)
• ISI Web of Science (http://www.isiknowledge.com).
46. References for the systematic review
Kitchenham, Barbara. Procedures for Performing Systematic Reviews, Joint Technical
Rreport, Keele University TR/SE-0401 and NICTA 0400011T.1, July 2004.
Australian National Health and Medical Research Council. How to review the evidence:
systematic identification and review of the scientific literature, 2000. IBSN 186-
4960329 .
Australian National Health and Medical Research Council. How to use the evidence:
assessment and application of scientific evidence. February 2000, ISBN 0 642 43295 2.
Cochrane Collaboration. Cochrane Reviewers’ Handbook. Version 4.2.1. December
2003.
Glass, R.L., Vessey, I., Ramesh, V. Research in software engineering: an analysis of the
literature. IST 44, 2002, pp491-506
Magne Jørgensen and Kjetil Moløkken. How large are Software Cost Overruns? Critical
Comments on the Standish Group’s CHAOS
Reports, http://www.simula.no/publication_one.php?publication_id=711, 2004.
Magne Jørgensen. A Review of Studies on Expert Estimation of Software Development
Effort. Journal Systems and Software, Vol 70, Issues 1-2, 2004, pp 37-60.
46
47. References for the systematic review
Khan, Khalid, S., ter Riet, Gerben., Glanville, Julia., Sowden, Amanda, J.
and Kleijnen, Jo. (eds) Undertaking Systematic Review of Research on
Effectiveness. CRD’s Guidance for those Carrying Out or Commissioning
Reviews. CRD Report Number 4 (2nd Edition), NHS Centre for Reviews and
Dissemination, University of York, IBSN 1 900640 20 1, March 2001.
Pai, Madhukar, McCullovch, Michael, Gorman, Jennifer
D., Pai, Nitika, Enanoria, Wayne, Kennedy, Gail, Tharyan, Prathap, Colford,
John M. Jnr. Systematic reviews and meta-analysis: An illustrated, step-by-
step guide. The National medical Journal of India, 17(2) 2004, pp 86-95.
Sackett, D.L., Straus, S.E., Richardson, W.S., Rosenberg, W., and
Haynes, R.B. Evidence-Based Medicine: How to Practice and Teach
EBM, Second Edition, Churchill Livingstone: Edinburgh, 2000.
47