Research seminar slides at URJC June 6. Briefly: social analysis; more detailed: static analysis and co-evolution (joint w Landman, Vinju, Muske; Businge).
6. Story 1: Women, men
and software
Bastiaan Heemskerck, Men and women, some carrying baskets, ca. 1700, KMSKB, Brussel
7. Bogdan Vasilescu, Daryl Posnett, Baishakhi Ray, Mark G. J. van den Brand, Alexander Serebrenik, Premkumar T. Devanbu,
Vladimir Filkov: Gender and Tenure Diversity in GitHub Teams. CHI 2015: 3789-3798
8. Bogdan Vasilescu, Vladimir Filkov, Alexander Serebrenik:
Perceptions of Diversity on Git Hub: A User Survey. CHASE@ICSE 2015: 50-56
9. Parastou Tourani, Bram Adams, Alexander Serebrenik: Code of Conduct in Open Source Projects. SANER 2017, pp. 24-33.
10. Story 2: How do they
communicate?
Jos Manders. Communication. ca. 1965. SMAK - Stedelijk Museum voor Actuele Kunst. Ghent
11. Bogdan Vasilescu, Alexander Serebrenik, Premkumar T. Devanbu, Vladimir Filkov: How social Q&A sites are changing
knowledge sharing in open source software communities. CSCW 2014: 342-354
Bin Lin, Alexey Zagalsky, Margaret-Anne D. Storey, Alexander Serebrenik: Why Developers Are Slacking Off: Understanding
How Software Teams Use Slack. CSCW Companion 2016: 333-336
12. Nicole Novielli, Fabio Calefato, Filippo Lanubile: The challenges of sentiment detection in the social programmer ecosystem.
SSE@SIGSOFT FSE 2015: 33-40
Robbert Jongeling, Proshanta Sarkar, Subhajit Datta, Alexander Serebrenik. On Negative Results when using Sentiment
Analysis Tools for Software Engineering Research. Empirical Software Engineering (accepted)
13. Daviti Gachechiladze, Filippo Lanubile, Nicole Novielli, Alexander Serebrenik. Anger and Its Direction in Collaborative
Software Development 39th International Conference on Software Engineering, New Ideas and Emerging Results 2017
21. Static analysis
(compile time)
Dynamic analysis
(runtime analysis)
All possible behaviours
Behaviours observed during the
execution:
• Results are incomplete
• Generalisation is a problem
Inherently imprecise.
• Precision requires
complex abstractions &
postprocessing
Precise
Correctness guaranteed
(under reasonable (?)
assumptions)
Correctness guaranteed for
the behaviours observed
Slow Fast
Applicable to incomplete
programs
22. Static analysis
(compile time)
Dynamic analysis
(runtime analysis)
All possible behaviours
Behaviours observed during the
execution:
• Results are incomplete
• Generalisation is a problem
Inherently imprecise.
• Precision requires
complex abstractions &
postprocessing
Precise
Correctness guaranteed
(under reasonable (?)
assumptions)
Correctness guaranteed for
the behaviours observed
Slow Fast
Applicable to incomplete
programs
27. <…> in practice, soundness is commonly eschewed:
we are not aware of a single realistic whole-program
analysis tool <…> that does not purposely make
unsound choices. Similarly, virtually all published whole-
program analyses are unsound and omit conservative
handling of common language features when applied
to real programming languages.
Benjamin Livshits, Manu Sridharan, Yannis Smaragdakis, Ondrej Lhoták, José Nelson Amaral, Bor-Yuh Evan Chang, Samuel
Z. Guyer, Uday P. Khedker, Anders Møller, Dimitrios Vardoulakis: In defense of soundiness: a manifesto. Commun. ACM 58(2):
44-46 (2015)
28. Benjamin Livshits, Manu Sridharan, Yannis Smaragdakis, Ondrej Lhoták, José Nelson Amaral, Bor-Yuh Evan Chang, Samuel
Z. Guyer, Uday P. Khedker, Anders Møller, Dimitrios Vardoulakis: In defense of soundiness: a manifesto. Commun. ACM 58(2):
44-46 (2015)
Language
Example of commonly
ignored features
Consequences of not modelling
these features
C/C++
setjmp/longjmp ignored
ignores arbitrary side-effects to
the heap
effects of pointer arithmetic
“manufactured” pointers
Java
reflection
can render much of the
codebase invisible for analysis
JNI
“invisible” code may create
invisible side-effects in programs
Javascript
eval, dynamic code loading
missing execution
data flow through the DOM missing dataflow in programs
30. Benjamin Livshits, Manu Sridharan, Yannis Smaragdakis, Ondrej Lhoták, José Nelson Amaral, Bor-Yuh Evan Chang, Samuel
Z. Guyer, Uday P. Khedker, Anders Møller, Dimitrios Vardoulakis: In defense of soundiness: a manifesto. Commun. ACM 58(2):
44-46 (2015)
Language
Example of commonly
ignored features
Consequences of not modelling
these features
C/C++
setjmp/longjmp ignored
ignores arbitrary side-effects to
the heap
effects of pointer arithmetic
“manufactured” pointers
Java
reflection
can render much of the
codebase invisible for analysis
JNI
“invisible” code may create
invisible side-effects in programs
Javascript
eval, dynamic code loading
missing execution
data flow through the DOM missing dataflow in programs
31. SWAT - Software Analysis And Transformation
Davy Landman, Alexander Serebrenik, Jurgen J. Vinju: Challenges for static analysis of Java reflection: literature review and
empirical study. ICSE 2017: 507-518. Distinguished paper award.
32. SWAT - SoftWare Analysis And Transformation
Empirical evidence
• Complex reflection is everywhere in Java
• 462 Java projects in a representative and clean corpus
• 78% of Java projects have hard reflective code
• Known limitations have significant impact (4% - 54%)
• Existing soundy assumptions validated, more assumptions motivated
Actionable results
• Researchers: high impact suggestions
• Practisioners: adapt code for robustness
Answers to research questions
1.What is Java reflection?
2.How often is Java reflection used, and how?
3.What do static analysis tools do to resolve reflection?
4.What are limitations of static analysis tools?
5.How often does real Java code challenge limitations of static analysis?
33. SWAT - SoftWare Analysis And Transformation
Q1: What is Java reflection?
34. SWAT - SoftWare Analysis And Transformation
Q1: What is Java reflection?
“Hard” “Easy”
37. SWAT - SoftWare Analysis And Transformation
Q2: How often is reflection used?
• Corpus of 461 (cleaned) Java projects
• Maximize representativeness [55]
• Clean [clone detection]
• Parse & resolve [Rascal, Eclipse JDT]
• Categorize [see Q1]
38. SWAT - SoftWare Analysis And Transformation
of projects
using
reflection
39. SWAT - SoftWare Analysis And Transformation
Q3: What do analysis tools do?
• Extended structured literature review
• 4K pdf’s
• Semi-automatic full text analysis
• Filtering from 4k via 514 and 50 to 33 pdf’s
• Annotating
• Categorizing
42. October 2015: 4K
pdf2text 36 documents fail
478 documents match at least one rule
514 = 478 + 36 documents to be read
43. 514 documents
Not relevant:
• not about Java
• not about static analysis
• reflection is only recognised as a limitation
• reflection is handled with an external tool
• …
50 documents
• thesis conference paper
• conference paper journal article
• no publication remove
39 documents
• Read + check citations
• 4 new papers missing from 4K
• 10 irrelevant
33 documents
46. SWAT - SoftWare Analysis And Transformation
• Collect and categorize analysis papers self-reported:
• Optimistic ‘soundy’ assumptions about code
• Known limitations of the algorithms
• What is their damage in the corpus?
• Method:
• Recognize and count counter examples
• Applying AST patterns to the entire corpus
• Rascal metaprogramming language
Q4: What are the limitations?
and Q5: how do these relate to real code?
48. SWAT - SoftWare Analysis And Transformation
Advice for software engineers; make your code more robust now
1.Do not factor reflection into type polymorphic methods
2.Never use dynamic proxies
3.Use local variables/fields for meta object storage
4.Avoid loops over collections of meta objects
5.Test for preconditions instead of waiting for exceptions
Suggestions for static analysis researchers and Java language designers
1.Reflection API improvements to restrict arbitrary interactions (i.e. using lambdas)
2.Infer information from downcasts more aggressively
3.Make soundy assumptions about dynamic proxies: the “oblivious wrapper proxy”
4.Model common “goto patterns” with exceptions around reflection
5.Soundily assume boundedness and unorderedness of meta object collections
6.Apply dynamic language analysis techniques to methods which have reflection
49. SWAT - Software Analysis And Transformation
This work has been supported by the NWO TOPGO grant #612.001.011 “Domain-Specific Languages: A Big Future for Small Programs”
@jurgenvinju@davylandman @aserebrenik
Please use these artefacts for yourselves, or contact us for discussion about:
- the new soundy assumptions are a prioritized work list (*)
- the corpus is a way to validate relevance for new ideas in static analysis [3]
- tell us why we were wrong (replicate it) [63]
To the anonymous reviewers and to the members of IFIP WG 2.4
Software Implementation Technology, including Anders Møller, …
53. What are possible
approaches for handling
the static analysis alarms?
J. Kamperman, “Automated software inspection: A new approach to increased software quality and productivity,” Reasoning
Inc., White paper, 2002
54. 84 = 7 x 4 x 3 different search strings.
150 top results (total 12600 papers)
49 relevant papers
Tukaram Muske, Alexander Serebrenik: Survey of Approaches for Handling Static Analysis Alarms. SCAM 2016: 157-166
56. Tukaram Muske, Alexander Serebrenik: Survey of Approaches for Handling Static Analysis Alarms. SCAM 2016: 157-166
57.
58. Tukaram Muske, Alexander Serebrenik: Survey of Approaches for Handling Static Analysis Alarms. SCAM 2016: 157-166
guaranteed not to miss bugs (do not introduce false negatives)
61. 11 releases: 1.0 - 3.7
Numerous Eclipse third-party plug-ins
Will my plugin survive the
next Eclipse release?
movement
evolution
62. Eclipse SDK
3rd party
plugin
API
• documented
• stable [in non-
breaking releases]
• use encouraged
non-API
• internal in the
name
• undocumented
• unstable
• use discouraged
63. Eclipse SDK
3rd party
plugin
44% of 512
plugins use non-
APIs (at least in
one version)
Plugins depending on non-
APIs are larger and use
more SDK functionality
J. Businge, A. Serebrenik, M.G.J. van den Brand. Eclipse API usage: The Good and The Bad, SQM 2012,pages 54–62.
J. Businge, A. Serebrenik, M.G.J. van den Brand: Analyzing the Eclipse API Usage: Putting the Developer in the Loop.
CSMR 2013: 37-46
64. Eclipse SDK
3rd party
plugin Main reason for
compatibility breakage
Especially: recent non-APIs
J. Businge, A. Serebrenik, M.G.J. van den Brand: Survival of Eclipse third-party plug-ins. ICSM 2012: 368-377
J. Businge, A. Serebrenik, M.G.J. van den Brand: Compatibility Prediction of Eclipse Third-Party Plug-ins in New Eclipse
Releases. SCAM 2012: 164-173
65. Eclipse SDK
3rd party
plugin
J. Businge, A. Serebrenik, M.G.J. van den Brand: Survival of Eclipse third-party plug-ins. ICSM 2012: 368-377
J. Businge, A. Serebrenik, M.G.J. van den Brand: Compatibility Prediction of Eclipse Third-Party Plug-ins in New Eclipse
Releases. SCAM 2012: 164-173
Do not use recent
non-APIs!
Let them
“stabilise”!