Mais conteúdo relacionado
Semelhante a Characterizing and Predicting Which Bugs Get Reopened (20)
Mais de Thomas Zimmermann (20)
Characterizing and Predicting Which Bugs Get Reopened
- 1. Characterizing and Predicting
Which Bugs Get Reopened
Thomas Zimmermann
Nachiappan Nagappan
Microsoft Research
Philip J. Guo
Stanford University
Brendan Murphy
Microsoft Research
© Microsoft Corporation
- 2. A bug’s life
Picture on the right via http://www.bugzilla.org/docs/2.18/html/lifecycle.html
© Microsoft Corporation
- 3. Final part of a trilogy…
Which bugs are fixed? Bug reassignments Bug reopens (this paper)
ICSE 2010 CSCW 2011 ICSE 2012 SEIP
© Microsoft Corporation
- 4. …and partly a remake
Emad Shihab, Akinori Ihara, Yasutaka Kamei,
Walid M. Ibrahim, Masao Ohira, Bram Adams,
Ahmed E. Hassan, Ken-ichi Matsumoto:
Predicting Re-opened Bugs: A Case Study on
the Eclipse Project. WCRE 2010: 249-258
© Microsoft Corporation
- 5. Shihab et. al: Predicting Reopened Bugs This paper: Characterizing Reopened Bugs
(WCRE 2010)
Four dimensions: work habits, bug report,
bug fix and team
Predicted reopened bugs with a precision
of 62.9% and a recall of 84.5% (d-trees)
Top node analysis found that bug report
dimension was most influential
|
© Microsoft Corporation
- 6. Shihab et. al: Predicting Reopened Bugs This paper: Characterizing Reopened Bugs
(WCRE 2010)
Four dimensions: work habits, bug report, Partial replication of Shihab et al.
bug fix and team
New measurements: organizational and
Predicted reopened bugs with a precision geographic distance, reputation, how found
of 62.9% and a recall of 84.5% (d-trees)
Qualitative component on the causes of
Top node analysis found that bug report bug reopens (identified with a survey)
dimension was most influential
Descriptive models (logistic regression)
|
|
© Microsoft Corporation
- 7. Methodology
Qualitative • “In your experience, what are reasons why a
bug would be reopened multiple times”
survey • 358 out of 1,773 responded. Card sort.
Quantitative • All bug reports for Windows Vista and
Windows 7
analysis • Logistic regression model for reopened bugs
Manual • Random sample of reopened bugs
inspection • 20 bug reports
© Microsoft Corporation
- 9. Causes of bug reopens
Not FIXED Related to Root Cause
Bugs difficult to reproduce
Developers misunderstood root cause
Bug had insufficient information
Related to Priority
Priority of the bug increased
FIXED Regression bugs
Process-related Process
© Microsoft Corporation
- 10. #1: Difficult to reproduce
“The bug is hard to reproduce and so the fix was made without
being able to fully verify it. A good example is a customer who
reports something. We think we see the issue in house and fix
that. It turns out we saw something different…”
“Bugs which are difficult to reproduce generally get
re-activated multiple times. At first, developers will
give a simple repro attempt before resolving bugs
'Not repro'. But if the bug opener is able to reproduce
the issue again, or perhaps comes up with better repro
instructions, then the developer will pay more
attention the second time the bug is activated.”
“Heisenbugs”
© Microsoft Corporation
- 11. #2: Misunderstood root cause
“The bug is tracking an unidentified symptom and it takes a
while to fully root cause. This comes up a lot with memory
leaks: there will be an unknown memory leak in a component
and the owning team plays whack-a-mole with the code
defects to remove memory issues one-by-one.”
“Not fixing the root cause and only
addressing symptoms. Without root
cause understood for the bug a
patch/hack can often be done that
will then be reactivated.”
© Microsoft Corporation
- 12. #3: Insufficient information
“Poor bug quality. If the bug wasn't described well enough, or
not enough diagnostic info was there, the dev will guess and
fix *something* in order to make the bug go away. What they
fix isn't always what the person who filed the bug ran into.”
“If a bug report does not accurately convey
enough information about what is actually
wrong (i.e. it describes incorrect behavior
but neglects to mention data loss) or if the
bug does not convey a dependency (such as
another team relying on a fix), a bug may be
de-prioritized and resolved without fixing.”
© Microsoft Corporation
- 13. #4: Increased priority
“Bugs are closed because one person or triage team believes
the bug is not worthy of fixing (i.e. too risky, don't care, etc.),
but then a few days later a VP or external customer reports
the same issue, then the bug has a higher priority.”
“Other reason is lack of business justification or too
late in product cycle; reopened when sufficient
justification exists or new cycle begins.”
“One team may feel an issue is critical while
the other does not see it as important enough,
and instead of carrying a discussion, the bug
is bounced around.”
© Microsoft Corporation
- 14. #5: Regression bugs
“First attempt at fix was flawed in some way, and wasn't
caught because of lack of testing or unknown related
scenario regression.”
“I've seen cases in the past where it was
thought that a bug was fixed only to find that
a corner case had been missed.”
“I've also seen cases where the bug was
only being hit due to a timing issue and
something changed that affected the
timing and the bug disappeared again.”
© Microsoft Corporation
- 15. #6: Process-related bugs
“Sometimes bugs are reopened due to a misunderstanding
of process. e.g. dev resolves bug when fix is submitted, but
tester reactivates because bug still repros (because fix has
not yet reached tester).”
“Bug is verified fixed in a feature of
developer’s branch and the fix takes
too long to hit the main branch.”
© Microsoft Corporation
- 16. #6: Process-related bugs
“First of all, I don’t like the model where we
reactivate bugs that were Fixed but the issue
was not resolved. Logically it makes sense,
but tracking the thread of the issue through
multiple checkins & reactivates can be hell if
it happens more than once or twice. I would
prefer a model where once a checkin has
been made for a bug, that bug is done! New
issues, or issues that linger despite a previous
fix, should/ would be tracked in a new bug.”
© Microsoft Corporation
- 17. What factors
correlate with
bug reopens?
© Microsoft Corporation
- 18. Does the source of a bug (how it was found)
influence the likelihood of bug reopens?
Bug Sources Vista Win7
Reopen rate for all bugs P Q
Code analysis tools 0.52P 0.73Q less likely to
be reopened
Human review 0.85P 0.66Q
Ad-hoc testing 0.87P 0.99Q
Internal user 1.12P 0.97Q
Component testing 1.13P 0.81Q
System testing 1.21P 1.46Q more likely to
be reopened
Customer 1.33P 1.12Q
© Microsoft Corporation
- 19. Does opener reputation influence the
likelihood of bug reopens?
For each bug, calculate opener’s reputation by
aggregating over all bugs in the past.
Hooimeijer and Weimer: Modeling bug report quality. ASE 2007.
© Microsoft Corporation
- 20. Does opener reputation influence the
likelihood of bug reopens?
more likely to
be reopened
less likely to
be reopened
© Microsoft Corporation
- 21. Does organizational and geographic distance
influence the likelihood of bug reopens?
Vista Win7
Organizational Opened by and initially assigned to …
distance … the same person X R
… someone with the same manager 1.13X 0.96R
… someone with a different manager 1.37X 1.07R
Geographic Opened by and initially assigned to …
distance … the same person X R
… someone in the same building 1.27X 0.93R
… someone in a different building but
in the same country 1.45X 1.00R
… someone in a different country 1.52X 1.14R
© Microsoft Corporation
- 22. Does organizational and geographic distance
influence the likelihood of bug reopens?
Vista Win7
Assigned to opener at some point in time Y S
Never assigned to opener, but assigned to 0.54Y 0.39S
someone with the same manager as opener
Never assigned to anyone with same manager 0.27Y 0.34S
Never assigned to opener, but assigned to 0.41Y 0.37S
someone in the same building
Never assigned to anyone in same building, but 0.31Y 0.43S
assigned to someone in the same country
Never assigned to anyone in the same country 0.20Y 0.20S
© Microsoft Corporation
- 23. Descriptive statistical analysis
• All pre- and post-release bug reports for Windows
Vista and Windows 7 until July 2009
• Logistic regression model to characterize
– Probability that a bug will be reopened
• Logistic regression model to characterize
– Probability that a bug will be fixed after the bug
has been reopened
– Probability that a bug will be fixed
(Guo et al., ICSE 2010)
• Same factors as in Guo et al., ICSE 2010
© Microsoft Corporation
- 24. Factor Reopen (Vista)
Human review not significant
Code analysis tool -0.503
Bug source: Component testing 0.238
(categorical) Ad-hoc testing (baseline)
System testing 0.204
Customer 0.239
Internal user not significant
Reputation of bug opener -0.266
Reputation of 1st assignee not significant
Opened by temporary employee 0.178
Initial severity level 0.127
Severity upgraded? 0.331
Opener / any assignee same manager? 0.721
Opener / any assignee same building? 0.468
Num. editors 0.236
Num. assignee building 0.090
Num. component path changes -0.160
© Microsoft Corporation
- 25. Factor Reopen (Vista)
Human review not significant
Code analysis tool decrease (-0.503)
Bug source: Component testing increase (0.238)
(categorical) Ad-hoc testing (baseline)
System testing increase (0.204)
Customer increase (0.239)
Internal user not significant
Reputation of bug opener -0.266
Reputation of 1st assignee not significant
Opened by temporary employee 0.178
Initial severity level 0.127
Severity upgraded? 0.331
Opener / any assignee same manager? 0.721
Opener / any assignee same building? 0.468
Num. editors 0.236
Num. assignee building 0.090
Num. component path changes -0.160
© Microsoft Corporation
- 26. Factor Reopen (Vista)
Human review not significant
Code analysis tool decrease (-0.503)
Bug source: Component testing increase (0.238)
(categorical) Ad-hoc testing (baseline)
System testing increase (0.204)
Customer increase (0.239)
Internal user not significant
Reputation of bug opener decrease
Reputation of 1st assignee not significant
Opened by temporary employee increase
Initial severity level increase
Severity upgraded? increase
Opener / any assignee same manager? increase
Opener / any assignee same building? increase
Num. editors increase
Num. assignee building increase
Num. component path changes decrease
© Microsoft Corporation
- 27. Which *reopened*
bugs get fixed?
vs.
Which bugs get fixed?
© Microsoft Corporation
- 28. Fixed When Fixed (Vista)
Factor
Reopened (Vista) [Guo, ICSE 2010]
Human review 0.377 0.511
Code analysis tool not significant 0.357
Bug source: Component testing -0.160 0.065
(categorical) Ad-hoc testing
System testing not significant -0.129
Customer -0.498 -0.347
Internal user -0.465 -0.454
Reputation of bug opener 1.632 2.193
Reputation of 1st assignee 1.651 2.463
Opened by temporary employee -0.144 -0.125
Initial severity level not significant 0.033
Severity upgraded? not significant 0.256
Opener / any assignee same manager? not significant 0.676
Opener / any assignee same building? not significant 0.270
Num. editors 0.127 0.240
Num. assignee building -0.213 -0.257
Num. component path changes -0.162 -0.232
Num. re-opens n/a -0.135
© Microsoft Corporation
- 29. Fixed When Fixed (Vista)
Factor
Reopened (Vista) [Guo, ICSE 2010]
Human review 0.377 0.511
Code analysis tool not significant 0.357
Bug source: Component testing -0.160 0.065
(categorical) Ad-hoc testing
System testing not significant -0.129
Customer -0.498 -0.347
Internal user -0.465 -0.454
Reputation of bug opener 1.632 2.193
Reputation of 1st assignee 1.651 2.463
Opened by temporary employee -0.144 -0.125
Initial severity level not significant 0.033
Severity upgraded? not significant 0.256
Opener / any assignee same manager? not significant 0.676
Opener / any assignee same building? not significant 0.270
Num. editors 0.127 0.240
Num. assignee building -0.213 -0.257
Num. component path changes -0.162 -0.232
Num. re-opens n/a -0.135
© Microsoft Corporation
- 30. Fixed When Fixed (Vista)
Factor
Reopened (Vista) [Guo, ICSE 2010]
Human review 0.377 0.511
Code analysis tool not significant 0.357
Bug source: Component testing -0.160 0.065
(categorical) Ad-hoc testing
System testing not significant -0.129
Customer -0.498 -0.347
Internal user -0.465 -0.454
Reputation of bug opener 1.632 2.193
Reputation of 1st assignee 1.651 2.463
Opened by temporary employee -0.144 -0.125
Initial severity level not significant 0.033
Severity upgraded? not significant 0.256
Opener / any assignee same manager? not significant 0.676
Opener / any assignee same building? not significant 0.270
Num. editors 0.127 0.240
Num. assignee building -0.213 -0.257
Num. component path changes -0.162 -0.232
Num. re-opens n/a -0.135
© Microsoft Corporation
- 31. Lessons learned
• Improve reproducibility of bug reports
• Provide better tools to identify root cause
• Better estimate initial priorities
• Reduce the complexity of branching
(bugs were “verified” in the wrong branch)
© Microsoft Corporation
- 32. Thank you! Partial replication of Shihab et al.
New measurements:
organizational and geographic
distance, reputation, how found
Qualitative component on the
causes of bug reopens (survey):
root cause, priority, process
Descriptive models based on
logistic regression
http://research.microsoft.com/ese
© Microsoft Corporation