Anúncio
Anúncio

Mais conteúdo relacionado

Similar a Mining Source Code Improvement Patterns from Similar Code Review Works(20)

Mais de 奈良先端大 情報科学研究科(20)

Anúncio

Mining Source Code Improvement Patterns from Similar Code Review Works

  1. Mining Source Code Improvement Patterns from Similar Code Review Yuki Ueda1, Takashi Ishio1, Akinori Ihara2, Kenichi Matsumoto1 1Nara Institute of Science and Technology 2Wakayama University 13th International Workshop on Software Clones (IWSC’19)
  2. Background Approach Result Summary Contents • Goal:Reduce Code Review Cost • Approach:Code Improvement Pattern Detection That Appeared Review • Evaluation: Measure Patterns’ Frequency and Accuracy 2
  3. Background Approach Result Summary Code review process: Reviewers suggest code fix Patch Author Reviewer Project 3 - i=key + i=dic[“key”] Patch Background (1) Submit
  4. Background Approach Result Summary Code review process: Reviewers suggest code fix Patch Author Reviewer Project 4 - i=key + i=dic[“key”] Patch You should fix (1) Submit (2) Review, Fix suggestion Background
  5. Background Approach Result Summary Code review process: Reviewers suggest code fix 5 - i=key + i=dic[“key”] - i=key + i_=_dic[“KEY”] (3) Integrate Patch Author Reviewer Project(1) Submit (2) Review, Fix suggestion Reviewed Patch (Integrated Patch) Pre-Review Patch (Initial Patch) Background
  6. Background Approach Result Summary Problem: Reviewers need to check several times 6 - i=key + i=dic[“key”] - i=key + i_=_dic[“KEY”] (2)〜(n) Review,Fix suggestion (n) Integrate Patch Author Reviewer Project(1) Submit Reviewed Patch (Integrated Patch) Pre-Review Patch (Initial Patch) String should be lower Waste space Background
  7. Background Approach Result Summary Goal: Reduce Similar Review Automatically 7 Auto Review System (2) Review,Fix suggestion (3) Review request Patch Author Reviewer(1) Submit Similar patch is fixed in the past like.. Background
  8. Background Approach Result Summary Approach: Detect Pattern from Reviewed Patch Diff 8 ”key” , it will be “KEY” Pattern: i=dic[“key”] i=dic[“KEY”] Dataset: i=dic[“key”] i=dic[“KEY”]i=dic[“key”] Pre-Review Patch i=dic[“KEY”] Reviewed Patch Approach If patch has Detect
  9. Background Approach Result Summary Approach: Detect Pattern from Reviewed Patch Diff 9 Patch Author Auto Review System print(“key”) print(“KEY”) ”key” , it will be “KEY” Pattern: If patch has Use Dataset: i=dic[“key”] i=dic[“KEY”]i=dic[“key”] i=dic[“KEY”]i=dic[“key”] Pre-Review Patch i=dic[“KEY”] Reviewed Patch Approach
  10. Background Approach Result Summary Detect Code Improved Pattern (1/2): Divide Patch Diff to Chunk 10 - if i␣==␣0: + if i==0: break - i=dic[“key”] + i=dic.get(“key”) - i=dic[“key”] + i=dic.get(“key”) - if i␣==␣0: + if i==0: Approach
  11. Background Approach Result Summary Detect Code Improved Pattern (1/2): Get Pattern by Sequential Pattern Mining 11 - i=dic[“key”] + i=dic.get(“key”) - [i=dic - [ + .get(i=dic - [i=dic i=dic - [ + .get(i=dic “key” - ] + ) Length 2 Length3 Length4 Approach
  12. Background Approach Result Summary Detect Code Improved Pattern (1/2): Get Pattern by Sequential Pattern Mining 12 - i=dic[“key”] + i=dic.get(“key”) - [i=dic - [ + .get(i=dic - [i=dic - ] i=dic + ) - [ + .get(i=dic “key” Length 2 Length3 Length4 Keep Frequently Appeared and Longer Patterns Approach
  13. Background Approach Result Summary Pattern Evaluation 13 i=dic + .get( - ] Appeared Time: + )(e.g. Pattern i=dic .get( ] ) Pre-Reviewed Patches that have Reviewed Patches that have ) Number of Patch Pairs Approach
  14. Background Approach Result Summary Pattern Evaluation 14 Appeared Time: .get( ) Pre-Reviewed Patches that have Reviewed Patches that have Number of Patch Pairs i=dic[“key”] i=dic.get(“key”) i=dic[”KEY”] Count NOT Count e.g.): Pre-Reviewed Patch i=dic + .get( - ] + )(e.g. Pattern ) i=dic ] Approach
  15. Background Approach Result Summary 15 Appeared Time: .get( ) Pre-Reviewed Patches that have Reviewed Patches that have Number of Patch Pairs Accuracy: .get( ) Pre-Reviewed Patches that have Reviewed Patches that have Ratio of Patch Pairs Pattern Evaluation i=dic ] i=dic ] i=dic + .get( - ] + )(e.g. Pattern ) Approach
  16. Background Approach Result Summary Target 16 Project OpenStack Language Python3 Time Period 2011-2016 # Patches 173,749 # Chunks for Detect Pattern 555,050 # Chunks for Evaluate Pattern 61,673 Result
  17. Background Approach Result Summary 8 Frequently Appeared Pattern 17 self.stbout() self.stubs.Set() Why?: Support for OpenStacks‘ library dependency changes Result
  18. Background Approach Result Summary 8 Frequently Appeared Pattern 18 assertEquals() assertEqual() Why?: Support for Python 2 to 3 changes self.stbout() self.stubs.Set() Why?: Support for OpenStacks‘ library dependency changes xrange() range() Result
  19. Background Approach Result Summary 8 Frequently Appeared Pattern 19 assertEquals() assertEqual() Why?: Support for Python 2 to 3 changes assertTrue(x in array) Why?: Improve readability assertIn(x, array) xrange() range() self.stbout() self.stubs.Set() Why?: Support for OpenStacks‘ library dependency changes Result
  20. Background Approach Result Summary 8 Frequently Appeared Pattern 20 assertEquals() assertEqual() Why?: Support for Python2 to 3 changes assertTrue(x in array) Why?: Improve readability assertIn(x, array) - xrange() + range() self.stbout() self.stubs.Set() Why?: Support for OpenStacks‘ library dependency changes Thresholds: Appeared time > 300 Accuracy > 10% Total 8 patterns Cover: 32.3% (19,940/ 61,673) similar patches Accuracy: 45.9% Result
  21. Background Approach Result Summary Patterns are discussed on StackOverflow 21 - assertEquals() + assertEqual() Why?: Support for Python2 to 3 changes - assertTrue(x in array) Why?: Improve readability + assertIn(x, array) - xrange() + range() - self.stbout() + self.stubs.Set() Why?: Support for OpenStacks‘ library dependency changes Result
  22. Background Approach Result Summary 修正しました For Automatically Code Review: Work as GitHub Bot 22 Patch authorBot I fixed Reviewer OK Sample URL: https://github.com/Ikuyadeu/ExtentionTest/pull/9 Result
  23. Background Approach Result Summary vs Other Tool (1 / 2) Static Analysis Tool FOO=0 foo_=_0 23 Bad name Waste space Static Analysis Tool (pylint) Fix based on Language Other tools: ESlint, Pmd, checkstyle Result
  24. Background Approach Result Summary vs Other Tool (1 / 2) Static Analysis Tool FOO=0 foo_=_0 24 Static Analysis Tool (pylint) Fix based on Language This research: Project-specific changes self.stbout() xrange() self.stubs.Set() range() Old library dependency Language definition Result
  25. Background Approach Result Summary vs Other Tool (2 / 2) 25 Choose best rule set from large rule set • Invalid-name • Bad-continuation • Wrong-import-order • Invalid-name • Bad-continuation • Wrong-import-order IntelliCode Result
  26. Background Approach Result Summary vs Other Tool (2 / 2) 26 Choose best rule set from large rule set Find NEW pattern set from history • Invalid-name • Bad-continuation • Wrong-import-order • Invalid-name • Bad-continuation • Wrong-import-order • disk2disk_api • stubs.Set2stub_out • assert-equals2equal IntelliCode This Study Result
  27. Background Approach Result Summary vs Other Tool (2 / 2) 27 Choose best rule set from large rule set Find NEW pattern set from history • Invalid-name • Bad-continuation • Wrong-import-order • Invalid-name • Bad-continuation • Wrong-import-order • disk2disk_api • stubs.Set2stub_out • assert-equals2equal IntelliCode This Study Support project-specific problem Support change of environment Result
  28. Background Approach Result Summary Future Work • Which pattern should bot choose? Most appeared pattern, High accuracy pattern • Compare with Other Projects and Languages’ Patterns • Evaluate by submitting pull request, and get ratio of Accepted / Submitted pull request 28 Summary
  29. 📧ueda.yuki.un7@is.naist.jp
Anúncio