SlideShare a Scribd company logo
1 of 13
Download to read offline
Supporting Program
Comprehension with Source
   Code Summarization
     Sonia Haiduc*, Jairo Aponte**, Andrian Marcus*

                    ICSE NIER 2010



 *                                          **
Developers read source code

• Before performing maintenance on a
  system, developers need to understand
  its source code

• During comprehension, programmers
  search and browse the code
Skimming vs. reading code
• Skimming (Starke’09): quickly reading the names of
  software artifacts
  + Fast
  – Insufficient information
  – Shallow understanding

• Reading in depth
   – Slow
   – Too much information
   + Deeper understanding
Code summaries

• Automatically generated, short, yet accurate
  descriptions of source code entities

• They give more information than just the
  header or the name of an artifact

• Significantly shorter and faster to read than
  the source code they summarize
What should we summarize?
• Code
   –   Packages
   –   Classes
   –   Methods
   –   Method sequences
   –   Etc.

• Other artifacts
   – Bug reports (ICSE 2010 - S. Rastakar, G. Murphy, G. Murray)
   – E-mails
   – Etc.
What should we include
         in code summaries?

• Semantic information
  – What does the source code do?
  – Identifiers and comments that capture the main concepts


• Structural information
  – How does the code work?
  – Class relationships, callers and callees, members of a
    class, etc.
Description: VFS virtual file system read write
              mkdir directory path save      +
Internal classes: DirectoryEntry             +
Methods: listDirectory, mkdir, constructPath +
Fields: WRITE_CAP, READ_CAP, lock            +
Sub-classes: FileVFS, FavoritesVFS           +
Other: ...
How should we generate
        code summaries?

• Semantic information: automatic text
  summarization
  – Machine Learning
  – Discourse-based approaches
  – Term-based Text Retrieval techniques


• Structural information: static analysis
How can we evaluate code
          summaries?

• How good are the automatic summaries
  when compared to manual ones?

• How useful are the automatic code
  summaries for SE tasks?
Preliminary evaluation

• Compared automatic code summaries
  with developer code summaries

• 6 developers, 12 methods in ATunes

• Used only lexical information – 5 most
  relevant terms
Results
• Automatic source code summaries good in
  reflecting developers’ summaries

• Text Retrieval techniques work as well on
  source code as on natural language in reflecting
  human summaries

• Developers make use of structural information in
  their code summaries:
  – Method name terms
  – Class name terms
  – Formal parameter types terms
What are we doing now?

• What type and how much structural
  information should be included in code
  summaries?
• How do developers generate summaries?
• Are different summaries needed for
  different tasks?
• How useful are the code summaries for
  SE tasks?, etc.
In summary…
• Automatic code summaries:
  –   Short yet accurate descriptions of source code
  –   Can reduce the effort of program comprehension
  –   Embed both semantic and structural information
  –   Can be generated for a variety of software entities

• Visit my poster
  (HINT: look for the huge and colorful one)
• www.cs.wayne.edu/~severe and
  www.cs.wayne.edu/~shaiduc
• sonja@wayne.edu

More Related Content

What's hot

Algorithms and Application Programming
Algorithms and Application ProgrammingAlgorithms and Application Programming
Algorithms and Application Programming
ahaleemsl
 
Chap 1-dhamdhere system programming
Chap 1-dhamdhere system programmingChap 1-dhamdhere system programming
Chap 1-dhamdhere system programming
TanzoGamerz
 
Performance Evaluation List
Performance Evaluation ListPerformance Evaluation List
Performance Evaluation List
Ievgen Kuzminov
 

What's hot (15)

Algorithms and Application Programming
Algorithms and Application ProgrammingAlgorithms and Application Programming
Algorithms and Application Programming
 
Euro python 2015 writing quality code
Euro python 2015   writing quality codeEuro python 2015   writing quality code
Euro python 2015 writing quality code
 
Mca 108
Mca 108Mca 108
Mca 108
 
Chap 1-dhamdhere system programming
Chap 1-dhamdhere system programmingChap 1-dhamdhere system programming
Chap 1-dhamdhere system programming
 
Intelligent Hiring with Resume Parser and Ranking using Natural Language Proc...
Intelligent Hiring with Resume Parser and Ranking using Natural Language Proc...Intelligent Hiring with Resume Parser and Ranking using Natural Language Proc...
Intelligent Hiring with Resume Parser and Ranking using Natural Language Proc...
 
IRJET- Querying Database using Natural Language Interface
IRJET-  	  Querying Database using Natural Language InterfaceIRJET-  	  Querying Database using Natural Language Interface
IRJET- Querying Database using Natural Language Interface
 
Topic modeling
Topic modelingTopic modeling
Topic modeling
 
Resume
ResumeResume
Resume
 
Resume parser
Resume parserResume parser
Resume parser
 
Mca 204
Mca 204Mca 204
Mca 204
 
Ramakeerthi_1+yr_resume
Ramakeerthi_1+yr_resumeRamakeerthi_1+yr_resume
Ramakeerthi_1+yr_resume
 
Performance Evaluation List
Performance Evaluation ListPerformance Evaluation List
Performance Evaluation List
 
Intro lecture infs429
Intro lecture infs429Intro lecture infs429
Intro lecture infs429
 
Python - code quality and production monitoring
Python - code quality and production monitoringPython - code quality and production monitoring
Python - code quality and production monitoring
 
Project report
Project reportProject report
Project report
 

Similar to Supporting program comprehension with source code summarization icse nier 2010

Page 18Goal Implement a complete search engine. Milestones.docx
Page 18Goal Implement a complete search engine. Milestones.docxPage 18Goal Implement a complete search engine. Milestones.docx
Page 18Goal Implement a complete search engine. Milestones.docx
smile790243
 
"Hands Off! Best Practices for Code Hand Offs"
"Hands Off!  Best Practices for Code Hand Offs""Hands Off!  Best Practices for Code Hand Offs"
"Hands Off! Best Practices for Code Hand Offs"
Naomi Dushay
 
Introducing Systems Analysis Design Development
Introducing Systems Analysis Design DevelopmentIntroducing Systems Analysis Design Development
Introducing Systems Analysis Design Development
bsadd
 
Xen Project Contributor Training - Part 1 introduction v1.0
Xen Project Contributor Training - Part 1 introduction v1.0Xen Project Contributor Training - Part 1 introduction v1.0
Xen Project Contributor Training - Part 1 introduction v1.0
The Linux Foundation
 

Similar to Supporting program comprehension with source code summarization icse nier 2010 (20)

Research software identification - Catherine Jones
Research software identification - Catherine JonesResearch software identification - Catherine Jones
Research software identification - Catherine Jones
 
Tips to kick-start your Software Engineering Career - Ferdous Mahmud Shaon
Tips to kick-start your Software Engineering Career - Ferdous Mahmud ShaonTips to kick-start your Software Engineering Career - Ferdous Mahmud Shaon
Tips to kick-start your Software Engineering Career - Ferdous Mahmud Shaon
 
Tips to Kick-start your Software Engineering Career
Tips to Kick-start your Software Engineering CareerTips to Kick-start your Software Engineering Career
Tips to Kick-start your Software Engineering Career
 
Code Inspection
Code InspectionCode Inspection
Code Inspection
 
Towards Reusable Research Software
Towards Reusable Research SoftwareTowards Reusable Research Software
Towards Reusable Research Software
 
Page 18Goal Implement a complete search engine. Milestones.docx
Page 18Goal Implement a complete search engine. Milestones.docxPage 18Goal Implement a complete search engine. Milestones.docx
Page 18Goal Implement a complete search engine. Milestones.docx
 
The Final Frontier
The Final FrontierThe Final Frontier
The Final Frontier
 
Dice.com Bay Area Search - Beyond Learning to Rank Talk
Dice.com Bay Area Search - Beyond Learning to Rank TalkDice.com Bay Area Search - Beyond Learning to Rank Talk
Dice.com Bay Area Search - Beyond Learning to Rank Talk
 
"Hands Off! Best Practices for Code Hand Offs"
"Hands Off!  Best Practices for Code Hand Offs""Hands Off!  Best Practices for Code Hand Offs"
"Hands Off! Best Practices for Code Hand Offs"
 
Automatic and rapid generation of massive knowledge repositories from data
Automatic and rapid generation of massive knowledge repositories from dataAutomatic and rapid generation of massive knowledge repositories from data
Automatic and rapid generation of massive knowledge repositories from data
 
Introducing Systems Analysis Design Development
Introducing Systems Analysis Design DevelopmentIntroducing Systems Analysis Design Development
Introducing Systems Analysis Design Development
 
Software citation
Software citationSoftware citation
Software citation
 
Introducing systems analysis, design & development Concepts
Introducing systems analysis, design & development ConceptsIntroducing systems analysis, design & development Concepts
Introducing systems analysis, design & development Concepts
 
Autopsy 3.0 - Open Source Digital Forensics Conference
Autopsy 3.0 - Open Source Digital Forensics ConferenceAutopsy 3.0 - Open Source Digital Forensics Conference
Autopsy 3.0 - Open Source Digital Forensics Conference
 
Hire a Machine to Code - Michael Arthur Bucko & Aurélien Nicolas
Hire a Machine to Code - Michael Arthur Bucko & Aurélien NicolasHire a Machine to Code - Michael Arthur Bucko & Aurélien Nicolas
Hire a Machine to Code - Michael Arthur Bucko & Aurélien Nicolas
 
CS6007 information retrieval - 5 units notes
CS6007   information retrieval - 5 units notesCS6007   information retrieval - 5 units notes
CS6007 information retrieval - 5 units notes
 
Information Architecture Explained
Information Architecture ExplainedInformation Architecture Explained
Information Architecture Explained
 
Object Pascal Clean Code Guidelines Proposal (at EKON 22)
Object Pascal Clean Code Guidelines Proposal (at EKON 22)Object Pascal Clean Code Guidelines Proposal (at EKON 22)
Object Pascal Clean Code Guidelines Proposal (at EKON 22)
 
Xen Project Contributor Training - Part 1 introduction v1.0
Xen Project Contributor Training - Part 1 introduction v1.0Xen Project Contributor Training - Part 1 introduction v1.0
Xen Project Contributor Training - Part 1 introduction v1.0
 
APIs and SDKs: Breaking into and Succeeding in a Specialty Market
APIs and SDKs: Breaking into and Succeeding in a Specialty MarketAPIs and SDKs: Breaking into and Succeeding in a Specialty Market
APIs and SDKs: Breaking into and Succeeding in a Specialty Market
 

Recently uploaded

Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
kauryashika82
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
QucHHunhnh
 

Recently uploaded (20)

Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docx
 
Spatium Project Simulation student brief
Spatium Project Simulation student briefSpatium Project Simulation student brief
Spatium Project Simulation student brief
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 

Supporting program comprehension with source code summarization icse nier 2010

  • 1. Supporting Program Comprehension with Source Code Summarization Sonia Haiduc*, Jairo Aponte**, Andrian Marcus* ICSE NIER 2010 * **
  • 2. Developers read source code • Before performing maintenance on a system, developers need to understand its source code • During comprehension, programmers search and browse the code
  • 3. Skimming vs. reading code • Skimming (Starke’09): quickly reading the names of software artifacts + Fast – Insufficient information – Shallow understanding • Reading in depth – Slow – Too much information + Deeper understanding
  • 4. Code summaries • Automatically generated, short, yet accurate descriptions of source code entities • They give more information than just the header or the name of an artifact • Significantly shorter and faster to read than the source code they summarize
  • 5. What should we summarize? • Code – Packages – Classes – Methods – Method sequences – Etc. • Other artifacts – Bug reports (ICSE 2010 - S. Rastakar, G. Murphy, G. Murray) – E-mails – Etc.
  • 6. What should we include in code summaries? • Semantic information – What does the source code do? – Identifiers and comments that capture the main concepts • Structural information – How does the code work? – Class relationships, callers and callees, members of a class, etc.
  • 7. Description: VFS virtual file system read write mkdir directory path save + Internal classes: DirectoryEntry + Methods: listDirectory, mkdir, constructPath + Fields: WRITE_CAP, READ_CAP, lock + Sub-classes: FileVFS, FavoritesVFS + Other: ...
  • 8. How should we generate code summaries? • Semantic information: automatic text summarization – Machine Learning – Discourse-based approaches – Term-based Text Retrieval techniques • Structural information: static analysis
  • 9. How can we evaluate code summaries? • How good are the automatic summaries when compared to manual ones? • How useful are the automatic code summaries for SE tasks?
  • 10. Preliminary evaluation • Compared automatic code summaries with developer code summaries • 6 developers, 12 methods in ATunes • Used only lexical information – 5 most relevant terms
  • 11. Results • Automatic source code summaries good in reflecting developers’ summaries • Text Retrieval techniques work as well on source code as on natural language in reflecting human summaries • Developers make use of structural information in their code summaries: – Method name terms – Class name terms – Formal parameter types terms
  • 12. What are we doing now? • What type and how much structural information should be included in code summaries? • How do developers generate summaries? • Are different summaries needed for different tasks? • How useful are the code summaries for SE tasks?, etc.
  • 13. In summary… • Automatic code summaries: – Short yet accurate descriptions of source code – Can reduce the effort of program comprehension – Embed both semantic and structural information – Can be generated for a variety of software entities • Visit my poster (HINT: look for the huge and colorful one) • www.cs.wayne.edu/~severe and www.cs.wayne.edu/~shaiduc • sonja@wayne.edu