Supporting program comprehension with source code summarization icse nier 2010

•

0 gostou•590 visualizações

One of the main challenges faced by today’s developers is keeping up with the staggering amount of source code that needs to be read and understood. In order to help developers with this problem and reduce the costs associated with it, one solution is to use simple textual descriptions of source code entities that developers can grasp easily, while capturing the code semantics precisely. We propose an approach to automatically determine such descriptions, based on automated text summarization technology and structural information.

Educação Tecnologia

Supporting Program
Comprehension with Source
Code Summarization
Sonia Haiduc*, Jairo Aponte**, Andrian Marcus*

ICSE NIER 2010

* **

Developers read source code

• Before performing maintenance on a
system, developers need to understand
its source code

• During comprehension, programmers
search and browse the code

Skimming vs. reading code
• Skimming (Starke’09): quickly reading the names of
software artifacts
+ Fast
– Insufficient information
– Shallow understanding

• Reading in depth
– Slow
– Too much information
+ Deeper understanding

Code summaries

• Automatically generated, short, yet accurate
descriptions of source code entities

• They give more information than just the
header or the name of an artifact

• Significantly shorter and faster to read than
the source code they summarize

What should we summarize?
• Code
– Packages
– Classes
– Methods
– Method sequences
– Etc.

• Other artifacts
– Bug reports (ICSE 2010 - S. Rastakar, G. Murphy, G. Murray)
– E-mails
– Etc.

What should we include
in code summaries?

• Semantic information
– What does the source code do?
– Identifiers and comments that capture the main concepts

• Structural information
– How does the code work?
– Class relationships, callers and callees, members of a
class, etc.

Description: VFS virtual file system read write
mkdir directory path save +
Internal classes: DirectoryEntry +
Methods: listDirectory, mkdir, constructPath +
Fields: WRITE_CAP, READ_CAP, lock +
Sub-classes: FileVFS, FavoritesVFS +
Other: ...

How should we generate
code summaries?

• Semantic information: automatic text
summarization
– Machine Learning
– Discourse-based approaches
– Term-based Text Retrieval techniques

• Structural information: static analysis

How can we evaluate code
summaries?

• How good are the automatic summaries
when compared to manual ones?

• How useful are the automatic code
summaries for SE tasks?

Preliminary evaluation

• Compared automatic code summaries
with developer code summaries

• 6 developers, 12 methods in ATunes

• Used only lexical information – 5 most
relevant terms

Results
• Automatic source code summaries good in
reflecting developers’ summaries

• Text Retrieval techniques work as well on
source code as on natural language in reflecting
human summaries

• Developers make use of structural information in
their code summaries:
– Method name terms
– Class name terms
– Formal parameter types terms

What are we doing now?

• What type and how much structural
information should be included in code
summaries?
• How do developers generate summaries?
• Are different summaries needed for
different tasks?
• How useful are the code summaries for
SE tasks?, etc.

In summary…
• Automatic code summaries:
– Short yet accurate descriptions of source code
– Can reduce the effort of program comprehension
– Embed both semantic and structural information
– Can be generated for a variety of software entities

• Visit my poster
(HINT: look for the huge and colorful one)
• www.cs.wayne.edu/~severe and
www.cs.wayne.edu/~shaiduc
• sonja@wayne.edu

Mais conteúdo relacionado

Mais procurados

Algorithms and Application Programmingahaleemsl

Euro python 2015 writing quality coderadek_j

Mca 108smumbahelp

Chap 1-dhamdhere system programmingTanzoGamerz

Intelligent Hiring with Resume Parser and Ranking using Natural Language Proc...Zainul Sayed

IRJET- Querying Database using Natural Language InterfaceIRJET Journal

Topic modelingSajal Sharma

ResumeDilip Simha Chintamani Rajagopal

Resume parserAkrita Agarwal

Mca 204smumbahelp

Ramakeerthi_1+yr_resumebotcha ramakeerthi

Performance Evaluation ListIevgen Kuzminov

Intro lecture infs429Edmund Sowah

Python - code quality and production monitoringDavid Melamed

Project reportUtkarsh Soni

Mais procurados (15)

Algorithms and Application Programming

Euro python 2015 writing quality code

Mca 108

Chap 1-dhamdhere system programming

Intelligent Hiring with Resume Parser and Ranking using Natural Language Proc...

IRJET- Querying Database using Natural Language Interface

Topic modeling

Resume

Resume parser

Mca 204

Ramakeerthi_1+yr_resume

Performance Evaluation List

Intro lecture infs429

Python - code quality and production monitoring

Project report

Semelhante a Supporting program comprehension with source code summarization icse nier 2010

Research software identification - Catherine JonesJisc RDM

Tips to kick-start your Software Engineering Career - Ferdous Mahmud ShaonCefalo

Tips to Kick-start your Software Engineering CareerFerdous Mahmud Shaon

Code InspectionFáber D. Giraldo

Towards Reusable Research Softwaredgarijo

Page 18Goal Implement a complete search engine. Milestones.docxsmile790243

The Final FrontierjClarity

Dice.com Bay Area Search - Beyond Learning to Rank TalkSimon Hughes

"Hands Off! Best Practices for Code Hand Offs"Naomi Dushay

Automatic and rapid generation of massive knowledge repositories from dataSIKM

Introducing Systems Analysis Design Developmentbsadd

Software citationDaniel S. Katz

Introducing systems analysis, design & development ConceptsShafiul Azam Chowdhury

Autopsy 3.0 - Open Source Digital Forensics ConferenceBasis Technology

Hire a Machine to Code - Michael Arthur Bucko & Aurélien NicolasWithTheBest

CS6007 information retrieval - 5 units notesAnandh Arumugakan

Information Architecture ExplainedLeigh White

Object Pascal Clean Code Guidelines Proposal (at EKON 22)Arnaud Bouchez

Xen Project Contributor Training - Part 1 introduction v1.0The Linux Foundation

APIs and SDKs: Breaking into and Succeeding in a Specialty MarketSTC-Philadelphia Metro Chapter

Semelhante a Supporting program comprehension with source code summarization icse nier 2010 (20)

Research software identification - Catherine Jones

Tips to kick-start your Software Engineering Career - Ferdous Mahmud Shaon

Tips to Kick-start your Software Engineering Career

Code Inspection

Towards Reusable Research Software

Page 18Goal Implement a complete search engine. Milestones.docx

The Final Frontier

Dice.com Bay Area Search - Beyond Learning to Rank Talk

"Hands Off! Best Practices for Code Hand Offs"

Automatic and rapid generation of massive knowledge repositories from data

Introducing Systems Analysis Design Development

Software citation

Introducing systems analysis, design & development Concepts

Autopsy 3.0 - Open Source Digital Forensics Conference

Hire a Machine to Code - Michael Arthur Bucko & Aurélien Nicolas

CS6007 information retrieval - 5 units notes

Information Architecture Explained

Object Pascal Clean Code Guidelines Proposal (at EKON 22)

Xen Project Contributor Training - Part 1 introduction v1.0

APIs and SDKs: Breaking into and Succeeding in a Specialty Market

Último

Scientific Writing :Research DiscourseAnita GoswamiGiri

How to Make a Duplicate of Your Odoo 17 DatabaseCeline George

DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptxMichelleTuguinay1

ARTERIAL BLOOD GAS ANALYSIS........pptxAneriPatwari

Grade Three -ELLNA-REVIEWER-ENGLISH.pptxkarenfajardo43

Faculty Profile prashantha K EEE dept Sri Sairam college of EngineeringSri Sairam College Of Engineering Bengaluru

31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...Nguyen Thanh Tu Collection

Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...DhatriParmar

4.11.24 Mass Incarceration and the New Jim Crow.pptxmary850239

4.11.24 Poverty and Inequality in America.pptxmary850239

Q-Factor General Quiz-7th April 2024, Quiz Club NITWQuiz Club NITW

Transaction Management in Database Management SystemChristalin Nelson

4.16.24 21st Century Movements for Black Lives.pptxmary850239

Mattingly "AI & Prompt Design: Large Language Models"National Information Standards Organization (NISO)

Concurrency Control in Database Management systemChristalin Nelson

prashanth updated resume 2024 for Teaching ProfessionSri Sairam College Of Engineering Bengaluru

Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...DhatriParmar

Active Learning Strategies (in short ALS).pdfPatidar M

Textual Evidence in Reading and Writing of SHSMae Pangan

Daily Lesson Plan in Mathematics Quarter 4JOYLYNSAMANIEGO

Supporting program comprehension with source code summarization icse nier 2010

1. Supporting Program Comprehension with Source Code Summarization Sonia Haiduc*, Jairo Aponte**, Andrian Marcus* ICSE NIER 2010 * **

2. Developers read source code • Before performing maintenance on a system, developers need to understand its source code • During comprehension, programmers search and browse the code

3. Skimming vs. reading code • Skimming (Starke’09): quickly reading the names of software artifacts + Fast – Insufficient information – Shallow understanding • Reading in depth – Slow – Too much information + Deeper understanding

4. Code summaries • Automatically generated, short, yet accurate descriptions of source code entities • They give more information than just the header or the name of an artifact • Significantly shorter and faster to read than the source code they summarize

5. What should we summarize? • Code – Packages – Classes – Methods – Method sequences – Etc. • Other artifacts – Bug reports (ICSE 2010 - S. Rastakar, G. Murphy, G. Murray) – E-mails – Etc.

6. What should we include in code summaries? • Semantic information – What does the source code do? – Identifiers and comments that capture the main concepts • Structural information – How does the code work? – Class relationships, callers and callees, members of a class, etc.

7. Description: VFS virtual file system read write mkdir directory path save + Internal classes: DirectoryEntry + Methods: listDirectory, mkdir, constructPath + Fields: WRITE_CAP, READ_CAP, lock + Sub-classes: FileVFS, FavoritesVFS + Other: ...

8. How should we generate code summaries? • Semantic information: automatic text summarization – Machine Learning – Discourse-based approaches – Term-based Text Retrieval techniques • Structural information: static analysis

9. How can we evaluate code summaries? • How good are the automatic summaries when compared to manual ones? • How useful are the automatic code summaries for SE tasks?

10. Preliminary evaluation • Compared automatic code summaries with developer code summaries • 6 developers, 12 methods in ATunes • Used only lexical information – 5 most relevant terms

11. Results • Automatic source code summaries good in reflecting developers’ summaries • Text Retrieval techniques work as well on source code as on natural language in reflecting human summaries • Developers make use of structural information in their code summaries: – Method name terms – Class name terms – Formal parameter types terms

12. What are we doing now? • What type and how much structural information should be included in code summaries? • How do developers generate summaries? • Are different summaries needed for different tasks? • How useful are the code summaries for SE tasks?, etc.

13. In summary… • Automatic code summaries: – Short yet accurate descriptions of source code – Can reduce the effort of program comprehension – Embed both semantic and structural information – Can be generated for a variety of software entities • Visit my poster (HINT: look for the huge and colorful one) • www.cs.wayne.edu/~severe and www.cs.wayne.edu/~shaiduc • sonja@wayne.edu

Supporting program comprehension with source code summarization icse nier 2010

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (15)

Semelhante a Supporting program comprehension with source code summarization icse nier 2010

Semelhante a Supporting program comprehension with source code summarization icse nier 2010 (20)

Último

Último (20)

Supporting program comprehension with source code summarization icse nier 2010