SlideShare a Scribd company logo
1 of 7
CODES: mining sourCe
cOde Descriptions from
developeErs diScussions
Carmine Sebastiano Massimiliano Gerardo
Vassallo Panichella Di Penta Canfora
S. Panichella, J. Aponte, M. Di Penta, A. Marcus, G. Canfora - ICPC 2012
Such communications are…
 unstructured
 usually not explicitly meant to describe specific parts of the
source code.
Example: Eclipse (Java System)
METHOD: searchMainMethods
..........................
Problem seems to come from
in org.eclipse.jdt.internal.debug.ui.launcher.
,there's a to
addSubTypes(List, IProgressMonitor,
IJavaSearchScope) if includesSubtypes flag
is ON. This add all types sub-types as soon
as the given scope encloses them without testing
if sub-types have a main !
..........................
CLASS: MainMethodSearchEngine Parameters: “ORANGE”
Keywords: “GREEN”
searchMainMethods
MainMethodSearchEngine
(IProgressMonitor,
IJavaSearchScope, boolean) cal
l
method
method
method
Method
Bug Report and Mailing List
• Step 1: Downloading SO discussions relying on
its REST interface and tracing them onto classes
• Step 2: Extracting paragraphs
• Step 3: Tracing paragraphs onto methods
( Discards Paragraphs of discussions with 0 Votes)
• Step 4: Heuristic based Filtering
• Step 5: Similarity based Filtering
CODES:
Approach for Mining Method Descriptions
DEMO
http://youtu.be/Rnc5ni1AAzc
CODES Results:
System Mined
Descriptions
True
Positives
False
Positives
Apache Lucene
2.9.0
9,343
(about 20% of the
Methods)
84 16
Hibernate
3.5.0
10,608
(about 28% of the
Methods)
91 9
We sampled 100 descriptions from each project
CODES: mining sourCe cOde Descriptions from developeErs diScussions - ICPC 2014

More Related Content

Viewers also liked

Xd '13 april lc day review
Xd '13 april lc day reviewXd '13 april lc day review
Xd '13 april lc day reviewaiesechyderabad
 
Com.epost.psf.szi
Com.epost.psf.sziCom.epost.psf.szi
Com.epost.psf.szismartepost
 
Best TM Award Application
Best TM Award ApplicationBest TM Award Application
Best TM Award Applicationaiesechyderabad
 
Evolusi microsoft
Evolusi microsoftEvolusi microsoft
Evolusi microsoftlabiebm
 
Struktur dan fungsi sel
Struktur dan fungsi selStruktur dan fungsi sel
Struktur dan fungsi seldwi_alam
 
Laporan fitum c1
Laporan fitum c1Laporan fitum c1
Laporan fitum c1dwi_alam
 
Evening plen presentation
Evening plen  presentationEvening plen  presentation
Evening plen presentationaiesechyderabad
 
Group fun with toobeez low cost activities
Group fun with toobeez   low cost activitiesGroup fun with toobeez   low cost activities
Group fun with toobeez low cost activitieskyla19_92719
 
Best Finance Award Application
Best Finance Award ApplicationBest Finance Award Application
Best Finance Award Applicationaiesechyderabad
 
Question 2 Media Evaluation
Question 2 Media EvaluationQuestion 2 Media Evaluation
Question 2 Media EvaluationHollie15
 
Rusz się Oława - wykład 2 - Budownictwo naturalne i zrównoważona urbanistyka ...
Rusz się Oława - wykład 2 - Budownictwo naturalne i zrównoważona urbanistyka ...Rusz się Oława - wykład 2 - Budownictwo naturalne i zrównoważona urbanistyka ...
Rusz się Oława - wykład 2 - Budownictwo naturalne i zrównoważona urbanistyka ...Magdalena Górska
 
Curriculum Night
Curriculum NightCurriculum Night
Curriculum Nightmsilberberg
 
Using questions and answers on Milkround
Using questions and answers on MilkroundUsing questions and answers on Milkround
Using questions and answers on MilkroundMilkround
 
Best oGCDP Award Application
Best oGCDP Award ApplicationBest oGCDP Award Application
Best oGCDP Award Applicationaiesechyderabad
 

Viewers also liked (20)

Xd '13 april lc day review
Xd '13 april lc day reviewXd '13 april lc day review
Xd '13 april lc day review
 
Com.epost.psf.szi
Com.epost.psf.sziCom.epost.psf.szi
Com.epost.psf.szi
 
Best TM Award Application
Best TM Award ApplicationBest TM Award Application
Best TM Award Application
 
Evolusi microsoft
Evolusi microsoftEvolusi microsoft
Evolusi microsoft
 
Struktur dan fungsi sel
Struktur dan fungsi selStruktur dan fungsi sel
Struktur dan fungsi sel
 
Laporan fitum c1
Laporan fitum c1Laporan fitum c1
Laporan fitum c1
 
Evening plen presentation
Evening plen  presentationEvening plen  presentation
Evening plen presentation
 
Presentation1
Presentation1Presentation1
Presentation1
 
Group fun with toobeez low cost activities
Group fun with toobeez   low cost activitiesGroup fun with toobeez   low cost activities
Group fun with toobeez low cost activities
 
About GML
About GMLAbout GML
About GML
 
Nps education
Nps educationNps education
Nps education
 
Updates, CIM
Updates, CIMUpdates, CIM
Updates, CIM
 
Finance Essentials
Finance EssentialsFinance Essentials
Finance Essentials
 
Best Finance Award Application
Best Finance Award ApplicationBest Finance Award Application
Best Finance Award Application
 
Question 2 Media Evaluation
Question 2 Media EvaluationQuestion 2 Media Evaluation
Question 2 Media Evaluation
 
Rusz się Oława - wykład 2 - Budownictwo naturalne i zrównoważona urbanistyka ...
Rusz się Oława - wykład 2 - Budownictwo naturalne i zrównoważona urbanistyka ...Rusz się Oława - wykład 2 - Budownictwo naturalne i zrównoważona urbanistyka ...
Rusz się Oława - wykład 2 - Budownictwo naturalne i zrównoważona urbanistyka ...
 
The millennials sola
The millennials solaThe millennials sola
The millennials sola
 
Curriculum Night
Curriculum NightCurriculum Night
Curriculum Night
 
Using questions and answers on Milkround
Using questions and answers on MilkroundUsing questions and answers on Milkround
Using questions and answers on Milkround
 
Best oGCDP Award Application
Best oGCDP Award ApplicationBest oGCDP Award Application
Best oGCDP Award Application
 

Similar to CODES: mining sourCe cOde Descriptions from developeErs diScussions - ICPC 2014

ICPC 2012 - Mining Source Code Descriptions
ICPC 2012 - Mining Source Code DescriptionsICPC 2012 - Mining Source Code Descriptions
ICPC 2012 - Mining Source Code DescriptionsSebastiano Panichella
 
130614 sebastiano panichella - mining source code descriptions from develo...
130614   sebastiano panichella -  mining source code descriptions from develo...130614   sebastiano panichella -  mining source code descriptions from develo...
130614 sebastiano panichella - mining source code descriptions from develo...Ptidej Team
 
Eclipse Indigo DemoCamp Walldorf 2011
Eclipse Indigo DemoCamp Walldorf 2011Eclipse Indigo DemoCamp Walldorf 2011
Eclipse Indigo DemoCamp Walldorf 2011Marcel Bruch
 
Call Graph Agnostic Malware Indexing (EuskalHack 2017)
Call Graph Agnostic Malware Indexing (EuskalHack 2017)Call Graph Agnostic Malware Indexing (EuskalHack 2017)
Call Graph Agnostic Malware Indexing (EuskalHack 2017)Joxean Koret
 
Domain-Specific Profiling - TOOLS 2011
Domain-Specific Profiling - TOOLS 2011Domain-Specific Profiling - TOOLS 2011
Domain-Specific Profiling - TOOLS 2011Jorge Ressia
 
Unit Testing RPG with JUnit
Unit Testing RPG with JUnitUnit Testing RPG with JUnit
Unit Testing RPG with JUnitGreg.Helton
 
"Быстрое обнаружение вредоносного ПО для Android с помощью машинного обучения...
"Быстрое обнаружение вредоносного ПО для Android с помощью машинного обучения..."Быстрое обнаружение вредоносного ПО для Android с помощью машинного обучения...
"Быстрое обнаружение вредоносного ПО для Android с помощью машинного обучения...Yandex
 
Fast detection of Android malware: machine learning approach
Fast detection of Android malware: machine learning approachFast detection of Android malware: machine learning approach
Fast detection of Android malware: machine learning approachYury Leonychev
 
Whoops! Where did my architecture go?
Whoops! Where did my architecture go?Whoops! Where did my architecture go?
Whoops! Where did my architecture go?Oliver Gierke
 
Neo4j Stored Procedure Training Part 1
Neo4j Stored Procedure Training Part 1Neo4j Stored Procedure Training Part 1
Neo4j Stored Procedure Training Part 1Max De Marzi
 
How High Will It Be? Using Machine Learning Models to Predict Branch Coverage...
How High Will It Be? Using Machine Learning Models to Predict Branch Coverage...How High Will It Be? Using Machine Learning Models to Predict Branch Coverage...
How High Will It Be? Using Machine Learning Models to Predict Branch Coverage...Sebastiano Panichella
 
117 A Outline 25
117 A Outline 25117 A Outline 25
117 A Outline 25wasntgosu
 
Generics Tutorial
Generics TutorialGenerics Tutorial
Generics Tutorialwasntgosu
 
Malware Analysis Tips and Tricks.pdf
Malware Analysis Tips and Tricks.pdfMalware Analysis Tips and Tricks.pdf
Malware Analysis Tips and Tricks.pdfYushimon
 
Generics Tutorial
Generics TutorialGenerics Tutorial
Generics Tutorialwasntgosu
 

Similar to CODES: mining sourCe cOde Descriptions from developeErs diScussions - ICPC 2014 (20)

ICPC 2012 - Mining Source Code Descriptions
ICPC 2012 - Mining Source Code DescriptionsICPC 2012 - Mining Source Code Descriptions
ICPC 2012 - Mining Source Code Descriptions
 
130614 sebastiano panichella - mining source code descriptions from develo...
130614   sebastiano panichella -  mining source code descriptions from develo...130614   sebastiano panichella -  mining source code descriptions from develo...
130614 sebastiano panichella - mining source code descriptions from develo...
 
Eclipse Indigo DemoCamp Walldorf 2011
Eclipse Indigo DemoCamp Walldorf 2011Eclipse Indigo DemoCamp Walldorf 2011
Eclipse Indigo DemoCamp Walldorf 2011
 
Call Graph Agnostic Malware Indexing (EuskalHack 2017)
Call Graph Agnostic Malware Indexing (EuskalHack 2017)Call Graph Agnostic Malware Indexing (EuskalHack 2017)
Call Graph Agnostic Malware Indexing (EuskalHack 2017)
 
Domain-Specific Profiling - TOOLS 2011
Domain-Specific Profiling - TOOLS 2011Domain-Specific Profiling - TOOLS 2011
Domain-Specific Profiling - TOOLS 2011
 
Debugging
DebuggingDebugging
Debugging
 
Unit Testing RPG with JUnit
Unit Testing RPG with JUnitUnit Testing RPG with JUnit
Unit Testing RPG with JUnit
 
Cutting out Malware
Cutting out MalwareCutting out Malware
Cutting out Malware
 
"Быстрое обнаружение вредоносного ПО для Android с помощью машинного обучения...
"Быстрое обнаружение вредоносного ПО для Android с помощью машинного обучения..."Быстрое обнаружение вредоносного ПО для Android с помощью машинного обучения...
"Быстрое обнаружение вредоносного ПО для Android с помощью машинного обучения...
 
Fast detection of Android malware: machine learning approach
Fast detection of Android malware: machine learning approachFast detection of Android malware: machine learning approach
Fast detection of Android malware: machine learning approach
 
Whoops! Where did my architecture go?
Whoops! Where did my architecture go?Whoops! Where did my architecture go?
Whoops! Where did my architecture go?
 
Neo4j Stored Procedure Training Part 1
Neo4j Stored Procedure Training Part 1Neo4j Stored Procedure Training Part 1
Neo4j Stored Procedure Training Part 1
 
How High Will It Be? Using Machine Learning Models to Predict Branch Coverage...
How High Will It Be? Using Machine Learning Models to Predict Branch Coverage...How High Will It Be? Using Machine Learning Models to Predict Branch Coverage...
How High Will It Be? Using Machine Learning Models to Predict Branch Coverage...
 
117 A Outline 25
117 A Outline 25117 A Outline 25
117 A Outline 25
 
Generics Tutorial
Generics TutorialGenerics Tutorial
Generics Tutorial
 
Malware Analysis Tips and Tricks.pdf
Malware Analysis Tips and Tricks.pdfMalware Analysis Tips and Tricks.pdf
Malware Analysis Tips and Tricks.pdf
 
Audit
AuditAudit
Audit
 
Javascript
JavascriptJavascript
Javascript
 
Benchmarking on JVM
Benchmarking on JVMBenchmarking on JVM
Benchmarking on JVM
 
Generics Tutorial
Generics TutorialGenerics Tutorial
Generics Tutorial
 

More from Sebastiano Panichella

Testing with Fewer Resources: Toward Adaptive Approaches for Cost-effective ...
Testing with Fewer Resources:  Toward Adaptive Approaches for Cost-effective ...Testing with Fewer Resources:  Toward Adaptive Approaches for Cost-effective ...
Testing with Fewer Resources: Toward Adaptive Approaches for Cost-effective ...Sebastiano Panichella
 
COSMOS: DevOps for Complex Cyber-physical Systems
COSMOS: DevOps for Complex Cyber-physical SystemsCOSMOS: DevOps for Complex Cyber-physical Systems
COSMOS: DevOps for Complex Cyber-physical SystemsSebastiano Panichella
 
Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...
Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...
Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...Sebastiano Panichella
 
An Empirical Characterization of Software Bugs in Open-Source Cyber-Physical ...
An Empirical Characterization of Software Bugs in Open-Source Cyber-Physical ...An Empirical Characterization of Software Bugs in Open-Source Cyber-Physical ...
An Empirical Characterization of Software Bugs in Open-Source Cyber-Physical ...Sebastiano Panichella
 
Automated Identification and Qualitative Characterization of Safety Concerns ...
Automated Identification and Qualitative Characterization of Safety Concerns ...Automated Identification and Qualitative Characterization of Safety Concerns ...
Automated Identification and Qualitative Characterization of Safety Concerns ...Sebastiano Panichella
 
The 2nd Intl. Workshop on NL-based Software Engineering
The 2nd Intl. Workshop on NL-based Software EngineeringThe 2nd Intl. Workshop on NL-based Software Engineering
The 2nd Intl. Workshop on NL-based Software EngineeringSebastiano Panichella
 
The 16th Intl. Workshop on Search-Based and Fuzz Testing
The 16th Intl. Workshop on Search-Based and Fuzz TestingThe 16th Intl. Workshop on Search-Based and Fuzz Testing
The 16th Intl. Workshop on Search-Based and Fuzz TestingSebastiano Panichella
 
Simulation-based Test Case Generation for Unmanned Aerial Vehicles in the Nei...
Simulation-based Test Case Generation for Unmanned Aerial Vehicles in the Nei...Simulation-based Test Case Generation for Unmanned Aerial Vehicles in the Nei...
Simulation-based Test Case Generation for Unmanned Aerial Vehicles in the Nei...Sebastiano Panichella
 
Exposed! A case study on the vulnerability-proneness of Google Play Apps
Exposed! A case study on the vulnerability-proneness of Google Play AppsExposed! A case study on the vulnerability-proneness of Google Play Apps
Exposed! A case study on the vulnerability-proneness of Google Play AppsSebastiano Panichella
 
Search-based Software Testing (SBST) '22
Search-based Software Testing (SBST) '22Search-based Software Testing (SBST) '22
Search-based Software Testing (SBST) '22Sebastiano Panichella
 
NL-based Software Engineering (NLBSE) '22
NL-based Software Engineering (NLBSE) '22NL-based Software Engineering (NLBSE) '22
NL-based Software Engineering (NLBSE) '22Sebastiano Panichella
 
"An NLP-based Tool for Software Artifacts Analysis" at @ICSME2021.
 "An NLP-based Tool for Software Artifacts Analysis" at @ICSME2021.  "An NLP-based Tool for Software Artifacts Analysis" at @ICSME2021.
"An NLP-based Tool for Software Artifacts Analysis" at @ICSME2021. Sebastiano Panichella
 
An Empirical Investigation of Relevant Changes and Automation Needs in Modern...
An Empirical Investigation of Relevant Changes and Automation Needs in Modern...An Empirical Investigation of Relevant Changes and Automation Needs in Modern...
An Empirical Investigation of Relevant Changes and Automation Needs in Modern...Sebastiano Panichella
 
Search-Based Software Testing Tool Competition 2021 by Sebastiano Panichella,...
Search-Based Software Testing Tool Competition 2021 by Sebastiano Panichella,...Search-Based Software Testing Tool Competition 2021 by Sebastiano Panichella,...
Search-Based Software Testing Tool Competition 2021 by Sebastiano Panichella,...Sebastiano Panichella
 
A Framework for Multi-source Studies based on Unstructured Data.
A Framework for Multi-source Studies based on Unstructured Data.A Framework for Multi-source Studies based on Unstructured Data.
A Framework for Multi-source Studies based on Unstructured Data.Sebastiano Panichella
 
Revisiting Test Smells in Automatically Generated Tests: Limitations, Pitfall...
Revisiting Test Smells in Automatically Generated Tests: Limitations, Pitfall...Revisiting Test Smells in Automatically Generated Tests: Limitations, Pitfall...
Revisiting Test Smells in Automatically Generated Tests: Limitations, Pitfall...Sebastiano Panichella
 
Requirements-Collector: Automating Requirements Specification from Elicitatio...
Requirements-Collector: Automating Requirements Specification from Elicitatio...Requirements-Collector: Automating Requirements Specification from Elicitatio...
Requirements-Collector: Automating Requirements Specification from Elicitatio...Sebastiano Panichella
 
Unit Testing Tool Competition-Eighth Round
Unit Testing Tool Competition-Eighth RoundUnit Testing Tool Competition-Eighth Round
Unit Testing Tool Competition-Eighth RoundSebastiano Panichella
 

More from Sebastiano Panichella (20)

Testing with Fewer Resources: Toward Adaptive Approaches for Cost-effective ...
Testing with Fewer Resources:  Toward Adaptive Approaches for Cost-effective ...Testing with Fewer Resources:  Toward Adaptive Approaches for Cost-effective ...
Testing with Fewer Resources: Toward Adaptive Approaches for Cost-effective ...
 
COSMOS: DevOps for Complex Cyber-physical Systems
COSMOS: DevOps for Complex Cyber-physical SystemsCOSMOS: DevOps for Complex Cyber-physical Systems
COSMOS: DevOps for Complex Cyber-physical Systems
 
Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...
Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...
Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...
 
An Empirical Characterization of Software Bugs in Open-Source Cyber-Physical ...
An Empirical Characterization of Software Bugs in Open-Source Cyber-Physical ...An Empirical Characterization of Software Bugs in Open-Source Cyber-Physical ...
An Empirical Characterization of Software Bugs in Open-Source Cyber-Physical ...
 
Automated Identification and Qualitative Characterization of Safety Concerns ...
Automated Identification and Qualitative Characterization of Safety Concerns ...Automated Identification and Qualitative Characterization of Safety Concerns ...
Automated Identification and Qualitative Characterization of Safety Concerns ...
 
The 2nd Intl. Workshop on NL-based Software Engineering
The 2nd Intl. Workshop on NL-based Software EngineeringThe 2nd Intl. Workshop on NL-based Software Engineering
The 2nd Intl. Workshop on NL-based Software Engineering
 
The 16th Intl. Workshop on Search-Based and Fuzz Testing
The 16th Intl. Workshop on Search-Based and Fuzz TestingThe 16th Intl. Workshop on Search-Based and Fuzz Testing
The 16th Intl. Workshop on Search-Based and Fuzz Testing
 
Simulation-based Test Case Generation for Unmanned Aerial Vehicles in the Nei...
Simulation-based Test Case Generation for Unmanned Aerial Vehicles in the Nei...Simulation-based Test Case Generation for Unmanned Aerial Vehicles in the Nei...
Simulation-based Test Case Generation for Unmanned Aerial Vehicles in the Nei...
 
Exposed! A case study on the vulnerability-proneness of Google Play Apps
Exposed! A case study on the vulnerability-proneness of Google Play AppsExposed! A case study on the vulnerability-proneness of Google Play Apps
Exposed! A case study on the vulnerability-proneness of Google Play Apps
 
Search-based Software Testing (SBST) '22
Search-based Software Testing (SBST) '22Search-based Software Testing (SBST) '22
Search-based Software Testing (SBST) '22
 
NL-based Software Engineering (NLBSE) '22
NL-based Software Engineering (NLBSE) '22NL-based Software Engineering (NLBSE) '22
NL-based Software Engineering (NLBSE) '22
 
NLBSE’22: Tool Competition
NLBSE’22: Tool CompetitionNLBSE’22: Tool Competition
NLBSE’22: Tool Competition
 
"An NLP-based Tool for Software Artifacts Analysis" at @ICSME2021.
 "An NLP-based Tool for Software Artifacts Analysis" at @ICSME2021.  "An NLP-based Tool for Software Artifacts Analysis" at @ICSME2021.
"An NLP-based Tool for Software Artifacts Analysis" at @ICSME2021.
 
An Empirical Investigation of Relevant Changes and Automation Needs in Modern...
An Empirical Investigation of Relevant Changes and Automation Needs in Modern...An Empirical Investigation of Relevant Changes and Automation Needs in Modern...
An Empirical Investigation of Relevant Changes and Automation Needs in Modern...
 
Search-Based Software Testing Tool Competition 2021 by Sebastiano Panichella,...
Search-Based Software Testing Tool Competition 2021 by Sebastiano Panichella,...Search-Based Software Testing Tool Competition 2021 by Sebastiano Panichella,...
Search-Based Software Testing Tool Competition 2021 by Sebastiano Panichella,...
 
A Framework for Multi-source Studies based on Unstructured Data.
A Framework for Multi-source Studies based on Unstructured Data.A Framework for Multi-source Studies based on Unstructured Data.
A Framework for Multi-source Studies based on Unstructured Data.
 
Revisiting Test Smells in Automatically Generated Tests: Limitations, Pitfall...
Revisiting Test Smells in Automatically Generated Tests: Limitations, Pitfall...Revisiting Test Smells in Automatically Generated Tests: Limitations, Pitfall...
Revisiting Test Smells in Automatically Generated Tests: Limitations, Pitfall...
 
Requirements-Collector: Automating Requirements Specification from Elicitatio...
Requirements-Collector: Automating Requirements Specification from Elicitatio...Requirements-Collector: Automating Requirements Specification from Elicitatio...
Requirements-Collector: Automating Requirements Specification from Elicitatio...
 
Unit Testing Tool Competition-Eighth Round
Unit Testing Tool Competition-Eighth RoundUnit Testing Tool Competition-Eighth Round
Unit Testing Tool Competition-Eighth Round
 
Cultural Exchange - ICSE 2020
Cultural Exchange - ICSE 2020Cultural Exchange - ICSE 2020
Cultural Exchange - ICSE 2020
 

Recently uploaded

General Elections Final Press Noteas per M
General Elections Final Press Noteas per MGeneral Elections Final Press Noteas per M
General Elections Final Press Noteas per MVidyaAdsule1
 
A Guide to Choosing the Ideal Air Cooler
A Guide to Choosing the Ideal Air CoolerA Guide to Choosing the Ideal Air Cooler
A Guide to Choosing the Ideal Air Coolerenquirieskenstar
 
proposal kumeneger edited.docx A kumeeger
proposal kumeneger edited.docx A kumeegerproposal kumeneger edited.docx A kumeeger
proposal kumeneger edited.docx A kumeegerkumenegertelayegrama
 
Chizaram's Women Tech Makers Deck. .pptx
Chizaram's Women Tech Makers Deck.  .pptxChizaram's Women Tech Makers Deck.  .pptx
Chizaram's Women Tech Makers Deck. .pptxogubuikealex
 
Quality by design.. ppt for RA (1ST SEM
Quality by design.. ppt for  RA (1ST SEMQuality by design.. ppt for  RA (1ST SEM
Quality by design.. ppt for RA (1ST SEMCharmi13
 
GESCO SE Press and Analyst Conference on Financial Results 2024
GESCO SE Press and Analyst Conference on Financial Results 2024GESCO SE Press and Analyst Conference on Financial Results 2024
GESCO SE Press and Analyst Conference on Financial Results 2024GESCO SE
 
cse-csp batch4 review-1.1.pptx cyber security
cse-csp batch4 review-1.1.pptx cyber securitycse-csp batch4 review-1.1.pptx cyber security
cse-csp batch4 review-1.1.pptx cyber securitysandeepnani2260
 
Internship Presentation | PPT | CSE | SE
Internship Presentation | PPT | CSE | SEInternship Presentation | PPT | CSE | SE
Internship Presentation | PPT | CSE | SESaleh Ibne Omar
 
Don't Miss Out: Strategies for Making the Most of the Ethena DigitalOpportunity
Don't Miss Out: Strategies for Making the Most of the Ethena DigitalOpportunityDon't Miss Out: Strategies for Making the Most of the Ethena DigitalOpportunity
Don't Miss Out: Strategies for Making the Most of the Ethena DigitalOpportunityApp Ethena
 
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...漢銘 謝
 
INDIAN GCP GUIDELINE. for Regulatory affair 1st sem CRR
INDIAN GCP GUIDELINE. for Regulatory  affair 1st sem CRRINDIAN GCP GUIDELINE. for Regulatory  affair 1st sem CRR
INDIAN GCP GUIDELINE. for Regulatory affair 1st sem CRRsarwankumar4524
 
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATION
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATIONRACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATION
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATIONRachelAnnTenibroAmaz
 
Engaging Eid Ul Fitr Presentation for Kindergartners.pptx
Engaging Eid Ul Fitr Presentation for Kindergartners.pptxEngaging Eid Ul Fitr Presentation for Kindergartners.pptx
Engaging Eid Ul Fitr Presentation for Kindergartners.pptxAsifArshad8
 
05.02 MMC - Assignment 4 - Image Attribution Lovepreet.pptx
05.02 MMC - Assignment 4 - Image Attribution Lovepreet.pptx05.02 MMC - Assignment 4 - Image Attribution Lovepreet.pptx
05.02 MMC - Assignment 4 - Image Attribution Lovepreet.pptxerickamwana1
 
Application of GIS in Landslide Disaster Response.pptx
Application of GIS in Landslide Disaster Response.pptxApplication of GIS in Landslide Disaster Response.pptx
Application of GIS in Landslide Disaster Response.pptxRoquia Salam
 

Recently uploaded (15)

General Elections Final Press Noteas per M
General Elections Final Press Noteas per MGeneral Elections Final Press Noteas per M
General Elections Final Press Noteas per M
 
A Guide to Choosing the Ideal Air Cooler
A Guide to Choosing the Ideal Air CoolerA Guide to Choosing the Ideal Air Cooler
A Guide to Choosing the Ideal Air Cooler
 
proposal kumeneger edited.docx A kumeeger
proposal kumeneger edited.docx A kumeegerproposal kumeneger edited.docx A kumeeger
proposal kumeneger edited.docx A kumeeger
 
Chizaram's Women Tech Makers Deck. .pptx
Chizaram's Women Tech Makers Deck.  .pptxChizaram's Women Tech Makers Deck.  .pptx
Chizaram's Women Tech Makers Deck. .pptx
 
Quality by design.. ppt for RA (1ST SEM
Quality by design.. ppt for  RA (1ST SEMQuality by design.. ppt for  RA (1ST SEM
Quality by design.. ppt for RA (1ST SEM
 
GESCO SE Press and Analyst Conference on Financial Results 2024
GESCO SE Press and Analyst Conference on Financial Results 2024GESCO SE Press and Analyst Conference on Financial Results 2024
GESCO SE Press and Analyst Conference on Financial Results 2024
 
cse-csp batch4 review-1.1.pptx cyber security
cse-csp batch4 review-1.1.pptx cyber securitycse-csp batch4 review-1.1.pptx cyber security
cse-csp batch4 review-1.1.pptx cyber security
 
Internship Presentation | PPT | CSE | SE
Internship Presentation | PPT | CSE | SEInternship Presentation | PPT | CSE | SE
Internship Presentation | PPT | CSE | SE
 
Don't Miss Out: Strategies for Making the Most of the Ethena DigitalOpportunity
Don't Miss Out: Strategies for Making the Most of the Ethena DigitalOpportunityDon't Miss Out: Strategies for Making the Most of the Ethena DigitalOpportunity
Don't Miss Out: Strategies for Making the Most of the Ethena DigitalOpportunity
 
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
 
INDIAN GCP GUIDELINE. for Regulatory affair 1st sem CRR
INDIAN GCP GUIDELINE. for Regulatory  affair 1st sem CRRINDIAN GCP GUIDELINE. for Regulatory  affair 1st sem CRR
INDIAN GCP GUIDELINE. for Regulatory affair 1st sem CRR
 
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATION
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATIONRACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATION
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATION
 
Engaging Eid Ul Fitr Presentation for Kindergartners.pptx
Engaging Eid Ul Fitr Presentation for Kindergartners.pptxEngaging Eid Ul Fitr Presentation for Kindergartners.pptx
Engaging Eid Ul Fitr Presentation for Kindergartners.pptx
 
05.02 MMC - Assignment 4 - Image Attribution Lovepreet.pptx
05.02 MMC - Assignment 4 - Image Attribution Lovepreet.pptx05.02 MMC - Assignment 4 - Image Attribution Lovepreet.pptx
05.02 MMC - Assignment 4 - Image Attribution Lovepreet.pptx
 
Application of GIS in Landslide Disaster Response.pptx
Application of GIS in Landslide Disaster Response.pptxApplication of GIS in Landslide Disaster Response.pptx
Application of GIS in Landslide Disaster Response.pptx
 

CODES: mining sourCe cOde Descriptions from developeErs diScussions - ICPC 2014

  • 1. CODES: mining sourCe cOde Descriptions from developeErs diScussions Carmine Sebastiano Massimiliano Gerardo Vassallo Panichella Di Penta Canfora
  • 2. S. Panichella, J. Aponte, M. Di Penta, A. Marcus, G. Canfora - ICPC 2012
  • 3. Such communications are…  unstructured  usually not explicitly meant to describe specific parts of the source code. Example: Eclipse (Java System) METHOD: searchMainMethods .......................... Problem seems to come from in org.eclipse.jdt.internal.debug.ui.launcher. ,there's a to addSubTypes(List, IProgressMonitor, IJavaSearchScope) if includesSubtypes flag is ON. This add all types sub-types as soon as the given scope encloses them without testing if sub-types have a main ! .......................... CLASS: MainMethodSearchEngine Parameters: “ORANGE” Keywords: “GREEN” searchMainMethods MainMethodSearchEngine (IProgressMonitor, IJavaSearchScope, boolean) cal l method method method Method Bug Report and Mailing List
  • 4. • Step 1: Downloading SO discussions relying on its REST interface and tracing them onto classes • Step 2: Extracting paragraphs • Step 3: Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes) • Step 4: Heuristic based Filtering • Step 5: Similarity based Filtering CODES: Approach for Mining Method Descriptions
  • 6. CODES Results: System Mined Descriptions True Positives False Positives Apache Lucene 2.9.0 9,343 (about 20% of the Methods) 84 16 Hibernate 3.5.0 10,608 (about 28% of the Methods) 91 9 We sampled 100 descriptions from each project

Editor's Notes

  1. The core approach behind CODES is based on the approach defined in our previous paper at the ICPC 2012 «Mining Source Code Description From developers Communications» the motivation of our previous work, is founded from the conviction that very often the documentation is scarse incompletes and out –of-date…Mine source code descriptions can be very important For code Re-documenting or complementing code comments. We argue that mailing list and issue tracker can be a useful source of information to help understand source code.. Thus, we defined in this previous paper an approach that mine java methods descriptions form developers discussions in mailing lists and issue trackers.
  2. Indeed in bugs report and mailing lists there are often source code descriptions at different levels of abstraction. Observing this example of bug report of Eclipse (a Java System) we can see a good method description of a java class “MainMethodSearchEngine”. Such example motivated our previous approach to mine method descriptions. However, such descriptions are also frequently present in discussions on stackoverflow… thus, we adapted our approach for this Questions&Answers Site…
  3. In same way we can find a similar description in an email of Apache Lucene (an other java system). What is important to note that such “USEFUL” descriptions contains very often relvant keywords like “call/invoke”
  4. Implementing CODES…that means “mining source code description from developers discussions”… that starts selecting a java method (or methods) to re-documents…and find related Description on stackoverflow, CODES consists of 5 steps Downloading SO discussions relying on its REST interface and tracing them onto classes. Step 2: after that ----Extracting paragraphs from such Discussions Step 3: than, -----Tracing paragraphs onto methods (using on Regular Expressions) – Discarding Discussions/or Answers with 0 Votes… Step 4: in the step four, CODES---- applying an Heuristic based Filtering ( verifies that a paragraph meets some patterns) and considers paragraphs having Syntactic descriptions, description of methods parameters, descriptions related to method invocations and so on..) Step 5: Finally, in the last step we try to verify the accordance between the found methods descriptions on SO and the source code, “COMPUTING THE TEXTUAL SIMILARITY BETWEEN THE source code and method descriptions, discarding descriptions having a similarity measure lower than 0.4” (Similarity based Filtering)
  5. Improvement with the aim of increasing the precision while keeping the method coverage as high as possible. Aim at further validating the proposed approach on a larger set of systems. Investigate enhancing CODES improving its features in terms of usability and adding new features, e.g., for re-documenting classes or packages.