A Metric for Code Readability

•

6 gostaram•3,177 visualizações

Ray Buse

2
Readability
“The quality that enables the observer to correctly
perceive the message”
Metrics for Natural Language
 Flesch-Kincaid Grade Level
 Gunning-Fog Index
 SMOG Index
 Automated Readability Index

5
Readability and Software
Code maintenance = 70% of lifecycle cost.
And most of maintenance effort is spent reading
code!
But do we have any way to gain some level of
assurance in code readability?

6
Hypothesis
Employing a simple set of local features, we can
derive, from a set of human judgments, an accurate
model of readability for code.
 To what extent do humans agree on code
readability?
 We know readability is important, but can we create
a predictive model of it?
 What could such a model teach us?

7
Outline
 Acquiring Human Readability Judgments
 Extracting a Model
 Model Performance
 Correlation with External Notions of Software
Quality
 Readability and the Software Lifecycle

14
Features
We choose “local” code features
 Line length
 Length of identifier names
 Comment density
 Blank lines
 Presence of numbers
 [and 20 others]

19
Conclusions
We can automatically judge readability about as
well as the “average” human can
This notion of readability shows significant
correlation with:
 Version Changes
 The output of a bug finder
 Self-reported program maturity
We may also learn more about software readability
by looking at the predictive power of our model’s
features

Mais conteúdo relacionado

Mais procurados

Software Analytics: Data Analytics for Software Engineering and SecurityTao Xie

Recommending Software Refactoring Using Search-based Software EnginneringAli Ouni

A Study of the Quality-Impacting Practices of Modern Code Review at Sony MobileSAIL_QU

MSRA 2018: Intelligent Software Engineering: Synergy between AI and Software ...Tao Xie

Defect Prediction: Accomplishments and Future ChallengesYasutaka Kamei

A Multi-Objective Refactoring Approach to Introduce Design Patterns and Fix A...Ali Ouni

IS3242 Case PresentationJ M

ACM Chicago March 2019 meeting: Software Engineering and AI - Prof. Tao Xie, ...ACM Chicago

Adoption of Software Testing in Open Source Projects - A Preliminary Study on...Pavneet Singh Kochhar

Understanding software metricsTushar Sharma

SETTA'18 Keynote: Intelligent Software Engineering: Synergy between AI and So...Tao Xie

Intelligent Software Engineering: Synergy between AI and Software Engineering...Tao Xie

Software Analytics: Data Analytics for Software EngineeringTao Xie

Software bug prediction Muthukumaran Kasinathan

Analysing the Performance of Different Population Structures for an Agent-bas...Juan Luis Jiménez Laredo

MDD and the Tautology Problem: Discussion Notes.Bob Binder

My life as a cyborg Alexander Serebrenik

NLP and its application in Insurance -Short story presentationstuti_agarwal

Why is Test Driven Development for Analytics or Data Projects so Hard?Phil Watt

Why is TDD so hard for Data Engineering and Analytics Projects?Phil Watt

Mais procurados (20)

Software Analytics: Data Analytics for Software Engineering and Security

Recommending Software Refactoring Using Search-based Software Enginnering

A Study of the Quality-Impacting Practices of Modern Code Review at Sony Mobile

MSRA 2018: Intelligent Software Engineering: Synergy between AI and Software ...

Defect Prediction: Accomplishments and Future Challenges

A Multi-Objective Refactoring Approach to Introduce Design Patterns and Fix A...

IS3242 Case Presentation

ACM Chicago March 2019 meeting: Software Engineering and AI - Prof. Tao Xie, ...

Adoption of Software Testing in Open Source Projects - A Preliminary Study on...

Understanding software metrics

SETTA'18 Keynote: Intelligent Software Engineering: Synergy between AI and So...

Intelligent Software Engineering: Synergy between AI and Software Engineering...

Software Analytics: Data Analytics for Software Engineering

Software bug prediction

Analysing the Performance of Different Population Structures for an Agent-bas...

MDD and the Tautology Problem: Discussion Notes.

My life as a cyborg

NLP and its application in Insurance -Short story presentation

Why is Test Driven Development for Analytics or Data Projects so Hard?

Why is TDD so hard for Data Engineering and Analytics Projects?

Destaque

MSR End of Internship TalkRay Buse

The Road Not Taken: Estimating Path Execution Frequency StaticallyRay Buse

Documentation Inference for ExceptionsRay Buse

Synthesizing API Usage Examples Ray Buse

Analytics for Software DevelopmentRay Buse

Information Needs for Software Development AnalyticsRay Buse

Automatically Documenting Program ChangesRay Buse

Automatically Describing Program Structure and Behavior (PhD Defense)Ray Buse

Engineering Highly Maintainable Code: Maintain or InnovateSteve Andrews

Mining Development Repositories to Study the Impact of Collaboration on Softw...Nicolas Bettenburg

Icpc 2011 storeyMargaret-Anne Storey

ICSE 2011: Research industry panelMargaret-Anne Storey

Mining Software Repositories: Using Humans to Better SoftwareMarat Akhin

Msr2016 tarek swy351

ICPE2015swy351

MSR 2009swy351

ICSE2013swy351

ICSME2014swy351

WCRE2011swy351

ICSE2014swy351

Destaque (20)

MSR End of Internship Talk

The Road Not Taken: Estimating Path Execution Frequency Statically

Documentation Inference for Exceptions

Synthesizing API Usage Examples

Analytics for Software Development

Information Needs for Software Development Analytics

Automatically Documenting Program Changes

Automatically Describing Program Structure and Behavior (PhD Defense)

Engineering Highly Maintainable Code: Maintain or Innovate

Mining Development Repositories to Study the Impact of Collaboration on Softw...

Icpc 2011 storey

ICSE 2011: Research industry panel

Mining Software Repositories: Using Humans to Better Software

Msr2016 tarek

ICPE2015

MSR 2009

ICSE2013

ICSME2014

WCRE2011

ICSE2014

Semelhante a A Metric for Code Readability

Shift AI 2020: Using AI for automatic synthesis | Boris Cergol (Comtrade Digi...Shift Conference

‘CodeAliker’ - Plagiarism Detection on the Cloud acijjournal

Overview of Information Engineeringaecro

Analyzing Big Data's Weakest Link (hint: it might be you)HPCC Systems

Deepcoder to Self-Code with Machine LearningIRJET Journal

Conversational Networks for AutomaticOnline ModerationJAYAPRAKASH JPINFOTECH

Big Data: the weakest linkCS, NcState

IRJET- Hand Sign Recognition using Convolutional Neural NetworkIRJET Journal

IDE Code Compiler for the physically challenged (Deaf, Blind & Mute)IRJET Journal

An Overview Of The Singularity Projectalanocu

Sign Language RecognitionIRJET Journal

Hand Gesture Recognition System Using Holistic MediapipeIRJET Journal

IRJET- Wearable AI Device for BlindIRJET Journal

AI pitch SSideri Uni Systems S.M.S.A.

Live Sign Language Translation: A SurveyIRJET Journal

J034057065ijceronline

DeepPavlov 2019Mikhail Burtsev

OPTICAL CHARACTER RECOGNITION IN HEALTHCAREIRJET Journal

Handwritten Digit Recognition Using CNNIRJET Journal

A Novel Biometric Technique Benchmark Analysis For Selection Of Best Biometri...CSCJournals

Semelhante a A Metric for Code Readability (20)

Shift AI 2020: Using AI for automatic synthesis | Boris Cergol (Comtrade Digi...

‘CodeAliker’ - Plagiarism Detection on the Cloud

Overview of Information Engineering

Analyzing Big Data's Weakest Link (hint: it might be you)

Deepcoder to Self-Code with Machine Learning

Conversational Networks for AutomaticOnline Moderation

Big Data: the weakest link

IRJET- Hand Sign Recognition using Convolutional Neural Network

IDE Code Compiler for the physically challenged (Deaf, Blind & Mute)

An Overview Of The Singularity Project

Sign Language Recognition

Hand Gesture Recognition System Using Holistic Mediapipe

IRJET- Wearable AI Device for Blind

AI pitch SSideri

Live Sign Language Translation: A Survey

J034057065

DeepPavlov 2019

OPTICAL CHARACTER RECOGNITION IN HEALTHCARE

Handwritten Digit Recognition Using CNN

A Novel Biometric Technique Benchmark Analysis For Selection Of Best Biometri...

A Metric for Code Readability

1. A Metric for Software Readability Ray Buse ∙ Westley Weimer ISSTA 2008

2. 2 Readability “The quality that enables the observer to correctly perceive the message” Metrics for Natural Language  Flesch-Kincaid Grade Level  Gunning-Fog Index  SMOG Index  Automated Readability Index

3. 3

4. 4

5. 5 Readability and Software Code maintenance = 70% of lifecycle cost. And most of maintenance effort is spent reading code! But do we have any way to gain some level of assurance in code readability?

6. 6 Hypothesis Employing a simple set of local features, we can derive, from a set of human judgments, an accurate model of readability for code.  To what extent do humans agree on code readability?  We know readability is important, but can we create a predictive model of it?  What could such a model teach us?

7. 7 Outline  Acquiring Human Readability Judgments  Extracting a Model  Model Performance  Correlation with External Notions of Software Quality  Readability and the Software Lifecycle

8. 8 Snippet Sniper Demo

9. 9

10. 10

11. 11 Scoring Data

12. 12 Score Distribution

13. 13 Setup

14. 14 Features We choose “local” code features  Line length  Length of identifier names  Comment density  Blank lines  Presence of numbers  [and 20 others]

15. 15 Model Performance

16. 16 External Notions of Quality

17. 17 Software Lifecycle

18. 18 Software Lifecycle 2

19. 19 Conclusions We can automatically judge readability about as well as the “average” human can This notion of readability shows significant correlation with:  Version Changes  The output of a bug finder  Self-reported program maturity We may also learn more about software readability by looking at the predictive power of our model’s features

20. 20 Questions? Questions?

A Metric for Code Readability

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Destaque

Destaque (20)

Semelhante a A Metric for Code Readability

Semelhante a A Metric for Code Readability (20)

A Metric for Code Readability