2. 2
Readability
“The quality that enables the observer to correctly
perceive the message”
Metrics for Natural Language
Flesch-Kincaid Grade Level
Gunning-Fog Index
SMOG Index
Automated Readability Index
5. 5
Readability and Software
Code maintenance = 70% of lifecycle cost.
And most of maintenance effort is spent reading
code!
But do we have any way to gain some level of
assurance in code readability?
6. 6
Hypothesis
Employing a simple set of local features, we can
derive, from a set of human judgments, an accurate
model of readability for code.
To what extent do humans agree on code
readability?
We know readability is important, but can we create
a predictive model of it?
What could such a model teach us?
7. 7
Outline
Acquiring Human Readability Judgments
Extracting a Model
Model Performance
Correlation with External Notions of Software
Quality
Readability and the Software Lifecycle
14. 14
Features
We choose “local” code features
Line length
Length of identifier names
Comment density
Blank lines
Presence of numbers
[and 20 others]
19. 19
Conclusions
We can automatically judge readability about as
well as the “average” human can
This notion of readability shows significant
correlation with:
Version Changes
The output of a bug finder
Self-reported program maturity
We may also learn more about software readability
by looking at the predictive power of our model’s
features