SlideShare a Scribd company logo
1 of 59
Supporting the Maintenance of Identifier
Names: A Holistic Approach to High-
Quality Automated Identifier Naming
Anthony Peruma
June 28, 2022
B. Thomas Golisano College of Computing and Information Sciences
Ph.D. Dissertation Presentation
Dissertation Committee: Dr. Mohamed Mkaouer, Dr. Mehdi Mirakhorli & Dr. Marcos Zampieri
Dissertation Advisor: Dr. Christian Newman
Dissertation Defense Chair: Dr. Robert Glick
Agenda
~ Martin Fowler, 1999
“Any fool can write code that
a computer can understand.
Good programmers write
code that humans can
understand.”
01
Introduction
➢ Context
➢ Current Research & Challenges
02
Research
➢ Goal & Research Questions
➢ Completed Studies
➢ Overall Findings
03
Conclusion
➢ Future Work
➢ Summary
Introduction
The importance of identifer names
4
Identifier Names
Lexical tokens that uniquely identify elements
in the source code (classes, methods, etc. )
Names acount for 70% of characters in the
code base
Software Maintenance
Every software system undergoes maintenance
(corrective, adaptive, preventive, perfective)
Consumes 60% - 80% of organization
resources
Program Comprehension
A precursor to any maintenance task
Developers spend 58% of their time on
program comprehension activities
Poor code readability impacts time and quality
Lexical tokens that uniquely identify entities
5
class name
attribute name
parameter name
method name
variable name
Responsible for saving/writing
results of an operation
Responsible for writing the output
which comes in as a parameter
Some poor-quality names are easy to spot…
6
Unreadable method name
Generic name
… others are not so straightforward!
7
Readable attribute name
Anti-Pattern: Collection data
type, singular identifier name
Readable method name
Anti-Pattern: method
name suggests
transformation, but
no return type
Current Research &
Challenges
Introduction
8
9
IDENTIFIER
NAMING IS
HARD
CODING STANDARDS & STYLE GUIDES
Provide heuristics about the overall readability of a class. They do not produce
strong names, nor can they provide lexical structure recommendations.
RENAMING
Renames account to over 40% of the rework developers perform.
Renames do not guarantee a strong name.
Challenges with
renaming
10
A “rename chain” - multiple
instances of developers renaming
an identifier
Is the final name
high-quality?
11
IDENTIFIER
NAMING IS
HARD
CODING STANDARDS & STYLE GUIDES
Provide heuristics about the overall readability of a class. They do not produce
strong names, nor can they provide lexical structure recommendations.
NAME RECOMMENDATION MODELS
Models are prescriptive not descriptive. Model is built based on the
existing code styles and does not consider pre-existing poor identifiers.
Works only on method names. Context sensitivity is a challenge.
RENAMING
Renames account to over 40% of the rework developers perform.
Renames do not guarantee a strong name.
12
IDENTIFIER
NAMING IS
HARD
CODING STANDARDS & STYLE GUIDES
Provide heuristics about the overall readability of a class. They do not produce
strong names, nor can they provide lexical structure recommendations.
NAME RECOMMENDATION MODELS
Models are prescriptive not descriptive. Model is built based on the
existing code styles and does not consider pre-existing poor identifiers.
Works only on method names. Context sensitivity is a challenge.
RENAMING
Renames account to over 40% of the rework developers perform.
Renames do not guarantee a strong name.
NLP TECHNOLOGY
Current technology is built for English prose– not source code (e.g., Stanford
POS tagger); domain/technology terms pose a challenge.
Names are diverse, and so are the developers who craft
these names -- a one-stop solution is very challenging!
Over 30 years of research in identifier naming
13
Multiple Research Streams
Identifier Renaming
Identifier Name Quality
Naming styles, metrics, models, linguistic anti-patterns,
grammar patterns
Challenges with current approaches
Name quality is a threat to downstream approaches
Even with over 30 years of research, we do not have a way to measure strong identifier names.
Research
Improving the
developer code
comprehension
experience through
novel automated
mechanisms in identifier
name appraisals and
recommendations
GOAL
Grammar Patterns
16
A grammar pattern is the sequence of part-of-speech tags assigned to
individual words within an identifier
Part-of-speech is a category to which a word is assigned in accordance with
its syntactic functions
• In English, the main parts of speech are noun, pronoun, adjective, determiner, verb, adverb,
preposition, conjunction, and interjection
int dynamic_Table_Index; void save_As_Quadratic_Png();
Noun Modifier
(NM)
Noun
(N)
Noun Modifier
(NM)
Verb
(V)
Noun
(N)
Noun Modifier
(NM)
Preposition
(P)
Research Questions
• RQ 1: How effectively, in terms of correctness, can
grammar patterns be automatically generated for identifier
names?
• RQ 2: To what extent did the automated identifier naming
mechanism positively or negatively influence naming
practices?
• RQ 3: What are the primary challenges in appraising and
recommending the semantic structure of identifier names,
and how can these be improved?
17
Incorporating automated support for identifier name
maintenance into the developer workflow
18
Incorporating automated support for identifier name
maintenance into the developer workflow
19
Squiggly Line Indicates A Naming Problem
Summary of All Naming Problems
Incorporating automated support for identifier name
maintenance into the developer workflow
20
Selected Identifier
Problem Summary
Detected &
Recommended
Grammar Pattern
Problem Explanation
Research Focus Areas
21
Identifier Name
Evolution
Identifier Name Tool
Development
Developer Workflow
Integration
• Rename Prevalence
• Semantic Evolution
• Contextualization
• Grammar Patterns
• Abbreviation Expansion
• Rename Semantic Detection
• Linguistic Anti-Pattern Detection
• Identifer Part-of-Speech Tagger
• Developer Experience
Completed Studies Research
22
An empirical investigation of how and why
developers rename identifiers
INTRODUCTION & GOAL
Current work in the field does not examine the evolution of the name
Most of these studies do not provide empirical data – mostly conceptual
The study extends a portion of the work done by Arnaoudova et al. to a much larger number of systems
Lays the groundwork for understanding how the semantics of a name evolves
Goal: Explore the volume of rename refactoring operations developers apply and changes to the
structure of the renamed identifier names
23
Peruma, A., Mkaouer, M. W., Decker, M. J., & Newman, C. D. (2018, September). An empirical investigation of how and why developers rename
identifiers. In Proceedings of the 2nd International Workshop on Refactoring (pp. 26-33).
An empirical investigation of how and why
developers rename identifiers
METHODOLOGY
Empirical study on 3,795 open-source Java systems
• RefactoringMiner to mine rename refactoring operations
• Rename Taxonomy – determines the type of form and semantic
change an identifier’s name
• NLP Tools – including NLTK to determine the semantics of a name
• Topic Modeling – on the rename refactoring commit messages to
determine
Data Overview:
1M+ refactoring operations  43.36% rename refactorings
24
Peruma, A., Mkaouer, M. W., Decker, M. J., & Newman, C. D. (2018, September). An empirical investigation of how and why developers rename
identifiers. In Proceedings of the 2nd International Workshop on Refactoring (pp. 26-33).
An empirical investigation of how and why
developers rename identifiers
KEY FINDINGS & TAKEAWAYS
Renames form the bulk of the rework developers perform when refactoring their code
Developers mostly perform simple renames – either add or remove a single term in a name
Narrowing the meaning of the name is frequently done by the developer during a rename
A strong correlation between grammar changes and meaning preservation of the identifier's name
Topic modeling of rename commit messages results in high-level topics - difficult to pinpoint the
developer’s intention
25
Peruma, A., Mkaouer, M. W., Decker, M. J., & Newman, C. D. (2018, September). An empirical investigation of how and why developers rename
identifiers. In Proceedings of the 2nd International Workshop on Refactoring (pp. 26-33).
• Exploratory study showing the viability of using the semantic structure of names to determine the quality of the name
• Scope for constructing specialized NLP tools for software engineering artifacts
INTRODUCTION & GOAL
Contextualizing rename decisions using
refactorings, commit messages, and data types
Existing research on identifer naming does not investigate how names evolve and how these
changes correlate with changes made to source code
Help determine when/how to rename identifiers and to understand more about developer naming
mental models
Goal: Understand how surrounding code and development activities influence the structure and
meaning of an identifier’s name
• Data Types – have strong influence over the data and behavior represented by an identifier
• Refactorings – changes made before or after a rename have a relationship with the rename itself
26
Peruma, A., Mkaouer, M. W., Decker, M. J., & Newman, C. D. (2020). Contextualizing rename decisions using refactorings, commit messages, and data
types. Journal of Systems and Software, 169, 110704.
METHODOLOGY
Contextualizing rename decisions using
refactorings, commit messages, and data types
• RefactoringMiner to mine 28 refactoring operation types in the
source code
• Rename Taxonomy – determines the type of form and semantic
change an identifier’s name
• Static Analysis – extract the data type for an identifier
• NLP Tools – including NLTK to determine the semantics of a name
• Developer Experience – measured using the amounts of commits
performed on source code
Data Overview:
748,001 commits  711,495 refactoring operations  53.51%
refactorings are renames
27
Peruma, A., Mkaouer, M. W., Decker, M. J., & Newman, C. D. (2020). Contextualizing rename decisions using refactorings, commit messages, and data
types. Journal of Systems and Software, 169, 110704.
Contextualizing rename decisions using
refactorings, commit messages, and data types
KEY FINDINGS & TAKEAWAYS
Novice developers frequently perform rename refactorings than other types of refactoring
operations
A rename attribute usually follow a move attribute
When a rename follows another rename, the developer reverts to the original name
Developers frequently change the semantic meaning of an identifier name after a refactoring
Renames which involve a change to the type name tended to also involve identifiers with names
exactly matching their type -- A data type change to collection causes a name to change to plural
28
Peruma, A., Mkaouer, M. W., Decker, M. J., & Newman, C. D. (2020). Contextualizing rename decisions using refactorings, commit messages, and data
types. Journal of Systems and Software, 169, 110704.
Contextualizing rename decisions using
refactorings, commit messages, and data types
KEY FINDINGS & TAKEAWAYS
29
Peruma, A., Mkaouer, M. W., Decker, M. J., & Newman, C. D. (2020). Contextualizing rename decisions using refactorings, commit messages, and data
types. Journal of Systems and Software, 169, 110704.
• Name quality tools should consider the experience of the developer when presenting results
• Incorporation of code & name relationship heuristics (e.g., data types and plurality) into automated name appraisals
and recommendations
Narrowing the name of the type narrows the
meaning of the identifier’s name
Collection data types are associated with plural
identifier names
On the generation, structure, and semantics of
grammar patterns in source code identifiers
30
Newman, C. D., AlSuhaibani, R. S., Decker, M. J., Peruma, A., Kaushik, D., Mkaouer, M. W., & Hill, E. (2020). On the generation, structure, and semantics
of grammar patterns in source code identifiers. Journal of Systems and Software, 170, 110740.
INTRODUCTION & GOAL
Existing work focus on a specific type of identifier (class or method) or do not focus on real-world names
Current NLP techniques are not trained to be applied to software
Understanding this connection between name and behavior is challenging for humans and tools
Goal: Study the structure, semantics, diversity, and generation of grammar patterns, including
establishing and exploring the common and diverse grammar pattern structures found in identifiers
On the generation, structure, and semantics of
grammar patterns in source code identifiers
31
Newman, C. D., AlSuhaibani, R. S., Decker, M. J., Peruma, A., Kaushik, D., Mkaouer, M. W., & Hill, E. (2020). On the generation, structure, and semantics
of grammar patterns in source code identifiers. Journal of Systems and Software, 170, 110740.
METHODOLOGY
Manually curated gold set of grammar patterns
• 20 open-source systems (java, c++, c)
• Statistically significant sample of 1,335 identifier names (95% confidence level; 6% confidence interval)
• class names, function names, parameter names & attribute names
• Annotated and reviewed by the authors – every identifer reviewed by two annotators
• assigned part-of-speech tags for each word in the name
• Comparison against 3 part-of-speech taggers (Stanfrord, SWUM, POSSE)
On the generation, structure, and semantics of
grammar patterns in source code identifiers
KEY FINDINGS & TAKEAWAYS
Identified five patterns by looking at how frequently they occurred in the annotated dataset
• Most common: noun phrase (NM+ N) pattern
• Most common for methods: verb phrase (V NM+ N)
SWUM had the most agreement with the annotated dataset, followed by POSSE and Stanford
• SWUM: 67.8%, POSSE: 24.7%, Stanford: 26.6%
Part-of-speech taggers still require significant improvements to be effective on identifiers
32
Newman, C. D., AlSuhaibani, R. S., Decker, M. J., Peruma, A., Kaushik, D., Mkaouer, M. W., & Hill, E. (2020). On the generation, structure, and semantics
of grammar patterns in source code identifiers. Journal of Systems and Software, 170, 110740.
On the generation, structure, and semantics of
grammar patterns in source code identifiers
KEY FINDINGS & TAKEAWAYS
33
Newman, C. D., AlSuhaibani, R. S., Decker, M. J., Peruma, A., Kaushik, D., Mkaouer, M. W., & Hill, E. (2020). On the generation, structure, and semantics
of grammar patterns in source code identifiers. Journal of Systems and Software, 170, 110740.
• Construction of a specialized identifier name part-of-speech tagger
• Incorporation of common grammar patterns for each identifier type in name appraisals and recommendations
int dynamic_Table_Index;
Noun Modifier
(NM)
Noun
(N)
Noun Modifier
(NM)
void save_As_Quadratic_Png();
Verb
(V)
Noun
(N)
Noun Modifier
(NM)
Preposition
(P)
Noun Phrase – Common for identifiers that are not
non-functions or not collections, not boolean types
Verb Phrase – Common for function identifiers or
identifiers with a boolean type
Using grammar patterns to interpret test
method name evolution
34
Peruma, A., Hu, E., Chen, J., AlOmar, E. A., Mkaouer, M. W., & Newman, C. D. (2021, May). Using grammar patterns to interpret test method name
evolution. In 2021 IEEE/ACM 29th International Conference on Program Comprehension (ICPC) (pp. 335-346). IEEE.
INTRODUCTION & GOAL
The purpose of unit test code differs from production code – therefore do their identifier names
Test methods names are constructed to describe both the entity that is being tested as well as
actions taken by the test
Most existing studies focus on production code (and identifier names) → findings do not generalize
to test code
Understand how test method names are structured, how they evolve in structure and meaning,
and how the structure/meaning of these names relate to statically-verifiable code behavior
Using grammar patterns to interpret test
method name evolution
35
Peruma, A., Hu, E., Chen, J., AlOmar, E. A., Mkaouer, M. W., & Newman, C. D. (2021, May). Using grammar patterns to interpret test method name
evolution. In 2021 IEEE/ACM 29th International Conference on Program Comprehension (ICPC) (pp. 335-346). IEEE.
METHODOLOGY
• RefactoringMiner – to mine 28 refactoring operation types in the source code
• Test Suites – 12,010 JUnit test files had undergone a Rename Method refactoring
• Manual Annotation – part-of-speech tags for a statistically significant sample of 632 random test
method names were annotated by the authors
• Rename Taxonomy – determines the type of form and semantic change an identifier’s name
Using grammar patterns to interpret test
method name evolution
KEY FINDINGS & TAKEAWAYS
• Test names have a structure that differs from production names; some of this structure can be leveraged
to provide test-specific recommendations
• Existence of grammar patterns that include determiners, prepositions, and adverbs (e.g., +VM+, +DT+, N
V+, V V N P+)
• Methods with a noun phrase grammar pattern (N) are extremely rare; hence, poor test method names
• Grammar pattern prefixes are stable; they do not change very often during rename activities.
• Test method name refactorings tend to change the meaning of terms in the name; contrasts with
production name that tend to narrow in meaning
36
Peruma, A., Hu, E., Chen, J., AlOmar, E. A., Mkaouer, M. W., & Newman, C. D. (2021, May). Using grammar patterns to interpret test method name
evolution. In 2021 IEEE/ACM 29th International Conference on Program Comprehension (ICPC) (pp. 335-346). IEEE.
Using grammar patterns to interpret test
method name evolution
KEY FINDINGS & TAKEAWAYS
37
Peruma, A., Hu, E., Chen, J., AlOmar, E. A., Mkaouer, M. W., & Newman, C. D. (2021, May). Using grammar patterns to interpret test method name
evolution. In 2021 IEEE/ACM 29th International Conference on Program Comprehension (ICPC) (pp. 335-346). IEEE.
• Code quality tools/techniques should treat test identifiers differently from production identifiers
• Incorporation of common test method grammar patterns when appraising and recommending names
Verb Verb
Noun
Dual Verb Phrase With Prepositon – Preposition
identifies the relationship between the nouns
Divided Verb Phrase – Verb enclosed within nouns, where
the verb is the action applied to the secondary noun
Verb Noun Preposition
Verb Noun Noun
Tool Development
Naming Violation Detection
• Detects 19 types of linguistic anti-patterns
• Provides an explanation of the violation
• Analyzes C# & Java source code
• Supports project-specific customizations
• Average precision: 75.27%
• Open-source
Ensemble Part-of-Speech Tagger
• Tagger uses machine-learning and the output
from multiple part-of-speech taggers to
annotate natural language text
• The ensemble uses three state-of-the-art part-
of-speech taggers: SWUM, POSSE, and
Stanford
• Accuracy of 86%; Outperforms Stanford by
51%
38
Peruma, A., Arnaoudova, V., & Newman, C. D. (2021, September). Ideal:
An open-source identifier name appraisal tool. In 2021 IEEE
International Conference on Software Maintenance and Evolution
(ICSME) (pp. 599-603). IEEE.
Newman, C. D., Decker, M. J., Alsuhaibani, R., Peruma, A., Mkaouer, M.,
Mohapatra, S., ... & Hill, E. (2021). An ensemble approach for
annotating source code identifiers with part-of-speech tags. IEEE
Transactions on Software Engineering.
Overall Findings Research
39
RQ 1: How effectively, in terms of
correctness, can grammar patterns
be automatically generated for
identifier names?
Common identifier naming patterns
41
NM* N
V P NM*
(N|NPL)
NM* N
P NM*
(N|NPL)
P NM*
(N|NPL)
V* DT
NM*
(N|NPL)
V NM*
(N|NPL)
V+
NM*
NPL
Prepositional w/ Noun
Prepositional phrase with leading noun
phrase
long query_Timeout_In_Milliseconds;
NM N P NPL
Noun w/ Determiner
Noun phrase with leading determiner
String[] all_Open_Indices;
DT NM NPL
Prepositional w/ Verb
Prepositional phrase with leading verb
string convert_to_php_namespace();
V P NM N
Prepositonal Phrase
A noun or verb-phrase with a leading
preposition
String to_string();
P N
Plural Noun Phrase
Identical to Noun Phrase, except the head-
noun is plural
String[] method_Name_Prefixes;
NM NM NPL
Verb Phrase
The addition of a verb to a noun phrase
creates a verb phrase
bool create_metadata_array();
V NM N
Noun Phrase
Zero or more noun-modifiers appear to
the left of a head-noun
int dynamic_Table_Index;
NM NM N
Verb Pattern
One or more verbs with no noun phrase
void sort();
V
Common identifier naming rules
42
Rule(s):
(Plural) Noun Phrase:
NM* N(PL)
(e.g., class StringUtility)
Rule(s):
Verb Phrase/Pattern:
V NM* N(PL)| V+
Event Handler or Casting:
(.*) P NM * N(PL)
Looping or Find/Contains:
V* DT NM* N(PL)
Rule(s):
Bool Type:
V NM* N(PL)
Non-Collection Type:
NM* N
Collection Type:
NM* NPL
CLASS METHOD
VARIABLE &
PARAMETER
Analyzing the quality of names using grammar patterns
43
Identifer Phase Structure != Human Language Phrase Structure
Off-the-shelf NLP tools underperform analyzing source code
Challenges with automatically determine the meaning of words in an identifier
and how these words interact with one another
Grammar patterns allow a more efficient analysis by broadly categorizing words
into their corresponding part-of-speech
The Ensemble Tagger is a specialized part-of-speech tagger with a high accuracy
and outperforms state-of-the-art taggers
Developers mostly agree with the proposed grammar pattern heuristics to
appraise identifier names
RQ 2: To what extent did the
automated identifier naming
mechanism positively or negatively
influence naming practices?
Plugin for IntelliJ IDEA
45
• Construction of an IntelliJ plugin that provides real-time appraisals and
recommendations for identifier grammar patterns
• Utilizes the part-of-speech tagger to generate the identifier’s part-of-speech tags
• The tagger is exposed as a webservice that is called from the plugin
46
Selected Identifier
Problem Summary
Detected &
Recommended
Grammar Pattern
Problem Explanation
Squiggly Line Indicates
A Naming Problem
Summary of All Naming
Problems In The File
IDE plugin user study
47
• User study with undergraduate and
graduate students
• 20 participants in total
• Two groups of equal size:
• Group A – utilized the plugin
• Group B – control group
• Review pre-defined code snippet in
IntelliJ IDEA and correct identifier
names
• Code snippets included string
manipulation methods and simple
object-oriented program
• Pre- and Post-questionnaire
Quantitative participant feedback
48
80% 80% 70% 90%
of participants
rated the priority
they place on
part-of-speech
tags as either
High Priority or
Essential
of participants
rated the
convenience of
having a
grammar pattern
recommendation
tool as either
Convenient or
Very Convenient
of participants
rated their ability
to interpret the
recommendations
as either Easy or
Very Easy
of participants
rated the accuracy
of the plugins
recommendations
as either Satisfied
or Very Satisfied
Qualitative participant feedback
49
NEGATIVE
• IDE at times is slow or hangs
• The plugin occasionally takes time to
update
• Part-of-speech tags are not known to
everyone
• Not all recommendations are accurate
POSITIVE
• The plugin forces the user to think about
the quality of the identifier's name
• Ensures consistency in identifier naming
• Good resource for novice developers
• Explanation and examples are helpful
• Most of the recommendations were
satisfactory
ENHANCEMENTS
• More examples on recommended patterns
• Definitions for part-of-speech tags
• The UI can improve to make it easier to
navigate to identifiers in the code
RQ 3: What are the primary
challenges in appraising and
recommending the semantic
structure of identifier names, and
how can these be improved?
Types of challenges encountered conducting this research
51
Tools and
Technologies
Prior Research
Studies
Lack of specialized tools for s/w artifacts
52
Due to the diversity of systems, not all name appraisal and recommendation tools incorporate all
naming rules – leading to inaccurate results (i.e., not a one-stop solution)
The Ensemble Tagger misannotates specific grammar patterns – performs poorly on names
having preambles and elongated verb phrase patterns
Name are diverse and subjective – context plays an essential part to evaluating the quality of the
name; context lies in the code surrounding the identifier
Existing well-established NLP tools (e.g., WordNet, NLTK) perform poorly on software engineering
artifacts, such as source code
Current code quality tools/approaches (e.g., checkstyle) focus on the styling of a name
Rename recommendation models are prescriptive – they do not provide a rationale for the
recommendation
Dearth of empirical data
53
Most studies focus on the readability of a name --- e.g., readability models look at name styling or
readability of entire files
Readability does not always correlate to understandability
Readable names might not accurately reflect intended behavior
Developers are diverse – experience impacts naming and comprehension activities
Lack of empirical studies on how developers' structure and comprehend identifier names
Names are composed of diverse words – including abbreviations, acronyms and digits
These tokens also convey meanings, but studies on how and why they are used by developers are
lacking and therefore inhibit our overall understanding of a name
54
Summary of overall research findings
Grammar Pattern
Name Appraisal &
Recommendation
At a conceptual
level, grammar
patterns reflect
both the linguistic
and program
behavior and make
it possible to
provide accurate
name appraisals
and
recommendations
01 Primary
Challenges With
Grammar Patterns
Current
tools/technologies
have shortcomings
and do not provide
a one-stop
solution; a dearth
of empirical
studies hinders the
comparison of
findings
03
Developer
Workflow
Integration
Developers find an
IDE plugin
incorporating
grammar pattern
name appraisals
and
recommendations
both valuable and
useable
02
Conclusion
Expanding the knowledge on identifier naming
56
Detect patterns in specific
types of systems/code
Further the understanding of
name-code relationships
Pattern Detection
Insight from professional
and novice developers on
the characteristics of high-
quality names
Developer Experience
Incremental improvements
to existing tools
Improving NLP techniques to
better understand code
Tool Development
Summary
• High-quality identifiers are essential for program comprehension
• I study the evolution of names and investigate their relationship with statically detectable code behavior
• My work provides developers with tools to craft and maintain high-quality identifier names in their projects
• This is a long-term initiative, that will continue post-graduation and into my academic career
57
PH.D100%
Advisor: Dr. Christian Newman
Committee:
Dr. Mohamed Mkaouer,
Dr. Mehdi Mirakhorli
Dr. Marcos Zampieri
Chair: Dr. Robert Glick
Faculty & Staff:
Department of Computing and Information Sciences
Department of Software Engineering
Collaborators, Colleagues & SE Sr. Design Teams
Friends & Family
Acknowledgements
Supporting the Maintenance of Identifier Names:
A Holistic Approach to High-Quality Automated
Identifier Naming
A n t h o ny P e r u m a
https://www.peruma.me
June 28, 2022
Supporting the Maintenance of Identifier Names: A Holistic Approach to High-Quality Automated Identifier Naming

More Related Content

Similar to Supporting the Maintenance of Identifier Names: A Holistic Approach to High-Quality Automated Identifier Naming

Thesis+of+laleh+eshkevari.ppt
Thesis+of+laleh+eshkevari.pptThesis+of+laleh+eshkevari.ppt
Thesis+of+laleh+eshkevari.pptPtidej Team
 
Butler
ButlerButler
Butleranesah
 
Using Grammar Patterns to Interpret Test Method Name Evolution
Using Grammar Patterns to Interpret Test Method Name EvolutionUsing Grammar Patterns to Interpret Test Method Name Evolution
Using Grammar Patterns to Interpret Test Method Name EvolutionUniversity of Hawai‘i at Mānoa
 
130817 latifa guerrouj - context-aware source code vocabulary normalization...
130817   latifa guerrouj - context-aware source code vocabulary normalization...130817   latifa guerrouj - context-aware source code vocabulary normalization...
130817 latifa guerrouj - context-aware source code vocabulary normalization...Ptidej Team
 
Natural Language Processing Through Different Classes of Machine Learning
Natural Language Processing Through Different Classes of Machine LearningNatural Language Processing Through Different Classes of Machine Learning
Natural Language Processing Through Different Classes of Machine Learningcsandit
 
Data analysis – using computers
Data analysis – using computersData analysis – using computers
Data analysis – using computersNoonapau
 
Paper id 28201441
Paper id 28201441Paper id 28201441
Paper id 28201441IJRAT
 
Data analysis – using computers
Data analysis – using computersData analysis – using computers
Data analysis – using computersNoonapau
 
Data analysis – using computers for presentation
Data analysis – using computers for presentationData analysis – using computers for presentation
Data analysis – using computers for presentationNoonapau
 
Supporting program comprehension with source code summarization
Supporting program comprehension with source code summarizationSupporting program comprehension with source code summarization
Supporting program comprehension with source code summarizationMasud Rahman
 
IRJET- Semantic Question Matching
IRJET- Semantic Question MatchingIRJET- Semantic Question Matching
IRJET- Semantic Question MatchingIRJET Journal
 
Analysing the concept of quality in model-driven engineering literature: a sy...
Analysing the concept of quality in model-driven engineering literature: a sy...Analysing the concept of quality in model-driven engineering literature: a sy...
Analysing the concept of quality in model-driven engineering literature: a sy...Fáber D. Giraldo
 
2014 IEEE JAVA SOFTWARE ENGINEERING PROJECT Repent analyzing the nature of id...
2014 IEEE JAVA SOFTWARE ENGINEERING PROJECT Repent analyzing the nature of id...2014 IEEE JAVA SOFTWARE ENGINEERING PROJECT Repent analyzing the nature of id...
2014 IEEE JAVA SOFTWARE ENGINEERING PROJECT Repent analyzing the nature of id...IEEEBEBTECHSTUDENTSPROJECTS
 
Knowledge Graph and Similarity Based Retrieval Method for Query Answering System
Knowledge Graph and Similarity Based Retrieval Method for Query Answering SystemKnowledge Graph and Similarity Based Retrieval Method for Query Answering System
Knowledge Graph and Similarity Based Retrieval Method for Query Answering SystemIRJET Journal
 
Thesis+of+zohreh+sharafi.ppt
Thesis+of+zohreh+sharafi.pptThesis+of+zohreh+sharafi.ppt
Thesis+of+zohreh+sharafi.pptPtidej Team
 
Model Manipulation for End-User Modelers
Model Manipulation for End-User ModelersModel Manipulation for End-User Modelers
Model Manipulation for End-User ModelersVlad Acretoaie
 
[2015/2016] RESEARCH in software engineering
[2015/2016] RESEARCH in software engineering[2015/2016] RESEARCH in software engineering
[2015/2016] RESEARCH in software engineeringIvano Malavolta
 
Question Classification using Semantic, Syntactic and Lexical features
Question Classification using Semantic, Syntactic and Lexical featuresQuestion Classification using Semantic, Syntactic and Lexical features
Question Classification using Semantic, Syntactic and Lexical featuresIJwest
 
Question Classification using Semantic, Syntactic and Lexical features
Question Classification using Semantic, Syntactic and Lexical featuresQuestion Classification using Semantic, Syntactic and Lexical features
Question Classification using Semantic, Syntactic and Lexical featuresdannyijwest
 
Named Entity Recognition using Tweet Segmentation
Named Entity Recognition using Tweet SegmentationNamed Entity Recognition using Tweet Segmentation
Named Entity Recognition using Tweet SegmentationIRJET Journal
 

Similar to Supporting the Maintenance of Identifier Names: A Holistic Approach to High-Quality Automated Identifier Naming (20)

Thesis+of+laleh+eshkevari.ppt
Thesis+of+laleh+eshkevari.pptThesis+of+laleh+eshkevari.ppt
Thesis+of+laleh+eshkevari.ppt
 
Butler
ButlerButler
Butler
 
Using Grammar Patterns to Interpret Test Method Name Evolution
Using Grammar Patterns to Interpret Test Method Name EvolutionUsing Grammar Patterns to Interpret Test Method Name Evolution
Using Grammar Patterns to Interpret Test Method Name Evolution
 
130817 latifa guerrouj - context-aware source code vocabulary normalization...
130817   latifa guerrouj - context-aware source code vocabulary normalization...130817   latifa guerrouj - context-aware source code vocabulary normalization...
130817 latifa guerrouj - context-aware source code vocabulary normalization...
 
Natural Language Processing Through Different Classes of Machine Learning
Natural Language Processing Through Different Classes of Machine LearningNatural Language Processing Through Different Classes of Machine Learning
Natural Language Processing Through Different Classes of Machine Learning
 
Data analysis – using computers
Data analysis – using computersData analysis – using computers
Data analysis – using computers
 
Paper id 28201441
Paper id 28201441Paper id 28201441
Paper id 28201441
 
Data analysis – using computers
Data analysis – using computersData analysis – using computers
Data analysis – using computers
 
Data analysis – using computers for presentation
Data analysis – using computers for presentationData analysis – using computers for presentation
Data analysis – using computers for presentation
 
Supporting program comprehension with source code summarization
Supporting program comprehension with source code summarizationSupporting program comprehension with source code summarization
Supporting program comprehension with source code summarization
 
IRJET- Semantic Question Matching
IRJET- Semantic Question MatchingIRJET- Semantic Question Matching
IRJET- Semantic Question Matching
 
Analysing the concept of quality in model-driven engineering literature: a sy...
Analysing the concept of quality in model-driven engineering literature: a sy...Analysing the concept of quality in model-driven engineering literature: a sy...
Analysing the concept of quality in model-driven engineering literature: a sy...
 
2014 IEEE JAVA SOFTWARE ENGINEERING PROJECT Repent analyzing the nature of id...
2014 IEEE JAVA SOFTWARE ENGINEERING PROJECT Repent analyzing the nature of id...2014 IEEE JAVA SOFTWARE ENGINEERING PROJECT Repent analyzing the nature of id...
2014 IEEE JAVA SOFTWARE ENGINEERING PROJECT Repent analyzing the nature of id...
 
Knowledge Graph and Similarity Based Retrieval Method for Query Answering System
Knowledge Graph and Similarity Based Retrieval Method for Query Answering SystemKnowledge Graph and Similarity Based Retrieval Method for Query Answering System
Knowledge Graph and Similarity Based Retrieval Method for Query Answering System
 
Thesis+of+zohreh+sharafi.ppt
Thesis+of+zohreh+sharafi.pptThesis+of+zohreh+sharafi.ppt
Thesis+of+zohreh+sharafi.ppt
 
Model Manipulation for End-User Modelers
Model Manipulation for End-User ModelersModel Manipulation for End-User Modelers
Model Manipulation for End-User Modelers
 
[2015/2016] RESEARCH in software engineering
[2015/2016] RESEARCH in software engineering[2015/2016] RESEARCH in software engineering
[2015/2016] RESEARCH in software engineering
 
Question Classification using Semantic, Syntactic and Lexical features
Question Classification using Semantic, Syntactic and Lexical featuresQuestion Classification using Semantic, Syntactic and Lexical features
Question Classification using Semantic, Syntactic and Lexical features
 
Question Classification using Semantic, Syntactic and Lexical features
Question Classification using Semantic, Syntactic and Lexical featuresQuestion Classification using Semantic, Syntactic and Lexical features
Question Classification using Semantic, Syntactic and Lexical features
 
Named Entity Recognition using Tweet Segmentation
Named Entity Recognition using Tweet SegmentationNamed Entity Recognition using Tweet Segmentation
Named Entity Recognition using Tweet Segmentation
 

More from University of Hawai‘i at Mānoa

Preparing for the Academic Job Market: Experience and Tips from a Recent F...
Preparing for the  Academic Job Market:  Experience and Tips from  a Recent F...Preparing for the  Academic Job Market:  Experience and Tips from  a Recent F...
Preparing for the Academic Job Market: Experience and Tips from a Recent F...University of Hawai‘i at Mānoa
 
Refactoring Debt: Myth or Reality? An Exploratory Study on the Relationship B...
Refactoring Debt: Myth or Reality? An Exploratory Study on the Relationship B...Refactoring Debt: Myth or Reality? An Exploratory Study on the Relationship B...
Refactoring Debt: Myth or Reality? An Exploratory Study on the Relationship B...University of Hawai‘i at Mānoa
 
Refactoring Debt: Myth or Reality? An Exploratory Study on the Relationship B...
Refactoring Debt: Myth or Reality? An Exploratory Study on the Relationship B...Refactoring Debt: Myth or Reality? An Exploratory Study on the Relationship B...
Refactoring Debt: Myth or Reality? An Exploratory Study on the Relationship B...University of Hawai‘i at Mānoa
 
Understanding Digits in Identifier Names: An Exploratory Study
Understanding Digits in Identifier Names: An Exploratory StudyUnderstanding Digits in Identifier Names: An Exploratory Study
Understanding Digits in Identifier Names: An Exploratory StudyUniversity of Hawai‘i at Mānoa
 
How Do I Refactor This? An Empirical Study on Refactoring Trends and Topics i...
How Do I Refactor This? An Empirical Study on Refactoring Trends and Topics i...How Do I Refactor This? An Empirical Study on Refactoring Trends and Topics i...
How Do I Refactor This? An Empirical Study on Refactoring Trends and Topics i...University of Hawai‘i at Mānoa
 
On the Distribution of "Simple Stupid Bugs" in Unit Test Files: An Explorator...
On the Distribution of "Simple Stupid Bugs" in Unit Test Files: An Explorator...On the Distribution of "Simple Stupid Bugs" in Unit Test Files: An Explorator...
On the Distribution of "Simple Stupid Bugs" in Unit Test Files: An Explorator...University of Hawai‘i at Mānoa
 
An Exploratory Study on the Refactoring of Unit Test Files in Android Applica...
An Exploratory Study on the Refactoring of Unit Test Files in Android Applica...An Exploratory Study on the Refactoring of Unit Test Files in Android Applica...
An Exploratory Study on the Refactoring of Unit Test Files in Android Applica...University of Hawai‘i at Mānoa
 
On the Distribution of Test Smells in Open Source Android Applications: An Ex...
On the Distribution of Test Smells in Open Source Android Applications: An Ex...On the Distribution of Test Smells in Open Source Android Applications: An Ex...
On the Distribution of Test Smells in Open Source Android Applications: An Ex...University of Hawai‘i at Mānoa
 
Permission Issues in Open-Source Android Apps: An Exploratory Study
Permission Issues in Open-Source Android Apps: An Exploratory StudyPermission Issues in Open-Source Android Apps: An Exploratory Study
Permission Issues in Open-Source Android Apps: An Exploratory StudyUniversity of Hawai‘i at Mānoa
 

More from University of Hawai‘i at Mānoa (16)

Preparing for the Academic Job Market: Experience and Tips from a Recent F...
Preparing for the  Academic Job Market:  Experience and Tips from  a Recent F...Preparing for the  Academic Job Market:  Experience and Tips from  a Recent F...
Preparing for the Academic Job Market: Experience and Tips from a Recent F...
 
Refactoring Debt: Myth or Reality? An Exploratory Study on the Relationship B...
Refactoring Debt: Myth or Reality? An Exploratory Study on the Relationship B...Refactoring Debt: Myth or Reality? An Exploratory Study on the Relationship B...
Refactoring Debt: Myth or Reality? An Exploratory Study on the Relationship B...
 
Test Anti-Patterns: From Definition to Detection
Test Anti-Patterns: From Definition to DetectionTest Anti-Patterns: From Definition to Detection
Test Anti-Patterns: From Definition to Detection
 
Refactoring Debt: Myth or Reality? An Exploratory Study on the Relationship B...
Refactoring Debt: Myth or Reality? An Exploratory Study on the Relationship B...Refactoring Debt: Myth or Reality? An Exploratory Study on the Relationship B...
Refactoring Debt: Myth or Reality? An Exploratory Study on the Relationship B...
 
Understanding Digits in Identifier Names: An Exploratory Study
Understanding Digits in Identifier Names: An Exploratory StudyUnderstanding Digits in Identifier Names: An Exploratory Study
Understanding Digits in Identifier Names: An Exploratory Study
 
How Do I Refactor This? An Empirical Study on Refactoring Trends and Topics i...
How Do I Refactor This? An Empirical Study on Refactoring Trends and Topics i...How Do I Refactor This? An Empirical Study on Refactoring Trends and Topics i...
How Do I Refactor This? An Empirical Study on Refactoring Trends and Topics i...
 
IDEAL: An Open-Source Identifier Name Appraisal Tool
IDEAL: An Open-Source Identifier Name Appraisal ToolIDEAL: An Open-Source Identifier Name Appraisal Tool
IDEAL: An Open-Source Identifier Name Appraisal Tool
 
On the Distribution of "Simple Stupid Bugs" in Unit Test Files: An Explorator...
On the Distribution of "Simple Stupid Bugs" in Unit Test Files: An Explorator...On the Distribution of "Simple Stupid Bugs" in Unit Test Files: An Explorator...
On the Distribution of "Simple Stupid Bugs" in Unit Test Files: An Explorator...
 
An Exploratory Study on the Refactoring of Unit Test Files in Android Applica...
An Exploratory Study on the Refactoring of Unit Test Files in Android Applica...An Exploratory Study on the Refactoring of Unit Test Files in Android Applica...
An Exploratory Study on the Refactoring of Unit Test Files in Android Applica...
 
On the Distribution of Test Smells in Open Source Android Applications: An Ex...
On the Distribution of Test Smells in Open Source Android Applications: An Ex...On the Distribution of Test Smells in Open Source Android Applications: An Ex...
On the Distribution of Test Smells in Open Source Android Applications: An Ex...
 
A Preliminary Study of Android Refactorings
A Preliminary Study of Android RefactoringsA Preliminary Study of Android Refactorings
A Preliminary Study of Android Refactorings
 
Permission Issues in Open-Source Android Apps: An Exploratory Study
Permission Issues in Open-Source Android Apps: An Exploratory StudyPermission Issues in Open-Source Android Apps: An Exploratory Study
Permission Issues in Open-Source Android Apps: An Exploratory Study
 
A Career In IT
A Career In ITA Career In IT
A Career In IT
 
Web Content Management - Introduction
Web Content Management - IntroductionWeb Content Management - Introduction
Web Content Management - Introduction
 
Introduction to SignalR
Introduction to SignalRIntroduction to SignalR
Introduction to SignalR
 
SharePoint 2013 - Search Driven Publishing
SharePoint 2013 - Search Driven PublishingSharePoint 2013 - Search Driven Publishing
SharePoint 2013 - Search Driven Publishing
 

Recently uploaded

The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceanilsa9823
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AIABDERRAOUF MEHENNI
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️anilsa9823
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 

Recently uploaded (20)

The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 

Supporting the Maintenance of Identifier Names: A Holistic Approach to High-Quality Automated Identifier Naming

  • 1. Supporting the Maintenance of Identifier Names: A Holistic Approach to High- Quality Automated Identifier Naming Anthony Peruma June 28, 2022 B. Thomas Golisano College of Computing and Information Sciences Ph.D. Dissertation Presentation Dissertation Committee: Dr. Mohamed Mkaouer, Dr. Mehdi Mirakhorli & Dr. Marcos Zampieri Dissertation Advisor: Dr. Christian Newman Dissertation Defense Chair: Dr. Robert Glick
  • 2. Agenda ~ Martin Fowler, 1999 “Any fool can write code that a computer can understand. Good programmers write code that humans can understand.” 01 Introduction ➢ Context ➢ Current Research & Challenges 02 Research ➢ Goal & Research Questions ➢ Completed Studies ➢ Overall Findings 03 Conclusion ➢ Future Work ➢ Summary
  • 4. The importance of identifer names 4 Identifier Names Lexical tokens that uniquely identify elements in the source code (classes, methods, etc. ) Names acount for 70% of characters in the code base Software Maintenance Every software system undergoes maintenance (corrective, adaptive, preventive, perfective) Consumes 60% - 80% of organization resources Program Comprehension A precursor to any maintenance task Developers spend 58% of their time on program comprehension activities Poor code readability impacts time and quality
  • 5. Lexical tokens that uniquely identify entities 5 class name attribute name parameter name method name variable name Responsible for saving/writing results of an operation Responsible for writing the output which comes in as a parameter
  • 6. Some poor-quality names are easy to spot… 6 Unreadable method name Generic name
  • 7. … others are not so straightforward! 7 Readable attribute name Anti-Pattern: Collection data type, singular identifier name Readable method name Anti-Pattern: method name suggests transformation, but no return type
  • 9. 9 IDENTIFIER NAMING IS HARD CODING STANDARDS & STYLE GUIDES Provide heuristics about the overall readability of a class. They do not produce strong names, nor can they provide lexical structure recommendations. RENAMING Renames account to over 40% of the rework developers perform. Renames do not guarantee a strong name.
  • 10. Challenges with renaming 10 A “rename chain” - multiple instances of developers renaming an identifier Is the final name high-quality?
  • 11. 11 IDENTIFIER NAMING IS HARD CODING STANDARDS & STYLE GUIDES Provide heuristics about the overall readability of a class. They do not produce strong names, nor can they provide lexical structure recommendations. NAME RECOMMENDATION MODELS Models are prescriptive not descriptive. Model is built based on the existing code styles and does not consider pre-existing poor identifiers. Works only on method names. Context sensitivity is a challenge. RENAMING Renames account to over 40% of the rework developers perform. Renames do not guarantee a strong name.
  • 12. 12 IDENTIFIER NAMING IS HARD CODING STANDARDS & STYLE GUIDES Provide heuristics about the overall readability of a class. They do not produce strong names, nor can they provide lexical structure recommendations. NAME RECOMMENDATION MODELS Models are prescriptive not descriptive. Model is built based on the existing code styles and does not consider pre-existing poor identifiers. Works only on method names. Context sensitivity is a challenge. RENAMING Renames account to over 40% of the rework developers perform. Renames do not guarantee a strong name. NLP TECHNOLOGY Current technology is built for English prose– not source code (e.g., Stanford POS tagger); domain/technology terms pose a challenge. Names are diverse, and so are the developers who craft these names -- a one-stop solution is very challenging!
  • 13. Over 30 years of research in identifier naming 13 Multiple Research Streams Identifier Renaming Identifier Name Quality Naming styles, metrics, models, linguistic anti-patterns, grammar patterns Challenges with current approaches Name quality is a threat to downstream approaches Even with over 30 years of research, we do not have a way to measure strong identifier names.
  • 15. Improving the developer code comprehension experience through novel automated mechanisms in identifier name appraisals and recommendations GOAL
  • 16. Grammar Patterns 16 A grammar pattern is the sequence of part-of-speech tags assigned to individual words within an identifier Part-of-speech is a category to which a word is assigned in accordance with its syntactic functions • In English, the main parts of speech are noun, pronoun, adjective, determiner, verb, adverb, preposition, conjunction, and interjection int dynamic_Table_Index; void save_As_Quadratic_Png(); Noun Modifier (NM) Noun (N) Noun Modifier (NM) Verb (V) Noun (N) Noun Modifier (NM) Preposition (P)
  • 17. Research Questions • RQ 1: How effectively, in terms of correctness, can grammar patterns be automatically generated for identifier names? • RQ 2: To what extent did the automated identifier naming mechanism positively or negatively influence naming practices? • RQ 3: What are the primary challenges in appraising and recommending the semantic structure of identifier names, and how can these be improved? 17
  • 18. Incorporating automated support for identifier name maintenance into the developer workflow 18
  • 19. Incorporating automated support for identifier name maintenance into the developer workflow 19 Squiggly Line Indicates A Naming Problem Summary of All Naming Problems
  • 20. Incorporating automated support for identifier name maintenance into the developer workflow 20 Selected Identifier Problem Summary Detected & Recommended Grammar Pattern Problem Explanation
  • 21. Research Focus Areas 21 Identifier Name Evolution Identifier Name Tool Development Developer Workflow Integration • Rename Prevalence • Semantic Evolution • Contextualization • Grammar Patterns • Abbreviation Expansion • Rename Semantic Detection • Linguistic Anti-Pattern Detection • Identifer Part-of-Speech Tagger • Developer Experience
  • 23. An empirical investigation of how and why developers rename identifiers INTRODUCTION & GOAL Current work in the field does not examine the evolution of the name Most of these studies do not provide empirical data – mostly conceptual The study extends a portion of the work done by Arnaoudova et al. to a much larger number of systems Lays the groundwork for understanding how the semantics of a name evolves Goal: Explore the volume of rename refactoring operations developers apply and changes to the structure of the renamed identifier names 23 Peruma, A., Mkaouer, M. W., Decker, M. J., & Newman, C. D. (2018, September). An empirical investigation of how and why developers rename identifiers. In Proceedings of the 2nd International Workshop on Refactoring (pp. 26-33).
  • 24. An empirical investigation of how and why developers rename identifiers METHODOLOGY Empirical study on 3,795 open-source Java systems • RefactoringMiner to mine rename refactoring operations • Rename Taxonomy – determines the type of form and semantic change an identifier’s name • NLP Tools – including NLTK to determine the semantics of a name • Topic Modeling – on the rename refactoring commit messages to determine Data Overview: 1M+ refactoring operations  43.36% rename refactorings 24 Peruma, A., Mkaouer, M. W., Decker, M. J., & Newman, C. D. (2018, September). An empirical investigation of how and why developers rename identifiers. In Proceedings of the 2nd International Workshop on Refactoring (pp. 26-33).
  • 25. An empirical investigation of how and why developers rename identifiers KEY FINDINGS & TAKEAWAYS Renames form the bulk of the rework developers perform when refactoring their code Developers mostly perform simple renames – either add or remove a single term in a name Narrowing the meaning of the name is frequently done by the developer during a rename A strong correlation between grammar changes and meaning preservation of the identifier's name Topic modeling of rename commit messages results in high-level topics - difficult to pinpoint the developer’s intention 25 Peruma, A., Mkaouer, M. W., Decker, M. J., & Newman, C. D. (2018, September). An empirical investigation of how and why developers rename identifiers. In Proceedings of the 2nd International Workshop on Refactoring (pp. 26-33). • Exploratory study showing the viability of using the semantic structure of names to determine the quality of the name • Scope for constructing specialized NLP tools for software engineering artifacts
  • 26. INTRODUCTION & GOAL Contextualizing rename decisions using refactorings, commit messages, and data types Existing research on identifer naming does not investigate how names evolve and how these changes correlate with changes made to source code Help determine when/how to rename identifiers and to understand more about developer naming mental models Goal: Understand how surrounding code and development activities influence the structure and meaning of an identifier’s name • Data Types – have strong influence over the data and behavior represented by an identifier • Refactorings – changes made before or after a rename have a relationship with the rename itself 26 Peruma, A., Mkaouer, M. W., Decker, M. J., & Newman, C. D. (2020). Contextualizing rename decisions using refactorings, commit messages, and data types. Journal of Systems and Software, 169, 110704.
  • 27. METHODOLOGY Contextualizing rename decisions using refactorings, commit messages, and data types • RefactoringMiner to mine 28 refactoring operation types in the source code • Rename Taxonomy – determines the type of form and semantic change an identifier’s name • Static Analysis – extract the data type for an identifier • NLP Tools – including NLTK to determine the semantics of a name • Developer Experience – measured using the amounts of commits performed on source code Data Overview: 748,001 commits  711,495 refactoring operations  53.51% refactorings are renames 27 Peruma, A., Mkaouer, M. W., Decker, M. J., & Newman, C. D. (2020). Contextualizing rename decisions using refactorings, commit messages, and data types. Journal of Systems and Software, 169, 110704.
  • 28. Contextualizing rename decisions using refactorings, commit messages, and data types KEY FINDINGS & TAKEAWAYS Novice developers frequently perform rename refactorings than other types of refactoring operations A rename attribute usually follow a move attribute When a rename follows another rename, the developer reverts to the original name Developers frequently change the semantic meaning of an identifier name after a refactoring Renames which involve a change to the type name tended to also involve identifiers with names exactly matching their type -- A data type change to collection causes a name to change to plural 28 Peruma, A., Mkaouer, M. W., Decker, M. J., & Newman, C. D. (2020). Contextualizing rename decisions using refactorings, commit messages, and data types. Journal of Systems and Software, 169, 110704.
  • 29. Contextualizing rename decisions using refactorings, commit messages, and data types KEY FINDINGS & TAKEAWAYS 29 Peruma, A., Mkaouer, M. W., Decker, M. J., & Newman, C. D. (2020). Contextualizing rename decisions using refactorings, commit messages, and data types. Journal of Systems and Software, 169, 110704. • Name quality tools should consider the experience of the developer when presenting results • Incorporation of code & name relationship heuristics (e.g., data types and plurality) into automated name appraisals and recommendations Narrowing the name of the type narrows the meaning of the identifier’s name Collection data types are associated with plural identifier names
  • 30. On the generation, structure, and semantics of grammar patterns in source code identifiers 30 Newman, C. D., AlSuhaibani, R. S., Decker, M. J., Peruma, A., Kaushik, D., Mkaouer, M. W., & Hill, E. (2020). On the generation, structure, and semantics of grammar patterns in source code identifiers. Journal of Systems and Software, 170, 110740. INTRODUCTION & GOAL Existing work focus on a specific type of identifier (class or method) or do not focus on real-world names Current NLP techniques are not trained to be applied to software Understanding this connection between name and behavior is challenging for humans and tools Goal: Study the structure, semantics, diversity, and generation of grammar patterns, including establishing and exploring the common and diverse grammar pattern structures found in identifiers
  • 31. On the generation, structure, and semantics of grammar patterns in source code identifiers 31 Newman, C. D., AlSuhaibani, R. S., Decker, M. J., Peruma, A., Kaushik, D., Mkaouer, M. W., & Hill, E. (2020). On the generation, structure, and semantics of grammar patterns in source code identifiers. Journal of Systems and Software, 170, 110740. METHODOLOGY Manually curated gold set of grammar patterns • 20 open-source systems (java, c++, c) • Statistically significant sample of 1,335 identifier names (95% confidence level; 6% confidence interval) • class names, function names, parameter names & attribute names • Annotated and reviewed by the authors – every identifer reviewed by two annotators • assigned part-of-speech tags for each word in the name • Comparison against 3 part-of-speech taggers (Stanfrord, SWUM, POSSE)
  • 32. On the generation, structure, and semantics of grammar patterns in source code identifiers KEY FINDINGS & TAKEAWAYS Identified five patterns by looking at how frequently they occurred in the annotated dataset • Most common: noun phrase (NM+ N) pattern • Most common for methods: verb phrase (V NM+ N) SWUM had the most agreement with the annotated dataset, followed by POSSE and Stanford • SWUM: 67.8%, POSSE: 24.7%, Stanford: 26.6% Part-of-speech taggers still require significant improvements to be effective on identifiers 32 Newman, C. D., AlSuhaibani, R. S., Decker, M. J., Peruma, A., Kaushik, D., Mkaouer, M. W., & Hill, E. (2020). On the generation, structure, and semantics of grammar patterns in source code identifiers. Journal of Systems and Software, 170, 110740.
  • 33. On the generation, structure, and semantics of grammar patterns in source code identifiers KEY FINDINGS & TAKEAWAYS 33 Newman, C. D., AlSuhaibani, R. S., Decker, M. J., Peruma, A., Kaushik, D., Mkaouer, M. W., & Hill, E. (2020). On the generation, structure, and semantics of grammar patterns in source code identifiers. Journal of Systems and Software, 170, 110740. • Construction of a specialized identifier name part-of-speech tagger • Incorporation of common grammar patterns for each identifier type in name appraisals and recommendations int dynamic_Table_Index; Noun Modifier (NM) Noun (N) Noun Modifier (NM) void save_As_Quadratic_Png(); Verb (V) Noun (N) Noun Modifier (NM) Preposition (P) Noun Phrase – Common for identifiers that are not non-functions or not collections, not boolean types Verb Phrase – Common for function identifiers or identifiers with a boolean type
  • 34. Using grammar patterns to interpret test method name evolution 34 Peruma, A., Hu, E., Chen, J., AlOmar, E. A., Mkaouer, M. W., & Newman, C. D. (2021, May). Using grammar patterns to interpret test method name evolution. In 2021 IEEE/ACM 29th International Conference on Program Comprehension (ICPC) (pp. 335-346). IEEE. INTRODUCTION & GOAL The purpose of unit test code differs from production code – therefore do their identifier names Test methods names are constructed to describe both the entity that is being tested as well as actions taken by the test Most existing studies focus on production code (and identifier names) → findings do not generalize to test code Understand how test method names are structured, how they evolve in structure and meaning, and how the structure/meaning of these names relate to statically-verifiable code behavior
  • 35. Using grammar patterns to interpret test method name evolution 35 Peruma, A., Hu, E., Chen, J., AlOmar, E. A., Mkaouer, M. W., & Newman, C. D. (2021, May). Using grammar patterns to interpret test method name evolution. In 2021 IEEE/ACM 29th International Conference on Program Comprehension (ICPC) (pp. 335-346). IEEE. METHODOLOGY • RefactoringMiner – to mine 28 refactoring operation types in the source code • Test Suites – 12,010 JUnit test files had undergone a Rename Method refactoring • Manual Annotation – part-of-speech tags for a statistically significant sample of 632 random test method names were annotated by the authors • Rename Taxonomy – determines the type of form and semantic change an identifier’s name
  • 36. Using grammar patterns to interpret test method name evolution KEY FINDINGS & TAKEAWAYS • Test names have a structure that differs from production names; some of this structure can be leveraged to provide test-specific recommendations • Existence of grammar patterns that include determiners, prepositions, and adverbs (e.g., +VM+, +DT+, N V+, V V N P+) • Methods with a noun phrase grammar pattern (N) are extremely rare; hence, poor test method names • Grammar pattern prefixes are stable; they do not change very often during rename activities. • Test method name refactorings tend to change the meaning of terms in the name; contrasts with production name that tend to narrow in meaning 36 Peruma, A., Hu, E., Chen, J., AlOmar, E. A., Mkaouer, M. W., & Newman, C. D. (2021, May). Using grammar patterns to interpret test method name evolution. In 2021 IEEE/ACM 29th International Conference on Program Comprehension (ICPC) (pp. 335-346). IEEE.
  • 37. Using grammar patterns to interpret test method name evolution KEY FINDINGS & TAKEAWAYS 37 Peruma, A., Hu, E., Chen, J., AlOmar, E. A., Mkaouer, M. W., & Newman, C. D. (2021, May). Using grammar patterns to interpret test method name evolution. In 2021 IEEE/ACM 29th International Conference on Program Comprehension (ICPC) (pp. 335-346). IEEE. • Code quality tools/techniques should treat test identifiers differently from production identifiers • Incorporation of common test method grammar patterns when appraising and recommending names Verb Verb Noun Dual Verb Phrase With Prepositon – Preposition identifies the relationship between the nouns Divided Verb Phrase – Verb enclosed within nouns, where the verb is the action applied to the secondary noun Verb Noun Preposition Verb Noun Noun
  • 38. Tool Development Naming Violation Detection • Detects 19 types of linguistic anti-patterns • Provides an explanation of the violation • Analyzes C# & Java source code • Supports project-specific customizations • Average precision: 75.27% • Open-source Ensemble Part-of-Speech Tagger • Tagger uses machine-learning and the output from multiple part-of-speech taggers to annotate natural language text • The ensemble uses three state-of-the-art part- of-speech taggers: SWUM, POSSE, and Stanford • Accuracy of 86%; Outperforms Stanford by 51% 38 Peruma, A., Arnaoudova, V., & Newman, C. D. (2021, September). Ideal: An open-source identifier name appraisal tool. In 2021 IEEE International Conference on Software Maintenance and Evolution (ICSME) (pp. 599-603). IEEE. Newman, C. D., Decker, M. J., Alsuhaibani, R., Peruma, A., Mkaouer, M., Mohapatra, S., ... & Hill, E. (2021). An ensemble approach for annotating source code identifiers with part-of-speech tags. IEEE Transactions on Software Engineering.
  • 40. RQ 1: How effectively, in terms of correctness, can grammar patterns be automatically generated for identifier names?
  • 41. Common identifier naming patterns 41 NM* N V P NM* (N|NPL) NM* N P NM* (N|NPL) P NM* (N|NPL) V* DT NM* (N|NPL) V NM* (N|NPL) V+ NM* NPL Prepositional w/ Noun Prepositional phrase with leading noun phrase long query_Timeout_In_Milliseconds; NM N P NPL Noun w/ Determiner Noun phrase with leading determiner String[] all_Open_Indices; DT NM NPL Prepositional w/ Verb Prepositional phrase with leading verb string convert_to_php_namespace(); V P NM N Prepositonal Phrase A noun or verb-phrase with a leading preposition String to_string(); P N Plural Noun Phrase Identical to Noun Phrase, except the head- noun is plural String[] method_Name_Prefixes; NM NM NPL Verb Phrase The addition of a verb to a noun phrase creates a verb phrase bool create_metadata_array(); V NM N Noun Phrase Zero or more noun-modifiers appear to the left of a head-noun int dynamic_Table_Index; NM NM N Verb Pattern One or more verbs with no noun phrase void sort(); V
  • 42. Common identifier naming rules 42 Rule(s): (Plural) Noun Phrase: NM* N(PL) (e.g., class StringUtility) Rule(s): Verb Phrase/Pattern: V NM* N(PL)| V+ Event Handler or Casting: (.*) P NM * N(PL) Looping or Find/Contains: V* DT NM* N(PL) Rule(s): Bool Type: V NM* N(PL) Non-Collection Type: NM* N Collection Type: NM* NPL CLASS METHOD VARIABLE & PARAMETER
  • 43. Analyzing the quality of names using grammar patterns 43 Identifer Phase Structure != Human Language Phrase Structure Off-the-shelf NLP tools underperform analyzing source code Challenges with automatically determine the meaning of words in an identifier and how these words interact with one another Grammar patterns allow a more efficient analysis by broadly categorizing words into their corresponding part-of-speech The Ensemble Tagger is a specialized part-of-speech tagger with a high accuracy and outperforms state-of-the-art taggers Developers mostly agree with the proposed grammar pattern heuristics to appraise identifier names
  • 44. RQ 2: To what extent did the automated identifier naming mechanism positively or negatively influence naming practices?
  • 45. Plugin for IntelliJ IDEA 45 • Construction of an IntelliJ plugin that provides real-time appraisals and recommendations for identifier grammar patterns • Utilizes the part-of-speech tagger to generate the identifier’s part-of-speech tags • The tagger is exposed as a webservice that is called from the plugin
  • 46. 46 Selected Identifier Problem Summary Detected & Recommended Grammar Pattern Problem Explanation Squiggly Line Indicates A Naming Problem Summary of All Naming Problems In The File
  • 47. IDE plugin user study 47 • User study with undergraduate and graduate students • 20 participants in total • Two groups of equal size: • Group A – utilized the plugin • Group B – control group • Review pre-defined code snippet in IntelliJ IDEA and correct identifier names • Code snippets included string manipulation methods and simple object-oriented program • Pre- and Post-questionnaire
  • 48. Quantitative participant feedback 48 80% 80% 70% 90% of participants rated the priority they place on part-of-speech tags as either High Priority or Essential of participants rated the convenience of having a grammar pattern recommendation tool as either Convenient or Very Convenient of participants rated their ability to interpret the recommendations as either Easy or Very Easy of participants rated the accuracy of the plugins recommendations as either Satisfied or Very Satisfied
  • 49. Qualitative participant feedback 49 NEGATIVE • IDE at times is slow or hangs • The plugin occasionally takes time to update • Part-of-speech tags are not known to everyone • Not all recommendations are accurate POSITIVE • The plugin forces the user to think about the quality of the identifier's name • Ensures consistency in identifier naming • Good resource for novice developers • Explanation and examples are helpful • Most of the recommendations were satisfactory ENHANCEMENTS • More examples on recommended patterns • Definitions for part-of-speech tags • The UI can improve to make it easier to navigate to identifiers in the code
  • 50. RQ 3: What are the primary challenges in appraising and recommending the semantic structure of identifier names, and how can these be improved?
  • 51. Types of challenges encountered conducting this research 51 Tools and Technologies Prior Research Studies
  • 52. Lack of specialized tools for s/w artifacts 52 Due to the diversity of systems, not all name appraisal and recommendation tools incorporate all naming rules – leading to inaccurate results (i.e., not a one-stop solution) The Ensemble Tagger misannotates specific grammar patterns – performs poorly on names having preambles and elongated verb phrase patterns Name are diverse and subjective – context plays an essential part to evaluating the quality of the name; context lies in the code surrounding the identifier Existing well-established NLP tools (e.g., WordNet, NLTK) perform poorly on software engineering artifacts, such as source code Current code quality tools/approaches (e.g., checkstyle) focus on the styling of a name Rename recommendation models are prescriptive – they do not provide a rationale for the recommendation
  • 53. Dearth of empirical data 53 Most studies focus on the readability of a name --- e.g., readability models look at name styling or readability of entire files Readability does not always correlate to understandability Readable names might not accurately reflect intended behavior Developers are diverse – experience impacts naming and comprehension activities Lack of empirical studies on how developers' structure and comprehend identifier names Names are composed of diverse words – including abbreviations, acronyms and digits These tokens also convey meanings, but studies on how and why they are used by developers are lacking and therefore inhibit our overall understanding of a name
  • 54. 54 Summary of overall research findings Grammar Pattern Name Appraisal & Recommendation At a conceptual level, grammar patterns reflect both the linguistic and program behavior and make it possible to provide accurate name appraisals and recommendations 01 Primary Challenges With Grammar Patterns Current tools/technologies have shortcomings and do not provide a one-stop solution; a dearth of empirical studies hinders the comparison of findings 03 Developer Workflow Integration Developers find an IDE plugin incorporating grammar pattern name appraisals and recommendations both valuable and useable 02
  • 56. Expanding the knowledge on identifier naming 56 Detect patterns in specific types of systems/code Further the understanding of name-code relationships Pattern Detection Insight from professional and novice developers on the characteristics of high- quality names Developer Experience Incremental improvements to existing tools Improving NLP techniques to better understand code Tool Development
  • 57. Summary • High-quality identifiers are essential for program comprehension • I study the evolution of names and investigate their relationship with statically detectable code behavior • My work provides developers with tools to craft and maintain high-quality identifier names in their projects • This is a long-term initiative, that will continue post-graduation and into my academic career 57
  • 58. PH.D100% Advisor: Dr. Christian Newman Committee: Dr. Mohamed Mkaouer, Dr. Mehdi Mirakhorli Dr. Marcos Zampieri Chair: Dr. Robert Glick Faculty & Staff: Department of Computing and Information Sciences Department of Software Engineering Collaborators, Colleagues & SE Sr. Design Teams Friends & Family Acknowledgements Supporting the Maintenance of Identifier Names: A Holistic Approach to High-Quality Automated Identifier Naming A n t h o ny P e r u m a https://www.peruma.me June 28, 2022