The Application of Parameterized Hierarchy Templates for Automated Program Code Defect-Fixing
Artyom Aleksyuk, Vladimir Itsykson,Peter The Great Saint Petersburg Polytechnic University, Saint Petersburg
12 - 14 November 2015
Tools and Methods of Program Analysis in St. Petersburg
Selaginella: features, morphology ,anatomy and reproduction.
TMPA-2015: The Application of Parameterized Hierarchy Templates for Automated Program Code Defect-Fixing
1. application of hierarchical parameterized
templates for automated software error
correction
Применение технологии иерархических параметризируемых шаблонов
для автоматизированного исправления ошибок в программном коде
Artyom Aleksyuk, Vladimir Itsykson
Nov 12, 2015
Peter the Great St.Petersburg Polytechnic University
2. introduction
• Wide use of software systems
• Important areas
• Validation and verification of software
• Static analysis
• Why not try to fix found bugs?
2
3. existing approaches and tools
• IntelliJ IDEA - Structural Search and Replace
• Uses templates to describe replacements
• Tightly coupled with IDEA UI and code model
• AutoFix-E: Automated Debugging of Programs with Contracts
• Juzi: A Tool for Repairing Complex Data Structures
• Corrects data structures using symbolic execution
• GenProg - Genetic programming for code repair
• A very promising tool and approach
• Requires a lot of unit tests
• Grail, Axis, AFix
• Dedicated to repair multithreaded programs
3
4. task
The main task is to develop an automated system which fixes code
with the help of a static analyzer.
Designed system consists of:
• Static analyzer interface module
• Code modification module
• Set of corrections
4
5. requirements
The developed system must meet the following requirements:
• It should work with minimal users’ involvement
• Modifications should be correct, i.e. the system shouldn’t alter
code logic in any way and should do only those modifications
which are described in the template;
• It should be universal;
• Code formatting and comments should be kept
• The system should support the latest versions of programming
language
• It should be extensible
5
6. static analyzer
FindBugs was chosen as the static analyzer.
• Easy to interchange information about warnings
• Mostly signature-based
The system must use templates to describe code replacements.
6
7. code modification approaches
• Manual AST modification (for example, using a JavaParser library)
• The most universal approach
• Low extensibility - requires writing new code for each new correction
• DSL for code modification?
• Template-based code modification technology (D.A. Timofeev
master’s degree, 2010)
• Uses templates to describe code modifications
Templates are written in language based on Java
• Allows using variables (”selectors”) in templates
• Supports Java 1.5 (JRE/JDK 1.5 were introduced in 2004!)
• Doesn’t keep code formatting and comments
• Sometimes behaves incorrectly (just buggy :( )
7
8. difficulties
A badly behaving automatic software repair system can skip
required code region, modify inappropriate code or even make a
wrong correction.
General reasons for that:
• Static analyzer mistake
• Static analyzer interface bottleneck
• Incorrect template match
• Improper modification
Ways to overcome the last problem
• Code review
• Unit testing
• Other suitable verification and validation methods
8
10. bugs examples
1. Absence of explicit default encoding designation when reading
text files
2. Strings comparison via == operator
3. Absence of null check in equal() method
4. Absence of argument type check in equal() method
5. Usage of constructors for wrapper classes
6. toString() method call for array
7. Usage of approximate values of mathematical constants
8. JVM termination via System.exit() call when handling errors
9. Null return in toString() method
10. Arrays comparison using equals() method
11. Comparison of compareTo() method returning value on equality
with constant
10
11. replacement templates
Templates language = Java + selectors.
Selectors are described using #idetifier expression.
Example: string comparison using ==. Before:
#a == #b
After:
#b.equals(#a)
Absence of a null pointer check. Before:
boolean equals(Object obj) {
#expr; }
After:
boolean equals(Object obj) {
if (obj == null) { return false; }
#expr; }
11
12. queries
Ability to specify requirements for selectors
1. Type of tree node
2. Range of values
3. Quantity of caught nodes
4. Complex queries via XPath
Example:
[before]
#array.toString()
[after]
Arrays.toString(#array)
[query]
array is Identifier
array quantity 1
12
13. development
• FindBugs report is just an XML document, read using standard
Java DOM parser
• Each template consists of three or four .INI-like sections:
[before], [after], [type] and optionally [query]. Each template can
fix multiple bug types and vice versa.
• Improved template matching code
• Selector queries
13
14. improved template matching code
Pattern Matching in Trees
Additional complexity because of selectors
Each selector can include any number of nodes
14
16. development
• Ported to ANTLRv4
• Grammar written from scratch, now based on Java 7
• Selectors can be used nearly everywhere
• Transition from AST to CST (Parse tree)
• New way to transform internal representation back to the source
code (allows to transfer formatting and comments)
16
17. ci integration
Shell script designed to be run as a Jenkins build step
• Launch FindBugs and fetch a report from it
• Run FixMyCode
• Commit changes
A new branch is created each time. Developers should review
modifications and do a merge.
17
19. testing
Trying to fix bugs in a popular, widely used project.
JGraphT library:
• Maintained code base
• Uses Java 7 features
• Has a plenty of unit tests (439)
• Middle-size project (27K SLOC)
Results:
• 46 bugs found
• 14 errors was fixes
• 8 errors can’t be fixed because of FindBugs error
• Other bugs need an appropriate replacement template
19
21. testing
Absence of a null pointer and argument type check:
@Override public boolean
equals(Object obj)
{
LabelsEdge otherEdge = (
LabelsEdge) obj;
if ((this.source ==
otherEdge.source)
&& (this.target ==
otherEdge.
target))
{
return true;
} else {
return false;
}
}
@Override public boolean equals(Object obj)
{
if (obj == null) {
return false;
}
if (!obj.getClass().isInstance(this)) {
return false;
}
LabelsEdge otherEdge = (LabelsEdge) obj;
if ((this.source == otherEdge.source)
&& (this.target == otherEdge.target)
)
{
return true;
} else {
return false;
}
}
21
22. recap
• The extensible system that works nearly automatically was
developed
Source code can be fetched from
https://bitbucket.org/h31/fixmycode
• Template grammar was updated and extended
• A set of replacement templates was written
• The developed system could be used to maintain the code
quality within Continuous Integration
• Also can be used to modernize legacy code
22
23. future direction of development
• First of all, make it a production-grade project (documentation,
code quality, stability)
• More powerful query types
• Support for other static analyzers (Java Path Finder, etc)
• Extending tool for related tasks: performance improvement,
security enhancement
23