Detecting Software Modularity Violations, Sunny Wong, Yuanfang Cai, Miryung Kim, Michael Dalton, ICSE' 11: Proceedings of the 2011 ACM and IEEE 33rd International Conference on Software Engineering
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
ICSE 2011 Research Paper on Modularity Violations
1. Detec%ng So+ware
Modularity Viola%ons
Sunny Wong,^* Yuanfang Cai,*
Miryung Kim,† and Michael Dalton*
* Drexel University
† University of Texas at Aus%n
^ Siemens Healthcare
Supported in part by NSF CCF‐0916891, CCF‐1043810, and DUE‐0837665
2. Mo%va%on
The Essence of Modularity:
Allows for independent module evolu%on
[Parnas 72; Baldwin and Clark 00]
In reality, modules do not always change independently
Quick and dirty implementa%on leaves technical debts
So+ware evolves in a way that deviates from original design
Modularity viola%on:
Components that are designed to evolve independently yet change
together in reality frequently
Our goal:
Detect modularity viola%ons
Slow down modularity decay
3. Limita%ons of Exis%ng Approaches
Verifica%on and valida%on
Modularity viola%ons usually do not affect func%onality
Not testable
Not problema%c un%l maintenance
Tradi%onal modularity analyses
Prevailing metrics (e.g., coupling, cohesion) do not measure
independence of module evolu%on
Do not detect the mismatches between design and reality
Code smell analyses
Not all code smells cause modularity viola%on
Not all modularity viola%ons are code smells
4. Approach Overview
Step 1: Find which modules should change together from their
design structure
Input: design model (e.g., UML, source code)
Clio finds modules from derived design structure matrix (DSM)
[Baldwin and Clark 00]
Step 2: Find which modules actually change together in reality
Input: revision history
Clio finds logical coupling of components [Ying et al. 04]
Step 3: Discover recurring discrepancies between the output
of step 1 and the output of step 2 as modularity viola-ons
Clio compares which modules should change together with which
modules actually change together
Recurring discrepancies are reported as modularity viola%ons
10. Iden%fy Recurring Discrepancies as
Modularity Viola%ons
Example discrepancy sets: {a, b}, {a, b, c}, {a, b}
Frequency
{a, b} 3
{a, c} 1
{a, b, c} 1
Clio takes minimal frequency threshold as input, and orders
the discrepancies according to their frequencies.
11. Evalua%on
Evalua%on ques%ons
Q1: How accurate are the viola%ons iden%fied by Clio?
Q2: How early can Clio iden%fy viola%ons?
Q3: What are the characteris%cs of viola%ons iden%fied by Clio?
Manually confirm viola%ons by looking forward in history
Refactoring in codebase
Developer recogni%on (e.g., change request)
Symptoms of code smells
Conserva%ve confirma%on
12. Evalua%on Subjects
Eclipse JDT
10 releases (~3 years)
27806 commits in revision history
3458 modifica%on requests
222 KSLOC in latest version
Hadoop Common
15 releases (~3 years)
3001 commits in revision history
490 modifica%on requests
64 KSLOC in latest version
Experimental Serngs
Minimal threshold of recurring discrepancies: 2
Length of sliding window for analysis: 5 releases
14. Q2: Timeliness of Modularity Viola%on
Detec%on
Hadoop: Clio detects modularity viola%ons, on average, 6
releases before developers iden%fy the design problems
Eclipse: Clio detects modularity viola%ons, on average, 5
releases before developers iden%fy the design problems
15. Q3: Characteris%cs of Modularity
Viola%ons
Analyzed symptoms of modularity viola%ons
Cyclic dependencies
Cloned code
Poor inheritance hierarchy
Unnamed coupling (43% in Hadoop, 16% in Eclipse)
Example of unnamed coupling
Several classes iden%fied as a part of modularity viola%ons, but do not
exhibit any symptoms of bad code smells.
Open modifica%on request to “redesign/refactor” those classes
because they are “hard to maintain, briwle, and merits some rework”
16. Summary
We define modularity viola-on as the mismatches between
designed modular structure and actual evolu%on path
Clio compares how modules should change together against
how modules actually change together to discover modularity
viola-ons
Can detect modularity viola%ons with 40% accuracy for Eclipse
and 66% accuracy for Hadoop
Can iden%fy modularity viola%ons several releases before
developers discover them
Symptoms of modularity viola%ons observed in our study go
beyond known bad smells
17. Detec%ng So+ware
Modularity Viola%ons
Sunny Wong,^* Yuanfang Cai,*
Miryung Kim,† and Michael Dalton*
* Drexel University
† University of Texas at Aus%n
^ Siemens Healthcare
Supported in part by NSF CCF‐0916891, CCF‐1043810, and DUE‐0837665