Every system is a legacy system, the moment a programmer writes a line of code it becomes a legacy. Therefore in even relatively new systems similar to long lived systems, developers are faced with a body of code that they need to understand, and from which they need to extract architectural knowledge. Unfortunately, anecdotal evidence has shown that such knowledge tends to be tacit in nature, stored in the heads of people, and inconsistently scattered across various software artifacts and repositories. Furthermore, architectural knowledge vaporizes over time. Given the size, complexity, and longevity of many projects, developers therefore often lack a comprehensive knowledge of architectural design decisions and consequently make changes in the code that inadvertently degrade the underlying design and compromise its qualities.
This technical briefing will answer three fundamental questions about software architecture recovery: Why? What? and How? Through several examples it articulates and synthesizes technical forces and financial motivations that make software companies to invest in software architecture recovery. It discusses “what” are the pieces of design knowledge that can be recovered and lastly demonstrates a methodology as well as required tools for answering “how” to reconstruct architecture from implementation artifacts.
7. 6
Master
Slave
HB
Detailed Example: An architectural view
7
Apache Hadoop Architecture
Requirements# 1:
highly available system, where
hardware failure can be the norm
rather than the exception
8. 6
Master
Slave
HB
Detailed Example: An architectural view
8
Apache Hadoop Architecture
Requirements# 1:
highly available system, where
hardware failure can be the norm
rather than the exception
9. 6
Master
Slave
HB
Decision # 1:
Use Master-slave Architectural
Style where slave processes are
replicated
Detailed Example: An architectural view
9
Apache Hadoop Architecture
Requirements# 1:
highly available system, where
hardware failure can be the norm
rather than the exception
10. 6
Master
Slave
HB
Decision # 1:
Use Master-slave Architectural
Style where slave processes are
replicated
Detailed Example: An architectural view
10
Decision # 2:
Checkpoint updated data, and
bundle replicas (send every 2
seconds) – in order to meet
performance goals.
Apache Hadoop Architecture
Requirements# 1:
highly available system, where
hardware failure can be the norm
rather than the exception
11. 6
Master
Slave
HB
Decision # 1:
Use Master-slave Architectural
Style where slave processes are
replicated
Detailed Example: An architectural view
11
Decision # 2:
Checkpoint updated data, and
bundle replicas (send every 2
seconds) – in order to meet
performance goals.
Decision # 3:
Use heartbeat tactic to monitor
availability of task trackers and data
nodes. Heartbeat must beat every
.25 seconds to balance availability
and performance.
Apache Hadoop Architecture
Requirements# 1:
highly available system, where
hardware failure can be the norm
rather than the exception
12. 6
Master
Slave
HB
Decision # 1:
Use Master-slave Architectural
Style where slave processes are
replicated
Detailed Example: An architectural view
12
Decision # 2:
Checkpoint updated data, and
bundle replicas (send every 2
seconds) – in order to meet
performance goals.
Decision # 3:
Use heartbeat tactic to monitor
availability of task trackers and data
nodes. Heartbeat must beat every
.25 seconds to balance availability
and performance.
Decision # 4:
Use proxy handles failure pattern to
shield clients from failures, and to
support fault tolerance (i.e. service
continues in the face of transient
failure.
Apache Hadoop Architecture
Requirements# 1:
highly available system, where
hardware failure can be the norm
rather than the exception
13. 6
Master
Slave
HB
Decision # 1:
Use Master-slave Architectural
Style where slave processes are
replicated
Detailed Example: An architectural view
13
Decision # 2:
Checkpoint updated data, and
bundle replicas (send every 2
seconds) – in order to meet
performance goals.
Decision # 3:
Use heartbeat tactic to monitor
availability of task trackers and data
nodes. Heartbeat must beat every
.25 seconds to balance availability
and performance.
Decision # 4:
Use proxy handles failure pattern to
shield clients from failures, and to
support fault tolerance (i.e. service
continues in the face of transient
failure.
Apache Hadoop Architecture
Requirements# 1:
highly available system, where
hardware failure can be the norm
rather than the exception
Requirements# 2:
Provide secure services for the
client
14. 6
Master
Slave
HB
Decision # 1:
Use Master-slave Architectural
Style where slave processes are
replicated
Detailed Example: An architectural view
14
Decision # 2:
Checkpoint updated data, and
bundle replicas (send every 2
seconds) – in order to meet
performance goals.
Decision # 3:
Use heartbeat tactic to monitor
availability of task trackers and data
nodes. Heartbeat must beat every
.25 seconds to balance availability
and performance.
Decision # 4:
Use proxy handles failure pattern to
shield clients from failures, and to
support fault tolerance (i.e. service
continues in the face of transient
failure.
Apache Hadoop Architecture
Requirements# 1:
highly available system, where
hardware failure can be the norm
rather than the exception
Requirements# 2:
Provide secure services for the
client
15. 6
Master
Slave
HB
Decision # 1:
Use Master-slave Architectural
Style where slave processes are
replicated
Detailed Example: An architectural view
15
Decision # 2:
Checkpoint updated data, and
bundle replicas (send every 2
seconds) – in order to meet
performance goals.
Decision # 3:
Use heartbeat tactic to monitor
availability of task trackers and data
nodes. Heartbeat must beat every
.25 seconds to balance availability
and performance.
Decision # 4:
Use proxy handles failure pattern to
shield clients from failures, and to
support fault tolerance (i.e. service
continues in the face of transient
failure.
Apache Hadoop Architecture
Requirements# 1:
highly available system, where
hardware failure can be the norm
rather than the exception
Decision # 1:
Mutual Authentication with
Kerberos RPC (SASL/GSSAPI) on
RPC connections
Requirements# 2:
Provide secure services for the
client
16. 6
Master
Slave
HB
Decision # 1:
Use Master-slave Architectural
Style where slave processes are
replicated
Detailed Example: An architectural view
16
Decision # 2:
Checkpoint updated data, and
bundle replicas (send every 2
seconds) – in order to meet
performance goals.
Decision # 3:
Use heartbeat tactic to monitor
availability of task trackers and data
nodes. Heartbeat must beat every
.25 seconds to balance availability
and performance.
Decision # 4:
Use proxy handles failure pattern to
shield clients from failures, and to
support fault tolerance (i.e. service
continues in the face of transient
failure.
Apache Hadoop Architecture
Requirements# 1:
highly available system, where
hardware failure can be the norm
rather than the exception
Decision # 1:
Mutual Authentication with
Kerberos RPC (SASL/GSSAPI) on
RPC connections
Decision # 2:
Maintain an Audit Trail
Requirements# 2:
Provide secure services for the
client
17. 6
Master
Slave
HB
Decision # 1:
Use Master-slave Architectural
Style where slave processes are
replicated
Detailed Example: An architectural view
17
Decision # 2:
Checkpoint updated data, and
bundle replicas (send every 2
seconds) – in order to meet
performance goals.
Decision # 3:
Use heartbeat tactic to monitor
availability of task trackers and data
nodes. Heartbeat must beat every
.25 seconds to balance availability
and performance.
Decision # 4:
Use proxy handles failure pattern to
shield clients from failures, and to
support fault tolerance (i.e. service
continues in the face of transient
failure.
Apache Hadoop Architecture
Requirements# 1:
highly available system, where
hardware failure can be the norm
rather than the exception
Decision # 1:
Mutual Authentication with
Kerberos RPC (SASL/GSSAPI) on
RPC connections
Decision # 2:
Maintain an Audit Trail
Decision # 3:
Data Encryption on RPC
Data Encryption on Block data
transfer
Data Encryption on HTTP
Requirements# 2:
Provide secure services for the
client
18. 6
Master
Slave
HB
Decision # 1:
Use Master-slave Architectural
Style where slave processes are
replicated
Detailed Example: An architectural view
18
Decision # 2:
Checkpoint updated data, and
bundle replicas (send every 2
seconds) – in order to meet
performance goals.
Decision # 3:
Use heartbeat tactic to monitor
availability of task trackers and data
nodes. Heartbeat must beat every
.25 seconds to balance availability
and performance.
Decision # 4:
Use proxy handles failure pattern to
shield clients from failures, and to
support fault tolerance (i.e. service
continues in the face of transient
failure.
Apache Hadoop Architecture
Requirements# 1:
highly available system, where
hardware failure can be the norm
rather than the exception
Decision # 1:
Mutual Authentication with
Kerberos RPC (SASL/GSSAPI) on
RPC connections
Decision # 2:
Maintain an Audit Trail
Decision # 3:
Data Encryption on RPC
Data Encryption on Block data
transfer
Data Encryption on HTTP
Decision # 4:
Secure DataNode: DataNodes must
authenticate themselves
Requirements# 2:
Provide secure services for the
client
19. 6
Master
Slave
HB
Decision # 1:
Use Master-slave Architectural
Style where slave processes are
replicated
Detailed Example: An architectural view
19
Decision # 2:
Checkpoint updated data, and
bundle replicas (send every 2
seconds) – in order to meet
performance goals.
Decision # 3:
Use heartbeat tactic to monitor
availability of task trackers and data
nodes. Heartbeat must beat every
.25 seconds to balance availability
and performance.
Decision # 4:
Use proxy handles failure pattern to
shield clients from failures, and to
support fault tolerance (i.e. service
continues in the face of transient
failure.
Apache Hadoop Architecture
Requirements# 1:
highly available system, where
hardware failure can be the norm
rather than the exception
Decision # 1:
Mutual Authentication with
Kerberos RPC (SASL/GSSAPI) on
RPC connections
Decision # 2:
Maintain an Audit Trail
Decision # 3:
Data Encryption on RPC
Data Encryption on Block data
transfer
Data Encryption on HTTP
Decision # 4:
Secure DataNode: DataNodes must
authenticate themselves
Requirements# 2:
Provide secure services for the
client
More Decisions:
A non-trivial architecture is likely
to be composed of hundreds, if
not thousands of architectural
decisions.
20. 6
Master
Slave
HB
Decision # 1:
Use Master-slave Architectural
Style where slave processes are
replicated
Detailed Example: An architectural view
20
Decision # 2:
Checkpoint updated data, and
bundle replicas (send every 2
seconds) – in order to meet
performance goals.
Decision # 3:
Use heartbeat tactic to monitor
availability of task trackers and data
nodes. Heartbeat must beat every
.25 seconds to balance availability
and performance.
Decision # 4:
Use proxy handles failure pattern to
shield clients from failures, and to
support fault tolerance (i.e. service
continues in the face of transient
failure.
Apache Hadoop Architecture
Requirements# 1:
highly available system, where
hardware failure can be the norm
rather than the exception
Decision # 1:
Mutual Authentication with
Kerberos RPC (SASL/GSSAPI) on
RPC connections
Decision # 2:
Maintain an Audit Trail
Decision # 3:
Data Encryption on RPC
Data Encryption on Block data
transfer
Data Encryption on HTTP
Decision # 4:
Secure DataNode: DataNodes must
authenticate themselves
Requirements# 2:
Provide secure services for the
client
More Decisions:
A non-trivial architecture is likely
to be composed of hundreds, if
not thousands of architectural
decisions.
Decision # 1: Separate each of
these domains and run them on
different threads..
More Decisions:
A non-trivial architecture is likely
to be composed of hundreds, if
not thousands of architectural
decisions.
Decision # 29:
Separate data busses should be
used for data or command
operations. Each buss should utilize
a different scheduling mechanisms.
Decision # 101:
Utilize heartbeat, voting and
simulation to detect faults. FDIR
module responsible for all these.
Decision # 1:
Keep primaries together, and
replicas together at all times to
meet redundancy goals.
Decision # 2:
Checkpoint updated data, and
bundle replicas (send every 2
seconds) – in order to meet
performance goals.Decision # 3:
Use heartbeat tactic to monitor
availability of track manager
primaries and secondaries.
Heartbeat must beat every .25
seconds to balance availability and
performance.
Decision # 3:
Use heartbeat tactic to monitor
availability of track manager
primaries and secondaries.
Heartbeat must beat every .25
seconds to balance availability and
performance.
Decision # 32:
System must use active redundancy
with graceful degradation to
achieve maximum availability.
More Decisions:
A non-trivial architecture is likely
to be composed of hundreds, if
not thousands of architectural
decisions.
More Decisions:
A non-trivial architecture is likely
to be composed of hundreds, if
not thousands of architectural
decisions.
Decision # 91:
Utilize heartbeat, voting and
simulation to detect faults. FDIR
module responsible for all these.
Decision # 1:
Use thread pooling to execute
recurrent mission operations.
Decision # 21:
Active redundancy is implemented
to achieve minimize mean time
between failures.
Decision # 19:
Separation of concerns: Each tasks
must run as a separate process on a
separate processor.
Decision # 66:
Check pointing is utilized to recover
non-mission critical operations.
Decision # 52:
Platform Diversity plus N-Version
programming must be used to
maximize the reliability.
Decision # 4:
Use proxy handles failure pattern to
shield clients from failures, and to
support fault tolerance (i.e. service
continues in the face of transient
failure.
Decision # 151:
Utilize heartbeat, voting and
simulation to detect faults. FDIR
module responsible for all these.
Decision # 81:
Voting mechanism is used to recover
from failure of any of the sensors.
Decision # 101:
Utilize heartbeat, voting and
simulation to detect faults. FDIR
module responsible for all these.
Decision # 113:
Semantic based scheduling and task
sequencer is used to provide real-
time performance.
More Decisions:
A non-trivial architecture is likely
to be composed of hundreds, if
not thousands of architectural
decisions.
Decision # 3:
Use heartbeat tactic to monitor
availability of track manager
primaries and secondaries.
Heartbeat must beat every .25
seconds to balance availability and
performance.
Decision # 121:
Keep primaries together, and
replicas together at all times to
meet redundancy goals.
Decision # 3:
Use heartbeat tactic to monitor
availability of track manager
primaries and secondaries.
Heartbeat must beat every .25
seconds to balance availability and
performance.
25. 25
Why Software Architecture Recovery?
1- Software Comprehension
Department of Defense (DOD) spends
almost half of its post-deployment costs
(47%) reverse engineering its own code.
This process often involves identifying the
underlying design intent in order to
modify a legacy system, or understanding
the code and evaluating the impact of
introducing changes.
Staff turn over.
Buy a new software product, get insight
about internal qualities.
29. 28
A big ball of mud: Apache Hadoop architecture
Why Software Architecture Recovery?
30. 30
Why Software Architecture Recovery?
3- Renovating Architecture
e.g. From Legacy System to Cloud
based architecture.
A- Forward Architecting to design the ideal architecture.
B- Reverse architecting the legacy system to discover current
architecture.
C- Rebalancing, to create an architecture improvement plan.
32. The Ariane 5 system reused design specifications and code from its highly successful
predecessor, the Ariane 4.
The Inertial Reference System, performed a data conversion of a 64-bit floating point
value related to the horizontal velocity and place the result into a 16-bit signed integer
variable.
Based on the operating characteristics of the Ariane 4, the design team felt it was
physically impossible to have a horizontal velocity large enough to cause an arithmetic
overflow of a 16-bit signed integer variable.
However, the reuse of this software in the Ariane 5 placed the code in a very different
operating context in which the specific design assumption relating to horizontal velocity
was no longer valid.
Primary process crashed, the system switched to the redundant backup process, the
backup crashed for the same reason.
The Inertial Reference System then generated diagnostic output which was incorrectly
interpreted as flight control data by other portions of the flight control system. This faulty
interpretation made the flight control system take actions that led to the self-destruction
of the rocket.
33. 33
Reverse Engineering and Design Recovery : A taxonomy,
Elliot J. Chikofsky, James H. Cross II, IEEE Software,
January 1990
What to Recover?
34. 34
Quality Function Deployment’s
House of Quality
Functionaliti
es
Relationships
Between
Qualities
and
Design Decisions
ImportanceRankingsQualities
Design
Decisions
Costs/Feasibility
Engineering Measures
Trade-offs
1
2
3
4
5
6
7
8
What to Recover?
35. How to Recover the Architecture?
We need a chain of evidences to discover the decisions
and reconstruct the architecture
It’s not about mining code, it’s about mining software
repositories, mining all assets.
35
….
Code
models
Bugs
Issues
Changes
Documents
Specifications
People
ChatsTraces
Web Sites
Emails
39. How to Achieve this?
39
Tactics are pervasive in fault-tolerant and/or high-
performance systems.
Tactics seem to have an interesting relationship to change.
40. Use of Tactics in a Variety of Systems
40
Tactics tend to be found in safety-critical, and/or other kinds of
performance-centric systems.
41. Why It is so Difficult?
41
No single way to implement
an architectural tactic/decision.
Structural analysis fails.
❸ Observer Pattern
(Found in Amalgam)
❷ Decorator Pattern
(Found in Thera, JSRB, Rossume)
❶ Direct communication with
Configuration (Found in Hadoop,
Chat3, Smartfrog)
Mehdi Mirakhorli, Jane Cleland-Huang, "Tracing Architectural Concerns in High Assurance Systems (NIER Track)“,
33th International Conference on Software Engineering, ICSE 2011, Honolulu, Hawaii, USA, May 2011.
44. Recovering Architectural Tactics
44
Mehdi Mirakhorli, Yonghee Shin, Jane Cleland-Huang and Murat Cinar, "A Tactic-Centric Approach for
Automating Traceability of Quality Concerns", IEEE International Conference on Software Engineering (ICSE)
2012, (14% acceptance).ACM SIGSOFT Distinguished Paper Award.
45. Tactic Detection Approach
45
Normalizes the frequency
with which term t occurs
in the training document
with respect to its length
Computes the percentage
of training documents of
type Q containing term t
Decreases
the weight
of terms
that are
project
specific.
Training
Classification
Computes the likelihood
that code snippet r traces
to Query q.
55. 55
Identification of major Components
Filtration based on the percentage of dependency
Class hierarchy
Softwarenaut
Main components like Node-Manager and Resource-Manager do not have dependency on each other.
http://scg.unibe.ch/softwarenaut
62. 62
Top Down/Conceptual/Reflection Model
Start with Specifications / High level knowledge about architecture
Formulate hypotheses and verify them against the source code
Gail C. Murphy, David Notkin, and Kevin Sullivan. 1997. Extending and Managing Software Reflexion Models.
Technical Report. University of British Columbia, Vancouver, BC, Canada, Canada.
65. Recover Functional Architecture
Functional architecture
is a conceptual view of the
architecture, depicting the
key areas of functionality in
the system and how they
interact with each other.
65
71. Discovery Micro Design
Stage 1: Creating semi-formal definitions for each pattern of interest
and its variants based on common extensible feature types.
Stage 2: Detecting patterns by identifying its features, with different
search technologies best fitting to the respective feature.
Slide used courtesy of Patrick Maeder
71
76. Architecture Breaker
76Issues Reported: HADOOP-4584, HADOOP-178,…
NameNode.java
DataNode.java
a)HeartBeat
b)blockreports
c)blocktobe
deleted
…
Developer #1: DataNodes.java,
should send several messages to
the NameNode.java. Messages
such as block reports, heartbeat,
blocks to be deleted etc.
Dat
a
Dat
a
Dat
a
77. Architecture Breaker
76Issues Reported: HADOOP-4584, HADOOP-178,…
NameNode.java
DataNode.java
a)HeartBeat
b)blockreports
c)blocktobe
deleted
…
Developer #2: So many
messages, lets merge them by
piggy-backing
Dat
a
Dat
a
Dat
a
78. Architecture Breaker
76Issues Reported: HADOOP-4584, HADOOP-178,…
NameNode.java
DataNode.java
a,b,c,…
Piggy Backing
Heartbeat message on
block reports
Developer #2: So many
messages, lets merge them by
piggy-backing
Dat
a
Dat
a
Dat
a
79. Architecture Breaker
76Issues Reported: HADOOP-4584, HADOOP-178,…
NameNode.java
DataNode.java
a,b,c,…
Piggy Backing
Heartbeat message on
block reports
Developer #2: So many
messages, lets merge them by
piggy-backing
Design Decay &
Compromising Availability:
Block reports are usually
delayed, system detects the
DataNode failure and lunches
the recovery process while it is
alive.
Dat
a
Dat
a
Dat
a
80. Architecture Breaker
80Issues Reported: HADOOP-4584, HADOOP-178,…
NameNode.java
DataNode.java
…
a.b.c….
If (t==10 & BlockReady)
report(b);
else report(Empty);
Developer #3: every 10
seconds DataNode reports data
or send an empty message for
heartbeat
Dat
a
Dat
a
Dat
a
81. Architecture Breaker
81Issues Reported: HADOOP-4584, HADOOP-178,…
NameNode.java
DataNode.java
a.b.c….
If (t==2 & BlockReady)
report(b);
else report(Empty);
Developer #4: lets make it
every 2 seconds
Design Decay &
Performance Tradeoff:
Performance issues, tradeoff
between availability and
performance
Dat
a
Dat
a
Dat
a
82. Archie: A Smart IDE to Protect Architecture
The vision initially presented at:
Mehdi Mirakhorli, Cleland-Huang, "Using Tactic Traceability Information Models to Reduce the Risk of Architectural Degradation during
System Maintenance", ICSM 2011.
82
83. Detect and
monitor code
snippets that
implement key
architectural
decisions in the
source code.
Proactively keep
developers
informed of
underlying
architectural
decisions during
maintenance
activities.
Automatically
trace external
architecture
specification
documents to
the source code
or design
model.
Perform change
impact analysis
of architectural
concerns at both
the code and
design level.
83
Archie: A Smart IDE to Protect Architecture
84. Decision Detector: A rigorously
validated automated technique
based on a combination of
machine learning, structural
analysis, and pattern matching
techniques.
Why it works?: Trained by sample
source codes from hundreds of
open source projects.
84
Detect and
monitor code
snippets that
implement key
architectural
decisions in the
source code.
Archie: A Smart IDE to Protect Architecture
Code Snippets
public boolean isAuditUserIdentifyPresent(){
return(this.auditUserIdentify != null);
public BigDecimal getAuditSequenceNumber(){
return(this.auditSequenceNumber;
Code Snippets
public boolean isAuditUserIdentifyPresent(){
return(this.auditUserIdentify != null);
public BigDecimal getAuditSequenceNumber(){
return(this.auditSequenceNumber;
85. 85
Detect and
monitor code
snippets that
implement key
architectural
decisions in the
source code.
Archie: A Smart IDE to Protect Architecture
86. 86
Detect and
monitor code
snippets that
implement key
architectural
decisions in the
source code.
Archie: A Smart IDE to Protect Architecture
87. 87
Proactively keep
developers
informed of
underlying
architectural
decisions during
maintenance
activities.
IDEs and Compilers do well on Syntactical issues, a
little attention to Semantic but Design Rational is not
covered.
Archie: A Smart IDE to Protect Architecture
Archie has features for communicating architectural
knowledge.
Visualization module to depict the seams of a software
design, the driving requirements, business goals and
rationale behind the source code.
88. 88
Archie: A Smart IDE to Protect Architecture
Proactively keep
developers
informed of
underlying
architectural
decisions during
maintenance
activities.
89. Perform change
impact analysis
of architectural
concerns at both
the code and
design level.
An asynchronous Event-Based monitoring and
notification infrastructure has been designed to
proactively inform developers of underlying
architectural decisions. An initial proof of
concept experiment has been conducted.
Archie: A Smart IDE to Protect Architecture
89
90. 90
Archie: A Smart IDE to Protect Architecture
Perform change
impact analysis
of architectural
concerns at both
the code and
design level.
91. 91
Archie: A Smart IDE to Protect Architecture
Perform change
impact analysis
of architectural
concerns at both
the code and
design level.
92. 92
Archie: A Smart IDE to Protect Architecture
Design Warnings
Perform change
impact analysis
of architectural
concerns at both
the code and
design level.