Apache CloudStack (ACS) is one of the best-of-breed in the current IaaS landscape. However, despite its high functional quality, its technical quality (the one you observe when you look "under the hood") is bellow industry average. Miguel Ferreira was initiated in ACS development in the early days of 2014 and presents, in this talk, his perspective, as a newcomer, on how it felt to him like he was working on a skyscraper in the early days of the XX century, without safety-net, nor rope. Together with his former colleague Dennis Bijlsma (SIG), they dived into the code of ACS to find evidence to support a change in the way ACS is maintained, that combines the best of modern software engineering practices and cutting-edge innovation. Miguel proposes that the ACS community take charge of keeping to the highest standards in software technical quality, while continuing to foster creativity.
Active Directory Penetration Testing, cionsystems.com.pdf
Working on a Skyscraper in the Early Years of the XX Century
1. CCCE 2014
WORKING ON A SKYSCRAPER IN THE EARLY
YEARS OF THE XX CENTURY
CCCE – 2014 – Budapest
Talk: Miguel Ferreira (SBP)
CloudStack evaluation: Dennis Bijlsma (SIG)
2. CCCE 2014
THIS TALK
It’s about how I’ve experienced CloudStack development
3. CCCE 2014
WHO AM I
Portuguese
Mission Critical Engineer at Schuberg Philis
Interests:
• Software maintainability
• Metrics-driven engineering
• Testing
18. CCCE 2014
WHAT SOFTWARE TAUGHT ME
Maintainable software is made of tiny blocks of code
Simple blocks, with meaningful names
That blend in together to create a consistent picture
Each piece is independently testable
19. CCCE 2014
MAINTAINABILITY IS THE ENABLER
Maintainability enables all other software quality characteristics
Security, Reliability, Performance, Portability, …
20. CCCE 2014
MAINTAINABLE SOFTWARE
Changes often
Changes fast
by Arthur John Picton, source http://www.flickr.com/, licensed under CC BY 2.0
21. CCCE 2014
UNIT-TESTS ENABLE MAINTAINABILITY
The single most effective feedback loop
Without it the risk of continuously grooming
code is prohibitive
System
Integration
Unit
26. CCCE 2014
CLOUDSTACK
What kind of software system is it?
“CloudStack is an open source software
platform that pools computing resources to build
public, private, and hybrid Infrastructure as a
Service (IaaS) clouds.”
27. CCCE 2014
IN OTHER WORDS
CloudStack coordinates the work of multiple computing systems in order to offer computing
resources in a seemingly integrated fashion
What kind of systems are we talking about?
• Hypervisors
• Storage devices
• Network devices
Pretty complex stuff
28. CCCE 2014
MY FIRST REACTION
Excitement to work on such an important piece of software
It must be really good, given the adoption
29. CCCE 2014
MY FIRST REACTION
Excitement to work on such an important piece of software
It must be really good, given the adoption
CloudStack is “the Bomb”!!
30. CCCE 2014
THEN I LOOKED UNDER THE HOOD
Code is cluttered
• Poor conventions (e.g. naming: ‘_*’, ‘s_*’)
• Commented out code
Poor domain model
• Long methods
• “God classes”
• Duplicated objects
• “Stringly typed” methods
Poor error handling
• “Pokémon” exception handling
Build system adoption was left halfway
31. CCCE 2014
THEN I LOOKED UNDER THE HOOD
Code is cluttered
• Poor conventions (e.g. naming: ‘_*’, ‘s_*’)
• Commented out code
Poor domain model
• Long methods
• “God classes”
• Duplicated objects
• “Stringly typed” methods
Barely any unit-testing
Poor error handling
• “Pokémon” exception handling
Build system adoption was left halfway
34. CCCE 2014
CLOUDSTACKWORK ENVIRONMENT
Is not a safe work environment
The feedback loop is way too long
It requires extensive amount of debugging
It lacks unit-tests (which keep me sane)
35. CCCE 2014
TESTING CHALLENGES
CloudStack code is tightly coupled to specific hardware devices
It has a database
Plug-in architecture
Distributed development teams
36. CCCE 2014
HOW DO YOU TEST SUCH A SYSTEM?
There are other plug-in system out there
Lets look at a study of test practices within the Eclipse project
37. CCCE 2014
TEST PRACTICES IN ECLIPSE
Test responsibilities
• “Tester and developer, that’s one person. (…)”
Unit testing
• “We have unit testing, and that’s where we put the main effort”
• “Ultimately, unit tests are our best friends, and everything else is already
difficult.”
• “What cannot be encapsulated is not tested.”
Influence of the plug-in architecture
• “The small glue code (..), that’s not tested, because it is just hard to test
that. And for these untested glue code parts we had the most bugs.”
38. CCCE 2014
GETTING BACK TO CLOUDSTACK
Why doesn’t it have more unit tests?
The code does not lend itself to
be unit-tested!
39. CCCE 2014
SOURCE CODE EVALUATION
According to model used for certifying software maintainability
Focus on simple low-level source code metrics
Aggregation of metrics to system level respecting statistical properties
47. CCCE 2014
WHAT DOES A UNIT-TEST LOOK LIKE
Low complexity (barely no branching at all)
3 blocks
• Setup: define the input
• Execution: call the method under test
• Verification: check expectations
48. CCCE 2014
UPDATE GUEST NETWORK
com.cloud.network.NetworkServiceImpl
updateGuestNetwork(…)
• 11 parameters,
• 286 lines of code
• 15 blocks of code (with header comments)
Unit test
• Setup: build/mock/abstract 11 input parameters (if you consider that 10 can be null, combinations = 2^10!)
• Execution: call updateGuestNetwork(…)
• Verification: what to expect from a 200+ line long method?
Easy answer: an updated guest network!
49. CCCE 2014
UPDATE GUEST NETWORK
In which state was it? Was it a system network? Was it any other kind of network?
Did it already have a name? Display name? Custom ID?
Did the network offering change?
Changing from VPC to non-VPC?
Network domain, IP reservation, …
Update life-cycle: shutdown all elements, make the update, turn back on
51. CCCE 2014
APACHE IS ALL ABOUT THE COMMUNITY
There must be something the community can do
Wants:
• Better code, more tests, happier developers
Don’t wants:
• Strangle innovation, chase current developers away
52. CCCE 2014
LTS MODEL
Popular in several open-source projects
Allows for cutting-edge innovation
Allows for better structure and quality control
59. CCCE 2014
HOW TO MAKE IT WORK
Freeze the current release branch
Automate code quality standard
• Measure unit size, complexity, coupling, duplication, etc.
• Measure unit-testing coverage
Keep the code as is, but
• Every time the code is touched it is refactored according to the defined quality standard
• No new code gets in the LTS branch without meeting the tech quality requirements
Maintain two groups of developers (LTS and Innovation)
• As time progresses swap people between the two
60. CCCE 2014
SUMMARY
CloudStack is functionally quite good, but the code is not keeping up
Feedback loops for developers are way too long
More and better unit testing will help to increase quality and developer satisfaction
Only the CloudStack community has the power to change the game
Schuberg Philis is more than interested in helping
63. CCCE 2014
CONTACTS
Miguel Ferreira, SBP, mferreira@schubergphilis.com, @miguel_f
Dennis Bijlsma, SIG, d.bijlsma@sig.eu
Ping us!
Notas do Editor
Made my first few €s with software.
Had my first paid development gig making small programs for research.
Had my first real development project, hired by the university for a government project
Image: https://www.flickr.com/photos/45648531@N00/3343486124/in/set-72157594166672630
Went into research myself.
Formally built and verified software models
Idea is to refine until code
Although I know people and projects where this happens, I never got there myself
Build the right software!
Image: https://www.flickr.com/photos/45648531@N00/3343486124/in/set-72157594166672630
Went into research myself.
Formally built and verified software models
Idea is to refine until code
Although I know people and projects where this happens, I never got there myself
Build the right software!
Image source: https://www.flickr.com/photos/pedromourapinheiro/5075612989
Got a job as in the R&D department of a company specialized in software quality
In my mind software quality meant correct-by-construction…
The perspective of my new employer was totally different.
Image source: https://www.flickr.com/photos/bdesham/2432400623
Some code is created, much code is re-used
Image source: https://www.flickr.com/photos/bdesham/2432400623
We called it legacy…
Image source: https://www.flickr.com/photos/bdesham/2432400623
The mind shift for me was tremendous… instead of defining through formal-mathematics what good quality code is, I learned to observe existing software to learn what good software looks like!
http://legoquestkids.blogspot.nl/2011/02/30-pieces-quest-34.html
Pending permission
Empirical approach to software quality, using real-world software to understand how maintainable software looks like
Evaluation based purely on source code. No expert opinion!
Assess individual software elements (a.k.a units), sort and categorize them
At this point I was building software to analyze/measure other software
Focused on building the software right (engineering)
Learning process: how to survive as a Software Engineer?
http://legoquestkids.blogspot.nl/2011/02/30-pieces-quest-34.html
Pending permission
Empirical approach to software quality, using real-world software to understand how maintainable software looks like
Evaluation based purely on source code. No expert opinion!
Assess individual software elements (a.k.a units), sort and categorize them
At this point I was building software to analyze/measure other software
Focused on building the software right (engineering)
Learning process: how to survive as a Software Engineer?
http://legoquestkids.blogspot.nl/2011/02/30-pieces-quest-34.html
Pending permission
Learn from software itself what good quality software looks like!
Without the ability to change a software system dies!
Image source: https://www.flickr.com/photos/profzucker/15179450153
Feedback even without executing test
Especially in a system with distributed development teams
Especially in a system with runtime dependencies that are hard to get together
Image source: https://www.flickr.com/photos/aeu04117/2478514667/in/photostream/
CC
Things get quite strict and bureaucratic when software becomes a medical device
Then I landed at SBP as a Mission Critical Engineer
SBP uses CloudStack for its Mission Critical Cloud
100% functionally available cloud!
I landed in the toolkit team which is engaged in CloudStack development (Daan & Hugo, VPC refactoring, etc)
This was my first contact with ClouStack
Image source: https://www.flickr.com/photos/kristian20/400019662
I was already putting my Engineer hat on…
Image source: https://www.flickr.com/photos/kristian20/400019662
I was already putting my Engineer hat on…
Image source: http://jalopnik.com/5182381/adventures-with-wildlife-surprise-underhood-opossum-nest
Pending permission
http://blog.codinghorror.com/new-programming-jargon/
Duplicated objects: HostDetailsDaoImpl is in two packages (1) engine/schema and (2) engine/orchestration
Stringly typed: HostVO constructor has 27 parameters
Pokémon:
1794 cases of catch exception
455 empty catches
Image source: http://jalopnik.com/5182381/adventures-with-wildlife-surprise-underhood-opossum-nest
Pending permission
http://blog.codinghorror.com/new-programming-jargon/
Duplicated objects: HostDetailsDaoImpl is in two packages (1) engine/schema and (2) engine/orchestration
Stringly typed: HostVO constructor has 27 parameters
Pokémon:
1794 cases of catch exception
455 empty catches
CloudStack is functionally very good, best of breed! #cloudtsackworks!
But the technical quality of the code falls behind!
Let it sink… I will come back to this in a moment
Plug-in architecture introduces integration challenge
Distributed development teams introduces communication challenge
And also a culture clash (?)
Lots and lots of small entities with high levels of duplication
We found cases of cyclic class dependencies
That makes it extremely tricky to test in isolation.
HostDetailsDaoImpl exists in both engine/schema and engine/orchestration
Considering also that in CloudStack code null is not simply an initialization value, it caries a lot of meaning throughout very long paths of execution