This document provides an overview of different software engineering approaches for bioinformatics projects, including Waterfall, Agile, DevOps, eXtreme Programming, and tools like Git, GitHub, Jenkins, Docker, and Fabric. Waterfall is not well-suited for bioinformatics due to lack of flexibility, while Agile and DevOps methodologies allow for iterative development and integration of code changes. eXtreme Programming focuses on code maintainability through practices like test-driven development and continuous integration. Version control with Git/GitHub and configuration tools like Docker and Fabric can help manage reproducibility and infrastructure as code.
2. Progress of software development practice
Waterfall Agile DevOps
There is no silver bullet for software engineering and every each has own advantage
But in terms of bioinformatics, there is very little chance for adopting Waterfall.
6. Waterfull stlyle development
• Very old and popular style of product
development
Do not go back to previous section
It has to be clear about what we really want to make
Suited for Large-scale development
“Few A-Class architect and the mass of C-class programmer”Ref: http://fireside.gamejolt.com/post/the-game-creation-process-part-2-designing-the-idea-viq5rk2t
8. waterfull(spiral)
• Iteration of waterfull
Advantage
• Task becomes clear by each iteration
Disadvantage
• Time consuming
• Hard to determine how much we have to
elaborate on first iteration
ref: http://www.qmetry.com/spiral.html
10. Agile
• Antithesis for waterfull
• Not Technique, it’s Phiosophy
• 1 iteration is 1~4 week,and 1 feature for each iteration.
Ref: https://www.linkedin.com/pulse/essential-resources-services-technologies-your-startup-jason-oh
11. Agile
Advantage
• Easy to adopt changes
• Make clear where we are and where we want to go
Disadvantage
• Necessity for refactoring -> CI(We will see later)
• Communication cost -> No more than about 20 people
12. Difference of agile and spiral
• Spiral … makes every feature in
each iteration
• Agile … implements only one
feature for each iteration.
14. One way of agile incarnation
Focus on communication of developers
• Make a list for features we one to implement and
update constantly
• Each iteration is 30 days and software has to be
deployable in the end
• 15 minutes standing meeting everyday
• No partitioning
Scrum
16. eXtreme Programming(XP)
One way of agile incarnation
Focus on maintainability of Code
• Test Driven Development(TDD)
• Pare Programming
• Joint ownership of code
• Continuous Integration (CI)
• Issue Tracking
17. eXtreme Programming(XP)
One way of agile incarnation
Focus on maintainability of Code
• Test Driven Development(TDD)
• Pare Programming
• Joint ownership of code
• Continuous Integration (CI)
• Issue Tracking
18. 2 purpose of software test
Test for users
Focused in Agile
Run test everytime we make a
change to source code
Test for developers
19. eXtreme Programming(XP)
One way of agile incarnation
Focus on maintainability of Code
• Test Driven Development(TDD)
• Pare Programming
• Joint ownership of code
• Continuous Integration (CI)
• Issue Tracking
20. • Distributed Version Control
System(DVCS)
• Able to share history of changes
• Cut a brunch for every single
feature or subproject
Ref: http://gotgroove.com/ecommerce-blog/guide-to-version-control-for-magento-using-git-and-beanstalk/
Mercurial (more simple DVCS for pythonista) could be enough for some bioinformaticians, though…
21. Workflow using git(≒ how to branch)
There are several practice of
branching but the following are
the principle rule
• 1 feature 1 branch
• Master always have to be
deployable
出典:https://www.atlassian.com/ja/git/workflows#!workflow-gitflow
22. • Hosting service for Git
• Filing issue for every subject
makes project trackable
Coding -> Pull Request -> Review -> merge
By following this flow, Source code becomes less dependent to particular person
23. Workflow using Git&github
Work in local
repository
push
Pull Request
Code Review
merge
Fork & clone
Ref: http://acrl.ala.org/techconnect/post/coding-collaboration-on-github
24. Workflow using Git&github
Work in local
repository
push
Pull Request
Code Review
merge
Fork & clone
Ref: http://acrl.ala.org/techconnect/post/coding-collaboration-on-github
Ticketing
↓
Issue Tracking
Buid test
↓
CI
25. eXtreme Programming(XP)
One way of agile incarnation
Focus on maintainability of Code
• Test Driven Development(TDD)
• Pare Programming
• Joint ownership of code
• Continuous Integration (CI)
• Issue Tracking
26. Continuous Integration(CI)
• Run automated test
constantly
• Makes easy to track a
Problem
Jenkins: The CI tool
Ref: http://www.slideshare.net/whyme/jenkins-reviewbot
27. Github and CI tool
Run test every time pushing remote
Common Combination is
Github + [travisCI or jenkins]
Ref: https://github.com/hltfbk/Excitement-Open-Platform/wiki/Developers
28. eXtreme Programming(XP)
One way of agile incarnation
Focus on maintainability of Code
• Test Driven Development(TDD)
• Pare Programming
• Joint ownership of code
• Continuous Integration (CI)
• Issue Tracking
29. Practice for Issue tracking
• Rough schedule is tracked by
Gantt chart, burn down chart
Ref: https://en.wikipedia.org/wiki/Gantt_chart
Ref: http://chandoo.org/wp/2009/07/21/burn-down-charts/
• More precise schedule
will be managed by
Tickets or issues
Redmine Github + Zenhub
Burn down chart
Gantt chart
30. Test Driven Development(TDD)
• Manage task Centrally as Ticket
• Make small tasks clear and
trackable
出典:http://itpro.nikkeibp.co.jp/article/COLUMN/20130927/507265/?SS=imgview&FD=55983188&ST=devops
Is a commonly used tool
32. DevOps
• Extending “Agile” from
Development to operation
That is ..
• Reflect changes to working system
instantly when we update a code. Not only developing a
software.
But to Develop a
Whole System.
33. Technologies for Devops
•Virtualization using container
• Configuration Management tool
http://blog.xebialabs.com/2014/12/05/rocket-vs-docker-myth-simple-lightweight-enterprise-platform/
Fabric
34. Technologies for Devops
•Virtualization using container
• Configuration Management tool
http://blog.xebialabs.com/2014/12/05/rocket-vs-docker-myth-simple-lightweight-enterprise-platform/
Fabric
35. Tipical Situation in bioinformatics
•
Small daily analysis
on laptop
Realize necessity of
computation power
Move pipeline to High-
performance server
Able to use Cloud?
Use CloudBiolinux or other VM image
From bioimg.org
_人人人人人人人人人人_
> dependency hell<
 ̄Y^Y^Y^Y^Y^Y^Y^Y^Y^Y ̄
Software (or package) Version difference
_人人人人人人人人人人_
> No Reproducibility<
 ̄Y^Y^Y^Y^Y^Y^Y^Y^Y^Y ̄
36. Container Virtualization(docker)
• Include wholeThird-Party
developed software into one
container.
• Build Once Run Anywhere
• Version-controlable and has
Github-like Hosting service
Easy to transport between servers
Develop whole container as “Software”
37. Progress of Virtualization
chroot、cgroups KVM、Virtualbox
Isolation of file and process space OS Virtualization
• Heavy
• Non-easy for Provisioning
• Hard to use base image
• (chroot has) a danger for depletion of
computation resource by 1 user.
Tries to take advantage of both
38. Emergence
of Counterforce
• Security problem
• Dockerfile problem
• Portablity problem
Some bugs around caching?
Peculiar way of writing
->Better to use packer
Become root is must
Better to be run on Linux
kernel version (>= 3.8)
Cloudius OSV
Problem of Docker
Not user-friendly enough so far
Not enough community resource such as Base image
Not mature enough to use
39. Technologies for Devops
•Virtualization using container
• Configuration Management tool
http://blog.xebialabs.com/2014/12/05/rocket-vs-docker-myth-simple-lightweight-enterprise-platform/
Fabric
40. Infrastructure as code
• Maintain Server condfiguration as
Code
• Assure to be idempotent
• Easily transport pipelines
between servers
Fabric
Ruby base Python base
Chef
Zero
simple
• Chef requires users to remember fancy jargons
• CloudBiolinux supports Fabric
Better to start from fabric
complex