3 challenges for open source bioinformatics projects
1. Community Integration Democratization
Biopython: challenges
Brad Chapman
Peter Cock
Biopython contributors
http://biopython.org
10 July 2010
2. Community Integration Democratization
3 challenges for successful open source
projects
Community
Integration
Democratization
3. Community Integration Democratization
Distributed code access
4. Community Integration Democratization
Recruiting and training
Google Summer of Code
2009 Eric Talevich
phyloXML; Bio.Phylo
Nick Matzke
Biogeographical Phylogenetics
2010 Jo˜o Rodrigues
a
Structural biology; Bio.PDB
5. Community Integration Democratization
Answering questions better
6. Community Integration Democratization
Recognizing contributions
7. Community Integration Democratization
Diversity of Python bioinformatics
8. Community Integration Democratization
Interoperability
Avoid re-implementation
Convert core objects
Document workflows with multiple
libraries
Communicate better
10. Community Integration Democratization
Documenting standards
11. Community Integration Democratization
Making code easier to use
>>> from Bio import SeqIO
>>> memory_dict = SeqIO.index("in.gb", "genbank")
>>> memory_dict.keys()
[’Z78484.1’, ... ’Z78471.1’]
>>> seq_record = memory_dict["Z78475.1"]
>>> print seq_record.description
P.supardii 5.8S rRNA gene and ITS1 and ITS2 DNA
>>> seq_record.seq
Seq(’CGTAACAAGGTTTCCGTAGGTGAACCTGCGGAAGG...GGT’,
IUPACAmbiguousDNA())
12. Community Integration Democratization
Challenges of big data
13. Community Integration Democratization
Cloud: easier to distribute
On-demand computational resources like
Amazon EC2
Provide ready-to-go images
Biopython and many associated
bioinformatics libraries
Biological data
http://github.com/chapmanb/bcbb/tree/master/ec2/biolinux/
14. Community Integration Democratization
Following up
Home http://biopython.org
Code http://github.com/biopython
BOSC Talk to Eric, Tiago or myself