1. BOSC Vienna, Austria July 20, 2007 Kam D. Dahlquist Department of Biology John David N. Dionisio Department of Electrical Engineering & Computer Science Loyola Marymount University An Open Source Framework for Teaching Bioinformatics
2.
3. Scientific Computing and the Digital Divide Wilson GV (2006) Where’s the real bottleneck in scientific computing? American Scientist 94:5–6. Scientists who come to computer science after being trained in a different primary discipline often have to rediscover, relearn, or keep up with work in the computer science and software development realms in order to get the most out of their work. This causses unecessary and unknowing repetitions of past discoveries and errors. Tools or paradigms that are out-of-date in computer science and software engineering remain in place. At worst, software flaws slow or impede research. Baxter SM, Day SW, Fetrow JS, and Reisinger SJ (2006) Scientific software development is not an oxymoron. PLoS Computational Biology 2:e87.
4. The Disconnect Between Undergraduate Computer Science Training and Expectations and Skill Sets Required for Industry and Research Undergraduate Training Industry Expectation Work alone Work in a team “ Toy” programs and algorithms Large, modular project Throwaway code Code longevity (for better or worse)
5. inroads – The SIGCSE Bulletin, Volume 39, Number 2, 2007 June, pp. 70-74 http://recourse.cs.lmu.edu/
6. Official Open Source Definition (version 1.9) Free redistribution Source code Derived works Integrity of the author’s source code No discrimination against persons or groups No discrimination against fields of endeavor Distribution of license License must not be specific to a product License must not restrict other software License must be technology-neutral
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23. XSD-to-DB Adam Carasso Jeffrey Nicholas Scott Spicer XMLPipeDBUtils David Hoffman Babak Naffas Jeffrey Nicholas Ryan Nakamoto UniProtDB Joe Boyle Joey Barrett GODB Scott Spicer Roberto Ruiz GenMAPP Builder Joey Barrett Jeffrey Nicholas Scott Spicer Special Thanks GenMAPP.org Development Group Caskey L. Dickson, Wesley T. Citti NSF CCLI Program (http://recourse.cs.lmu.edu) http://xmlpipedb.cs.lmu.edu LMU Bioinformatics Group Kam D. Dahlquist http://myweb.lmu.edu/kdahqui [email_address] John David N. Dionisio http://myweb.lmu.edu/dondi [email_address]
Notas do Editor
Both relatively new to LMU Dondi’s background in medical informatics, data visualization, person-computer interactions During my postdoc I had served as project manager for GenMAPP, want to extend features of GenMAPP, especially for other species I am not a software developer (last time I took a computer science class was AP Pascal in high school), but I’ve had a lot of experience interacting with developers I’m proud of GenMAPP, especially that it is user-friendly for biologists, and is relatively bug free (result of my extensive testing) However, I never would have been standing up in this community to talk about it because although we believed strongly that GenMAPP should be free-of-charge, we were slow to make the source code available (it is now available on SourceForge) It has only been my collaboration with Dondi that I have been educated as to what Open Source software development truly means (Cathedral and Bazaar) This is the perfect forum for talking about our work because, while I am using the fruits of XMLPipeDB for GenMAPP as first imagined, we designed this project to have components that are resusable for other purposes and that the bioinformatics developer community is our target audience
Todays computer science master’s students were trained the old way What about math students and biology students coming to bioinformatics later? American scientist article on scientific computing Replicability Code longevity—you are stuck with the code that you have even if it’s not perfect, orphaned code
No ego in code Unit testing + Anyone can make any change anytime
No ego in code Unit testing + Anyone can make any change anytime
No ego in code Unit testing + Anyone can make any change anytime
Student code + tests, run all student tests against all student code Does it build, compile, run, student can submit multiple times until it passes these basic tests If pieces are out there, we won’t rewrite it , will reuse it. Versioning code and uploading Concurrent Versioning System (CVS) version control software “ Subversion” takes good parts of CVS Perforce is not open source SourceForge uses CVS and Subversion SourceForge went closed source Originally were going to use the open version of SourceForge, “Gforge”, but now will let students use SourceForge directly or cs dept server We mostly used the Eclipse development environment, but let the students choose their development environment
Archetypes Promoter: Worst case scenario of undergrad throwaway code, etc. Quiet guys didn’t make too much impact Just enough to get by (wanted to get reimbursed) Mature, professional, management skills Categories Promoter/supporter/analyzer/supporter Knowledge and skill Some with more experience All working at other “day jobs” Because of range of skill levels/personality styles “ know your learners” This paradigm of collaborative work/independent motivation was still unfamiliar to some students (aerospace industry) We were developing our vision of the project at the same time, nebulous phase, exposed them to this process Ownership/responsibilities Not the place for heroics—don’t go to the cave and hide, people are waiting on you We are young, we didn’t have control over the people who joined, we can self-critique our own code, BUT we still have a product! Even our weakest link contributed code that didn’t have to be rewritten Grading mentality
Archetypes Promoter: Worst case scenario of undergrad throwaway code, etc. Quiet guys didn’t make too much impact Just enough to get by (wanted to get reimbursed) Mature, professional, management skills Categories Promoter/supporter/analyzer/supporter Knowledge and skill Some with more experience All working at other “day jobs” Because of range of skill levels/personality styles “ know your learners” This paradigm of collaborative work/independent motivation was still unfamiliar to some students (aerospace industry) We were developing our vision of the project at the same time, nebulous phase, exposed them to this process Ownership/responsibilities Not the place for heroics—don’t go to the cave and hide, people are waiting on you We are young, we didn’t have control over the people who joined, we can self-critique our own code, BUT we still have a product! Even our weakest link contributed code that didn’t have to be rewritten Grading mentality
Archetypes Promoter: Worst case scenario of undergrad throwaway code, etc. Quiet guys didn’t make too much impact Just enough to get by (wanted to get reimbursed) Mature, professional, management skills Categories Promoter/supporter/analyzer/supporter Knowledge and skill Some with more experience All working at other “day jobs” Because of range of skill levels/personality styles “ know your learners” This paradigm of collaborative work/independent motivation was still unfamiliar to some students (aerospace industry) We were developing our vision of the project at the same time, nebulous phase, exposed them to this process Ownership/responsibilities Not the place for heroics—don’t go to the cave and hide, people are waiting on you We are young, we didn’t have control over the people who joined, we can self-critique our own code, BUT we still have a product! Even our weakest link contributed code that didn’t have to be rewritten Grading mentality
Archetypes Promoter: Worst case scenario of undergrad throwaway code, etc. Quiet guys didn’t make too much impact Just enough to get by (wanted to get reimbursed) Mature, professional, management skills Categories Promoter/supporter/analyzer/supporter Knowledge and skill Some with more experience All working at other “day jobs” Because of range of skill levels/personality styles “ know your learners” This paradigm of collaborative work/independent motivation was still unfamiliar to some students (aerospace industry) We were developing our vision of the project at the same time, nebulous phase, exposed them to this process Ownership/responsibilities Not the place for heroics—don’t go to the cave and hide, people are waiting on you We are young, we didn’t have control over the people who joined, we can self-critique our own code, BUT we still have a product! Even our weakest link contributed code that didn’t have to be rewritten Grading mentality
Archetypes Promoter: Worst case scenario of undergrad throwaway code, etc. Quiet guys didn’t make too much impact Just enough to get by (wanted to get reimbursed) Mature, professional, management skills Categories Promoter/supporter/analyzer/supporter Knowledge and skill Some with more experience All working at other “day jobs” Because of range of skill levels/personality styles “ know your learners” This paradigm of collaborative work/independent motivation was still unfamiliar to some students (aerospace industry) We were developing our vision of the project at the same time, nebulous phase, exposed them to this process Ownership/responsibilities Not the place for heroics—don’t go to the cave and hide, people are waiting on you We are young, we didn’t have control over the people who joined, we can self-critique our own code, BUT we still have a product! Even our weakest link contributed code that didn’t have to be rewritten Grading mentality