Bah! Humbug! The embedded movies do not work. Gak.
This slide show was NOT presented during the FOAM meeting as the PC was being used to futz with the new Cloudman instance so I could use it for the demo.
2. Context: bioinformatic analyses
Big data; complex analyses
Repeatable, automated pipelines
Reproducibility real goal
Reproducibility is hard
2
3. Frameworks
Eg VGL
Local SOPs for biologists
Tools, canned workflows
Minimise opportunities for error
Maximise reproducibilty
3
4. In real life
90/10 rule
Need to tweak SOPs
Trivial 'disposable' scripts
Not documented or curated
Not reliably available to re-run
“Dark script matter”
4
5. Dark Script Matter
Outside usual VCS/pipelines
Manual =/= reproducible
Necessary evil?
Platform extensions complex
Eg Galaxy – hours of work
5
6. Plan
Context: Reproducible analyses
Frameworks vs Dark Scripts
Alchemy: script to Galaxy
tool
Demonstration
Summary
Conclusions
6
7. Galaxy Tool Factory
An installable Galaxy tool
Runs scripts: Python,R,Perl,sh
Generates new Galaxy tools
Tool code wraps the script
Minutes – not hours
7
8. Galaxy Tool Shed
Separate server
Stores/serves Galaxy tools
Admin can install to Galaxy
Mercurial VCS archives
Explicit tool versioning
Sharing and reproducibility
8
16. Use Redo button; Generate
When working right
Use Redo to save retyping
Select Generate option
Provide tool ID, help text
Execute
Expect a toolfactory.gz in history
Copy link (floppy disk icon)
18
17. What's in the toolshed.gz ?
A gzip'd mercurial tool repository (!)
Auto generated tool XML file
Auto generated tool python wrapper
Functional test case - the sample data
Familiar Galaxy tool for all users
Executes your script over their data
Interoperably inside Galaxy
19
18. Upload TS gzip to new repository
Upload to any tool shed
Create new repo; sensible name!
Choose Upload files to new repo
Paste URL (floppydisk save icon)
New tool ready to install
20
19. Install and Test New Tool
Back to Galaxy admin interface
Browse local tool shed
Choose new tool
Install to local Galaxy
Try it out
Run functional test
21
20. Summary
GTF = script to tool in minutes
Integrated with Galaxy and TS
Simple workflow components
If needed, generate simple tool
Then add parameters manually
22
21. Tool Factory Operation Guide
Galaxy Install new tool from toolshed
Script Tool Factory from Galaxy admin page;
(Python,R, Tool Form; Test; Functional test;
perl, sh) Paste script;
Upload/paste
Sample Input for Test run; Create new repository.
functional test Check outputs; Upload files – paste TS gzip
Rerun/fix; link and upload
Generate TS gzip;
Copy download link for Tool Shed
pasting
23
23. Galaxy Tool Factory
Generate a new Galaxy tool
From a python, R, Perl or bash script
Using a Galaxy write as a tabular output file
# transpose a tabular input file and
tool
Via a Tool Shed
ourargs = commandArgs(T)
inf = ourargs[1]
outf = ourargs[2]
inp = read.table(inf,head=F,row.names=NULL,sep='t')
outp = t(inp)
write.table(outp,outf,quote=F, sep="t",row.names=F,col.names=F)
25
24. Tool Factory Operation Guide
Galaxy Install new tool from toolshed
Script – R, Tool Factory from Galaxy admin page;
perl, python Tool Form; Test; Functional test;
Paste script;
Upload/paste
Sample Input for Test run; Create new repository.
functional test Check outputs; Upload files – paste TS gzip
Rerun/fix; link and upload
Generate TS gzip;
Copy download link for Tool Shed
pasting
26