My presentation at the 5th Sequencing FInishing and Analysis in the Future (SFAF -- http://www.lanl.gov/conferences/finishfuture/2010SFAF_Meeting_Guide.pdf) June 3, 2010
12. Swiss Army Knife of NGS Analysis SDK Intuitive GUI Traditional Bioinformatics Visualization Desktop Solutions EnterpriseSolutions High Performance File Format Conversion Tools Integration Epigenomics Transcriptomics Genomics RNA-Seq miRNA Read Mapping De Novo Assembly SNP/DIP Detection CHIP-Seq
13. Why not use free tools? Are tools free or “free”? Tools vs solutions True cost of ownership Ease of Use Tools integration Support
14. Small RNA Analysis(in Beta soon) Identify and filter/trim adapters annotate using mirBASE and other resources - target species of interest Merge/group by mature, precursor/reference Fully integrated with expression analysis
15. De Novo Assembler Human assembly of 38x Illumina paired-end CLC Quality equivalent to Abyss CLC: 7 hrs, 1 node, 42 Gbof RAM Abyss: 80 hrs, 21 nodes, 336 Gbof RAM Metagenomics Assembly METAHIT Dataset MH0041 40M 75bp paired end 3 hrs on desktop, 6 Gb RAM Higher N50 and Total Contig Size than Reported
16. Viral Sequencing at JCVI(See Nadia Fedorova’s Poster!) Amplify and Barcode using SISPA, 454 + Illumina Sequencing Depth of coverage sometimes >1000x De novo Assembly of Consensus for all Segments For each segment: Map reads from each technology independently using best full length reference from NCBI, call variations Update reference with variations confirmed by multiple technologies Map reads using updated reference and all reads Convert to consed, analyze, order Sanger closure reactions Source: Jessica Hostetler, Nadia Federova, Tim Stockwell, Danny Katzel
17. Why CLC bioTools? CLC handled hybrid sequencing technologies directly Very biased coverage confounded other assemblers that expect random arrival stats. CLC didn’t seem to suffer from biased coverage. Very accurate SNP calls in areas of deep coverage. Tim Stockwell Director of Viral Informatics J. Craig Venter Institute
18. Targeted Resequencing QC Assessment of targeted sequencing technology Coverage Statistics for Targeted Regions Very short schedule, limited bioinformatics staff Plug-in development leveraging CLC tools to automate the process and meet short deadline QC Report now available as plug-in
19. Professional Services Developing customized solutions Integration with LIMS, workflows, DB Bioinformatics Algorithm Development Cloud and Grid Integration Data Analysis
20. Questions? Saul A. Kravitz, PhD skravitz@clcbio.com (301)355-0813 Thank you for listening
21. Questions Saul A. Kravitz, PhD skravitz @ clcbio.com 301)355-0813 Thank you for listening
Notas do Editor
GUI-driven tools and workflows
- miRNA workflow leveraging mirBASE and other resources
Very fast, small memory footprint
SISPA = Sequence Independent Single Primer Amplification (if that needs spelling out) – amplifies and barcodes DNA moleculesAlso, if people are interested, can also mention availability of Danny Katzel’s cas2consed software.
CustomizationJava plug-in architecture for Server and WorkbenchOptimized “Cell” command line tools for efficient HPCWizard-based integration of customer toolsServer integration via SOAP and Command Line*