The document describes ISMU, a pipeline for NGS data analysis and facilitating molecular breeding. ISMU version 1 focuses on SNP discovery between genotypes through mapping, assembly, and visualization. Version 2 applies identified SNPs to breeding through assay design, genotype calling, and analysis. The pipeline benchmarks open source programs, performs assembly and polymorphism detection between genotypes, and identifies parental lines for molecular breeding applications. It provides user-friendly interfaces for uploading data and visualizing results. Future plans include updating tools, extending pipeline capabilities, and linking with other databases and analysis systems.
08448380779 Call Girls In Civil Lines Women Seeking Men
NGS data analysis and molecular breeding pipeline
1. ISMU pipeline for NGS data
analysis and facilitating
molecular breeding
http://hpc.icrisat.cgiar.org/NGS/
2. • Short read length of sequences
• Availability of many tools
• Platform dependency and command line driven
• No direct ways for prediction of SNPs between
genotypes
• Quality scores vary depending on version and
technology
Challenges
3. ISMU version 1
• SNP discovery from NGS data
– Pipeline for mapping / assembling
– Calling SNPs between genotypes
– Visualisation
5. • Benchmark available open source short reads
assembly and downstream analysis
programs/software.
• Assembly and polymorphism detection between
genotypes and visualization
• Assay design (Illumina GoldenGate Assay), genotype
calling and visualization and analysis of SNP
genotyping and haplotype data
• Identify and use parental lines for using in MABC or
MARS
• Discovery of SNP markers for use in foreground and
background selection of MABC or MARS.
• Documentation of the pipeline and the integrated
software.
Objectives of NGS Pipeline
6. Control Flowchart
ICRISAT
CROPS
YesNo
Input Data & validation
Upload Reference
& data
Mapping (Maq,Novo)
Mapped reads
Assembly Visualization
Consensus calling
Report SNPs
• Extract sequences with SNPs
• Design primers
• In silico validation by SNP2CAPS
Database
ADT Score
G.G Assay
Bead Studio
Flapjack
7. Genotype 1 Genotype 2
Chrom1 Pos RefAllele Gtyp1 Gtyp2
5 303 A G ?
Maq NovoProgramme
SNP Bet Genotypes
Standard Methodology
Mapping Mapping
Assembly
SNP Calling
ag. Reference
ADT Scoring
Reporting
Remove
duplicates
Check the inverse
combination
Compare allele between
genotypes
Base calling in 2nd genotype
Predicted SNPs against Reference
9. Consensus Base Calling
Parameters (Default)
• Max number of mismatches <= 7
• Sum of mismatches score <=60
• Min mapping quality =>0
• Read depth threshold =>5
• Major base frequency threshold => 0.75
10. What if more than 2 genotypes?
Genotype1
Genotype2
Genotype3
Genotype4
G1 G2 G3
G1 0 1 1
G2 0 0 1
G3 0 0 0
Combination of genotypes = (n2–n)/2
11. • Reads format
fna and qual
(Standard/Sanger)Fastq
SCARF fomat
Solexa fastq, Solexa export
AB SOLiD read format
FASTA
• Reference sequence
Chickpea transcript assembly
Pearl millet transcript assembly
Pigeonpea transcript assembly
Medicago genome
Sorghum genome
NGS pipeline input data
17. Available in 2 Editions
1. Server Edition
2. Desktop Edition
Pipeline Editions
18. • User friendly web interface
– Installation on following Linux platform
• Fedora 13
• Cent OS 5
• Clients can be any OS with a web browser
• Communication resources
• SMTP (Email)
• Session specific job processing
- Avoid file over writing
Server Edition
19. Desktop Edition
• All functionalities of Server Edition on a Desktop
• Supported OS
• Fedora 13
• RHEL 5
• Single command installation
• Available in Installable CD
20. Future plans
•Consideration of new tools to integrate /
update eg: BWA, Bowtie
•Implementation of the extension to the
pipeline
•Evaluate cloud computing and high
performance computing cluster options
•Initiatives such as iPlant (discovery
environment – genotype to phenotype)
21. • Identification of
appropriate modules for
MARS, GWS and GBS
• Integration of MARS and
GWS module
• Linking of ISMU pipeline
with DMS of IBP
• Documentation & Training
of ISMU pipeline
Future Plans: ISMU v 2
23. • Rajeev K. Varshney
• Abhishek Rathore
• Jayashree B
• Vivek Thakur
• R. Pradeep
• A. Bhanu Prakash
• Sarwar Azam
• G.Meenakshi
• David Marshall
• Iain Milne
Contributors
• Jonathan Jones
• David Studholme
• Greg May
• Andrew Farmer
• Jimmy Woodward
• Dave Edwards