SlideShare uma empresa Scribd logo
1 de 23
Baixar para ler offline
ISMU pipeline for NGS data
analysis and facilitating
molecular breeding
http://hpc.icrisat.cgiar.org/NGS/
• Short read length of sequences
• Availability of many tools
• Platform dependency and command line driven
• No direct ways for prediction of SNPs between
genotypes
• Quality scores vary depending on version and
technology
Challenges
ISMU version 1
• SNP discovery from NGS data
– Pipeline for mapping / assembling
– Calling SNPs between genotypes
– Visualisation
ISMU version 2
• Application of identified SNPs to breeding
• Benchmark available open source short reads
assembly and downstream analysis
programs/software.
• Assembly and polymorphism detection between
genotypes and visualization
• Assay design (Illumina GoldenGate Assay), genotype
calling and visualization and analysis of SNP
genotyping and haplotype data
• Identify and use parental lines for using in MABC or
MARS
• Discovery of SNP markers for use in foreground and
background selection of MABC or MARS.
• Documentation of the pipeline and the integrated
software.
Objectives of NGS Pipeline
Control Flowchart
ICRISAT
CROPS
YesNo
Input Data & validation
Upload Reference
& data
Mapping (Maq,Novo)
Mapped reads
Assembly Visualization
Consensus calling
Report SNPs
• Extract sequences with SNPs
• Design primers
• In silico validation by SNP2CAPS
Database
ADT Score
G.G Assay
Bead Studio
Flapjack
Genotype 1 Genotype 2
Chrom1 Pos RefAllele Gtyp1 Gtyp2
5 303 A G ?
Maq NovoProgramme
SNP Bet Genotypes
Standard Methodology
Mapping Mapping
Assembly
SNP Calling
ag. Reference
ADT Scoring
Reporting
Remove
duplicates
Check the inverse
combination
Compare allele between
genotypes
Base calling in 2nd genotype
Predicted SNPs against Reference
Customized Methodology
(Consensus Base Calling-cc)
ccMaq ccNovo
SNP Calling
Genotype 1 Genotype 2
Programme
Inhouse Script
ADT scoring
Genotype 2
fmaj=21/28
=0.75
Genotype 1
fmaj =38/40
=0.95
Mapping Mapping
Consensus Base Calling
Parameters (Default)
• Max number of mismatches <= 7
• Sum of mismatches score <=60
• Min mapping quality =>0
• Read depth threshold =>5
• Major base frequency threshold => 0.75
What if more than 2 genotypes?
Genotype1
Genotype2
Genotype3
Genotype4
G1 G2 G3
G1 0 1 1
G2 0 0 1
G3 0 0 0
Combination of genotypes = (n2–n)/2
• Reads format
fna and qual
(Standard/Sanger)Fastq
SCARF fomat
Solexa fastq, Solexa export
AB SOLiD read format
FASTA
• Reference sequence
Chickpea transcript assembly
Pearl millet transcript assembly
Pigeonpea transcript assembly
Medicago genome
Sorghum genome
NGS pipeline input data
NGS pipeline (Input 1)
http://hpc.icrisat.cgiar.org/NGS/
NGS pipeline (Input 2)
NGS pipeline (Help page)
NGS pipeline (Results)
NGS pipeline (Visualisation)
Available in 2 Editions
1. Server Edition
2. Desktop Edition
Pipeline Editions
• User friendly web interface
– Installation on following Linux platform
• Fedora 13
• Cent OS 5
• Clients can be any OS with a web browser
• Communication resources
• SMTP (Email)
• Session specific job processing
- Avoid file over writing
Server Edition
Desktop Edition
• All functionalities of Server Edition on a Desktop
• Supported OS
• Fedora 13
• RHEL 5
• Single command installation
• Available in Installable CD
Future plans
•Consideration of new tools to integrate /
update eg: BWA, Bowtie
•Implementation of the extension to the
pipeline
•Evaluate cloud computing and high
performance computing cluster options
•Initiatives such as iPlant (discovery
environment – genotype to phenotype)
• Identification of
appropriate modules for
MARS, GWS and GBS
• Integration of MARS and
GWS module
• Linking of ISMU pipeline
with DMS of IBP
• Documentation & Training
of ISMU pipeline
Future Plans: ISMU v 2
Internet
Architecture
Reference
Sequences
Velvet
Perl Prog
Maq
Novo
CGI
SNP Database
Files
downloading
Dynamic
Querying
Assembly
Visualization
Input data
validation
NGS Data Analysis pipeline at ICRISAT
Apache Server
Hosting Web
Pages
SMTP
Server
• Rajeev K. Varshney
• Abhishek Rathore
• Jayashree B
• Vivek Thakur
• R. Pradeep
• A. Bhanu Prakash
• Sarwar Azam
• G.Meenakshi
• David Marshall
• Iain Milne
Contributors
• Jonathan Jones
• David Studholme
• Greg May
• Andrew Farmer
• Jimmy Woodward
• Dave Edwards

Mais conteúdo relacionado

Semelhante a NGS data analysis and molecular breeding pipeline

DIYA: An annotation pipeline for any genomics lab
DIYA: An annotation pipeline for any genomics labDIYA: An annotation pipeline for any genomics lab
DIYA: An annotation pipeline for any genomics labAndrew Stewart
 
N2os overview
N2os overviewN2os overview
N2os overviewhwjeon1
 
rnaseq2015-02-18-170327193409.pdf
rnaseq2015-02-18-170327193409.pdfrnaseq2015-02-18-170327193409.pdf
rnaseq2015-02-18-170327193409.pdfPushpendra83
 
Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016
Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016
Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016Prof. Wim Van Criekinge
 
Making powerful science: an introduction to NGS data analysis
Making powerful science: an introduction to NGS data analysisMaking powerful science: an introduction to NGS data analysis
Making powerful science: an introduction to NGS data analysisAdamCribbs1
 
Bioinformatics class ppt arifuzzaman
Bioinformatics class ppt arifuzzamanBioinformatics class ppt arifuzzaman
Bioinformatics class ppt arifuzzamanSardar Arifuzzaman
 
Introduction to NBL
Introduction to NBLIntroduction to NBL
Introduction to NBLFei Ji Siao
 
Gene prediction methods vijay
Gene prediction methods  vijayGene prediction methods  vijay
Gene prediction methods vijayVijay Hemmadi
 
Under the Hood of Alignment Algorithms for NGS Researchers
Under the Hood of Alignment Algorithms for NGS ResearchersUnder the Hood of Alignment Algorithms for NGS Researchers
Under the Hood of Alignment Algorithms for NGS Researchers Golden Helix Inc
 
Ion Torrent Sequencer, Mappers, Variant Callers
Ion Torrent Sequencer, Mappers, Variant CallersIon Torrent Sequencer, Mappers, Variant Callers
Ion Torrent Sequencer, Mappers, Variant CallersVenu Thatikonda
 
For your final step, you will synthesize the previous steps and la
For your final step, you will synthesize the previous steps and laFor your final step, you will synthesize the previous steps and la
For your final step, you will synthesize the previous steps and laShainaBoling829
 
RNASeq - Analysis Pipeline for Differential Expression
RNASeq - Analysis Pipeline for Differential ExpressionRNASeq - Analysis Pipeline for Differential Expression
RNASeq - Analysis Pipeline for Differential ExpressionJatinder Singh
 
Network Situational Awareness with d00gle
Network Situational Awareness with d00gleNetwork Situational Awareness with d00gle
Network Situational Awareness with d00gleDug Song
 
Galaxy RNA-Seq Analysis: Tuxedo Protocol
Galaxy RNA-Seq Analysis: Tuxedo ProtocolGalaxy RNA-Seq Analysis: Tuxedo Protocol
Galaxy RNA-Seq Analysis: Tuxedo ProtocolHong ChangBum
 
Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...
Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...
Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...VHIR Vall d’Hebron Institut de Recerca
 
Pipeline Scripting for the Parallel Alignment of Genomic Short Sequence Reads
Pipeline Scripting for the Parallel Alignment of Genomic Short Sequence ReadsPipeline Scripting for the Parallel Alignment of Genomic Short Sequence Reads
Pipeline Scripting for the Parallel Alignment of Genomic Short Sequence ReadsAdam Bradley
 
Digital RNAseq Technology Introduction: Digital RNAseq Webinar Part 1
Digital RNAseq Technology Introduction: Digital RNAseq Webinar Part 1Digital RNAseq Technology Introduction: Digital RNAseq Webinar Part 1
Digital RNAseq Technology Introduction: Digital RNAseq Webinar Part 1QIAGEN
 
[2017-05-29] DNASmartTagger
[2017-05-29] DNASmartTagger [2017-05-29] DNASmartTagger
[2017-05-29] DNASmartTagger Eli Kaminuma
 
TechWiseTV Workshop: Segment Routing for the Datacenter
TechWiseTV Workshop: Segment Routing for the DatacenterTechWiseTV Workshop: Segment Routing for the Datacenter
TechWiseTV Workshop: Segment Routing for the DatacenterRobb Boyd
 

Semelhante a NGS data analysis and molecular breeding pipeline (20)

DIYA: An annotation pipeline for any genomics lab
DIYA: An annotation pipeline for any genomics labDIYA: An annotation pipeline for any genomics lab
DIYA: An annotation pipeline for any genomics lab
 
N2os overview
N2os overviewN2os overview
N2os overview
 
rnaseq2015-02-18-170327193409.pdf
rnaseq2015-02-18-170327193409.pdfrnaseq2015-02-18-170327193409.pdf
rnaseq2015-02-18-170327193409.pdf
 
Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016
Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016
Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016
 
Making powerful science: an introduction to NGS data analysis
Making powerful science: an introduction to NGS data analysisMaking powerful science: an introduction to NGS data analysis
Making powerful science: an introduction to NGS data analysis
 
Bioinformatics class ppt arifuzzaman
Bioinformatics class ppt arifuzzamanBioinformatics class ppt arifuzzaman
Bioinformatics class ppt arifuzzaman
 
Introduction to NBL
Introduction to NBLIntroduction to NBL
Introduction to NBL
 
Gene prediction methods vijay
Gene prediction methods  vijayGene prediction methods  vijay
Gene prediction methods vijay
 
Under the Hood of Alignment Algorithms for NGS Researchers
Under the Hood of Alignment Algorithms for NGS ResearchersUnder the Hood of Alignment Algorithms for NGS Researchers
Under the Hood of Alignment Algorithms for NGS Researchers
 
Ion Torrent Sequencer, Mappers, Variant Callers
Ion Torrent Sequencer, Mappers, Variant CallersIon Torrent Sequencer, Mappers, Variant Callers
Ion Torrent Sequencer, Mappers, Variant Callers
 
For your final step, you will synthesize the previous steps and la
For your final step, you will synthesize the previous steps and laFor your final step, you will synthesize the previous steps and la
For your final step, you will synthesize the previous steps and la
 
RNASeq - Analysis Pipeline for Differential Expression
RNASeq - Analysis Pipeline for Differential ExpressionRNASeq - Analysis Pipeline for Differential Expression
RNASeq - Analysis Pipeline for Differential Expression
 
Network Situational Awareness with d00gle
Network Situational Awareness with d00gleNetwork Situational Awareness with d00gle
Network Situational Awareness with d00gle
 
NGS - QC & Dataformat
NGS - QC & Dataformat NGS - QC & Dataformat
NGS - QC & Dataformat
 
Galaxy RNA-Seq Analysis: Tuxedo Protocol
Galaxy RNA-Seq Analysis: Tuxedo ProtocolGalaxy RNA-Seq Analysis: Tuxedo Protocol
Galaxy RNA-Seq Analysis: Tuxedo Protocol
 
Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...
Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...
Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...
 
Pipeline Scripting for the Parallel Alignment of Genomic Short Sequence Reads
Pipeline Scripting for the Parallel Alignment of Genomic Short Sequence ReadsPipeline Scripting for the Parallel Alignment of Genomic Short Sequence Reads
Pipeline Scripting for the Parallel Alignment of Genomic Short Sequence Reads
 
Digital RNAseq Technology Introduction: Digital RNAseq Webinar Part 1
Digital RNAseq Technology Introduction: Digital RNAseq Webinar Part 1Digital RNAseq Technology Introduction: Digital RNAseq Webinar Part 1
Digital RNAseq Technology Introduction: Digital RNAseq Webinar Part 1
 
[2017-05-29] DNASmartTagger
[2017-05-29] DNASmartTagger [2017-05-29] DNASmartTagger
[2017-05-29] DNASmartTagger
 
TechWiseTV Workshop: Segment Routing for the Datacenter
TechWiseTV Workshop: Segment Routing for the DatacenterTechWiseTV Workshop: Segment Routing for the Datacenter
TechWiseTV Workshop: Segment Routing for the Datacenter
 

Mais de CGIAR Generation Challenge Programme

ARM 2008: Dissection, characterisation and utilisation of disease QTL -- R Ne...
ARM 2008: Dissection, characterisation and utilisation of disease QTL -- R Ne...ARM 2008: Dissection, characterisation and utilisation of disease QTL -- R Ne...
ARM 2008: Dissection, characterisation and utilisation of disease QTL -- R Ne...CGIAR Generation Challenge Programme
 
ARM 2007: Dissection, characterisation and utilisation of disease QTL -- R Ne...
ARM 2007: Dissection, characterisation and utilisation of disease QTL -- R Ne...ARM 2007: Dissection, characterisation and utilisation of disease QTL -- R Ne...
ARM 2007: Dissection, characterisation and utilisation of disease QTL -- R Ne...CGIAR Generation Challenge Programme
 
The Generation Challenge Programme: Lessons learnt relevant to CRPs, and the ...
The Generation Challenge Programme: Lessons learnt relevant to CRPs, and the ...The Generation Challenge Programme: Lessons learnt relevant to CRPs, and the ...
The Generation Challenge Programme: Lessons learnt relevant to CRPs, and the ...CGIAR Generation Challenge Programme
 
Integrated Breeding Platform (IBP): A user-friendly platform to implement the...
Integrated Breeding Platform (IBP): A user-friendly platform to implement the...Integrated Breeding Platform (IBP): A user-friendly platform to implement the...
Integrated Breeding Platform (IBP): A user-friendly platform to implement the...CGIAR Generation Challenge Programme
 
Integrated Breeding Platform (IBP): A user-friendly platform to implement the...
Integrated Breeding Platform (IBP): A user-friendly platform to implement the...Integrated Breeding Platform (IBP): A user-friendly platform to implement the...
Integrated Breeding Platform (IBP): A user-friendly platform to implement the...CGIAR Generation Challenge Programme
 
TLM III: : Improve common bean productivity for marginal environments in su...
TLM III: :   Improve common bean productivity for marginal environments in su...TLM III: :   Improve common bean productivity for marginal environments in su...
TLM III: : Improve common bean productivity for marginal environments in su...CGIAR Generation Challenge Programme
 
TLM III: Improve groundnut productivity for marginal environments from sub-Sa...
TLM III: Improve groundnut productivity for marginal environments from sub-Sa...TLM III: Improve groundnut productivity for marginal environments from sub-Sa...
TLM III: Improve groundnut productivity for marginal environments from sub-Sa...CGIAR Generation Challenge Programme
 
TLM III: Improve cowpea productivity for marginal environments in sub-Sahara...
TLM III: Improve cowpea productivity for marginal  environments in sub-Sahara...TLM III: Improve cowpea productivity for marginal  environments in sub-Sahara...
TLM III: Improve cowpea productivity for marginal environments in sub-Sahara...CGIAR Generation Challenge Programme
 
TLIII: Overview of TLII achievements, lessons and challenges for Phase III – ...
TLIII: Overview of TLII achievements, lessons and challenges for Phase III – ...TLIII: Overview of TLII achievements, lessons and challenges for Phase III – ...
TLIII: Overview of TLII achievements, lessons and challenges for Phase III – ...CGIAR Generation Challenge Programme
 
TLIII: Tropical Legumes I – Improving Tropical Legume Productivity for Margin...
TLIII: Tropical Legumes I – Improving Tropical Legume Productivity for Margin...TLIII: Tropical Legumes I – Improving Tropical Legume Productivity for Margin...
TLIII: Tropical Legumes I – Improving Tropical Legume Productivity for Margin...CGIAR Generation Challenge Programme
 
Adoption of modern breeding tools in developing countries: challenges and opp...
Adoption of modern breeding tools in developing countries: challenges and opp...Adoption of modern breeding tools in developing countries: challenges and opp...
Adoption of modern breeding tools in developing countries: challenges and opp...CGIAR Generation Challenge Programme
 
PAG XXII 2014 – The Breeding Management System (BMS) of the Integrated Breedi...
PAG XXII 2014 – The Breeding Management System (BMS) of the Integrated Breedi...PAG XXII 2014 – The Breeding Management System (BMS) of the Integrated Breedi...
PAG XXII 2014 – The Breeding Management System (BMS) of the Integrated Breedi...CGIAR Generation Challenge Programme
 
PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders...
PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders...PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders...
PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders...CGIAR Generation Challenge Programme
 
PAG XXII 2014 – Genomic resources applied to marker-assisted breeding in cowp...
PAG XXII 2014 – Genomic resources applied to marker-assisted breeding in cowp...PAG XXII 2014 – Genomic resources applied to marker-assisted breeding in cowp...
PAG XXII 2014 – Genomic resources applied to marker-assisted breeding in cowp...CGIAR Generation Challenge Programme
 
2011: Introduction to the CGIAR Generation Challenge Programme (GCP)
2011: Introduction to the CGIAR Generation Challenge Programme (GCP)2011: Introduction to the CGIAR Generation Challenge Programme (GCP)
2011: Introduction to the CGIAR Generation Challenge Programme (GCP)CGIAR Generation Challenge Programme
 
Working with diversity in international partnerships -- The GCP experience --...
Working with diversity in international partnerships -- The GCP experience --...Working with diversity in international partnerships -- The GCP experience --...
Working with diversity in international partnerships -- The GCP experience --...CGIAR Generation Challenge Programme
 
GRM 2013: Improving rice productivity in lowland ecosystems of Burkina Faso, ...
GRM 2013: Improving rice productivity in lowland ecosystems of Burkina Faso, ...GRM 2013: Improving rice productivity in lowland ecosystems of Burkina Faso, ...
GRM 2013: Improving rice productivity in lowland ecosystems of Burkina Faso, ...CGIAR Generation Challenge Programme
 
GRM 2013: Improving sorghum productivity in semi-arid environments of Mali th...
GRM 2013: Improving sorghum productivity in semi-arid environments of Mali th...GRM 2013: Improving sorghum productivity in semi-arid environments of Mali th...
GRM 2013: Improving sorghum productivity in semi-arid environments of Mali th...CGIAR Generation Challenge Programme
 

Mais de CGIAR Generation Challenge Programme (20)

Capacity Building: Gain or Drain? J-M Ribaut, F Okono and NN Diop
Capacity Building: Gain or Drain? J-M Ribaut, F Okono and NN DiopCapacity Building: Gain or Drain? J-M Ribaut, F Okono and NN Diop
Capacity Building: Gain or Drain? J-M Ribaut, F Okono and NN Diop
 
ARM 2008: Dissection, characterisation and utilisation of disease QTL -- R Ne...
ARM 2008: Dissection, characterisation and utilisation of disease QTL -- R Ne...ARM 2008: Dissection, characterisation and utilisation of disease QTL -- R Ne...
ARM 2008: Dissection, characterisation and utilisation of disease QTL -- R Ne...
 
ARM 2007: Dissection, characterisation and utilisation of disease QTL -- R Ne...
ARM 2007: Dissection, characterisation and utilisation of disease QTL -- R Ne...ARM 2007: Dissection, characterisation and utilisation of disease QTL -- R Ne...
ARM 2007: Dissection, characterisation and utilisation of disease QTL -- R Ne...
 
The Generation Challenge Programme: Lessons learnt relevant to CRPs, and the ...
The Generation Challenge Programme: Lessons learnt relevant to CRPs, and the ...The Generation Challenge Programme: Lessons learnt relevant to CRPs, and the ...
The Generation Challenge Programme: Lessons learnt relevant to CRPs, and the ...
 
Lessons learnt from the GCP experience – J-M Ribaut
Lessons learnt from the GCP experience – J-M RibautLessons learnt from the GCP experience – J-M Ribaut
Lessons learnt from the GCP experience – J-M Ribaut
 
Integrated Breeding Platform (IBP): A user-friendly platform to implement the...
Integrated Breeding Platform (IBP): A user-friendly platform to implement the...Integrated Breeding Platform (IBP): A user-friendly platform to implement the...
Integrated Breeding Platform (IBP): A user-friendly platform to implement the...
 
Integrated Breeding Platform (IBP): A user-friendly platform to implement the...
Integrated Breeding Platform (IBP): A user-friendly platform to implement the...Integrated Breeding Platform (IBP): A user-friendly platform to implement the...
Integrated Breeding Platform (IBP): A user-friendly platform to implement the...
 
TLM III: : Improve common bean productivity for marginal environments in su...
TLM III: :   Improve common bean productivity for marginal environments in su...TLM III: :   Improve common bean productivity for marginal environments in su...
TLM III: : Improve common bean productivity for marginal environments in su...
 
TLM III: Improve groundnut productivity for marginal environments from sub-Sa...
TLM III: Improve groundnut productivity for marginal environments from sub-Sa...TLM III: Improve groundnut productivity for marginal environments from sub-Sa...
TLM III: Improve groundnut productivity for marginal environments from sub-Sa...
 
TLM III: Improve cowpea productivity for marginal environments in sub-Sahara...
TLM III: Improve cowpea productivity for marginal  environments in sub-Sahara...TLM III: Improve cowpea productivity for marginal  environments in sub-Sahara...
TLM III: Improve cowpea productivity for marginal environments in sub-Sahara...
 
TLIII: Overview of TLII achievements, lessons and challenges for Phase III – ...
TLIII: Overview of TLII achievements, lessons and challenges for Phase III – ...TLIII: Overview of TLII achievements, lessons and challenges for Phase III – ...
TLIII: Overview of TLII achievements, lessons and challenges for Phase III – ...
 
TLIII: Tropical Legumes I – Improving Tropical Legume Productivity for Margin...
TLIII: Tropical Legumes I – Improving Tropical Legume Productivity for Margin...TLIII: Tropical Legumes I – Improving Tropical Legume Productivity for Margin...
TLIII: Tropical Legumes I – Improving Tropical Legume Productivity for Margin...
 
Adoption of modern breeding tools in developing countries: challenges and opp...
Adoption of modern breeding tools in developing countries: challenges and opp...Adoption of modern breeding tools in developing countries: challenges and opp...
Adoption of modern breeding tools in developing countries: challenges and opp...
 
PAG XXII 2014 – The Breeding Management System (BMS) of the Integrated Breedi...
PAG XXII 2014 – The Breeding Management System (BMS) of the Integrated Breedi...PAG XXII 2014 – The Breeding Management System (BMS) of the Integrated Breedi...
PAG XXII 2014 – The Breeding Management System (BMS) of the Integrated Breedi...
 
PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders...
PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders...PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders...
PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders...
 
PAG XXII 2014 – Genomic resources applied to marker-assisted breeding in cowp...
PAG XXII 2014 – Genomic resources applied to marker-assisted breeding in cowp...PAG XXII 2014 – Genomic resources applied to marker-assisted breeding in cowp...
PAG XXII 2014 – Genomic resources applied to marker-assisted breeding in cowp...
 
2011: Introduction to the CGIAR Generation Challenge Programme (GCP)
2011: Introduction to the CGIAR Generation Challenge Programme (GCP)2011: Introduction to the CGIAR Generation Challenge Programme (GCP)
2011: Introduction to the CGIAR Generation Challenge Programme (GCP)
 
Working with diversity in international partnerships -- The GCP experience --...
Working with diversity in international partnerships -- The GCP experience --...Working with diversity in international partnerships -- The GCP experience --...
Working with diversity in international partnerships -- The GCP experience --...
 
GRM 2013: Improving rice productivity in lowland ecosystems of Burkina Faso, ...
GRM 2013: Improving rice productivity in lowland ecosystems of Burkina Faso, ...GRM 2013: Improving rice productivity in lowland ecosystems of Burkina Faso, ...
GRM 2013: Improving rice productivity in lowland ecosystems of Burkina Faso, ...
 
GRM 2013: Improving sorghum productivity in semi-arid environments of Mali th...
GRM 2013: Improving sorghum productivity in semi-arid environments of Mali th...GRM 2013: Improving sorghum productivity in semi-arid environments of Mali th...
GRM 2013: Improving sorghum productivity in semi-arid environments of Mali th...
 

Último

SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 

Último (20)

SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 

NGS data analysis and molecular breeding pipeline

  • 1. ISMU pipeline for NGS data analysis and facilitating molecular breeding http://hpc.icrisat.cgiar.org/NGS/
  • 2. • Short read length of sequences • Availability of many tools • Platform dependency and command line driven • No direct ways for prediction of SNPs between genotypes • Quality scores vary depending on version and technology Challenges
  • 3. ISMU version 1 • SNP discovery from NGS data – Pipeline for mapping / assembling – Calling SNPs between genotypes – Visualisation
  • 4. ISMU version 2 • Application of identified SNPs to breeding
  • 5. • Benchmark available open source short reads assembly and downstream analysis programs/software. • Assembly and polymorphism detection between genotypes and visualization • Assay design (Illumina GoldenGate Assay), genotype calling and visualization and analysis of SNP genotyping and haplotype data • Identify and use parental lines for using in MABC or MARS • Discovery of SNP markers for use in foreground and background selection of MABC or MARS. • Documentation of the pipeline and the integrated software. Objectives of NGS Pipeline
  • 6. Control Flowchart ICRISAT CROPS YesNo Input Data & validation Upload Reference & data Mapping (Maq,Novo) Mapped reads Assembly Visualization Consensus calling Report SNPs • Extract sequences with SNPs • Design primers • In silico validation by SNP2CAPS Database ADT Score G.G Assay Bead Studio Flapjack
  • 7. Genotype 1 Genotype 2 Chrom1 Pos RefAllele Gtyp1 Gtyp2 5 303 A G ? Maq NovoProgramme SNP Bet Genotypes Standard Methodology Mapping Mapping Assembly SNP Calling ag. Reference ADT Scoring Reporting Remove duplicates Check the inverse combination Compare allele between genotypes Base calling in 2nd genotype Predicted SNPs against Reference
  • 8. Customized Methodology (Consensus Base Calling-cc) ccMaq ccNovo SNP Calling Genotype 1 Genotype 2 Programme Inhouse Script ADT scoring Genotype 2 fmaj=21/28 =0.75 Genotype 1 fmaj =38/40 =0.95 Mapping Mapping
  • 9. Consensus Base Calling Parameters (Default) • Max number of mismatches <= 7 • Sum of mismatches score <=60 • Min mapping quality =>0 • Read depth threshold =>5 • Major base frequency threshold => 0.75
  • 10. What if more than 2 genotypes? Genotype1 Genotype2 Genotype3 Genotype4 G1 G2 G3 G1 0 1 1 G2 0 0 1 G3 0 0 0 Combination of genotypes = (n2–n)/2
  • 11. • Reads format fna and qual (Standard/Sanger)Fastq SCARF fomat Solexa fastq, Solexa export AB SOLiD read format FASTA • Reference sequence Chickpea transcript assembly Pearl millet transcript assembly Pigeonpea transcript assembly Medicago genome Sorghum genome NGS pipeline input data
  • 12. NGS pipeline (Input 1) http://hpc.icrisat.cgiar.org/NGS/
  • 17. Available in 2 Editions 1. Server Edition 2. Desktop Edition Pipeline Editions
  • 18. • User friendly web interface – Installation on following Linux platform • Fedora 13 • Cent OS 5 • Clients can be any OS with a web browser • Communication resources • SMTP (Email) • Session specific job processing - Avoid file over writing Server Edition
  • 19. Desktop Edition • All functionalities of Server Edition on a Desktop • Supported OS • Fedora 13 • RHEL 5 • Single command installation • Available in Installable CD
  • 20. Future plans •Consideration of new tools to integrate / update eg: BWA, Bowtie •Implementation of the extension to the pipeline •Evaluate cloud computing and high performance computing cluster options •Initiatives such as iPlant (discovery environment – genotype to phenotype)
  • 21. • Identification of appropriate modules for MARS, GWS and GBS • Integration of MARS and GWS module • Linking of ISMU pipeline with DMS of IBP • Documentation & Training of ISMU pipeline Future Plans: ISMU v 2
  • 22. Internet Architecture Reference Sequences Velvet Perl Prog Maq Novo CGI SNP Database Files downloading Dynamic Querying Assembly Visualization Input data validation NGS Data Analysis pipeline at ICRISAT Apache Server Hosting Web Pages SMTP Server
  • 23. • Rajeev K. Varshney • Abhishek Rathore • Jayashree B • Vivek Thakur • R. Pradeep • A. Bhanu Prakash • Sarwar Azam • G.Meenakshi • David Marshall • Iain Milne Contributors • Jonathan Jones • David Studholme • Greg May • Andrew Farmer • Jimmy Woodward • Dave Edwards