SlideShare uma empresa Scribd logo
1 de 24
Baixar para ler offline
The NCBI Pathogen Analysis Pipeline
to Support
Real Time Sequencing of Foodborne Pathogens
William Klimke
GMI9
NCBI Pathogen Detection Pipeline
NCBISubmissionPortal
BioSamples
SRA
GenBank
BioProject
NCBI Pathogen Pipeline
Kmer analysis
Genome Assembly
Genome Annotation
Genome Placement
Clustering
SNP analysis
Tree Construction
Reports
QC
sample_name
organism
strain/isolate
Category (attribute_package)
1a) Clinical/Host-associated
1a1) specific_host
1a2) isolation_source
1a3) host-disease
OR
1b) Environmental/Food/Other
1b1) isolation_source
collection_date
Geographic location
6a) geo_loc_name
OR
6b) lat_lon
collected by
Where
When
Who
What
minimal metadata
NCBI Biosample – Pathogen Template
(Foodborne Outbreaks)
https://submit.ncbi.nlm.nih.gov/subs/biosample/
https://www.ncbi.nlm.nih.gov/biosample/docs/
http://www.ncbi.nlm.nih.gov/projects/biosample/validate/
NCBI Pathogen Detection Pipeline
Submissions (Jan – May, 2016)
Automated Bacterial Assembly
SRA Reads sample 1
Trim reads
(Ns, adaptor)
Reference
Distance tree
Find closest reference genome(s)
ArgoCA (Combined Assembly)
De novo assembly panel
Argo (Reference
assisted assembly) SOAP denovo GS-assembler (newbler)MaSuRCA Celera Assembler
Reads remapped to combined assembly
Contig fasta
Read placements (bam)
Quality profile
SPAdes
1. Initial partition of isolates within each species by kmer distances
2. Within each partition, blast comparison of all pairs of genomes
3. Single linkage clusters with at most 50 SNPs
4. Within clusters, SNPs with respect to one reference
5. Generate final SNP list and phylogenetic trees
Filtering:
• Base level
• Repeat
• Density
Problematic genomes are eliminated at various points along the way
SNP pipeline
High SNP density
Cumulative count of differences
Iterative density filtering (Richa Agarwala modification of
Science. 2011 Jan 28;331(6016):430-4.
Type Total targets in k-
mer tree
Targets in clusters (single linkage
<= 50 SNPs)
Salmonella 45297 38794
Listeria 9621 8135
E. coli & Shigella 13144 6046
Campylobacter 2234 1569
Acinteobacter 2179 1299
Elizabethkingia 89 74
Serratia 336 227
Klebsiella 1194 677
Total targets (May 2016)
http://www.ncbi.nlm.nih.gov/pathogens/
Results Available Now
there are several rows as NULL – means the target
either is not in a cluster (check last column) or is in a cluster
without any other isolate of the opposite type
rows with low SNP count are significant
these isolates are all <10 SNPs, and they all are in the same
cluster
NCBI Pathogen Detection SNP Pipeline: example 1 - stone fruit outbreak
http://www.cdc.gov/mmwr/preview/mmwrhtml/mm6410a6.htm?s_cid=mm6410a6_e#Fig
similar results to CDC wgMLST
MN chicken kiev outbreak
NCBI Pathogen Detection SNP Pipeline: example 2 – chicken kiev outbreak
NCBI Pathogen Detection SNP Pipeline Web viewer (coming soon):
example 3 – Elizabethkingia outbreak
wgMLST approach
• Complementary to SNP analysis e.g. consistency check
• Efficient for initial clustering of all isolates in species
• Generate loci using “essentially complete” RefSeq genomes
Organism Number of loci Genome in loci Number of genomes Major species
Acinetobacter 2420 58.25% 43/47 Baumannii
Campylobacter 1257 68.36% 90/132 Jejuni
Escherichia 2896 52.97% 159/165 Coli
Klebsiella 4004 82.54% 67/82 Pneumoniae
Listeria 2364 73.88% 73/81 Monocytogenes
Salmonella 3469 66.98% 137/147 Enterica
R&D: wgMLST
• Fast & relatively simple
• Epidemiologists are
familiar with it
• Good for initial clustering
• Different heuristics
• Can use special markers
for e.g. serovars
• Still need to deal with
assembly errors
• Recombination can still
be a problem…
wgMLST – a
complementary
method
Loci are not
independent
R&D: wgMLST
NCBI’s Role in Combating Antibiotic Resistant
Bacteria
“Create a repository of resistant bacterial
strains (an “isolate bank”) and maintain a
well-curated reference database that
describes the characteristics of these
strains.”
“Develop and maintain a national sequence
database of resistant pathogens.”
AMR efforts at NCBI
• With collaborators, build database of sequenced isolates with standardized
AMR metadata (i.e. accept antibiograms) (2019 Samples as of May 16 -
http://www.ncbi.nlm.nih.gov/biosample/?term=antibiogram[filter])
• Collaborators include: (CDC, WRAIR, FDA, B&W)
• Stable, up-to-date database of AMR genes with standardized nomenclature
• Collaborators (CARD)
• RefSeq set released by June 2016
• Implement and validate tools for identifying AMR genes in new isolates
Antibiogram Fields
• Fields designed to find balance between comprehensiveness and ease of
submission
• Data dictionaries based on outside expertise (ASM, CLSI) standardize input and
minimize ‘data drift’
mcr-1 encoding organisms Total
E. coli 11
Salmonella 10
Antibiotic resistance
NCBI Outputs
Kmer tree
ftp://ftp.ncbi.nlm.nih.gov/pathogen/Results/
• Genome Workbench
• full SNP reports
• Integrated web-based interactive
system*
• AMR reports*
• wgMLST*
Acknowledgements
Richa Agarwala
Azat Badretdin
Slava Brover
Joshua Cherry
Vyacheslav
Chetvernin
Robert Cohen
Michael DiCuccio
Mike Feldgarden
Dan Haft
William Klimke
Alex Kotliarov
Arjun Prasad
Edward Rice
Kirill Rotmistrovskyy
This research was supported by the Intramural Research Program of the NIH, National Library of Medicine. http://www.ncbi.nlm.nih.gov
National Center for Biotechnology Information – National Library of Medicine – Bethesda MD 20892 USA
CDC
FDA/CFSAN
USDA-FSIS
PHE/FERA
NIHGRI
NIAID
WRAIR
Broad
Wadsworth/MDH
Vendors: PacBio, Illumina, Roche
Stephen Sherry
Sergey Shiryev
Martin Shumway
Tatiana Tatusova
Igor Tolstoy
Chunlin Xiao
Leonid Zaslavsky
Alexander Zasypkin
Alejandro A. Schaffer
Lukas Wagner
Aleksandr Morgulis
David Lipman
James Ostell

Mais conteúdo relacionado

Mais procurados

Biological databases: Challenges in organization and usability
Biological databases: Challenges in organization and usabilityBiological databases: Challenges in organization and usability
Biological databases: Challenges in organization and usabilityLars Juhl Jensen
 
Introduction to Bioinformatics.
 Introduction to Bioinformatics. Introduction to Bioinformatics.
Introduction to Bioinformatics.Elena Sügis
 
DNA data bank of japan (DDBJ)
DNA data bank of japan (DDBJ)DNA data bank of japan (DDBJ)
DNA data bank of japan (DDBJ)ZoufishanY
 
BIOINFORMATICS Applications And Challenges
BIOINFORMATICS Applications And ChallengesBIOINFORMATICS Applications And Challenges
BIOINFORMATICS Applications And ChallengesAmos Watentena
 
How to submit a sequence in NCBI
How to submit a sequence in NCBIHow to submit a sequence in NCBI
How to submit a sequence in NCBIMinhaz Ahmed
 
Primary, secondary, tertiary biological database
Primary, secondary, tertiary biological databasePrimary, secondary, tertiary biological database
Primary, secondary, tertiary biological databaseKAUSHAL SAHU
 
NCBI Boot Camp for Beginners Slides
NCBI Boot Camp for Beginners SlidesNCBI Boot Camp for Beginners Slides
NCBI Boot Camp for Beginners SlidesJackie Wirz, PhD
 
Bioinformatics Databases
Bioinformatics DatabasesBioinformatics Databases
Bioinformatics Databasescschlos2
 
Careers in bioinformatics, Scope, Skills and Jobs
Careers in bioinformatics, Scope, Skills and JobsCareers in bioinformatics, Scope, Skills and Jobs
Careers in bioinformatics, Scope, Skills and JobsM Abdullah Chaudhry
 
Introduction to ncbi, embl, ddbj
Introduction to ncbi, embl, ddbjIntroduction to ncbi, embl, ddbj
Introduction to ncbi, embl, ddbjKAUSHAL SAHU
 
B.sc biochem i bobi u 2 database
B.sc biochem i bobi u 2 databaseB.sc biochem i bobi u 2 database
B.sc biochem i bobi u 2 databaseRai University
 
Role of bioinformatics in life sciences research
Role of bioinformatics in life sciences researchRole of bioinformatics in life sciences research
Role of bioinformatics in life sciences researchAnshika Bansal
 
Publicly available tools and open resources in Bioinformatics
Publicly available  tools and open resources in BioinformaticsPublicly available  tools and open resources in Bioinformatics
Publicly available tools and open resources in BioinformaticsArindam Ghosh
 
Tips for effective use of BLAST and other NCBI tools
Tips for effective use of BLAST and other NCBI toolsTips for effective use of BLAST and other NCBI tools
Tips for effective use of BLAST and other NCBI toolsIntegrated DNA Technologies
 

Mais procurados (20)

Biological databases: Challenges in organization and usability
Biological databases: Challenges in organization and usabilityBiological databases: Challenges in organization and usability
Biological databases: Challenges in organization and usability
 
Introduction to Bioinformatics.
 Introduction to Bioinformatics. Introduction to Bioinformatics.
Introduction to Bioinformatics.
 
DNA data bank of japan (DDBJ)
DNA data bank of japan (DDBJ)DNA data bank of japan (DDBJ)
DNA data bank of japan (DDBJ)
 
Bioinformatics principles and applications
Bioinformatics principles and applicationsBioinformatics principles and applications
Bioinformatics principles and applications
 
BIOINFORMATICS Applications And Challenges
BIOINFORMATICS Applications And ChallengesBIOINFORMATICS Applications And Challenges
BIOINFORMATICS Applications And Challenges
 
How to submit a sequence in NCBI
How to submit a sequence in NCBIHow to submit a sequence in NCBI
How to submit a sequence in NCBI
 
Genomic databases
Genomic databasesGenomic databases
Genomic databases
 
Primary, secondary, tertiary biological database
Primary, secondary, tertiary biological databasePrimary, secondary, tertiary biological database
Primary, secondary, tertiary biological database
 
NCBI Boot Camp for Beginners Slides
NCBI Boot Camp for Beginners SlidesNCBI Boot Camp for Beginners Slides
NCBI Boot Camp for Beginners Slides
 
Bioinformatics Databases
Bioinformatics DatabasesBioinformatics Databases
Bioinformatics Databases
 
Careers in bioinformatics, Scope, Skills and Jobs
Careers in bioinformatics, Scope, Skills and JobsCareers in bioinformatics, Scope, Skills and Jobs
Careers in bioinformatics, Scope, Skills and Jobs
 
Biological databases
Biological databasesBiological databases
Biological databases
 
Introduction to ncbi, embl, ddbj
Introduction to ncbi, embl, ddbjIntroduction to ncbi, embl, ddbj
Introduction to ncbi, embl, ddbj
 
B.sc biochem i bobi u 2 database
B.sc biochem i bobi u 2 databaseB.sc biochem i bobi u 2 database
B.sc biochem i bobi u 2 database
 
Biological data base
Biological data baseBiological data base
Biological data base
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Role of bioinformatics in life sciences research
Role of bioinformatics in life sciences researchRole of bioinformatics in life sciences research
Role of bioinformatics in life sciences research
 
Publicly available tools and open resources in Bioinformatics
Publicly available  tools and open resources in BioinformaticsPublicly available  tools and open resources in Bioinformatics
Publicly available tools and open resources in Bioinformatics
 
Tips for effective use of BLAST and other NCBI tools
Tips for effective use of BLAST and other NCBI toolsTips for effective use of BLAST and other NCBI tools
Tips for effective use of BLAST and other NCBI tools
 
Applications of bioinformatics
Applications of bioinformaticsApplications of bioinformatics
Applications of bioinformatics
 

Destaque (7)

Cyberspace ncbi 2012
Cyberspace ncbi 2012Cyberspace ncbi 2012
Cyberspace ncbi 2012
 
Introduction to NCBI
Introduction to NCBIIntroduction to NCBI
Introduction to NCBI
 
Biological databases
Biological databasesBiological databases
Biological databases
 
Biological Databases
Biological DatabasesBiological Databases
Biological Databases
 
Biological databases
Biological databasesBiological databases
Biological databases
 
Biological databases
Biological databasesBiological databases
Biological databases
 
databases in bioinformatics
databases in bioinformaticsdatabases in bioinformatics
databases in bioinformatics
 

Semelhante a The National Center for Biotechnology Information (NCBI) Pathogen Analysis Pipeline to Support Real Time Sequencing of Foodborne Pathogens

Development of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference DatabaseDevelopment of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference Databasenist-spin
 
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...Nathan Olson
 
Towards Precision Medicine: Tute Genomics, a cloud-based application for anal...
Towards Precision Medicine: Tute Genomics, a cloud-based application for anal...Towards Precision Medicine: Tute Genomics, a cloud-based application for anal...
Towards Precision Medicine: Tute Genomics, a cloud-based application for anal...Reid Robison
 
2014 June 17 PacBio User Group Meeting Presentation "How Looking for a Needle...
2014 June 17 PacBio User Group Meeting Presentation "How Looking for a Needle...2014 June 17 PacBio User Group Meeting Presentation "How Looking for a Needle...
2014 June 17 PacBio User Group Meeting Presentation "How Looking for a Needle...Anne Deslattes Mays
 
Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...
Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...
Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...Golden Helix Inc
 
sdAb Screening Services
sdAb Screening ServicessdAb Screening Services
sdAb Screening ServicesShawVivian
 
sdAb Functional Identification
sdAb Functional IdentificationsdAb Functional Identification
sdAb Functional IdentificationShawVivian
 
Phage Display Screening for Single Domain Antibody (sdAb)
Phage Display Screening for Single Domain Antibody (sdAb)Phage Display Screening for Single Domain Antibody (sdAb)
Phage Display Screening for Single Domain Antibody (sdAb)ShawVivian
 
Comparison between RNASeq and Microarray for Gene Expression Analysis
Comparison between RNASeq and Microarray for Gene Expression AnalysisComparison between RNASeq and Microarray for Gene Expression Analysis
Comparison between RNASeq and Microarray for Gene Expression AnalysisYaoyu Wang
 
Apac distributor training series 3 swift product for cancer study
Apac distributor training series 3  swift product for cancer studyApac distributor training series 3  swift product for cancer study
Apac distributor training series 3 swift product for cancer studySwift Biosciences
 
Building bioinformatics resources for the global community
Building bioinformatics resources for the global communityBuilding bioinformatics resources for the global community
Building bioinformatics resources for the global communityExternalEvents
 
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference DatabaseDevelopment of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference DatabaseNathan Olson
 
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...nist-spin
 
Overview of the commonly used sequencing platforms, bioinformatic search tool...
Overview of the commonly used sequencing platforms, bioinformatic search tool...Overview of the commonly used sequencing platforms, bioinformatic search tool...
Overview of the commonly used sequencing platforms, bioinformatic search tool...OECD Environment
 
How CRISPR–Cas9 Screening will revolutionise your drug development programs
How CRISPR–Cas9 Screening will revolutionise your drug development programsHow CRISPR–Cas9 Screening will revolutionise your drug development programs
How CRISPR–Cas9 Screening will revolutionise your drug development programsHorizonDiscovery
 
Molecular screening assay must have sample adequacy control
Molecular screening assay must have sample adequacy controlMolecular screening assay must have sample adequacy control
Molecular screening assay must have sample adequacy controlIvan Brukner
 
Antibiotic-pathogen-biomarker screening by PCR must have SAC
Antibiotic-pathogen-biomarker screening by PCR must have SACAntibiotic-pathogen-biomarker screening by PCR must have SAC
Antibiotic-pathogen-biomarker screening by PCR must have SACIvan Brukner
 
Next generation sequencing by Muhammad Abbas
Next generation sequencing by Muhammad AbbasNext generation sequencing by Muhammad Abbas
Next generation sequencing by Muhammad AbbasMuhammadAbbaskhan9
 

Semelhante a The National Center for Biotechnology Information (NCBI) Pathogen Analysis Pipeline to Support Real Time Sequencing of Foodborne Pathogens (20)

Development of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference DatabaseDevelopment of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
 
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
 
Towards Precision Medicine: Tute Genomics, a cloud-based application for anal...
Towards Precision Medicine: Tute Genomics, a cloud-based application for anal...Towards Precision Medicine: Tute Genomics, a cloud-based application for anal...
Towards Precision Medicine: Tute Genomics, a cloud-based application for anal...
 
2014 June 17 PacBio User Group Meeting Presentation "How Looking for a Needle...
2014 June 17 PacBio User Group Meeting Presentation "How Looking for a Needle...2014 June 17 PacBio User Group Meeting Presentation "How Looking for a Needle...
2014 June 17 PacBio User Group Meeting Presentation "How Looking for a Needle...
 
Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...
Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...
Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...
 
sdAb Screening Services
sdAb Screening ServicessdAb Screening Services
sdAb Screening Services
 
Prashant esa2017
Prashant esa2017Prashant esa2017
Prashant esa2017
 
sdAb Functional Identification
sdAb Functional IdentificationsdAb Functional Identification
sdAb Functional Identification
 
Sdab discovery
Sdab discoverySdab discovery
Sdab discovery
 
Phage Display Screening for Single Domain Antibody (sdAb)
Phage Display Screening for Single Domain Antibody (sdAb)Phage Display Screening for Single Domain Antibody (sdAb)
Phage Display Screening for Single Domain Antibody (sdAb)
 
Comparison between RNASeq and Microarray for Gene Expression Analysis
Comparison between RNASeq and Microarray for Gene Expression AnalysisComparison between RNASeq and Microarray for Gene Expression Analysis
Comparison between RNASeq and Microarray for Gene Expression Analysis
 
Apac distributor training series 3 swift product for cancer study
Apac distributor training series 3  swift product for cancer studyApac distributor training series 3  swift product for cancer study
Apac distributor training series 3 swift product for cancer study
 
Building bioinformatics resources for the global community
Building bioinformatics resources for the global communityBuilding bioinformatics resources for the global community
Building bioinformatics resources for the global community
 
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference DatabaseDevelopment of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
 
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
 
Overview of the commonly used sequencing platforms, bioinformatic search tool...
Overview of the commonly used sequencing platforms, bioinformatic search tool...Overview of the commonly used sequencing platforms, bioinformatic search tool...
Overview of the commonly used sequencing platforms, bioinformatic search tool...
 
How CRISPR–Cas9 Screening will revolutionise your drug development programs
How CRISPR–Cas9 Screening will revolutionise your drug development programsHow CRISPR–Cas9 Screening will revolutionise your drug development programs
How CRISPR–Cas9 Screening will revolutionise your drug development programs
 
Molecular screening assay must have sample adequacy control
Molecular screening assay must have sample adequacy controlMolecular screening assay must have sample adequacy control
Molecular screening assay must have sample adequacy control
 
Antibiotic-pathogen-biomarker screening by PCR must have SAC
Antibiotic-pathogen-biomarker screening by PCR must have SACAntibiotic-pathogen-biomarker screening by PCR must have SAC
Antibiotic-pathogen-biomarker screening by PCR must have SAC
 
Next generation sequencing by Muhammad Abbas
Next generation sequencing by Muhammad AbbasNext generation sequencing by Muhammad Abbas
Next generation sequencing by Muhammad Abbas
 

Mais de ExternalEvents (20)

Mauritania
Mauritania Mauritania
Mauritania
 
Malawi - M. Munthali
Malawi - M. MunthaliMalawi - M. Munthali
Malawi - M. Munthali
 
Malawi (Mbewe)
Malawi (Mbewe)Malawi (Mbewe)
Malawi (Mbewe)
 
Malawi (Desideri)
Malawi (Desideri)Malawi (Desideri)
Malawi (Desideri)
 
Lesotho
LesothoLesotho
Lesotho
 
Kenya
KenyaKenya
Kenya
 
ICRAF: Soil-plant spectral diagnostics laboratory
ICRAF: Soil-plant spectral diagnostics laboratoryICRAF: Soil-plant spectral diagnostics laboratory
ICRAF: Soil-plant spectral diagnostics laboratory
 
Ghana
GhanaGhana
Ghana
 
Ethiopia
EthiopiaEthiopia
Ethiopia
 
Item 15
Item 15Item 15
Item 15
 
Item 14
Item 14Item 14
Item 14
 
Item 13
Item 13Item 13
Item 13
 
Item 7
Item 7Item 7
Item 7
 
Item 6
Item 6Item 6
Item 6
 
Item 3
Item 3Item 3
Item 3
 
Item 16
Item 16Item 16
Item 16
 
Item 9: Soil mapping to support sustainable agriculture
Item 9: Soil mapping to support sustainable agricultureItem 9: Soil mapping to support sustainable agriculture
Item 9: Soil mapping to support sustainable agriculture
 
Item 8: WRB, World Reference Base for Soil Resouces
Item 8: WRB, World Reference Base for Soil ResoucesItem 8: WRB, World Reference Base for Soil Resouces
Item 8: WRB, World Reference Base for Soil Resouces
 
Item 7: Progress made in Nepal
Item 7: Progress made in NepalItem 7: Progress made in Nepal
Item 7: Progress made in Nepal
 
Item 6: International Center for Biosaline Agriculture
Item 6: International Center for Biosaline AgricultureItem 6: International Center for Biosaline Agriculture
Item 6: International Center for Biosaline Agriculture
 

Último

Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...KokoStevan
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin ClassesCeline George
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.pptRamjanShidvankar
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docxPoojaSen20
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxVishalSingh1417
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Disha Kariya
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfChris Hunter
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17Celine George
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...christianmathematics
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxDenish Jangid
 
Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.MateoGardella
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 

Último (20)

Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 

The National Center for Biotechnology Information (NCBI) Pathogen Analysis Pipeline to Support Real Time Sequencing of Foodborne Pathogens

  • 1. The NCBI Pathogen Analysis Pipeline to Support Real Time Sequencing of Foodborne Pathogens William Klimke GMI9
  • 2. NCBI Pathogen Detection Pipeline NCBISubmissionPortal BioSamples SRA GenBank BioProject NCBI Pathogen Pipeline Kmer analysis Genome Assembly Genome Annotation Genome Placement Clustering SNP analysis Tree Construction Reports QC
  • 3. sample_name organism strain/isolate Category (attribute_package) 1a) Clinical/Host-associated 1a1) specific_host 1a2) isolation_source 1a3) host-disease OR 1b) Environmental/Food/Other 1b1) isolation_source collection_date Geographic location 6a) geo_loc_name OR 6b) lat_lon collected by Where When Who What minimal metadata NCBI Biosample – Pathogen Template (Foodborne Outbreaks) https://submit.ncbi.nlm.nih.gov/subs/biosample/ https://www.ncbi.nlm.nih.gov/biosample/docs/ http://www.ncbi.nlm.nih.gov/projects/biosample/validate/
  • 4. NCBI Pathogen Detection Pipeline Submissions (Jan – May, 2016)
  • 5. Automated Bacterial Assembly SRA Reads sample 1 Trim reads (Ns, adaptor) Reference Distance tree Find closest reference genome(s) ArgoCA (Combined Assembly) De novo assembly panel Argo (Reference assisted assembly) SOAP denovo GS-assembler (newbler)MaSuRCA Celera Assembler Reads remapped to combined assembly Contig fasta Read placements (bam) Quality profile SPAdes
  • 6. 1. Initial partition of isolates within each species by kmer distances 2. Within each partition, blast comparison of all pairs of genomes 3. Single linkage clusters with at most 50 SNPs 4. Within clusters, SNPs with respect to one reference 5. Generate final SNP list and phylogenetic trees Filtering: • Base level • Repeat • Density Problematic genomes are eliminated at various points along the way SNP pipeline
  • 7. High SNP density Cumulative count of differences Iterative density filtering (Richa Agarwala modification of Science. 2011 Jan 28;331(6016):430-4.
  • 8. Type Total targets in k- mer tree Targets in clusters (single linkage <= 50 SNPs) Salmonella 45297 38794 Listeria 9621 8135 E. coli & Shigella 13144 6046 Campylobacter 2234 1569 Acinteobacter 2179 1299 Elizabethkingia 89 74 Serratia 336 227 Klebsiella 1194 677 Total targets (May 2016)
  • 10.
  • 11.
  • 12. there are several rows as NULL – means the target either is not in a cluster (check last column) or is in a cluster without any other isolate of the opposite type rows with low SNP count are significant these isolates are all <10 SNPs, and they all are in the same cluster
  • 13. NCBI Pathogen Detection SNP Pipeline: example 1 - stone fruit outbreak
  • 15. MN chicken kiev outbreak NCBI Pathogen Detection SNP Pipeline: example 2 – chicken kiev outbreak
  • 16. NCBI Pathogen Detection SNP Pipeline Web viewer (coming soon): example 3 – Elizabethkingia outbreak
  • 17. wgMLST approach • Complementary to SNP analysis e.g. consistency check • Efficient for initial clustering of all isolates in species • Generate loci using “essentially complete” RefSeq genomes Organism Number of loci Genome in loci Number of genomes Major species Acinetobacter 2420 58.25% 43/47 Baumannii Campylobacter 1257 68.36% 90/132 Jejuni Escherichia 2896 52.97% 159/165 Coli Klebsiella 4004 82.54% 67/82 Pneumoniae Listeria 2364 73.88% 73/81 Monocytogenes Salmonella 3469 66.98% 137/147 Enterica R&D: wgMLST
  • 18. • Fast & relatively simple • Epidemiologists are familiar with it • Good for initial clustering • Different heuristics • Can use special markers for e.g. serovars • Still need to deal with assembly errors • Recombination can still be a problem… wgMLST – a complementary method Loci are not independent R&D: wgMLST
  • 19. NCBI’s Role in Combating Antibiotic Resistant Bacteria “Create a repository of resistant bacterial strains (an “isolate bank”) and maintain a well-curated reference database that describes the characteristics of these strains.” “Develop and maintain a national sequence database of resistant pathogens.”
  • 20. AMR efforts at NCBI • With collaborators, build database of sequenced isolates with standardized AMR metadata (i.e. accept antibiograms) (2019 Samples as of May 16 - http://www.ncbi.nlm.nih.gov/biosample/?term=antibiogram[filter]) • Collaborators include: (CDC, WRAIR, FDA, B&W) • Stable, up-to-date database of AMR genes with standardized nomenclature • Collaborators (CARD) • RefSeq set released by June 2016 • Implement and validate tools for identifying AMR genes in new isolates
  • 21. Antibiogram Fields • Fields designed to find balance between comprehensiveness and ease of submission • Data dictionaries based on outside expertise (ASM, CLSI) standardize input and minimize ‘data drift’
  • 22. mcr-1 encoding organisms Total E. coli 11 Salmonella 10 Antibiotic resistance
  • 23. NCBI Outputs Kmer tree ftp://ftp.ncbi.nlm.nih.gov/pathogen/Results/ • Genome Workbench • full SNP reports • Integrated web-based interactive system* • AMR reports* • wgMLST*
  • 24. Acknowledgements Richa Agarwala Azat Badretdin Slava Brover Joshua Cherry Vyacheslav Chetvernin Robert Cohen Michael DiCuccio Mike Feldgarden Dan Haft William Klimke Alex Kotliarov Arjun Prasad Edward Rice Kirill Rotmistrovskyy This research was supported by the Intramural Research Program of the NIH, National Library of Medicine. http://www.ncbi.nlm.nih.gov National Center for Biotechnology Information – National Library of Medicine – Bethesda MD 20892 USA CDC FDA/CFSAN USDA-FSIS PHE/FERA NIHGRI NIAID WRAIR Broad Wadsworth/MDH Vendors: PacBio, Illumina, Roche Stephen Sherry Sergey Shiryev Martin Shumway Tatiana Tatusova Igor Tolstoy Chunlin Xiao Leonid Zaslavsky Alexander Zasypkin Alejandro A. Schaffer Lukas Wagner Aleksandr Morgulis David Lipman James Ostell