SlideShare uma empresa Scribd logo
1 de 18
U.S. Food and Drug Administration 
Institute for Genome Sciences 
Development of FDA MicroDB: 
A Regulatory-Grade 
Microbial Reference Database 
Heike Sichtig, Ph.D. 
Division of Microbiology Devices 
OIR/CDRH/FDA/HHS 
Heike.Sichtig@fda.hhs.gov 
Genomics Resource Center 
Institute for Genome Sciences 
ljtallon@som.umaryland.edu 
October 21-22, 2014 
Luke Tallon 
UMSOM 
NIST Workshop to Identify Standards Needed to Support Pathogen Identification 
via Next-Generation Sequencing, NIST, MD
2 
Microbial NGS-Based Diagnostic Devices 
• OIR/DMD working on a fast-tracked Draft Guidance 
• On April 1st 2014 held Public Workshop 
“Advancing Regulatory Science for High Throughput Sequencing 
Devices for Microbial Identification and Detection of Antimicrobial 
Resistance Markers” [FR Doc No: 2014-04940] 
• Workshop agenda, discussion paper and webcast online: 
http://www.fda.gov/MedicalDevices/NewsEvents/WorkshopsConferences/ucm386967.htm 
Objectives: 
1. Streamline/shorten clinical trials for microbial diagnosis/identification 
2. Establish a new comparator algorithm for assays developed using this 
new technology 
3. Develop regulatory science standards for microbial genome sequencing 
4. Investigate the regulatory science required for antimicrobial resistance 
determination through microbial genome sequence information.
3 
Inter-Agency Working Group on Feasibility 
Approach: 
• Formed a diverse working group FDA, NIH-NCBI, NIAID, DTRA, 
LLNL, and CDC 
• Conducted small pilot study to generate information to evaluate 
quality of existing sequences in the public domain (In Progress) 
• Identify the pre-existing high-quality deposits, and build from 
there 
• Will use information to set quality bar for sequence outputs for 
our ongoing sequencing efforts 
• Utilized existing standards (if available) for technical and isolate 
metadata –no need to re-invent 
• Attention given to connecting antimicrobial resistance 
phenotype to genomic deposits – clinical collection site
Looking ahead: Predictions for Reference Databases 
– Multiple levels of Reference DBs likely 
• “High quality” genomes only 
– For validation and clinical use 
• “High quality” + other available genomes 
– For testing and development 
• Requires definition of “high quality” that must include 
some draft genomes 
– Extensive screening required 
• Human and other hosts; chimeras 
• Artificial constructs 
– Separate bacterial, viral, fungal reference DBs 
– Publicly available (NCBI/EMBL/DDBJ) 
4 
Courtesy of Tom Slezak
5 
Current Need 
Robust, Standardized, and High Quality Microbial 
Sequence Database in the Public Sector 
Cover illustration 
(Copyright © 2009, American Society 
for Microbiology. All Rights Reserved.) 
• Representative Samples 
• Metadata 
• High quality raw sequences 
• Assemblies 
• Annotation 
• Public Domain
Latest NCBI Genbank Report on Bacterial Genome 
25000 
20000 
15000 
10000 
5000 
6 
Growth 
0 
Bacterial 
Genomes 
Report 
Jul-­‐98 
Aug-­‐99 
Oct-­‐00 
Nov-­‐01 
Dec-­‐02 
Jan-­‐04 
Feb-­‐05 
Mar-­‐06 
Apr-­‐07 
Jun-­‐08 
Jul-­‐09 
Aug-­‐10 
Sep-­‐11 
Oct-­‐12 
Nov-­‐13 
Dec-­‐14 
Count 
Date 
#Genomes 
#Real 
Species 
Courtesy of NCBI
Microbial Reference Database (MicroDB)($1,67M) 
• Identify “gaps” and target sequencing efforts (Funding awarded by FDA/OCET) 
7 
• All raw reads, assemblies, annotations, metadata sent to NCBI and 
accessible to the PUBLIC 
• Traceable results that could be reevaluated as necessary 
>600 Clinically 
Relevant and MCM 
Microorganisms 
Highly 
Controlled 
and 
Documented 
Approach 
Collaborations with Clinical Labs and Repositories 
• Children’s National Hospital 
• DoD Critical Reagents Program (CRP, USAMRIID) 
• FDA-CFSAN, FDA-CBER, FDA-CDER 
• DHS National Biodefense Analysis and 
Countermeasures Center (NBACC) 
• The Rockefeller University 
• Culture Collections: ATCC, DSMZ 
Sequencing Center (UMD IGS) 
• Hybrid Approach (PacBio and Illumina) 
• Deposit of Raw Reads at NCBI (SRA) 
• Deposit of Assemblies at NCBI 
• Deposit of Annotations at NCBI 
• FDA Interface to Access Data
MicroDB Requirements 
A. Extracted Genomic DNA (gDNA) 
– Extracted gDNA should be of high quality and purity, and at sufficient concentration to 
achieve a suitable yield to assure adequate depth and breadth of genomic coverage for 
the type of sequencing method employed. 
B. BioSample Metadata 
– A minimal description of the isolate source material is necessary for traceability. We are 
using 14 descriptors as outlined below. (Note: Minimal metadata is modeled in part after 
NCBI’s minimal pathogen template) 
– Unique ID, organism, strain/isolate, sample site, specimen type, host disease, collection 
date, collected by, patient age, gender, geographic location, AST method*, AST method 
manufacturer*, Antimicrobial Susceptibilities* 
C. Sequencing Data 
– The minimum requirement for sequencing data is that the generated raw reads should be 
deposited in NCBI’s Sequence Read Archive (SRA) and assemblies should be deposited 
at NCBI’s Assembly division. The availability of raw reads and assemblies will provide a 
pathway to re-analyze the data as newer technologies emerge. Furthermore, annotation 
data should be deposited when available. 
– Raw reads, assemblies, annotations* 
*not used as a criteria for exclusion 8
MicroDB Requirements 
D. Sequencing Metadata 
– A minimal description of the sequencing process is necessary for traceability. We are 
using 7 descriptors as outlined below including bioinformatics tool information for assembly 
and annotation, and genomic coverage information. 
– Library, platform, submitted by, fold coverage, pipeline, assembler, annotation tool* 
E. Suggested phenotypic metadata* 
– A description of the phenotypic information is suggested to create a link between the 
phenotypic traits of particular organisms and their genomic sequence. We are 
recommending 5 descriptors as outlined below (1-4 are also included in sections B and C). 
– Annotation, AST method, AST method manufacturer, antimicrobial susceptibilities, 
additional phenotypic data 
*not used as a criteria for exclusion 9
NCBI Submission Cases 
1. Childrens National Medical Center 
– Submit all data when available 
– Register sample metadata via BioSample 
– Submit raw reads and assemblies generated by IGS when available 
2. FDA/CFSAN 
– Collaborative agreement: Wait for genome announcements 
– Follow same procedures as for 1 and put a ‘6 month hold’ to 
release data, lift hold when genome announcements are out 
3. Rockefeller University 
– Collaborative agreement: Wait for publication 
– Follow same procedures as for 1 and put a ‘6 month hold’ to 
release data, lift hold when publication is out 
Similar agreements in place with other collaborators depending 
on their needs 
10
Project 
Approach 
• Sequencing 
in 
large 
batches 
– Illumina 
HiSeq 
paired-­‐end 
sequencing: 
>200x 
– PacBio 
long-­‐insert 
SMRT 
P4-­‐C2 
sequencing: 
>80-­‐100x 
• Assembly 
– PacBio 
only 
(HGAP, 
PBcR 
CA) 
– Illumina 
only 
(CA, 
MaSuRCA) 
– PacBio/Illumina 
hybrid 
(CA) 
– Minimal 
manual 
QA/QC 
& 
curaon 
• Automated 
Annotaon 
• Base 
modificaon 
detecon 
• Raw 
reads 
-­‐> 
NCBI 
SRA 
• Assembled 
& 
annotated 
genomes 
-­‐> 
Genbank 
– NCBI 
BIOPROJECT 
ID: 
PRJNA231221 
• FDA 
Web 
interface 
to 
aggregate 
data
Progress 
-­‐ 
Batch 
1 
Rockefeller 
(50) 
• Uniform 
sample 
set 
– Staphylococcus 
aureus 
– 2.8Mbp 
genome 
size 
– 32.8 
%GC 
– Significant 
metadata 
CNH/CFSAN 
(41) 
• Diverse 
sample 
set 
– 18 
genera 
represented 
– 2 
– 
8 
Mbp 
genome 
size 
range 
– 38 
– 
67 
%GC 
range 
Wikimedia 
Commons 
Wikimedia 
Commons 
NCBI 
BioProject: 
PRJNA231221
Rockefeller 
Samples 
• Sequencing 
– Avg 
Illumina 
cvg: 
578x 
– Avg 
PacBio 
cvg: 
185x 
– 1 
or 
2 
SMRT 
cells 
each 
• Assembly: 
– 32 
of 
50 
in 
single 
cong 
chromosome 
– Average 
cong 
count 
= 
5 
– “Best” 
assembly: 
• HGAP 
= 
29 
• CA 
hybrid 
= 
21 
• Most 
differences 
subtle 
• Annotaon 
complete 
• Final 
QC 
& 
data 
submissions 
underway
CNH/CFSAN 
Samples 
• Sequencing 
– Avg 
Illumina 
cvg: 
315x 
– Avg 
PacBio 
cvg: 
167x 
• 2 
SMRT 
cells 
each 
• Assembly 
– 12 
of 
41 
in 
single 
cong 
chromosome 
• 29 
in 
<= 
5 
congs 
– Avg 
cong 
count 
= 
4.5 
– Median 
cong 
count 
= 
3 
– “Best” 
assembly 
(of 
41): 
• HGAP 
= 
24 
• PBcR 
CA 
= 
14 
• CA 
hybrid 
= 
3 
• Annotaon 
underway
ROCK_290 Celera8 ctg vs. ref 
0 500000 1000000 1500000 2000000 2500000 
gi|374362062|gb|CP003033.1| 
2500000 
2000000 
1500000 
1000000 
500000 
0 
ctg7180000000002 
100 
80 
60 
40 
20 
0 
Assembly 
QC 
& 
Curaon 
%similarity 
CA8 
– 
Ill/PB 
hybrid 
Largest 
Ctg 
Len: 
2,759,091bp 
Total 
asm 
Ctg 
Len: 
2,770,822 
bp 
ROCK_290 HGAP2 ctg vs. ref 
0 500000 1000000 1500000 2000000 2500000 
gi|374362062|gb|CP003033.1| 
ssccff77118800000000000000001134||qquuiivveerr 
QRY 
ssscccfff777111888000000000000000000000000111012|||qqquuuiiivvveeerrr 
100 
80 
60 
40 
20 
0 
%similarity 
HGAP2 
Largest 
Ctg 
Len: 
2,128,476bp 
Total 
asm 
Ctg 
Len: 
2,802,621 
bp
4bp 
overlap? 
0 500000 1000000 1500000 2000000 2500000 
gi|595636499|gb|CP007454.1| 
2500000 
2000000 
1500000 
1000000 
500000 
0 
scf7180000000002|quiver 
Assembly 
QC 
& 
Curaon 
100 80 
60 
similarity 
%40 
20 
0 
HGAP2 
Largest 
Ctg 
Len: 
2,764,709bp 
Total 
asm 
Ctg 
Len: 
2,764,709bp 
1X 
coverage 
TAAC 
1X 
coverage 
TAGC
Challenges 
& 
Opportunies 
• Sample 
acquision 
& 
quality 
• Efficiency/throughput 
vs. 
accuracy/quality 
– Sequencing 
strategy 
– Assembly 
QA/QC 
& 
curaon 
• Ever 
longer 
reads! 
– Reduced 
coverage 
-­‐> 
higher 
efficiency 
sequencing 
– More 
“closed” 
genomes! 
• Small 
plasmids 
– SageELF 
& 
Illumina
FDA Micro Team 
Peyton Hobson, Brittany Goldberg, Kevin Snyder, Tamara Feldblyum, Uwe Scherf, Sally Hojvat 
C ollaborators 
18 
Thank You 
LLNL 
Tom Slezak 
NIH-NCBI 
Bill Klimke, Martin Shumway, David Lipman 
NIH-NIAID 
Vivien Dugan, Maria Giovani 
DTRA 
Matt Tobelmann, Chris Detter, Eric 
VanGieson, Nels Olsen 
CDC 
Duncan MacCannell 
FDA-CFSAN 
Maria Hoffmann, Cary Pirone, Andrea 
Ottessen, Marc Allard, Eric Brown 
NMRC 
Kim Bishop-Lilly, Ken Frey 
IGS@UMD 
Lisa Sadzewicz, Luke Tallon, Naomi 
Sengamalay, Al Godinez, Sandy 
Ott, Sushma Nagaraj, Claire Fraser 
Rockefeller University 
Bryan Utter, Douglas Deutsch 
Children’s National Medical Center 
Brittany Goldberg, Joseph Campos 
DOD-CRP 
Shanmuga Sozhamannan, Mike Smith 
DOD-USAMRIID 
Tim Minogue 
NBACC 
Adam Phillippy, Nick Bergman 
ATCC 
Liz Kerrigan 
DSMZ 
Cathrin Sproer

Mais conteúdo relacionado

Mais procurados

NGS Applications II (UEB-UAT Bioinformatics Course - Session 2.1.3 - VHIR, Ba...
NGS Applications II (UEB-UAT Bioinformatics Course - Session 2.1.3 - VHIR, Ba...NGS Applications II (UEB-UAT Bioinformatics Course - Session 2.1.3 - VHIR, Ba...
NGS Applications II (UEB-UAT Bioinformatics Course - Session 2.1.3 - VHIR, Ba...VHIR Vall d’Hebron Institut de Recerca
 
Making Use of NGS Data: From Reads to Trees and Annotations
Making Use of NGS Data: From Reads to Trees and AnnotationsMaking Use of NGS Data: From Reads to Trees and Annotations
Making Use of NGS Data: From Reads to Trees and AnnotationsJoão André Carriço
 
GIAB GRC Workshop slides
GIAB GRC Workshop slidesGIAB GRC Workshop slides
GIAB GRC Workshop slidesGenomeInABottle
 
The Transforming Genetic Medicine Initiative (TGMI)
The Transforming Genetic Medicine Initiative (TGMI)The Transforming Genetic Medicine Initiative (TGMI)
The Transforming Genetic Medicine Initiative (TGMI)Genome Reference Consortium
 
Sept2016 plenary mercer_sequins
Sept2016 plenary mercer_sequinsSept2016 plenary mercer_sequins
Sept2016 plenary mercer_sequinsGenomeInABottle
 
Aug2013 NIST highly confident genotype calls for NA12878
Aug2013 NIST highly confident genotype calls for NA12878Aug2013 NIST highly confident genotype calls for NA12878
Aug2013 NIST highly confident genotype calls for NA12878GenomeInABottle
 
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference DatabaseDevelopment of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference Databasenist-spin
 
160627 giab for festival sv workshop
160627 giab for festival sv workshop160627 giab for festival sv workshop
160627 giab for festival sv workshopGenomeInABottle
 
Building bioinformatics resources for the global community
Building bioinformatics resources for the global communityBuilding bioinformatics resources for the global community
Building bioinformatics resources for the global communityExternalEvents
 
2017 agbt benchmarking_poster
2017 agbt benchmarking_poster2017 agbt benchmarking_poster
2017 agbt benchmarking_posterGenomeInABottle
 
High-Throughput Sequencing
High-Throughput SequencingHigh-Throughput Sequencing
High-Throughput SequencingMark Pallen
 
Tools for Using NIST Reference Materials
Tools for Using NIST Reference MaterialsTools for Using NIST Reference Materials
Tools for Using NIST Reference MaterialsGenomeInABottle
 
GIAB-GRC workshop oct2015 giab introduction 151005
GIAB-GRC workshop oct2015 giab introduction 151005GIAB-GRC workshop oct2015 giab introduction 151005
GIAB-GRC workshop oct2015 giab introduction 151005GenomeInABottle
 
Giab ashg webinar 160224
Giab ashg webinar 160224Giab ashg webinar 160224
Giab ashg webinar 160224GenomeInABottle
 
160628 giab for festival of genomics
160628 giab for festival of genomics160628 giab for festival of genomics
160628 giab for festival of genomicsGenomeInABottle
 
Aug2013 illumina platinum genomes
Aug2013 illumina platinum genomesAug2013 illumina platinum genomes
Aug2013 illumina platinum genomesGenomeInABottle
 
Genomic Epidemiology: How High Throughput Sequencing changed our view on bac...
Genomic Epidemiology:  How High Throughput Sequencing changed our view on bac...Genomic Epidemiology:  How High Throughput Sequencing changed our view on bac...
Genomic Epidemiology: How High Throughput Sequencing changed our view on bac...João André Carriço
 

Mais procurados (20)

NGS Applications II (UEB-UAT Bioinformatics Course - Session 2.1.3 - VHIR, Ba...
NGS Applications II (UEB-UAT Bioinformatics Course - Session 2.1.3 - VHIR, Ba...NGS Applications II (UEB-UAT Bioinformatics Course - Session 2.1.3 - VHIR, Ba...
NGS Applications II (UEB-UAT Bioinformatics Course - Session 2.1.3 - VHIR, Ba...
 
Making Use of NGS Data: From Reads to Trees and Annotations
Making Use of NGS Data: From Reads to Trees and AnnotationsMaking Use of NGS Data: From Reads to Trees and Annotations
Making Use of NGS Data: From Reads to Trees and Annotations
 
GIAB GRC Workshop slides
GIAB GRC Workshop slidesGIAB GRC Workshop slides
GIAB GRC Workshop slides
 
The Transforming Genetic Medicine Initiative (TGMI)
The Transforming Genetic Medicine Initiative (TGMI)The Transforming Genetic Medicine Initiative (TGMI)
The Transforming Genetic Medicine Initiative (TGMI)
 
Sept2016 plenary mercer_sequins
Sept2016 plenary mercer_sequinsSept2016 plenary mercer_sequins
Sept2016 plenary mercer_sequins
 
Aug2013 NIST highly confident genotype calls for NA12878
Aug2013 NIST highly confident genotype calls for NA12878Aug2013 NIST highly confident genotype calls for NA12878
Aug2013 NIST highly confident genotype calls for NA12878
 
Genome in a Bottle
Genome in a BottleGenome in a Bottle
Genome in a Bottle
 
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference DatabaseDevelopment of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
 
160627 giab for festival sv workshop
160627 giab for festival sv workshop160627 giab for festival sv workshop
160627 giab for festival sv workshop
 
Building bioinformatics resources for the global community
Building bioinformatics resources for the global communityBuilding bioinformatics resources for the global community
Building bioinformatics resources for the global community
 
2017 agbt benchmarking_poster
2017 agbt benchmarking_poster2017 agbt benchmarking_poster
2017 agbt benchmarking_poster
 
High-Throughput Sequencing
High-Throughput SequencingHigh-Throughput Sequencing
High-Throughput Sequencing
 
Tools for Using NIST Reference Materials
Tools for Using NIST Reference MaterialsTools for Using NIST Reference Materials
Tools for Using NIST Reference Materials
 
2016 ashg giab poster
2016 ashg giab poster2016 ashg giab poster
2016 ashg giab poster
 
GIAB-GRC workshop oct2015 giab introduction 151005
GIAB-GRC workshop oct2015 giab introduction 151005GIAB-GRC workshop oct2015 giab introduction 151005
GIAB-GRC workshop oct2015 giab introduction 151005
 
Giab ashg webinar 160224
Giab ashg webinar 160224Giab ashg webinar 160224
Giab ashg webinar 160224
 
Introduction to Metagenomics Data Analysis - UEB-VHIR - 2013
Introduction to Metagenomics Data Analysis - UEB-VHIR - 2013Introduction to Metagenomics Data Analysis - UEB-VHIR - 2013
Introduction to Metagenomics Data Analysis - UEB-VHIR - 2013
 
160628 giab for festival of genomics
160628 giab for festival of genomics160628 giab for festival of genomics
160628 giab for festival of genomics
 
Aug2013 illumina platinum genomes
Aug2013 illumina platinum genomesAug2013 illumina platinum genomes
Aug2013 illumina platinum genomes
 
Genomic Epidemiology: How High Throughput Sequencing changed our view on bac...
Genomic Epidemiology:  How High Throughput Sequencing changed our view on bac...Genomic Epidemiology:  How High Throughput Sequencing changed our view on bac...
Genomic Epidemiology: How High Throughput Sequencing changed our view on bac...
 

Destaque

[WeFocus] KIAT 기술인문융합창작소_사업화를 위한 특허 전략_김성현_20161017_v3
[WeFocus] KIAT 기술인문융합창작소_사업화를 위한 특허 전략_김성현_20161017_v3[WeFocus] KIAT 기술인문융합창작소_사업화를 위한 특허 전략_김성현_20161017_v3
[WeFocus] KIAT 기술인문융합창작소_사업화를 위한 특허 전략_김성현_20161017_v3Luke Sunghyun Kim
 
Metrology for Identity and Other Nominal Properties
Metrology for Identity and Other Nominal PropertiesMetrology for Identity and Other Nominal Properties
Metrology for Identity and Other Nominal PropertiesNathan Olson
 
бизнес комуникация-правила
бизнес комуникация-правилабизнес комуникация-правила
бизнес комуникация-правилаRania Mohamed
 
Common Online terminologies
Common Online terminologiesCommon Online terminologies
Common Online terminologiesChezkaClaudio
 
[WeFocus] 특허실무기초(1) 특허법기초 김성현
[WeFocus] 특허실무기초(1) 특허법기초 김성현[WeFocus] 특허실무기초(1) 특허법기초 김성현
[WeFocus] 특허실무기초(1) 특허법기초 김성현Luke Sunghyun Kim
 
O net 2553 thai
O net 2553 thaiO net 2553 thai
O net 2553 thaidogmee
 
Bacterial Pathogen Genomics at NCBI
Bacterial Pathogen Genomics at NCBIBacterial Pathogen Genomics at NCBI
Bacterial Pathogen Genomics at NCBINathan Olson
 
O net 2550
O net 2550O net 2550
O net 2550dogmee
 
Бизнес етикет -10 правила
Бизнес етикет -10 правилаБизнес етикет -10 правила
Бизнес етикет -10 правилаRania Mohamed
 
Scale Computing & the Time-Starved Administrator’s Guide to Simplifying the S...
Scale Computing & the Time-Starved Administrator’s Guide to Simplifying the S...Scale Computing & the Time-Starved Administrator’s Guide to Simplifying the S...
Scale Computing & the Time-Starved Administrator’s Guide to Simplifying the S...actualtechmedia
 
Activity 13 common online terminologies
Activity 13 common online terminologiesActivity 13 common online terminologies
Activity 13 common online terminologiesuineomino
 
Ch. 7 resp system pharm bio 120 sp2014
Ch. 7 resp system pharm bio 120 sp2014Ch. 7 resp system pharm bio 120 sp2014
Ch. 7 resp system pharm bio 120 sp2014mcp7576
 
งาน
งานงาน
งานdogmee
 
Scrum Framework: Manage Anything Efficiently and Accurately
Scrum Framework: Manage Anything Efficiently and AccuratelyScrum Framework: Manage Anything Efficiently and Accurately
Scrum Framework: Manage Anything Efficiently and AccuratelyAmir Syafrudin
 
Get rid of Cellulite
Get rid of CelluliteGet rid of Cellulite
Get rid of CelluliteIdlehands
 
O net 2550
O net 2550O net 2550
O net 2550dogmee
 

Destaque (20)

[WeFocus] KIAT 기술인문융합창작소_사업화를 위한 특허 전략_김성현_20161017_v3
[WeFocus] KIAT 기술인문융합창작소_사업화를 위한 특허 전략_김성현_20161017_v3[WeFocus] KIAT 기술인문융합창작소_사업화를 위한 특허 전략_김성현_20161017_v3
[WeFocus] KIAT 기술인문융합창작소_사업화를 위한 특허 전략_김성현_20161017_v3
 
Editing images
Editing imagesEditing images
Editing images
 
Metrology for Identity and Other Nominal Properties
Metrology for Identity and Other Nominal PropertiesMetrology for Identity and Other Nominal Properties
Metrology for Identity and Other Nominal Properties
 
бизнес комуникация-правила
бизнес комуникация-правилабизнес комуникация-правила
бизнес комуникация-правила
 
Common Online terminologies
Common Online terminologiesCommon Online terminologies
Common Online terminologies
 
Jamur
JamurJamur
Jamur
 
[WeFocus] 특허실무기초(1) 특허법기초 김성현
[WeFocus] 특허실무기초(1) 특허법기초 김성현[WeFocus] 특허실무기초(1) 특허법기초 김성현
[WeFocus] 특허실무기초(1) 특허법기초 김성현
 
O net 2553 thai
O net 2553 thaiO net 2553 thai
O net 2553 thai
 
Bacterial Pathogen Genomics at NCBI
Bacterial Pathogen Genomics at NCBIBacterial Pathogen Genomics at NCBI
Bacterial Pathogen Genomics at NCBI
 
O net 2550
O net 2550O net 2550
O net 2550
 
Бизнес етикет -10 правила
Бизнес етикет -10 правилаБизнес етикет -10 правила
Бизнес етикет -10 правила
 
Scale Computing & the Time-Starved Administrator’s Guide to Simplifying the S...
Scale Computing & the Time-Starved Administrator’s Guide to Simplifying the S...Scale Computing & the Time-Starved Administrator’s Guide to Simplifying the S...
Scale Computing & the Time-Starved Administrator’s Guide to Simplifying the S...
 
Activity 13 common online terminologies
Activity 13 common online terminologiesActivity 13 common online terminologies
Activity 13 common online terminologies
 
Ch. 7 resp system pharm bio 120 sp2014
Ch. 7 resp system pharm bio 120 sp2014Ch. 7 resp system pharm bio 120 sp2014
Ch. 7 resp system pharm bio 120 sp2014
 
Plan cuentas ang
Plan cuentas angPlan cuentas ang
Plan cuentas ang
 
งาน
งานงาน
งาน
 
Scrum Framework: Manage Anything Efficiently and Accurately
Scrum Framework: Manage Anything Efficiently and AccuratelyScrum Framework: Manage Anything Efficiently and Accurately
Scrum Framework: Manage Anything Efficiently and Accurately
 
Plan cuentas ang
Plan cuentas angPlan cuentas ang
Plan cuentas ang
 
Get rid of Cellulite
Get rid of CelluliteGet rid of Cellulite
Get rid of Cellulite
 
O net 2550
O net 2550O net 2550
O net 2550
 

Semelhante a Development of FDA MicroDB: A Regulatory-Grade Microbial Reference Database

Supporting high throughput high-biotechnologies in today’s research environme...
Supporting high throughput high-biotechnologies in today’s research environme...Supporting high throughput high-biotechnologies in today’s research environme...
Supporting high throughput high-biotechnologies in today’s research environme...Ed Dodds
 
VarSeq 2.4.0: VSClinical ACMG Workflow from the User Perspective
VarSeq 2.4.0: VSClinical ACMG Workflow from the User PerspectiveVarSeq 2.4.0: VSClinical ACMG Workflow from the User Perspective
VarSeq 2.4.0: VSClinical ACMG Workflow from the User PerspectiveGolden Helix
 
VarSeq 2.4.0: VSClinical ACMG Workflow from the User Perspective
VarSeq 2.4.0: VSClinical ACMG Workflow from the User PerspectiveVarSeq 2.4.0: VSClinical ACMG Workflow from the User Perspective
VarSeq 2.4.0: VSClinical ACMG Workflow from the User PerspectiveGolden Helix
 
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...GenomeInABottle
 
Genomic Big Data Management, Integration and Mining - Emanuel Weitschek
Genomic Big Data Management, Integration and Mining - Emanuel WeitschekGenomic Big Data Management, Integration and Mining - Emanuel Weitschek
Genomic Big Data Management, Integration and Mining - Emanuel WeitschekData Driven Innovation
 
Giab for jax long read 190917
Giab for jax long read 190917Giab for jax long read 190917
Giab for jax long read 190917GenomeInABottle
 
GIAB update for GRC GIAB workshop 191015
GIAB update for GRC GIAB workshop 191015GIAB update for GRC GIAB workshop 191015
GIAB update for GRC GIAB workshop 191015GenomeInABottle
 
16S MVRSION at Washington University
16S MVRSION at Washington University16S MVRSION at Washington University
16S MVRSION at Washington UniversitySeth Crosby
 
Giab jan2016 intro and update 160128
Giab jan2016 intro and update 160128Giab jan2016 intro and update 160128
Giab jan2016 intro and update 160128GenomeInABottle
 
The Wide Spectrum of Next-Generation Sequencing Assays with VarSeq
The Wide Spectrum of Next-Generation Sequencing Assays with VarSeqThe Wide Spectrum of Next-Generation Sequencing Assays with VarSeq
The Wide Spectrum of Next-Generation Sequencing Assays with VarSeqGolden Helix
 
Bioinformatic_Databases_2.ppt
Bioinformatic_Databases_2.pptBioinformatic_Databases_2.ppt
Bioinformatic_Databases_2.pptNaglaaFathy42
 
Bioinformatic_Databases_2xcxzczxcxzxcxzc
Bioinformatic_Databases_2xcxzczxcxzxcxzcBioinformatic_Databases_2xcxzczxcxzxcxzc
Bioinformatic_Databases_2xcxzczxcxzxcxzcAdiM27
 
Bioinformatic databases 2
Bioinformatic databases 2Bioinformatic databases 2
Bioinformatic databases 2Razzaqe
 
Bioinformatic databases 2
Bioinformatic databases 2Bioinformatic databases 2
Bioinformatic databases 2Razzaqe
 
Platforms CIBERER and INB-ELIXIR-es
Platforms CIBERER and INB-ELIXIR-esPlatforms CIBERER and INB-ELIXIR-es
Platforms CIBERER and INB-ELIXIR-esJoaquin Dopazo
 
100,000 Genomes Project.
100,000 Genomes Project.100,000 Genomes Project.
100,000 Genomes Project.David Montaner
 

Semelhante a Development of FDA MicroDB: A Regulatory-Grade Microbial Reference Database (20)

Supporting high throughput high-biotechnologies in today’s research environme...
Supporting high throughput high-biotechnologies in today’s research environme...Supporting high throughput high-biotechnologies in today’s research environme...
Supporting high throughput high-biotechnologies in today’s research environme...
 
VarSeq 2.4.0: VSClinical ACMG Workflow from the User Perspective
VarSeq 2.4.0: VSClinical ACMG Workflow from the User PerspectiveVarSeq 2.4.0: VSClinical ACMG Workflow from the User Perspective
VarSeq 2.4.0: VSClinical ACMG Workflow from the User Perspective
 
VarSeq 2.4.0: VSClinical ACMG Workflow from the User Perspective
VarSeq 2.4.0: VSClinical ACMG Workflow from the User PerspectiveVarSeq 2.4.0: VSClinical ACMG Workflow from the User Perspective
VarSeq 2.4.0: VSClinical ACMG Workflow from the User Perspective
 
Variant analysis and whole exome sequencing
Variant analysis and whole exome sequencingVariant analysis and whole exome sequencing
Variant analysis and whole exome sequencing
 
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
 
Genomic Big Data Management, Integration and Mining - Emanuel Weitschek
Genomic Big Data Management, Integration and Mining - Emanuel WeitschekGenomic Big Data Management, Integration and Mining - Emanuel Weitschek
Genomic Big Data Management, Integration and Mining - Emanuel Weitschek
 
Giab for jax long read 190917
Giab for jax long read 190917Giab for jax long read 190917
Giab for jax long read 190917
 
Overview of Next Gen Sequencing Data Analysis
Overview of Next Gen Sequencing Data AnalysisOverview of Next Gen Sequencing Data Analysis
Overview of Next Gen Sequencing Data Analysis
 
GIAB update for GRC GIAB workshop 191015
GIAB update for GRC GIAB workshop 191015GIAB update for GRC GIAB workshop 191015
GIAB update for GRC GIAB workshop 191015
 
16S MVRSION at Washington University
16S MVRSION at Washington University16S MVRSION at Washington University
16S MVRSION at Washington University
 
First Coast Final
First Coast FinalFirst Coast Final
First Coast Final
 
Giab jan2016 intro and update 160128
Giab jan2016 intro and update 160128Giab jan2016 intro and update 160128
Giab jan2016 intro and update 160128
 
The Wide Spectrum of Next-Generation Sequencing Assays with VarSeq
The Wide Spectrum of Next-Generation Sequencing Assays with VarSeqThe Wide Spectrum of Next-Generation Sequencing Assays with VarSeq
The Wide Spectrum of Next-Generation Sequencing Assays with VarSeq
 
Bioinformatic_Databases_2.ppt
Bioinformatic_Databases_2.pptBioinformatic_Databases_2.ppt
Bioinformatic_Databases_2.ppt
 
Bioinformatic_Databases_2xcxzczxcxzxcxzc
Bioinformatic_Databases_2xcxzczxcxzxcxzcBioinformatic_Databases_2xcxzczxcxzxcxzc
Bioinformatic_Databases_2xcxzczxcxzxcxzc
 
Bioinformatic databases 2
Bioinformatic databases 2Bioinformatic databases 2
Bioinformatic databases 2
 
Bioinformatic databases 2
Bioinformatic databases 2Bioinformatic databases 2
Bioinformatic databases 2
 
Platforms CIBERER and INB-ELIXIR-es
Platforms CIBERER and INB-ELIXIR-esPlatforms CIBERER and INB-ELIXIR-es
Platforms CIBERER and INB-ELIXIR-es
 
100,000 Genomes Project.
100,000 Genomes Project.100,000 Genomes Project.
100,000 Genomes Project.
 
171017 giab for giab grc workshop
171017 giab for giab grc workshop171017 giab for giab grc workshop
171017 giab for giab grc workshop
 

Último

How we decide powerpoint presentation.pptx
How we decide powerpoint presentation.pptxHow we decide powerpoint presentation.pptx
How we decide powerpoint presentation.pptxJosielynTars
 
GenAI talk for Young at Wageningen University & Research (WUR) March 2024
GenAI talk for Young at Wageningen University & Research (WUR) March 2024GenAI talk for Young at Wageningen University & Research (WUR) March 2024
GenAI talk for Young at Wageningen University & Research (WUR) March 2024Jene van der Heide
 
Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...
Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...
Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...Christina Parmionova
 
Q4-Mod-1c-Quiz-Projectile-333344444.pptx
Q4-Mod-1c-Quiz-Projectile-333344444.pptxQ4-Mod-1c-Quiz-Projectile-333344444.pptx
Q4-Mod-1c-Quiz-Projectile-333344444.pptxtuking87
 
DNA isolation molecular biology practical.pptx
DNA isolation molecular biology practical.pptxDNA isolation molecular biology practical.pptx
DNA isolation molecular biology practical.pptxGiDMOh
 
GLYCOSIDES Classification Of GLYCOSIDES Chemical Tests Glycosides
GLYCOSIDES Classification Of GLYCOSIDES  Chemical Tests GlycosidesGLYCOSIDES Classification Of GLYCOSIDES  Chemical Tests Glycosides
GLYCOSIDES Classification Of GLYCOSIDES Chemical Tests GlycosidesNandakishor Bhaurao Deshmukh
 
Science (Communication) and Wikipedia - Potentials and Pitfalls
Science (Communication) and Wikipedia - Potentials and PitfallsScience (Communication) and Wikipedia - Potentials and Pitfalls
Science (Communication) and Wikipedia - Potentials and PitfallsDobusch Leonhard
 
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdf
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdfKDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdf
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdfGABYFIORELAMALPARTID1
 
Oxo-Acids of Halogens and their Salts.pptx
Oxo-Acids of Halogens and their Salts.pptxOxo-Acids of Halogens and their Salts.pptx
Oxo-Acids of Halogens and their Salts.pptxfarhanvvdk
 
bonjourmadame.tumblr.com bhaskar's girls
bonjourmadame.tumblr.com bhaskar's girlsbonjourmadame.tumblr.com bhaskar's girls
bonjourmadame.tumblr.com bhaskar's girlshansessene
 
Observational constraints on mergers creating magnetism in massive stars
Observational constraints on mergers creating magnetism in massive starsObservational constraints on mergers creating magnetism in massive stars
Observational constraints on mergers creating magnetism in massive starsSérgio Sacani
 
WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11
WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11
WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11GelineAvendao
 
Environmental acoustics- noise criteria.pptx
Environmental acoustics- noise criteria.pptxEnvironmental acoustics- noise criteria.pptx
Environmental acoustics- noise criteria.pptxpriyankatabhane
 
linear Regression, multiple Regression and Annova
linear Regression, multiple Regression and Annovalinear Regression, multiple Regression and Annova
linear Regression, multiple Regression and AnnovaMansi Rastogi
 
CHROMATOGRAPHY PALLAVI RAWAT.pptx
CHROMATOGRAPHY  PALLAVI RAWAT.pptxCHROMATOGRAPHY  PALLAVI RAWAT.pptx
CHROMATOGRAPHY PALLAVI RAWAT.pptxpallavirawat456
 
Replisome-Cohesin Interfacing A Molecular Perspective.pdf
Replisome-Cohesin Interfacing A Molecular Perspective.pdfReplisome-Cohesin Interfacing A Molecular Perspective.pdf
Replisome-Cohesin Interfacing A Molecular Perspective.pdfAtiaGohar1
 
Explainable AI for distinguishing future climate change scenarios
Explainable AI for distinguishing future climate change scenariosExplainable AI for distinguishing future climate change scenarios
Explainable AI for distinguishing future climate change scenariosZachary Labe
 

Último (20)

How we decide powerpoint presentation.pptx
How we decide powerpoint presentation.pptxHow we decide powerpoint presentation.pptx
How we decide powerpoint presentation.pptx
 
Interferons.pptx.
Interferons.pptx.Interferons.pptx.
Interferons.pptx.
 
GenAI talk for Young at Wageningen University & Research (WUR) March 2024
GenAI talk for Young at Wageningen University & Research (WUR) March 2024GenAI talk for Young at Wageningen University & Research (WUR) March 2024
GenAI talk for Young at Wageningen University & Research (WUR) March 2024
 
Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...
Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...
Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...
 
Q4-Mod-1c-Quiz-Projectile-333344444.pptx
Q4-Mod-1c-Quiz-Projectile-333344444.pptxQ4-Mod-1c-Quiz-Projectile-333344444.pptx
Q4-Mod-1c-Quiz-Projectile-333344444.pptx
 
DNA isolation molecular biology practical.pptx
DNA isolation molecular biology practical.pptxDNA isolation molecular biology practical.pptx
DNA isolation molecular biology practical.pptx
 
PLASMODIUM. PPTX
PLASMODIUM. PPTXPLASMODIUM. PPTX
PLASMODIUM. PPTX
 
AZOTOBACTER AS BIOFERILIZER.PPTX
AZOTOBACTER AS BIOFERILIZER.PPTXAZOTOBACTER AS BIOFERILIZER.PPTX
AZOTOBACTER AS BIOFERILIZER.PPTX
 
GLYCOSIDES Classification Of GLYCOSIDES Chemical Tests Glycosides
GLYCOSIDES Classification Of GLYCOSIDES  Chemical Tests GlycosidesGLYCOSIDES Classification Of GLYCOSIDES  Chemical Tests Glycosides
GLYCOSIDES Classification Of GLYCOSIDES Chemical Tests Glycosides
 
Science (Communication) and Wikipedia - Potentials and Pitfalls
Science (Communication) and Wikipedia - Potentials and PitfallsScience (Communication) and Wikipedia - Potentials and Pitfalls
Science (Communication) and Wikipedia - Potentials and Pitfalls
 
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdf
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdfKDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdf
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdf
 
Oxo-Acids of Halogens and their Salts.pptx
Oxo-Acids of Halogens and their Salts.pptxOxo-Acids of Halogens and their Salts.pptx
Oxo-Acids of Halogens and their Salts.pptx
 
bonjourmadame.tumblr.com bhaskar's girls
bonjourmadame.tumblr.com bhaskar's girlsbonjourmadame.tumblr.com bhaskar's girls
bonjourmadame.tumblr.com bhaskar's girls
 
Observational constraints on mergers creating magnetism in massive stars
Observational constraints on mergers creating magnetism in massive starsObservational constraints on mergers creating magnetism in massive stars
Observational constraints on mergers creating magnetism in massive stars
 
WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11
WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11
WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11
 
Environmental acoustics- noise criteria.pptx
Environmental acoustics- noise criteria.pptxEnvironmental acoustics- noise criteria.pptx
Environmental acoustics- noise criteria.pptx
 
linear Regression, multiple Regression and Annova
linear Regression, multiple Regression and Annovalinear Regression, multiple Regression and Annova
linear Regression, multiple Regression and Annova
 
CHROMATOGRAPHY PALLAVI RAWAT.pptx
CHROMATOGRAPHY  PALLAVI RAWAT.pptxCHROMATOGRAPHY  PALLAVI RAWAT.pptx
CHROMATOGRAPHY PALLAVI RAWAT.pptx
 
Replisome-Cohesin Interfacing A Molecular Perspective.pdf
Replisome-Cohesin Interfacing A Molecular Perspective.pdfReplisome-Cohesin Interfacing A Molecular Perspective.pdf
Replisome-Cohesin Interfacing A Molecular Perspective.pdf
 
Explainable AI for distinguishing future climate change scenarios
Explainable AI for distinguishing future climate change scenariosExplainable AI for distinguishing future climate change scenarios
Explainable AI for distinguishing future climate change scenarios
 

Development of FDA MicroDB: A Regulatory-Grade Microbial Reference Database

  • 1. U.S. Food and Drug Administration Institute for Genome Sciences Development of FDA MicroDB: A Regulatory-Grade Microbial Reference Database Heike Sichtig, Ph.D. Division of Microbiology Devices OIR/CDRH/FDA/HHS Heike.Sichtig@fda.hhs.gov Genomics Resource Center Institute for Genome Sciences ljtallon@som.umaryland.edu October 21-22, 2014 Luke Tallon UMSOM NIST Workshop to Identify Standards Needed to Support Pathogen Identification via Next-Generation Sequencing, NIST, MD
  • 2. 2 Microbial NGS-Based Diagnostic Devices • OIR/DMD working on a fast-tracked Draft Guidance • On April 1st 2014 held Public Workshop “Advancing Regulatory Science for High Throughput Sequencing Devices for Microbial Identification and Detection of Antimicrobial Resistance Markers” [FR Doc No: 2014-04940] • Workshop agenda, discussion paper and webcast online: http://www.fda.gov/MedicalDevices/NewsEvents/WorkshopsConferences/ucm386967.htm Objectives: 1. Streamline/shorten clinical trials for microbial diagnosis/identification 2. Establish a new comparator algorithm for assays developed using this new technology 3. Develop regulatory science standards for microbial genome sequencing 4. Investigate the regulatory science required for antimicrobial resistance determination through microbial genome sequence information.
  • 3. 3 Inter-Agency Working Group on Feasibility Approach: • Formed a diverse working group FDA, NIH-NCBI, NIAID, DTRA, LLNL, and CDC • Conducted small pilot study to generate information to evaluate quality of existing sequences in the public domain (In Progress) • Identify the pre-existing high-quality deposits, and build from there • Will use information to set quality bar for sequence outputs for our ongoing sequencing efforts • Utilized existing standards (if available) for technical and isolate metadata –no need to re-invent • Attention given to connecting antimicrobial resistance phenotype to genomic deposits – clinical collection site
  • 4. Looking ahead: Predictions for Reference Databases – Multiple levels of Reference DBs likely • “High quality” genomes only – For validation and clinical use • “High quality” + other available genomes – For testing and development • Requires definition of “high quality” that must include some draft genomes – Extensive screening required • Human and other hosts; chimeras • Artificial constructs – Separate bacterial, viral, fungal reference DBs – Publicly available (NCBI/EMBL/DDBJ) 4 Courtesy of Tom Slezak
  • 5. 5 Current Need Robust, Standardized, and High Quality Microbial Sequence Database in the Public Sector Cover illustration (Copyright © 2009, American Society for Microbiology. All Rights Reserved.) • Representative Samples • Metadata • High quality raw sequences • Assemblies • Annotation • Public Domain
  • 6. Latest NCBI Genbank Report on Bacterial Genome 25000 20000 15000 10000 5000 6 Growth 0 Bacterial Genomes Report Jul-­‐98 Aug-­‐99 Oct-­‐00 Nov-­‐01 Dec-­‐02 Jan-­‐04 Feb-­‐05 Mar-­‐06 Apr-­‐07 Jun-­‐08 Jul-­‐09 Aug-­‐10 Sep-­‐11 Oct-­‐12 Nov-­‐13 Dec-­‐14 Count Date #Genomes #Real Species Courtesy of NCBI
  • 7. Microbial Reference Database (MicroDB)($1,67M) • Identify “gaps” and target sequencing efforts (Funding awarded by FDA/OCET) 7 • All raw reads, assemblies, annotations, metadata sent to NCBI and accessible to the PUBLIC • Traceable results that could be reevaluated as necessary >600 Clinically Relevant and MCM Microorganisms Highly Controlled and Documented Approach Collaborations with Clinical Labs and Repositories • Children’s National Hospital • DoD Critical Reagents Program (CRP, USAMRIID) • FDA-CFSAN, FDA-CBER, FDA-CDER • DHS National Biodefense Analysis and Countermeasures Center (NBACC) • The Rockefeller University • Culture Collections: ATCC, DSMZ Sequencing Center (UMD IGS) • Hybrid Approach (PacBio and Illumina) • Deposit of Raw Reads at NCBI (SRA) • Deposit of Assemblies at NCBI • Deposit of Annotations at NCBI • FDA Interface to Access Data
  • 8. MicroDB Requirements A. Extracted Genomic DNA (gDNA) – Extracted gDNA should be of high quality and purity, and at sufficient concentration to achieve a suitable yield to assure adequate depth and breadth of genomic coverage for the type of sequencing method employed. B. BioSample Metadata – A minimal description of the isolate source material is necessary for traceability. We are using 14 descriptors as outlined below. (Note: Minimal metadata is modeled in part after NCBI’s minimal pathogen template) – Unique ID, organism, strain/isolate, sample site, specimen type, host disease, collection date, collected by, patient age, gender, geographic location, AST method*, AST method manufacturer*, Antimicrobial Susceptibilities* C. Sequencing Data – The minimum requirement for sequencing data is that the generated raw reads should be deposited in NCBI’s Sequence Read Archive (SRA) and assemblies should be deposited at NCBI’s Assembly division. The availability of raw reads and assemblies will provide a pathway to re-analyze the data as newer technologies emerge. Furthermore, annotation data should be deposited when available. – Raw reads, assemblies, annotations* *not used as a criteria for exclusion 8
  • 9. MicroDB Requirements D. Sequencing Metadata – A minimal description of the sequencing process is necessary for traceability. We are using 7 descriptors as outlined below including bioinformatics tool information for assembly and annotation, and genomic coverage information. – Library, platform, submitted by, fold coverage, pipeline, assembler, annotation tool* E. Suggested phenotypic metadata* – A description of the phenotypic information is suggested to create a link between the phenotypic traits of particular organisms and their genomic sequence. We are recommending 5 descriptors as outlined below (1-4 are also included in sections B and C). – Annotation, AST method, AST method manufacturer, antimicrobial susceptibilities, additional phenotypic data *not used as a criteria for exclusion 9
  • 10. NCBI Submission Cases 1. Childrens National Medical Center – Submit all data when available – Register sample metadata via BioSample – Submit raw reads and assemblies generated by IGS when available 2. FDA/CFSAN – Collaborative agreement: Wait for genome announcements – Follow same procedures as for 1 and put a ‘6 month hold’ to release data, lift hold when genome announcements are out 3. Rockefeller University – Collaborative agreement: Wait for publication – Follow same procedures as for 1 and put a ‘6 month hold’ to release data, lift hold when publication is out Similar agreements in place with other collaborators depending on their needs 10
  • 11. Project Approach • Sequencing in large batches – Illumina HiSeq paired-­‐end sequencing: >200x – PacBio long-­‐insert SMRT P4-­‐C2 sequencing: >80-­‐100x • Assembly – PacBio only (HGAP, PBcR CA) – Illumina only (CA, MaSuRCA) – PacBio/Illumina hybrid (CA) – Minimal manual QA/QC & curaon • Automated Annotaon • Base modificaon detecon • Raw reads -­‐> NCBI SRA • Assembled & annotated genomes -­‐> Genbank – NCBI BIOPROJECT ID: PRJNA231221 • FDA Web interface to aggregate data
  • 12. Progress -­‐ Batch 1 Rockefeller (50) • Uniform sample set – Staphylococcus aureus – 2.8Mbp genome size – 32.8 %GC – Significant metadata CNH/CFSAN (41) • Diverse sample set – 18 genera represented – 2 – 8 Mbp genome size range – 38 – 67 %GC range Wikimedia Commons Wikimedia Commons NCBI BioProject: PRJNA231221
  • 13. Rockefeller Samples • Sequencing – Avg Illumina cvg: 578x – Avg PacBio cvg: 185x – 1 or 2 SMRT cells each • Assembly: – 32 of 50 in single cong chromosome – Average cong count = 5 – “Best” assembly: • HGAP = 29 • CA hybrid = 21 • Most differences subtle • Annotaon complete • Final QC & data submissions underway
  • 14. CNH/CFSAN Samples • Sequencing – Avg Illumina cvg: 315x – Avg PacBio cvg: 167x • 2 SMRT cells each • Assembly – 12 of 41 in single cong chromosome • 29 in <= 5 congs – Avg cong count = 4.5 – Median cong count = 3 – “Best” assembly (of 41): • HGAP = 24 • PBcR CA = 14 • CA hybrid = 3 • Annotaon underway
  • 15. ROCK_290 Celera8 ctg vs. ref 0 500000 1000000 1500000 2000000 2500000 gi|374362062|gb|CP003033.1| 2500000 2000000 1500000 1000000 500000 0 ctg7180000000002 100 80 60 40 20 0 Assembly QC & Curaon %similarity CA8 – Ill/PB hybrid Largest Ctg Len: 2,759,091bp Total asm Ctg Len: 2,770,822 bp ROCK_290 HGAP2 ctg vs. ref 0 500000 1000000 1500000 2000000 2500000 gi|374362062|gb|CP003033.1| ssccff77118800000000000000001134||qquuiivveerr QRY ssscccfff777111888000000000000000000000000111012|||qqquuuiiivvveeerrr 100 80 60 40 20 0 %similarity HGAP2 Largest Ctg Len: 2,128,476bp Total asm Ctg Len: 2,802,621 bp
  • 16. 4bp overlap? 0 500000 1000000 1500000 2000000 2500000 gi|595636499|gb|CP007454.1| 2500000 2000000 1500000 1000000 500000 0 scf7180000000002|quiver Assembly QC & Curaon 100 80 60 similarity %40 20 0 HGAP2 Largest Ctg Len: 2,764,709bp Total asm Ctg Len: 2,764,709bp 1X coverage TAAC 1X coverage TAGC
  • 17. Challenges & Opportunies • Sample acquision & quality • Efficiency/throughput vs. accuracy/quality – Sequencing strategy – Assembly QA/QC & curaon • Ever longer reads! – Reduced coverage -­‐> higher efficiency sequencing – More “closed” genomes! • Small plasmids – SageELF & Illumina
  • 18. FDA Micro Team Peyton Hobson, Brittany Goldberg, Kevin Snyder, Tamara Feldblyum, Uwe Scherf, Sally Hojvat C ollaborators 18 Thank You LLNL Tom Slezak NIH-NCBI Bill Klimke, Martin Shumway, David Lipman NIH-NIAID Vivien Dugan, Maria Giovani DTRA Matt Tobelmann, Chris Detter, Eric VanGieson, Nels Olsen CDC Duncan MacCannell FDA-CFSAN Maria Hoffmann, Cary Pirone, Andrea Ottessen, Marc Allard, Eric Brown NMRC Kim Bishop-Lilly, Ken Frey IGS@UMD Lisa Sadzewicz, Luke Tallon, Naomi Sengamalay, Al Godinez, Sandy Ott, Sushma Nagaraj, Claire Fraser Rockefeller University Bryan Utter, Douglas Deutsch Children’s National Medical Center Brittany Goldberg, Joseph Campos DOD-CRP Shanmuga Sozhamannan, Mike Smith DOD-USAMRIID Tim Minogue NBACC Adam Phillippy, Nick Bergman ATCC Liz Kerrigan DSMZ Cathrin Sproer