SlideShare uma empresa Scribd logo
1 de 1
Baixar para ler offline
Results from Adding Long and Linked Reads
NIST hosts the Genome in a Bottle Consortium, which develops metrology
infrastructure for characterization of human whole genome variant detection.
Consortium products include:
• Characterization of seven broadly-consented human genomes including 2
son-mother-father trios released as Reference Materials (RMs)
• Reference data associated with RMs are benchmark variants and
genomic regions covering, for example, 87.84% of assembled bases in
chromosomes 1-22 in GRCh37 for the sample HG002
• Short read variant callers perform poorly in genomic locations with high
homology such as segmental duplications and low-complexity repeat-
rich regions
• Now utilizing PacBio long read data and 10X Genomics linked reads to
expand the GIAB benchmark regions and reduce errors in current regions
• Initial results suggest linked and long reads might be able to add 139,480
benchmark SNPs and 16,081 insertions/deletions, mostly in regions
difficult to map with short reads
Overview
Integration data for HG002 with GRCh37
Expanding the Genome in a Bottle benchmark callsets with
high-confidence small variant calls from
long and linked read sequencing technologies
Justin Wagner1, Nathan D. Olson1, Lesley M. Chapman1, Marc Salit1,2,3, Justin M. Zook1, and the Genome in a Bottle Consortium
1: Material Measurement Laboratory, National Institute of Standards and Technology, 100 Bureau Dr., Gaithersburg, MD 20899; 2: Joint Initiative for
Metrology in Biology, Stanford, CA 94305, USA; 3: Department of Bioengineering, Stanford University, Stanford, CA 94305
Ongoing and Future work
Integration Pipeline Process
Benchmark includes more bases, variants, and segmental duplications in v4⍺
Comparison of Illumina GATK4 VCF against benchmark sets
• SNP FN rate increases by a factor of 10, almost entirely due to new
benchmark variants in difficult to map regions (lowmap) and segmental
duplications (segdups)
Performance in medically-relevant genes
• Top 5 genes with variants increased from v3.3.2 to v4⍺ benchmark:
TSPEAR (37), TNXB (22), CYP21A2 (9), KANSL1 (9), SDHA (8)
• PMS2 from ACMG59 has 2 more variants covered in v4⍺ benchmark
Genome in a Bottle Consortium
Platform Characteristics Alignment; Variant Calling
PacBio ~15Kbp reads; ~28x coverage minimap2; GATK4
PacBio ~15Kbp reads; ~28x coverage minimap2; DeepVariant
10X Linked reads; ~84x coverage LongRanger Pipeline
Variants
PASS
Filtered outliers
Low/high coverage or low
MQ (or low GQ for gVCF)
Difficult regions/SVs
Callable regions
PASS variants #2
Benchmark regions
0/1 1/1
TR
1/1
Benchmark calls 0/11/1
Callable regions #2
VariantCallingMethodX
(1) (2) (3)
1/1
0/1
Callable regions #1
1/10/11/1PASS variants #1
InputMethods
1/1
(1)
Concordant
(2)
Discordant
unresolved
(3)
Discordant
arbitrated
(4)
Concordant
not callable
Find sensitive
variant calls and
callable regions
for each dataset,
excluding difficult
regions/SVs that
are problematic
for each type of
data and variant
caller
Find
“consensus”
calls with
support from
2+ technologies
(and no other
technologies
disagree) using
callable regions
Use “consensus”
calls to train simple
one-class model for
each dataset and
find “outliers” that
are less trustworthy
for each dataset
Find
benchmark
calls by using
callable
regions and
“outliers” to
arbitrate
between
datasets when
they disagree
Find
benchmark
regions by
taking
union of
callable
regions and
subtracting
uncertain
variants
Variants in Medical Exome
(genes from OMIM, HGMD, ClinVar, UniProt)
Benchmark Regions v3.3.2 8,209
Benchmark Regions v4⍺ 8,627
Difficult Region Description Method Excluded From
All candidate structural variant regions from the
Son-Mother-Father Trio
All methods
All tandem repeats < 51bp in length All methods except GATK from Illumina PCR-
free and Complete Genomics
All tandem repeats > 51bp and < 200bp in length All methods except GATK from Illumina PCR-
free
All tandem repeats > 200bp in length All methods
Perfect or imperfect homopolymers > 10bp All methods except GATK from Illumina PCR-
free
Segmental duplications from Eichler et al. All methods except 10X Genomics linked
reads and PacBio CCS
Segmental duplications > 10Kbp from self-chain
mapping
All methods except 10X Genomics linked
reads and PacBio CCS
Regions homologous to contigs in hs37d5 decoy All methods except 10X Genomics linked
reads and PacBio CCS
Subset v3.3.2 Recall v4⍺ Recall v3.3.2 Precision v4⍺ Precision
All SNPs 0.9995 0.9947 0.9981 0.9933
Lowmap 100 bp 0.9799 0.8464 0.9623 0.8717
Lowmap 250 bp no mismatch 0.9474 0.5522 0.8911 0.7180
Segdups 0.9982 0.9321 0.9910 0.9085
v3.3.2 v4⍺ Increase in v4⍺
Number of bases
covered
2,358,060,765 2,442,494,334 84,433,569
Percent of GRCh37
covered
87.84% 90.98% 3.14%
SNPs 3,046,933 3,186,536 139,603
Indels 465,670 482,172 16,502
Number of bases
covered in Segmental
Duplications
269,887 269,589,673 269,319,786
• Machine learning
- Multi-view classification, outlier detection, active learning
• Refine use of genome stratifications
• Adding variant calls from raw PacBio and Oxford Nanopore
• Improve benchmark for larger indels, homopolymers, and tandem repeats
• Explore graph-based methods to characterize MHC region
• Improve normalization of complex variants
In addition to the Genome in a Bottle v3.3.2 input data that consisted of
Illumina, Complete Genomics, Ion, 10X, and Solid technologies v4⍺
includes PacBio CCS and new 10X linked read data.
v4⍺
v3.3.2
Illumina
PacBio
CCS
10X
ONT
v4⍺
v3.3.2
v4⍺
v3.3.2
Illumina
PacBio
CCS
10X
ONT
v4⍺
v3.3.2
v3.3.2 Error Excluded in v4⍺Variant Added in v4⍺
New members welcome! Sign up for newsletters at www.genomeinabottle.org
Recruiting members to test v4⍺ benchmark please email: justin.zook@nist.gov

Mais conteúdo relacionado

Mais procurados

Characterizing Alzheimer’s Disease candidate genes and transcripts with targe...
Characterizing Alzheimer’s Disease candidate genes and transcripts with targe...Characterizing Alzheimer’s Disease candidate genes and transcripts with targe...
Characterizing Alzheimer’s Disease candidate genes and transcripts with targe...
Integrated DNA Technologies
 

Mais procurados (20)

Aug2015 analysis team 04 10x genomics
Aug2015 analysis team 04 10x genomicsAug2015 analysis team 04 10x genomics
Aug2015 analysis team 04 10x genomics
 
How giab fits in the rest of the world seqc2 tumor normal
How giab fits in the rest of the world   seqc2 tumor normalHow giab fits in the rest of the world   seqc2 tumor normal
How giab fits in the rest of the world seqc2 tumor normal
 
New methods draft v4alpha small variant benchmark
New methods   draft v4alpha small variant benchmarkNew methods   draft v4alpha small variant benchmark
New methods draft v4alpha small variant benchmark
 
New data from giab genomes pacbio ccs
New data from giab genomes   pacbio ccsNew data from giab genomes   pacbio ccs
New data from giab genomes pacbio ccs
 
GIAB and long reads for bio it world 190417
GIAB and long reads for bio it world 190417GIAB and long reads for bio it world 190417
GIAB and long reads for bio it world 190417
 
Sept2016 sv nist_intro
Sept2016 sv nist_introSept2016 sv nist_intro
Sept2016 sv nist_intro
 
Exome Sequencing
Exome SequencingExome Sequencing
Exome Sequencing
 
Characterizing Alzheimer’s Disease candidate genes and transcripts with targe...
Characterizing Alzheimer’s Disease candidate genes and transcripts with targe...Characterizing Alzheimer’s Disease candidate genes and transcripts with targe...
Characterizing Alzheimer’s Disease candidate genes and transcripts with targe...
 
CPP-Mediated Delivery of Cas9 RNP
 CPP-Mediated Delivery of Cas9 RNP CPP-Mediated Delivery of Cas9 RNP
CPP-Mediated Delivery of Cas9 RNP
 
ChIP-seq
ChIP-seqChIP-seq
ChIP-seq
 
Getting started with CRISPR: a review of gene knockout and homology-directed ...
Getting started with CRISPR: a review of gene knockout and homology-directed ...Getting started with CRISPR: a review of gene knockout and homology-directed ...
Getting started with CRISPR: a review of gene knockout and homology-directed ...
 
New methods diploid assembly with graphs
New methods   diploid assembly with graphsNew methods   diploid assembly with graphs
New methods diploid assembly with graphs
 
New RNA tools for optimized CRISPR/Cas9 genome editing
New RNA tools for optimized CRISPR/Cas9 genome editingNew RNA tools for optimized CRISPR/Cas9 genome editing
New RNA tools for optimized CRISPR/Cas9 genome editing
 
Gene overexpression protocol.doc
Gene overexpression protocol.docGene overexpression protocol.doc
Gene overexpression protocol.doc
 
Automation of Chromatin Immunoprecipitation
Automation of Chromatin ImmunoprecipitationAutomation of Chromatin Immunoprecipitation
Automation of Chromatin Immunoprecipitation
 
Sequence assembly
Sequence assemblySequence assembly
Sequence assembly
 
Custom Enrichment Panels for Targeted Next Generation Sequencing
Custom Enrichment Panels for Targeted Next Generation SequencingCustom Enrichment Panels for Targeted Next Generation Sequencing
Custom Enrichment Panels for Targeted Next Generation Sequencing
 
Giab for jax long read 190917
Giab for jax long read 190917Giab for jax long read 190917
Giab for jax long read 190917
 
Ffpe white paper
Ffpe white paperFfpe white paper
Ffpe white paper
 
Sept2016 sv 10_x
Sept2016 sv 10_xSept2016 sv 10_x
Sept2016 sv 10_x
 

Semelhante a Giab agbt small_var_2019

Microarray validation
Microarray validationMicroarray validation
Microarray validation
Elsa von Licy
 
Analyzing microbes
Analyzing microbesAnalyzing microbes
Analyzing microbes
Springer
 
Bioinformatics workshop Sept 2014
Bioinformatics workshop Sept 2014Bioinformatics workshop Sept 2014
Bioinformatics workshop Sept 2014
LutzFr
 
DNASeq and basis structure of Dna and its function
DNASeq and basis structure of Dna and its functionDNASeq and basis structure of Dna and its function
DNASeq and basis structure of Dna and its function
SubhadipGhosh96
 

Semelhante a Giab agbt small_var_2019 (20)

GRC GIAB Workshop ASHG 2019 Small Variant Benchmark
GRC GIAB Workshop ASHG 2019 Small Variant BenchmarkGRC GIAB Workshop ASHG 2019 Small Variant Benchmark
GRC GIAB Workshop ASHG 2019 Small Variant Benchmark
 
Giab agbt small_var_2020
Giab agbt small_var_2020Giab agbt small_var_2020
Giab agbt small_var_2020
 
Benchmarking with GIAB 220907
Benchmarking with GIAB 220907Benchmarking with GIAB 220907
Benchmarking with GIAB 220907
 
Genome in a bottle for amp GeT-RM 181030
Genome in a bottle for amp GeT-RM 181030Genome in a bottle for amp GeT-RM 181030
Genome in a bottle for amp GeT-RM 181030
 
Isothermal Nucleic Acid Amplification Techniques
Isothermal Nucleic Acid Amplification TechniquesIsothermal Nucleic Acid Amplification Techniques
Isothermal Nucleic Acid Amplification Techniques
 
GIAB for AMP GeT-RM Forum
GIAB for AMP GeT-RM ForumGIAB for AMP GeT-RM Forum
GIAB for AMP GeT-RM Forum
 
PCR based molecular markers
PCR based molecular markersPCR based molecular markers
PCR based molecular markers
 
Genome in a bottle for ashg grc giab workshop 181016
Genome in a bottle for ashg grc giab workshop 181016Genome in a bottle for ashg grc giab workshop 181016
Genome in a bottle for ashg grc giab workshop 181016
 
PCR technology
PCR technologyPCR technology
PCR technology
 
Microarray validation
Microarray validationMicroarray validation
Microarray validation
 
Polymerase chain reaction
Polymerase chain reactionPolymerase chain reaction
Polymerase chain reaction
 
Analyzing microbes
Analyzing microbesAnalyzing microbes
Analyzing microbes
 
RT PCR Protocol-creative biogene
RT PCR Protocol-creative biogeneRT PCR Protocol-creative biogene
RT PCR Protocol-creative biogene
 
Bioinformatics workshop Sept 2014
Bioinformatics workshop Sept 2014Bioinformatics workshop Sept 2014
Bioinformatics workshop Sept 2014
 
Pcr
Pcr Pcr
Pcr
 
DNASeq and basis structure of Dna and its function
DNASeq and basis structure of Dna and its functionDNASeq and basis structure of Dna and its function
DNASeq and basis structure of Dna and its function
 
Understanding and controlling for sample and platform biases in NGS assays
Understanding and controlling for sample and platform biases in NGS assaysUnderstanding and controlling for sample and platform biases in NGS assays
Understanding and controlling for sample and platform biases in NGS assays
 
GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511
GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511
GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511
 
Planning and Executing siRNA Experiments—Good Practices for Optimal Results
Planning and Executing siRNA Experiments—Good Practices for Optimal ResultsPlanning and Executing siRNA Experiments—Good Practices for Optimal Results
Planning and Executing siRNA Experiments—Good Practices for Optimal Results
 
PCR lecture.ppt
PCR lecture.pptPCR lecture.ppt
PCR lecture.ppt
 

Mais de GenomeInABottle

Mais de GenomeInABottle (13)

2023 GIAB AMP Update
2023 GIAB AMP Update2023 GIAB AMP Update
2023 GIAB AMP Update
 
GIAB Tumor Normal ASHG 2023
GIAB Tumor Normal ASHG 2023GIAB Tumor Normal ASHG 2023
GIAB Tumor Normal ASHG 2023
 
Stratomod ASHG 2023
Stratomod ASHG 2023Stratomod ASHG 2023
Stratomod ASHG 2023
 
GIAB_ASHG_JZook_2023.pdf
GIAB_ASHG_JZook_2023.pdfGIAB_ASHG_JZook_2023.pdf
GIAB_ASHG_JZook_2023.pdf
 
Using accurate long reads to improve Genome in a Bottle Benchmarks 220923
Using accurate long reads to improve Genome in a Bottle Benchmarks 220923Using accurate long reads to improve Genome in a Bottle Benchmarks 220923
Using accurate long reads to improve Genome in a Bottle Benchmarks 220923
 
Genome in a Bottle- reference materials to benchmark challenging variants and...
Genome in a Bottle- reference materials to benchmark challenging variants and...Genome in a Bottle- reference materials to benchmark challenging variants and...
Genome in a Bottle- reference materials to benchmark challenging variants and...
 
GIAB Technical Germline Benchmark roadmap discussion
GIAB Technical Germline Benchmark roadmap discussionGIAB Technical Germline Benchmark roadmap discussion
GIAB Technical Germline Benchmark roadmap discussion
 
GIAB GRC Workshop ASHG 2019 Billy Rowell Evaluation of v4 with CCS GATK
GIAB GRC Workshop ASHG 2019 Billy Rowell Evaluation of v4 with CCS GATKGIAB GRC Workshop ASHG 2019 Billy Rowell Evaluation of v4 with CCS GATK
GIAB GRC Workshop ASHG 2019 Billy Rowell Evaluation of v4 with CCS GATK
 
New data from giab genomes strand-seq
New data from giab genomes   strand-seqNew data from giab genomes   strand-seq
New data from giab genomes strand-seq
 
New data from giab genomes intro and ultralong nanopore
New data from giab genomes   intro and ultralong nanoporeNew data from giab genomes   intro and ultralong nanopore
New data from giab genomes intro and ultralong nanopore
 
How giab fits in the rest of the world mdic somatic reference samples
How giab fits in the rest of the world   mdic somatic reference samplesHow giab fits in the rest of the world   mdic somatic reference samples
How giab fits in the rest of the world mdic somatic reference samples
 
How giab fits in the rest of the world telomere to telomere consortium
How giab fits in the rest of the world   telomere to telomere consortiumHow giab fits in the rest of the world   telomere to telomere consortium
How giab fits in the rest of the world telomere to telomere consortium
 
How giab fits in the rest of the world human genome structural variation co...
How giab fits in the rest of the world   human genome structural variation co...How giab fits in the rest of the world   human genome structural variation co...
How giab fits in the rest of the world human genome structural variation co...
 

Último

💚Call Girls In Amritsar 💯Anvi 📲🔝8725944379🔝Amritsar Call Girl No💰Advance Cash...
💚Call Girls In Amritsar 💯Anvi 📲🔝8725944379🔝Amritsar Call Girl No💰Advance Cash...💚Call Girls In Amritsar 💯Anvi 📲🔝8725944379🔝Amritsar Call Girl No💰Advance Cash...
💚Call Girls In Amritsar 💯Anvi 📲🔝8725944379🔝Amritsar Call Girl No💰Advance Cash...
Sheetaleventcompany
 
❤️Amritsar Escorts Service☎️9815674956☎️ Call Girl service in Amritsar☎️ Amri...
❤️Amritsar Escorts Service☎️9815674956☎️ Call Girl service in Amritsar☎️ Amri...❤️Amritsar Escorts Service☎️9815674956☎️ Call Girl service in Amritsar☎️ Amri...
❤️Amritsar Escorts Service☎️9815674956☎️ Call Girl service in Amritsar☎️ Amri...
Sheetaleventcompany
 
Premium Call Girls Nagpur {9xx000xx09} ❤️VVIP POOJA Call Girls in Nagpur Maha...
Premium Call Girls Nagpur {9xx000xx09} ❤️VVIP POOJA Call Girls in Nagpur Maha...Premium Call Girls Nagpur {9xx000xx09} ❤️VVIP POOJA Call Girls in Nagpur Maha...
Premium Call Girls Nagpur {9xx000xx09} ❤️VVIP POOJA Call Girls in Nagpur Maha...
Sheetaleventcompany
 
Goa Call Girl Service 📞9xx000xx09📞Just Call Divya📲 Call Girl In Goa No💰Advanc...
Goa Call Girl Service 📞9xx000xx09📞Just Call Divya📲 Call Girl In Goa No💰Advanc...Goa Call Girl Service 📞9xx000xx09📞Just Call Divya📲 Call Girl In Goa No💰Advanc...
Goa Call Girl Service 📞9xx000xx09📞Just Call Divya📲 Call Girl In Goa No💰Advanc...
Sheetaleventcompany
 
Pune Call Girl Service 📞9xx000xx09📞Just Call Divya📲 Call Girl In Pune No💰Adva...
Pune Call Girl Service 📞9xx000xx09📞Just Call Divya📲 Call Girl In Pune No💰Adva...Pune Call Girl Service 📞9xx000xx09📞Just Call Divya📲 Call Girl In Pune No💰Adva...
Pune Call Girl Service 📞9xx000xx09📞Just Call Divya📲 Call Girl In Pune No💰Adva...
Sheetaleventcompany
 

Último (20)

ANATOMY AND PHYSIOLOGY OF RESPIRATORY SYSTEM.pptx
ANATOMY AND PHYSIOLOGY OF RESPIRATORY SYSTEM.pptxANATOMY AND PHYSIOLOGY OF RESPIRATORY SYSTEM.pptx
ANATOMY AND PHYSIOLOGY OF RESPIRATORY SYSTEM.pptx
 
Call Girls Kathua Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Kathua Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Kathua Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Kathua Just Call 8250077686 Top Class Call Girl Service Available
 
Bhawanipatna Call Girls 📞9332606886 Call Girls in Bhawanipatna Escorts servic...
Bhawanipatna Call Girls 📞9332606886 Call Girls in Bhawanipatna Escorts servic...Bhawanipatna Call Girls 📞9332606886 Call Girls in Bhawanipatna Escorts servic...
Bhawanipatna Call Girls 📞9332606886 Call Girls in Bhawanipatna Escorts servic...
 
Call 8250092165 Patna Call Girls ₹4.5k Cash Payment With Room Delivery
Call 8250092165 Patna Call Girls ₹4.5k Cash Payment With Room DeliveryCall 8250092165 Patna Call Girls ₹4.5k Cash Payment With Room Delivery
Call 8250092165 Patna Call Girls ₹4.5k Cash Payment With Room Delivery
 
💚Call Girls In Amritsar 💯Anvi 📲🔝8725944379🔝Amritsar Call Girl No💰Advance Cash...
💚Call Girls In Amritsar 💯Anvi 📲🔝8725944379🔝Amritsar Call Girl No💰Advance Cash...💚Call Girls In Amritsar 💯Anvi 📲🔝8725944379🔝Amritsar Call Girl No💰Advance Cash...
💚Call Girls In Amritsar 💯Anvi 📲🔝8725944379🔝Amritsar Call Girl No💰Advance Cash...
 
(RIYA)🎄Airhostess Call Girl Jaipur Call Now 8445551418 Premium Collection Of ...
(RIYA)🎄Airhostess Call Girl Jaipur Call Now 8445551418 Premium Collection Of ...(RIYA)🎄Airhostess Call Girl Jaipur Call Now 8445551418 Premium Collection Of ...
(RIYA)🎄Airhostess Call Girl Jaipur Call Now 8445551418 Premium Collection Of ...
 
Independent Bangalore Call Girls (Adult Only) 💯Call Us 🔝 7304373326 🔝 💃 Escor...
Independent Bangalore Call Girls (Adult Only) 💯Call Us 🔝 7304373326 🔝 💃 Escor...Independent Bangalore Call Girls (Adult Only) 💯Call Us 🔝 7304373326 🔝 💃 Escor...
Independent Bangalore Call Girls (Adult Only) 💯Call Us 🔝 7304373326 🔝 💃 Escor...
 
Call Girl In Chandigarh 📞9809698092📞 Just📲 Call Inaaya Chandigarh Call Girls ...
Call Girl In Chandigarh 📞9809698092📞 Just📲 Call Inaaya Chandigarh Call Girls ...Call Girl In Chandigarh 📞9809698092📞 Just📲 Call Inaaya Chandigarh Call Girls ...
Call Girl In Chandigarh 📞9809698092📞 Just📲 Call Inaaya Chandigarh Call Girls ...
 
Chennai ❣️ Call Girl 6378878445 Call Girls in Chennai Escort service book now
Chennai ❣️ Call Girl 6378878445 Call Girls in Chennai Escort service book nowChennai ❣️ Call Girl 6378878445 Call Girls in Chennai Escort service book now
Chennai ❣️ Call Girl 6378878445 Call Girls in Chennai Escort service book now
 
Call Girls Rishikesh Just Call 9667172968 Top Class Call Girl Service Available
Call Girls Rishikesh Just Call 9667172968 Top Class Call Girl Service AvailableCall Girls Rishikesh Just Call 9667172968 Top Class Call Girl Service Available
Call Girls Rishikesh Just Call 9667172968 Top Class Call Girl Service Available
 
❤️Amritsar Escorts Service☎️9815674956☎️ Call Girl service in Amritsar☎️ Amri...
❤️Amritsar Escorts Service☎️9815674956☎️ Call Girl service in Amritsar☎️ Amri...❤️Amritsar Escorts Service☎️9815674956☎️ Call Girl service in Amritsar☎️ Amri...
❤️Amritsar Escorts Service☎️9815674956☎️ Call Girl service in Amritsar☎️ Amri...
 
💰Call Girl In Bangalore☎️7304373326💰 Call Girl service in Bangalore☎️Bangalor...
💰Call Girl In Bangalore☎️7304373326💰 Call Girl service in Bangalore☎️Bangalor...💰Call Girl In Bangalore☎️7304373326💰 Call Girl service in Bangalore☎️Bangalor...
💰Call Girl In Bangalore☎️7304373326💰 Call Girl service in Bangalore☎️Bangalor...
 
Chandigarh Call Girls Service ❤️🍑 9809698092 👄🫦Independent Escort Service Cha...
Chandigarh Call Girls Service ❤️🍑 9809698092 👄🫦Independent Escort Service Cha...Chandigarh Call Girls Service ❤️🍑 9809698092 👄🫦Independent Escort Service Cha...
Chandigarh Call Girls Service ❤️🍑 9809698092 👄🫦Independent Escort Service Cha...
 
❤️Chandigarh Escorts Service☎️9814379184☎️ Call Girl service in Chandigarh☎️ ...
❤️Chandigarh Escorts Service☎️9814379184☎️ Call Girl service in Chandigarh☎️ ...❤️Chandigarh Escorts Service☎️9814379184☎️ Call Girl service in Chandigarh☎️ ...
❤️Chandigarh Escorts Service☎️9814379184☎️ Call Girl service in Chandigarh☎️ ...
 
Chandigarh Call Girls Service ❤️🍑 9809698092 👄🫦Independent Escort Service Cha...
Chandigarh Call Girls Service ❤️🍑 9809698092 👄🫦Independent Escort Service Cha...Chandigarh Call Girls Service ❤️🍑 9809698092 👄🫦Independent Escort Service Cha...
Chandigarh Call Girls Service ❤️🍑 9809698092 👄🫦Independent Escort Service Cha...
 
Premium Call Girls Nagpur {9xx000xx09} ❤️VVIP POOJA Call Girls in Nagpur Maha...
Premium Call Girls Nagpur {9xx000xx09} ❤️VVIP POOJA Call Girls in Nagpur Maha...Premium Call Girls Nagpur {9xx000xx09} ❤️VVIP POOJA Call Girls in Nagpur Maha...
Premium Call Girls Nagpur {9xx000xx09} ❤️VVIP POOJA Call Girls in Nagpur Maha...
 
💰Call Girl In Bangalore☎️63788-78445💰 Call Girl service in Bangalore☎️Bangalo...
💰Call Girl In Bangalore☎️63788-78445💰 Call Girl service in Bangalore☎️Bangalo...💰Call Girl In Bangalore☎️63788-78445💰 Call Girl service in Bangalore☎️Bangalo...
💰Call Girl In Bangalore☎️63788-78445💰 Call Girl service in Bangalore☎️Bangalo...
 
Call girls Service Phullen / 9332606886 Genuine Call girls with real Photos a...
Call girls Service Phullen / 9332606886 Genuine Call girls with real Photos a...Call girls Service Phullen / 9332606886 Genuine Call girls with real Photos a...
Call girls Service Phullen / 9332606886 Genuine Call girls with real Photos a...
 
Goa Call Girl Service 📞9xx000xx09📞Just Call Divya📲 Call Girl In Goa No💰Advanc...
Goa Call Girl Service 📞9xx000xx09📞Just Call Divya📲 Call Girl In Goa No💰Advanc...Goa Call Girl Service 📞9xx000xx09📞Just Call Divya📲 Call Girl In Goa No💰Advanc...
Goa Call Girl Service 📞9xx000xx09📞Just Call Divya📲 Call Girl In Goa No💰Advanc...
 
Pune Call Girl Service 📞9xx000xx09📞Just Call Divya📲 Call Girl In Pune No💰Adva...
Pune Call Girl Service 📞9xx000xx09📞Just Call Divya📲 Call Girl In Pune No💰Adva...Pune Call Girl Service 📞9xx000xx09📞Just Call Divya📲 Call Girl In Pune No💰Adva...
Pune Call Girl Service 📞9xx000xx09📞Just Call Divya📲 Call Girl In Pune No💰Adva...
 

Giab agbt small_var_2019

  • 1. Results from Adding Long and Linked Reads NIST hosts the Genome in a Bottle Consortium, which develops metrology infrastructure for characterization of human whole genome variant detection. Consortium products include: • Characterization of seven broadly-consented human genomes including 2 son-mother-father trios released as Reference Materials (RMs) • Reference data associated with RMs are benchmark variants and genomic regions covering, for example, 87.84% of assembled bases in chromosomes 1-22 in GRCh37 for the sample HG002 • Short read variant callers perform poorly in genomic locations with high homology such as segmental duplications and low-complexity repeat- rich regions • Now utilizing PacBio long read data and 10X Genomics linked reads to expand the GIAB benchmark regions and reduce errors in current regions • Initial results suggest linked and long reads might be able to add 139,480 benchmark SNPs and 16,081 insertions/deletions, mostly in regions difficult to map with short reads Overview Integration data for HG002 with GRCh37 Expanding the Genome in a Bottle benchmark callsets with high-confidence small variant calls from long and linked read sequencing technologies Justin Wagner1, Nathan D. Olson1, Lesley M. Chapman1, Marc Salit1,2,3, Justin M. Zook1, and the Genome in a Bottle Consortium 1: Material Measurement Laboratory, National Institute of Standards and Technology, 100 Bureau Dr., Gaithersburg, MD 20899; 2: Joint Initiative for Metrology in Biology, Stanford, CA 94305, USA; 3: Department of Bioengineering, Stanford University, Stanford, CA 94305 Ongoing and Future work Integration Pipeline Process Benchmark includes more bases, variants, and segmental duplications in v4⍺ Comparison of Illumina GATK4 VCF against benchmark sets • SNP FN rate increases by a factor of 10, almost entirely due to new benchmark variants in difficult to map regions (lowmap) and segmental duplications (segdups) Performance in medically-relevant genes • Top 5 genes with variants increased from v3.3.2 to v4⍺ benchmark: TSPEAR (37), TNXB (22), CYP21A2 (9), KANSL1 (9), SDHA (8) • PMS2 from ACMG59 has 2 more variants covered in v4⍺ benchmark Genome in a Bottle Consortium Platform Characteristics Alignment; Variant Calling PacBio ~15Kbp reads; ~28x coverage minimap2; GATK4 PacBio ~15Kbp reads; ~28x coverage minimap2; DeepVariant 10X Linked reads; ~84x coverage LongRanger Pipeline Variants PASS Filtered outliers Low/high coverage or low MQ (or low GQ for gVCF) Difficult regions/SVs Callable regions PASS variants #2 Benchmark regions 0/1 1/1 TR 1/1 Benchmark calls 0/11/1 Callable regions #2 VariantCallingMethodX (1) (2) (3) 1/1 0/1 Callable regions #1 1/10/11/1PASS variants #1 InputMethods 1/1 (1) Concordant (2) Discordant unresolved (3) Discordant arbitrated (4) Concordant not callable Find sensitive variant calls and callable regions for each dataset, excluding difficult regions/SVs that are problematic for each type of data and variant caller Find “consensus” calls with support from 2+ technologies (and no other technologies disagree) using callable regions Use “consensus” calls to train simple one-class model for each dataset and find “outliers” that are less trustworthy for each dataset Find benchmark calls by using callable regions and “outliers” to arbitrate between datasets when they disagree Find benchmark regions by taking union of callable regions and subtracting uncertain variants Variants in Medical Exome (genes from OMIM, HGMD, ClinVar, UniProt) Benchmark Regions v3.3.2 8,209 Benchmark Regions v4⍺ 8,627 Difficult Region Description Method Excluded From All candidate structural variant regions from the Son-Mother-Father Trio All methods All tandem repeats < 51bp in length All methods except GATK from Illumina PCR- free and Complete Genomics All tandem repeats > 51bp and < 200bp in length All methods except GATK from Illumina PCR- free All tandem repeats > 200bp in length All methods Perfect or imperfect homopolymers > 10bp All methods except GATK from Illumina PCR- free Segmental duplications from Eichler et al. All methods except 10X Genomics linked reads and PacBio CCS Segmental duplications > 10Kbp from self-chain mapping All methods except 10X Genomics linked reads and PacBio CCS Regions homologous to contigs in hs37d5 decoy All methods except 10X Genomics linked reads and PacBio CCS Subset v3.3.2 Recall v4⍺ Recall v3.3.2 Precision v4⍺ Precision All SNPs 0.9995 0.9947 0.9981 0.9933 Lowmap 100 bp 0.9799 0.8464 0.9623 0.8717 Lowmap 250 bp no mismatch 0.9474 0.5522 0.8911 0.7180 Segdups 0.9982 0.9321 0.9910 0.9085 v3.3.2 v4⍺ Increase in v4⍺ Number of bases covered 2,358,060,765 2,442,494,334 84,433,569 Percent of GRCh37 covered 87.84% 90.98% 3.14% SNPs 3,046,933 3,186,536 139,603 Indels 465,670 482,172 16,502 Number of bases covered in Segmental Duplications 269,887 269,589,673 269,319,786 • Machine learning - Multi-view classification, outlier detection, active learning • Refine use of genome stratifications • Adding variant calls from raw PacBio and Oxford Nanopore • Improve benchmark for larger indels, homopolymers, and tandem repeats • Explore graph-based methods to characterize MHC region • Improve normalization of complex variants In addition to the Genome in a Bottle v3.3.2 input data that consisted of Illumina, Complete Genomics, Ion, 10X, and Solid technologies v4⍺ includes PacBio CCS and new 10X linked read data. v4⍺ v3.3.2 Illumina PacBio CCS 10X ONT v4⍺ v3.3.2 v4⍺ v3.3.2 Illumina PacBio CCS 10X ONT v4⍺ v3.3.2 v3.3.2 Error Excluded in v4⍺Variant Added in v4⍺ New members welcome! Sign up for newsletters at www.genomeinabottle.org Recruiting members to test v4⍺ benchmark please email: justin.zook@nist.gov