1. Genome in a Bottle Consortium
August 2014
NIST, Gaithersburg, MD
Reference Materials for Clinical Applications of Human Genome
Sequencing
Marc Salit, Ph.D. and Justin Zook, Ph.D
National Institute of Standards and Technology
Advances in Biological/Medical Measurement Science
(ABMS @ Stanford)
2. Genome in a Bottle
Consortium Development
• NIST met with sequencing technology
developers to assess standards needs
– Stanford, June 2011
• Open, exploratory workshop
– ASHG, Montreal, Canada
– October 2011
• Small, invitational workshop at NIST
to develop consortium for human
genome reference materials
– FDA, NCBI, NHGRI, NCI, CDC, Wash U,
Broad, technology developers, clinical
labs, CAP, PGP, Partners, ABRF, others
– developed draft work plan
– April 2012
• Open, public meeting at NIST to
formally establish consortium,
present draft work plan
– formed working groups
– identified candidate genomes
– established principles of:
• reference material selection
• characterization
• informatics
• performance metrics
– August 2012
• Open, public workshop at XGen
Congress
– March 2013
• Biannual workshops
– August 2013 at NIST
– January 2014 at Stanford
– August 2014 at NIST
– January 29-30 2015 at Stanford
• Website
– www.genomeinabottle.org
3. Well-characterized, stable RMs
• Obtain metrics for validation,
QC, QA, PT
• Determine sources and types
of bias/error
• Learn to resolve difficult
structural variants
• Improve reference genome
assembly
• Optimization
– integration of data from
multiple platforms
– sequencing and analysis
• Enable regulated applications Comparison of SNP Calls for
NA12878 on 2 platforms, 3
analysis methods
4. Measurement Process
Sample
gDNA isolation
Library Prep
Sequencing
Alignment/Mapping
Variant Calling
Confidence Estimates
Downstream Analysis
• gDNA reference
materials will be
developed to
characterize
performance of a part
of process
– materials will be
certified for their
variants against a
reference sequence,
with confidence
estimates
genericmeasurementprocess
5. • NIST working with GiaB
to select genomes
• Current plan
– NA12878 HapMap
sample as Pilot sample
• part of 17-member
pedigree
– trios from PGP as more
complete set
• 2 trios, focus on children
• varying biogeographic
ancestry
12889 12890 12891 12892
12877 12878
12879 12880 12881 12882 12883 12884 12885 1288712886 12888 12893
CEPH Utah Pedigree 1463
Putting “Genomes” in Bottles
11 children, Birth Order Redacted
6. Genome in a Bottle Working Groups
Reference Material
Selection
& Design
Andrew Grupe,
Celera
•Develop prioritized list
of whole human
genomes for Reference
Materials
•Identify candidate
approaches and
materials for artificial
RMs
•Develop prioritized
list
Meaurements for
Reference Material
Characterization
Mike Eberle, Illumina
•Develop consensus
plan for experimental
characterization of
Reference Materials
Bioninformatics,
Data Integration,
and Data
Representation
Steve Sherry, NCBI
•Develop plan for
integrating
experimental data and
forming consensus
variant calls and
confidence estimates
•Develop consensus
plan for data
representation
Performance Metrics
& Figures of Merit
Deanna Church,
Personalis
•User interface to the
Genome-in-a-Bottle
Reference Material
•“Dashboard”
•what an end user will
see and report to
understand and
describe the
performance of their
experiment
•variant call accuracy
•process performance
measures to enable
optimization
7. Update
Zook et al., Nature Biotechnology, 2014.
• methods to develop
SNP/indel call set
described in manuscript
• broad and quick
adoption of call set for
benchmarking
– struck nerve
• use scenarios a
highlight of this
workshop
8. Preliminary uses of high-confidence
NIST-GIAB genotypes for NA12878
• NIST have released
several versions of high-
confidence genotypes
for its pilot RM
• These data are
presently being used for
benchmarking
– prior to release of RMs
– SNPs & indels
• ~77% of the genome
9. Highlights
This workshop
• release plans for pilot RM
• characterization plans for
HG-002 – HG-005
• prioritization of next
genomes
• collaborations
– ABRF
– Global Alliance
Upcoming
• materials and methods to
support somatic variant
calling
• integrating structural
variants into the GIAB call
sets
• crowdsourcing analysis
• data analysis/code
jamboree
10. Agenda
Thursday
• Welcome and Status Update
• Charge to Working Groups
• Break
• Working Group Breakout
Discussions
• Lunch (in NIST cafeteria)
• FDA Perspective on Future
Needs
• ABRF interlaboratory NGS
study
• Break
• Working Group Breakout
Discussions continued
Friday
• Working groups meet if
needed
• Working Group leaders
present plans and discussion
• Break
• Discussion of NIST Reference
Material Development plans
• Discussion of Steering
Committee agenda items
• Everyone adjourn except
steering committee
• Steering committee meet over
lunch
11. Agenda
Monday
• Breakfast and registration
• Welcome and Context Setting
• NIST RM Update and Status Report
• Charge to Working Groups
• Coffee Break
• Working Group Breakout Discussions
• Lunch (provided)
• Informal Working Group Reports
• Coffee Break
• Breakout Topical Discussions
– Topic #1: Moving beyond the 'easy'
variants and regions of the genome
– Topic #2: Selecting future genomes for
Reference Materials
Tuesday
• Breakfast and registration
• Use cases: Experiences using the pilot
Reference Material
• Discussion of plans to release pilot
Reference Material
• Coffee Break
• Working Group Breakout discussions
• Lunch (provided)
• Working Group leaders present plans
and discussion
• Steering committee Overview
• First meeting of the Steering
Committee (others adjourn)
Please Note
Slides will be made available on SlideShare after
the workshop (see genomeinabottle.org).
Tweets are welcome unless the speaker requests
otherwise. Please use #giab as the hashtag.
Plenary sessions are being broadcast as a
webinar. Questions from webinar attendees can
be submitted via chat
12. NIST Reference Materials
Pilot RM - NA12878
• 8300 10ug vials of NA12878
gDNA @ NIST 4/2013
– Available for sequencing by
GIAB participants
– target for release as NIST RM
12/2014
• SNPs, small indels
• Sequenced at 6 labs
– 4 technologies, multiple modes
• Received “Human Subjects
Approval” for release of
NA12878 as NIST RM
Personal Genome Project
• Ashkenazim trio DNA
received Feb 2014
• Asian son DNA received Feb
2014
– Parents’ cell lines at Coriell
• “Human subjects review”
approval for release of PGP
genomes as NIST RMs
• What future RMs are
needed?
13. Consenting Genomes for use as
Reference Materials
• Risk of re-identification
– this is a real risk
– privacy
– implications for family members
• Meaning of possibility of withdrawal
• Commercial application
– indirect, research
– direct, derived products
• PGP project currently state-of-art
– broad and direct
– test to demonstrate understanding
• “Wild West”
• Coriell MTA for PGP genomes now
explicitly permits commercial
redistribution/modification/…