SlideShare uma empresa Scribd logo
1 de 74
Baixar para ler offline
A unique targeted sequencing service providing meaningful
             results, not insurmountable data
               Dr. Mike Evans — Chief Executive
Outline of presentation


• Delivering a unique next generation sequencing service —
  Dr Mike Evans, CEO
• Optimised bait design for targeted sequencing — Dr Volker Brenner,
  Head of Computational Biology
• Adding value through analysis — Dr Volker Brenner, Head of
  Computational Biology
• Summary
• Q&A
OGT - provides advanced clinical genetics solutions
    - develops innovative molecular diagnostics

• Founded by Ed Southern in 1995
• 64 people




       OGT Begbroke: Corporate offices and high-   OGT Southern Centre: Biomarker discovery
                  throughput labs
OGT’s key businesses


IP Licensing
40 licence relationships




                                             Technologies
Diagnostic Biomarkers
Genomic- and protein-based diagnostics       For Molecular
                                               Medicine

Clinical and Genomic Solutions
Cytogenetics products and genomic services
Clinical and Genomic Solutions


Addressing the challenges of high-throughput, high-resolution
molecular technologies:

• High equipment and staff training costs
• Short equipment lifespan
• Complex study design and processes (e.g. platform evaluation &
  selection)
• Vast amounts of data
    • Extensive computing infrastructure
    • Data analysis expertise and resource


     The solution: Genefficiency Genomic Services
Genefficiency™ — World’s leading aCGH service


High-quality data & complete reassurance

  • Experimental and array design expertise
  • High-throughput processing (>2000 samples / week)
  • Applications: aCGH-CNV, methylation, miRNA, gene expression
    analysis
  • Comprehensive data analysis services
  • >40 QC checks on each sample to ensure high-quality data
Independent accreditations


              • First Agilent High-Throughput Microarray Certified
                Service Provider

              • ISO 9001:2008 — Quality management systems

  FS 561156


              • ISO 27001:2005 — Information security

  IS 561157


              • ISO 17025:2005 — aCGH Laboratory services


    4593
Customer satisfaction…




                 “In order to characterise genetic variants,
                 reproducible performance and reliable processing
                 of the high resolution microarrays is essential. We
                 were pleased with OGT’s responsive approach
                 and attention to producing high quality data to tight
                 deadlines”

                 Dr Matt Hurles, Wellcome Trust Sanger Institute.”




  20,000 samples. 1,000 samples / week
OGT collaborators and customers
A world-class team


Our expert team deliver:
• Excellent project management and customer service
   • >600 projects to date
   • >50,000 samples
• Unparalleled expertise in study and probe design
• Advanced data analysis though a dedicated team of
  bioinformaticians
• Rapid turnaround times
• A wealth of experience of clinical and translational
  research projects
New Genefficiency Targeted Sequencing Services
Delivering discovery


Genefficiency Targeted Sequencing Services — designed to be different:

• Comprehensive — taking you from genomic DNA to filtered, qualified results
• Rigorously designed — project and probe design expertise maximises your
  likelihood of discovery
• Expert support — experienced team of biologists and bioinformaticians
• Dedication to quality — from sample to result, delivering reliable results
  every time
Delivering an integrated, comprehensive service




    1. Selection of most     2. Capture, sample   3. Data analysis and
    appropriate genomic      multiplexing and     advanced filtering of
    regions for enrichment   sequencing           variants




27/10/2011                                                        13
Delivering expert project design


Step 1: Selection of most appropriate genomic regions for your project
        and budget



Whole exome                            Custom genomic regions
Pre-designed, validated whole          Expert custom design of capture probes
exome capture probes                   for your regions of interest

  Coding regions are “most likely”      Flexibility to focus on regions of clinical
    candidates for many disorders          significance or GWAS regions
Delivering class-leading technology

We have fully optimised the DNA capture and sequencing
methodologies, so you don’t have to!

Step 2: Performing the capture, sample multiplexing, library
   preparation and sequencing

• Options for sample indexing and multiplexing to minimise
  sequencing cost

• Depth of sequencing coverage to suit your samples and project

• Paired-end sequencing on the industry-leading Illumina HiSeq 2000
OGT delivers discovery, not just data

Step 3: Data analysis and advanced filtering of variants

• OGT’s dedicated analysis pipeline brings you beyond data, to a
  filtered list of variants relevant to your study




       SEQUENCE                FILTER              DISCOVER
Genefficiency Targeted Sequencing Services



The PLATFORM
   • Core sequencing platform: Illumina HiSeq 2000
   • Core sequence capture technology: Agilent SureSelect


The PEOPLE
   • Team of highly skilled molecular biologists and bioinformaticians
   • Core expertise in probe design
   • Successful development of advanced analysis solutions
Outline of presentation


• Delivering a unique next generation sequencing service —
  Dr Mike Evans, CEO
• Optimised bait design for targeted sequencing — Dr Volker Brenner,
  Head of Computational Biology
• Adding value through analysis — Dr Volker Brenner, Head of
  Computational Biology
• Summary
• Q&A
Agenda


• Important Definitions and Terminologies

• Introduction to Targeted Enrichment

• Custom Bait Design
Definitions and terminologies

• Read length — The number of bases sequenced in a fragment


                                              Region of Interest
• Capture efficiency


                                 Off target       On target        Off target



                                              Region of Interest
• Paired end sequencing
                                          Fragment 1



                                                         Fragment 2




• Read depth — How many times has a base been sequenced?
Read depth required for mutation detection
Assuming no allelic bias the theoretical read depth required to detect
heterozygous variation with given accuracy can be calculated using a
binomial distribution
    Calculations based on variation being seen in at least 2 reads
    • Should not be just one read as this could be ‘noise’
    • Required observations could be a percentage of reads


    Depth Required        Het. Call Accuracy      Probability of Error      Quality
            11                    99%                         1:100             Q20
            14                   99.9%                       1:1000             Q30
            18                  99.99%                       1:10000            Q40
            25                  99.999%                  1:100000               Q50

    •   Minimum capacity required = Region of interest (ROI) x required depth
    •   Q30 variant detection for 15Kb ROI requires 210Kb sequencing capacity
Agenda


• Important Definitions and Terminologies

• Introduction to Targeted Enrichment

• Custom Bait Design
Why use targeted enrichment?

 Flexibility in choice of genomic loci
     • Allows capture of specific regions of interest for SNP and Indel detection


 Cost Effectiveness
     • Ideal for clinical applications
         • Specific candidate genes are targeted
         • Fine mapping post-GWAS
     • Cost Benefits
         • Enables multiplexing to fill capacity


 Streamlined Data Analysis
     • Reduced noise due to targeted specificity
Example of design bias — Insufficient coverage

Targeted gene sequencing can lead to some targets without the
required depth of coverage


                                                       Inadequate Coverage


14x (Q30)




   *data kindly provided by C. Mattocks National Genetics Reference Lab, Salisbury, UK
Solution: Intelligent design to improve coverage:

 Option 1:                       Option 2:
 • Increase coverage by          • Intelligent design of capture probes
   increasing depth of             increases under-represented loci
   sequencing                    • More even coverage of entire region,
 • Coverage of all targets         no loci missed (more likely to find
   proportionally increased        mutations present)
 • Increased cost of             • No need to increase sequence depth
   sequencing                      overall (more cost effective)
 • Some bases still missed



                              (Q30)
Agenda


• Important Definitions and Terminologies

• Introduction to Targeted Enrichment

• Custom Bait Design
Problems facing users


• Design tools not user friendly
• Design tools only good for draft design
• Potential sources of bias
    • Regions of interest too short
    • Bait thermodynamic behaviour
       • GC content
        • Melting Temperature
• Risk of Design Errors

• OGT’s extensive experience in designing probes for microarrays
  allows us to minimise bias and ensure evenness of coverage giving
  the best chance to identify mutations
OGT’s design pipeline — what we need from you


 •   Regions of Interest
     • Gene lists
     • Chromosomal locations

 •   Genome build version

 •   Data file format
     • Text, Excel, etc....
     • Consistent e.g. chr1: 2247628-2248537




                   2. Draft                   4. Thermo-
       1. Data                3. Singletons                5. Report
                   Design                      dynamics
Run draft design

• Assess the output:
   • Coverage
   • Bait distribution
   • Repeat masking
                          Region of Interest                     Repeat masking




                   2. Draft     3. Singleton   4. Bait Thermo-
      1. Data                                                      5. Report
                   Design           Baits         dynamics
Custom baits improve coverage at region boundaries


                OGT                                      1KG




 OGT custom bait design gives increased read depth around edges of target regions.
Correction for singleton baits

• Review the draft design and identify any regions covered by a
  single bait
    • These regions span less than 120 bases
• Add additional singleton baits to the design
           Before                                               After




• This ensures that small regions are captured as well as large
  regions
• Advantage — Improves evenness of capture across the design
                    2. Draft   3. Singleton   4. Bait Thermo-
       1. Data                                                  5. Report
                    Design         Baits         dynamics
Custom approach ensures variant detection




                                                                         OGT



                                                                         1KG




 Even at more than 50x coverage, whole exome sequencing does not accurately
 identify all SNPs.
 OGT custom baits design compared with 1000 Genomes whole exome capture data.
Correction for bait thermodynamics
GC content                                   Tm content
• Calculate GC content for all baits         • Calculate the Tm for all baits
• Identify those baits where GC              • Identify those baits where Tm is
  content is extreme (for instance              extreme (e.g. > 75oC)
  >65% and <40%)
• Add additional copies of these baits       •    Add additional copies of these baits

                               Region of Interest
                     GC extreme



                                                        Tm extreme




                    2. Draft      3. Singleton   4. Bait Thermo-
       1. Data                                                     5. Report
                    Design            Baits         dynamics
OGT custom bait designs help overcome GC issues




                                                                        OGT




                                                                        SureSelect




 In a region with 70% GC content OGT custom bait design achieved a maximum read
 depth of 50x.
 The Agilent SureSelect 50Mb capture kit does not capture any reads in this region.
OGT custom bait designs help overcome GC issues




                                                                              OGT




                                                                              SureSelect




Relative capture of targets within a single gene. Agilent coverage is 20x for the target with no GC
content bias, and minimal for targets with a GC content of 65%.
In contrast OGT custom baits perform excellently in this region.
Customer report

  • Design Parameters

  • Depth of Coverage
     • On target / Off target
     • Regions not covered – and why not

  • Bait Details
     • Singletons
     • GC distribution
     • Tm distribution

  • Library Design
     • Baits generated

                2. Draft   3. Singleton   4. Bait Thermo-
     1. Data                                                5. Report
                Design        Baits          dynamics
Summary

  • Custom design of regions for targeted sequencing offers
    significant flexibility for many applications

  • Expert probe design will ensure:

     • Better ‘evenness’ of coverage helps ensure no regions are
       missed and maximises the likelihood of variant detection

     • Improvement of overall capture efficiency and on-target
       performance equals cost effective sequencing downstream

     • Increase capture efficiency of SNPs and Indels equals an increase
       in the likelihood of detection

     • Reduction of risk and better performance
Adding value through analysis


    • Introduction
    • NGS data analysis
       • Primary analysis
           •   Mapping and assembly
           •   Q score re-calibration
           •   NGS sequencing QC
           •   NGS alignment QC
       • Secondary analysis
           • SNP and Indel calling
           • Annotation and evaluation pipeline
           • SIFT and PolyPhen
    • Deliverables
    • Case study
    • Summary
The analysis challenge




               Hard drive
Sequencer         with
             ~4Gb per exome

                                                                     Publication




      NGS   Raw data          Mapping
                              Mapping   Annotation
                                        Annotation   Filtering
                                                     Filtering   Reporting
                                                                 Reporting
Raw data: FASTQ
(standard text representation of short reads)


FASTQ uses four lines per sequence.

    •   Line 1: '@' followed by a sequence identifier

    •   Line 2: raw sequence letters

    •   Line 3: '+' (and optional sequence identifier)

    •   Line 4: quality values for the sequence in Line 2. Must contain the same number of
        symbols as letters in the sequence.
        (The letters encode Phred Quality Scores from 0 to 93 using ASCII 33 to 126)



   Example
          @SEQ_ID
          GATTTGGGGTTCAAAGCAGTATCGATCAAATAGTAAATCCATTTGTTCAACTCACAGTTT
          +
          !''*((((***+))%%%++)(%%%%).1***-+*''))**55CCF>>>>>>CCCCCCC65
Phred quality scores


•   Phred is an accurate base-caller used for capillary traces (Ewing et al
    Genome Research 1998)
•   Each called base is given a quality score Q
•   Quality based on simple metrics (such as peak spacing) calibrated against a
    database of hand-edited data
•   QPhred = -10 * log10(estimated probability call is wrong)


                           Probability of incorrect
    Phred Quality Score                               Base call accuracy
                           base call
    10                     1 in 10                    90 %
    20                     1 in 100                   99 %
    30                     1 in 1000                  99.9 %
    40                     1 in 10000                 99.99 %

     Q30 often used as a threshold for useful sequence data
Adding value through analysis


    • Introduction
    • NGS data analysis
       • Primary analysis
           •   Mapping and assembly
           •   Q score re-calibration
           •   NGS sequencing QC
           •   NGS alignment QC
       • Secondary analysis
           • SNP and Indel calling
           • Annotation and evaluation pipeline
           • SIFT and PolyPhen
    • Deliverables
    • Case study
    • Summary
Primary analysis — Mapping and alignment




    Raw
  Sequence
    Files
 FASTQ Format




                     Raw            Local                        Quality
                                                   Duplicate                   Analysis-
   Mapping        Alignment      Realignment                    score re-
                                 (around InDels)   marking                       ready
                     Files                                     calibration     Alignment

 BWA/Bowtie     SAM/BAM Format     GATK            Picard       Picard       SAM/BAM Format
Why mark duplicates and realignment around indels?




                     3 incorrect calls within 40bp!
Primary analysis — Mapping and alignment




      Raw
    Sequence
      Files
  FASTQ Format




                      Raw            Local                        Quality
                                                    Duplicate                   Analysis-
    Mapping        Alignment      Realignment                    score re-
                                  (around InDels)   marking                       ready
                      Files                                     calibration     Alignment

   BWA/Bowtie    SAM/BAM Format     GATK            Picard       Picard       SAM/BAM Format
NGS variant calling methods


  Option 1 - Hard filtering
  Example: SNP can only be called if
      • read depth >10
      • >35% of reads carry SNP

     Effective filtering
     Transparent to user
  –   Simplistic approach
  –   Will miss high quality calls that don’t pass threshold

  Option 2 - Statistical analysis
  Based on quality scores of individual basepairs, the alignment and statistical probability models

     Robust
     Optimum balance of sensitivity and specificity due to the use of statistical models
     Fewer false positive and false negative SNP calls
  –   Requires correctly pre-processed data with reliable quality scores
Base quality score re-calibration


            Before Recalibration                                    After Recalibration




Source: The Broad Institute
http://www.broadinstitute.org/files/shared/mpg/nextgen2010/nextgen_poplin.pdf
Primary analysis — Raw data and assembly QC




    Raw
  Sequence
    Files
 FASTQ Format                                                                  Alignment
                                                                               QC check

                                                                                 Picard
  Sequence
  QC check           Raw            Local                        Quality
                                                   Duplicate                   Analysis-
   Mapping        Alignment      Realignment                    score re-
                                 (around InDels)   marking                       ready
   FastQC            Files                                     calibration     Alignment
                                                                                Alignment
                                                                               QC Report
  BWA/Bowtie    SAM/BAM Format     GATK            Picard       Picard       SAM/BAM Format

  Raw data
  QC Report
Secondary analysis
SNP and Indel calling, annotation and filtering

                                                           • Known variant?

                                                           • Impact on gene expression?
                               SNPs
   Analysis-                                               • Splicing affected?
                  Unified                      Variant
      ready
                 Genotyper                    Evaluation
   alignment                                               • Non-synonymous or frameshift
                              InDels
                                                           mutation?
                  GATK                          OGT
SAM/BAM Format
                                                           • Impact on protein function?
                             VCF Format
                                                           • How confident are we in the
                                                           call?

                                                           • Zygosity?



                                          Sequence
                                          QC Report



                                          Alignment
                                          QC Report        Comprehensive
                                                           interactive OGT
                                                                Report
SNP/Indel classification
(standard analysis)
We check and annotate every single detected SNP and Indel against all human
Ensembl genes and transcripts and dbSNP

dbSNP annotation:
• Is the variant known?
• Obtain allele frequency

Does it affect any of the following
• Promoter region
• UTR
• Splice sites or intronic region
• CDS
    •   Synonymous mutation
    •   Non synonymous mutation
    •   Frameshift mutation
    •   Stop codon (truncated/elongated protein sequence)
    •   Overlap with protein domain
    •   Consequence on protein function predicted (SIFT & PolyPhen)
OGT Processing Overview
                                                                                                                      Filter out variants
                                               Mapped to                                     Perform pairwise       present in “baseline”     Additional Filtering
                                            Promoter Regions                                 genome analysis       genome (e.g. somatic
                                                                                                                        Filter out              and Analysis
                                                                                                                   tissue, healthy sibling)
                                                                                                                        variants
                        Not Described in                                                                             Filter out variants
                                                                                                                  present“baseline”
                                                                                                                                 in any
                                                                    Non-synonymous           Perform pairwise      present in                 Additional Filtering
                             dbSNP
                                                                                                                     “baseline”               StudyAnalysis
                                                                                                                                                      specific
                                             Mapped to Exons,       Coding Variations         Perform
                                                                                             genome analysis       genome (e.g. somatic
                                                                                                                  tissue, healthy sibling)
                                                                                                                                                and
                                                                                                                                              additional in-
                                            Splice sites or UTRs                              pairwise              exome (e.g.
                                                and Protein                                                       somatic variants
                                                                                                                     Filter out tissue,
                                                                                                                                              depth filtering
Gather All detected                               domains
                                                                   Variations with Serious
                                                                    Consequences to the
                                                                                              genome
                                                                                             Perform pairwise                                  and analysis
                                                                                                                                              Additional Filtering
    SNP/Indels                                                       Protein Sequence         analysis
                                                                                             genome analysis
                                                                                                                  healthy “baseline”
                                                                                                                   present in
                                                                                                                               sibling)
                                                                                                                   genome (e.g. somatic           and Analysis
                                                                            (SIFT)                                   AND not all
                                                                                                                  tissue, healthy sibling)
                                                                                                                  “case” exomes
                                                                                                                     Filter out variants
                                                Rare RS ID                                   Perform pairwise       present in “baseline”     Additional Filtering
                       Described in dbSNP
                                                Variations                                   genome analysis       genome (e.g. somatic         and Analysis
                                                                                                                   tissue, healthy sibling)




                                                                      Multi Genome Analysis, Data                Tailored analysis based on client’s
                             Individual Genome Analysis                Gathering and Comparison                       individual requirements
                                   (Standard Level)                         (Advanced Level)                                     (Expert Level)




                      Data
                                                                                                                Information
NGS data delivery


      ship data



                   Hard drive
                    (or FTP)




                                                    Double click!




                                              File location
                                             & share results

        Comprehensive HTML analysis report
Analysis report: Summary section
Analysis report: QC section — Read QC
Analysis report: QC section — Read QC
Analysis report: QC section — Alignment QC
Analysis report: QC section — Alignment QC
Analysis section — Overview
The Variant Table View




                         Data display




                         Data export
The Variant Table View — External links
The Detailed Variant View
Predicted consequences on protein function
Alignment View of selected variant in IGV
OGT data processing ensures detection of insertions




 Detection of an 31bp insertion
OGT data processing ensures detection of deletions:
Example1




  Detection of an 84bp deletion
Detection of homozygous and heterozygous deletions




                                        Homozygous deletion




                                        Heterozygous deletion




                                        No deletion (reference sequence)
Interactive data filtering
Customer data: Analysis of consanguineous samples

                                      1    2
                            I
                                                       HACE1
                                                       Exon11
                                                       c.994C>T
                                  1         2          R332X
                           II
                                                       (CGA -> TGA)




  Data courtesy of Dr. Bernd Wollnik, Institute of Human Genetics, University Hospital of Cologne
Confirmation by Sanger sequencing
                          X
                H V   F   R   I   G   P


 Control
                                                                        R332X
                                                     69-161   168-258               602-909

                                                    ANK1 ANK1                        HECT

 Mother



 Father



  Patient1




  Patient2




  Data courtesy of Dr. Bernd Wollnik, Institute of Human Genetics, University Hospital of Cologne
Customer feedback...




           Analysis of Consanguineous Samples


        “Just wanted to let you know that we have probably identified the
              causative gene and mutation in the patient sample.

        The mutation is located in the middle of an 18 Mb homozygous
               stretch and is a homozygous nonsense mutation!!!

                 Wow, its going so nicely with your data!!!”


                                     Dr. Bernd Wollnik, Institute of Human Genetics,
                                             University Hospital of Cologne
Summary
OGT offers fast, accurate & powerful NGS analysis

Standard Analysis
• Robust statistical data analysis
• Comprehensive variant annotation
• Interactive filtering and prioritisation of data based on
    • chromosomal region
    • allele frequency / novelty
    • zygosity
    • confidence score and read depth
    • severity of mutation


Advanced Analysis
• Multi-genome comparison

Bespoke analysis
• Tailored to your specific requirements
Outline of presentation


• Delivering a unique next generation sequencing service —
  Dr Mike Evans, CEO
• Optimised bait design for targeted sequencing — Dr Volker Brenner,
  Head of Computational Biology
• Adding value through analysis — Dr Volker Brenner, Head of
  Computational Biology
• Summary
• Q&A
Speak to one of our team or visit booth 713 to:

• Book a demonstration of our interactive analysis
  report — Hurry limited availability

• Discuss your specific project requirements

• Take part in our short survey and have your
  chance to win an Amazon Kindle
Thank you
www.ogt.co.uk

                75

Mais conteúdo relacionado

Semelhante a ASHG sequencing workshop

Improving exome sequencing, targeted sequencing, and low frequency variant de...
Improving exome sequencing, targeted sequencing, and low frequency variant de...Improving exome sequencing, targeted sequencing, and low frequency variant de...
Improving exome sequencing, targeted sequencing, and low frequency variant de...Laura Berry
 
Digital RNAseq for Gene Expression Profiling: Digital RNAseq Webinar Part 2
Digital RNAseq for Gene Expression Profiling: Digital RNAseq Webinar Part 2Digital RNAseq for Gene Expression Profiling: Digital RNAseq Webinar Part 2
Digital RNAseq for Gene Expression Profiling: Digital RNAseq Webinar Part 2QIAGEN
 
Acoustic Time Series in Industry 4.0: Improved Reliability and Cyber-Security...
Acoustic Time Series in Industry 4.0: Improved Reliability and Cyber-Security...Acoustic Time Series in Industry 4.0: Improved Reliability and Cyber-Security...
Acoustic Time Series in Industry 4.0: Improved Reliability and Cyber-Security...J On The Beach
 
Epoch single sheet
Epoch single sheetEpoch single sheet
Epoch single sheetMaziar Yari
 
DCN Diagnostics. Design and Development of Lateral Flow Assay Systems
DCN Diagnostics. Design and Development of Lateral Flow Assay SystemsDCN Diagnostics. Design and Development of Lateral Flow Assay Systems
DCN Diagnostics. Design and Development of Lateral Flow Assay SystemsBrendan O'Farrell
 
Evaluating Cloud vs On-Premises for NGS Clinical Workflows
Evaluating Cloud vs On-Premises for NGS Clinical WorkflowsEvaluating Cloud vs On-Premises for NGS Clinical Workflows
Evaluating Cloud vs On-Premises for NGS Clinical WorkflowsGolden Helix
 
David Snead on The use of digital pathology in the primary diagnosis of histo...
David Snead on The use of digital pathology in the primary diagnosis of histo...David Snead on The use of digital pathology in the primary diagnosis of histo...
David Snead on The use of digital pathology in the primary diagnosis of histo...Cirdan
 
Design and development of lateral flow assays for field use
Design and development of lateral flow assays for field useDesign and development of lateral flow assays for field use
Design and development of lateral flow assays for field useBrendan O'Farrell
 
Accelerate Your Discovery with QIAGEN Service Solutions for Biomarker Researc...
Accelerate Your Discovery with QIAGEN Service Solutions for Biomarker Researc...Accelerate Your Discovery with QIAGEN Service Solutions for Biomarker Researc...
Accelerate Your Discovery with QIAGEN Service Solutions for Biomarker Researc...QIAGEN
 
Multispectral imaging in Plant Sciences with VideometerLab 3
Multispectral imaging in Plant Sciences with VideometerLab 3Multispectral imaging in Plant Sciences with VideometerLab 3
Multispectral imaging in Plant Sciences with VideometerLab 3Adrian Waltho
 
E lx800 single-sheet
E lx800 single-sheetE lx800 single-sheet
E lx800 single-sheetMaziar Yari
 
Extending the Depth of Coverage in SWATH® Acquisition with Deeper Ion Libraries
Extending the Depth of Coverage in SWATH® Acquisition with Deeper Ion Libraries Extending the Depth of Coverage in SWATH® Acquisition with Deeper Ion Libraries
Extending the Depth of Coverage in SWATH® Acquisition with Deeper Ion Libraries SCIEX
 
Making powerful science: an introduction to NGS data analysis
Making powerful science: an introduction to NGS data analysisMaking powerful science: an introduction to NGS data analysis
Making powerful science: an introduction to NGS data analysisAdamCribbs1
 
Building Secure Analysis and Storage Systems with Golden Helix
Building Secure Analysis and Storage Systems with Golden HelixBuilding Secure Analysis and Storage Systems with Golden Helix
Building Secure Analysis and Storage Systems with Golden HelixGolden Helix
 
VarSeq 2.4.0: VSClinical ACMG Workflow from the User Perspective
VarSeq 2.4.0: VSClinical ACMG Workflow from the User PerspectiveVarSeq 2.4.0: VSClinical ACMG Workflow from the User Perspective
VarSeq 2.4.0: VSClinical ACMG Workflow from the User PerspectiveGolden Helix
 
VarSeq 2.4.0: VSClinical ACMG Workflow from the User Perspective
VarSeq 2.4.0: VSClinical ACMG Workflow from the User PerspectiveVarSeq 2.4.0: VSClinical ACMG Workflow from the User Perspective
VarSeq 2.4.0: VSClinical ACMG Workflow from the User PerspectiveGolden Helix
 
Semantic Journal Mapping for Search Visualization in a Large Scale Article Di...
Semantic Journal Mapping for Search Visualization in a Large Scale Article Di...Semantic Journal Mapping for Search Visualization in a Large Scale Article Di...
Semantic Journal Mapping for Search Visualization in a Large Scale Article Di...Glen Newton
 
Employing ForteBio Octet platform for the development of a dual-binding poten...
Employing ForteBio Octet platform for the development of a dual-binding poten...Employing ForteBio Octet platform for the development of a dual-binding poten...
Employing ForteBio Octet platform for the development of a dual-binding poten...KBI Biopharma
 
Multisizer 375 nm to 1600 microns
Multisizer 375 nm to 1600 micronsMultisizer 375 nm to 1600 microns
Multisizer 375 nm to 1600 micronsadityahpatel
 
From Screening to QC: Development Considerations for Octet Methods
From Screening to QC: Development Considerations for Octet MethodsFrom Screening to QC: Development Considerations for Octet Methods
From Screening to QC: Development Considerations for Octet MethodsKBI Biopharma
 

Semelhante a ASHG sequencing workshop (20)

Improving exome sequencing, targeted sequencing, and low frequency variant de...
Improving exome sequencing, targeted sequencing, and low frequency variant de...Improving exome sequencing, targeted sequencing, and low frequency variant de...
Improving exome sequencing, targeted sequencing, and low frequency variant de...
 
Digital RNAseq for Gene Expression Profiling: Digital RNAseq Webinar Part 2
Digital RNAseq for Gene Expression Profiling: Digital RNAseq Webinar Part 2Digital RNAseq for Gene Expression Profiling: Digital RNAseq Webinar Part 2
Digital RNAseq for Gene Expression Profiling: Digital RNAseq Webinar Part 2
 
Acoustic Time Series in Industry 4.0: Improved Reliability and Cyber-Security...
Acoustic Time Series in Industry 4.0: Improved Reliability and Cyber-Security...Acoustic Time Series in Industry 4.0: Improved Reliability and Cyber-Security...
Acoustic Time Series in Industry 4.0: Improved Reliability and Cyber-Security...
 
Epoch single sheet
Epoch single sheetEpoch single sheet
Epoch single sheet
 
DCN Diagnostics. Design and Development of Lateral Flow Assay Systems
DCN Diagnostics. Design and Development of Lateral Flow Assay SystemsDCN Diagnostics. Design and Development of Lateral Flow Assay Systems
DCN Diagnostics. Design and Development of Lateral Flow Assay Systems
 
Evaluating Cloud vs On-Premises for NGS Clinical Workflows
Evaluating Cloud vs On-Premises for NGS Clinical WorkflowsEvaluating Cloud vs On-Premises for NGS Clinical Workflows
Evaluating Cloud vs On-Premises for NGS Clinical Workflows
 
David Snead on The use of digital pathology in the primary diagnosis of histo...
David Snead on The use of digital pathology in the primary diagnosis of histo...David Snead on The use of digital pathology in the primary diagnosis of histo...
David Snead on The use of digital pathology in the primary diagnosis of histo...
 
Design and development of lateral flow assays for field use
Design and development of lateral flow assays for field useDesign and development of lateral flow assays for field use
Design and development of lateral flow assays for field use
 
Accelerate Your Discovery with QIAGEN Service Solutions for Biomarker Researc...
Accelerate Your Discovery with QIAGEN Service Solutions for Biomarker Researc...Accelerate Your Discovery with QIAGEN Service Solutions for Biomarker Researc...
Accelerate Your Discovery with QIAGEN Service Solutions for Biomarker Researc...
 
Multispectral imaging in Plant Sciences with VideometerLab 3
Multispectral imaging in Plant Sciences with VideometerLab 3Multispectral imaging in Plant Sciences with VideometerLab 3
Multispectral imaging in Plant Sciences with VideometerLab 3
 
E lx800 single-sheet
E lx800 single-sheetE lx800 single-sheet
E lx800 single-sheet
 
Extending the Depth of Coverage in SWATH® Acquisition with Deeper Ion Libraries
Extending the Depth of Coverage in SWATH® Acquisition with Deeper Ion Libraries Extending the Depth of Coverage in SWATH® Acquisition with Deeper Ion Libraries
Extending the Depth of Coverage in SWATH® Acquisition with Deeper Ion Libraries
 
Making powerful science: an introduction to NGS data analysis
Making powerful science: an introduction to NGS data analysisMaking powerful science: an introduction to NGS data analysis
Making powerful science: an introduction to NGS data analysis
 
Building Secure Analysis and Storage Systems with Golden Helix
Building Secure Analysis and Storage Systems with Golden HelixBuilding Secure Analysis and Storage Systems with Golden Helix
Building Secure Analysis and Storage Systems with Golden Helix
 
VarSeq 2.4.0: VSClinical ACMG Workflow from the User Perspective
VarSeq 2.4.0: VSClinical ACMG Workflow from the User PerspectiveVarSeq 2.4.0: VSClinical ACMG Workflow from the User Perspective
VarSeq 2.4.0: VSClinical ACMG Workflow from the User Perspective
 
VarSeq 2.4.0: VSClinical ACMG Workflow from the User Perspective
VarSeq 2.4.0: VSClinical ACMG Workflow from the User PerspectiveVarSeq 2.4.0: VSClinical ACMG Workflow from the User Perspective
VarSeq 2.4.0: VSClinical ACMG Workflow from the User Perspective
 
Semantic Journal Mapping for Search Visualization in a Large Scale Article Di...
Semantic Journal Mapping for Search Visualization in a Large Scale Article Di...Semantic Journal Mapping for Search Visualization in a Large Scale Article Di...
Semantic Journal Mapping for Search Visualization in a Large Scale Article Di...
 
Employing ForteBio Octet platform for the development of a dual-binding poten...
Employing ForteBio Octet platform for the development of a dual-binding poten...Employing ForteBio Octet platform for the development of a dual-binding poten...
Employing ForteBio Octet platform for the development of a dual-binding poten...
 
Multisizer 375 nm to 1600 microns
Multisizer 375 nm to 1600 micronsMultisizer 375 nm to 1600 microns
Multisizer 375 nm to 1600 microns
 
From Screening to QC: Development Considerations for Octet Methods
From Screening to QC: Development Considerations for Octet MethodsFrom Screening to QC: Development Considerations for Octet Methods
From Screening to QC: Development Considerations for Octet Methods
 

Último

Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxAshokKarra1
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
 
Food processing presentation for bsc agriculture hons
Food processing presentation for bsc agriculture honsFood processing presentation for bsc agriculture hons
Food processing presentation for bsc agriculture honsManeerUddin
 
4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptxmary850239
 
Integumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptIntegumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptshraddhaparab530
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management systemChristalin Nelson
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptxiammrhaywood
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Seán Kennedy
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
ICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfVanessa Camilleri
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)lakshayb543
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4JOYLYNSAMANIEGO
 
ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxVanesaIglesias10
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptxmary850239
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfJemuel Francisco
 

Último (20)

Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptx
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
 
Food processing presentation for bsc agriculture hons
Food processing presentation for bsc agriculture honsFood processing presentation for bsc agriculture hons
Food processing presentation for bsc agriculture hons
 
4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx
 
Integumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptIntegumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.ppt
 
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptxYOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management system
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
ICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdf
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4
 
ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptx
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
 

ASHG sequencing workshop

  • 1. A unique targeted sequencing service providing meaningful results, not insurmountable data Dr. Mike Evans — Chief Executive
  • 2. Outline of presentation • Delivering a unique next generation sequencing service — Dr Mike Evans, CEO • Optimised bait design for targeted sequencing — Dr Volker Brenner, Head of Computational Biology • Adding value through analysis — Dr Volker Brenner, Head of Computational Biology • Summary • Q&A
  • 3. OGT - provides advanced clinical genetics solutions - develops innovative molecular diagnostics • Founded by Ed Southern in 1995 • 64 people OGT Begbroke: Corporate offices and high- OGT Southern Centre: Biomarker discovery throughput labs
  • 4. OGT’s key businesses IP Licensing 40 licence relationships Technologies Diagnostic Biomarkers Genomic- and protein-based diagnostics For Molecular Medicine Clinical and Genomic Solutions Cytogenetics products and genomic services
  • 5. Clinical and Genomic Solutions Addressing the challenges of high-throughput, high-resolution molecular technologies: • High equipment and staff training costs • Short equipment lifespan • Complex study design and processes (e.g. platform evaluation & selection) • Vast amounts of data • Extensive computing infrastructure • Data analysis expertise and resource The solution: Genefficiency Genomic Services
  • 6. Genefficiency™ — World’s leading aCGH service High-quality data & complete reassurance • Experimental and array design expertise • High-throughput processing (>2000 samples / week) • Applications: aCGH-CNV, methylation, miRNA, gene expression analysis • Comprehensive data analysis services • >40 QC checks on each sample to ensure high-quality data
  • 7. Independent accreditations • First Agilent High-Throughput Microarray Certified Service Provider • ISO 9001:2008 — Quality management systems FS 561156 • ISO 27001:2005 — Information security IS 561157 • ISO 17025:2005 — aCGH Laboratory services 4593
  • 8. Customer satisfaction… “In order to characterise genetic variants, reproducible performance and reliable processing of the high resolution microarrays is essential. We were pleased with OGT’s responsive approach and attention to producing high quality data to tight deadlines” Dr Matt Hurles, Wellcome Trust Sanger Institute.” 20,000 samples. 1,000 samples / week
  • 10. A world-class team Our expert team deliver: • Excellent project management and customer service • >600 projects to date • >50,000 samples • Unparalleled expertise in study and probe design • Advanced data analysis though a dedicated team of bioinformaticians • Rapid turnaround times • A wealth of experience of clinical and translational research projects
  • 11. New Genefficiency Targeted Sequencing Services
  • 12. Delivering discovery Genefficiency Targeted Sequencing Services — designed to be different: • Comprehensive — taking you from genomic DNA to filtered, qualified results • Rigorously designed — project and probe design expertise maximises your likelihood of discovery • Expert support — experienced team of biologists and bioinformaticians • Dedication to quality — from sample to result, delivering reliable results every time
  • 13. Delivering an integrated, comprehensive service 1. Selection of most 2. Capture, sample 3. Data analysis and appropriate genomic multiplexing and advanced filtering of regions for enrichment sequencing variants 27/10/2011 13
  • 14. Delivering expert project design Step 1: Selection of most appropriate genomic regions for your project and budget Whole exome Custom genomic regions Pre-designed, validated whole Expert custom design of capture probes exome capture probes for your regions of interest Coding regions are “most likely” Flexibility to focus on regions of clinical candidates for many disorders significance or GWAS regions
  • 15. Delivering class-leading technology We have fully optimised the DNA capture and sequencing methodologies, so you don’t have to! Step 2: Performing the capture, sample multiplexing, library preparation and sequencing • Options for sample indexing and multiplexing to minimise sequencing cost • Depth of sequencing coverage to suit your samples and project • Paired-end sequencing on the industry-leading Illumina HiSeq 2000
  • 16. OGT delivers discovery, not just data Step 3: Data analysis and advanced filtering of variants • OGT’s dedicated analysis pipeline brings you beyond data, to a filtered list of variants relevant to your study SEQUENCE FILTER DISCOVER
  • 17. Genefficiency Targeted Sequencing Services The PLATFORM • Core sequencing platform: Illumina HiSeq 2000 • Core sequence capture technology: Agilent SureSelect The PEOPLE • Team of highly skilled molecular biologists and bioinformaticians • Core expertise in probe design • Successful development of advanced analysis solutions
  • 18. Outline of presentation • Delivering a unique next generation sequencing service — Dr Mike Evans, CEO • Optimised bait design for targeted sequencing — Dr Volker Brenner, Head of Computational Biology • Adding value through analysis — Dr Volker Brenner, Head of Computational Biology • Summary • Q&A
  • 19. Agenda • Important Definitions and Terminologies • Introduction to Targeted Enrichment • Custom Bait Design
  • 20. Definitions and terminologies • Read length — The number of bases sequenced in a fragment Region of Interest • Capture efficiency Off target On target Off target Region of Interest • Paired end sequencing Fragment 1 Fragment 2 • Read depth — How many times has a base been sequenced?
  • 21. Read depth required for mutation detection Assuming no allelic bias the theoretical read depth required to detect heterozygous variation with given accuracy can be calculated using a binomial distribution Calculations based on variation being seen in at least 2 reads • Should not be just one read as this could be ‘noise’ • Required observations could be a percentage of reads Depth Required Het. Call Accuracy Probability of Error Quality 11 99% 1:100 Q20 14 99.9% 1:1000 Q30 18 99.99% 1:10000 Q40 25 99.999% 1:100000 Q50 • Minimum capacity required = Region of interest (ROI) x required depth • Q30 variant detection for 15Kb ROI requires 210Kb sequencing capacity
  • 22. Agenda • Important Definitions and Terminologies • Introduction to Targeted Enrichment • Custom Bait Design
  • 23. Why use targeted enrichment? Flexibility in choice of genomic loci • Allows capture of specific regions of interest for SNP and Indel detection Cost Effectiveness • Ideal for clinical applications • Specific candidate genes are targeted • Fine mapping post-GWAS • Cost Benefits • Enables multiplexing to fill capacity Streamlined Data Analysis • Reduced noise due to targeted specificity
  • 24. Example of design bias — Insufficient coverage Targeted gene sequencing can lead to some targets without the required depth of coverage Inadequate Coverage 14x (Q30) *data kindly provided by C. Mattocks National Genetics Reference Lab, Salisbury, UK
  • 25. Solution: Intelligent design to improve coverage: Option 1: Option 2: • Increase coverage by • Intelligent design of capture probes increasing depth of increases under-represented loci sequencing • More even coverage of entire region, • Coverage of all targets no loci missed (more likely to find proportionally increased mutations present) • Increased cost of • No need to increase sequence depth sequencing overall (more cost effective) • Some bases still missed (Q30)
  • 26. Agenda • Important Definitions and Terminologies • Introduction to Targeted Enrichment • Custom Bait Design
  • 27. Problems facing users • Design tools not user friendly • Design tools only good for draft design • Potential sources of bias • Regions of interest too short • Bait thermodynamic behaviour • GC content • Melting Temperature • Risk of Design Errors • OGT’s extensive experience in designing probes for microarrays allows us to minimise bias and ensure evenness of coverage giving the best chance to identify mutations
  • 28. OGT’s design pipeline — what we need from you • Regions of Interest • Gene lists • Chromosomal locations • Genome build version • Data file format • Text, Excel, etc.... • Consistent e.g. chr1: 2247628-2248537 2. Draft 4. Thermo- 1. Data 3. Singletons 5. Report Design dynamics
  • 29. Run draft design • Assess the output: • Coverage • Bait distribution • Repeat masking Region of Interest Repeat masking 2. Draft 3. Singleton 4. Bait Thermo- 1. Data 5. Report Design Baits dynamics
  • 30. Custom baits improve coverage at region boundaries OGT 1KG OGT custom bait design gives increased read depth around edges of target regions.
  • 31. Correction for singleton baits • Review the draft design and identify any regions covered by a single bait • These regions span less than 120 bases • Add additional singleton baits to the design Before After • This ensures that small regions are captured as well as large regions • Advantage — Improves evenness of capture across the design 2. Draft 3. Singleton 4. Bait Thermo- 1. Data 5. Report Design Baits dynamics
  • 32. Custom approach ensures variant detection OGT 1KG Even at more than 50x coverage, whole exome sequencing does not accurately identify all SNPs. OGT custom baits design compared with 1000 Genomes whole exome capture data.
  • 33. Correction for bait thermodynamics GC content Tm content • Calculate GC content for all baits • Calculate the Tm for all baits • Identify those baits where GC • Identify those baits where Tm is content is extreme (for instance extreme (e.g. > 75oC) >65% and <40%) • Add additional copies of these baits • Add additional copies of these baits Region of Interest GC extreme Tm extreme 2. Draft 3. Singleton 4. Bait Thermo- 1. Data 5. Report Design Baits dynamics
  • 34. OGT custom bait designs help overcome GC issues OGT SureSelect In a region with 70% GC content OGT custom bait design achieved a maximum read depth of 50x. The Agilent SureSelect 50Mb capture kit does not capture any reads in this region.
  • 35. OGT custom bait designs help overcome GC issues OGT SureSelect Relative capture of targets within a single gene. Agilent coverage is 20x for the target with no GC content bias, and minimal for targets with a GC content of 65%. In contrast OGT custom baits perform excellently in this region.
  • 36. Customer report • Design Parameters • Depth of Coverage • On target / Off target • Regions not covered – and why not • Bait Details • Singletons • GC distribution • Tm distribution • Library Design • Baits generated 2. Draft 3. Singleton 4. Bait Thermo- 1. Data 5. Report Design Baits dynamics
  • 37. Summary • Custom design of regions for targeted sequencing offers significant flexibility for many applications • Expert probe design will ensure: • Better ‘evenness’ of coverage helps ensure no regions are missed and maximises the likelihood of variant detection • Improvement of overall capture efficiency and on-target performance equals cost effective sequencing downstream • Increase capture efficiency of SNPs and Indels equals an increase in the likelihood of detection • Reduction of risk and better performance
  • 38. Adding value through analysis • Introduction • NGS data analysis • Primary analysis • Mapping and assembly • Q score re-calibration • NGS sequencing QC • NGS alignment QC • Secondary analysis • SNP and Indel calling • Annotation and evaluation pipeline • SIFT and PolyPhen • Deliverables • Case study • Summary
  • 39. The analysis challenge Hard drive Sequencer with ~4Gb per exome Publication NGS Raw data Mapping Mapping Annotation Annotation Filtering Filtering Reporting Reporting
  • 40. Raw data: FASTQ (standard text representation of short reads) FASTQ uses four lines per sequence. • Line 1: '@' followed by a sequence identifier • Line 2: raw sequence letters • Line 3: '+' (and optional sequence identifier) • Line 4: quality values for the sequence in Line 2. Must contain the same number of symbols as letters in the sequence. (The letters encode Phred Quality Scores from 0 to 93 using ASCII 33 to 126) Example @SEQ_ID GATTTGGGGTTCAAAGCAGTATCGATCAAATAGTAAATCCATTTGTTCAACTCACAGTTT + !''*((((***+))%%%++)(%%%%).1***-+*''))**55CCF>>>>>>CCCCCCC65
  • 41. Phred quality scores • Phred is an accurate base-caller used for capillary traces (Ewing et al Genome Research 1998) • Each called base is given a quality score Q • Quality based on simple metrics (such as peak spacing) calibrated against a database of hand-edited data • QPhred = -10 * log10(estimated probability call is wrong) Probability of incorrect Phred Quality Score Base call accuracy base call 10 1 in 10 90 % 20 1 in 100 99 % 30 1 in 1000 99.9 % 40 1 in 10000 99.99 % Q30 often used as a threshold for useful sequence data
  • 42. Adding value through analysis • Introduction • NGS data analysis • Primary analysis • Mapping and assembly • Q score re-calibration • NGS sequencing QC • NGS alignment QC • Secondary analysis • SNP and Indel calling • Annotation and evaluation pipeline • SIFT and PolyPhen • Deliverables • Case study • Summary
  • 43. Primary analysis — Mapping and alignment Raw Sequence Files FASTQ Format Raw Local Quality Duplicate Analysis- Mapping Alignment Realignment score re- (around InDels) marking ready Files calibration Alignment BWA/Bowtie SAM/BAM Format GATK Picard Picard SAM/BAM Format
  • 44. Why mark duplicates and realignment around indels? 3 incorrect calls within 40bp!
  • 45. Primary analysis — Mapping and alignment Raw Sequence Files FASTQ Format Raw Local Quality Duplicate Analysis- Mapping Alignment Realignment score re- (around InDels) marking ready Files calibration Alignment BWA/Bowtie SAM/BAM Format GATK Picard Picard SAM/BAM Format
  • 46. NGS variant calling methods Option 1 - Hard filtering Example: SNP can only be called if • read depth >10 • >35% of reads carry SNP  Effective filtering  Transparent to user – Simplistic approach – Will miss high quality calls that don’t pass threshold Option 2 - Statistical analysis Based on quality scores of individual basepairs, the alignment and statistical probability models  Robust  Optimum balance of sensitivity and specificity due to the use of statistical models  Fewer false positive and false negative SNP calls – Requires correctly pre-processed data with reliable quality scores
  • 47. Base quality score re-calibration Before Recalibration After Recalibration Source: The Broad Institute http://www.broadinstitute.org/files/shared/mpg/nextgen2010/nextgen_poplin.pdf
  • 48. Primary analysis — Raw data and assembly QC Raw Sequence Files FASTQ Format Alignment QC check Picard Sequence QC check Raw Local Quality Duplicate Analysis- Mapping Alignment Realignment score re- (around InDels) marking ready FastQC Files calibration Alignment Alignment QC Report BWA/Bowtie SAM/BAM Format GATK Picard Picard SAM/BAM Format Raw data QC Report
  • 49. Secondary analysis SNP and Indel calling, annotation and filtering • Known variant? • Impact on gene expression? SNPs Analysis- • Splicing affected? Unified Variant ready Genotyper Evaluation alignment • Non-synonymous or frameshift InDels mutation? GATK OGT SAM/BAM Format • Impact on protein function? VCF Format • How confident are we in the call? • Zygosity? Sequence QC Report Alignment QC Report Comprehensive interactive OGT Report
  • 50. SNP/Indel classification (standard analysis) We check and annotate every single detected SNP and Indel against all human Ensembl genes and transcripts and dbSNP dbSNP annotation: • Is the variant known? • Obtain allele frequency Does it affect any of the following • Promoter region • UTR • Splice sites or intronic region • CDS • Synonymous mutation • Non synonymous mutation • Frameshift mutation • Stop codon (truncated/elongated protein sequence) • Overlap with protein domain • Consequence on protein function predicted (SIFT & PolyPhen)
  • 51. OGT Processing Overview Filter out variants Mapped to Perform pairwise present in “baseline” Additional Filtering Promoter Regions genome analysis genome (e.g. somatic Filter out and Analysis tissue, healthy sibling) variants Not Described in Filter out variants present“baseline” in any Non-synonymous Perform pairwise present in Additional Filtering dbSNP “baseline” StudyAnalysis specific Mapped to Exons, Coding Variations Perform genome analysis genome (e.g. somatic tissue, healthy sibling) and additional in- Splice sites or UTRs pairwise exome (e.g. and Protein somatic variants Filter out tissue, depth filtering Gather All detected domains Variations with Serious Consequences to the genome Perform pairwise and analysis Additional Filtering SNP/Indels Protein Sequence analysis genome analysis healthy “baseline” present in sibling) genome (e.g. somatic and Analysis (SIFT) AND not all tissue, healthy sibling) “case” exomes Filter out variants Rare RS ID Perform pairwise present in “baseline” Additional Filtering Described in dbSNP Variations genome analysis genome (e.g. somatic and Analysis tissue, healthy sibling) Multi Genome Analysis, Data Tailored analysis based on client’s Individual Genome Analysis Gathering and Comparison individual requirements (Standard Level) (Advanced Level) (Expert Level) Data Information
  • 52. NGS data delivery ship data Hard drive (or FTP) Double click! File location & share results Comprehensive HTML analysis report
  • 54. Analysis report: QC section — Read QC
  • 55. Analysis report: QC section — Read QC
  • 56. Analysis report: QC section — Alignment QC
  • 57. Analysis report: QC section — Alignment QC
  • 59. The Variant Table View Data display Data export
  • 60. The Variant Table View — External links
  • 62. Predicted consequences on protein function
  • 63. Alignment View of selected variant in IGV
  • 64. OGT data processing ensures detection of insertions Detection of an 31bp insertion
  • 65. OGT data processing ensures detection of deletions: Example1 Detection of an 84bp deletion
  • 66. Detection of homozygous and heterozygous deletions Homozygous deletion Heterozygous deletion No deletion (reference sequence)
  • 68. Customer data: Analysis of consanguineous samples 1 2 I HACE1 Exon11 c.994C>T 1 2 R332X II (CGA -> TGA) Data courtesy of Dr. Bernd Wollnik, Institute of Human Genetics, University Hospital of Cologne
  • 69. Confirmation by Sanger sequencing X H V F R I G P Control R332X 69-161 168-258 602-909 ANK1 ANK1 HECT Mother Father Patient1 Patient2 Data courtesy of Dr. Bernd Wollnik, Institute of Human Genetics, University Hospital of Cologne
  • 70. Customer feedback... Analysis of Consanguineous Samples “Just wanted to let you know that we have probably identified the causative gene and mutation in the patient sample. The mutation is located in the middle of an 18 Mb homozygous stretch and is a homozygous nonsense mutation!!! Wow, its going so nicely with your data!!!” Dr. Bernd Wollnik, Institute of Human Genetics, University Hospital of Cologne
  • 71. Summary OGT offers fast, accurate & powerful NGS analysis Standard Analysis • Robust statistical data analysis • Comprehensive variant annotation • Interactive filtering and prioritisation of data based on • chromosomal region • allele frequency / novelty • zygosity • confidence score and read depth • severity of mutation Advanced Analysis • Multi-genome comparison Bespoke analysis • Tailored to your specific requirements
  • 72. Outline of presentation • Delivering a unique next generation sequencing service — Dr Mike Evans, CEO • Optimised bait design for targeted sequencing — Dr Volker Brenner, Head of Computational Biology • Adding value through analysis — Dr Volker Brenner, Head of Computational Biology • Summary • Q&A
  • 73. Speak to one of our team or visit booth 713 to: • Book a demonstration of our interactive analysis report — Hurry limited availability • Discuss your specific project requirements • Take part in our short survey and have your chance to win an Amazon Kindle