SlideShare uma empresa Scribd logo
1 de 47
Baixar para ler offline
Making your science powerful: an introduction to
NGS experimental design
7th May CSCR seminar
Dr Jelena Aleksic
The purpose of this talk
Bioinformaticians often get asked about experimental design, as
we have experience working with the data.
Good experimental design is absolutely crucial for NGS
applications, because of the size of the datasets.
The purpose of this talk is to introduce the basic concepts of
good NGS experimental design.
NGS Experimental Design
• Importance
• Specificity and power
• Accuracy, precision and bias
• Considerations for sample collection
• Considerations for sequencing
• Conclusions
Without a valid design, valid scientific conclusions cannot be drawn. Also
particularly important for NGS because of the size of the datasets.
Ronald A. Fisher
(1890-1962)
“To consult the statistician after an experiment is
finished is often merely to ask them to conduct a
post mortem examination. They can perhaps say what
the experiment died of.” (1938)
Good to think about early!
Value of Planning
Rushing into experiments without thoughtful planning invites
failure.
“Seventy percent of whether your experiment will
work is determined before you touch the first test
tube”
Tung-Tien Sun (2004).
Excessive trust in authorities and its influence on experimental design.
Nature Reviews Molecular Cell Biology
Still a serious issue in the field
- Almost 70% of all the human RNA-seq samples in GEO do not
have biological replicates (Feng et al. 2012)
- More unreplicated RNA-seq data were published than
replicated RNA-seq data in 2011 (Feng et al. 2012)
- ENCODE guidelines recommend two ChIP-seq biological
replicates (Landt et al. 2012), but this is controversial and
arguably too few
• Importance
• Specificity and power
• Accuracy, precision and bias
• Considerations for sample collection
• Considerations for sequencing
• Conclusions
NGS Experimental Design
Important concepts
• Specificity: the proportion of results that are genuine ones,
as opposed to false positives
• Power (sensitivity): the proportion of genuine results that are
successfully identified by the experiment
e.g. At FDR 1%, 1 in 100 results will be a false positive.
e.g. At 20% power, the experiment will successfully identify
200 out of 1000 genuine differentially expressed genes at a
specified fold-change level.
Adequately Powered
The power (sensitivity) of an experiment is the
probability that it can detect an effect, if it is present.
•Power is often overlooked.
•A probability - any value between 0% and 100%.
•Achieved by:
–Using appropriate numbers of animals / samples
(sample size)
–Controlling sources of variation
If you increase the variability when you increase the size then it
won't necessarily have more power.
Adequately Powered
•Too many samples:
–wastes resources e.g. animals
(unethical), money, time and
effort.
•Too few samples:
–may lack power and miss a
scientifically important effect. Power(%)
Sample Size per group (n)
How much power is enough?
– Depends on the experiment purpose.
– To just get the top few targets = not much.
– To get a comprehensive list of targets = a lot.
– Rare transcripts / genomic variants = a lot.
– Also depends on variability of samples.
– In vitro studies = less variability
– In vivo = more variability, mixture of tissues
– Cancer = hugely variable, need a lot of power
How do I tell?
• It’s possible to calculate the required power of an experiment.
For RNA-seq, there’s even an online tool (and there are
various general calculators):
• Alternatively, use rules of thumb based on other people’s
systematic reviews of a particular experiment type.
How many replicates?
4 mice…
No, it should be
12 mice…
Look into my crystal ball……
The sensitivity, ability or power to detect changes
depends on the sample size.
Depends on the resources, the goals of the study, and the
reliability of the technology:
–How much money do you have?
–Can you handle all these samples without problem?
–What size of differences (effect size) to detect?
–It’s an accurate representation of the population?
–Large enough to achieve meaningful results?
How many replicates?
What is the minimum number
of replicates?
Experiments with only one replicate break
statistics and make puppies cry.
• Unless:
– It’s a pilot experiment, e.g. to estimate data quality /
antibody specificity / quantity of particular transcripts etc.,
and you plan to do follow-ups (further reps or other
biological validation).
– It’s performed on a homogeneous and well-characterised
system where a lot of other data already exists. e.g. RNA-
seq of a much used cell line.
–It’s a simple confirmatory experiment
What is the minimum number
of replicates?
• Adequate minimum number of replicates for an
exploratory study = 3.
• The reasoning
– 3 data points per gene lets you fit a statistical distribution
more accurately, and gives a better estimate of variability
– This in turn improves both the specificity and the
sensitivity of the experiment.
Some labs do 4 replicates as standard, in case one fails.
• Importance
• Specificity and power
• Accuracy, precision and bias
• Considerations for sample collection
• Considerations for sequencing
• Conclusions
NGS Experimental Design
Precise & Unbiased
PRECISION:
• Reproducibility of repeated
measurements.
• Precise estimation of the quantity of
interest.
Precision
Bias
Target
UNBIASED:
• Bias can affect accuracy.
• Should control for systematic differences between the measure
and some “true" value (target).
• Doesn’t confound that estimate with a technical effect.
• Systematic variation (bias) leads to results being inaccurate.
• Random variation (chance) leads to results being imprecise.
“A biased scientific result is no
different from a useless one”
Daniel Sarewitz
(Beware the creeping cracks of bias. Nature, 2012)
• Bias is avoided by:
• Correct selection of experimental units.
• Randomisation of the experimental units.
• Randomisation of the order in which
measurements are made.
• “Blinding” and the use of coded samples where
appropriate.
• Failure to randomise and blind can lead to
false positive and negative results
Controlling bias
Confounding factors:
example
Plate1 Plate2 Plate3
Control Treatment 1 Treatment 2
RNA Extraction
The difference between Control, Treatment 1
and Treatment 2 is confounded by Plate
Confounding factors:
example 2
Day1, Plate 1 Day2, Plate 2 Day3, Plate 3
Control Treatment 1 Treatment 2
RNA Extraction
The difference between Control, Treatment 1
and Treatment 2 is confounded by day and plate.
Sources of potential bias
Issues can arise if any of the experimental steps are
applied in a systematically biased way between sample
and control. e.g. During:
- Sample collection procedure
- Molecular biology during the main experiment (e.g.
ChIP)
- Library prep
- Sequencing (different lanes on same flowcell is ok, but
different flow cells can produce different results)
http://www.the-scientist.com/blog/display/57558/
• A GWAS study of 800 centenarians against controls found 150 SNPs which can predict
if a person is a centenarian with 77 % accuracy.
• Problem: they used different SNP chips for centenarian vs control.
• Retracted 2011 following an independent lab reviewed the data and QC applied.
• Importance
• Specificity and power
• Accuracy, precision and bias
• Considerations for sample collection
• Considerations for sequencing
• Conclusions
NGS Experimental Design
Samples to collect
• Adequate minimum number of replicates for an
exploratory study = 3.
• Controls are also essential.
• Biological replicates should be performed -
technical reps will not capture variation present in the
tissues / organisms.
–> This means you need at least 3 biological replicate
samples, and 3 biological replicate controls.
Biological or technical
replicates?
Choose Samples
Extract RNA
Quality Control
Convert to cDNA
Quality Control
Library prep
Washing
Sequencing
Analysis
Biological Replication
Technical Replication
Technical Replication
Technical Replication
Technical Replication
RNAseqProcessingWorkflow
Biological replicate tricky
cases
• Some biological replicate cases are obvious: e.g.
tissue samples from different mice, blood samples
from different people, cell cultures from different
people.
• Others are less clear cut:
• e.g. Experiments using a single cell line
Cell line replicates
Treatment
Results
Bad example
Cell line replicates
Treatment
Results
Bad example Better
Cell line replicates
Treatment
Results
Bad example Better
Even better
Treatment
Results
Cell line replicates
Treatment
Results
Bad example Better
Even better
Treatment
Results
None of these
are biological
replicates - but
we do the best
we can.
Performing cell line replicates
• The aim is to assess the replicability of a given
experiment, by performing it independently
• This means ideally they should be done on different
days, with fresh medium and reagents etc. However,
this might be impractical.
• It is at least important to perform the treatments
independently. e.g. To get 4 replicates, do 4 separate
transfections (or RNAi etc) for the experimental
sample, and 4 for the control.
Experiment Controls
• A very important part in your experiment,
because it’s very difficult to eliminate all of the
possible confounding variables & bias.
• Designing the experiment with controls in mind is
often more crucial than determining the
independent variable.
• Increases the statistical validity of your data.
• Two types: positive and negative.
Experimental Controls
• Placebo control
Mimic a procedure or treatment without the
actual use of the procedure or test
substance.
• e.g. same surgical procedure but
without X implanted
• Vehicle control
Used in studies where a substance is used to deliver an experimental
compound.
e.g. apply EtOH to cell lines on it’s own as a control since it’s
used as a vehicle for delivering the Tamoxifen drug.
• Baseline biological state
e.g. Wild type to observe the effects of a knockout.
ChIP-seq Controls
• What controls to use
– Either input DNA or IgG antibody are commonly used. Both
have pros and cons.
• Replicates
- The control should have as many replicates as the sample,
as it is also variable.
• Particularly important
- ChIP reactions are extremely noisy and variable.
Should I pool samples?
When working with very small amounts of tissue sample, pooling is a
good way of reducing the noise while keeping the number of n
reasonably small.
e.g. where n = no. of sequencing reactions.
• Better to pool the biological material (tissue, cells), not the purified
RNA or labeled cDNA. In this way, problems are far easier to spot.
• However, there’s no way to estimate variation between individuals
in a pool, which is sometimes important and often interesting.
• Beware of outliers - Don’t include any sample that looks suspicious.
• In some studies (pooled vs unpooled designs) the majority of DEGs turn out
extreme in only one individual.
• Importance
• Specificity and power
• Accuracy, precision and bias
• Considerations for sample collection
• Considerations for sequencing
• Conclusions
NGS Experimental Design
Basic NGS parameters
• Library size / sequencing depth: the number of reads
obtained for a given sample (e.g. 20 million)
• Read length: bp length of each read in the library (e.g. 150
bp)
• Single end vs paired end: single end sequencing only
sequences one end of each DNA fragment, while paired end
sequences both ends.
How deeply should you
sequence
• Depends on the application: e.g. some parts of the genome
will be more tricky to sequence, so full coverage for genome
assembly may require very deep sequencing.
• RNA-seq differential expression: very roughly (for human
and mouse) - at least 10 million reads per sample is required,
and 30 million reads per sample should suit most applications.
Can sequence deeper if low expressed genes are particularly
important.
• If in doubt: you can run a pilot experiment and see how much
coverage and saturation you get, and potentially sequence
further after that.
Sequencing depth vs
replicates
Replicates are good! Even if you don’t sequence any more
deeply.
A study looking at RNAseq (Liu et al. 2013) found that adding
more replicates at 10 million reads library size was a more cost-
efficient way of improving the power of the experiment, than
adding more sequencing coverage. This was true for 3-7
replicates.
So, if you don’t have money for a lot of sequencing, you can still
multiplex a number of biological replicates!
What read length?
• The read length should match your sample fragment length:
you want it to be around the size of and just a bit shorter than your
fragments.
• Recommended for Illumina:
sRNAseq, ribosomal profiling = 50 bp read length
mRNAseq (fragment size ~80-200bp) = 150 bp read length
longer fragments = 250 bp
• For longer reads / niche applications: other sequencers also
exist. PacBio has an impressive read length (often >10 kb)
Single end vs paired end
• Single end: a bit cheaper, less sequencing required. Suitable for
most general purposes, such as differential expression analysis.
• Paired end: contains more length and positional information about
the sequenced fragment - can tell where it starts and stops.
• Applications for which you need paired end sequencing:
- splice junctions
- rearrangements: indels and inversions
• Importance
• Specificity and power
• Accuracy, precision and bias
• Considerations for sample collection
• Considerations for sequencing
• Conclusions
NGS Experimental Design
Summary
• Experimental design is key to a successful genomics experiment.
• The required power of an experiment should be considered when
choosing the number of replicates.
• An adequate number of biological replicates (usually >=3) is
crucial for a reliable statistical analysis
• Appropriate controls are essential
• Systematic bias can completely mess up an experiment and should
be controlled
Acknowledgements
Thank you to Ela and everyone in the Frye lab!
Iwona Driskell
Sandra Blanco
Shobbir Hussain
Joana Flores
Martyna Popis
Roberto Bandiera
Abdul Sajini
Mahalia Page
Nikoletta Gkatza
Carolin Witte
And also the following colleagues for
excellent experimental design
discussions:
Sabine Dietmann
Patrick Lombard
Meryem Ralser
Bettina Fischer (CSBC)
Sarah Carl (CSBC)
Duncan Odom (CRUK CRI)
Audrey Fu (University of Chicago)
And thank you all for listening!
I would particularly like to thank Roslin Russell (CRUK) for letting
me use her course material slides as a basis for preparing this talk

Mais conteúdo relacionado

Mais procurados

Acute myeloid leukemia
Acute myeloid leukemiaAcute myeloid leukemia
Acute myeloid leukemiaAthira RG
 
Fluorescent in-situ Hybridization (FISH)
Fluorescent in-situ Hybridization (FISH)Fluorescent in-situ Hybridization (FISH)
Fluorescent in-situ Hybridization (FISH)BioGenex
 
Insuite hybridization
Insuite hybridizationInsuite hybridization
Insuite hybridizationNoman Ch
 
Stem cell culture by kk sahu
Stem cell culture by kk sahuStem cell culture by kk sahu
Stem cell culture by kk sahuKAUSHAL SAHU
 
Single-cell RNA-seq tutorial
Single-cell RNA-seq tutorialSingle-cell RNA-seq tutorial
Single-cell RNA-seq tutorialAaron Diaz
 
Polytene chromosomes
Polytene chromosomesPolytene chromosomes
Polytene chromosomesAasim Faiz
 
Telomeres and Telomerase
Telomeres and TelomeraseTelomeres and Telomerase
Telomeres and TelomeraseKaran N
 
Chromatin modulation and role in gene regulation
Chromatin modulation and role in gene regulationChromatin modulation and role in gene regulation
Chromatin modulation and role in gene regulationZain Khadim
 
Presentation on Fluorescence in-Situ Hybridization (FISH)
Presentation on Fluorescence in-Situ Hybridization (FISH)Presentation on Fluorescence in-Situ Hybridization (FISH)
Presentation on Fluorescence in-Situ Hybridization (FISH)Dr. Kaushik Kumar Panigrahi
 
Next Generation Sequencing
Next Generation SequencingNext Generation Sequencing
Next Generation SequencingKuldeep Gauliya
 
GISH AND FISH
GISH AND FISHGISH AND FISH
GISH AND FISHsandeshGM
 
Mitochondrial genome and its manipulation
Mitochondrial genome and its manipulationMitochondrial genome and its manipulation
Mitochondrial genome and its manipulationAvinash Gowda H
 
Cell cell hybridization or somatic cell hybridization
Cell cell hybridization or somatic cell hybridizationCell cell hybridization or somatic cell hybridization
Cell cell hybridization or somatic cell hybridizationSubhradeep sarkar
 

Mais procurados (20)

Hedgehog Pathway
Hedgehog PathwayHedgehog Pathway
Hedgehog Pathway
 
Analysis of gene expression
Analysis of gene expressionAnalysis of gene expression
Analysis of gene expression
 
Acute myeloid leukemia
Acute myeloid leukemiaAcute myeloid leukemia
Acute myeloid leukemia
 
Fluorescent in-situ Hybridization (FISH)
Fluorescent in-situ Hybridization (FISH)Fluorescent in-situ Hybridization (FISH)
Fluorescent in-situ Hybridization (FISH)
 
Insuite hybridization
Insuite hybridizationInsuite hybridization
Insuite hybridization
 
Stem cell culture by kk sahu
Stem cell culture by kk sahuStem cell culture by kk sahu
Stem cell culture by kk sahu
 
Oncogenes
OncogenesOncogenes
Oncogenes
 
Single-cell RNA-seq tutorial
Single-cell RNA-seq tutorialSingle-cell RNA-seq tutorial
Single-cell RNA-seq tutorial
 
Polytene chromosomes
Polytene chromosomesPolytene chromosomes
Polytene chromosomes
 
Telomeres and Telomerase
Telomeres and TelomeraseTelomeres and Telomerase
Telomeres and Telomerase
 
In situ hybridization
In situ hybridizationIn situ hybridization
In situ hybridization
 
Chromatin remodeling
Chromatin remodelingChromatin remodeling
Chromatin remodeling
 
Chromatin modulation and role in gene regulation
Chromatin modulation and role in gene regulationChromatin modulation and role in gene regulation
Chromatin modulation and role in gene regulation
 
Presentation on Fluorescence in-Situ Hybridization (FISH)
Presentation on Fluorescence in-Situ Hybridization (FISH)Presentation on Fluorescence in-Situ Hybridization (FISH)
Presentation on Fluorescence in-Situ Hybridization (FISH)
 
Karyotyping
KaryotypingKaryotyping
Karyotyping
 
Next Generation Sequencing
Next Generation SequencingNext Generation Sequencing
Next Generation Sequencing
 
GISH AND FISH
GISH AND FISHGISH AND FISH
GISH AND FISH
 
Oncogene
OncogeneOncogene
Oncogene
 
Mitochondrial genome and its manipulation
Mitochondrial genome and its manipulationMitochondrial genome and its manipulation
Mitochondrial genome and its manipulation
 
Cell cell hybridization or somatic cell hybridization
Cell cell hybridization or somatic cell hybridizationCell cell hybridization or somatic cell hybridization
Cell cell hybridization or somatic cell hybridization
 

Destaque

Overview of methods for variant calling from next-generation sequence data
Overview of methods for variant calling from next-generation sequence dataOverview of methods for variant calling from next-generation sequence data
Overview of methods for variant calling from next-generation sequence dataThomas Keane
 
Workshop NGS data analysis - 1
Workshop NGS data analysis - 1Workshop NGS data analysis - 1
Workshop NGS data analysis - 1Maté Ongenaert
 
Advanced NGS Data Analysis & Interpretation- BGW + IVA: NGS Tech Overview Web...
Advanced NGS Data Analysis & Interpretation- BGW + IVA: NGS Tech Overview Web...Advanced NGS Data Analysis & Interpretation- BGW + IVA: NGS Tech Overview Web...
Advanced NGS Data Analysis & Interpretation- BGW + IVA: NGS Tech Overview Web...QIAGEN
 
Next Generation Sequencing & Transcriptome Analysis
Next Generation Sequencing & Transcriptome AnalysisNext Generation Sequencing & Transcriptome Analysis
Next Generation Sequencing & Transcriptome AnalysisBastian Greshake
 
Introduction to NGS
Introduction to NGSIntroduction to NGS
Introduction to NGScursoNGS
 
NEXT GENERATION SEQUENCING
NEXT GENERATION SEQUENCINGNEXT GENERATION SEQUENCING
NEXT GENERATION SEQUENCINGBilal Nizami
 
NGS for Infectious Disease Diagnostics: An Opportunity for Growth
NGS for Infectious Disease Diagnostics: An Opportunity for Growth NGS for Infectious Disease Diagnostics: An Opportunity for Growth
NGS for Infectious Disease Diagnostics: An Opportunity for Growth Alira Health
 
Computational infrastructure for NGS data analysis
Computational infrastructure for NGS data analysisComputational infrastructure for NGS data analysis
Computational infrastructure for NGS data analysiscursoNGS
 
Step by Step, from Liquid Biopsy to a Genomic Biomarker: Liquid Biopsy Series...
Step by Step, from Liquid Biopsy to a Genomic Biomarker: Liquid Biopsy Series...Step by Step, from Liquid Biopsy to a Genomic Biomarker: Liquid Biopsy Series...
Step by Step, from Liquid Biopsy to a Genomic Biomarker: Liquid Biopsy Series...QIAGEN
 
Digital RNAseq Technology Introduction: Digital RNAseq Webinar Part 1
Digital RNAseq Technology Introduction: Digital RNAseq Webinar Part 1Digital RNAseq Technology Introduction: Digital RNAseq Webinar Part 1
Digital RNAseq Technology Introduction: Digital RNAseq Webinar Part 1QIAGEN
 
Next generation sequencing
Next generation sequencingNext generation sequencing
Next generation sequencingDayananda Salam
 
“Next-Generation Sequencing (NGS) Global Market – Forecast To 2022”
“Next-Generation Sequencing (NGS) Global Market  – Forecast To 2022”“Next-Generation Sequencing (NGS) Global Market  – Forecast To 2022”
“Next-Generation Sequencing (NGS) Global Market – Forecast To 2022”Vinay Shiva Prasad
 
The QIAseq NGS Portfolio for Cancer Research: Sample-to-Insight for All
The QIAseq NGS Portfolio for Cancer Research: Sample-to-Insight for AllThe QIAseq NGS Portfolio for Cancer Research: Sample-to-Insight for All
The QIAseq NGS Portfolio for Cancer Research: Sample-to-Insight for AllQIAGEN
 
20170209 ngs for_cancer_genomics_101
20170209 ngs for_cancer_genomics_10120170209 ngs for_cancer_genomics_101
20170209 ngs for_cancer_genomics_101Ino de Bruijn
 
NGS Introduction and Technology Overview (UEB-UAT Bioinformatics Course - Ses...
NGS Introduction and Technology Overview (UEB-UAT Bioinformatics Course - Ses...NGS Introduction and Technology Overview (UEB-UAT Bioinformatics Course - Ses...
NGS Introduction and Technology Overview (UEB-UAT Bioinformatics Course - Ses...VHIR Vall d’Hebron Institut de Recerca
 
Next Gen Sequencing (NGS) Technology Overview
Next Gen Sequencing (NGS) Technology OverviewNext Gen Sequencing (NGS) Technology Overview
Next Gen Sequencing (NGS) Technology OverviewDominic Suciu
 
How to cluster and sequence an ngs library (james hadfield160416)
How to cluster and sequence an ngs library (james hadfield160416)How to cluster and sequence an ngs library (james hadfield160416)
How to cluster and sequence an ngs library (james hadfield160416)James Hadfield
 

Destaque (20)

Overview of methods for variant calling from next-generation sequence data
Overview of methods for variant calling from next-generation sequence dataOverview of methods for variant calling from next-generation sequence data
Overview of methods for variant calling from next-generation sequence data
 
Workshop NGS data analysis - 1
Workshop NGS data analysis - 1Workshop NGS data analysis - 1
Workshop NGS data analysis - 1
 
Advanced NGS Data Analysis & Interpretation- BGW + IVA: NGS Tech Overview Web...
Advanced NGS Data Analysis & Interpretation- BGW + IVA: NGS Tech Overview Web...Advanced NGS Data Analysis & Interpretation- BGW + IVA: NGS Tech Overview Web...
Advanced NGS Data Analysis & Interpretation- BGW + IVA: NGS Tech Overview Web...
 
Introduction to next generation sequencing
Introduction to next generation sequencingIntroduction to next generation sequencing
Introduction to next generation sequencing
 
Next Generation Sequencing & Transcriptome Analysis
Next Generation Sequencing & Transcriptome AnalysisNext Generation Sequencing & Transcriptome Analysis
Next Generation Sequencing & Transcriptome Analysis
 
Introduction to NGS
Introduction to NGSIntroduction to NGS
Introduction to NGS
 
High throughput sequencing
High throughput sequencingHigh throughput sequencing
High throughput sequencing
 
NEXT GENERATION SEQUENCING
NEXT GENERATION SEQUENCINGNEXT GENERATION SEQUENCING
NEXT GENERATION SEQUENCING
 
NGS for Infectious Disease Diagnostics: An Opportunity for Growth
NGS for Infectious Disease Diagnostics: An Opportunity for Growth NGS for Infectious Disease Diagnostics: An Opportunity for Growth
NGS for Infectious Disease Diagnostics: An Opportunity for Growth
 
Computational infrastructure for NGS data analysis
Computational infrastructure for NGS data analysisComputational infrastructure for NGS data analysis
Computational infrastructure for NGS data analysis
 
Step by Step, from Liquid Biopsy to a Genomic Biomarker: Liquid Biopsy Series...
Step by Step, from Liquid Biopsy to a Genomic Biomarker: Liquid Biopsy Series...Step by Step, from Liquid Biopsy to a Genomic Biomarker: Liquid Biopsy Series...
Step by Step, from Liquid Biopsy to a Genomic Biomarker: Liquid Biopsy Series...
 
Digital RNAseq Technology Introduction: Digital RNAseq Webinar Part 1
Digital RNAseq Technology Introduction: Digital RNAseq Webinar Part 1Digital RNAseq Technology Introduction: Digital RNAseq Webinar Part 1
Digital RNAseq Technology Introduction: Digital RNAseq Webinar Part 1
 
Next generation sequencing
Next generation sequencingNext generation sequencing
Next generation sequencing
 
“Next-Generation Sequencing (NGS) Global Market – Forecast To 2022”
“Next-Generation Sequencing (NGS) Global Market  – Forecast To 2022”“Next-Generation Sequencing (NGS) Global Market  – Forecast To 2022”
“Next-Generation Sequencing (NGS) Global Market – Forecast To 2022”
 
The QIAseq NGS Portfolio for Cancer Research: Sample-to-Insight for All
The QIAseq NGS Portfolio for Cancer Research: Sample-to-Insight for AllThe QIAseq NGS Portfolio for Cancer Research: Sample-to-Insight for All
The QIAseq NGS Portfolio for Cancer Research: Sample-to-Insight for All
 
Basic Steps of the NGS Method
Basic Steps of the NGS MethodBasic Steps of the NGS Method
Basic Steps of the NGS Method
 
20170209 ngs for_cancer_genomics_101
20170209 ngs for_cancer_genomics_10120170209 ngs for_cancer_genomics_101
20170209 ngs for_cancer_genomics_101
 
NGS Introduction and Technology Overview (UEB-UAT Bioinformatics Course - Ses...
NGS Introduction and Technology Overview (UEB-UAT Bioinformatics Course - Ses...NGS Introduction and Technology Overview (UEB-UAT Bioinformatics Course - Ses...
NGS Introduction and Technology Overview (UEB-UAT Bioinformatics Course - Ses...
 
Next Gen Sequencing (NGS) Technology Overview
Next Gen Sequencing (NGS) Technology OverviewNext Gen Sequencing (NGS) Technology Overview
Next Gen Sequencing (NGS) Technology Overview
 
How to cluster and sequence an ngs library (james hadfield160416)
How to cluster and sequence an ngs library (james hadfield160416)How to cluster and sequence an ngs library (james hadfield160416)
How to cluster and sequence an ngs library (james hadfield160416)
 

Semelhante a Making your science powerful : an introduction to NGS experimental design

So you want to do a: RNAseq experiment, Differential Gene Expression Analysis
So you want to do a: RNAseq experiment, Differential Gene Expression AnalysisSo you want to do a: RNAseq experiment, Differential Gene Expression Analysis
So you want to do a: RNAseq experiment, Differential Gene Expression AnalysisUniversity of California, Davis
 
3 - SamplingTechVarious sampling techniques are employedniques(new1430).ppt
3 - SamplingTechVarious sampling techniques are employedniques(new1430).ppt3 - SamplingTechVarious sampling techniques are employedniques(new1430).ppt
3 - SamplingTechVarious sampling techniques are employedniques(new1430).pptgharkaacer
 
Hypothesis Testing and Experimental design.pptx
Hypothesis Testing and Experimental design.pptxHypothesis Testing and Experimental design.pptx
Hypothesis Testing and Experimental design.pptxAbile2
 
Experimental method of Educational Research.
Experimental method of Educational Research.Experimental method of Educational Research.
Experimental method of Educational Research.Neha Deo
 
Research design and experimentation
Research design and experimentationResearch design and experimentation
Research design and experimentationDr NEETHU ASOKAN
 
sience 2.0 : an illustration of good research practices in a real study
sience 2.0 : an illustration of good research practices in a real studysience 2.0 : an illustration of good research practices in a real study
sience 2.0 : an illustration of good research practices in a real studywolf vanpaemel
 
Advances and Applications Enabled by Single Cell Technology
Advances and Applications Enabled by Single Cell TechnologyAdvances and Applications Enabled by Single Cell Technology
Advances and Applications Enabled by Single Cell TechnologyQIAGEN
 
Amia tb-review-08
Amia tb-review-08Amia tb-review-08
Amia tb-review-08Russ Altman
 
Evidence based psychiatry
Evidence based psychiatryEvidence based psychiatry
Evidence based psychiatryshuchi pande
 
Study design2 6_07
Study design2 6_07Study design2 6_07
Study design2 6_07Dan Fisher
 
Biostatistics in Clinical Research
Biostatistics in Clinical ResearchBiostatistics in Clinical Research
Biostatistics in Clinical ResearchAbhaya Indrayan
 
Dr. Marie Culhane - Increase the value of your diagnostics and your value as ...
Dr. Marie Culhane - Increase the value of your diagnostics and your value as ...Dr. Marie Culhane - Increase the value of your diagnostics and your value as ...
Dr. Marie Culhane - Increase the value of your diagnostics and your value as ...John Blue
 
Sampling in Research
Sampling in ResearchSampling in Research
Sampling in ResearchEnzo Engada
 
Design of experiments - Dr. Manu Melwin Joy - School of Management Studies, C...
Design of experiments - Dr. Manu Melwin Joy - School of Management Studies, C...Design of experiments - Dr. Manu Melwin Joy - School of Management Studies, C...
Design of experiments - Dr. Manu Melwin Joy - School of Management Studies, C...manumelwin
 
scRNA-Seq Lecture - Stem Cell Network RNA-Seq Workshop 2017
scRNA-Seq Lecture - Stem Cell Network RNA-Seq Workshop 2017scRNA-Seq Lecture - Stem Cell Network RNA-Seq Workshop 2017
scRNA-Seq Lecture - Stem Cell Network RNA-Seq Workshop 2017David Cook
 

Semelhante a Making your science powerful : an introduction to NGS experimental design (20)

So you want to do a: RNAseq experiment, Differential Gene Expression Analysis
So you want to do a: RNAseq experiment, Differential Gene Expression AnalysisSo you want to do a: RNAseq experiment, Differential Gene Expression Analysis
So you want to do a: RNAseq experiment, Differential Gene Expression Analysis
 
3 - SamplingTechVarious sampling techniques are employedniques(new1430).ppt
3 - SamplingTechVarious sampling techniques are employedniques(new1430).ppt3 - SamplingTechVarious sampling techniques are employedniques(new1430).ppt
3 - SamplingTechVarious sampling techniques are employedniques(new1430).ppt
 
Hypothesis Testing and Experimental design.pptx
Hypothesis Testing and Experimental design.pptxHypothesis Testing and Experimental design.pptx
Hypothesis Testing and Experimental design.pptx
 
Experimental method of Educational Research.
Experimental method of Educational Research.Experimental method of Educational Research.
Experimental method of Educational Research.
 
Lecture 7 gwas full
Lecture 7 gwas fullLecture 7 gwas full
Lecture 7 gwas full
 
Research design and experimentation
Research design and experimentationResearch design and experimentation
Research design and experimentation
 
sience 2.0 : an illustration of good research practices in a real study
sience 2.0 : an illustration of good research practices in a real studysience 2.0 : an illustration of good research practices in a real study
sience 2.0 : an illustration of good research practices in a real study
 
Advances and Applications Enabled by Single Cell Technology
Advances and Applications Enabled by Single Cell TechnologyAdvances and Applications Enabled by Single Cell Technology
Advances and Applications Enabled by Single Cell Technology
 
Amia tb-review-08
Amia tb-review-08Amia tb-review-08
Amia tb-review-08
 
Sampling techniques
Sampling techniquesSampling techniques
Sampling techniques
 
Statistics five
Statistics fiveStatistics five
Statistics five
 
Evidence based psychiatry
Evidence based psychiatryEvidence based psychiatry
Evidence based psychiatry
 
Sampling
SamplingSampling
Sampling
 
Study design2 6_07
Study design2 6_07Study design2 6_07
Study design2 6_07
 
Biostatistics in Clinical Research
Biostatistics in Clinical ResearchBiostatistics in Clinical Research
Biostatistics in Clinical Research
 
Research
ResearchResearch
Research
 
Dr. Marie Culhane - Increase the value of your diagnostics and your value as ...
Dr. Marie Culhane - Increase the value of your diagnostics and your value as ...Dr. Marie Culhane - Increase the value of your diagnostics and your value as ...
Dr. Marie Culhane - Increase the value of your diagnostics and your value as ...
 
Sampling in Research
Sampling in ResearchSampling in Research
Sampling in Research
 
Design of experiments - Dr. Manu Melwin Joy - School of Management Studies, C...
Design of experiments - Dr. Manu Melwin Joy - School of Management Studies, C...Design of experiments - Dr. Manu Melwin Joy - School of Management Studies, C...
Design of experiments - Dr. Manu Melwin Joy - School of Management Studies, C...
 
scRNA-Seq Lecture - Stem Cell Network RNA-Seq Workshop 2017
scRNA-Seq Lecture - Stem Cell Network RNA-Seq Workshop 2017scRNA-Seq Lecture - Stem Cell Network RNA-Seq Workshop 2017
scRNA-Seq Lecture - Stem Cell Network RNA-Seq Workshop 2017
 

Último

Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceDelhi Call girls
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
 

Último (20)

Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 

Making your science powerful : an introduction to NGS experimental design

  • 1. Making your science powerful: an introduction to NGS experimental design 7th May CSCR seminar Dr Jelena Aleksic
  • 2. The purpose of this talk Bioinformaticians often get asked about experimental design, as we have experience working with the data. Good experimental design is absolutely crucial for NGS applications, because of the size of the datasets. The purpose of this talk is to introduce the basic concepts of good NGS experimental design.
  • 3. NGS Experimental Design • Importance • Specificity and power • Accuracy, precision and bias • Considerations for sample collection • Considerations for sequencing • Conclusions
  • 4. Without a valid design, valid scientific conclusions cannot be drawn. Also particularly important for NGS because of the size of the datasets.
  • 5. Ronald A. Fisher (1890-1962) “To consult the statistician after an experiment is finished is often merely to ask them to conduct a post mortem examination. They can perhaps say what the experiment died of.” (1938) Good to think about early!
  • 6. Value of Planning Rushing into experiments without thoughtful planning invites failure. “Seventy percent of whether your experiment will work is determined before you touch the first test tube” Tung-Tien Sun (2004). Excessive trust in authorities and its influence on experimental design. Nature Reviews Molecular Cell Biology
  • 7. Still a serious issue in the field - Almost 70% of all the human RNA-seq samples in GEO do not have biological replicates (Feng et al. 2012) - More unreplicated RNA-seq data were published than replicated RNA-seq data in 2011 (Feng et al. 2012) - ENCODE guidelines recommend two ChIP-seq biological replicates (Landt et al. 2012), but this is controversial and arguably too few
  • 8. • Importance • Specificity and power • Accuracy, precision and bias • Considerations for sample collection • Considerations for sequencing • Conclusions NGS Experimental Design
  • 9. Important concepts • Specificity: the proportion of results that are genuine ones, as opposed to false positives • Power (sensitivity): the proportion of genuine results that are successfully identified by the experiment e.g. At FDR 1%, 1 in 100 results will be a false positive. e.g. At 20% power, the experiment will successfully identify 200 out of 1000 genuine differentially expressed genes at a specified fold-change level.
  • 10. Adequately Powered The power (sensitivity) of an experiment is the probability that it can detect an effect, if it is present. •Power is often overlooked. •A probability - any value between 0% and 100%. •Achieved by: –Using appropriate numbers of animals / samples (sample size) –Controlling sources of variation If you increase the variability when you increase the size then it won't necessarily have more power.
  • 11. Adequately Powered •Too many samples: –wastes resources e.g. animals (unethical), money, time and effort. •Too few samples: –may lack power and miss a scientifically important effect. Power(%) Sample Size per group (n)
  • 12. How much power is enough? – Depends on the experiment purpose. – To just get the top few targets = not much. – To get a comprehensive list of targets = a lot. – Rare transcripts / genomic variants = a lot. – Also depends on variability of samples. – In vitro studies = less variability – In vivo = more variability, mixture of tissues – Cancer = hugely variable, need a lot of power
  • 13. How do I tell? • It’s possible to calculate the required power of an experiment. For RNA-seq, there’s even an online tool (and there are various general calculators): • Alternatively, use rules of thumb based on other people’s systematic reviews of a particular experiment type.
  • 14. How many replicates? 4 mice… No, it should be 12 mice… Look into my crystal ball…… The sensitivity, ability or power to detect changes depends on the sample size.
  • 15. Depends on the resources, the goals of the study, and the reliability of the technology: –How much money do you have? –Can you handle all these samples without problem? –What size of differences (effect size) to detect? –It’s an accurate representation of the population? –Large enough to achieve meaningful results? How many replicates?
  • 16. What is the minimum number of replicates? Experiments with only one replicate break statistics and make puppies cry. • Unless: – It’s a pilot experiment, e.g. to estimate data quality / antibody specificity / quantity of particular transcripts etc., and you plan to do follow-ups (further reps or other biological validation). – It’s performed on a homogeneous and well-characterised system where a lot of other data already exists. e.g. RNA- seq of a much used cell line. –It’s a simple confirmatory experiment
  • 17. What is the minimum number of replicates? • Adequate minimum number of replicates for an exploratory study = 3. • The reasoning – 3 data points per gene lets you fit a statistical distribution more accurately, and gives a better estimate of variability – This in turn improves both the specificity and the sensitivity of the experiment. Some labs do 4 replicates as standard, in case one fails.
  • 18. • Importance • Specificity and power • Accuracy, precision and bias • Considerations for sample collection • Considerations for sequencing • Conclusions NGS Experimental Design
  • 19. Precise & Unbiased PRECISION: • Reproducibility of repeated measurements. • Precise estimation of the quantity of interest. Precision Bias Target UNBIASED: • Bias can affect accuracy. • Should control for systematic differences between the measure and some “true" value (target). • Doesn’t confound that estimate with a technical effect. • Systematic variation (bias) leads to results being inaccurate. • Random variation (chance) leads to results being imprecise.
  • 20. “A biased scientific result is no different from a useless one” Daniel Sarewitz (Beware the creeping cracks of bias. Nature, 2012)
  • 21. • Bias is avoided by: • Correct selection of experimental units. • Randomisation of the experimental units. • Randomisation of the order in which measurements are made. • “Blinding” and the use of coded samples where appropriate. • Failure to randomise and blind can lead to false positive and negative results Controlling bias
  • 22. Confounding factors: example Plate1 Plate2 Plate3 Control Treatment 1 Treatment 2 RNA Extraction The difference between Control, Treatment 1 and Treatment 2 is confounded by Plate
  • 23. Confounding factors: example 2 Day1, Plate 1 Day2, Plate 2 Day3, Plate 3 Control Treatment 1 Treatment 2 RNA Extraction The difference between Control, Treatment 1 and Treatment 2 is confounded by day and plate.
  • 24. Sources of potential bias Issues can arise if any of the experimental steps are applied in a systematically biased way between sample and control. e.g. During: - Sample collection procedure - Molecular biology during the main experiment (e.g. ChIP) - Library prep - Sequencing (different lanes on same flowcell is ok, but different flow cells can produce different results)
  • 25. http://www.the-scientist.com/blog/display/57558/ • A GWAS study of 800 centenarians against controls found 150 SNPs which can predict if a person is a centenarian with 77 % accuracy. • Problem: they used different SNP chips for centenarian vs control. • Retracted 2011 following an independent lab reviewed the data and QC applied.
  • 26. • Importance • Specificity and power • Accuracy, precision and bias • Considerations for sample collection • Considerations for sequencing • Conclusions NGS Experimental Design
  • 27. Samples to collect • Adequate minimum number of replicates for an exploratory study = 3. • Controls are also essential. • Biological replicates should be performed - technical reps will not capture variation present in the tissues / organisms. –> This means you need at least 3 biological replicate samples, and 3 biological replicate controls.
  • 28. Biological or technical replicates? Choose Samples Extract RNA Quality Control Convert to cDNA Quality Control Library prep Washing Sequencing Analysis Biological Replication Technical Replication Technical Replication Technical Replication Technical Replication RNAseqProcessingWorkflow
  • 29. Biological replicate tricky cases • Some biological replicate cases are obvious: e.g. tissue samples from different mice, blood samples from different people, cell cultures from different people. • Others are less clear cut: • e.g. Experiments using a single cell line
  • 32. Cell line replicates Treatment Results Bad example Better Even better Treatment Results
  • 33. Cell line replicates Treatment Results Bad example Better Even better Treatment Results None of these are biological replicates - but we do the best we can.
  • 34. Performing cell line replicates • The aim is to assess the replicability of a given experiment, by performing it independently • This means ideally they should be done on different days, with fresh medium and reagents etc. However, this might be impractical. • It is at least important to perform the treatments independently. e.g. To get 4 replicates, do 4 separate transfections (or RNAi etc) for the experimental sample, and 4 for the control.
  • 35. Experiment Controls • A very important part in your experiment, because it’s very difficult to eliminate all of the possible confounding variables & bias. • Designing the experiment with controls in mind is often more crucial than determining the independent variable. • Increases the statistical validity of your data. • Two types: positive and negative.
  • 36. Experimental Controls • Placebo control Mimic a procedure or treatment without the actual use of the procedure or test substance. • e.g. same surgical procedure but without X implanted • Vehicle control Used in studies where a substance is used to deliver an experimental compound. e.g. apply EtOH to cell lines on it’s own as a control since it’s used as a vehicle for delivering the Tamoxifen drug. • Baseline biological state e.g. Wild type to observe the effects of a knockout.
  • 37. ChIP-seq Controls • What controls to use – Either input DNA or IgG antibody are commonly used. Both have pros and cons. • Replicates - The control should have as many replicates as the sample, as it is also variable. • Particularly important - ChIP reactions are extremely noisy and variable.
  • 38. Should I pool samples? When working with very small amounts of tissue sample, pooling is a good way of reducing the noise while keeping the number of n reasonably small. e.g. where n = no. of sequencing reactions. • Better to pool the biological material (tissue, cells), not the purified RNA or labeled cDNA. In this way, problems are far easier to spot. • However, there’s no way to estimate variation between individuals in a pool, which is sometimes important and often interesting. • Beware of outliers - Don’t include any sample that looks suspicious. • In some studies (pooled vs unpooled designs) the majority of DEGs turn out extreme in only one individual.
  • 39. • Importance • Specificity and power • Accuracy, precision and bias • Considerations for sample collection • Considerations for sequencing • Conclusions NGS Experimental Design
  • 40. Basic NGS parameters • Library size / sequencing depth: the number of reads obtained for a given sample (e.g. 20 million) • Read length: bp length of each read in the library (e.g. 150 bp) • Single end vs paired end: single end sequencing only sequences one end of each DNA fragment, while paired end sequences both ends.
  • 41. How deeply should you sequence • Depends on the application: e.g. some parts of the genome will be more tricky to sequence, so full coverage for genome assembly may require very deep sequencing. • RNA-seq differential expression: very roughly (for human and mouse) - at least 10 million reads per sample is required, and 30 million reads per sample should suit most applications. Can sequence deeper if low expressed genes are particularly important. • If in doubt: you can run a pilot experiment and see how much coverage and saturation you get, and potentially sequence further after that.
  • 42. Sequencing depth vs replicates Replicates are good! Even if you don’t sequence any more deeply. A study looking at RNAseq (Liu et al. 2013) found that adding more replicates at 10 million reads library size was a more cost- efficient way of improving the power of the experiment, than adding more sequencing coverage. This was true for 3-7 replicates. So, if you don’t have money for a lot of sequencing, you can still multiplex a number of biological replicates!
  • 43. What read length? • The read length should match your sample fragment length: you want it to be around the size of and just a bit shorter than your fragments. • Recommended for Illumina: sRNAseq, ribosomal profiling = 50 bp read length mRNAseq (fragment size ~80-200bp) = 150 bp read length longer fragments = 250 bp • For longer reads / niche applications: other sequencers also exist. PacBio has an impressive read length (often >10 kb)
  • 44. Single end vs paired end • Single end: a bit cheaper, less sequencing required. Suitable for most general purposes, such as differential expression analysis. • Paired end: contains more length and positional information about the sequenced fragment - can tell where it starts and stops. • Applications for which you need paired end sequencing: - splice junctions - rearrangements: indels and inversions
  • 45. • Importance • Specificity and power • Accuracy, precision and bias • Considerations for sample collection • Considerations for sequencing • Conclusions NGS Experimental Design
  • 46. Summary • Experimental design is key to a successful genomics experiment. • The required power of an experiment should be considered when choosing the number of replicates. • An adequate number of biological replicates (usually >=3) is crucial for a reliable statistical analysis • Appropriate controls are essential • Systematic bias can completely mess up an experiment and should be controlled
  • 47. Acknowledgements Thank you to Ela and everyone in the Frye lab! Iwona Driskell Sandra Blanco Shobbir Hussain Joana Flores Martyna Popis Roberto Bandiera Abdul Sajini Mahalia Page Nikoletta Gkatza Carolin Witte And also the following colleagues for excellent experimental design discussions: Sabine Dietmann Patrick Lombard Meryem Ralser Bettina Fischer (CSBC) Sarah Carl (CSBC) Duncan Odom (CRUK CRI) Audrey Fu (University of Chicago) And thank you all for listening! I would particularly like to thank Roslin Russell (CRUK) for letting me use her course material slides as a basis for preparing this talk