SlideShare a Scribd company logo
COMPREHENSIVE CATALOG OF
STATISTICAL FORMULAE,
ALGORITHMS AND SOFTWARE – A
STEP TOWARDS GOOD STATISTICS
PRACTICE IN FORENSIC GENETICS
Nikita N. Khromov-Borisov,Nikita N. Khromov-Borisov,
Andrew G. Smolyanitsky
Forensic Medicine Bureau of Leningrad District
Saint Petersburg, Russia
Nikita.KhromovBorisov@gmail.com
Andrew.Smolyanitsky@yandex.ru
Quotations of the day
If your experiment requires statistics,
then you ought to have done a better experiment
Ernest Rutherford
Statistical thinking will one day be as necessary
for efficient citizenship as the ability to read and writefor efficient citizenship as the ability to read and write
Herbert George Wells
Those who ignore Statistics are condemned to reinvent it
Bradley Efron
If Experimentation is the Queen of the Sciences,
then Statistical Methods must be regarded
as Guardians of the Royal Virtue
Myron Tribus
GSP – Good Statistics
Practice is what we need
Obviously, in their turn, statistical methods must be blameless and
perfect
So there is an urgent need for the comprehensive catalog of
carefully checked and approved formulae
as well as corresponding algorithms and software.as well as corresponding algorithms and software.
Unfortunately, some of them are published initially with errors which are
reproduced in subsequent sources.
Example of corrections:
Clayton T. M., Foreman L. A., Carracedo A.
FORENSIC SCIENCE INTERNATIONAL
Vol. 125: No. 2-3 p. 284-284, 2002
Motherless case in paternity testing by Lee et al.
Elements of the Statistical Design of
Experiment
Some formulae are in rare use or forgotten.
Example: Chakraborty’s formula for the sample size required
to reach the representativeness (reliability, “saturation”) of
the reference population samples. Human Biology 64 (1992)
141-159:141-159:
ln[1 - (1 - α)1/r]
Nmin= --------------
4 ln(1 – Pmin)
Nmin - minimum number of independent individuals to be analyzed,
a - probability of error,
r - number of alleles revealed by the system,
Pmin- minimum allele frequency.
Minimum sample sizes required
for the reference population data
Minimum
allele
frequency
No. of
alleles
Error Sample size,
No. of
individuals
P r α NPmin r α Nmin
0.01 2 - 25 0.001-0.0001 190 - 310
0.005 2 – 25 0.001-0.0001 380 - 620
0.001 2 - 25 0.001-0.0001 1900 - 3100
Template for paternity testing
Mother Child Tested man
JK JK JJ JK JW
PI 1/(pJ +pK) 0.5/(pJ +pK)
JJ JJ JJ JWJJ
JK
KK
KW
JJ
JJ
JK
JK
JJ
JJ
JJ
JJ
JK
JK
JK
JW
JW
JW
JW JZ
PI 1/pJ 0.5/pJ
Obligative paternal alleles are in red color
False genotyping is excluded
C. C. Li and A. Chakravarti
alternatives
Paternity probability based on Nonexclusion
P0
W=-------------------------------
P0 + (1-P0)(1- E1)…(1-Et)P0 + (1-P0)(1- E1)…(1-Et)
Ei – probability of exclusion for i-th test
P0 can be estimated from long-term records
Am. J. Hum. Genet. 43 (1) 197-205 (1988)
Coincidental DNA matches
Match probability as a property of a locus:
M0 = 2(sum pi
2)2 – sum(pi
4)
First principles: no prior knowledge is required
Li C.C. Hum. Biol. 68 (1996) 167-184
Won the Gabriel W. Lasker Award as the best paper
of the year in Human Biology
Rare allele frequency
estimation
Commonly, from a reference sample: pi = ni/N
When, however, a stain and suspect are independent homozygotes AiA then
pi = (ni + 4)/(N + 4)
If they are independent heterozygotes AiAj then
pi = (ni + 2)/(N + 2) and pj = (nj + 2)/(N + 2)
Let the size of a reference sample N = 1000 and the frequency of a rare
allele Ai is ni = 1, so pi = ni/N = 0.001 and pii = 0.000001
If a stain and suspect are homozygotes AiAi, then pi becomes
pi = (1 + 4)/(1000 + 4 ) ≈ 0.005 and pii = 0.000025
Forensic genetics software
Allelix http://www.allelix.net/
BDgen dbgen@yahoo.com.ar
DNAdacto, mDNAbase gavriley@krinc.ru
DNAmix, EasyDNA: EasyPA, EasyPAnt, EasyIN, EsayMISS/EasyKIN
http://www.hku.hk/statistics/staff/wingfung/countdown/dnamix.ht
ml
DNAmix2DNAmix2
ftp://statgen.ncsu.edu/pub/storey/DNAMIXv2/dos/dnamix2.exe
DNA-view http://dna-view.com/
EasyPat, Patern
http://www.uni-
kiel.de/medinfo/mitarbeiter/krawczak/download/index.html
Familias
http://www.math.chalmers.se/~mostad/familias/familias.zip
FCalc bolon@caltech.edu www.its.caltech.edu/~bolon
GRAPE serge@star.net.
Identity, NewPat5 DadShare
http://www.zoo.cam.ac.uk/zoostaff/amos/
PARENTE http://www2.ujf-
grenoble.fr/leca/download/PARENTE/PARENTE.zip
PATER2 spena@dcc.ufmg.br
PATRI
http://www.bscb.cornell.edu/Homepages/Rasmus_Nielsen/mdivhttp://www.bscb.cornell.edu/Homepages/Rasmus_Nielsen/mdiv
/mdiv.exe
PedCheck
http://watson.hgen.pitt.edu/register/docs/pedcheck.html
PowerMarker http://152.14.14.57/
PowerStats http://www.promega.com/geneticidtools/
ProbMax http://www.uoguelph.ca/~rdanzman/software/
Relative hhg2@columbia.edu
SPUR nina.fukshansky@gmx.de
STRLab http://strlab.co.za/
Population genetics software
Arlequin http://anthro/unige.ch/arlequin
CERVUS http://helios.bto.ed.ac.uk/evolgen
Con~Struct andy.overall@ed.ac.uk
FSTAT http://www.unil.ch/izea/softwares/fstat.html
FSTMET, HWMET http://www.reading.ac.uk/~snsbalng/
GDA http://lewis.eeb.uconn.edu/lewishome/software.html
GEN lazzeroni@stanford.edu
GENEPOP ftp://ftp.cefe.cnrs-mop.fr/genepop
GENEPOP on WebGENEPOP on Web
http://wbiomed.curtin.edu.au/genepop/index.html
GENETIX http://www.univ-montp2.fr/~genetix/genetix.htm
GeneKonv http://www.rrz.uni-
hamburg.de/OekoGenetik/software.htm
HWE
http://www.biology.ualberta.ca/old_site/jbrzusto/hwenj.html
PopGen32 Http://www.ualberta.ca/~fyeh
Population http://www.cnrs-
gif.fr/pge/bioinfo/populations/index.php?lang=fr
PowerMarker http://www.powermarker.net
PowerStats http://www.promega.com/geneticidtools/
TFPGA http://bioweb.usu.edu/mpmbio/tfpga.htm
Software online
Allelix http://www.allelix.net/
GENEPOP on Web
http://wbiomed.curtin.edu.au/genepop/index.html
ProfilerPlus Random Match Probability Calculator
http://www.csfs.ca/pplus/profiler.htm
Different tests (even exact) can lead to
different conclusions
Locus: vWA, Russian Caucasians, Hardy-Weinberg equilibrium test
Test P-value or CL Software
χ2 0.106 ChiHW, GDA, PowerMarker,
etc.etc.
Corrected χ2 0.092 GEN
Fisher’s probability,
Guo-Thompson alg.
0.026 Arlequin, GDA, GENEPOP,
HWE, TFPGA, etc.
G2 asympt. 0.163 POPGENE
Fis 0.141 FSTAT, GENETIX
Fis, 95% cred. lim. -0.044, -0.015 HWMET
Algorithms
Modern software implement exact nonparametric
approaches and modern Bayesian ideology and
methodology.
Their realization requires sophisticated
computational algorithms and facilities.
In this respect some new problems are raised, e.g.
the problem of convergence for the procedures
based on Markov chain Monte Carlo (MCMC)
algorithms.
familiarize yourself with the method,
including convergence diagnostics
K. L. Ayres, D. J. Balding
P-values produced by MCMC procedure
depend on the number of
randomization steps:
10 steps — P = 0.7815 ± 0.0008104 steps — P = 0.7815 ± 0.0008
105 steps — P = 0.2681 ± 0.0005
106 steps — P = 0.373 ± 0.012
107 steps — P = 0.424 ± 0.006
108 steps — P = 0.460 ± 0.003
Conclusion
First principle of GSP
It should be good statistics practiceIt should be good statistics practice
to analyze the data with different
statistical methods and investigate
their consistency.
Acknowledgements
We thank Drs., Karen L. Ayres and David J. Balding,
Laura C. Lazzeroni and Kenneth Lange, and John
Brzustowski
for kind supply with the executables of theirfor kind supply with the executables of their
programs (HWMET, GEN and HWE, respectively).
Many thanks to them and Drs. Angel Carracedo,
Laurent Excoffier, Jerome Goudet, Kejun Liu,
Tristan Marshall, Mark P. Miller, Eleanor Morgan,
Michel Raymond, Francois Rousset, Hans-Georg
Scheil, Bruce S. Weir and Dmitri Zaykin,
the authors of other programs and papers used in
this study, for helpful and fruitful discussion.
Sincere thanks
Drs.
Carsten HohoffCarsten Hohoff
Edwin Ehrlich
Kurt Trübner
for the invitation, help and financial
support

More Related Content

Similar to Catalog of formulae for forensic genetics ppt

2013.03.26 Bayesian Methods for Modern Statistical Analysis
2013.03.26 Bayesian Methods for Modern Statistical Analysis2013.03.26 Bayesian Methods for Modern Statistical Analysis
2013.03.26 Bayesian Methods for Modern Statistical AnalysisNUI Galway
 
Team 5 imputing_medical_missing_data_ga approach_preseatation
Team 5 imputing_medical_missing_data_ga approach_preseatationTeam 5 imputing_medical_missing_data_ga approach_preseatation
Team 5 imputing_medical_missing_data_ga approach_preseatationNafiz Ishtiaque Ahmed
 
A Comparison Of Fitness Scallng Methods In Evolutionary Algorithms
A Comparison Of Fitness Scallng Methods In Evolutionary AlgorithmsA Comparison Of Fitness Scallng Methods In Evolutionary Algorithms
A Comparison Of Fitness Scallng Methods In Evolutionary AlgorithmsTracy Hill
 
Computational Pool-Testing with Retesting Strategy
Computational Pool-Testing with Retesting StrategyComputational Pool-Testing with Retesting Strategy
Computational Pool-Testing with Retesting StrategyWaqas Tariq
 
Exponential software reliability using SPRT: MLE
Exponential software reliability using SPRT: MLEExponential software reliability using SPRT: MLE
Exponential software reliability using SPRT: MLEIOSR Journals
 
Quantitative Studies Group - Item Response Theory Spring 2014.pdf
Quantitative Studies Group - Item Response Theory Spring 2014.pdfQuantitative Studies Group - Item Response Theory Spring 2014.pdf
Quantitative Studies Group - Item Response Theory Spring 2014.pdfQuinn Lathrop
 
Data analytics to support exposome research course slides
Data analytics to support exposome research course slidesData analytics to support exposome research course slides
Data analytics to support exposome research course slidesChirag Patel
 
Bayesian Nonparametrics, Applications to biology, ecology, and marketing
Bayesian Nonparametrics, Applications to biology, ecology, and marketingBayesian Nonparametrics, Applications to biology, ecology, and marketing
Bayesian Nonparametrics, Applications to biology, ecology, and marketingJulyan Arbel
 
Novelties in social science statistics
Novelties in social science statisticsNovelties in social science statistics
Novelties in social science statisticsJiri Haviger
 
PSB2016 Computational Microbiology Workshop
PSB2016 Computational Microbiology WorkshopPSB2016 Computational Microbiology Workshop
PSB2016 Computational Microbiology WorkshopCasey Greene
 
Es credit scoring_2020
Es credit scoring_2020Es credit scoring_2020
Es credit scoring_2020Eero Siljander
 
ppgardner-lecture06-homologysearch.pdf
ppgardner-lecture06-homologysearch.pdfppgardner-lecture06-homologysearch.pdf
ppgardner-lecture06-homologysearch.pdfPaul Gardner
 
20081206 Biostatistics
20081206 Biostatistics20081206 Biostatistics
20081206 BiostatisticsChung-Han Yang
 
Heart Disease Prediction Using Data Mining Techniques
Heart Disease Prediction Using Data Mining TechniquesHeart Disease Prediction Using Data Mining Techniques
Heart Disease Prediction Using Data Mining TechniquesIJRES Journal
 
Evolution of Knowledge Discovery and Management
Evolution of Knowledge Discovery and Management Evolution of Knowledge Discovery and Management
Evolution of Knowledge Discovery and Management inscit2006
 
advanced_statistics.pdf
advanced_statistics.pdfadvanced_statistics.pdf
advanced_statistics.pdfGerryMakilan2
 

Similar to Catalog of formulae for forensic genetics ppt (20)

2013.03.26 Bayesian Methods for Modern Statistical Analysis
2013.03.26 Bayesian Methods for Modern Statistical Analysis2013.03.26 Bayesian Methods for Modern Statistical Analysis
2013.03.26 Bayesian Methods for Modern Statistical Analysis
 
Team 5 imputing_medical_missing_data_ga approach_preseatation
Team 5 imputing_medical_missing_data_ga approach_preseatationTeam 5 imputing_medical_missing_data_ga approach_preseatation
Team 5 imputing_medical_missing_data_ga approach_preseatation
 
A Comparison Of Fitness Scallng Methods In Evolutionary Algorithms
A Comparison Of Fitness Scallng Methods In Evolutionary AlgorithmsA Comparison Of Fitness Scallng Methods In Evolutionary Algorithms
A Comparison Of Fitness Scallng Methods In Evolutionary Algorithms
 
Computational Pool-Testing with Retesting Strategy
Computational Pool-Testing with Retesting StrategyComputational Pool-Testing with Retesting Strategy
Computational Pool-Testing with Retesting Strategy
 
Exponential software reliability using SPRT: MLE
Exponential software reliability using SPRT: MLEExponential software reliability using SPRT: MLE
Exponential software reliability using SPRT: MLE
 
Basen Network
Basen NetworkBasen Network
Basen Network
 
Quantitative Studies Group - Item Response Theory Spring 2014.pdf
Quantitative Studies Group - Item Response Theory Spring 2014.pdfQuantitative Studies Group - Item Response Theory Spring 2014.pdf
Quantitative Studies Group - Item Response Theory Spring 2014.pdf
 
Data analytics to support exposome research course slides
Data analytics to support exposome research course slidesData analytics to support exposome research course slides
Data analytics to support exposome research course slides
 
Bayesian Nonparametrics, Applications to biology, ecology, and marketing
Bayesian Nonparametrics, Applications to biology, ecology, and marketingBayesian Nonparametrics, Applications to biology, ecology, and marketing
Bayesian Nonparametrics, Applications to biology, ecology, and marketing
 
50120130405032
5012013040503250120130405032
50120130405032
 
London 2008
London 2008London 2008
London 2008
 
Novelties in social science statistics
Novelties in social science statisticsNovelties in social science statistics
Novelties in social science statistics
 
PSB2016 Computational Microbiology Workshop
PSB2016 Computational Microbiology WorkshopPSB2016 Computational Microbiology Workshop
PSB2016 Computational Microbiology Workshop
 
Es credit scoring_2020
Es credit scoring_2020Es credit scoring_2020
Es credit scoring_2020
 
Sampling methods
Sampling  methodsSampling  methods
Sampling methods
 
ppgardner-lecture06-homologysearch.pdf
ppgardner-lecture06-homologysearch.pdfppgardner-lecture06-homologysearch.pdf
ppgardner-lecture06-homologysearch.pdf
 
20081206 Biostatistics
20081206 Biostatistics20081206 Biostatistics
20081206 Biostatistics
 
Heart Disease Prediction Using Data Mining Techniques
Heart Disease Prediction Using Data Mining TechniquesHeart Disease Prediction Using Data Mining Techniques
Heart Disease Prediction Using Data Mining Techniques
 
Evolution of Knowledge Discovery and Management
Evolution of Knowledge Discovery and Management Evolution of Knowledge Discovery and Management
Evolution of Knowledge Discovery and Management
 
advanced_statistics.pdf
advanced_statistics.pdfadvanced_statistics.pdf
advanced_statistics.pdf
 

More from Nikita Khromov-Borisov

парадоксы спортгеномики 2015
парадоксы спортгеномики 2015парадоксы спортгеномики 2015
парадоксы спортгеномики 2015Nikita Khromov-Borisov
 
химия днк для генетиков 2015
химия днк для генетиков 2015химия днк для генетиков 2015
химия днк для генетиков 2015Nikita Khromov-Borisov
 
парадоксы геномной медицины 2015
парадоксы геномной медицины 2015парадоксы геномной медицины 2015
парадоксы геномной медицины 2015Nikita Khromov-Borisov
 
Harmonizing statistical evidences and predictions
Harmonizing statistical evidences and predictionsHarmonizing statistical evidences and predictions
Harmonizing statistical evidences and predictionsNikita Khromov-Borisov
 
Evolutionary arguments in medical genomics
Evolutionary arguments in medical genomicsEvolutionary arguments in medical genomics
Evolutionary arguments in medical genomicsNikita Khromov-Borisov
 
кризис воспроизводимости в биомедицине Rus 2014
кризис воспроизводимости в биомедицине Rus 2014кризис воспроизводимости в биомедицине Rus 2014
кризис воспроизводимости в биомедицине Rus 2014Nikita Khromov-Borisov
 
Prematurity of genetic testing of predispositions rus 2014
Prematurity of genetic testing of predispositions rus 2014Prematurity of genetic testing of predispositions rus 2014
Prematurity of genetic testing of predispositions rus 2014Nikita Khromov-Borisov
 
Reproducibility and predictivity in the genetics of predispositions ppt 2013
Reproducibility and predictivity in the genetics of predispositions ppt 2013Reproducibility and predictivity in the genetics of predispositions ppt 2013
Reproducibility and predictivity in the genetics of predispositions ppt 2013Nikita Khromov-Borisov
 
Population thinking in studies of genetic predispositions ppt
Population thinking in studies of genetic predispositions pptPopulation thinking in studies of genetic predispositions ppt
Population thinking in studies of genetic predispositions pptNikita Khromov-Borisov
 
Modern free biostatistical software ppt
Modern free biostatistical software pptModern free biostatistical software ppt
Modern free biostatistical software pptNikita Khromov-Borisov
 
Half a century with the central dogma of molecular biology ppt
Half a century with the central dogma of molecular biology pptHalf a century with the central dogma of molecular biology ppt
Half a century with the central dogma of molecular biology pptNikita Khromov-Borisov
 
Format for the population data in forensic genetics ppt
Format for the population data in forensic genetics pptFormat for the population data in forensic genetics ppt
Format for the population data in forensic genetics pptNikita Khromov-Borisov
 
Evolutionary medical genomics ppt 2013
Evolutionary medical genomics ppt 2013Evolutionary medical genomics ppt 2013
Evolutionary medical genomics ppt 2013Nikita Khromov-Borisov
 
Biometrical problems in population studies ppt 2004
Biometrical problems in population studies ppt 2004Biometrical problems in population studies ppt 2004
Biometrical problems in population studies ppt 2004Nikita Khromov-Borisov
 
Reproducibility of results in the genetics of predisposition eng 2014
Reproducibility of results in the genetics of predisposition eng 2014Reproducibility of results in the genetics of predisposition eng 2014
Reproducibility of results in the genetics of predisposition eng 2014Nikita Khromov-Borisov
 

More from Nikita Khromov-Borisov (18)

парадоксы спортгеномики 2015
парадоксы спортгеномики 2015парадоксы спортгеномики 2015
парадоксы спортгеномики 2015
 
химия днк для генетиков 2015
химия днк для генетиков 2015химия днк для генетиков 2015
химия днк для генетиков 2015
 
парадоксы геномной медицины 2015
парадоксы геномной медицины 2015парадоксы геномной медицины 2015
парадоксы геномной медицины 2015
 
Harmonizing statistical evidences and predictions
Harmonizing statistical evidences and predictionsHarmonizing statistical evidences and predictions
Harmonizing statistical evidences and predictions
 
Evolutionary arguments in medical genomics
Evolutionary arguments in medical genomicsEvolutionary arguments in medical genomics
Evolutionary arguments in medical genomics
 
кризис воспроизводимости в биомедицине Rus 2014
кризис воспроизводимости в биомедицине Rus 2014кризис воспроизводимости в биомедицине Rus 2014
кризис воспроизводимости в биомедицине Rus 2014
 
Prematurity of genetic testing of predispositions rus 2014
Prematurity of genetic testing of predispositions rus 2014Prematurity of genetic testing of predispositions rus 2014
Prematurity of genetic testing of predispositions rus 2014
 
Syndrome of statistical leniency ppt
Syndrome of statistical leniency pptSyndrome of statistical leniency ppt
Syndrome of statistical leniency ppt
 
Reproducibility and predictivity in the genetics of predispositions ppt 2013
Reproducibility and predictivity in the genetics of predispositions ppt 2013Reproducibility and predictivity in the genetics of predispositions ppt 2013
Reproducibility and predictivity in the genetics of predispositions ppt 2013
 
Population thinking in studies of genetic predispositions ppt
Population thinking in studies of genetic predispositions pptPopulation thinking in studies of genetic predispositions ppt
Population thinking in studies of genetic predispositions ppt
 
Modern free biostatistical software ppt
Modern free biostatistical software pptModern free biostatistical software ppt
Modern free biostatistical software ppt
 
Half a century with the central dogma of molecular biology ppt
Half a century with the central dogma of molecular biology pptHalf a century with the central dogma of molecular biology ppt
Half a century with the central dogma of molecular biology ppt
 
Genetics of predispositions ppt
Genetics of predispositions pptGenetics of predispositions ppt
Genetics of predispositions ppt
 
Format for the population data in forensic genetics ppt
Format for the population data in forensic genetics pptFormat for the population data in forensic genetics ppt
Format for the population data in forensic genetics ppt
 
Evolutionary medical genomics ppt 2013
Evolutionary medical genomics ppt 2013Evolutionary medical genomics ppt 2013
Evolutionary medical genomics ppt 2013
 
Biometrical problems in population studies ppt 2004
Biometrical problems in population studies ppt 2004Biometrical problems in population studies ppt 2004
Biometrical problems in population studies ppt 2004
 
Joshua lederberg ppt
Joshua lederberg pptJoshua lederberg ppt
Joshua lederberg ppt
 
Reproducibility of results in the genetics of predisposition eng 2014
Reproducibility of results in the genetics of predisposition eng 2014Reproducibility of results in the genetics of predisposition eng 2014
Reproducibility of results in the genetics of predisposition eng 2014
 

Recently uploaded

In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxAlaminAfendy1
 
Musical Meetups Knowledge Graph (MMKG): a collection of evidence for historic...
Musical Meetups Knowledge Graph (MMKG): a collection of evidence for historic...Musical Meetups Knowledge Graph (MMKG): a collection of evidence for historic...
Musical Meetups Knowledge Graph (MMKG): a collection of evidence for historic...Alba Morales
 
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdfPests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdfPirithiRaju
 
EY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptxEY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptxAlguinaldoKong
 
Microbial Type Culture Collection (MTCC)
Microbial Type Culture Collection (MTCC)Microbial Type Culture Collection (MTCC)
Microbial Type Culture Collection (MTCC)abhishekdhamu51
 
Structures and textures of metamorphic rocks
Structures and textures of metamorphic rocksStructures and textures of metamorphic rocks
Structures and textures of metamorphic rockskumarmathi863
 
Cancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate PathwayCancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate PathwayAADYARAJPANDEY1
 
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdfSCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdfSELF-EXPLANATORY
 
Transport in plants G1.pptx Cambridge IGCSE
Transport in plants G1.pptx Cambridge IGCSETransport in plants G1.pptx Cambridge IGCSE
Transport in plants G1.pptx Cambridge IGCSEjordanparish425
 
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...Health Advances
 
A Giant Impact Origin for the First Subduction on Earth
A Giant Impact Origin for the First Subduction on EarthA Giant Impact Origin for the First Subduction on Earth
A Giant Impact Origin for the First Subduction on EarthSérgio Sacani
 
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.Sérgio Sacani
 
GLOBAL AND LOCAL SCENARIO OF FOOD AND NUTRITION.pptx
GLOBAL AND LOCAL SCENARIO OF FOOD AND NUTRITION.pptxGLOBAL AND LOCAL SCENARIO OF FOOD AND NUTRITION.pptx
GLOBAL AND LOCAL SCENARIO OF FOOD AND NUTRITION.pptxSultanMuhammadGhauri
 
Mammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also FunctionsMammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also FunctionsYOGESH DOGRA
 
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...muralinath2
 
Hemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptxHemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptxmuralinath2
 
GEOLOGICAL FIELD REPORT On Kaptai Rangamati Road-Cut Section.pdf
GEOLOGICAL FIELD REPORT  On  Kaptai Rangamati Road-Cut Section.pdfGEOLOGICAL FIELD REPORT  On  Kaptai Rangamati Road-Cut Section.pdf
GEOLOGICAL FIELD REPORT On Kaptai Rangamati Road-Cut Section.pdfUniversity of Barishal
 
platelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptxplatelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptxmuralinath2
 
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdf
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdfPests of sugarcane_Binomics_IPM_Dr.UPR.pdf
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdfPirithiRaju
 
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Sérgio Sacani
 

Recently uploaded (20)

In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
 
Musical Meetups Knowledge Graph (MMKG): a collection of evidence for historic...
Musical Meetups Knowledge Graph (MMKG): a collection of evidence for historic...Musical Meetups Knowledge Graph (MMKG): a collection of evidence for historic...
Musical Meetups Knowledge Graph (MMKG): a collection of evidence for historic...
 
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdfPests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
 
EY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptxEY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptx
 
Microbial Type Culture Collection (MTCC)
Microbial Type Culture Collection (MTCC)Microbial Type Culture Collection (MTCC)
Microbial Type Culture Collection (MTCC)
 
Structures and textures of metamorphic rocks
Structures and textures of metamorphic rocksStructures and textures of metamorphic rocks
Structures and textures of metamorphic rocks
 
Cancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate PathwayCancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate Pathway
 
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdfSCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
 
Transport in plants G1.pptx Cambridge IGCSE
Transport in plants G1.pptx Cambridge IGCSETransport in plants G1.pptx Cambridge IGCSE
Transport in plants G1.pptx Cambridge IGCSE
 
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
 
A Giant Impact Origin for the First Subduction on Earth
A Giant Impact Origin for the First Subduction on EarthA Giant Impact Origin for the First Subduction on Earth
A Giant Impact Origin for the First Subduction on Earth
 
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
 
GLOBAL AND LOCAL SCENARIO OF FOOD AND NUTRITION.pptx
GLOBAL AND LOCAL SCENARIO OF FOOD AND NUTRITION.pptxGLOBAL AND LOCAL SCENARIO OF FOOD AND NUTRITION.pptx
GLOBAL AND LOCAL SCENARIO OF FOOD AND NUTRITION.pptx
 
Mammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also FunctionsMammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also Functions
 
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
 
Hemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptxHemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptx
 
GEOLOGICAL FIELD REPORT On Kaptai Rangamati Road-Cut Section.pdf
GEOLOGICAL FIELD REPORT  On  Kaptai Rangamati Road-Cut Section.pdfGEOLOGICAL FIELD REPORT  On  Kaptai Rangamati Road-Cut Section.pdf
GEOLOGICAL FIELD REPORT On Kaptai Rangamati Road-Cut Section.pdf
 
platelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptxplatelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptx
 
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdf
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdfPests of sugarcane_Binomics_IPM_Dr.UPR.pdf
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdf
 
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
 

Catalog of formulae for forensic genetics ppt

  • 1. COMPREHENSIVE CATALOG OF STATISTICAL FORMULAE, ALGORITHMS AND SOFTWARE – A STEP TOWARDS GOOD STATISTICS PRACTICE IN FORENSIC GENETICS Nikita N. Khromov-Borisov,Nikita N. Khromov-Borisov, Andrew G. Smolyanitsky Forensic Medicine Bureau of Leningrad District Saint Petersburg, Russia Nikita.KhromovBorisov@gmail.com Andrew.Smolyanitsky@yandex.ru
  • 2. Quotations of the day If your experiment requires statistics, then you ought to have done a better experiment Ernest Rutherford Statistical thinking will one day be as necessary for efficient citizenship as the ability to read and writefor efficient citizenship as the ability to read and write Herbert George Wells Those who ignore Statistics are condemned to reinvent it Bradley Efron If Experimentation is the Queen of the Sciences, then Statistical Methods must be regarded as Guardians of the Royal Virtue Myron Tribus
  • 3. GSP – Good Statistics Practice is what we need Obviously, in their turn, statistical methods must be blameless and perfect So there is an urgent need for the comprehensive catalog of carefully checked and approved formulae as well as corresponding algorithms and software.as well as corresponding algorithms and software. Unfortunately, some of them are published initially with errors which are reproduced in subsequent sources. Example of corrections: Clayton T. M., Foreman L. A., Carracedo A. FORENSIC SCIENCE INTERNATIONAL Vol. 125: No. 2-3 p. 284-284, 2002 Motherless case in paternity testing by Lee et al.
  • 4. Elements of the Statistical Design of Experiment Some formulae are in rare use or forgotten. Example: Chakraborty’s formula for the sample size required to reach the representativeness (reliability, “saturation”) of the reference population samples. Human Biology 64 (1992) 141-159:141-159: ln[1 - (1 - α)1/r] Nmin= -------------- 4 ln(1 – Pmin) Nmin - minimum number of independent individuals to be analyzed, a - probability of error, r - number of alleles revealed by the system, Pmin- minimum allele frequency.
  • 5. Minimum sample sizes required for the reference population data Minimum allele frequency No. of alleles Error Sample size, No. of individuals P r α NPmin r α Nmin 0.01 2 - 25 0.001-0.0001 190 - 310 0.005 2 – 25 0.001-0.0001 380 - 620 0.001 2 - 25 0.001-0.0001 1900 - 3100
  • 6. Template for paternity testing Mother Child Tested man JK JK JJ JK JW PI 1/(pJ +pK) 0.5/(pJ +pK) JJ JJ JJ JWJJ JK KK KW JJ JJ JK JK JJ JJ JJ JJ JK JK JK JW JW JW JW JZ PI 1/pJ 0.5/pJ Obligative paternal alleles are in red color False genotyping is excluded
  • 7. C. C. Li and A. Chakravarti alternatives Paternity probability based on Nonexclusion P0 W=------------------------------- P0 + (1-P0)(1- E1)…(1-Et)P0 + (1-P0)(1- E1)…(1-Et) Ei – probability of exclusion for i-th test P0 can be estimated from long-term records Am. J. Hum. Genet. 43 (1) 197-205 (1988)
  • 8. Coincidental DNA matches Match probability as a property of a locus: M0 = 2(sum pi 2)2 – sum(pi 4) First principles: no prior knowledge is required Li C.C. Hum. Biol. 68 (1996) 167-184 Won the Gabriel W. Lasker Award as the best paper of the year in Human Biology
  • 9. Rare allele frequency estimation Commonly, from a reference sample: pi = ni/N When, however, a stain and suspect are independent homozygotes AiA then pi = (ni + 4)/(N + 4) If they are independent heterozygotes AiAj then pi = (ni + 2)/(N + 2) and pj = (nj + 2)/(N + 2) Let the size of a reference sample N = 1000 and the frequency of a rare allele Ai is ni = 1, so pi = ni/N = 0.001 and pii = 0.000001 If a stain and suspect are homozygotes AiAi, then pi becomes pi = (1 + 4)/(1000 + 4 ) ≈ 0.005 and pii = 0.000025
  • 10. Forensic genetics software Allelix http://www.allelix.net/ BDgen dbgen@yahoo.com.ar DNAdacto, mDNAbase gavriley@krinc.ru DNAmix, EasyDNA: EasyPA, EasyPAnt, EasyIN, EsayMISS/EasyKIN http://www.hku.hk/statistics/staff/wingfung/countdown/dnamix.ht ml DNAmix2DNAmix2 ftp://statgen.ncsu.edu/pub/storey/DNAMIXv2/dos/dnamix2.exe DNA-view http://dna-view.com/ EasyPat, Patern http://www.uni- kiel.de/medinfo/mitarbeiter/krawczak/download/index.html Familias http://www.math.chalmers.se/~mostad/familias/familias.zip FCalc bolon@caltech.edu www.its.caltech.edu/~bolon GRAPE serge@star.net.
  • 11. Identity, NewPat5 DadShare http://www.zoo.cam.ac.uk/zoostaff/amos/ PARENTE http://www2.ujf- grenoble.fr/leca/download/PARENTE/PARENTE.zip PATER2 spena@dcc.ufmg.br PATRI http://www.bscb.cornell.edu/Homepages/Rasmus_Nielsen/mdivhttp://www.bscb.cornell.edu/Homepages/Rasmus_Nielsen/mdiv /mdiv.exe PedCheck http://watson.hgen.pitt.edu/register/docs/pedcheck.html PowerMarker http://152.14.14.57/ PowerStats http://www.promega.com/geneticidtools/ ProbMax http://www.uoguelph.ca/~rdanzman/software/ Relative hhg2@columbia.edu SPUR nina.fukshansky@gmx.de STRLab http://strlab.co.za/
  • 12. Population genetics software Arlequin http://anthro/unige.ch/arlequin CERVUS http://helios.bto.ed.ac.uk/evolgen Con~Struct andy.overall@ed.ac.uk FSTAT http://www.unil.ch/izea/softwares/fstat.html FSTMET, HWMET http://www.reading.ac.uk/~snsbalng/ GDA http://lewis.eeb.uconn.edu/lewishome/software.html GEN lazzeroni@stanford.edu GENEPOP ftp://ftp.cefe.cnrs-mop.fr/genepop GENEPOP on WebGENEPOP on Web http://wbiomed.curtin.edu.au/genepop/index.html GENETIX http://www.univ-montp2.fr/~genetix/genetix.htm GeneKonv http://www.rrz.uni- hamburg.de/OekoGenetik/software.htm HWE http://www.biology.ualberta.ca/old_site/jbrzusto/hwenj.html PopGen32 Http://www.ualberta.ca/~fyeh Population http://www.cnrs- gif.fr/pge/bioinfo/populations/index.php?lang=fr PowerMarker http://www.powermarker.net PowerStats http://www.promega.com/geneticidtools/ TFPGA http://bioweb.usu.edu/mpmbio/tfpga.htm
  • 13. Software online Allelix http://www.allelix.net/ GENEPOP on Web http://wbiomed.curtin.edu.au/genepop/index.html ProfilerPlus Random Match Probability Calculator http://www.csfs.ca/pplus/profiler.htm
  • 14. Different tests (even exact) can lead to different conclusions Locus: vWA, Russian Caucasians, Hardy-Weinberg equilibrium test Test P-value or CL Software χ2 0.106 ChiHW, GDA, PowerMarker, etc.etc. Corrected χ2 0.092 GEN Fisher’s probability, Guo-Thompson alg. 0.026 Arlequin, GDA, GENEPOP, HWE, TFPGA, etc. G2 asympt. 0.163 POPGENE Fis 0.141 FSTAT, GENETIX Fis, 95% cred. lim. -0.044, -0.015 HWMET
  • 15. Algorithms Modern software implement exact nonparametric approaches and modern Bayesian ideology and methodology. Their realization requires sophisticated computational algorithms and facilities. In this respect some new problems are raised, e.g. the problem of convergence for the procedures based on Markov chain Monte Carlo (MCMC) algorithms.
  • 16. familiarize yourself with the method, including convergence diagnostics K. L. Ayres, D. J. Balding P-values produced by MCMC procedure depend on the number of randomization steps: 10 steps — P = 0.7815 ± 0.0008104 steps — P = 0.7815 ± 0.0008 105 steps — P = 0.2681 ± 0.0005 106 steps — P = 0.373 ± 0.012 107 steps — P = 0.424 ± 0.006 108 steps — P = 0.460 ± 0.003
  • 17. Conclusion First principle of GSP It should be good statistics practiceIt should be good statistics practice to analyze the data with different statistical methods and investigate their consistency.
  • 18. Acknowledgements We thank Drs., Karen L. Ayres and David J. Balding, Laura C. Lazzeroni and Kenneth Lange, and John Brzustowski for kind supply with the executables of theirfor kind supply with the executables of their programs (HWMET, GEN and HWE, respectively). Many thanks to them and Drs. Angel Carracedo, Laurent Excoffier, Jerome Goudet, Kejun Liu, Tristan Marshall, Mark P. Miller, Eleanor Morgan, Michel Raymond, Francois Rousset, Hans-Georg Scheil, Bruce S. Weir and Dmitri Zaykin, the authors of other programs and papers used in this study, for helpful and fruitful discussion.
  • 19. Sincere thanks Drs. Carsten HohoffCarsten Hohoff Edwin Ehrlich Kurt Trübner for the invitation, help and financial support