SlideShare uma empresa Scribd logo
1 de 27
1 Facultad de Ingeniería - Universidad Nacional de Mar del Plata 2 - Agencia Nacional de Promoción Científica y Tecnológica – FONCyT -  PICT 2006 1er Congreso Argentino de Bioinformática Data Balancing for Phenotype Classification Based on SNPs Marcel Brun 1,2 , Virginia Ballarín 1
Single Nucleotide Polymorphism (SNP) ,[object Object],[object Object],[object Object],[object Object]
SNPs
SNPs Categorization ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],SNPs ,[object Object],[object Object],[object Object],[object Object],[object Object]
Classification based on SNP data SNP  Data Combinatorial search Errors Tables for combinations of  n  SNPs as predictors of control/disease 35 3 1 1  2  3  0  0  0.423 0.430 0.423 0.248 1  2  4  0  0  0.340 0.430 0.340 0.284 1  2  5  0  0  0.351 0.430 0.351 0.279 1  2  6  0  0  0.205 0.430 0.205 0.342 1  2  7  0  0  0.323 0.430 0.323 0.291 1  3  4  0  0  0.344 0.430 0.344 0.282 1  3  5  0  0  0.328 0.430 0.328 0.289 1  3  6  0  0  0.333 0.430 0.333 0.287 1  3  7  0  0  0.314 0.430 0.314 0.295 1  4  5  0  0  0.267 0.430 0.267 0.315 1  4  6  0  0  0.130 0.430 0.130 0.374 1  4  7  0  0  0.267 0.430 0.267 0.315 Processed Results in html pages and Excel datasheets Calls – SNiPer HD
Example: Bovine race classification based on SNPs ,[object Object],[object Object],[object Object],[object Object],[object Object],rs29024708 BTA-160695 BTA-140710  29 BTA-71641 rs29019831 BTA-117838  4 SNP 3 SNP 2 SNP 1 0.01864 0.00339 Error 0.989 0.999 Sensibilidad  0.974 0.995 Especificidad  0.994 0.999 NPV  0.957 0.991 PPV  0.011 0.00098 FNR  0.022 0.0046 FPR  38.14 38.46 TN Medio 19.76 20.34 TP Medio 0.22 0.02 FN Medio 0.88 0.18 FP Medio  4  29
Discrete Ful Logic for SNP-based classification ,[object Object],[object Object],[object Object],[object Object],[object Object],Example of decision table Control Unknown Unknown Control Control Case Case Case Control Outcome AB AA BB AA AA AB BB AB AB AB AA BB BB BB AB BB AA AA SNP 2 SNP 1
Estimation of error rate for 2 SNPs from data – Discrete Full logic Statistical Inference of the optimal function using multi-resolution to “generalize” 2 mistakes on 6      66% accuracy Control AB AA S 7 Case AB AA S 7 Case AA BB S 6 Case AA BB S 5 Case AA AA S 4 Control Case Control Phenotype AA AA AB SNP 2 AB AA AA SNP 1 S 3 S 2 S 1 Train 2 1 0 1 0 0 2 0 0 # Cases 0 2 0 0 0 0 0 0 0 # Control Control Control AB AA Case ??? BB AA Case Case AA AB Case ??? BB AB Case ??? AB AB Case Case ??? ??? Decision Case AA BB Case BB BB Case AB BB Case AA AA Generalization SNP 2 SNP 1 Case AA AA S 11 Case AB BB S 12 Control BB AB S 10 Control Case Control Phenotype AA AA AB SNP 2 AA AA AA SNP 1 S 13 S 9 S 8 TEST
Example of Truth table for a real case 396  185  Total  Case 2 ( 0.3 % )  0 ( 0 % )  AA  AA  AA  Case 0 ( 0 % )  0 ( 0 % )  AB  AA  AA  Case 0 ( 0 % )  0 ( 0 % )  BB  AA  AA  Case 28 ( 3.5 % )  0 ( 0 % )  AA  AB  AA  Case 11 ( 1.4 % )  0 ( 0 % )  AB  AB  AA  Case 0 ( 0 % )  0 ( 0 % )  BB  AB  AA  Case 22 ( 2.8 % )  3 ( 0.8 % )  AA  BB  AA  Case 11 ( 1.4 % )  0 ( 0 % )  AB  BB  AA  Case 0 ( 0 % )  0 ( 0 % )  BB  BB  AA  Control 5 ( 0.6 % )  8 ( 2.2 % )  AA  AA  AB  Case 5 ( 0.6 % )  2 ( 0.5 % )  AB  AA  AB  Control 1 ( 0.1 % )  1 ( 0.3 % )  BB  AA  AB  Case 60 ( 7.6 % )  14 ( 3.8 % )  AA  AB  AB  Case 35 ( 4.4 % )  5 ( 1.4 % )  AB  AB  AB  Case 3 ( 0.4 % )  1 ( 0.3 % )  BB  AB  AB  Case 49 ( 6.2 % )  11 ( 3 % )  AA  BB  AB  Case 31 ( 3.9 % )  6 ( 1.6 % )  AB  BB  AB  Control 3 ( 0.4 % )  2 ( 0.5 % )  BB  BB  AB  Control 8 ( 1 % )  25 ( 6.8 % )  AA  AA  BB  Case 9 ( 1.1 % )  4 ( 1.1 % )  AB  AA  BB  Control 1 ( 0.1 % )  1 ( 0.3 % )  BB  AA  BB  Control 41 ( 5.2 % )  49 ( 13.2 % )  AA  AB  BB  Control 15 ( 1.9 % )  11 ( 3 % )  AB  AB  BB  Case 2 ( 0.3 % )  0 ( 0 % )  BB  AB  BB  Control 31 ( 3.9 % )  34 ( 9.2 % )  AA  BB  BB  Case  21 ( 2.7 % )  5 ( 1.4 % )  AB  BB  BB  Control 2 ( 0.3 % )  3 ( 0.8 % )  BB  BB  BB  Predicted  # Case ( Freq. Corrected )  # Control ( Freq. Corrected )  SNP 41 SNP 6 SNP 1
Feature Selection – Sets of 3 SNPs Selected SNPs Selected SNPs Step N Step N+1 Error Error
Now The Issues ,[object Object]
Why Balancing ,[object Object],[object Object],[object Object],[object Object]
Why Balancing 48%  2.5%  6.6%  15%  0 2.9 ,[object Object],[object Object],Error = 6.6%  Error = 14.2%
Why Balancing ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Why is this bad? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Balancing ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Balancing on discrete data Unbalanced Data Classifier design 19 85 AA AA 512 0 45 90 90 50 60 10 95 # 0 120 19 8 3 15 19 16 12 9 # 1 AB AA BB AA AA AB BB AB AB AB AA BB BB BB AB BB SNP 2 SNP 1 0 0.0295 0.1318 AA AA 0.1860 0.0295 0.0124 0.0047 0.0233 0.0295 0.0248 0.0186 0.0140 # 1 0.814 0.0000 0.0698 0.1395 0.1395 0.0775 0.0930 0.0155 0.1473 # 0 0 0 0 0 0 1 0 0 Classif AB AA BB AA AA AB BB AB AB AB AA BB BB BB AB BB SNP 2 SNP 1
Balancing on discrete data X*0.5/0.814 X*0.5/0.1860 ,[object Object],[object Object],[object Object],[object Object],ERROR = 25 % FPR = 11 % FNR = 86 % ERROR = 34 % FPR = 23 % FNR = 45% 0 0.0792 0.0810 AA AA 0.5000 0.0792 0.0333 0.0125 0.0625 0.0792 0.0667 0.0500 0.0375 # 1 0.5000 0.0000 0.0429 0.0857 0.0857 0.0476 0.0571 0.0095 0.0905 # 0 1 0 0 0 1 1 1 0  B AB AA BB AA AA AB BB AB AB AB AA BB BB BB AB BB SNP 2 SNP 1 0 0.0295 0.1318 AA AA 0.1860 0.0295 0.0124 0.0047 0.0233 0.0295 0.0248 0.0186 0.0140 # 1 0.814 0.0000 0.0698 0.1395 0.1395 0.0775 0.0930 0.0155 0.1473 # 0 0 0 0 0 0 1 0 0  AB AA BB AA AA AB BB AB AB AB AA BB BB BB AB BB SNP 2 SNP 1
Balancing on discrete data ERROR   B  = 26 % (against 25% of   ) ERROR    = 49 % (against 34% of   B ) ,[object Object],[object Object],0 0.0792 0.0810 AA AA 0.5000 0.0792 0.0333 0.0125 0.0625 0.0792 0.0667 0.0500 0.0375 # 1 0.5000 0.0000 0.0429 0.0857 0.0857 0.0476 0.0571 0.0095 0.0905 # 0 0 0 0 0 0 1 0 0  AB AA BB AA AA AB BB AB AB AB AA BB BB BB AB BB SNP 2 SNP 1 0 0.0295 0.1318 AA AA 0.1860 0.0295 0.0124 0.0047 0.0233 0.0295 0.0248 0.0186 0.0140 # 1 0.814 0.0000 0.0698 0.1395 0.1395 0.0775 0.0930 0.0155 0.1473 # 0 1 0 0 0 1 1 1 0  B AB AA BB AA AA AB BB AB AB AB AA BB BB BB AB BB SNP 2 SNP 1
[object Object],Balancing - Simulations ,[object Object],[object Object],[object Object],[object Object]
Another Advantages ,[object Object],[object Object],[object Object]
Another Advantages ,[object Object],[object Object],[object Object],Non Balanced Design Balanced Design
Another Advantages ,[object Object],[object Object],Threshold Change Balancing
Example of Application Victor L. Boyartchuk, Karl W. Broman, Rebecca E. Mosher, Sarah E.F. D’Orazio, Michael N. Starnbach & William F. Dietrich, “Multigenic control of  Listeria  monocytogenes susceptibility in mice”, 2001 Nature Publishing Group, Brief Communications, 2001 Selected SNPs Survival vs. No Survival Analysis
Results ,[object Object],[object Object],[object Object],9.7% 51.4% 30.3% Balanced Design 19.8% 50.9% 29.7% Classic Design FNR FPR Error Rate
Conclusions ,[object Object],[object Object],[object Object]
Acknowledgments UNMdP Virginia Ballarin Mariela Azul Gonzalez INTA (Balcarce) Pablo Corva FI-UNER Inti Anabela Pagnuco Agencia FONCyT – PICT 2313 TGen Edward Dougherty Dietrich Stephan

Mais conteúdo relacionado

Semelhante a Data balancing for phenotype classification based on SNPs

Aug2014 use cases combined
Aug2014 use cases combinedAug2014 use cases combined
Aug2014 use cases combinedGenomeInABottle
 
Interpreting yield variation in commercial production of crops / Como interp...
Interpreting  yield variation in commercial production of crops / Como interp...Interpreting  yield variation in commercial production of crops / Como interp...
Interpreting yield variation in commercial production of crops / Como interp...Decision and Policy Analysis Program
 
High Sensitivity Sanger Sequencing for Minor Indel Detection and Characteriza...
High Sensitivity Sanger Sequencing for Minor Indel Detection and Characteriza...High Sensitivity Sanger Sequencing for Minor Indel Detection and Characteriza...
High Sensitivity Sanger Sequencing for Minor Indel Detection and Characteriza...Thermo Fisher Scientific
 
testing123
testing123testing123
testing123callroom
 
Burden of Proof, Proof of Principle
Burden of Proof, Proof of PrincipleBurden of Proof, Proof of Principle
Burden of Proof, Proof of PrincipleRobert Simons
 
Burden of Proof Proof of Principle
Burden of Proof Proof of PrincipleBurden of Proof Proof of Principle
Burden of Proof Proof of PrincipleRobert Simons
 
Health Economics and Outcomes Research: Minimizing Uncertainty
Health Economics and Outcomes Research: Minimizing Uncertainty Health Economics and Outcomes Research: Minimizing Uncertainty
Health Economics and Outcomes Research: Minimizing Uncertainty Robert Simons
 
Avoiding Nonsense Results in your NGS Variant Studies
Avoiding Nonsense Results in your NGS Variant StudiesAvoiding Nonsense Results in your NGS Variant Studies
Avoiding Nonsense Results in your NGS Variant StudiesJames Lyons-Weiler
 
03 chapter 3 application .pptx
03 chapter 3 application .pptx03 chapter 3 application .pptx
03 chapter 3 application .pptxHendmaarof
 
Sept2016 plenary mercer_sequins
Sept2016 plenary mercer_sequinsSept2016 plenary mercer_sequins
Sept2016 plenary mercer_sequinsGenomeInABottle
 
Part 5 of RNA-seq for DE analysis: Detecting differential expression
Part 5 of RNA-seq for DE analysis: Detecting differential expressionPart 5 of RNA-seq for DE analysis: Detecting differential expression
Part 5 of RNA-seq for DE analysis: Detecting differential expressionJoachim Jacob
 
Computation and System Biology Assignment Help
Computation and System Biology Assignment HelpComputation and System Biology Assignment Help
Computation and System Biology Assignment HelpNursing Assignment Help
 
Bacterial transcriptome profiling using Ion Torrent Proton™ technology
Bacterial transcriptome profiling using Ion Torrent Proton™ technologyBacterial transcriptome profiling using Ion Torrent Proton™ technology
Bacterial transcriptome profiling using Ion Torrent Proton™ technologyThermo Fisher Scientific
 
Reverse transcription-quantitative PCR (RT-qPCR): Reporting and minimizing th...
Reverse transcription-quantitative PCR (RT-qPCR): Reporting and minimizing th...Reverse transcription-quantitative PCR (RT-qPCR): Reporting and minimizing th...
Reverse transcription-quantitative PCR (RT-qPCR): Reporting and minimizing th...Jonathan Clarke
 
Financial Econometrics_Edmond_Farah
Financial Econometrics_Edmond_FarahFinancial Econometrics_Edmond_Farah
Financial Econometrics_Edmond_FarahEdmond Farah
 
2010 smg training_cardiff_day1_session1 (1 of 3)_mckenzie
2010 smg training_cardiff_day1_session1 (1 of 3)_mckenzie2010 smg training_cardiff_day1_session1 (1 of 3)_mckenzie
2010 smg training_cardiff_day1_session1 (1 of 3)_mckenziergveroniki
 
Interpretation of electrocardiography (ECG) by using polynomial function simu...
Interpretation of electrocardiography (ECG) by using polynomial function simu...Interpretation of electrocardiography (ECG) by using polynomial function simu...
Interpretation of electrocardiography (ECG) by using polynomial function simu...mustafatacettin1
 
Hepatic injury classification
Hepatic injury classificationHepatic injury classification
Hepatic injury classificationZheliang Jiang
 

Semelhante a Data balancing for phenotype classification based on SNPs (20)

Aug2014 use cases combined
Aug2014 use cases combinedAug2014 use cases combined
Aug2014 use cases combined
 
Interpreting yield variation in commercial production of crops / Como interp...
Interpreting  yield variation in commercial production of crops / Como interp...Interpreting  yield variation in commercial production of crops / Como interp...
Interpreting yield variation in commercial production of crops / Como interp...
 
High Sensitivity Sanger Sequencing for Minor Indel Detection and Characteriza...
High Sensitivity Sanger Sequencing for Minor Indel Detection and Characteriza...High Sensitivity Sanger Sequencing for Minor Indel Detection and Characteriza...
High Sensitivity Sanger Sequencing for Minor Indel Detection and Characteriza...
 
testing123
testing123testing123
testing123
 
Burden of Proof, Proof of Principle
Burden of Proof, Proof of PrincipleBurden of Proof, Proof of Principle
Burden of Proof, Proof of Principle
 
Burden of Proof Proof of Principle
Burden of Proof Proof of PrincipleBurden of Proof Proof of Principle
Burden of Proof Proof of Principle
 
Health Economics and Outcomes Research: Minimizing Uncertainty
Health Economics and Outcomes Research: Minimizing Uncertainty Health Economics and Outcomes Research: Minimizing Uncertainty
Health Economics and Outcomes Research: Minimizing Uncertainty
 
Statistics
StatisticsStatistics
Statistics
 
Avoiding Nonsense Results in your NGS Variant Studies
Avoiding Nonsense Results in your NGS Variant StudiesAvoiding Nonsense Results in your NGS Variant Studies
Avoiding Nonsense Results in your NGS Variant Studies
 
03 chapter 3 application .pptx
03 chapter 3 application .pptx03 chapter 3 application .pptx
03 chapter 3 application .pptx
 
Sept2016 plenary mercer_sequins
Sept2016 plenary mercer_sequinsSept2016 plenary mercer_sequins
Sept2016 plenary mercer_sequins
 
Part 5 of RNA-seq for DE analysis: Detecting differential expression
Part 5 of RNA-seq for DE analysis: Detecting differential expressionPart 5 of RNA-seq for DE analysis: Detecting differential expression
Part 5 of RNA-seq for DE analysis: Detecting differential expression
 
Aacr poster2007
Aacr poster2007Aacr poster2007
Aacr poster2007
 
Computation and System Biology Assignment Help
Computation and System Biology Assignment HelpComputation and System Biology Assignment Help
Computation and System Biology Assignment Help
 
Bacterial transcriptome profiling using Ion Torrent Proton™ technology
Bacterial transcriptome profiling using Ion Torrent Proton™ technologyBacterial transcriptome profiling using Ion Torrent Proton™ technology
Bacterial transcriptome profiling using Ion Torrent Proton™ technology
 
Reverse transcription-quantitative PCR (RT-qPCR): Reporting and minimizing th...
Reverse transcription-quantitative PCR (RT-qPCR): Reporting and minimizing th...Reverse transcription-quantitative PCR (RT-qPCR): Reporting and minimizing th...
Reverse transcription-quantitative PCR (RT-qPCR): Reporting and minimizing th...
 
Financial Econometrics_Edmond_Farah
Financial Econometrics_Edmond_FarahFinancial Econometrics_Edmond_Farah
Financial Econometrics_Edmond_Farah
 
2010 smg training_cardiff_day1_session1 (1 of 3)_mckenzie
2010 smg training_cardiff_day1_session1 (1 of 3)_mckenzie2010 smg training_cardiff_day1_session1 (1 of 3)_mckenzie
2010 smg training_cardiff_day1_session1 (1 of 3)_mckenzie
 
Interpretation of electrocardiography (ECG) by using polynomial function simu...
Interpretation of electrocardiography (ECG) by using polynomial function simu...Interpretation of electrocardiography (ECG) by using polynomial function simu...
Interpretation of electrocardiography (ECG) by using polynomial function simu...
 
Hepatic injury classification
Hepatic injury classificationHepatic injury classification
Hepatic injury classification
 

Mais de Asociación Argentina de Bioinformática y Biología Computacional

Mais de Asociación Argentina de Bioinformática y Biología Computacional (14)

About using new descriptors for cheminformatics
About using new descriptors for cheminformaticsAbout using new descriptors for cheminformatics
About using new descriptors for cheminformatics
 
La Unidad de Bioinformática del INTA
La Unidad de Bioinformática del INTALa Unidad de Bioinformática del INTA
La Unidad de Bioinformática del INTA
 
Structural Order and Disorder Dictate Sequence And Functional Evolution of th...
Structural Order and Disorder Dictate Sequence And Functional Evolution of th...Structural Order and Disorder Dictate Sequence And Functional Evolution of th...
Structural Order and Disorder Dictate Sequence And Functional Evolution of th...
 
Discovery Of Functional Protein Linear Motifs Using a Greaddy Algorithm and I...
Discovery Of Functional Protein Linear Motifs Using a Greaddy Algorithm and I...Discovery Of Functional Protein Linear Motifs Using a Greaddy Algorithm and I...
Discovery Of Functional Protein Linear Motifs Using a Greaddy Algorithm and I...
 
Cooperatividad en la Expresión Génica: Abordaje Estocástico
Cooperatividad en la Expresión Génica: Abordaje EstocásticoCooperatividad en la Expresión Génica: Abordaje Estocástico
Cooperatividad en la Expresión Génica: Abordaje Estocástico
 
Prediction of heparin binding sites on GAPDH
Prediction of heparin binding sites on GAPDHPrediction of heparin binding sites on GAPDH
Prediction of heparin binding sites on GAPDH
 
Signals of Evolution: Conservation, Specificity Determining Positions and Coe...
Signals of Evolution: Conservation, Specificity Determining Positions and Coe...Signals of Evolution: Conservation, Specificity Determining Positions and Coe...
Signals of Evolution: Conservation, Specificity Determining Positions and Coe...
 
Design of degenerated primers from bioinformatics online software for putativ...
Design of degenerated primers from bioinformatics online software for putativ...Design of degenerated primers from bioinformatics online software for putativ...
Design of degenerated primers from bioinformatics online software for putativ...
 
A structure-function analysis of s HSPs in plants
A structure-function analysis of s HSPs in plantsA structure-function analysis of s HSPs in plants
A structure-function analysis of s HSPs in plants
 
Modelado de la proteína p35 de toxoplasma gondii
Modelado de la proteína p35 de toxoplasma gondiiModelado de la proteína p35 de toxoplasma gondii
Modelado de la proteína p35 de toxoplasma gondii
 
Gene selection via significant subset using silhouette index
Gene selection via significant subset using silhouette indexGene selection via significant subset using silhouette index
Gene selection via significant subset using silhouette index
 
Bolstered error estimation for discrete classifier applied to genomic signal ...
Bolstered error estimation for discrete classifier applied to genomic signal ...Bolstered error estimation for discrete classifier applied to genomic signal ...
Bolstered error estimation for discrete classifier applied to genomic signal ...
 
¿Cuál es la estabilidad relevante de las proteínas?
¿Cuál es la estabilidad relevante de las proteínas?¿Cuál es la estabilidad relevante de las proteínas?
¿Cuál es la estabilidad relevante de las proteínas?
 
Biogeografía histórica y Análisis de Vicarianza: Una perspectiva computacional
Biogeografía histórica y Análisis de Vicarianza: Una perspectiva computacionalBiogeografía histórica y Análisis de Vicarianza: Una perspectiva computacional
Biogeografía histórica y Análisis de Vicarianza: Una perspectiva computacional
 

Último

Call Girls Hebbal Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hebbal Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Hebbal Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hebbal Just Call 7001305949 Top Class Call Girl Service Availablenarwatsonia7
 
Hemostasis Physiology and Clinical correlations by Dr Faiza.pdf
Hemostasis Physiology and Clinical correlations by Dr Faiza.pdfHemostasis Physiology and Clinical correlations by Dr Faiza.pdf
Hemostasis Physiology and Clinical correlations by Dr Faiza.pdfMedicoseAcademics
 
Call Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service Availablenarwatsonia7
 
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...narwatsonia7
 
Call Girls Service Chennai Jiya 7001305949 Independent Escort Service Chennai
Call Girls Service Chennai Jiya 7001305949 Independent Escort Service ChennaiCall Girls Service Chennai Jiya 7001305949 Independent Escort Service Chennai
Call Girls Service Chennai Jiya 7001305949 Independent Escort Service ChennaiNehru place Escorts
 
Call Girls Thane Just Call 9910780858 Get High Class Call Girls Service
Call Girls Thane Just Call 9910780858 Get High Class Call Girls ServiceCall Girls Thane Just Call 9910780858 Get High Class Call Girls Service
Call Girls Thane Just Call 9910780858 Get High Class Call Girls Servicesonalikaur4
 
Dwarka Sector 6 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few Cl...
Dwarka Sector 6 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few Cl...Dwarka Sector 6 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few Cl...
Dwarka Sector 6 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few Cl...rajnisinghkjn
 
Low Rate Call Girls Pune Esha 9907093804 Short 1500 Night 6000 Best call girl...
Low Rate Call Girls Pune Esha 9907093804 Short 1500 Night 6000 Best call girl...Low Rate Call Girls Pune Esha 9907093804 Short 1500 Night 6000 Best call girl...
Low Rate Call Girls Pune Esha 9907093804 Short 1500 Night 6000 Best call girl...Miss joya
 
call girls in munirka DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️
call girls in munirka  DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️call girls in munirka  DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️
call girls in munirka DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️saminamagar
 
call girls in green park DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️
call girls in green park  DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️call girls in green park  DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️
call girls in green park DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️saminamagar
 
Kolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call Now
Kolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call NowKolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call Now
Kolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call NowNehru place Escorts
 
Call Girl Lucknow Mallika 7001305949 Independent Escort Service Lucknow
Call Girl Lucknow Mallika 7001305949 Independent Escort Service LucknowCall Girl Lucknow Mallika 7001305949 Independent Escort Service Lucknow
Call Girl Lucknow Mallika 7001305949 Independent Escort Service Lucknownarwatsonia7
 
Book Call Girls in Yelahanka - For 7001305949 Cheap & Best with original Photos
Book Call Girls in Yelahanka - For 7001305949 Cheap & Best with original PhotosBook Call Girls in Yelahanka - For 7001305949 Cheap & Best with original Photos
Book Call Girls in Yelahanka - For 7001305949 Cheap & Best with original Photosnarwatsonia7
 
Low Rate Call Girls Mumbai Suman 9910780858 Independent Escort Service Mumbai
Low Rate Call Girls Mumbai Suman 9910780858 Independent Escort Service MumbaiLow Rate Call Girls Mumbai Suman 9910780858 Independent Escort Service Mumbai
Low Rate Call Girls Mumbai Suman 9910780858 Independent Escort Service Mumbaisonalikaur4
 
call girls in Connaught Place DELHI 🔝 >༒9540349809 🔝 genuine Escort Service ...
call girls in Connaught Place  DELHI 🔝 >༒9540349809 🔝 genuine Escort Service ...call girls in Connaught Place  DELHI 🔝 >༒9540349809 🔝 genuine Escort Service ...
call girls in Connaught Place DELHI 🔝 >༒9540349809 🔝 genuine Escort Service ...saminamagar
 
Call Girls Service Noida Maya 9711199012 Independent Escort Service Noida
Call Girls Service Noida Maya 9711199012 Independent Escort Service NoidaCall Girls Service Noida Maya 9711199012 Independent Escort Service Noida
Call Girls Service Noida Maya 9711199012 Independent Escort Service NoidaPooja Gupta
 
Call Girls Electronic City Just Call 7001305949 Top Class Call Girl Service A...
Call Girls Electronic City Just Call 7001305949 Top Class Call Girl Service A...Call Girls Electronic City Just Call 7001305949 Top Class Call Girl Service A...
Call Girls Electronic City Just Call 7001305949 Top Class Call Girl Service A...narwatsonia7
 
Call Girl Bangalore Nandini 7001305949 Independent Escort Service Bangalore
Call Girl Bangalore Nandini 7001305949 Independent Escort Service BangaloreCall Girl Bangalore Nandini 7001305949 Independent Escort Service Bangalore
Call Girl Bangalore Nandini 7001305949 Independent Escort Service Bangalorenarwatsonia7
 
Book Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbers
Book Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbersBook Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbers
Book Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbersnarwatsonia7
 
Call Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment Booking
Call Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment BookingCall Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment Booking
Call Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment BookingNehru place Escorts
 

Último (20)

Call Girls Hebbal Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hebbal Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Hebbal Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hebbal Just Call 7001305949 Top Class Call Girl Service Available
 
Hemostasis Physiology and Clinical correlations by Dr Faiza.pdf
Hemostasis Physiology and Clinical correlations by Dr Faiza.pdfHemostasis Physiology and Clinical correlations by Dr Faiza.pdf
Hemostasis Physiology and Clinical correlations by Dr Faiza.pdf
 
Call Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service Available
 
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...
 
Call Girls Service Chennai Jiya 7001305949 Independent Escort Service Chennai
Call Girls Service Chennai Jiya 7001305949 Independent Escort Service ChennaiCall Girls Service Chennai Jiya 7001305949 Independent Escort Service Chennai
Call Girls Service Chennai Jiya 7001305949 Independent Escort Service Chennai
 
Call Girls Thane Just Call 9910780858 Get High Class Call Girls Service
Call Girls Thane Just Call 9910780858 Get High Class Call Girls ServiceCall Girls Thane Just Call 9910780858 Get High Class Call Girls Service
Call Girls Thane Just Call 9910780858 Get High Class Call Girls Service
 
Dwarka Sector 6 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few Cl...
Dwarka Sector 6 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few Cl...Dwarka Sector 6 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few Cl...
Dwarka Sector 6 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few Cl...
 
Low Rate Call Girls Pune Esha 9907093804 Short 1500 Night 6000 Best call girl...
Low Rate Call Girls Pune Esha 9907093804 Short 1500 Night 6000 Best call girl...Low Rate Call Girls Pune Esha 9907093804 Short 1500 Night 6000 Best call girl...
Low Rate Call Girls Pune Esha 9907093804 Short 1500 Night 6000 Best call girl...
 
call girls in munirka DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️
call girls in munirka  DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️call girls in munirka  DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️
call girls in munirka DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️
 
call girls in green park DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️
call girls in green park  DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️call girls in green park  DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️
call girls in green park DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️
 
Kolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call Now
Kolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call NowKolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call Now
Kolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call Now
 
Call Girl Lucknow Mallika 7001305949 Independent Escort Service Lucknow
Call Girl Lucknow Mallika 7001305949 Independent Escort Service LucknowCall Girl Lucknow Mallika 7001305949 Independent Escort Service Lucknow
Call Girl Lucknow Mallika 7001305949 Independent Escort Service Lucknow
 
Book Call Girls in Yelahanka - For 7001305949 Cheap & Best with original Photos
Book Call Girls in Yelahanka - For 7001305949 Cheap & Best with original PhotosBook Call Girls in Yelahanka - For 7001305949 Cheap & Best with original Photos
Book Call Girls in Yelahanka - For 7001305949 Cheap & Best with original Photos
 
Low Rate Call Girls Mumbai Suman 9910780858 Independent Escort Service Mumbai
Low Rate Call Girls Mumbai Suman 9910780858 Independent Escort Service MumbaiLow Rate Call Girls Mumbai Suman 9910780858 Independent Escort Service Mumbai
Low Rate Call Girls Mumbai Suman 9910780858 Independent Escort Service Mumbai
 
call girls in Connaught Place DELHI 🔝 >༒9540349809 🔝 genuine Escort Service ...
call girls in Connaught Place  DELHI 🔝 >༒9540349809 🔝 genuine Escort Service ...call girls in Connaught Place  DELHI 🔝 >༒9540349809 🔝 genuine Escort Service ...
call girls in Connaught Place DELHI 🔝 >༒9540349809 🔝 genuine Escort Service ...
 
Call Girls Service Noida Maya 9711199012 Independent Escort Service Noida
Call Girls Service Noida Maya 9711199012 Independent Escort Service NoidaCall Girls Service Noida Maya 9711199012 Independent Escort Service Noida
Call Girls Service Noida Maya 9711199012 Independent Escort Service Noida
 
Call Girls Electronic City Just Call 7001305949 Top Class Call Girl Service A...
Call Girls Electronic City Just Call 7001305949 Top Class Call Girl Service A...Call Girls Electronic City Just Call 7001305949 Top Class Call Girl Service A...
Call Girls Electronic City Just Call 7001305949 Top Class Call Girl Service A...
 
Call Girl Bangalore Nandini 7001305949 Independent Escort Service Bangalore
Call Girl Bangalore Nandini 7001305949 Independent Escort Service BangaloreCall Girl Bangalore Nandini 7001305949 Independent Escort Service Bangalore
Call Girl Bangalore Nandini 7001305949 Independent Escort Service Bangalore
 
Book Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbers
Book Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbersBook Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbers
Book Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbers
 
Call Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment Booking
Call Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment BookingCall Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment Booking
Call Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment Booking
 

Data balancing for phenotype classification based on SNPs

  • 1. 1 Facultad de Ingeniería - Universidad Nacional de Mar del Plata 2 - Agencia Nacional de Promoción Científica y Tecnológica – FONCyT - PICT 2006 1er Congreso Argentino de Bioinformática Data Balancing for Phenotype Classification Based on SNPs Marcel Brun 1,2 , Virginia Ballarín 1
  • 2.
  • 4.
  • 5. Classification based on SNP data SNP Data Combinatorial search Errors Tables for combinations of n SNPs as predictors of control/disease 35 3 1 1 2 3 0 0 0.423 0.430 0.423 0.248 1 2 4 0 0 0.340 0.430 0.340 0.284 1 2 5 0 0 0.351 0.430 0.351 0.279 1 2 6 0 0 0.205 0.430 0.205 0.342 1 2 7 0 0 0.323 0.430 0.323 0.291 1 3 4 0 0 0.344 0.430 0.344 0.282 1 3 5 0 0 0.328 0.430 0.328 0.289 1 3 6 0 0 0.333 0.430 0.333 0.287 1 3 7 0 0 0.314 0.430 0.314 0.295 1 4 5 0 0 0.267 0.430 0.267 0.315 1 4 6 0 0 0.130 0.430 0.130 0.374 1 4 7 0 0 0.267 0.430 0.267 0.315 Processed Results in html pages and Excel datasheets Calls – SNiPer HD
  • 6.
  • 7.
  • 8. Estimation of error rate for 2 SNPs from data – Discrete Full logic Statistical Inference of the optimal function using multi-resolution to “generalize” 2 mistakes on 6  66% accuracy Control AB AA S 7 Case AB AA S 7 Case AA BB S 6 Case AA BB S 5 Case AA AA S 4 Control Case Control Phenotype AA AA AB SNP 2 AB AA AA SNP 1 S 3 S 2 S 1 Train 2 1 0 1 0 0 2 0 0 # Cases 0 2 0 0 0 0 0 0 0 # Control Control Control AB AA Case ??? BB AA Case Case AA AB Case ??? BB AB Case ??? AB AB Case Case ??? ??? Decision Case AA BB Case BB BB Case AB BB Case AA AA Generalization SNP 2 SNP 1 Case AA AA S 11 Case AB BB S 12 Control BB AB S 10 Control Case Control Phenotype AA AA AB SNP 2 AA AA AA SNP 1 S 13 S 9 S 8 TEST
  • 9. Example of Truth table for a real case 396 185 Total Case 2 ( 0.3 % ) 0 ( 0 % ) AA AA AA Case 0 ( 0 % ) 0 ( 0 % ) AB AA AA Case 0 ( 0 % ) 0 ( 0 % ) BB AA AA Case 28 ( 3.5 % ) 0 ( 0 % ) AA AB AA Case 11 ( 1.4 % ) 0 ( 0 % ) AB AB AA Case 0 ( 0 % ) 0 ( 0 % ) BB AB AA Case 22 ( 2.8 % ) 3 ( 0.8 % ) AA BB AA Case 11 ( 1.4 % ) 0 ( 0 % ) AB BB AA Case 0 ( 0 % ) 0 ( 0 % ) BB BB AA Control 5 ( 0.6 % ) 8 ( 2.2 % ) AA AA AB Case 5 ( 0.6 % ) 2 ( 0.5 % ) AB AA AB Control 1 ( 0.1 % ) 1 ( 0.3 % ) BB AA AB Case 60 ( 7.6 % ) 14 ( 3.8 % ) AA AB AB Case 35 ( 4.4 % ) 5 ( 1.4 % ) AB AB AB Case 3 ( 0.4 % ) 1 ( 0.3 % ) BB AB AB Case 49 ( 6.2 % ) 11 ( 3 % ) AA BB AB Case 31 ( 3.9 % ) 6 ( 1.6 % ) AB BB AB Control 3 ( 0.4 % ) 2 ( 0.5 % ) BB BB AB Control 8 ( 1 % ) 25 ( 6.8 % ) AA AA BB Case 9 ( 1.1 % ) 4 ( 1.1 % ) AB AA BB Control 1 ( 0.1 % ) 1 ( 0.3 % ) BB AA BB Control 41 ( 5.2 % ) 49 ( 13.2 % ) AA AB BB Control 15 ( 1.9 % ) 11 ( 3 % ) AB AB BB Case 2 ( 0.3 % ) 0 ( 0 % ) BB AB BB Control 31 ( 3.9 % ) 34 ( 9.2 % ) AA BB BB Case 21 ( 2.7 % ) 5 ( 1.4 % ) AB BB BB Control 2 ( 0.3 % ) 3 ( 0.8 % ) BB BB BB Predicted # Case ( Freq. Corrected ) # Control ( Freq. Corrected ) SNP 41 SNP 6 SNP 1
  • 10. Feature Selection – Sets of 3 SNPs Selected SNPs Selected SNPs Step N Step N+1 Error Error
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17. Balancing on discrete data Unbalanced Data Classifier design 19 85 AA AA 512 0 45 90 90 50 60 10 95 # 0 120 19 8 3 15 19 16 12 9 # 1 AB AA BB AA AA AB BB AB AB AB AA BB BB BB AB BB SNP 2 SNP 1 0 0.0295 0.1318 AA AA 0.1860 0.0295 0.0124 0.0047 0.0233 0.0295 0.0248 0.0186 0.0140 # 1 0.814 0.0000 0.0698 0.1395 0.1395 0.0775 0.0930 0.0155 0.1473 # 0 0 0 0 0 0 1 0 0 Classif AB AA BB AA AA AB BB AB AB AB AA BB BB BB AB BB SNP 2 SNP 1
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24. Example of Application Victor L. Boyartchuk, Karl W. Broman, Rebecca E. Mosher, Sarah E.F. D’Orazio, Michael N. Starnbach & William F. Dietrich, “Multigenic control of Listeria monocytogenes susceptibility in mice”, 2001 Nature Publishing Group, Brief Communications, 2001 Selected SNPs Survival vs. No Survival Analysis
  • 25.
  • 26.
  • 27. Acknowledgments UNMdP Virginia Ballarin Mariela Azul Gonzalez INTA (Balcarce) Pablo Corva FI-UNER Inti Anabela Pagnuco Agencia FONCyT – PICT 2313 TGen Edward Dougherty Dietrich Stephan