Call Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment Booking
Data balancing for phenotype classification based on SNPs
1. 1 Facultad de Ingeniería - Universidad Nacional de Mar del Plata 2 - Agencia Nacional de Promoción Científica y Tecnológica – FONCyT - PICT 2006 1er Congreso Argentino de Bioinformática Data Balancing for Phenotype Classification Based on SNPs Marcel Brun 1,2 , Virginia Ballarín 1
5. Classification based on SNP data SNP Data Combinatorial search Errors Tables for combinations of n SNPs as predictors of control/disease 35 3 1 1 2 3 0 0 0.423 0.430 0.423 0.248 1 2 4 0 0 0.340 0.430 0.340 0.284 1 2 5 0 0 0.351 0.430 0.351 0.279 1 2 6 0 0 0.205 0.430 0.205 0.342 1 2 7 0 0 0.323 0.430 0.323 0.291 1 3 4 0 0 0.344 0.430 0.344 0.282 1 3 5 0 0 0.328 0.430 0.328 0.289 1 3 6 0 0 0.333 0.430 0.333 0.287 1 3 7 0 0 0.314 0.430 0.314 0.295 1 4 5 0 0 0.267 0.430 0.267 0.315 1 4 6 0 0 0.130 0.430 0.130 0.374 1 4 7 0 0 0.267 0.430 0.267 0.315 Processed Results in html pages and Excel datasheets Calls – SNiPer HD
6.
7.
8. Estimation of error rate for 2 SNPs from data – Discrete Full logic Statistical Inference of the optimal function using multi-resolution to “generalize” 2 mistakes on 6 66% accuracy Control AB AA S 7 Case AB AA S 7 Case AA BB S 6 Case AA BB S 5 Case AA AA S 4 Control Case Control Phenotype AA AA AB SNP 2 AB AA AA SNP 1 S 3 S 2 S 1 Train 2 1 0 1 0 0 2 0 0 # Cases 0 2 0 0 0 0 0 0 0 # Control Control Control AB AA Case ??? BB AA Case Case AA AB Case ??? BB AB Case ??? AB AB Case Case ??? ??? Decision Case AA BB Case BB BB Case AB BB Case AA AA Generalization SNP 2 SNP 1 Case AA AA S 11 Case AB BB S 12 Control BB AB S 10 Control Case Control Phenotype AA AA AB SNP 2 AA AA AA SNP 1 S 13 S 9 S 8 TEST
9. Example of Truth table for a real case 396 185 Total Case 2 ( 0.3 % ) 0 ( 0 % ) AA AA AA Case 0 ( 0 % ) 0 ( 0 % ) AB AA AA Case 0 ( 0 % ) 0 ( 0 % ) BB AA AA Case 28 ( 3.5 % ) 0 ( 0 % ) AA AB AA Case 11 ( 1.4 % ) 0 ( 0 % ) AB AB AA Case 0 ( 0 % ) 0 ( 0 % ) BB AB AA Case 22 ( 2.8 % ) 3 ( 0.8 % ) AA BB AA Case 11 ( 1.4 % ) 0 ( 0 % ) AB BB AA Case 0 ( 0 % ) 0 ( 0 % ) BB BB AA Control 5 ( 0.6 % ) 8 ( 2.2 % ) AA AA AB Case 5 ( 0.6 % ) 2 ( 0.5 % ) AB AA AB Control 1 ( 0.1 % ) 1 ( 0.3 % ) BB AA AB Case 60 ( 7.6 % ) 14 ( 3.8 % ) AA AB AB Case 35 ( 4.4 % ) 5 ( 1.4 % ) AB AB AB Case 3 ( 0.4 % ) 1 ( 0.3 % ) BB AB AB Case 49 ( 6.2 % ) 11 ( 3 % ) AA BB AB Case 31 ( 3.9 % ) 6 ( 1.6 % ) AB BB AB Control 3 ( 0.4 % ) 2 ( 0.5 % ) BB BB AB Control 8 ( 1 % ) 25 ( 6.8 % ) AA AA BB Case 9 ( 1.1 % ) 4 ( 1.1 % ) AB AA BB Control 1 ( 0.1 % ) 1 ( 0.3 % ) BB AA BB Control 41 ( 5.2 % ) 49 ( 13.2 % ) AA AB BB Control 15 ( 1.9 % ) 11 ( 3 % ) AB AB BB Case 2 ( 0.3 % ) 0 ( 0 % ) BB AB BB Control 31 ( 3.9 % ) 34 ( 9.2 % ) AA BB BB Case 21 ( 2.7 % ) 5 ( 1.4 % ) AB BB BB Control 2 ( 0.3 % ) 3 ( 0.8 % ) BB BB BB Predicted # Case ( Freq. Corrected ) # Control ( Freq. Corrected ) SNP 41 SNP 6 SNP 1
10. Feature Selection – Sets of 3 SNPs Selected SNPs Selected SNPs Step N Step N+1 Error Error
11.
12.
13.
14.
15.
16.
17. Balancing on discrete data Unbalanced Data Classifier design 19 85 AA AA 512 0 45 90 90 50 60 10 95 # 0 120 19 8 3 15 19 16 12 9 # 1 AB AA BB AA AA AB BB AB AB AB AA BB BB BB AB BB SNP 2 SNP 1 0 0.0295 0.1318 AA AA 0.1860 0.0295 0.0124 0.0047 0.0233 0.0295 0.0248 0.0186 0.0140 # 1 0.814 0.0000 0.0698 0.1395 0.1395 0.0775 0.0930 0.0155 0.1473 # 0 0 0 0 0 0 1 0 0 Classif AB AA BB AA AA AB BB AB AB AB AA BB BB BB AB BB SNP 2 SNP 1
18.
19.
20.
21.
22.
23.
24. Example of Application Victor L. Boyartchuk, Karl W. Broman, Rebecca E. Mosher, Sarah E.F. D’Orazio, Michael N. Starnbach & William F. Dietrich, “Multigenic control of Listeria monocytogenes susceptibility in mice”, 2001 Nature Publishing Group, Brief Communications, 2001 Selected SNPs Survival vs. No Survival Analysis
25.
26.
27. Acknowledgments UNMdP Virginia Ballarin Mariela Azul Gonzalez INTA (Balcarce) Pablo Corva FI-UNER Inti Anabela Pagnuco Agencia FONCyT – PICT 2313 TGen Edward Dougherty Dietrich Stephan