SlideShare uma empresa Scribd logo
1 de 25
Forest Learning based on the Chow-Liu Algorithm
and its Application to Genome Differential Analysis:
A Novel Mutual Information Estimation
Nov. 16-18, 2015
Joe Suzuki
(Osaka Univ.)
@joe_suzuki
Prof-joe
Road Map
• MI Estimation for Discrete (warming-up)
• MI Estimation for Discrete/Coninuous (propose)
• Experiment 1 (gene differential analysis)
• Experiment 2 (combination of SNP and gene)
• Concluding Remarks
How do you estimate MI given data ?
For discrete data, a naïve way is
MI Estimation based on MDL (Suzuki, UAI93)
P. Liang and N. Srebro 2004、K. Panayidou 2010、Edwards, et. al 2010 revisited the same
Overestimation occurs for 𝐼 𝑛
Dist. Chow-Liu (apply Kruskal)
known approximation to trees with K-L minimum
unknown ML spanning trees (Chow-Liu), MDL forests (Suzuki 93)
R package bnlearn data set “Asia”
Forest
Spanning
Tree
Correlation ≠ Independence
X: Gauss, Y: Discrete (Edwards et. al. 2010)
close to Naïve Bayes
(ANOVA)
Causal Direction
Gaussaian should not be between discrete variables
Discrete
Gaussian
(Edwards, et. al 2010)
SNP
Gene Expression
Proposed MI estimation
For each mesh (percentile), estimate the MI based on the quntized data
Choose the maximum MI estimation
n=1000, 8x8
occurrences of X,Y
occurrence of (X, Y)
Does not distinguish discrete and continuous
For each u,v=1,2,…,s,
Continue to divide the interval half
(avoid to divide the mass intervals)
What can be proved?
Convex
Experimet 1:
Genome expression profiling in breast cancer patients
• 58 sample with p53 mutation and 192 without it
• 1000 genes
Why only Bonferroni and FDR rather than causality and regression?
Normality test: p-values
are extreamely low
MI distributions for all (1000) and 50 least p-value genes
• 20 seconds for MI values and 30 seconds for a forest (1000 nodes)
The class variable has
only one connection
with gene variables
We conclude that
regression may be more
appropriate than
a graphical model.
Experiment 2:
300 gene expression (continue) and 300 SNP (3 values)
• Utah 90 residents SNP (HapMap) with northern and western European
ancestry
• R library (BioConductor) GGData
ftp://ftp.sanger.ac.uk/pub/genevar/CEU_parents_norm_march2007.zip
200 genes and 200 SNPs
Causality among genes and SNP can be explored!!
Insights we obtain from the experiment:
• In the real causality,
SNP and genes are not separated as Edwards assumed !!
• Both SNPs and gene expressions are hubs of the mixed network.
variable cardinality
SNP 3 values
Gene expression continuous
Summary
• MI estimtaion
• Application to Chow-Liu
• Gene Differential Analysis
• Causality among SNPs and Gene Expressions
Future Works
Beyond Forests:
• BNs with bounded TW
• MNs not necessarily forests

Mais conteúdo relacionado

Semelhante a Forest Learning based on the Chow-Liu Algorithm and its Application to Genome Differential Analysis: A Novel Mutual Information Estimation

Interpreting ‘tree space’ in the context of very large empirical datasets
Interpreting ‘tree space’ in the context of very large empirical datasetsInterpreting ‘tree space’ in the context of very large empirical datasets
Interpreting ‘tree space’ in the context of very large empirical datasetsJoe Parker
 
Predicting phenotype from genotype with machine learning
Predicting phenotype from genotype with machine learningPredicting phenotype from genotype with machine learning
Predicting phenotype from genotype with machine learningPatricia Francis-Lyon
 
Glioblastoma_Linkedin
Glioblastoma_LinkedinGlioblastoma_Linkedin
Glioblastoma_LinkedinElsa Fecke
 
Genome wide association studies seminar Prepared by Ms Varsha Gaitonde.
Genome wide association studies seminar Prepared by Ms Varsha Gaitonde.Genome wide association studies seminar Prepared by Ms Varsha Gaitonde.
Genome wide association studies seminar Prepared by Ms Varsha Gaitonde.Varsha Gayatonde
 
Genome wide association studies seminar
Genome wide association studies seminarGenome wide association studies seminar
Genome wide association studies seminarVarsha Gayatonde
 
Poster - determining the effects of tau on synaptic density in a mouse model ...
Poster - determining the effects of tau on synaptic density in a mouse model ...Poster - determining the effects of tau on synaptic density in a mouse model ...
Poster - determining the effects of tau on synaptic density in a mouse model ...Shaun Croft, MScR
 
Research in Progress April 2014
Research in Progress April 2014Research in Progress April 2014
Research in Progress April 2014Vanessa S
 
ASHG 2015 - Redundant Annotations in Tertiary Analysis
ASHG 2015 - Redundant Annotations in Tertiary AnalysisASHG 2015 - Redundant Annotations in Tertiary Analysis
ASHG 2015 - Redundant Annotations in Tertiary AnalysisJames Warren
 
Microarray data noise simulation
Microarray data noise simulationMicroarray data noise simulation
Microarray data noise simulationDespoina Kalfakakou
 
Comparative transcriptome analysis of Eugenia uniflora L. (Myrtaceae) shed li...
Comparative transcriptome analysis of Eugenia uniflora L. (Myrtaceae) shed li...Comparative transcriptome analysis of Eugenia uniflora L. (Myrtaceae) shed li...
Comparative transcriptome analysis of Eugenia uniflora L. (Myrtaceae) shed li...José Neto
 
Pathway Signature Genes
Pathway Signature GenesPathway Signature Genes
Pathway Signature GenesNCMLS
 
jin-HMG2014-post
jin-HMG2014-postjin-HMG2014-post
jin-HMG2014-postJin Yu
 
Genetic diversity clustering and AMOVA
Genetic diversityclustering and AMOVAGenetic diversityclustering and AMOVA
Genetic diversity clustering and AMOVAFAO
 
Correlation globes of the exposome 2016
Correlation globes of the exposome 2016Correlation globes of the exposome 2016
Correlation globes of the exposome 2016Chirag Patel
 

Semelhante a Forest Learning based on the Chow-Liu Algorithm and its Application to Genome Differential Analysis: A Novel Mutual Information Estimation (20)

6 55 E
6 55 E6 55 E
6 55 E
 
Genotyping an invasive vine
Genotyping an invasive vineGenotyping an invasive vine
Genotyping an invasive vine
 
Interpreting ‘tree space’ in the context of very large empirical datasets
Interpreting ‘tree space’ in the context of very large empirical datasetsInterpreting ‘tree space’ in the context of very large empirical datasets
Interpreting ‘tree space’ in the context of very large empirical datasets
 
Predicting phenotype from genotype with machine learning
Predicting phenotype from genotype with machine learningPredicting phenotype from genotype with machine learning
Predicting phenotype from genotype with machine learning
 
Glioblastoma_Linkedin
Glioblastoma_LinkedinGlioblastoma_Linkedin
Glioblastoma_Linkedin
 
Genome wide association studies seminar Prepared by Ms Varsha Gaitonde.
Genome wide association studies seminar Prepared by Ms Varsha Gaitonde.Genome wide association studies seminar Prepared by Ms Varsha Gaitonde.
Genome wide association studies seminar Prepared by Ms Varsha Gaitonde.
 
Genome wide association studies seminar
Genome wide association studies seminarGenome wide association studies seminar
Genome wide association studies seminar
 
El Cerebro Social por Pablo Billeke
El Cerebro Social por Pablo BillekeEl Cerebro Social por Pablo Billeke
El Cerebro Social por Pablo Billeke
 
Current Projects Summary
Current Projects SummaryCurrent Projects Summary
Current Projects Summary
 
Poster - determining the effects of tau on synaptic density in a mouse model ...
Poster - determining the effects of tau on synaptic density in a mouse model ...Poster - determining the effects of tau on synaptic density in a mouse model ...
Poster - determining the effects of tau on synaptic density in a mouse model ...
 
Research in Progress April 2014
Research in Progress April 2014Research in Progress April 2014
Research in Progress April 2014
 
ASHG 2015 - Redundant Annotations in Tertiary Analysis
ASHG 2015 - Redundant Annotations in Tertiary AnalysisASHG 2015 - Redundant Annotations in Tertiary Analysis
ASHG 2015 - Redundant Annotations in Tertiary Analysis
 
Microarray data noise simulation
Microarray data noise simulationMicroarray data noise simulation
Microarray data noise simulation
 
Nicolas Puillandre - Opening Plenary
Nicolas Puillandre - Opening PlenaryNicolas Puillandre - Opening Plenary
Nicolas Puillandre - Opening Plenary
 
Comparative transcriptome analysis of Eugenia uniflora L. (Myrtaceae) shed li...
Comparative transcriptome analysis of Eugenia uniflora L. (Myrtaceae) shed li...Comparative transcriptome analysis of Eugenia uniflora L. (Myrtaceae) shed li...
Comparative transcriptome analysis of Eugenia uniflora L. (Myrtaceae) shed li...
 
Pathway Signature Genes
Pathway Signature GenesPathway Signature Genes
Pathway Signature Genes
 
jin-HMG2014-post
jin-HMG2014-postjin-HMG2014-post
jin-HMG2014-post
 
Genetic diversity clustering and AMOVA
Genetic diversityclustering and AMOVAGenetic diversityclustering and AMOVA
Genetic diversity clustering and AMOVA
 
Correlation globes of the exposome 2016
Correlation globes of the exposome 2016Correlation globes of the exposome 2016
Correlation globes of the exposome 2016
 
Slides_SB3.ppt
Slides_SB3.pptSlides_SB3.ppt
Slides_SB3.ppt
 

Mais de Joe Suzuki

RとPythonを比較する
RとPythonを比較するRとPythonを比較する
RとPythonを比較するJoe Suzuki
 
R集会@統数研
R集会@統数研R集会@統数研
R集会@統数研Joe Suzuki
 
E-learning Development of Statistics and in Duex: Practical Approaches and Th...
E-learning Development of Statistics and in Duex: Practical Approaches and Th...E-learning Development of Statistics and in Duex: Practical Approaches and Th...
E-learning Development of Statistics and in Duex: Practical Approaches and Th...Joe Suzuki
 
分枝限定法でモデル選択の計算量を低減する
分枝限定法でモデル選択の計算量を低減する分枝限定法でモデル選択の計算量を低減する
分枝限定法でモデル選択の計算量を低減するJoe Suzuki
 
連続変量を含む条件付相互情報量の推定
連続変量を含む条件付相互情報量の推定連続変量を含む条件付相互情報量の推定
連続変量を含む条件付相互情報量の推定Joe Suzuki
 
E-learning Design and Development for Data Science in Osaka University
E-learning Design and Development for Data Science in Osaka UniversityE-learning Design and Development for Data Science in Osaka University
E-learning Design and Development for Data Science in Osaka UniversityJoe Suzuki
 
AMBN2017 サテライトワークショップ
AMBN2017 サテライトワークショップAMBN2017 サテライトワークショップ
AMBN2017 サテライトワークショップJoe Suzuki
 
CRAN Rパッケージ BNSLの概要
CRAN Rパッケージ BNSLの概要CRAN Rパッケージ BNSLの概要
CRAN Rパッケージ BNSLの概要Joe Suzuki
 
Forest Learning from Data
Forest Learning from DataForest Learning from Data
Forest Learning from DataJoe Suzuki
 
A Bayesian Approach to Data Compression
A Bayesian Approach to Data CompressionA Bayesian Approach to Data Compression
A Bayesian Approach to Data CompressionJoe Suzuki
 
A Conjecture on Strongly Consistent Learning
A Conjecture on Strongly Consistent LearningA Conjecture on Strongly Consistent Learning
A Conjecture on Strongly Consistent LearningJoe Suzuki
 
A Generalization of the Chow-Liu Algorithm and its Applications to Artificial...
A Generalization of the Chow-Liu Algorithm and its Applications to Artificial...A Generalization of the Chow-Liu Algorithm and its Applications to Artificial...
A Generalization of the Chow-Liu Algorithm and its Applications to Artificial...Joe Suzuki
 
A Generalization of Nonparametric Estimation and On-Line Prediction for Stati...
A Generalization of Nonparametric Estimation and On-Line Prediction for Stati...A Generalization of Nonparametric Estimation and On-Line Prediction for Stati...
A Generalization of Nonparametric Estimation and On-Line Prediction for Stati...Joe Suzuki
 
研究紹介(学生向け)
研究紹介(学生向け)研究紹介(学生向け)
研究紹介(学生向け)Joe Suzuki
 
Bayesian Criteria based on Universal Measures
Bayesian Criteria based on Universal MeasuresBayesian Criteria based on Universal Measures
Bayesian Criteria based on Universal MeasuresJoe Suzuki
 
MDL/Bayesian Criteria based on Universal Coding/Measure
MDL/Bayesian Criteria based on Universal Coding/MeasureMDL/Bayesian Criteria based on Universal Coding/Measure
MDL/Bayesian Criteria based on Universal Coding/MeasureJoe Suzuki
 
The Universal Measure for General Sources and its Application to MDL/Bayesian...
The Universal Measure for General Sources and its Application to MDL/Bayesian...The Universal Measure for General Sources and its Application to MDL/Bayesian...
The Universal Measure for General Sources and its Application to MDL/Bayesian...Joe Suzuki
 
Universal Prediction without assuming either Discrete or Continuous
Universal Prediction without assuming either Discrete or ContinuousUniversal Prediction without assuming either Discrete or Continuous
Universal Prediction without assuming either Discrete or ContinuousJoe Suzuki
 
Bayesian network structure estimation based on the Bayesian/MDL criteria when...
Bayesian network structure estimation based on the Bayesian/MDL criteria when...Bayesian network structure estimation based on the Bayesian/MDL criteria when...
Bayesian network structure estimation based on the Bayesian/MDL criteria when...Joe Suzuki
 

Mais de Joe Suzuki (20)

RとPythonを比較する
RとPythonを比較するRとPythonを比較する
RとPythonを比較する
 
R集会@統数研
R集会@統数研R集会@統数研
R集会@統数研
 
E-learning Development of Statistics and in Duex: Practical Approaches and Th...
E-learning Development of Statistics and in Duex: Practical Approaches and Th...E-learning Development of Statistics and in Duex: Practical Approaches and Th...
E-learning Development of Statistics and in Duex: Practical Approaches and Th...
 
分枝限定法でモデル選択の計算量を低減する
分枝限定法でモデル選択の計算量を低減する分枝限定法でモデル選択の計算量を低減する
分枝限定法でモデル選択の計算量を低減する
 
連続変量を含む条件付相互情報量の推定
連続変量を含む条件付相互情報量の推定連続変量を含む条件付相互情報量の推定
連続変量を含む条件付相互情報量の推定
 
E-learning Design and Development for Data Science in Osaka University
E-learning Design and Development for Data Science in Osaka UniversityE-learning Design and Development for Data Science in Osaka University
E-learning Design and Development for Data Science in Osaka University
 
UAI 2017
UAI 2017UAI 2017
UAI 2017
 
AMBN2017 サテライトワークショップ
AMBN2017 サテライトワークショップAMBN2017 サテライトワークショップ
AMBN2017 サテライトワークショップ
 
CRAN Rパッケージ BNSLの概要
CRAN Rパッケージ BNSLの概要CRAN Rパッケージ BNSLの概要
CRAN Rパッケージ BNSLの概要
 
Forest Learning from Data
Forest Learning from DataForest Learning from Data
Forest Learning from Data
 
A Bayesian Approach to Data Compression
A Bayesian Approach to Data CompressionA Bayesian Approach to Data Compression
A Bayesian Approach to Data Compression
 
A Conjecture on Strongly Consistent Learning
A Conjecture on Strongly Consistent LearningA Conjecture on Strongly Consistent Learning
A Conjecture on Strongly Consistent Learning
 
A Generalization of the Chow-Liu Algorithm and its Applications to Artificial...
A Generalization of the Chow-Liu Algorithm and its Applications to Artificial...A Generalization of the Chow-Liu Algorithm and its Applications to Artificial...
A Generalization of the Chow-Liu Algorithm and its Applications to Artificial...
 
A Generalization of Nonparametric Estimation and On-Line Prediction for Stati...
A Generalization of Nonparametric Estimation and On-Line Prediction for Stati...A Generalization of Nonparametric Estimation and On-Line Prediction for Stati...
A Generalization of Nonparametric Estimation and On-Line Prediction for Stati...
 
研究紹介(学生向け)
研究紹介(学生向け)研究紹介(学生向け)
研究紹介(学生向け)
 
Bayesian Criteria based on Universal Measures
Bayesian Criteria based on Universal MeasuresBayesian Criteria based on Universal Measures
Bayesian Criteria based on Universal Measures
 
MDL/Bayesian Criteria based on Universal Coding/Measure
MDL/Bayesian Criteria based on Universal Coding/MeasureMDL/Bayesian Criteria based on Universal Coding/Measure
MDL/Bayesian Criteria based on Universal Coding/Measure
 
The Universal Measure for General Sources and its Application to MDL/Bayesian...
The Universal Measure for General Sources and its Application to MDL/Bayesian...The Universal Measure for General Sources and its Application to MDL/Bayesian...
The Universal Measure for General Sources and its Application to MDL/Bayesian...
 
Universal Prediction without assuming either Discrete or Continuous
Universal Prediction without assuming either Discrete or ContinuousUniversal Prediction without assuming either Discrete or Continuous
Universal Prediction without assuming either Discrete or Continuous
 
Bayesian network structure estimation based on the Bayesian/MDL criteria when...
Bayesian network structure estimation based on the Bayesian/MDL criteria when...Bayesian network structure estimation based on the Bayesian/MDL criteria when...
Bayesian network structure estimation based on the Bayesian/MDL criteria when...
 

Último

Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Silpa
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .Poonam Aher Patil
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)Areesha Ahmad
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPirithiRaju
 
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Servicenishacall1
 
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts ServiceJustdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Servicemonikaservice1
 
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Bookingroncy bisnoi
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfSumit Kumar yadav
 
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flypumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flyPRADYUMMAURYA1
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsSérgio Sacani
 
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxPSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxSuji236384
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)Areesha Ahmad
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bSérgio Sacani
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICEayushi9330
 
IDENTIFICATION OF THE LIVING- forensic medicine
IDENTIFICATION OF THE LIVING- forensic medicineIDENTIFICATION OF THE LIVING- forensic medicine
IDENTIFICATION OF THE LIVING- forensic medicinesherlingomez2
 
Introduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptxIntroduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptxBhagirath Gogikar
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxRizalinePalanog2
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000Sapana Sha
 
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...ssuser79fe74
 
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)AkefAfaneh2
 

Último (20)

Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
 
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts ServiceJustdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
 
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdf
 
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flypumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxPSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
 
IDENTIFICATION OF THE LIVING- forensic medicine
IDENTIFICATION OF THE LIVING- forensic medicineIDENTIFICATION OF THE LIVING- forensic medicine
IDENTIFICATION OF THE LIVING- forensic medicine
 
Introduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptxIntroduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptx
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
 
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
 

Forest Learning based on the Chow-Liu Algorithm and its Application to Genome Differential Analysis: A Novel Mutual Information Estimation

  • 1. Forest Learning based on the Chow-Liu Algorithm and its Application to Genome Differential Analysis: A Novel Mutual Information Estimation Nov. 16-18, 2015 Joe Suzuki (Osaka Univ.) @joe_suzuki Prof-joe
  • 2. Road Map • MI Estimation for Discrete (warming-up) • MI Estimation for Discrete/Coninuous (propose) • Experiment 1 (gene differential analysis) • Experiment 2 (combination of SNP and gene) • Concluding Remarks
  • 3. How do you estimate MI given data ? For discrete data, a naïve way is
  • 4. MI Estimation based on MDL (Suzuki, UAI93) P. Liang and N. Srebro 2004、K. Panayidou 2010、Edwards, et. al 2010 revisited the same
  • 5.
  • 7. Dist. Chow-Liu (apply Kruskal) known approximation to trees with K-L minimum unknown ML spanning trees (Chow-Liu), MDL forests (Suzuki 93)
  • 8. R package bnlearn data set “Asia” Forest Spanning Tree
  • 10. X: Gauss, Y: Discrete (Edwards et. al. 2010) close to Naïve Bayes (ANOVA) Causal Direction
  • 11. Gaussaian should not be between discrete variables Discrete Gaussian (Edwards, et. al 2010) SNP Gene Expression
  • 12. Proposed MI estimation For each mesh (percentile), estimate the MI based on the quntized data Choose the maximum MI estimation
  • 13. n=1000, 8x8 occurrences of X,Y occurrence of (X, Y)
  • 14. Does not distinguish discrete and continuous For each u,v=1,2,…,s, Continue to divide the interval half (avoid to divide the mass intervals)
  • 15.
  • 16. What can be proved?
  • 18. Experimet 1: Genome expression profiling in breast cancer patients • 58 sample with p53 mutation and 192 without it • 1000 genes Why only Bonferroni and FDR rather than causality and regression?
  • 20. MI distributions for all (1000) and 50 least p-value genes
  • 21. • 20 seconds for MI values and 30 seconds for a forest (1000 nodes) The class variable has only one connection with gene variables We conclude that regression may be more appropriate than a graphical model.
  • 22. Experiment 2: 300 gene expression (continue) and 300 SNP (3 values) • Utah 90 residents SNP (HapMap) with northern and western European ancestry • R library (BioConductor) GGData ftp://ftp.sanger.ac.uk/pub/genevar/CEU_parents_norm_march2007.zip
  • 23. 200 genes and 200 SNPs
  • 24. Causality among genes and SNP can be explored!! Insights we obtain from the experiment: • In the real causality, SNP and genes are not separated as Edwards assumed !! • Both SNPs and gene expressions are hubs of the mixed network. variable cardinality SNP 3 values Gene expression continuous
  • 25. Summary • MI estimtaion • Application to Chow-Liu • Gene Differential Analysis • Causality among SNPs and Gene Expressions Future Works Beyond Forests: • BNs with bounded TW • MNs not necessarily forests