SlideShare uma empresa Scribd logo
1 de 25
1
David Amar, Tom Hait, and Ron Shamir
Blavatnik School of Computer Science
Tel Aviv University
2
Comparative genomics
 Standard expression experiments: cases vs. controls ->
differential genes -> interpretation
 Problems
 Small number of samples
 Non-specific signal
 Interpretation of a gene set/ gene ranking
 Goal: find specific changes for a tested disease
 E.g., an up-regulated pathway
 Crucial for clinical studies
3
Previous integrative classification studies
 Huang et al. 2010 PNAS (9,160 samples); Schmid et al.
PNAS 2012 (3,030); Lee et al. Bioinformatics 2013 (~14,000)
 Multilabel classification
 Global expression patterns
 Only 1-3 platforms
 Many datasets were removed from GEO
 No “healthy” class (Huang);No diseases (Lee)
 Pathprint (Altschuler et al. 2013)
 Use pathways
 Tissue classification (as in Lee et al.)
4
Integrating pathways and molecular
profiles
 Enrichment tests
 Improves interpretability
 GSEAGSA
 Ranked based
 Higher statistical power
 Classification
 Extract pathway features
 Example: given a pathway remove non-differential genes
 Not clear if prediction performance improves
compared to using genes (Staiger et al. 2013)
5
6
Pathways
KEGG Reactome
Biocarta NCI
Expression
profiles
GSE
GDS
TCGA
Sample labels
Disease
Datasetsample
description
Single sample - single
pathway analysis
For each
pathway
• Mean
• SD
Y
Samples
XP
Pathway features
Platform
data
Single sample analysis
Ranked
genes
transcripts
Sample j
Weighted
ranks
/i k
iW ie

Standardized
profile
low
expression
high
expression
7
Single sample analysis
 Input: an expression profile of a sample
 A vector of real values for each patient
 Step 1: rank the genes
 Step 2: calculate a score for each gene
Rank of
gene g in
sample s
Total number
of ranked
genes
(Yang et al. 2012,2013)
8
Pathway features
 1723 pathways in total
 Covering 7842 genes
 Mean size: 36.35 (median 15)
 Score all genes that are in the pathway databases
 Pathway statistics:
 Mean score
 Standard deviation
 Skewness
 KS test
Pathway DBs
KEGG Reactome
Biocarta NCI
9
Patient labels
 Unite ~180 datasets, >14,000 samples
 Public databases contain ‘free text’
 Problem: automatic mapping fails,
example:
 GDS4358:” lymph-node biopsies
from classic Hodgkins lymphoma
HIV- patients before ABVD
chemotherapy”
 MetaMap top score: “HIV infections”
 Solution: manual analysis
 Read descriptions and papers
10
Current microarray data
 Data from GEO
 13,314 samples
 17 platforms
 Sample annotation
 Ignore terms with less than
 100 samples
 5 datasets
 48 disease terms
Disease terms
XP
Samples
Pathway features
Y
Disease terms {0,1}
Samples 11
12
Multi-label classification algorithms
 Learn a single classifier for each disease
 Ignore class dependencies
 Adaptation: Bayesian Correction
 Learn single classifiers
 Correct errors using the DO DAG
 Transformation: use the label power
sets and learn a multiclass model
 Using RF: multi-label trees
 Was better than most approaches in an
experimental study (Madjarov et al. 2012)
13
How to validate an classifier?
 Use leave-dataset out cross-validation
 Global AUC scores: each prediction Pij vs the correct label Yij
 Disease based AUC scores: consider each column separately
14
Y
Disease terms {0,1}
Samples
P
Probabilities [0,1]
Samples
The output of a multi-label learner
Test set
A problem (!)
 What is in the background?
 For a disease D define:
 Positives: disease samples
 Negatives: direct controls
 Background controls
15
Example:
500 positives
500 negatives
10000 BGCs
Y
P
Multistep validation
16
 It is recommended to use several scores (Lee et al. 2013)
 Measure global AUPR
 For each disease we calculate three scores
Measure Used (additional)
information
AUPR: check separation between positives and
all others
Sick vs. not sick
ROC: test for separation between positives and
negatives
Direct use of negatives
Meta analysis p-value: calculate the overall
separation significance within the original
datasets (a p-value)
Mapping of samples to
datasets
Performance results
17
Meta analysis q-value < 0.001 (filled boxes)
Positives vs. negatives ROC
AUPR
Performance results
18
8.5% improvement in
recall, 12% in precision,
compared to Huang et al.
Validation on RNA-Seq
Data from TCGA: 1,699 samples
19
Pathway-Disease network
 Steps (for each of the selected diseases):
1. Disease-pathway edges
1. RF importance: Select the top features
2. Test for disease relevance
2. Add edges between diseases
1. Use the DO structure
3. Add edges between pathways
1. Based on significant overlap in genes
20
Cancer network
Down
Up
Cardiovascular disease
23
Down
Up
Gastric cancers
Summary
 Large scale integration
 Multi-label learning
 Careful validation
 Pathway based features as biomarkers
 Summary of the results in a network
 Currently
 Add genes: overcome missing values
 Shows improvement in validation
25
Acknowledgements
 Ron Shamir
 Tom Hait

Mais conteúdo relacionado

Mais procurados

Machine Learning in Biology and Why It Doesn't Make Sense - Theo Knijnenburg,...
Machine Learning in Biology and Why It Doesn't Make Sense - Theo Knijnenburg,...Machine Learning in Biology and Why It Doesn't Make Sense - Theo Knijnenburg,...
Machine Learning in Biology and Why It Doesn't Make Sense - Theo Knijnenburg,...Seattle DAML meetup
 
Колкер Е. An introduction to MOPED: Multi-Omics Profiling Expression Database
Колкер Е. An introduction to MOPED: Multi-Omics Profiling Expression DatabaseКолкер Е. An introduction to MOPED: Multi-Omics Profiling Expression Database
Колкер Е. An introduction to MOPED: Multi-Omics Profiling Expression Databasebigdatabm
 
Identification of novel potential anti cancer agents using network pharmacolo...
Identification of novel potential anti cancer agents using network pharmacolo...Identification of novel potential anti cancer agents using network pharmacolo...
Identification of novel potential anti cancer agents using network pharmacolo...Cresset
 
Network Pharmacology Tri-Con 022212
Network Pharmacology Tri-Con 022212Network Pharmacology Tri-Con 022212
Network Pharmacology Tri-Con 022212Philip Bourne
 
Project report-on-bio-informatics
Project report-on-bio-informaticsProject report-on-bio-informatics
Project report-on-bio-informaticsDaniela Rotariu
 
Novel network pharmacology methods for drug mechanism of action identificatio...
Novel network pharmacology methods for drug mechanism of action identificatio...Novel network pharmacology methods for drug mechanism of action identificatio...
Novel network pharmacology methods for drug mechanism of action identificatio...laserxiong
 
Sigma Xi 2021 Andrew Gao Presentation
Sigma Xi 2021 Andrew Gao PresentationSigma Xi 2021 Andrew Gao Presentation
Sigma Xi 2021 Andrew Gao PresentationAndrewGao12
 
AI in Bioinformatics
AI in BioinformaticsAI in Bioinformatics
AI in BioinformaticsAli Kishk
 
NetBioSIG2012 ugurdogrusoz-cbio
NetBioSIG2012 ugurdogrusoz-cbioNetBioSIG2012 ugurdogrusoz-cbio
NetBioSIG2012 ugurdogrusoz-cbioAlexander Pico
 
NetBioSIG2014-Talk by Traver Hart
NetBioSIG2014-Talk by Traver HartNetBioSIG2014-Talk by Traver Hart
NetBioSIG2014-Talk by Traver HartAlexander Pico
 
A Classification of Cancer Diagnostics based on Microarray Gene Expression Pr...
A Classification of Cancer Diagnostics based on Microarray Gene Expression Pr...A Classification of Cancer Diagnostics based on Microarray Gene Expression Pr...
A Classification of Cancer Diagnostics based on Microarray Gene Expression Pr...IJTET Journal
 
Evolution of Knowledge Discovery and Management
Evolution of Knowledge Discovery and Management Evolution of Knowledge Discovery and Management
Evolution of Knowledge Discovery and Management inscit2006
 
Pistoia Alliance-Elsevier Datathon
Pistoia Alliance-Elsevier DatathonPistoia Alliance-Elsevier Datathon
Pistoia Alliance-Elsevier DatathonPistoia Alliance
 
NetBioSIG2013-Talk Gang Su
NetBioSIG2013-Talk Gang SuNetBioSIG2013-Talk Gang Su
NetBioSIG2013-Talk Gang SuAlexander Pico
 
EnrichNet: Graph-based statistic and web-application for gene/protein set enr...
EnrichNet: Graph-based statistic and web-application for gene/protein set enr...EnrichNet: Graph-based statistic and web-application for gene/protein set enr...
EnrichNet: Graph-based statistic and web-application for gene/protein set enr...Enrico Glaab
 
Comparing Genetic Evolutionary Algorithms on Three Enzymes of HIV-1: Integras...
Comparing Genetic Evolutionary Algorithms on Three Enzymes of HIV-1: Integras...Comparing Genetic Evolutionary Algorithms on Three Enzymes of HIV-1: Integras...
Comparing Genetic Evolutionary Algorithms on Three Enzymes of HIV-1: Integras...CSCJournals
 
Uses of Artificial Intelligence in Bioinformatics
Uses of Artificial Intelligence in BioinformaticsUses of Artificial Intelligence in Bioinformatics
Uses of Artificial Intelligence in BioinformaticsPragya Pai
 

Mais procurados (20)

iOmics
iOmicsiOmics
iOmics
 
Machine Learning in Biology and Why It Doesn't Make Sense - Theo Knijnenburg,...
Machine Learning in Biology and Why It Doesn't Make Sense - Theo Knijnenburg,...Machine Learning in Biology and Why It Doesn't Make Sense - Theo Knijnenburg,...
Machine Learning in Biology and Why It Doesn't Make Sense - Theo Knijnenburg,...
 
Колкер Е. An introduction to MOPED: Multi-Omics Profiling Expression Database
Колкер Е. An introduction to MOPED: Multi-Omics Profiling Expression DatabaseКолкер Е. An introduction to MOPED: Multi-Omics Profiling Expression Database
Колкер Е. An introduction to MOPED: Multi-Omics Profiling Expression Database
 
Identification of novel potential anti cancer agents using network pharmacolo...
Identification of novel potential anti cancer agents using network pharmacolo...Identification of novel potential anti cancer agents using network pharmacolo...
Identification of novel potential anti cancer agents using network pharmacolo...
 
Network Pharmacology Tri-Con 022212
Network Pharmacology Tri-Con 022212Network Pharmacology Tri-Con 022212
Network Pharmacology Tri-Con 022212
 
Project report-on-bio-informatics
Project report-on-bio-informaticsProject report-on-bio-informatics
Project report-on-bio-informatics
 
Novel network pharmacology methods for drug mechanism of action identificatio...
Novel network pharmacology methods for drug mechanism of action identificatio...Novel network pharmacology methods for drug mechanism of action identificatio...
Novel network pharmacology methods for drug mechanism of action identificatio...
 
Sigma Xi 2021 Andrew Gao Presentation
Sigma Xi 2021 Andrew Gao PresentationSigma Xi 2021 Andrew Gao Presentation
Sigma Xi 2021 Andrew Gao Presentation
 
AI in Bioinformatics
AI in BioinformaticsAI in Bioinformatics
AI in Bioinformatics
 
NetBioSIG2012 ugurdogrusoz-cbio
NetBioSIG2012 ugurdogrusoz-cbioNetBioSIG2012 ugurdogrusoz-cbio
NetBioSIG2012 ugurdogrusoz-cbio
 
NetBioSIG2014-Talk by Traver Hart
NetBioSIG2014-Talk by Traver HartNetBioSIG2014-Talk by Traver Hart
NetBioSIG2014-Talk by Traver Hart
 
A Classification of Cancer Diagnostics based on Microarray Gene Expression Pr...
A Classification of Cancer Diagnostics based on Microarray Gene Expression Pr...A Classification of Cancer Diagnostics based on Microarray Gene Expression Pr...
A Classification of Cancer Diagnostics based on Microarray Gene Expression Pr...
 
Evolution of Knowledge Discovery and Management
Evolution of Knowledge Discovery and Management Evolution of Knowledge Discovery and Management
Evolution of Knowledge Discovery and Management
 
Pistoia Alliance-Elsevier Datathon
Pistoia Alliance-Elsevier DatathonPistoia Alliance-Elsevier Datathon
Pistoia Alliance-Elsevier Datathon
 
Bioinformatics Projects And Applications
Bioinformatics Projects And ApplicationsBioinformatics Projects And Applications
Bioinformatics Projects And Applications
 
NTU-2019
NTU-2019NTU-2019
NTU-2019
 
NetBioSIG2013-Talk Gang Su
NetBioSIG2013-Talk Gang SuNetBioSIG2013-Talk Gang Su
NetBioSIG2013-Talk Gang Su
 
EnrichNet: Graph-based statistic and web-application for gene/protein set enr...
EnrichNet: Graph-based statistic and web-application for gene/protein set enr...EnrichNet: Graph-based statistic and web-application for gene/protein set enr...
EnrichNet: Graph-based statistic and web-application for gene/protein set enr...
 
Comparing Genetic Evolutionary Algorithms on Three Enzymes of HIV-1: Integras...
Comparing Genetic Evolutionary Algorithms on Three Enzymes of HIV-1: Integras...Comparing Genetic Evolutionary Algorithms on Three Enzymes of HIV-1: Integras...
Comparing Genetic Evolutionary Algorithms on Three Enzymes of HIV-1: Integras...
 
Uses of Artificial Intelligence in Bioinformatics
Uses of Artificial Intelligence in BioinformaticsUses of Artificial Intelligence in Bioinformatics
Uses of Artificial Intelligence in Bioinformatics
 

Semelhante a NetBioSIG2014-Talk by David Amar

Critical appraisal of meta-analysis
Critical appraisal of meta-analysisCritical appraisal of meta-analysis
Critical appraisal of meta-analysisSamir Haffar
 
Single-Cell Sequencing for Drug Discovery: Applications and Challenges
Single-Cell Sequencing for Drug Discovery: Applications and ChallengesSingle-Cell Sequencing for Drug Discovery: Applications and Challenges
Single-Cell Sequencing for Drug Discovery: Applications and Challengesinside-BigData.com
 
Gene Profiling in Clinical Oncology - Slide 9 - F. André - Genomic evaluation...
Gene Profiling in Clinical Oncology - Slide 9 - F. André - Genomic evaluation...Gene Profiling in Clinical Oncology - Slide 9 - F. André - Genomic evaluation...
Gene Profiling in Clinical Oncology - Slide 9 - F. André - Genomic evaluation...European School of Oncology
 
Readmission of Diabetes Patients Report
Readmission of Diabetes Patients ReportReadmission of Diabetes Patients Report
Readmission of Diabetes Patients ReportHong Lu
 
Multivariate Analysis and Visualization of Proteomic Data
Multivariate Analysis and Visualization of Proteomic DataMultivariate Analysis and Visualization of Proteomic Data
Multivariate Analysis and Visualization of Proteomic DataUC Davis
 
Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009Ian Foster
 
Genetic predisposition to papillary thyroid cancer by Albert de la Chapelle, ...
Genetic predisposition to papillary thyroid cancer by Albert de la Chapelle, ...Genetic predisposition to papillary thyroid cancer by Albert de la Chapelle, ...
Genetic predisposition to papillary thyroid cancer by Albert de la Chapelle, ...OSUCCC - James
 
Norwegian clinical genetics analysis platform ”genAP”, Thomas Grünfeld and To...
Norwegian clinical genetics analysis platform ”genAP”, Thomas Grünfeld and To...Norwegian clinical genetics analysis platform ”genAP”, Thomas Grünfeld and To...
Norwegian clinical genetics analysis platform ”genAP”, Thomas Grünfeld and To...The Research Council of Norway, IKTPLUSS
 
Farmacoepi Course Leiden 0210 Part 2
Farmacoepi Course Leiden 0210   Part 2Farmacoepi Course Leiden 0210   Part 2
Farmacoepi Course Leiden 0210 Part 2RobHeerdink
 
Talk at Yale University April 26th 2011: Applying Computational Models for To...
Talk at Yale University April 26th 2011: Applying Computational Modelsfor To...Talk at Yale University April 26th 2011: Applying Computational Modelsfor To...
Talk at Yale University April 26th 2011: Applying Computational Models for To...Sean Ekins
 
Exploiting technical replicate variance in omics data analysis (RepExplore)
Exploiting technical replicate variance in omics data analysis (RepExplore)Exploiting technical replicate variance in omics data analysis (RepExplore)
Exploiting technical replicate variance in omics data analysis (RepExplore)Enrico Glaab
 
provenance of microarray experiments
provenance of microarray experimentsprovenance of microarray experiments
provenance of microarray experimentsHelena Deus
 
Analysis of Medication Possession Ratio for Improved Blood Pressure Control
Analysis of Medication Possession Ratio for Improved Blood Pressure ControlAnalysis of Medication Possession Ratio for Improved Blood Pressure Control
Analysis of Medication Possession Ratio for Improved Blood Pressure ControlHealth Informatics New Zealand
 
Bioinformatics in dermato-oncology
Bioinformatics in dermato-oncologyBioinformatics in dermato-oncology
Bioinformatics in dermato-oncologyJoaquin Dopazo
 
Clinical trial bms clinical trials methodology 17012018
Clinical trial bms   clinical trials methodology 17012018Clinical trial bms   clinical trials methodology 17012018
Clinical trial bms clinical trials methodology 17012018SoM
 
2010 smg training_cardiff_day2_session3_dwan_altman
2010 smg training_cardiff_day2_session3_dwan_altman2010 smg training_cardiff_day2_session3_dwan_altman
2010 smg training_cardiff_day2_session3_dwan_altmanrgveroniki
 
Collaborative Drug Discovery: A Platform For Transforming Neglected Disease R...
Collaborative Drug Discovery: A Platform For Transforming Neglected Disease R...Collaborative Drug Discovery: A Platform For Transforming Neglected Disease R...
Collaborative Drug Discovery: A Platform For Transforming Neglected Disease R...Sean Ekins
 
[DigiHealth 22] Budget friendly sample sizes for genomics research - Ognjen M...
[DigiHealth 22] Budget friendly sample sizes for genomics research - Ognjen M...[DigiHealth 22] Budget friendly sample sizes for genomics research - Ognjen M...
[DigiHealth 22] Budget friendly sample sizes for genomics research - Ognjen M...DataScienceConferenc1
 

Semelhante a NetBioSIG2014-Talk by David Amar (20)

Critical appraisal of meta-analysis
Critical appraisal of meta-analysisCritical appraisal of meta-analysis
Critical appraisal of meta-analysis
 
Metaanalysis copy
Metaanalysis    copyMetaanalysis    copy
Metaanalysis copy
 
Single-Cell Sequencing for Drug Discovery: Applications and Challenges
Single-Cell Sequencing for Drug Discovery: Applications and ChallengesSingle-Cell Sequencing for Drug Discovery: Applications and Challenges
Single-Cell Sequencing for Drug Discovery: Applications and Challenges
 
Gene Profiling in Clinical Oncology - Slide 9 - F. André - Genomic evaluation...
Gene Profiling in Clinical Oncology - Slide 9 - F. André - Genomic evaluation...Gene Profiling in Clinical Oncology - Slide 9 - F. André - Genomic evaluation...
Gene Profiling in Clinical Oncology - Slide 9 - F. André - Genomic evaluation...
 
Readmission of Diabetes Patients Report
Readmission of Diabetes Patients ReportReadmission of Diabetes Patients Report
Readmission of Diabetes Patients Report
 
Kishor Presentation
Kishor PresentationKishor Presentation
Kishor Presentation
 
Multivariate Analysis and Visualization of Proteomic Data
Multivariate Analysis and Visualization of Proteomic DataMultivariate Analysis and Visualization of Proteomic Data
Multivariate Analysis and Visualization of Proteomic Data
 
Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009
 
Genetic predisposition to papillary thyroid cancer by Albert de la Chapelle, ...
Genetic predisposition to papillary thyroid cancer by Albert de la Chapelle, ...Genetic predisposition to papillary thyroid cancer by Albert de la Chapelle, ...
Genetic predisposition to papillary thyroid cancer by Albert de la Chapelle, ...
 
Norwegian clinical genetics analysis platform ”genAP”, Thomas Grünfeld and To...
Norwegian clinical genetics analysis platform ”genAP”, Thomas Grünfeld and To...Norwegian clinical genetics analysis platform ”genAP”, Thomas Grünfeld and To...
Norwegian clinical genetics analysis platform ”genAP”, Thomas Grünfeld and To...
 
Farmacoepi Course Leiden 0210 Part 2
Farmacoepi Course Leiden 0210   Part 2Farmacoepi Course Leiden 0210   Part 2
Farmacoepi Course Leiden 0210 Part 2
 
Talk at Yale University April 26th 2011: Applying Computational Models for To...
Talk at Yale University April 26th 2011: Applying Computational Modelsfor To...Talk at Yale University April 26th 2011: Applying Computational Modelsfor To...
Talk at Yale University April 26th 2011: Applying Computational Models for To...
 
Exploiting technical replicate variance in omics data analysis (RepExplore)
Exploiting technical replicate variance in omics data analysis (RepExplore)Exploiting technical replicate variance in omics data analysis (RepExplore)
Exploiting technical replicate variance in omics data analysis (RepExplore)
 
provenance of microarray experiments
provenance of microarray experimentsprovenance of microarray experiments
provenance of microarray experiments
 
Analysis of Medication Possession Ratio for Improved Blood Pressure Control
Analysis of Medication Possession Ratio for Improved Blood Pressure ControlAnalysis of Medication Possession Ratio for Improved Blood Pressure Control
Analysis of Medication Possession Ratio for Improved Blood Pressure Control
 
Bioinformatics in dermato-oncology
Bioinformatics in dermato-oncologyBioinformatics in dermato-oncology
Bioinformatics in dermato-oncology
 
Clinical trial bms clinical trials methodology 17012018
Clinical trial bms   clinical trials methodology 17012018Clinical trial bms   clinical trials methodology 17012018
Clinical trial bms clinical trials methodology 17012018
 
2010 smg training_cardiff_day2_session3_dwan_altman
2010 smg training_cardiff_day2_session3_dwan_altman2010 smg training_cardiff_day2_session3_dwan_altman
2010 smg training_cardiff_day2_session3_dwan_altman
 
Collaborative Drug Discovery: A Platform For Transforming Neglected Disease R...
Collaborative Drug Discovery: A Platform For Transforming Neglected Disease R...Collaborative Drug Discovery: A Platform For Transforming Neglected Disease R...
Collaborative Drug Discovery: A Platform For Transforming Neglected Disease R...
 
[DigiHealth 22] Budget friendly sample sizes for genomics research - Ognjen M...
[DigiHealth 22] Budget friendly sample sizes for genomics research - Ognjen M...[DigiHealth 22] Budget friendly sample sizes for genomics research - Ognjen M...
[DigiHealth 22] Budget friendly sample sizes for genomics research - Ognjen M...
 

Mais de Alexander Pico

NRNB Annual Report 2018
NRNB Annual Report 2018NRNB Annual Report 2018
NRNB Annual Report 2018Alexander Pico
 
NRNB Annual Report 2017
NRNB Annual Report 2017NRNB Annual Report 2017
NRNB Annual Report 2017Alexander Pico
 
2016 Cytoscape 3.3 Tutorial
2016 Cytoscape 3.3 Tutorial2016 Cytoscape 3.3 Tutorial
2016 Cytoscape 3.3 TutorialAlexander Pico
 
NRNB Annual Report 2016: Overall
NRNB Annual Report 2016: OverallNRNB Annual Report 2016: Overall
NRNB Annual Report 2016: OverallAlexander Pico
 
Technology R&D Theme 3: Multi-scale Network Representations
Technology R&D Theme 3: Multi-scale Network RepresentationsTechnology R&D Theme 3: Multi-scale Network Representations
Technology R&D Theme 3: Multi-scale Network RepresentationsAlexander Pico
 
Technology R&D Theme 1: Differential Networks
Technology R&D Theme 1: Differential NetworksTechnology R&D Theme 1: Differential Networks
Technology R&D Theme 1: Differential NetworksAlexander Pico
 
Overall Vision for NRNB: 2015-2020
Overall Vision for NRNB: 2015-2020Overall Vision for NRNB: 2015-2020
Overall Vision for NRNB: 2015-2020Alexander Pico
 
2015 Cytoscape 3.2 Tutorial
2015 Cytoscape 3.2 Tutorial2015 Cytoscape 3.2 Tutorial
2015 Cytoscape 3.2 TutorialAlexander Pico
 
NetBioSIG2014-FlashJournalClub by Frank Kramer
NetBioSIG2014-FlashJournalClub by Frank KramerNetBioSIG2014-FlashJournalClub by Frank Kramer
NetBioSIG2014-FlashJournalClub by Frank KramerAlexander Pico
 
NetBioSIG2014-Talk by Salvatore Loguercio
NetBioSIG2014-Talk by Salvatore LoguercioNetBioSIG2014-Talk by Salvatore Loguercio
NetBioSIG2014-Talk by Salvatore LoguercioAlexander Pico
 
NetBioSIG2014-Intro by Alex Pico
NetBioSIG2014-Intro by Alex PicoNetBioSIG2014-Intro by Alex Pico
NetBioSIG2014-Intro by Alex PicoAlexander Pico
 
NetBioSIG2014-Talk by Tijana Milenkovic
NetBioSIG2014-Talk by Tijana MilenkovicNetBioSIG2014-Talk by Tijana Milenkovic
NetBioSIG2014-Talk by Tijana MilenkovicAlexander Pico
 
NetBioSIG2014-Talk by Yu Xia
NetBioSIG2014-Talk by Yu XiaNetBioSIG2014-Talk by Yu Xia
NetBioSIG2014-Talk by Yu XiaAlexander Pico
 
NetBioSIG2014-Keynote by Marian Walhout
NetBioSIG2014-Keynote by Marian WalhoutNetBioSIG2014-Keynote by Marian Walhout
NetBioSIG2014-Keynote by Marian WalhoutAlexander Pico
 
NetBioSIG2014-Talk by Ashwini Patil
NetBioSIG2014-Talk by Ashwini PatilNetBioSIG2014-Talk by Ashwini Patil
NetBioSIG2014-Talk by Ashwini PatilAlexander Pico
 
NetBioSIG2014-Talk by Hyunghoon Cho
NetBioSIG2014-Talk by Hyunghoon ChoNetBioSIG2014-Talk by Hyunghoon Cho
NetBioSIG2014-Talk by Hyunghoon ChoAlexander Pico
 
NetBioSIG2014-Talk by Gerald Quon
NetBioSIG2014-Talk by Gerald QuonNetBioSIG2014-Talk by Gerald Quon
NetBioSIG2014-Talk by Gerald QuonAlexander Pico
 
Visualization and Analysis of Dynamic Networks
Visualization and Analysis of Dynamic Networks Visualization and Analysis of Dynamic Networks
Visualization and Analysis of Dynamic Networks Alexander Pico
 
NRNB Annual Report 2013
NRNB Annual Report 2013NRNB Annual Report 2013
NRNB Annual Report 2013Alexander Pico
 
Introduction to WikiPathways
Introduction to WikiPathwaysIntroduction to WikiPathways
Introduction to WikiPathwaysAlexander Pico
 

Mais de Alexander Pico (20)

NRNB Annual Report 2018
NRNB Annual Report 2018NRNB Annual Report 2018
NRNB Annual Report 2018
 
NRNB Annual Report 2017
NRNB Annual Report 2017NRNB Annual Report 2017
NRNB Annual Report 2017
 
2016 Cytoscape 3.3 Tutorial
2016 Cytoscape 3.3 Tutorial2016 Cytoscape 3.3 Tutorial
2016 Cytoscape 3.3 Tutorial
 
NRNB Annual Report 2016: Overall
NRNB Annual Report 2016: OverallNRNB Annual Report 2016: Overall
NRNB Annual Report 2016: Overall
 
Technology R&D Theme 3: Multi-scale Network Representations
Technology R&D Theme 3: Multi-scale Network RepresentationsTechnology R&D Theme 3: Multi-scale Network Representations
Technology R&D Theme 3: Multi-scale Network Representations
 
Technology R&D Theme 1: Differential Networks
Technology R&D Theme 1: Differential NetworksTechnology R&D Theme 1: Differential Networks
Technology R&D Theme 1: Differential Networks
 
Overall Vision for NRNB: 2015-2020
Overall Vision for NRNB: 2015-2020Overall Vision for NRNB: 2015-2020
Overall Vision for NRNB: 2015-2020
 
2015 Cytoscape 3.2 Tutorial
2015 Cytoscape 3.2 Tutorial2015 Cytoscape 3.2 Tutorial
2015 Cytoscape 3.2 Tutorial
 
NetBioSIG2014-FlashJournalClub by Frank Kramer
NetBioSIG2014-FlashJournalClub by Frank KramerNetBioSIG2014-FlashJournalClub by Frank Kramer
NetBioSIG2014-FlashJournalClub by Frank Kramer
 
NetBioSIG2014-Talk by Salvatore Loguercio
NetBioSIG2014-Talk by Salvatore LoguercioNetBioSIG2014-Talk by Salvatore Loguercio
NetBioSIG2014-Talk by Salvatore Loguercio
 
NetBioSIG2014-Intro by Alex Pico
NetBioSIG2014-Intro by Alex PicoNetBioSIG2014-Intro by Alex Pico
NetBioSIG2014-Intro by Alex Pico
 
NetBioSIG2014-Talk by Tijana Milenkovic
NetBioSIG2014-Talk by Tijana MilenkovicNetBioSIG2014-Talk by Tijana Milenkovic
NetBioSIG2014-Talk by Tijana Milenkovic
 
NetBioSIG2014-Talk by Yu Xia
NetBioSIG2014-Talk by Yu XiaNetBioSIG2014-Talk by Yu Xia
NetBioSIG2014-Talk by Yu Xia
 
NetBioSIG2014-Keynote by Marian Walhout
NetBioSIG2014-Keynote by Marian WalhoutNetBioSIG2014-Keynote by Marian Walhout
NetBioSIG2014-Keynote by Marian Walhout
 
NetBioSIG2014-Talk by Ashwini Patil
NetBioSIG2014-Talk by Ashwini PatilNetBioSIG2014-Talk by Ashwini Patil
NetBioSIG2014-Talk by Ashwini Patil
 
NetBioSIG2014-Talk by Hyunghoon Cho
NetBioSIG2014-Talk by Hyunghoon ChoNetBioSIG2014-Talk by Hyunghoon Cho
NetBioSIG2014-Talk by Hyunghoon Cho
 
NetBioSIG2014-Talk by Gerald Quon
NetBioSIG2014-Talk by Gerald QuonNetBioSIG2014-Talk by Gerald Quon
NetBioSIG2014-Talk by Gerald Quon
 
Visualization and Analysis of Dynamic Networks
Visualization and Analysis of Dynamic Networks Visualization and Analysis of Dynamic Networks
Visualization and Analysis of Dynamic Networks
 
NRNB Annual Report 2013
NRNB Annual Report 2013NRNB Annual Report 2013
NRNB Annual Report 2013
 
Introduction to WikiPathways
Introduction to WikiPathwaysIntroduction to WikiPathways
Introduction to WikiPathways
 

Último

DIFFERENCE IN BACK CROSS AND TEST CROSS
DIFFERENCE IN  BACK CROSS AND TEST CROSSDIFFERENCE IN  BACK CROSS AND TEST CROSS
DIFFERENCE IN BACK CROSS AND TEST CROSSLeenakshiTyagi
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfSumit Kumar yadav
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisDiwakar Mishra
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPirithiRaju
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptxRajatChauhan518211
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfSumit Kumar yadav
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)Areesha Ahmad
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Lokesh Kothari
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​kaibalyasahoo82800
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)Areesha Ahmad
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPirithiRaju
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)Areesha Ahmad
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...ssifa0344
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...anilsa9823
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.Nitya salvi
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
 

Último (20)

DIFFERENCE IN BACK CROSS AND TEST CROSS
DIFFERENCE IN  BACK CROSS AND TEST CROSSDIFFERENCE IN  BACK CROSS AND TEST CROSS
DIFFERENCE IN BACK CROSS AND TEST CROSS
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptx
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdf
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 

NetBioSIG2014-Talk by David Amar

  • 1. 1 David Amar, Tom Hait, and Ron Shamir Blavatnik School of Computer Science Tel Aviv University
  • 2. 2
  • 3. Comparative genomics  Standard expression experiments: cases vs. controls -> differential genes -> interpretation  Problems  Small number of samples  Non-specific signal  Interpretation of a gene set/ gene ranking  Goal: find specific changes for a tested disease  E.g., an up-regulated pathway  Crucial for clinical studies 3
  • 4. Previous integrative classification studies  Huang et al. 2010 PNAS (9,160 samples); Schmid et al. PNAS 2012 (3,030); Lee et al. Bioinformatics 2013 (~14,000)  Multilabel classification  Global expression patterns  Only 1-3 platforms  Many datasets were removed from GEO  No “healthy” class (Huang);No diseases (Lee)  Pathprint (Altschuler et al. 2013)  Use pathways  Tissue classification (as in Lee et al.) 4
  • 5. Integrating pathways and molecular profiles  Enrichment tests  Improves interpretability  GSEAGSA  Ranked based  Higher statistical power  Classification  Extract pathway features  Example: given a pathway remove non-differential genes  Not clear if prediction performance improves compared to using genes (Staiger et al. 2013) 5
  • 6. 6
  • 7. Pathways KEGG Reactome Biocarta NCI Expression profiles GSE GDS TCGA Sample labels Disease Datasetsample description Single sample - single pathway analysis For each pathway • Mean • SD Y Samples XP Pathway features Platform data Single sample analysis Ranked genes transcripts Sample j Weighted ranks /i k iW ie  Standardized profile low expression high expression 7
  • 8. Single sample analysis  Input: an expression profile of a sample  A vector of real values for each patient  Step 1: rank the genes  Step 2: calculate a score for each gene Rank of gene g in sample s Total number of ranked genes (Yang et al. 2012,2013) 8
  • 9. Pathway features  1723 pathways in total  Covering 7842 genes  Mean size: 36.35 (median 15)  Score all genes that are in the pathway databases  Pathway statistics:  Mean score  Standard deviation  Skewness  KS test Pathway DBs KEGG Reactome Biocarta NCI 9
  • 10. Patient labels  Unite ~180 datasets, >14,000 samples  Public databases contain ‘free text’  Problem: automatic mapping fails, example:  GDS4358:” lymph-node biopsies from classic Hodgkins lymphoma HIV- patients before ABVD chemotherapy”  MetaMap top score: “HIV infections”  Solution: manual analysis  Read descriptions and papers 10
  • 11. Current microarray data  Data from GEO  13,314 samples  17 platforms  Sample annotation  Ignore terms with less than  100 samples  5 datasets  48 disease terms Disease terms XP Samples Pathway features Y Disease terms {0,1} Samples 11
  • 12. 12
  • 13. Multi-label classification algorithms  Learn a single classifier for each disease  Ignore class dependencies  Adaptation: Bayesian Correction  Learn single classifiers  Correct errors using the DO DAG  Transformation: use the label power sets and learn a multiclass model  Using RF: multi-label trees  Was better than most approaches in an experimental study (Madjarov et al. 2012) 13
  • 14. How to validate an classifier?  Use leave-dataset out cross-validation  Global AUC scores: each prediction Pij vs the correct label Yij  Disease based AUC scores: consider each column separately 14 Y Disease terms {0,1} Samples P Probabilities [0,1] Samples The output of a multi-label learner Test set
  • 15. A problem (!)  What is in the background?  For a disease D define:  Positives: disease samples  Negatives: direct controls  Background controls 15 Example: 500 positives 500 negatives 10000 BGCs Y P
  • 16. Multistep validation 16  It is recommended to use several scores (Lee et al. 2013)  Measure global AUPR  For each disease we calculate three scores Measure Used (additional) information AUPR: check separation between positives and all others Sick vs. not sick ROC: test for separation between positives and negatives Direct use of negatives Meta analysis p-value: calculate the overall separation significance within the original datasets (a p-value) Mapping of samples to datasets
  • 17. Performance results 17 Meta analysis q-value < 0.001 (filled boxes) Positives vs. negatives ROC AUPR
  • 18. Performance results 18 8.5% improvement in recall, 12% in precision, compared to Huang et al.
  • 19. Validation on RNA-Seq Data from TCGA: 1,699 samples 19
  • 20. Pathway-Disease network  Steps (for each of the selected diseases): 1. Disease-pathway edges 1. RF importance: Select the top features 2. Test for disease relevance 2. Add edges between diseases 1. Use the DO structure 3. Add edges between pathways 1. Based on significant overlap in genes 20
  • 24. Summary  Large scale integration  Multi-label learning  Careful validation  Pathway based features as biomarkers  Summary of the results in a network  Currently  Add genes: overcome missing values  Shows improvement in validation 25