The document is about the cBio Cancer Genomics Portal, which is an open platform for exploring multidimensional cancer genomics data. The portal provides integrated access to genomic data types, clinical data, and biological pathways from cancer studies. It uses a web-based interface for iterative exploratory data analysis. Some key features include visualization of discrete genomic events, survival analysis, pathway analysis, and network analysis.
Top Rated Bangalore Call Girls Mg Road ⟟ 8250192130 ⟟ Call Me For Genuine Sex...
The cBio Cancer Genomics Portal: An Open Platform for Exploring Cancer Genomics Data
1. The cBio Cancer Genomics Portal: An Open Platform for
Exploring Multidimensional Cancer Genomics Data
Ethan Cerami, Ph.D.
Director, Cancer Informatics Development http://cbioportal.org
Computational Biology Center (cBio)
Memorial Sloan-Kettering Cancer Center
CBIIT Talk
May 23, 2012
CBIIT Talk
Motivation: cBio Cancer Genomics Portal TCGA Ecosystem
Pathway Analysis Introduction Examples of Usage Network Analysis Advanced Options Web API / R Package & Future Plans
Friday, May 18, 12
2. The Cancer Genome Atlas (TCGA) Project
MSKCC Genome Data
Analysis Center (GDAC)
2
3. Pathway Analysis
Patient Cohort
Genomic Inputs:
Genomic Alteration(s): Single Nucleotide Variants Copy Number mRNA and microRNA DNA Methylation
Small Insertions and Alterations expression Changes
Deletions
+ +
Pathway Analysis:
Copy number Epigenetically
altered genes silenced genes
PI3K Pathway with correlated
gene expression
TP53 Pathway
Pathway and Network Data
Metabolic Pathways Signaling Pathways Protein-Protein Interactions Regulatory Networks Drug-Target Networks
CH2OH
O
O
CHOH
AcNH CHOH COO - 6.3.2.7-10
CH2OH 2.4.99.7 6.3.2.13 HO O OPPU
OPC CH3CH NHAC
HO COO-
UDP-N-Ac-Muramate
CMP-N-Acetyl
neuraminate
O
CHOH 2.4.1.16
AcNH CHOH COO
CH2OH
OH 2.7.7.43 1.1.1.158 CH2OH
3.1.3.29 O
HO
N-Ac-Neuraminate HO O
(Sialate) OPPU
CH2 C
CH2OP NHAC
O COO
UDP-N-Ac-
3.1.3.29
ACNH
HO OH Glucosamine
OH
4.1.3.20
3
pyruvate
N-Ac-Mannosamine-6-P
CH2OH CH2OH
O O 4.1.3.20
ACNH
2.7.1.60
HO OH OH HO OH OPPU
NHAC
Friday, May 18, 12
4. Motivation: cBio Cancer Genomics Portal TCGA Ecosystem
Pathway Analysis Introduction Examples of Usage Network Analysis Advanced Options Web API / R Package & Future Plans
Comprehensive genomic characterization defines human glioblastoma genes and core pathways
The Cancer Genome Atlas Research Network
Nature 455, 1061-1068(23 October 2008)
4
5. Motivation: cBio Cancer Genomics Portal TCGA Ecosystem
Pathway Analysis Introduction Examples of Usage Network Analysis Advanced Options Web API / R Package & Future Plans
Homologous Repair (HR) Alterations
BRCA Altered Cases, N=103 (33%)
BRCA1
BRCA2
Germline Somatic Epigenetic Silencing
Mutation Mutation via Hypermethylation HR Pathway
DNA damage
Sensors
51% of cases altered
100
BRCA Mutated [66]
BRCA1 Epigenetically Silenced [33]
ATM ATR BRCA1 EMSY
BRCA Wildtype [212]
80
1% <1% 23% 8%
mutated mutated mutated / amplified /
Patient Survival
/RJïrank test
60
hypermethyl. mutated
Sïvalue: 0.0008602
40
FA core
FANCD2 BRCA2 RAD51C
complex
<1% 11% 3%
5%
20
mutated mutated mutated hypermethyl.
HR-Mediated PTEN
0
0 50 100 150 repair 7%
Months Survival deleted
Integrated genomic analyses of ovarian carcinoma
The Cancer Genome Atlas Research Network
Nature 474, 609–615 (30 June 2011) 5
Friday, May 18, 12
6. Pathway Analysis
Patient Cohort
Genomic Inputs:
Genomic Alteration(s): Single Nucleotide Variants Copy Number mRNA and microRNA DNA Methylation
cBio Cancer
Small Insertions and
Deletions
Alterations expression Changes
Genomics
Portal
+ +
Pathway Analysis:
Copy number Epigenetically
altered genes silenced genes
PI3K Pathway with correlated
gene expression
TP53 Pathway
Pathway and Network Data
Metabolic Pathways Signaling Pathways Protein-Protein Interactions Regulatory Networks Drug-Target Networks
CH2OH
Pathway
O
O
CHOH
AcNH CHOH COO - 6.3.2.7-10
CH2OH 2.4.99.7 6.3.2.13 HO O OPPU
OPC CH3CH NHAC
HO COO-
UDP-N-Ac-Muramate
CMP-N-Acetyl
Commons
neuraminate
O
CHOH 2.4.1.16
AcNH CHOH COO
CH2OH
OH 2.7.7.43 1.1.1.158 CH2OH
3.1.3.29 O
HO
N-Ac-Neuraminate HO O
(Sialate) OPPU
CH2 C
CH2OP NHAC
O COO
UDP-N-Ac-
3.1.3.29
ACNH
HO OH Glucosamine
OH
4.1.3.20
pyruvate
N-Ac-Mannosamine-6-P
6
CH2OH CH2OH
O O 4.1.3.20
ACNH
2.7.1.60
HO OH OH HO OH OPPU
NHAC
Friday, May 18, 12
7. cBio Cancer Genomics Portal Web-Based Interface for Iterative Exploratory Data Analysis
Comprehensive Cancer Genomic Studies
OncoPrint: Compact Visualization of Discrete Genomic Events
Gene A
... Gene B
Gene C
Survival Analysis Network Analysis
Mutations
100
Protein /
80
Copy
Phospho- Number
protein
60
Integration of
Genomic Data
40
Types, Clinical
Clinical Data, and Biologi-
20
mRNA
Survival cal Pathways. Expression Alteration Frequency (%)
0
0 20 40 60 80 100 120
Other Reports
Biological DNA Mutation Details
Pathways Methylation Web-Service Interface Predicted Functional Impact
R-Package of Mutations
MATLAB ToolBox Multidimensional Genomic
Data Plots
The cBio Cancer Genomics Portal
Biological Insight
Cerami, et. al, Cancer Discovery (May, 2012)
Clinical Trial Design
7
8. CBIIT Talk
Motivation: cBio Cancer Genomics Portal TCGA Ecosystem
Pathway Analysis Introduction Examples of Usage Network Analysis Advanced Options Web API / R Package & Future Plans
cBio Portal in Context
• Other Portals available:
• TCGA Data Portal
• ICGC Data Portal
• UCSC Cancer Genome Browser
• cBio Portal:
• Supports Exploratory Data Analysis
• Lowers the barrier to access -
specifically for biologists and
clinical researchers
• Provides integrated access to data
8
Friday, May 18, 12
9. CBIIT Talk
Motivation: cBio Cancer Genomics Portal TCGA Ecosystem
Pathway Analysis Introduction Examples of Usage Network Analysis Advanced Options Web API / R Package & Future Plans
Multiple Portals
• Public Portal: http://www.cbioportal.org/
• Contains published TCGA studies + a few
other studies.
• Now also contains public copy number,
mRNA, RPPA data for all TCGA tumor types
(everything, but mutation data).
• Open Access.
• TCGA Portal: http://cbio.mskcc.org/gdac-
portal/
• Contains all provisional TCGA data, updated
monthly.
• Requires a user name / password.
• Register at: http://bit.ly/gdac-form.
• Stand-Up to Cancer (SU2C) Portal
9
Friday, May 18, 12
10. 4-Step Web Interface
4-step web interface for querying a single cancer study
Query Download Data
Select Cancer Study: Glioblastoma (TCGA)
1 Select a Cancer Study or “All Cancer
The Cancer Genome Atlas (TCGA) Glioblastoma project. 206 primary glioblastoma samples.
Nature 2008. Raw data via the TCGA Data Portal.
Studies”
Select Genomic Profiles:
Mutations
Copy Number Data. Select one of the profiles below:
Putative copy-number alterations (GBM Pathways)
2 Select one or more genomic profiles
Putative copy-number alterations (RAE) For example: Mutation and Copy Number Data
mRNA Expression z-Scores
Select Patient/Case Set: All Complete Tumors (seq, mRNA, CNA)
3 Select a Patient Set
Enter Gene Set: Advanced: Onco Query Language (OQL)
RB1 CDK4 CDKN2A
4 Enter a Gene or Gene Set
Or Select from Example Gene Sets:
User-Defined List
Optional Arguments: Optional argument to compute mutual exclusivity
Compute Mutual Exclusivity / Co-occurence between all pairs of genes. / co-occurence between all pairs of genes.
(Not recommended for more than 10 genes.)
Submit
10
11. CBIIT Talk
Motivation: cBio Cancer Genomics Portal TCGA Ecosystem
Pathway Analysis Introduction Examples of Usage Network Analysis Advanced Options Web API / R Package & Future Plans
Main Features:
11
Friday, May 18, 12
12. Motivation: cBio Cancer Genomics Portal TCGA Ecosystem
Pathway Analysis Introduction Examples of Usage Network Analysis Advanced Options Web API / R Package & Future Plans
Key Abstraction: Discrete Genomic-Level Events
• Each Gene within each sample is assigned multiple discrete
genomic level events:
• Mutations: Mutated or WT.
• Copy Number: Amplification, Homozygous Deletion, etc.
• Important caveats:
• Portal does not provide confidence intervals for mutations.
• Copy number calls (as determined by GISTIC or RAE) are
putative.
12
Friday, May 18, 12
22. Motivation: cBio Cancer Genomics Portal TCGA Ecosystem
Pathway Analysis Introduction Examples of Usage Network Analysis Advanced Options Web API / R Package & Future Plans
Cross-Cancer Queries
How do Pi3K
alterations vary
across ovarian
and endometrial
cancers?
22
Friday, May 18, 12
24. UniProt Entrez Gene RefSeq
MSKCC Cancer CellMap
NCI Nature PID Web Service
ID Mapp
ing
HumanCyc
AX
Reactome
BioP
IMID
PC Pathway Web Site
Commons
SI
P
-M
MINT I Batch Download
IntAct
HRPD
BioGrid
http://www.pathwaycommons.org
Pathway Commons, a web resource for biological pathway data.
Cerami, et. al, Nucleic Acids Res. 2011
24
26. A. Network View for BRCA1/BRCA2 in TCGA Ovarian Cancer B. Node Legend
Copy Number
Thick Border: seed gene Amplification
Thin Border: linker gene Homozygous Deletion
Gain
Hemizygous Deletion
mRNA Expression Mutation
Up-Regulated Mutated
Alteration Frequency (%)
Down-Regulated
0 100
C. Interaction Legend
In Same Component Other
Reacts With Merged (multiple types)
State Change
D. Network Filtering, Cropping and Searching Filter Edges by Interaction
Type and/or Data Source
Show only selected
}
Filter Neighbors by
Hide selected Alteration (%)
Show all
} Search by Gene
Symbol
Collaboration with Ugur Dogrusoz, Bilkent University; separately funded
by National Resource for Network Biology (NRNB) grant.
26
Friday, May 18, 12
27. Recently Added: RPPA Analysis
Ovarian Cancer
Gene Set: PTEN
27
Friday, May 18, 12
28. OncoQuery Language (OQL)
Steps 1-3 Step 4: Onco Query Description OncoPrint Output
A) Onco Query Examples: Copy Number and Mutations
}
User selects TCGA Ovarian RB1 Default. Shows putative
Cancer, with genomic profiles: amplifications, homozygous
deletions, and mutations.
Mutations (next-gen) RB1: MUT Shows only mutations.
Putative CNA (GISTIC)
RB1: HOMDEL MUT Shows putative homozy-
All Complete Tumors gous deletions and
mutations.
B) Onco Query Examples: mRNA Expression Data
}
PTEN Default. Shows up-down
User selects TCGA GBM, with mRNA regulation at least 2
genomic profiles: standard deviation from the
mean.
mRNA Expression (Z-Scores) PTEN: EXP < -1 Shows only down-regulated
mRNA events more than 1
All Complete Tumors standard deviation below
the mean.
Putative Copy Number Amplification mRNA up-regulation
Putative Homozygous Deletion
mRNA down-regulation
Mutation
28
Friday, May 18, 12
29. A
Endometrial
Cancer: PIK3CA
B C
PIK3CA 29
Friday, May 18, 12
30. Motivation: cBio Cancer Genomics Portal TCGA Ecosystem
Pathway Analysis Introduction Examples of Usage Network Analysis Advanced Options Web API / R Package & Future Plans
Web Service API and R/MATLAB Packages
• Access via Web API
• Access via R Package and MATLAB Library
A) Example Query: Retrieve all Cancer Studies
http://www.cbioportal.org/public-portal/webservice.do?cmd=getCancerStudies
Output
cancer_type_id name description
tcga_gbm Glioblastoma (TCGA) ...
mskcc_prad Prostate Cancer (MSKCC) ...
mskcc_broad_sarc Sarcoma (MSKCC/Broad) ...
tcga_ova Serous Ovarian Cancer (TCGA) ...
B) Example Query: Retrieve Copy Number Data for CCNE1 in TCGA Ovarian Cancer
http://www.cbioportal.org/public-portal/webservice.do?
cmd=getProfileData&case_set_id=ova_all&genetic_profile_id=ova_gistic&gene_list=CCNE1
Putative Copy Number Status
Get Genomic Profile Data Restrict to all TCGA Retrieve Copy Number (GISTIC) Gene List +2 Amplification
Ovarian Cancer Samples Data +1 Gain
Output 0 Diploid
GENE_ID COMMON TCGA-04-1331 TCGA-04-1332 TCGA-04-1336 TCGA-04-1337 -1 Hemizygous Deletion
898 CCNE1 1 1 0 0 -2 Homozygous Deletion
30
Friday, May 18, 12
31. R and MATLAB Packages
• Access portal data within R via the CGDS-R package.
• Available via CRAN.
• Vignette and Reference PDF available.
R Package maintained by Anders Jacobsen; MATLAB package maintained by Erik Larsson.
31
Friday, May 18, 12
32. Motivation: cBio Cancer Genomics Portal TCGA Ecosystem
Pathway Analysis Introduction Examples of Usage Network Analysis Advanced Options Web API / R Package & Future Plans
Integrating with the Cancer Genome Atlas Project (TCGA)
GDAC
Data
Broad
Coordination
Firehose
Center (DCC)
cBio Portal (s)
TCGA Researchers
TCGA Disease Working Groups
All Data...
32
Friday, May 18, 12
33. Motivation: cBio Cancer Genomics Portal TCGA Ecosystem
Pathway Analysis Introduction Examples of Usage Network Analysis Advanced Options Web API / R Package & Future Plans
TCGA
Data Coordination Center (DCC) @ NCI Analysis Working Groups
Central repository for all TCGA data. Generates freeze lists, sub-
types, and other case lists
Multidimensional
Ecosystem
genomic profiling data
Mutation Assessor @ cBio
Predicted functional conse-
quences of mutations in cancer. Web Firehose @ Broad
Web
API API
Pipeline for processing all TCGA data. Freeze lists, subtypes, and
other case lists
Download of Firehose
Data via DCC
cBio Portal @ MSKCC UCSC Cancer Genome Browser Tools at ISB
Web portal for exploring TCGA genomic,
Open platform for exploring, clinical, and image data. Regulome Explorer, ...
mining and visualizing TCGA User Cross
data. Links (Beta)
RB1
CDK4
CDKN2A
User Cross Links
Web API for IGV and Network Visualization
Oncotator @ Broad Integrative Genomics Viewer
(IGV) @ Broad
Web application for annotating Legend
human genomic point mutations High-performance visualization tool for
and indels with data relevant to interactive exploration of large, inte- Implemented
cancer researchers grated genomic datasets.
Work In Progress
Proposed / Planned
33
Friday, May 18, 12
34. Planned Features
• Adding Drugs and Drug Targets to the network view.
• Adding clinical features and new sort features to the OncoPrint, e.g. group/sort
by MSI-Status or Histological Grade, etc.
• Improved analysis and visualization of RPPA (collaboration with Gordon Mills).
• Integration of mutation and copy number algorithm results, e.g. MutSig and
GISTIC.
• full support for DNA methylation events.
• [your idea here...]
34
Friday, May 18, 12
35. Motivation: cBio Cancer Genomics Portal TCGA Ecosystem
Pathway Analysis Introduction Examples of Usage Network Analysis Advanced Options Web API / R Package & Future Plans
Open Source
• Portal software open source (GNU Lesser GPL).
• Available on Google code:
• http://code.google.com/p/cbio-cancer-genomics-portal/
• Amazon Machine Image (AMI) also available.
• Upstream pre-processing activities required before data can be imported into
the portal:
• Mutation data finalization and format.
• Discrete copy number data, e.g. GISTIC algorithm.
• Case lists.
• Some of this is currently handled by the TCGA Broad Firehose.
35
Friday, May 18, 12
36. Motivation: cBio Cancer Genomics Portal TCGA Ecosystem
Pathway Analysis Introduction Examples of Usage Network Analysis Advanced Options Web API / R Package & Future Plans
Acknowledgements
• cBio Portal • Collaborators:
• Nikolaus Schultz • Broad Firehose Team
• Benjamin Gross • The TCGA Project Team
• Arthur Goldberg
• Caitlin Byrne • Pathway Commons:
• Anders Jacobsen • Benjamin Gross
• Jianjiong Gao • Emek Demir
• Erik Larsson • Igor Rodchenkov, U. Toronto
• Selcuk Onur Sumer, Bilkent University • Ozgün Babur
• Sinan Sonlu, Bilkent University • Nadia Anwar
• Ugur Dogrusoz, Bilkent University • Nikolaus Schultz
• Chris Sander • Gary D. Bader, U. Toronto
• Chris Sander
36
Friday, May 18, 12