May 2016 NCI Cancer Center Directors meeting. Data Sharing and the Cancer Genomic Data Commons (GDC). Focus is on cancer genomic and clinical phenotype data.
4. 4
NIH Genomic Data Sharing Policy
https://gds.nih.gov/
Went into effect January 25, 2015
NCI guidance:
http://www.cancer.gov/grants-training/grants-
management/nci-policies/genomic-data
Requires public sharing of genomic data sets
8. 8
Changing the conversation around data sharing
How do we find data, software, standards?
How can we make data, software, metadata accessible?
How do we reuse data standards
How do we make more data machine readable?
Assumption:
Data sharing enhances reusability and reproducibility
10. 10
Genomic Data Commons – Cheat Sheet
Is a Data Sharing Platform for Cancer Genomic Data
A place for well-characterized cancer genomic data
Spans from yeast and worm cancer biology to clinical trials
Is built on the latest computer science and computer engineering
principles – object stores, graph databases
Is the foundation for a Cancer Knowledgebase
Enables the publication (through Digital Object Identifiers -DOIs) of
data sets, analyses, annotations of any data in the GDC
Every object in the GDC is machine readable and supports FAIR
11. 11
The Cancer Genomic Data Commons
(GDC) is an existing effort to
standardize and simplify submission of
genomic data to NCI and follow the
principles of FAIR – Findable,
Accessible, Interoperable, Reusable.
The GDC is part of the NIH Big Data to
Knowledge (BD2K) initiative and an
example of the NIH Commons
Genomic Data Commons
Microattribution, nanopublications, tracking the
use of data, annotation of data, use of
algorithms, supports the data /software
/metadata life cycle to provide credit and
analyze impact of data, software, analytics,
algorithm, curation and knowledge sharing
16. Development of the NCI Genomic Data Commons (GDC)
To Foster the Molecular Diagnosis and Treatment of Cancer
GDC
Bob Grossman PI
Univ. of Chicago
Ontario Inst. Cancer Res.
Leidos
17. 17
Support the Precision Medicine Initiative
• Integrate GDC with Cloud
• Expand data model to
include other data (e.g.
imaging and proteomics)
The Genomic Data Commons and Cloud Pilots
21. 21
Can we use the GDC to build a sustainability model for
data consortia that include genomic data like
GENIE and ORIEN and make data open?
22. 22
How do we move toward a ‘universal consent’ and
tissue/data acquisition protocol?
Assumption: this would simplify access to data
23. 23
Does NCI need to / should NCI help
Cancer Centers build a cadre of folks who understand the
GDC, can transform and deposit data in the GDC, and
contribute knowledge and algorithms to the GDC and
Cloud Pilots?
If so, how?
24. 24
What would an appropriate incentive for data deposition in
the GDC look like for Cancer Centers?
What clinical data makes a case in the GDC most valuable?