1. 1
Platform for Imaging in Precision Medicine —
The PRISM Project (U24CA215109)
Fred Prior, PhD
Dept. of Biomedical Informatics
Univ. of Arkansas for Medical Sciences
Ashish Sharma, PhD
Dept. of Biomedical Informatics
Emory University
Joel H. Saltz, MD, PhD
Dept. of Biomedical Informatics
Stonybrook University
3. 3
TCIA Collections are
Growing Rapidly
• TCIA team now has 5 Curation
team
• Latest team was established at
IROC/Ohio
• Based on current estimates the
volume of data managed by
TCIA will increase by 5X as we
complete currently planned
projects
4. 4
Challenges of
Data Variety
• TCIA has multiple ways to store
non-image data
• Non-image data is difficult to reuse
• In some cases (e.g., NLST) it is
used to create data cohorts
• Often it is difficult conduct studies
that make use of non-image data in
an integrative manner.
5. 5
User Requirements
• Clinical data management
• One uniform management strategy for all non-image data (clinical)
• Enhance data exploration, cohort identification, visual analytics
• Image feature management
• Featurebase for Radiomics and Pathomics features
• One data representation
• Enhanced and automated data curation
• Non-image data, pathology data, feature sets
• Increased community engagement.
• Enable efficient deployment and leverage
the on-demand and elastic nature of cloud computing
6. 6
PRISM: Platform for Imaging in
Precision Medicine
• The TCIA technology stack is being refactored and modernized into a set of
modular tools and microservices called PRISM
• PRISM will help streamline its deployment and incorporate new tools for
analysis and management of images and imaging features with clinical
context to enrich TCIA’s datasets.
• A prototype for semantic integration to help explore and find TCIA non-image
data is in development.
• New tools for Pathomics data analysis and management from
Stony Brook and Emory are being integrated.
• Some new functionality will go into both TCIA and PRISM
8. 8
Semantic
Query in
PRISM
• Primary focus: Overcoming
problems created by
differences in representation
of a broad spectrum of
imaging and non-image
data.
• Help integrate diverse data,
• Making them more
accessible and usable, and
allowing queries that span
collections.
9. 9
Semantic Query
Activities
• Data is stored in graph database
using RDF as a representation
language alongside pre-existing
biomedical ontologies.
• Allows us to write SPARQL queries
that combine information from
multiple collections.
• Development of a user-friendly
search interface is in progress.
10. 10
Digital Pathology
Fastest growing image modality in
TCIA. PRISM is uses the QuIP Stack
to for digital pathology collections
• PathDB — Pathology metadata
management system that uses the
DICOM standards for specimen
management
• caMicroscope — Viewer that supports the
creation of annotations, and display of
segmentations, heatmaps, and human
markup
• FeatureScape: Visual analytic system to
explore pathomic features. It includes
microservices to load and manage
pathomic features, particularly those from
deep learning algorithms
11. Pathomic
Features
e.g. Importance of Immune System in
Cancer Treatment and Prognosis
● Tumor spatial context and cellular
heterogeneity are important in cancer
prognosis
● Spatial TIL densities in different tumor
regions have been shown to have high
prognostic value – they may be
superior to the standard TNM
classification
● Immune related assays used to
determine Checkpoint Inhibitor immune
therapy in several cancer types
● TCGA Pan Cancer Immune group
publications documents strong
relationships with molecular measures
of immune tumor response
12. FeatureMap Deployment in PRISM (Explore and Manage Features)
We are also investigating how the lessons of Pathomic Feature
Management could be applied in radiology
13. Curation/POSDA Activities
Radiology Curation
○ POSDA is a suite of tools and workflows developed for curation of DICOM Objects
○ POSDA has powerful PHI scanning/remediation capabilities
○ POSDA maintains history of all scans/modifications to all files
○ We have prototyped extensions to scan and edit non-DICOM files
○ POSDA is on the VA’s list of OS tools that may be run on the VA network
Pathology Curation
○ This effort will be coordinated with PathDB implementation
■ PathDB will be main source of pathology image metadata
○ We are working on extending the POSDA workflow and components to support
pathology image curation
○ Incorporating QuIP and PathDB with POSDA for data curation and loading
14. 14
Processing
at Scale
Hint: Docker is not
the silver bullet
Pipelines Can Work (e.g. Google
Genomics, DNANexus, NCI Cloud Resources,
Globus Genomics…)
o Think multiple steps not monolithic
executable
o Stages containerized or API endpoints
o Describe workflows (WDL, CWL……)
o Rely on orchestrators capable of running
pipelines on local/cloud/hybrid
15. 15
Cloudy Pipelines
Computing on the Cloud Containerized Tools/Stages. Describe
your imaging pipeline in WDL
(Workflow Desc. Lang.)
The API is responsible for pulling in
data; launch parallel pipelines; gather
results; and notify users
Leverages work from the Genomics
community
Works on Google Cloud. AWS
support underway
16. 16
Cloudy Pipelines
Computing on the Cloud Various Path & Rad algorithms and
pipelines are being served by
CloudyPipelines on Google
~3000 pathology images have been
processed in the past 6 months
An ongoing Physionet/CinC challenge
where participants submit algorithms as
Dockerfiles.
Private test dataset
~250 submissions
~77 teams
Ends in Aug 2019
17. 17
Using Dataverse
in PRISM
• Encourage data attribution
• Every TCIA collection and
derived dataset gets a DOI
(versions, related dataset,
related publications …)
• DOIs are tracked
• Metrics to compute impact/
popularity
DATAVERSE — https://dataverse.org
19. Dissemination: Informatics Platform for Cancer-Related
Cognitive Impairment and Dementia Research
3U24CA215109-02S1
Aim 1: Deployable Modular technology for semantic integration of multimodal, multiscale data.
Aim 2: Validation of PRISM utility by cross-testing in a non-cancer research community.
20. 20
Supplement #1
Adapting and Deploying CNNs
for Outcome Prediction on
Rad/Path Data acquired from
TCGA Lung Cancer Collections
joint w/ Lee Cooper and team
Create a Survival Convolutional
Network Model for non-Small Cell
Lung Cancer (NSCLC) Pathology
that Integrates TIL Map Data.
21. Supplement #2
Enhanced Image Viewing
for TCIA/PRISM – a
PRISM/XNAT
collaboration
Aim 1. Integrate the I3CR’s
extension of the OHIF4 viewer into
the TCIA user interface.
Aim 2. Implement current TCIA
visualization algorithms for RT
objects within the OHIF viewer.
Aim 3. Implement TCIA submitting
site de-identification procedures in
XNAT to create and export TCIA
compatible data sets.
joint w/ Dan Marcus and team
22. 22
The PRISM Team
• Fred Prior, PhD
• Jonathan Bona, PhD
• Kirk Smith
• Lawrence Tarbox, PhD
• Mathias Brochhausen, PhD
• Roosevelt Dobbins
• Tracy Nolan
• William Bennett
• Ashish Sharma, PhD
• Annie Gu
• Mohanapriya Narapareddy
• Monjoy Saha, PhD
• Pradeeban Kathiravelu, PhD
• Joel Saltz, MD, PhD
• Erich Bremer
• Rajrisi Gupta MD
• Tahsin Kurc, PhD
• Tammy DiPrima
• TJ Fitzgerald, MD
• Fran Laurie