SlideShare uma empresa Scribd logo
1 de 19
Baixar para ler offline
Some Proposed Principles for
Interoperating Cloud Based Data Platforms
Robert L. Grossman
Center for Translational Data Science
University of Chicago and
Open Commons Consortium
NIH Workshop on
Cloud-Based Platforms Interoperability
October 3, 2019
Draft 1.5
Josh Denny (Vanderbilt), David Glazer (Verily Life Sciences), Robert L.
Grossman (University of Chicago), Benedict Paten (University of California
at Santa Cruz), Anthony Philippakis (Broad Institute)
Data Biosphere Principles 1. modular, composed of
functional components
with well-specified
interfaces;
2. community-driven,
created by many
groups to foster a
diversity of ideas;
3. open, developed
under open-source
licenses that enable
extensibility and reuse,
with users able to add
custom, proprietary
modules as needed;
and
4. standards-based
Ingest
Explore
HCA
Analysis
Engine
Examples of Data Environments
Portals
Data
Generators
Researchers
Ingest
Explore
CRDC
Methods
Repo
Work-
Spaces
Store
Use in cloud
Ingest
Store
Explore
AoU
Store
Figure: Courtesy of Anthony Philippakis, Broad Institute
The question today: how do we go from building data commons to
building data ecosystems of interoperating data resources,
computational resources, and applications that explore, analyze,
visualize and share data and knowledge?
Cloud-based platforms
Cloud-based data ecosystems
of multiple platforms
http://bit.ly/222QYY
Some Problems Today
• Platforms that refuse to expose any API and instead require all users
to use their platform or application, usually for competitive reasons.
• Platforms that bring data from other resources and platforms into
their system, but don’t let your data out.
• Platforms that don’t interoperate with other systems with the same
or greater security and compliance and blame security and
compliance.
Incentives /
Disincentives for
Interoperating
USG / NFP / For profits
Platform Builders /
Platform Operators
Researchers /
Research Consortiums
Patients / Data Generators
Patients Partnered Research
Many incentives to interoperate
Fewer incentives to interoperate
Some incentives to interoperate
Let’s Distinguish: Technical Guidelines vs Operating Principles
• Common vision: we have a common vision of interoperating to
accelerate research, improve patient outcomes and leverage
resources.
• Operating principles include questions about which platforms can
interoperate, whether a platform will expose an API, whether a
platform will be open and support different applications or will be
closed and only support a single application, etc.
• Technical guidelines can follow technical best practices (e.g. use a
persistent digital ID not tied to a particular domain or location
within a domain) or standards (e.g. GA4GH TES).
It may be helpful to think of policies as on an orthogonal axis.
Principles To Support a Data Ecosystem
• Use Digital IDs
• Interoperate with third party
authentication and authorization
services
• Expose your data through an API
• Expose your data model through an
API
• Interoperate with other trusted data
platforms with similar security &
compliance
• Process authorized queries and
computations from other systems
and return the results (scatter /
gather)
Please
• Refuse to expose any API and
instead require all users to use your
platform or application
• Bring data from other resources and
platforms into your system, but
don’t let your data out.
• Refuse to interoperate with other
systems with the same or greater
security and compliance
Please don’t
Narrow Middle Architecture
*Robert L. Grossman, Progress Towards Cancer Data Ecosystems, The Cancer Journal: The Journal of Principles & Practice
of Oncology, May/June, 2018.
Architectures for Data Ecosystems
• A simple data ecosystem can be built when a data commons exposes an API that can support a collection
of third party applications that can access data from the commons.
• More complex data ecosystems arise when multiple data commons and data clouds can interoperate and
support a collection of third party applications by using a common set of core services (called framework
services) that provide support for authentication, authorization, digital IDs, metadata, importing,
exporting and harmonization of phenotype data, etc.
Bioinformaticians curating
and submitting data
Researchers analyzing data
and making discoveries
cloud-based
platforms
container-based
workspaces
ML/AI apps
notebooks
data commons
• Authentication
• Authorization
• Digital IDs
• Importing, exporting &
harmonization of clinical data
• Can be multiple implementations
that trust each other & interop
Towards a Definition of a Trust Platform
• Before we discuss the operating principles, we need one definition. Let’s say that
Platform A trusts Platform B (so that B is trusted platform) if Platform B
i) operates with a set of policies, procedures and controls that have been
reviewed and approved by Platform A;
ii) the organizations associated with Platform A and Platform B have a formal
signed agreement describing any costs, liabilities, intellectual property issues,
data or data use limitations, etc. that may be associated with the interoperation
of the two platforms.
• As an example, two data commons that both operate with FISMA Moderate security
and compliance (or more generally follow NIST 800-53) and are operated by two
different NIH Institutes or Centers would, in general, each treat each other as trusted
platforms.
• With this definition, two platforms would directly trust each other. At the end we look
at more general trust relationships among members of a consortium or other larger
organization.
1. Interoperate with other trusted platforms: if another trusted
platform is part of your data ecosystem or wants to create an
ecosystem with you, then interoperate with it.
2. Follow the golden rule of data resources: if you take someone else’s
data, let them have access to your data (assuming you have, or can
establish, a trust relationship with them).
Proposed Operating Principles (Draft 1.5)
3. Support the principle of least restrictive access: Provide another
trusted platform access to your data in the least restrictive manner
possible.
- With rare exceptions, a data resource should provide an API so
that application in other trusted platforms can access data directly.
- If this is not possible due to the sensitivity of your data, then
support the ability for approved queries or analyses to be run over
your data and the results returned. Sometimes this is called an
analysis or query gateway.
Proposed Operating Principles (Draft 1.5)
4. Agree on standards, compete on implementations:
- It is important to open up your ecosystem to competition, less it stagnates.
- What this principle means is that a platform should expose its data and
resources via APIs so that other applications and systems can be part of your
ecosystem.
- It is not necessary for the sponsor of a data resource to necessarily fund
other systems or applications, but it is important not to implicitly create a
monopoly by requiring all users of your data to use a particular application or
system.
- Remember that not all researchers have the same requirements, or the same
preferences, and in general a mix of applications, systems and platforms is
better than requiring the use of a single application or system.
Proposed Operating Principles (Draft 1.5)
5. Support patient partnered research: Support patient partnered
research so that individuals can provide their data and have control
over it within your system. If you cannot do this today, add this to your
platform roadmap.
Proposed Operating Principles (Draft 1.5)
Trusted Platforms
• A trust relationship between two resources in a data ecosystem requires agreements
between two organizations about a number of matters, including: security;
compliance; liability; data egress charges; and infrastructure costs.
• For this reason, a formal agreement between two different organizations or a memo
between two different units within an organization or agency is usually required.
• As an example, an Interconnection Security Agreement (ISA) between two platforms
would serve this purpose.
• A consortium of platforms can also sign formal agreements. For example, the Open
Commons Consortium agreements for the BloodPAC Consortium.
Bilateral trust
relationships
Consortium trust
relationships
Federated trust
relationships
Isolated
platform
18
19
For More Information
Robert L. Grossman, Some Proposed Principles for
Interoperating Data Commons, Medium, October 1, 2019,
http://bit.ly/222QYY
Robert L. Grossman
robert.grossman@uchicago.edu
@BobGrossman

Mais conteúdo relacionado

Mais procurados

BROWN BAG TALK WITH MICAH ALTMAN INTEGRATING OPEN DATA INTO OPEN ACCESS JOURNALS
BROWN BAG TALK WITH MICAH ALTMAN INTEGRATING OPEN DATA INTO OPEN ACCESS JOURNALSBROWN BAG TALK WITH MICAH ALTMAN INTEGRATING OPEN DATA INTO OPEN ACCESS JOURNALS
BROWN BAG TALK WITH MICAH ALTMAN INTEGRATING OPEN DATA INTO OPEN ACCESS JOURNALSMicah Altman
 
DataTags: Sharing Privacy Sensitive Data by Michael Bar-sinai
DataTags: Sharing Privacy Sensitive Data by Michael Bar-sinaiDataTags: Sharing Privacy Sensitive Data by Michael Bar-sinai
DataTags: Sharing Privacy Sensitive Data by Michael Bar-sinaidatascienceiqss
 
DataONE Education Module 08: Data Citation
DataONE Education Module 08: Data CitationDataONE Education Module 08: Data Citation
DataONE Education Module 08: Data CitationDataONE
 
FAIR Data Knowledge Graphs
FAIR Data Knowledge GraphsFAIR Data Knowledge Graphs
FAIR Data Knowledge GraphsTom Plasterer
 
DataONE Education Module 02: Data Sharing
DataONE Education Module 02: Data SharingDataONE Education Module 02: Data Sharing
DataONE Education Module 02: Data SharingDataONE
 
BioPharma and FAIR Data, a Collaborative Advantage
BioPharma and FAIR Data, a Collaborative AdvantageBioPharma and FAIR Data, a Collaborative Advantage
BioPharma and FAIR Data, a Collaborative AdvantageTom Plasterer
 
BROWN BAG TALK WITH MICAH ALTMAN, SOURCES OF BIG DATA FOR SOCIAL SCIENCES
BROWN BAG TALK WITH MICAH ALTMAN, SOURCES OF BIG DATA FOR SOCIAL SCIENCESBROWN BAG TALK WITH MICAH ALTMAN, SOURCES OF BIG DATA FOR SOCIAL SCIENCES
BROWN BAG TALK WITH MICAH ALTMAN, SOURCES OF BIG DATA FOR SOCIAL SCIENCESMicah Altman
 
Big Data Repository for Structural Biology: Challenges and Opportunities by P...
Big Data Repository for Structural Biology: Challenges and Opportunities by P...Big Data Repository for Structural Biology: Challenges and Opportunities by P...
Big Data Repository for Structural Biology: Challenges and Opportunities by P...datascienceiqss
 
Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...
Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...
Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...Tom Plasterer
 
Dataset Catalogs as a Foundation for FAIR* Data
Dataset Catalogs as a Foundation for FAIR* DataDataset Catalogs as a Foundation for FAIR* Data
Dataset Catalogs as a Foundation for FAIR* DataTom Plasterer
 
DataONE Education Module 01: Why Data Management?
DataONE Education Module 01: Why Data Management?DataONE Education Module 01: Why Data Management?
DataONE Education Module 01: Why Data Management?DataONE
 
Dataverse, Cloud Dataverse, and DataTags
Dataverse, Cloud Dataverse, and DataTagsDataverse, Cloud Dataverse, and DataTags
Dataverse, Cloud Dataverse, and DataTagsMerce Crosas
 
Linked Data for Biopharma
Linked Data for BiopharmaLinked Data for Biopharma
Linked Data for BiopharmaTom Plasterer
 
Capsule Computing: Safe Open Science
Capsule Computing: Safe Open Science Capsule Computing: Safe Open Science
Capsule Computing: Safe Open Science Beth Plale
 
dkNET Webinar: FAIR Data & Software in the Research Life Cycle 01/22/2021
dkNET Webinar: FAIR Data & Software in the Research Life Cycle 01/22/2021dkNET Webinar: FAIR Data & Software in the Research Life Cycle 01/22/2021
dkNET Webinar: FAIR Data & Software in the Research Life Cycle 01/22/2021dkNET
 
Edge Informatics and FAIR (Findable, Accessible, Interoperable and Reusable) ...
Edge Informatics and FAIR (Findable, Accessible, Interoperable and Reusable) ...Edge Informatics and FAIR (Findable, Accessible, Interoperable and Reusable) ...
Edge Informatics and FAIR (Findable, Accessible, Interoperable and Reusable) ...Tom Plasterer
 
DataONE Education Module 03: Data Management Planning
DataONE Education Module 03: Data Management PlanningDataONE Education Module 03: Data Management Planning
DataONE Education Module 03: Data Management PlanningDataONE
 
Data Citation Implementation Guidelines By Tim Clark
Data Citation Implementation Guidelines By Tim ClarkData Citation Implementation Guidelines By Tim Clark
Data Citation Implementation Guidelines By Tim Clarkdatascienceiqss
 
DataONE Education Module 07: Metadata
DataONE Education Module 07: MetadataDataONE Education Module 07: Metadata
DataONE Education Module 07: MetadataDataONE
 
Bonazzi commons bd2 k ahm 2016 v2
Bonazzi commons bd2 k ahm 2016 v2Bonazzi commons bd2 k ahm 2016 v2
Bonazzi commons bd2 k ahm 2016 v2Vivien Bonazzi
 

Mais procurados (20)

BROWN BAG TALK WITH MICAH ALTMAN INTEGRATING OPEN DATA INTO OPEN ACCESS JOURNALS
BROWN BAG TALK WITH MICAH ALTMAN INTEGRATING OPEN DATA INTO OPEN ACCESS JOURNALSBROWN BAG TALK WITH MICAH ALTMAN INTEGRATING OPEN DATA INTO OPEN ACCESS JOURNALS
BROWN BAG TALK WITH MICAH ALTMAN INTEGRATING OPEN DATA INTO OPEN ACCESS JOURNALS
 
DataTags: Sharing Privacy Sensitive Data by Michael Bar-sinai
DataTags: Sharing Privacy Sensitive Data by Michael Bar-sinaiDataTags: Sharing Privacy Sensitive Data by Michael Bar-sinai
DataTags: Sharing Privacy Sensitive Data by Michael Bar-sinai
 
DataONE Education Module 08: Data Citation
DataONE Education Module 08: Data CitationDataONE Education Module 08: Data Citation
DataONE Education Module 08: Data Citation
 
FAIR Data Knowledge Graphs
FAIR Data Knowledge GraphsFAIR Data Knowledge Graphs
FAIR Data Knowledge Graphs
 
DataONE Education Module 02: Data Sharing
DataONE Education Module 02: Data SharingDataONE Education Module 02: Data Sharing
DataONE Education Module 02: Data Sharing
 
BioPharma and FAIR Data, a Collaborative Advantage
BioPharma and FAIR Data, a Collaborative AdvantageBioPharma and FAIR Data, a Collaborative Advantage
BioPharma and FAIR Data, a Collaborative Advantage
 
BROWN BAG TALK WITH MICAH ALTMAN, SOURCES OF BIG DATA FOR SOCIAL SCIENCES
BROWN BAG TALK WITH MICAH ALTMAN, SOURCES OF BIG DATA FOR SOCIAL SCIENCESBROWN BAG TALK WITH MICAH ALTMAN, SOURCES OF BIG DATA FOR SOCIAL SCIENCES
BROWN BAG TALK WITH MICAH ALTMAN, SOURCES OF BIG DATA FOR SOCIAL SCIENCES
 
Big Data Repository for Structural Biology: Challenges and Opportunities by P...
Big Data Repository for Structural Biology: Challenges and Opportunities by P...Big Data Repository for Structural Biology: Challenges and Opportunities by P...
Big Data Repository for Structural Biology: Challenges and Opportunities by P...
 
Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...
Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...
Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...
 
Dataset Catalogs as a Foundation for FAIR* Data
Dataset Catalogs as a Foundation for FAIR* DataDataset Catalogs as a Foundation for FAIR* Data
Dataset Catalogs as a Foundation for FAIR* Data
 
DataONE Education Module 01: Why Data Management?
DataONE Education Module 01: Why Data Management?DataONE Education Module 01: Why Data Management?
DataONE Education Module 01: Why Data Management?
 
Dataverse, Cloud Dataverse, and DataTags
Dataverse, Cloud Dataverse, and DataTagsDataverse, Cloud Dataverse, and DataTags
Dataverse, Cloud Dataverse, and DataTags
 
Linked Data for Biopharma
Linked Data for BiopharmaLinked Data for Biopharma
Linked Data for Biopharma
 
Capsule Computing: Safe Open Science
Capsule Computing: Safe Open Science Capsule Computing: Safe Open Science
Capsule Computing: Safe Open Science
 
dkNET Webinar: FAIR Data & Software in the Research Life Cycle 01/22/2021
dkNET Webinar: FAIR Data & Software in the Research Life Cycle 01/22/2021dkNET Webinar: FAIR Data & Software in the Research Life Cycle 01/22/2021
dkNET Webinar: FAIR Data & Software in the Research Life Cycle 01/22/2021
 
Edge Informatics and FAIR (Findable, Accessible, Interoperable and Reusable) ...
Edge Informatics and FAIR (Findable, Accessible, Interoperable and Reusable) ...Edge Informatics and FAIR (Findable, Accessible, Interoperable and Reusable) ...
Edge Informatics and FAIR (Findable, Accessible, Interoperable and Reusable) ...
 
DataONE Education Module 03: Data Management Planning
DataONE Education Module 03: Data Management PlanningDataONE Education Module 03: Data Management Planning
DataONE Education Module 03: Data Management Planning
 
Data Citation Implementation Guidelines By Tim Clark
Data Citation Implementation Guidelines By Tim ClarkData Citation Implementation Guidelines By Tim Clark
Data Citation Implementation Guidelines By Tim Clark
 
DataONE Education Module 07: Metadata
DataONE Education Module 07: MetadataDataONE Education Module 07: Metadata
DataONE Education Module 07: Metadata
 
Bonazzi commons bd2 k ahm 2016 v2
Bonazzi commons bd2 k ahm 2016 v2Bonazzi commons bd2 k ahm 2016 v2
Bonazzi commons bd2 k ahm 2016 v2
 

Semelhante a Some Proposed Principles for Interoperating Cloud Based Data Platforms

NIH Data Summit - The NIH Data Commons
NIH Data Summit - The NIH Data CommonsNIH Data Summit - The NIH Data Commons
NIH Data Summit - The NIH Data CommonsVivien Bonazzi
 
tranSMART Community Meeting 5-7 Nov 13 - Session 5: Recent tranSMART Lessons ...
tranSMART Community Meeting 5-7 Nov 13 - Session 5: Recent tranSMART Lessons ...tranSMART Community Meeting 5-7 Nov 13 - Session 5: Recent tranSMART Lessons ...
tranSMART Community Meeting 5-7 Nov 13 - Session 5: Recent tranSMART Lessons ...David Peyruc
 
Data, Data Everywhere: What's A Publisher to Do?
Data, Data Everywhere: What's  A Publisher to Do?Data, Data Everywhere: What's  A Publisher to Do?
Data, Data Everywhere: What's A Publisher to Do?Anita de Waard
 
A Framework for Geospatial Web Services for Public Health by Dr. Leslie Lenert
A Framework for Geospatial Web Services for Public Health by Dr. Leslie LenertA Framework for Geospatial Web Services for Public Health by Dr. Leslie Lenert
A Framework for Geospatial Web Services for Public Health by Dr. Leslie LenertWansoo Im
 
A Muilt-Keyword Ranked Based Search and Privacy Preservation of Distributed D...
A Muilt-Keyword Ranked Based Search and Privacy Preservation of Distributed D...A Muilt-Keyword Ranked Based Search and Privacy Preservation of Distributed D...
A Muilt-Keyword Ranked Based Search and Privacy Preservation of Distributed D...IRJET Journal
 
A Muilt-Keyword Ranked Based Search and Privacy Preservation of Distributed D...
A Muilt-Keyword Ranked Based Search and Privacy Preservation of Distributed D...A Muilt-Keyword Ranked Based Search and Privacy Preservation of Distributed D...
A Muilt-Keyword Ranked Based Search and Privacy Preservation of Distributed D...IRJET Journal
 
Unification Algorithm in Hefty Iterative Multi-tier Classifiers for Gigantic ...
Unification Algorithm in Hefty Iterative Multi-tier Classifiers for Gigantic ...Unification Algorithm in Hefty Iterative Multi-tier Classifiers for Gigantic ...
Unification Algorithm in Hefty Iterative Multi-tier Classifiers for Gigantic ...Editor IJAIEM
 
How FAIR is your data? Copyright, licensing and reuse of data
How FAIR is your data? Copyright, licensing and reuse of dataHow FAIR is your data? Copyright, licensing and reuse of data
How FAIR is your data? Copyright, licensing and reuse of dataARDC
 
Project 1 Template (Due on Week 4)Name.docx
Project 1 Template (Due on Week 4)Name.docxProject 1 Template (Due on Week 4)Name.docx
Project 1 Template (Due on Week 4)Name.docxsimonlbentley59018
 
Data Management and Horizon 2020
Data Management and Horizon 2020Data Management and Horizon 2020
Data Management and Horizon 2020Sarah Jones
 
ESG - HDS HCP Anywhere Easy, Secure, On-Premises File Sharing
ESG - HDS HCP Anywhere Easy, Secure, On-Premises File SharingESG - HDS HCP Anywhere Easy, Secure, On-Premises File Sharing
ESG - HDS HCP Anywhere Easy, Secure, On-Premises File SharingHitachi Vantara
 
Sitra rise of the pilots janne enberg
Sitra rise of the pilots janne enbergSitra rise of the pilots janne enberg
Sitra rise of the pilots janne enbergSitra / Hyvinvointi
 
AUTHORIZATION FRAMEWORK FOR MEDICAL DATA
AUTHORIZATION FRAMEWORK FOR MEDICAL DATA AUTHORIZATION FRAMEWORK FOR MEDICAL DATA
AUTHORIZATION FRAMEWORK FOR MEDICAL DATA ijmnct
 
VODAN Africa IN.pptx
VODAN Africa IN.pptxVODAN Africa IN.pptx
VODAN Africa IN.pptxGetu Tadele
 
SURVEY ON DYNAMIC DATA SHARING IN PUBLIC CLOUD USING MULTI-AUTHORITY SYSTEM
SURVEY ON DYNAMIC DATA SHARING IN PUBLIC CLOUD USING MULTI-AUTHORITY SYSTEMSURVEY ON DYNAMIC DATA SHARING IN PUBLIC CLOUD USING MULTI-AUTHORITY SYSTEM
SURVEY ON DYNAMIC DATA SHARING IN PUBLIC CLOUD USING MULTI-AUTHORITY SYSTEMijiert bestjournal
 
Shared Authority Based Privacy-preserving Authentication Protocol in Cloud Co...
Shared Authority Based Privacy-preserving Authentication Protocol in Cloud Co...Shared Authority Based Privacy-preserving Authentication Protocol in Cloud Co...
Shared Authority Based Privacy-preserving Authentication Protocol in Cloud Co...Migrant Systems
 
iaetsd Shared authority based privacy preserving protocol
iaetsd Shared authority based privacy preserving protocoliaetsd Shared authority based privacy preserving protocol
iaetsd Shared authority based privacy preserving protocolIaetsd Iaetsd
 

Semelhante a Some Proposed Principles for Interoperating Cloud Based Data Platforms (20)

NIH Data Summit - The NIH Data Commons
NIH Data Summit - The NIH Data CommonsNIH Data Summit - The NIH Data Commons
NIH Data Summit - The NIH Data Commons
 
Data Domain-Driven Design
Data Domain-Driven DesignData Domain-Driven Design
Data Domain-Driven Design
 
tranSMART Community Meeting 5-7 Nov 13 - Session 5: Recent tranSMART Lessons ...
tranSMART Community Meeting 5-7 Nov 13 - Session 5: Recent tranSMART Lessons ...tranSMART Community Meeting 5-7 Nov 13 - Session 5: Recent tranSMART Lessons ...
tranSMART Community Meeting 5-7 Nov 13 - Session 5: Recent tranSMART Lessons ...
 
Data, Data Everywhere: What's A Publisher to Do?
Data, Data Everywhere: What's  A Publisher to Do?Data, Data Everywhere: What's  A Publisher to Do?
Data, Data Everywhere: What's A Publisher to Do?
 
A Framework for Geospatial Web Services for Public Health by Dr. Leslie Lenert
A Framework for Geospatial Web Services for Public Health by Dr. Leslie LenertA Framework for Geospatial Web Services for Public Health by Dr. Leslie Lenert
A Framework for Geospatial Web Services for Public Health by Dr. Leslie Lenert
 
A Muilt-Keyword Ranked Based Search and Privacy Preservation of Distributed D...
A Muilt-Keyword Ranked Based Search and Privacy Preservation of Distributed D...A Muilt-Keyword Ranked Based Search and Privacy Preservation of Distributed D...
A Muilt-Keyword Ranked Based Search and Privacy Preservation of Distributed D...
 
A Muilt-Keyword Ranked Based Search and Privacy Preservation of Distributed D...
A Muilt-Keyword Ranked Based Search and Privacy Preservation of Distributed D...A Muilt-Keyword Ranked Based Search and Privacy Preservation of Distributed D...
A Muilt-Keyword Ranked Based Search and Privacy Preservation of Distributed D...
 
Unification Algorithm in Hefty Iterative Multi-tier Classifiers for Gigantic ...
Unification Algorithm in Hefty Iterative Multi-tier Classifiers for Gigantic ...Unification Algorithm in Hefty Iterative Multi-tier Classifiers for Gigantic ...
Unification Algorithm in Hefty Iterative Multi-tier Classifiers for Gigantic ...
 
Security for Big Data
Security for Big DataSecurity for Big Data
Security for Big Data
 
How FAIR is your data? Copyright, licensing and reuse of data
How FAIR is your data? Copyright, licensing and reuse of dataHow FAIR is your data? Copyright, licensing and reuse of data
How FAIR is your data? Copyright, licensing and reuse of data
 
Project 1 Template (Due on Week 4)Name.docx
Project 1 Template (Due on Week 4)Name.docxProject 1 Template (Due on Week 4)Name.docx
Project 1 Template (Due on Week 4)Name.docx
 
Data Management and Horizon 2020
Data Management and Horizon 2020Data Management and Horizon 2020
Data Management and Horizon 2020
 
ESG - HDS HCP Anywhere Easy, Secure, On-Premises File Sharing
ESG - HDS HCP Anywhere Easy, Secure, On-Premises File SharingESG - HDS HCP Anywhere Easy, Secure, On-Premises File Sharing
ESG - HDS HCP Anywhere Easy, Secure, On-Premises File Sharing
 
Sitra rise of the pilots janne enberg
Sitra rise of the pilots janne enbergSitra rise of the pilots janne enberg
Sitra rise of the pilots janne enberg
 
AUTHORIZATION FRAMEWORK FOR MEDICAL DATA
AUTHORIZATION FRAMEWORK FOR MEDICAL DATA AUTHORIZATION FRAMEWORK FOR MEDICAL DATA
AUTHORIZATION FRAMEWORK FOR MEDICAL DATA
 
VODAN Africa IN.pptx
VODAN Africa IN.pptxVODAN Africa IN.pptx
VODAN Africa IN.pptx
 
SURVEY ON DYNAMIC DATA SHARING IN PUBLIC CLOUD USING MULTI-AUTHORITY SYSTEM
SURVEY ON DYNAMIC DATA SHARING IN PUBLIC CLOUD USING MULTI-AUTHORITY SYSTEMSURVEY ON DYNAMIC DATA SHARING IN PUBLIC CLOUD USING MULTI-AUTHORITY SYSTEM
SURVEY ON DYNAMIC DATA SHARING IN PUBLIC CLOUD USING MULTI-AUTHORITY SYSTEM
 
Shared Authority Based Privacy-preserving Authentication Protocol in Cloud Co...
Shared Authority Based Privacy-preserving Authentication Protocol in Cloud Co...Shared Authority Based Privacy-preserving Authentication Protocol in Cloud Co...
Shared Authority Based Privacy-preserving Authentication Protocol in Cloud Co...
 
iaetsd Shared authority based privacy preserving protocol
iaetsd Shared authority based privacy preserving protocoliaetsd Shared authority based privacy preserving protocol
iaetsd Shared authority based privacy preserving protocol
 
Trial io pcori doc v1
Trial io pcori doc v1Trial io pcori doc v1
Trial io pcori doc v1
 

Mais de Robert Grossman

How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...Robert Grossman
 
AnalyticOps - Chicago PAW 2016
AnalyticOps - Chicago PAW 2016AnalyticOps - Chicago PAW 2016
AnalyticOps - Chicago PAW 2016Robert Grossman
 
Keynote on 2015 Yale Day of Data
Keynote on 2015 Yale Day of Data Keynote on 2015 Yale Day of Data
Keynote on 2015 Yale Day of Data Robert Grossman
 
How to Lower the Cost of Deploying Analytics: An Introduction to the Portable...
How to Lower the Cost of Deploying Analytics: An Introduction to the Portable...How to Lower the Cost of Deploying Analytics: An Introduction to the Portable...
How to Lower the Cost of Deploying Analytics: An Introduction to the Portable...Robert Grossman
 
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...Robert Grossman
 
Clouds and Commons for the Data Intensive Science Community (June 8, 2015)
Clouds and Commons for the Data Intensive Science Community (June 8, 2015)Clouds and Commons for the Data Intensive Science Community (June 8, 2015)
Clouds and Commons for the Data Intensive Science Community (June 8, 2015)Robert Grossman
 
Architectures for Data Commons (XLDB 15 Lightning Talk)
Architectures for Data Commons (XLDB 15 Lightning Talk)Architectures for Data Commons (XLDB 15 Lightning Talk)
Architectures for Data Commons (XLDB 15 Lightning Talk)Robert Grossman
 
Practical Methods for Identifying Anomalies That Matter in Large Datasets
Practical Methods for Identifying Anomalies That Matter in Large DatasetsPractical Methods for Identifying Anomalies That Matter in Large Datasets
Practical Methods for Identifying Anomalies That Matter in Large DatasetsRobert Grossman
 
What is a Data Commons and Why Should You Care?
What is a Data Commons and Why Should You Care? What is a Data Commons and Why Should You Care?
What is a Data Commons and Why Should You Care? Robert Grossman
 
Biomedical Clusters, Clouds and Commons - DePaul Colloquium Oct 24, 2014
Biomedical Clusters, Clouds and Commons - DePaul Colloquium Oct 24, 2014Biomedical Clusters, Clouds and Commons - DePaul Colloquium Oct 24, 2014
Biomedical Clusters, Clouds and Commons - DePaul Colloquium Oct 24, 2014Robert Grossman
 
Big Data, The Community and The Commons (May 12, 2014)
Big Data, The Community and The Commons (May 12, 2014)Big Data, The Community and The Commons (May 12, 2014)
Big Data, The Community and The Commons (May 12, 2014)Robert Grossman
 
What Are Science Clouds?
What Are Science Clouds?What Are Science Clouds?
What Are Science Clouds?Robert Grossman
 
Adversarial Analytics - 2013 Strata & Hadoop World Talk
Adversarial Analytics - 2013 Strata & Hadoop World TalkAdversarial Analytics - 2013 Strata & Hadoop World Talk
Adversarial Analytics - 2013 Strata & Hadoop World TalkRobert Grossman
 
The Matsu Project - Open Source Software for Processing Satellite Imagery Data
The Matsu Project - Open Source Software for Processing Satellite Imagery DataThe Matsu Project - Open Source Software for Processing Satellite Imagery Data
The Matsu Project - Open Source Software for Processing Satellite Imagery DataRobert Grossman
 
Using the Open Science Data Cloud for Data Science Research
Using the Open Science Data Cloud for Data Science ResearchUsing the Open Science Data Cloud for Data Science Research
Using the Open Science Data Cloud for Data Science ResearchRobert Grossman
 
The Open Science Data Cloud: Empowering the Long Tail of Science
The Open Science Data Cloud: Empowering the Long Tail of ScienceThe Open Science Data Cloud: Empowering the Long Tail of Science
The Open Science Data Cloud: Empowering the Long Tail of ScienceRobert Grossman
 
Bionimbus: Towards One Million Genomes (XLDB 2012 Lecture)
Bionimbus: Towards One Million Genomes (XLDB 2012 Lecture)Bionimbus: Towards One Million Genomes (XLDB 2012 Lecture)
Bionimbus: Towards One Million Genomes (XLDB 2012 Lecture)Robert Grossman
 
Big Data - Lab A1 (SC 11 Tutorial)
Big Data - Lab A1 (SC 11 Tutorial)Big Data - Lab A1 (SC 11 Tutorial)
Big Data - Lab A1 (SC 11 Tutorial)Robert Grossman
 
Managing Big Data (Chapter 2, SC 11 Tutorial)
Managing Big Data (Chapter 2, SC 11 Tutorial)Managing Big Data (Chapter 2, SC 11 Tutorial)
Managing Big Data (Chapter 2, SC 11 Tutorial)Robert Grossman
 
Introduction to Big Data and Science Clouds (Chapter 1, SC 11 Tutorial)
Introduction to Big Data and Science Clouds (Chapter 1, SC 11 Tutorial)Introduction to Big Data and Science Clouds (Chapter 1, SC 11 Tutorial)
Introduction to Big Data and Science Clouds (Chapter 1, SC 11 Tutorial)Robert Grossman
 

Mais de Robert Grossman (20)

How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...
 
AnalyticOps - Chicago PAW 2016
AnalyticOps - Chicago PAW 2016AnalyticOps - Chicago PAW 2016
AnalyticOps - Chicago PAW 2016
 
Keynote on 2015 Yale Day of Data
Keynote on 2015 Yale Day of Data Keynote on 2015 Yale Day of Data
Keynote on 2015 Yale Day of Data
 
How to Lower the Cost of Deploying Analytics: An Introduction to the Portable...
How to Lower the Cost of Deploying Analytics: An Introduction to the Portable...How to Lower the Cost of Deploying Analytics: An Introduction to the Portable...
How to Lower the Cost of Deploying Analytics: An Introduction to the Portable...
 
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...
 
Clouds and Commons for the Data Intensive Science Community (June 8, 2015)
Clouds and Commons for the Data Intensive Science Community (June 8, 2015)Clouds and Commons for the Data Intensive Science Community (June 8, 2015)
Clouds and Commons for the Data Intensive Science Community (June 8, 2015)
 
Architectures for Data Commons (XLDB 15 Lightning Talk)
Architectures for Data Commons (XLDB 15 Lightning Talk)Architectures for Data Commons (XLDB 15 Lightning Talk)
Architectures for Data Commons (XLDB 15 Lightning Talk)
 
Practical Methods for Identifying Anomalies That Matter in Large Datasets
Practical Methods for Identifying Anomalies That Matter in Large DatasetsPractical Methods for Identifying Anomalies That Matter in Large Datasets
Practical Methods for Identifying Anomalies That Matter in Large Datasets
 
What is a Data Commons and Why Should You Care?
What is a Data Commons and Why Should You Care? What is a Data Commons and Why Should You Care?
What is a Data Commons and Why Should You Care?
 
Biomedical Clusters, Clouds and Commons - DePaul Colloquium Oct 24, 2014
Biomedical Clusters, Clouds and Commons - DePaul Colloquium Oct 24, 2014Biomedical Clusters, Clouds and Commons - DePaul Colloquium Oct 24, 2014
Biomedical Clusters, Clouds and Commons - DePaul Colloquium Oct 24, 2014
 
Big Data, The Community and The Commons (May 12, 2014)
Big Data, The Community and The Commons (May 12, 2014)Big Data, The Community and The Commons (May 12, 2014)
Big Data, The Community and The Commons (May 12, 2014)
 
What Are Science Clouds?
What Are Science Clouds?What Are Science Clouds?
What Are Science Clouds?
 
Adversarial Analytics - 2013 Strata & Hadoop World Talk
Adversarial Analytics - 2013 Strata & Hadoop World TalkAdversarial Analytics - 2013 Strata & Hadoop World Talk
Adversarial Analytics - 2013 Strata & Hadoop World Talk
 
The Matsu Project - Open Source Software for Processing Satellite Imagery Data
The Matsu Project - Open Source Software for Processing Satellite Imagery DataThe Matsu Project - Open Source Software for Processing Satellite Imagery Data
The Matsu Project - Open Source Software for Processing Satellite Imagery Data
 
Using the Open Science Data Cloud for Data Science Research
Using the Open Science Data Cloud for Data Science ResearchUsing the Open Science Data Cloud for Data Science Research
Using the Open Science Data Cloud for Data Science Research
 
The Open Science Data Cloud: Empowering the Long Tail of Science
The Open Science Data Cloud: Empowering the Long Tail of ScienceThe Open Science Data Cloud: Empowering the Long Tail of Science
The Open Science Data Cloud: Empowering the Long Tail of Science
 
Bionimbus: Towards One Million Genomes (XLDB 2012 Lecture)
Bionimbus: Towards One Million Genomes (XLDB 2012 Lecture)Bionimbus: Towards One Million Genomes (XLDB 2012 Lecture)
Bionimbus: Towards One Million Genomes (XLDB 2012 Lecture)
 
Big Data - Lab A1 (SC 11 Tutorial)
Big Data - Lab A1 (SC 11 Tutorial)Big Data - Lab A1 (SC 11 Tutorial)
Big Data - Lab A1 (SC 11 Tutorial)
 
Managing Big Data (Chapter 2, SC 11 Tutorial)
Managing Big Data (Chapter 2, SC 11 Tutorial)Managing Big Data (Chapter 2, SC 11 Tutorial)
Managing Big Data (Chapter 2, SC 11 Tutorial)
 
Introduction to Big Data and Science Clouds (Chapter 1, SC 11 Tutorial)
Introduction to Big Data and Science Clouds (Chapter 1, SC 11 Tutorial)Introduction to Big Data and Science Clouds (Chapter 1, SC 11 Tutorial)
Introduction to Big Data and Science Clouds (Chapter 1, SC 11 Tutorial)
 

Último

Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 

Último (20)

Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 

Some Proposed Principles for Interoperating Cloud Based Data Platforms

  • 1. Some Proposed Principles for Interoperating Cloud Based Data Platforms Robert L. Grossman Center for Translational Data Science University of Chicago and Open Commons Consortium NIH Workshop on Cloud-Based Platforms Interoperability October 3, 2019 Draft 1.5
  • 2. Josh Denny (Vanderbilt), David Glazer (Verily Life Sciences), Robert L. Grossman (University of Chicago), Benedict Paten (University of California at Santa Cruz), Anthony Philippakis (Broad Institute)
  • 3. Data Biosphere Principles 1. modular, composed of functional components with well-specified interfaces; 2. community-driven, created by many groups to foster a diversity of ideas; 3. open, developed under open-source licenses that enable extensibility and reuse, with users able to add custom, proprietary modules as needed; and 4. standards-based Ingest Explore HCA Analysis Engine Examples of Data Environments Portals Data Generators Researchers Ingest Explore CRDC Methods Repo Work- Spaces Store Use in cloud Ingest Store Explore AoU Store Figure: Courtesy of Anthony Philippakis, Broad Institute
  • 4. The question today: how do we go from building data commons to building data ecosystems of interoperating data resources, computational resources, and applications that explore, analyze, visualize and share data and knowledge? Cloud-based platforms Cloud-based data ecosystems of multiple platforms
  • 6. Some Problems Today • Platforms that refuse to expose any API and instead require all users to use their platform or application, usually for competitive reasons. • Platforms that bring data from other resources and platforms into their system, but don’t let your data out. • Platforms that don’t interoperate with other systems with the same or greater security and compliance and blame security and compliance.
  • 7. Incentives / Disincentives for Interoperating USG / NFP / For profits Platform Builders / Platform Operators Researchers / Research Consortiums Patients / Data Generators Patients Partnered Research Many incentives to interoperate Fewer incentives to interoperate Some incentives to interoperate
  • 8. Let’s Distinguish: Technical Guidelines vs Operating Principles • Common vision: we have a common vision of interoperating to accelerate research, improve patient outcomes and leverage resources. • Operating principles include questions about which platforms can interoperate, whether a platform will expose an API, whether a platform will be open and support different applications or will be closed and only support a single application, etc. • Technical guidelines can follow technical best practices (e.g. use a persistent digital ID not tied to a particular domain or location within a domain) or standards (e.g. GA4GH TES). It may be helpful to think of policies as on an orthogonal axis.
  • 9. Principles To Support a Data Ecosystem • Use Digital IDs • Interoperate with third party authentication and authorization services • Expose your data through an API • Expose your data model through an API • Interoperate with other trusted data platforms with similar security & compliance • Process authorized queries and computations from other systems and return the results (scatter / gather) Please • Refuse to expose any API and instead require all users to use your platform or application • Bring data from other resources and platforms into your system, but don’t let your data out. • Refuse to interoperate with other systems with the same or greater security and compliance Please don’t
  • 10. Narrow Middle Architecture *Robert L. Grossman, Progress Towards Cancer Data Ecosystems, The Cancer Journal: The Journal of Principles & Practice of Oncology, May/June, 2018.
  • 11. Architectures for Data Ecosystems • A simple data ecosystem can be built when a data commons exposes an API that can support a collection of third party applications that can access data from the commons. • More complex data ecosystems arise when multiple data commons and data clouds can interoperate and support a collection of third party applications by using a common set of core services (called framework services) that provide support for authentication, authorization, digital IDs, metadata, importing, exporting and harmonization of phenotype data, etc. Bioinformaticians curating and submitting data Researchers analyzing data and making discoveries cloud-based platforms container-based workspaces ML/AI apps notebooks data commons • Authentication • Authorization • Digital IDs • Importing, exporting & harmonization of clinical data • Can be multiple implementations that trust each other & interop
  • 12. Towards a Definition of a Trust Platform • Before we discuss the operating principles, we need one definition. Let’s say that Platform A trusts Platform B (so that B is trusted platform) if Platform B i) operates with a set of policies, procedures and controls that have been reviewed and approved by Platform A; ii) the organizations associated with Platform A and Platform B have a formal signed agreement describing any costs, liabilities, intellectual property issues, data or data use limitations, etc. that may be associated with the interoperation of the two platforms. • As an example, two data commons that both operate with FISMA Moderate security and compliance (or more generally follow NIST 800-53) and are operated by two different NIH Institutes or Centers would, in general, each treat each other as trusted platforms. • With this definition, two platforms would directly trust each other. At the end we look at more general trust relationships among members of a consortium or other larger organization.
  • 13. 1. Interoperate with other trusted platforms: if another trusted platform is part of your data ecosystem or wants to create an ecosystem with you, then interoperate with it. 2. Follow the golden rule of data resources: if you take someone else’s data, let them have access to your data (assuming you have, or can establish, a trust relationship with them). Proposed Operating Principles (Draft 1.5)
  • 14. 3. Support the principle of least restrictive access: Provide another trusted platform access to your data in the least restrictive manner possible. - With rare exceptions, a data resource should provide an API so that application in other trusted platforms can access data directly. - If this is not possible due to the sensitivity of your data, then support the ability for approved queries or analyses to be run over your data and the results returned. Sometimes this is called an analysis or query gateway. Proposed Operating Principles (Draft 1.5)
  • 15. 4. Agree on standards, compete on implementations: - It is important to open up your ecosystem to competition, less it stagnates. - What this principle means is that a platform should expose its data and resources via APIs so that other applications and systems can be part of your ecosystem. - It is not necessary for the sponsor of a data resource to necessarily fund other systems or applications, but it is important not to implicitly create a monopoly by requiring all users of your data to use a particular application or system. - Remember that not all researchers have the same requirements, or the same preferences, and in general a mix of applications, systems and platforms is better than requiring the use of a single application or system. Proposed Operating Principles (Draft 1.5)
  • 16. 5. Support patient partnered research: Support patient partnered research so that individuals can provide their data and have control over it within your system. If you cannot do this today, add this to your platform roadmap. Proposed Operating Principles (Draft 1.5)
  • 17. Trusted Platforms • A trust relationship between two resources in a data ecosystem requires agreements between two organizations about a number of matters, including: security; compliance; liability; data egress charges; and infrastructure costs. • For this reason, a formal agreement between two different organizations or a memo between two different units within an organization or agency is usually required. • As an example, an Interconnection Security Agreement (ISA) between two platforms would serve this purpose. • A consortium of platforms can also sign formal agreements. For example, the Open Commons Consortium agreements for the BloodPAC Consortium. Bilateral trust relationships Consortium trust relationships Federated trust relationships Isolated platform
  • 18. 18
  • 19. 19 For More Information Robert L. Grossman, Some Proposed Principles for Interoperating Data Commons, Medium, October 1, 2019, http://bit.ly/222QYY Robert L. Grossman robert.grossman@uchicago.edu @BobGrossman