SlideShare uma empresa Scribd logo
1 de 26
Baixar para ler offline
Mercè Crosas, Ph.D.
Chief Data Science andTechnology Officer
Institute for Quantitative Social Science (IQSS)
Harvard University
@mercecrosas mercecrosas.com
Data should be Findable, Accessible,
Interoperable, Reusable (FAIR) by machines
Wilkinson et al,‘The FAIR Guiding Principles scientific data management and stewardship,” Nature Scientific Data, 2016;
NIH Data Commons Principles; Joint Declaration of Data Citation Principles (Force11)
“FAIR Principles put specific emphasis on enhancing the
ability of machines to automatically find and use the data,
in addition to supporting its reuse by individuals.”
“Good data management is not a goal in itself, but rather is
the key conduit leading to knowledge discovery and
innovation, and to subsequent data and knowledge integration
and reuse by the community after the data publication
process.”
FAIR Data Principles in Brief
• To be Findable:
๏ (meta)data are assigned a globally
unique and persistent identifier
๏ data are described with rich
metadata
๏ metadata clearly and explicitly
include the identifier of the data it
describes
๏ (meta)data are registered or
indexed in a searchable resource
• To be Accessible:
๏ (meta)data are retrievable by their
identifier using a standardized
communications protocol
๏ the protocol is open, free, and
universally implementable
๏ the protocol allows for an
authentication and authorization
procedure, where necessary
๏ metadata are accessible, even
when the data are no longer
available
• To be Interoperable:
๏ (meta)data use a formal, accessible,
shared, and broadly applicable
language for knowledge
representation.
๏ (meta)data use vocabularies that
follow FAIR principles
๏ (meta)data include qualified
references to other (meta)data
• To be Reusable:
๏ meta(data) are richly described
with a plurality of accurate and
relevant attributes
๏ (meta)data are released with a
clear and accessible data usage
license
๏ (meta)data are associated with
detailed provenance
๏ (meta)data meet domain-relevant
community standards
We built Dataverse to incentivize data sharing,
with “good data management” in mind
• An open-source platform to share and archive data

• Developed at Harvard’s Institute for Quantitative Social
Science since 2006

• Gives credit and control to data authors & producers

• Builds a community to:

• define new standards and best practices

• foster new research in data sharing and reproducibility

• Has brought data publishing into the hands of data authors
21 installations around the world
Used by researchers from > 500 institutions
60,000 datasets in Harvard Dataverse repository
http://dataverse.org
Dataverse is now a widely used repository platform
Dataverse has a growing, engaged
community of developers and users
38

GitHub
contributors
332

members in the
community list
23

community calls
so far with 239
participants from
8countries
Annual 

Community Meeting, 

with 200 attendees
Dataverse implements FAIR Data Principles
๏ Data Citation with global persistent IDs:
๏ generate DOI automatically
๏ attribution to data authors and repository
๏ registration to DataCite
๏ Rich Metadata:
๏ citation metadata
๏ domain-specific descriptive metadata
๏ variable and file metadata (extracted automatically)
๏ Access and usage controls:
๏ open data as default, with CC0 waiver
๏ custom terms of use and licenses, when needed
๏ data can be restricted, but citation & metadata always publicly accessible
๏ APIs and standards:
๏ SWORD, OAI-PMH, native API to search and get data and metadata
๏ Dublin Core and DDI metadata standards
๏ PROV ontology standard to capture provenance of a dataset (coming soon)
Dataset Landing Page
Files: data,
docs,code
Data Citation
Dataset Landing Page
Metadata
Dataset Landing Page
Terms of Use
& Licenses
Dataset Landing Page
Versions
Standard file formats and automatic
metadata extraction allow data exploration
Var1 Var2 Var3 Var4
Var1 Var2 Var3 Var4
TwoRavens: summary stats & analysis
WorldMap: geospatial exploration
geospatial
variable
Led by Boston University, the MOC is a collaborative effort
among BU, Harvard, UMass Amherst, MIT, and Northeastern
University, as well as the Massachusetts Green High-
Performance Computing Center (MGHPCC) and Oak Ridge
National Laboratory (ORNL)
Cloud Dataverse:
•is a collaboration
between Massachusetts
Open Cloud (MOC)
and Dataverse
•will allow replication of
data in multiple storage
locations, and access to
cloud computing
Dataverse led by Harvard’s IQSS
Data$depositor$
Data$users$
Metadata$
Data$files$
Data$+$metadata$
Access$object$in$Swi8$+$$
Compute+with$Sahara/Hadoop$
download$
Swi8$
Object$
Store$
Dataverse$Now$$$ with$Cloud$Dataverse$
Repository+
Publish$dataset$
Data+
Replica3on+
Cloud&Access&+&
Compute&
Dataset enabled in
Cloud Dataverse
Future Dataset Landing Page: Access to Compute
http://privacytools.seas.harvard.edu http://datatags.org
DataTags:
•is part of the Harvard
University PrivacyTools
Project
•will be integrated with
Dataverse
http://dataverse.org
A datatag is a set of security features and
access requirements for file handling.
A datatags repository is one that stores and
shares data files in accordance with a
standardized and ordered levels of security and
access requirements
DataTags Levels
Tag Type Description Security Features Access Credentials
Blue Public
Clear storage,

Clear transmit
Open
Green Controlled public
Clear storage,

Clear transmit
Email- or OAuth Verified
Registration
Yellow Accountable
Clear storage,

Encrypted transmit
Password, Registered,
Approval, Click-through DUA
Orange More accountable
Encrypted storage,
Encrypted transmit
Password, Registered,
Approval, Signed DUA
Red Fully accountable
Encrypted storage,
Encrypted transmit
Two-factor authentication,
Approval, Signed DUA
Crimson Maximally restricted
Multi-encrypted storage,
Encrypted transmit
Two-factor authentication,
Approval, Signed DUA
DataTags and their respective policies
Sweeney L, Crosas M, Bar-Sinai M. Sharing Sensitive Data with Confidence: The Datatags System. 

Technology Science. 2015.
Requirements for a DataTags Repository
1. Supports more than one datatag
2. Each file in the repository must have one and only one datatag
๏ additional requirements cannot weaken the file security
๏ and cannot required the same or more security than a more
restrictive datatag
3. A recipient of a file from the repository must
๏ satisfy file’s access requirements,
๏ produce sufficient credentials as requested,
๏ and agree to any terms of use required to acquire the file.
4. Provides technological guarantees for requirements 1, 2 and 3.
Repositories today do not have an easy,
standard way to support sensitive data
“User Uploads must be void of all identifiable information, such
that re-identification of any subjects from the amalgamation of the
information available from all of the materials (across datasets and
dataverses) uploaded under any one author and/or user should
not be possible.”
Terms of Use
We are making Dataverse a DataTags
Repository to share sensitive data
Data File
Deposit
DataTags
Automated
Interview
Review Board
Approval
Sensitive
Data File
Direct
Access
Privacy
Preserving
Access
PSI
Differential
Privacy Tool
Two-factor auth;
approval; signed-
DUA
The DataTags automated interview …
… helps you generate a machine-readable
datatag for your data
Thanks!
Learn more at http//dataverse.org
@mercecrosas mercecrosas.com

Mais conteúdo relacionado

Mais procurados

D4Science Data infrastructure: a facilitator for a FAIR data management
D4Science Data infrastructure: a facilitator for a FAIR data managementD4Science Data infrastructure: a facilitator for a FAIR data management
D4Science Data infrastructure: a facilitator for a FAIR data managementResearch Data Alliance
 
Managing, Sharing and Curating Your Research Data in a Digital Environment
Managing, Sharing and Curating Your Research Data in a Digital EnvironmentManaging, Sharing and Curating Your Research Data in a Digital Environment
Managing, Sharing and Curating Your Research Data in a Digital Environmentphilipdurbin
 
Connecting Dataverse with the Research Life Cycle
Connecting Dataverse with the Research Life CycleConnecting Dataverse with the Research Life Cycle
Connecting Dataverse with the Research Life CycleMerce Crosas
 
BioPharma and FAIR Data, a Collaborative Advantage
BioPharma and FAIR Data, a Collaborative AdvantageBioPharma and FAIR Data, a Collaborative Advantage
BioPharma and FAIR Data, a Collaborative AdvantageTom Plasterer
 
FAIR Data Knowledge Graphs–from Theory to Practice
FAIR Data Knowledge Graphs–from Theory to PracticeFAIR Data Knowledge Graphs–from Theory to Practice
FAIR Data Knowledge Graphs–from Theory to PracticeTom Plasterer
 
Data Citation Implementation Guidelines By Tim Clark
Data Citation Implementation Guidelines By Tim ClarkData Citation Implementation Guidelines By Tim Clark
Data Citation Implementation Guidelines By Tim Clarkdatascienceiqss
 
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...Merce Crosas
 
FAIR Data in Trustworthy Data Repositories Webinar - 12-13 December 2016| www...
FAIR Data in Trustworthy Data Repositories Webinar - 12-13 December 2016| www...FAIR Data in Trustworthy Data Repositories Webinar - 12-13 December 2016| www...
FAIR Data in Trustworthy Data Repositories Webinar - 12-13 December 2016| www...EUDAT
 
Altman RDAP11 Policy-based Data Management
Altman RDAP11 Policy-based Data ManagementAltman RDAP11 Policy-based Data Management
Altman RDAP11 Policy-based Data ManagementASIS&T
 
dkNET Poster ENDO 2016
dkNET Poster ENDO 2016 dkNET Poster ENDO 2016
dkNET Poster ENDO 2016 dkNET
 
Dataset Catalogs as a Foundation for FAIR* Data
Dataset Catalogs as a Foundation for FAIR* DataDataset Catalogs as a Foundation for FAIR* Data
Dataset Catalogs as a Foundation for FAIR* DataTom Plasterer
 
DataONE Education Module 02: Data Sharing
DataONE Education Module 02: Data SharingDataONE Education Module 02: Data Sharing
DataONE Education Module 02: Data SharingDataONE
 
Smith RDAP11 NSF Data Management Plan Case Studies
Smith RDAP11 NSF Data Management Plan Case StudiesSmith RDAP11 NSF Data Management Plan Case Studies
Smith RDAP11 NSF Data Management Plan Case StudiesASIS&T
 
dkNET Introductory Webinar 05/10/2017
dkNET Introductory Webinar 05/10/2017dkNET Introductory Webinar 05/10/2017
dkNET Introductory Webinar 05/10/2017dkNET
 
Global registries initiative frumkin omodei
Global registries initiative frumkin omodeiGlobal registries initiative frumkin omodei
Global registries initiative frumkin omodeiASIS&T
 
INSTRUCT - Integrated Structural Biology Infrastructure
INSTRUCT - Integrated Structural Biology InfrastructureINSTRUCT - Integrated Structural Biology Infrastructure
INSTRUCT - Integrated Structural Biology InfrastructureResearch Data Alliance
 
A Data Citation Roadmap for Scholarly Data Repositories
A Data Citation Roadmap for Scholarly Data RepositoriesA Data Citation Roadmap for Scholarly Data Repositories
A Data Citation Roadmap for Scholarly Data RepositoriesLIBER Europe
 

Mais procurados (20)

D4Science Data infrastructure: a facilitator for a FAIR data management
D4Science Data infrastructure: a facilitator for a FAIR data managementD4Science Data infrastructure: a facilitator for a FAIR data management
D4Science Data infrastructure: a facilitator for a FAIR data management
 
Managing, Sharing and Curating Your Research Data in a Digital Environment
Managing, Sharing and Curating Your Research Data in a Digital EnvironmentManaging, Sharing and Curating Your Research Data in a Digital Environment
Managing, Sharing and Curating Your Research Data in a Digital Environment
 
Connecting Dataverse with the Research Life Cycle
Connecting Dataverse with the Research Life CycleConnecting Dataverse with the Research Life Cycle
Connecting Dataverse with the Research Life Cycle
 
BioPharma and FAIR Data, a Collaborative Advantage
BioPharma and FAIR Data, a Collaborative AdvantageBioPharma and FAIR Data, a Collaborative Advantage
BioPharma and FAIR Data, a Collaborative Advantage
 
FAIR Data Knowledge Graphs–from Theory to Practice
FAIR Data Knowledge Graphs–from Theory to PracticeFAIR Data Knowledge Graphs–from Theory to Practice
FAIR Data Knowledge Graphs–from Theory to Practice
 
Data Citation Implementation Guidelines By Tim Clark
Data Citation Implementation Guidelines By Tim ClarkData Citation Implementation Guidelines By Tim Clark
Data Citation Implementation Guidelines By Tim Clark
 
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...
 
FAIR data overview
FAIR data overviewFAIR data overview
FAIR data overview
 
FAIR Data in Trustworthy Data Repositories Webinar - 12-13 December 2016| www...
FAIR Data in Trustworthy Data Repositories Webinar - 12-13 December 2016| www...FAIR Data in Trustworthy Data Repositories Webinar - 12-13 December 2016| www...
FAIR Data in Trustworthy Data Repositories Webinar - 12-13 December 2016| www...
 
Altman RDAP11 Policy-based Data Management
Altman RDAP11 Policy-based Data ManagementAltman RDAP11 Policy-based Data Management
Altman RDAP11 Policy-based Data Management
 
dkNET Poster ENDO 2016
dkNET Poster ENDO 2016 dkNET Poster ENDO 2016
dkNET Poster ENDO 2016
 
Dataset Catalogs as a Foundation for FAIR* Data
Dataset Catalogs as a Foundation for FAIR* DataDataset Catalogs as a Foundation for FAIR* Data
Dataset Catalogs as a Foundation for FAIR* Data
 
McGeary Data Curation Network: Developing and Scaling
McGeary Data Curation Network: Developing and ScalingMcGeary Data Curation Network: Developing and Scaling
McGeary Data Curation Network: Developing and Scaling
 
DataONE Education Module 02: Data Sharing
DataONE Education Module 02: Data SharingDataONE Education Module 02: Data Sharing
DataONE Education Module 02: Data Sharing
 
Smith RDAP11 NSF Data Management Plan Case Studies
Smith RDAP11 NSF Data Management Plan Case StudiesSmith RDAP11 NSF Data Management Plan Case Studies
Smith RDAP11 NSF Data Management Plan Case Studies
 
dkNET Introductory Webinar 05/10/2017
dkNET Introductory Webinar 05/10/2017dkNET Introductory Webinar 05/10/2017
dkNET Introductory Webinar 05/10/2017
 
Global registries initiative frumkin omodei
Global registries initiative frumkin omodeiGlobal registries initiative frumkin omodei
Global registries initiative frumkin omodei
 
INSTRUCT - Integrated Structural Biology Infrastructure
INSTRUCT - Integrated Structural Biology InfrastructureINSTRUCT - Integrated Structural Biology Infrastructure
INSTRUCT - Integrated Structural Biology Infrastructure
 
Borgman - Privacy, Policy and Data Governance in the University
Borgman - Privacy, Policy and Data Governance in the UniversityBorgman - Privacy, Policy and Data Governance in the University
Borgman - Privacy, Policy and Data Governance in the University
 
A Data Citation Roadmap for Scholarly Data Repositories
A Data Citation Roadmap for Scholarly Data RepositoriesA Data Citation Roadmap for Scholarly Data Repositories
A Data Citation Roadmap for Scholarly Data Repositories
 

Destaque

Fintech platforms and strategy 2016
Fintech platforms and strategy 2016Fintech platforms and strategy 2016
Fintech platforms and strategy 2016Ian Beckett
 
New kids on the blockchain fintech
New kids on the blockchain   fintech New kids on the blockchain   fintech
New kids on the blockchain fintech Ian Beckett
 
Blockchain analysis of regulation and tech related to distributed ledger-fi...
Blockchain   analysis of regulation and tech related to distributed ledger-fi...Blockchain   analysis of regulation and tech related to distributed ledger-fi...
Blockchain analysis of regulation and tech related to distributed ledger-fi...Ian Beckett
 
The rise of the digital supply network - IIOT Industry40 distribution
The rise of the digital supply network - IIOT Industry40 distribution The rise of the digital supply network - IIOT Industry40 distribution
The rise of the digital supply network - IIOT Industry40 distribution Ian Beckett
 
Distributed ledger technology in payments clearing and settlement - blockchai...
Distributed ledger technology in payments clearing and settlement - blockchai...Distributed ledger technology in payments clearing and settlement - blockchai...
Distributed ledger technology in payments clearing and settlement - blockchai...Ian Beckett
 
Wide variation and excessive dosage of opioid prescriptions for common genera...
Wide variation and excessive dosage of opioid prescriptions for common genera...Wide variation and excessive dosage of opioid prescriptions for common genera...
Wide variation and excessive dosage of opioid prescriptions for common genera...Paul Coelho, MD
 
The fintech startup scene in germany
The fintech startup scene in germany  The fintech startup scene in germany
The fintech startup scene in germany Ian Beckett
 
The UK fintech industry support policies and its implications
The UK fintech industry support policies and its implicationsThe UK fintech industry support policies and its implications
The UK fintech industry support policies and its implicationsIan Beckett
 
Choosing Open (#OEGlobal) - Openness and praxis: Using OEP in HE
Choosing Open (#OEGlobal) - Openness and praxis: Using OEP in HEChoosing Open (#OEGlobal) - Openness and praxis: Using OEP in HE
Choosing Open (#OEGlobal) - Openness and praxis: Using OEP in HECatherine Cronin
 
Bain blockchain in financial markets - how to gain an edge
Bain   blockchain in financial markets - how to gain an edgeBain   blockchain in financial markets - how to gain an edge
Bain blockchain in financial markets - how to gain an edgeIan Beckett
 
2017 iosco research report on financial technologies (fintech)
2017 iosco research report on  financial technologies (fintech)2017 iosco research report on  financial technologies (fintech)
2017 iosco research report on financial technologies (fintech)Ian Beckett
 
Atomico - state of European Tech 2016
Atomico - state of European Tech 2016Atomico - state of European Tech 2016
Atomico - state of European Tech 2016Ian Beckett
 
Cuidado de la salud y nutrición de los niños 2
Cuidado  de la salud y nutrición de los niños 2Cuidado  de la salud y nutrición de los niños 2
Cuidado de la salud y nutrición de los niños 2Milena Zuñiga
 
Impact of corporate vc investment motivation on startup valuation
Impact of corporate vc investment motivation on startup valuationImpact of corporate vc investment motivation on startup valuation
Impact of corporate vc investment motivation on startup valuationIan Beckett
 
Fake news presentation
Fake news presentationFake news presentation
Fake news presentationYumonomics
 
Radio y television
Radio y televisionRadio y television
Radio y televisionUPTC LIMITED
 

Destaque (20)

V tech toy comparison
V tech toy comparisonV tech toy comparison
V tech toy comparison
 
temple y revenido
temple y revenidotemple y revenido
temple y revenido
 
Fintech platforms and strategy 2016
Fintech platforms and strategy 2016Fintech platforms and strategy 2016
Fintech platforms and strategy 2016
 
New kids on the blockchain fintech
New kids on the blockchain   fintech New kids on the blockchain   fintech
New kids on the blockchain fintech
 
Blockchain analysis of regulation and tech related to distributed ledger-fi...
Blockchain   analysis of regulation and tech related to distributed ledger-fi...Blockchain   analysis of regulation and tech related to distributed ledger-fi...
Blockchain analysis of regulation and tech related to distributed ledger-fi...
 
The rise of the digital supply network - IIOT Industry40 distribution
The rise of the digital supply network - IIOT Industry40 distribution The rise of the digital supply network - IIOT Industry40 distribution
The rise of the digital supply network - IIOT Industry40 distribution
 
Distributed ledger technology in payments clearing and settlement - blockchai...
Distributed ledger technology in payments clearing and settlement - blockchai...Distributed ledger technology in payments clearing and settlement - blockchai...
Distributed ledger technology in payments clearing and settlement - blockchai...
 
Wide variation and excessive dosage of opioid prescriptions for common genera...
Wide variation and excessive dosage of opioid prescriptions for common genera...Wide variation and excessive dosage of opioid prescriptions for common genera...
Wide variation and excessive dosage of opioid prescriptions for common genera...
 
The fintech startup scene in germany
The fintech startup scene in germany  The fintech startup scene in germany
The fintech startup scene in germany
 
The UK fintech industry support policies and its implications
The UK fintech industry support policies and its implicationsThe UK fintech industry support policies and its implications
The UK fintech industry support policies and its implications
 
Choosing Open (#OEGlobal) - Openness and praxis: Using OEP in HE
Choosing Open (#OEGlobal) - Openness and praxis: Using OEP in HEChoosing Open (#OEGlobal) - Openness and praxis: Using OEP in HE
Choosing Open (#OEGlobal) - Openness and praxis: Using OEP in HE
 
Bain blockchain in financial markets - how to gain an edge
Bain   blockchain in financial markets - how to gain an edgeBain   blockchain in financial markets - how to gain an edge
Bain blockchain in financial markets - how to gain an edge
 
2017 iosco research report on financial technologies (fintech)
2017 iosco research report on  financial technologies (fintech)2017 iosco research report on  financial technologies (fintech)
2017 iosco research report on financial technologies (fintech)
 
Grem ampt
Grem amptGrem ampt
Grem ampt
 
Atomico - state of European Tech 2016
Atomico - state of European Tech 2016Atomico - state of European Tech 2016
Atomico - state of European Tech 2016
 
20%tercer corte
20%tercer corte20%tercer corte
20%tercer corte
 
Cuidado de la salud y nutrición de los niños 2
Cuidado  de la salud y nutrición de los niños 2Cuidado  de la salud y nutrición de los niños 2
Cuidado de la salud y nutrición de los niños 2
 
Impact of corporate vc investment motivation on startup valuation
Impact of corporate vc investment motivation on startup valuationImpact of corporate vc investment motivation on startup valuation
Impact of corporate vc investment motivation on startup valuation
 
Fake news presentation
Fake news presentationFake news presentation
Fake news presentation
 
Radio y television
Radio y televisionRadio y television
Radio y television
 

Semelhante a Dataverse, Cloud Dataverse, and DataTags

Data Publishing at Harvard's Research Data Access Symposium
Data Publishing at Harvard's Research Data Access SymposiumData Publishing at Harvard's Research Data Access Symposium
Data Publishing at Harvard's Research Data Access SymposiumMerce Crosas
 
Research methods group accelarating impact by sharing data
Research methods group  accelarating impact by sharing dataResearch methods group  accelarating impact by sharing data
Research methods group accelarating impact by sharing dataWorld Agroforestry (ICRAF)
 
FAIR Data Knowledge Graphs
FAIR Data Knowledge GraphsFAIR Data Knowledge Graphs
FAIR Data Knowledge GraphsTom Plasterer
 
EMBL Australian Bioinformatics Resource AHM - Data Commons
EMBL Australian Bioinformatics Resource AHM   - Data CommonsEMBL Australian Bioinformatics Resource AHM   - Data Commons
EMBL Australian Bioinformatics Resource AHM - Data CommonsVivien Bonazzi
 
NIH Data Summit - The NIH Data Commons
NIH Data Summit - The NIH Data CommonsNIH Data Summit - The NIH Data Commons
NIH Data Summit - The NIH Data CommonsVivien Bonazzi
 
What is Data Commons and How Can Your Organization Build One?
What is Data Commons and How Can Your Organization Build One?What is Data Commons and How Can Your Organization Build One?
What is Data Commons and How Can Your Organization Build One?Robert Grossman
 
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...Carole Goble
 
Open Science Globally: Some Developments/Dr Simon Hodson
Open Science Globally: Some Developments/Dr Simon HodsonOpen Science Globally: Some Developments/Dr Simon Hodson
Open Science Globally: Some Developments/Dr Simon HodsonAfrican Open Science Platform
 
How Cyverse.org enables scalable data discoverability and re-use
How Cyverse.org enables scalable data discoverability and re-useHow Cyverse.org enables scalable data discoverability and re-use
How Cyverse.org enables scalable data discoverability and re-useMatthew Vaughn
 
Data commons bonazzi bd2 k fundamentals of science feb 2017
Data commons bonazzi   bd2 k fundamentals of science feb 2017Data commons bonazzi   bd2 k fundamentals of science feb 2017
Data commons bonazzi bd2 k fundamentals of science feb 2017Vivien Bonazzi
 
Towards metrics to assess and encourage FAIRness
Towards metrics to assess and encourage FAIRnessTowards metrics to assess and encourage FAIRness
Towards metrics to assess and encourage FAIRnessMichel Dumontier
 
DataCite and its DOI infrastructure - IASSIST 2013
DataCite and its DOI infrastructure - IASSIST 2013DataCite and its DOI infrastructure - IASSIST 2013
DataCite and its DOI infrastructure - IASSIST 2013Frauke Ziedorn
 
Data sharing in the Netherlands
Data sharing in the NetherlandsData sharing in the Netherlands
Data sharing in the NetherlandsJisc RDM
 
VODAN Africa IN.pptx
VODAN Africa IN.pptxVODAN Africa IN.pptx
VODAN Africa IN.pptxGetu Tadele
 
FAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practiceFAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practiceCarole Goble
 
FAIRness Assessment of the Library of Integrated Network-based Cellular Signa...
FAIRness Assessment of the Library of Integrated Network-based Cellular Signa...FAIRness Assessment of the Library of Integrated Network-based Cellular Signa...
FAIRness Assessment of the Library of Integrated Network-based Cellular Signa...Kathleen Jagodnik
 
Introduction to APIs and Linked Data
Introduction to APIs and Linked DataIntroduction to APIs and Linked Data
Introduction to APIs and Linked DataAdrian Stevenson
 
Lynch & Dirks - Platforms for Open Research - Charleston Conference 2011
Lynch & Dirks  - Platforms for Open Research - Charleston Conference 2011Lynch & Dirks  - Platforms for Open Research - Charleston Conference 2011
Lynch & Dirks - Platforms for Open Research - Charleston Conference 2011Lee Dirks
 

Semelhante a Dataverse, Cloud Dataverse, and DataTags (20)

Data Publishing at Harvard's Research Data Access Symposium
Data Publishing at Harvard's Research Data Access SymposiumData Publishing at Harvard's Research Data Access Symposium
Data Publishing at Harvard's Research Data Access Symposium
 
Research methods group accelarating impact by sharing data
Research methods group  accelarating impact by sharing dataResearch methods group  accelarating impact by sharing data
Research methods group accelarating impact by sharing data
 
FAIR Data Knowledge Graphs
FAIR Data Knowledge GraphsFAIR Data Knowledge Graphs
FAIR Data Knowledge Graphs
 
Jonathan David Crabtree - The Dataverse Community: Supporting Open Science an...
Jonathan David Crabtree - The Dataverse Community: Supporting Open Science an...Jonathan David Crabtree - The Dataverse Community: Supporting Open Science an...
Jonathan David Crabtree - The Dataverse Community: Supporting Open Science an...
 
EMBL Australian Bioinformatics Resource AHM - Data Commons
EMBL Australian Bioinformatics Resource AHM   - Data CommonsEMBL Australian Bioinformatics Resource AHM   - Data Commons
EMBL Australian Bioinformatics Resource AHM - Data Commons
 
NIH Data Summit - The NIH Data Commons
NIH Data Summit - The NIH Data CommonsNIH Data Summit - The NIH Data Commons
NIH Data Summit - The NIH Data Commons
 
What is Data Commons and How Can Your Organization Build One?
What is Data Commons and How Can Your Organization Build One?What is Data Commons and How Can Your Organization Build One?
What is Data Commons and How Can Your Organization Build One?
 
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
 
Open Science Globally: Some Developments/Dr Simon Hodson
Open Science Globally: Some Developments/Dr Simon HodsonOpen Science Globally: Some Developments/Dr Simon Hodson
Open Science Globally: Some Developments/Dr Simon Hodson
 
How Cyverse.org enables scalable data discoverability and re-use
How Cyverse.org enables scalable data discoverability and re-useHow Cyverse.org enables scalable data discoverability and re-use
How Cyverse.org enables scalable data discoverability and re-use
 
Data commons bonazzi bd2 k fundamentals of science feb 2017
Data commons bonazzi   bd2 k fundamentals of science feb 2017Data commons bonazzi   bd2 k fundamentals of science feb 2017
Data commons bonazzi bd2 k fundamentals of science feb 2017
 
Towards metrics to assess and encourage FAIRness
Towards metrics to assess and encourage FAIRnessTowards metrics to assess and encourage FAIRness
Towards metrics to assess and encourage FAIRness
 
DataCite and its DOI infrastructure - IASSIST 2013
DataCite and its DOI infrastructure - IASSIST 2013DataCite and its DOI infrastructure - IASSIST 2013
DataCite and its DOI infrastructure - IASSIST 2013
 
SMART Seminar Series: SMART Data Management
SMART Seminar Series: SMART Data ManagementSMART Seminar Series: SMART Data Management
SMART Seminar Series: SMART Data Management
 
Data sharing in the Netherlands
Data sharing in the NetherlandsData sharing in the Netherlands
Data sharing in the Netherlands
 
VODAN Africa IN.pptx
VODAN Africa IN.pptxVODAN Africa IN.pptx
VODAN Africa IN.pptx
 
FAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practiceFAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practice
 
FAIRness Assessment of the Library of Integrated Network-based Cellular Signa...
FAIRness Assessment of the Library of Integrated Network-based Cellular Signa...FAIRness Assessment of the Library of Integrated Network-based Cellular Signa...
FAIRness Assessment of the Library of Integrated Network-based Cellular Signa...
 
Introduction to APIs and Linked Data
Introduction to APIs and Linked DataIntroduction to APIs and Linked Data
Introduction to APIs and Linked Data
 
Lynch & Dirks - Platforms for Open Research - Charleston Conference 2011
Lynch & Dirks  - Platforms for Open Research - Charleston Conference 2011Lynch & Dirks  - Platforms for Open Research - Charleston Conference 2011
Lynch & Dirks - Platforms for Open Research - Charleston Conference 2011
 

Mais de Merce Crosas

Practical Implementation of research data policies: Solutions with Dataverse
Practical Implementation of research data policies: Solutions with DataversePractical Implementation of research data policies: Solutions with Dataverse
Practical Implementation of research data policies: Solutions with DataverseMerce Crosas
 
Research Data Management @Harvard
Research Data Management @HarvardResearch Data Management @Harvard
Research Data Management @HarvardMerce Crosas
 
Cloud Dataverse: A Data repository platform for an OpenStack Cloud
Cloud Dataverse: A Data repository platform for an OpenStack CloudCloud Dataverse: A Data repository platform for an OpenStack Cloud
Cloud Dataverse: A Data repository platform for an OpenStack CloudMerce Crosas
 
Can data access combat fake news?
Can data access combat fake news?Can data access combat fake news?
Can data access combat fake news?Merce Crosas
 
The Data Lifecycle (Harvard DataFest)
The Data Lifecycle (Harvard DataFest)The Data Lifecycle (Harvard DataFest)
The Data Lifecycle (Harvard DataFest)Merce Crosas
 
Making Data Accessible
Making Data AccessibleMaking Data Accessible
Making Data AccessibleMerce Crosas
 
Abcd iqs ssoftware-projects-mercecrosas
Abcd iqs ssoftware-projects-mercecrosasAbcd iqs ssoftware-projects-mercecrosas
Abcd iqs ssoftware-projects-mercecrosasMerce Crosas
 
The Rise of Data Publishing in the Digital World (and how Dataverse and DataT...
The Rise of Data Publishing in the Digital World (and how Dataverse and DataT...The Rise of Data Publishing in the Digital World (and how Dataverse and DataT...
The Rise of Data Publishing in the Digital World (and how Dataverse and DataT...Merce Crosas
 
A very Brief History of Communicating Science
A very Brief History of Communicating ScienceA very Brief History of Communicating Science
A very Brief History of Communicating ScienceMerce Crosas
 
Data Citation Implementation at Dataverse
Data Citation Implementation at DataverseData Citation Implementation at Dataverse
Data Citation Implementation at DataverseMerce Crosas
 
Dataverse on the MOC
Dataverse on the MOCDataverse on the MOC
Dataverse on the MOCMerce Crosas
 
The Dataverse Commons
The Dataverse CommonsThe Dataverse Commons
The Dataverse CommonsMerce Crosas
 
Dataverse hpdm symposium
Dataverse   hpdm symposiumDataverse   hpdm symposium
Dataverse hpdm symposiumMerce Crosas
 
Collaboration in science and technology it summit
Collaboration in science and technology   it summitCollaboration in science and technology   it summit
Collaboration in science and technology it summitMerce Crosas
 
Dataverse for Journals
Dataverse for JournalsDataverse for Journals
Dataverse for JournalsMerce Crosas
 
Collaboration in science and technology
Collaboration in science and technologyCollaboration in science and technology
Collaboration in science and technologyMerce Crosas
 
Force11 jddcp intro
Force11  jddcp introForce11  jddcp intro
Force11 jddcp introMerce Crosas
 
The expanding dataverse
The expanding dataverseThe expanding dataverse
The expanding dataverseMerce Crosas
 

Mais de Merce Crosas (19)

Practical Implementation of research data policies: Solutions with Dataverse
Practical Implementation of research data policies: Solutions with DataversePractical Implementation of research data policies: Solutions with Dataverse
Practical Implementation of research data policies: Solutions with Dataverse
 
Research Data Management @Harvard
Research Data Management @HarvardResearch Data Management @Harvard
Research Data Management @Harvard
 
Cloud Dataverse: A Data repository platform for an OpenStack Cloud
Cloud Dataverse: A Data repository platform for an OpenStack CloudCloud Dataverse: A Data repository platform for an OpenStack Cloud
Cloud Dataverse: A Data repository platform for an OpenStack Cloud
 
Can data access combat fake news?
Can data access combat fake news?Can data access combat fake news?
Can data access combat fake news?
 
The Data Lifecycle (Harvard DataFest)
The Data Lifecycle (Harvard DataFest)The Data Lifecycle (Harvard DataFest)
The Data Lifecycle (Harvard DataFest)
 
Cloud Dataverse
Cloud DataverseCloud Dataverse
Cloud Dataverse
 
Making Data Accessible
Making Data AccessibleMaking Data Accessible
Making Data Accessible
 
Abcd iqs ssoftware-projects-mercecrosas
Abcd iqs ssoftware-projects-mercecrosasAbcd iqs ssoftware-projects-mercecrosas
Abcd iqs ssoftware-projects-mercecrosas
 
The Rise of Data Publishing in the Digital World (and how Dataverse and DataT...
The Rise of Data Publishing in the Digital World (and how Dataverse and DataT...The Rise of Data Publishing in the Digital World (and how Dataverse and DataT...
The Rise of Data Publishing in the Digital World (and how Dataverse and DataT...
 
A very Brief History of Communicating Science
A very Brief History of Communicating ScienceA very Brief History of Communicating Science
A very Brief History of Communicating Science
 
Data Citation Implementation at Dataverse
Data Citation Implementation at DataverseData Citation Implementation at Dataverse
Data Citation Implementation at Dataverse
 
Dataverse on the MOC
Dataverse on the MOCDataverse on the MOC
Dataverse on the MOC
 
The Dataverse Commons
The Dataverse CommonsThe Dataverse Commons
The Dataverse Commons
 
Dataverse hpdm symposium
Dataverse   hpdm symposiumDataverse   hpdm symposium
Dataverse hpdm symposium
 
Collaboration in science and technology it summit
Collaboration in science and technology   it summitCollaboration in science and technology   it summit
Collaboration in science and technology it summit
 
Dataverse for Journals
Dataverse for JournalsDataverse for Journals
Dataverse for Journals
 
Collaboration in science and technology
Collaboration in science and technologyCollaboration in science and technology
Collaboration in science and technology
 
Force11 jddcp intro
Force11  jddcp introForce11  jddcp intro
Force11 jddcp intro
 
The expanding dataverse
The expanding dataverseThe expanding dataverse
The expanding dataverse
 

Último

Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramMoniSankarHazra
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083
 
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlCall Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlkumarajju5765
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Onlineanilsa9823
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 

Último (20)

Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlCall Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 

Dataverse, Cloud Dataverse, and DataTags

  • 1. Mercè Crosas, Ph.D. Chief Data Science andTechnology Officer Institute for Quantitative Social Science (IQSS) Harvard University @mercecrosas mercecrosas.com
  • 2. Data should be Findable, Accessible, Interoperable, Reusable (FAIR) by machines Wilkinson et al,‘The FAIR Guiding Principles scientific data management and stewardship,” Nature Scientific Data, 2016; NIH Data Commons Principles; Joint Declaration of Data Citation Principles (Force11)
  • 3. “FAIR Principles put specific emphasis on enhancing the ability of machines to automatically find and use the data, in addition to supporting its reuse by individuals.” “Good data management is not a goal in itself, but rather is the key conduit leading to knowledge discovery and innovation, and to subsequent data and knowledge integration and reuse by the community after the data publication process.”
  • 4. FAIR Data Principles in Brief • To be Findable: ๏ (meta)data are assigned a globally unique and persistent identifier ๏ data are described with rich metadata ๏ metadata clearly and explicitly include the identifier of the data it describes ๏ (meta)data are registered or indexed in a searchable resource • To be Accessible: ๏ (meta)data are retrievable by their identifier using a standardized communications protocol ๏ the protocol is open, free, and universally implementable ๏ the protocol allows for an authentication and authorization procedure, where necessary ๏ metadata are accessible, even when the data are no longer available • To be Interoperable: ๏ (meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation. ๏ (meta)data use vocabularies that follow FAIR principles ๏ (meta)data include qualified references to other (meta)data • To be Reusable: ๏ meta(data) are richly described with a plurality of accurate and relevant attributes ๏ (meta)data are released with a clear and accessible data usage license ๏ (meta)data are associated with detailed provenance ๏ (meta)data meet domain-relevant community standards
  • 5. We built Dataverse to incentivize data sharing, with “good data management” in mind • An open-source platform to share and archive data • Developed at Harvard’s Institute for Quantitative Social Science since 2006 • Gives credit and control to data authors & producers • Builds a community to: • define new standards and best practices • foster new research in data sharing and reproducibility • Has brought data publishing into the hands of data authors
  • 6. 21 installations around the world Used by researchers from > 500 institutions 60,000 datasets in Harvard Dataverse repository http://dataverse.org Dataverse is now a widely used repository platform
  • 7. Dataverse has a growing, engaged community of developers and users 38 GitHub contributors 332 members in the community list 23 community calls so far with 239 participants from 8countries Annual Community Meeting, with 200 attendees
  • 8. Dataverse implements FAIR Data Principles ๏ Data Citation with global persistent IDs: ๏ generate DOI automatically ๏ attribution to data authors and repository ๏ registration to DataCite ๏ Rich Metadata: ๏ citation metadata ๏ domain-specific descriptive metadata ๏ variable and file metadata (extracted automatically) ๏ Access and usage controls: ๏ open data as default, with CC0 waiver ๏ custom terms of use and licenses, when needed ๏ data can be restricted, but citation & metadata always publicly accessible ๏ APIs and standards: ๏ SWORD, OAI-PMH, native API to search and get data and metadata ๏ Dublin Core and DDI metadata standards ๏ PROV ontology standard to capture provenance of a dataset (coming soon)
  • 9. Dataset Landing Page Files: data, docs,code Data Citation
  • 11. Dataset Landing Page Terms of Use & Licenses
  • 13. Standard file formats and automatic metadata extraction allow data exploration Var1 Var2 Var3 Var4 Var1 Var2 Var3 Var4 TwoRavens: summary stats & analysis WorldMap: geospatial exploration geospatial variable
  • 14. Led by Boston University, the MOC is a collaborative effort among BU, Harvard, UMass Amherst, MIT, and Northeastern University, as well as the Massachusetts Green High- Performance Computing Center (MGHPCC) and Oak Ridge National Laboratory (ORNL) Cloud Dataverse: •is a collaboration between Massachusetts Open Cloud (MOC) and Dataverse •will allow replication of data in multiple storage locations, and access to cloud computing Dataverse led by Harvard’s IQSS
  • 16. Cloud&Access&+& Compute& Dataset enabled in Cloud Dataverse Future Dataset Landing Page: Access to Compute
  • 17.
  • 18. http://privacytools.seas.harvard.edu http://datatags.org DataTags: •is part of the Harvard University PrivacyTools Project •will be integrated with Dataverse http://dataverse.org
  • 19. A datatag is a set of security features and access requirements for file handling. A datatags repository is one that stores and shares data files in accordance with a standardized and ordered levels of security and access requirements
  • 20. DataTags Levels Tag Type Description Security Features Access Credentials Blue Public Clear storage,
 Clear transmit Open Green Controlled public Clear storage,
 Clear transmit Email- or OAuth Verified Registration Yellow Accountable Clear storage,
 Encrypted transmit Password, Registered, Approval, Click-through DUA Orange More accountable Encrypted storage, Encrypted transmit Password, Registered, Approval, Signed DUA Red Fully accountable Encrypted storage, Encrypted transmit Two-factor authentication, Approval, Signed DUA Crimson Maximally restricted Multi-encrypted storage, Encrypted transmit Two-factor authentication, Approval, Signed DUA DataTags and their respective policies Sweeney L, Crosas M, Bar-Sinai M. Sharing Sensitive Data with Confidence: The Datatags System. 
 Technology Science. 2015.
  • 21. Requirements for a DataTags Repository 1. Supports more than one datatag 2. Each file in the repository must have one and only one datatag ๏ additional requirements cannot weaken the file security ๏ and cannot required the same or more security than a more restrictive datatag 3. A recipient of a file from the repository must ๏ satisfy file’s access requirements, ๏ produce sufficient credentials as requested, ๏ and agree to any terms of use required to acquire the file. 4. Provides technological guarantees for requirements 1, 2 and 3.
  • 22. Repositories today do not have an easy, standard way to support sensitive data “User Uploads must be void of all identifiable information, such that re-identification of any subjects from the amalgamation of the information available from all of the materials (across datasets and dataverses) uploaded under any one author and/or user should not be possible.” Terms of Use
  • 23. We are making Dataverse a DataTags Repository to share sensitive data Data File Deposit DataTags Automated Interview Review Board Approval Sensitive Data File Direct Access Privacy Preserving Access PSI Differential Privacy Tool Two-factor auth; approval; signed- DUA
  • 24. The DataTags automated interview …
  • 25. … helps you generate a machine-readable datatag for your data
  • 26. Thanks! Learn more at http//dataverse.org @mercecrosas mercecrosas.com