SlideShare uma empresa Scribd logo
1 de 15
www.d4science.org
D4SCIENCE DATA INFRASTRUCTURE
Facilitator for a FAIR data management
Pasquale Pagano
CNR – ISTI
(Pisa, Italy)
www.d4science.org
Outline
Context
Requirements
Virtual Research Environments
Dealing with complexity
FAIR principles
Conclusions
D4Science: Facilitator for a FAIR data management 1
www.d4science.org
D4Science is an hybrid data infrastructure
technologies integrated to provide
elastic access and usage of data and data-management capabilities
D4Science: Facilitator for a FAIR data management 2
• +55 VREs hosted
• +2500 scientists in 44 countries
• +50 data providers
• +25,000 derivative data/month
• over a billion quality records
• +20,000 temporal datasets
• +50,000 spatial datasets
• 99.7% service availability
Humanities and Cultural Heritage
Social Mining
Environmental Studies
Biological and Ecological Studies
www.d4science.org
are multidisciplinary, involve members belonging to diverse organisations
cannot rely on costly environments managed by dedicated organizations
require to access data and services that are spread among many providers
Communities’ needs
D4Science: Facilitator for a FAIR data management 3
cost and time required to implement this approach
largely exceed the available capacities
Not individual researchers but group of researchers
dynamically aggregated to address research questions/problems
build and operate their own supporting environments
wish to effectively inject open science in daily tasks
www.d4science.org
Requirements for IT systems
Support collaborative research and experimentation
Implement Reproducibility-Repeatability-Reusability
Allow sharing data and findings
Grant open access to produced scientific knowledge and data
Tackle simplified access to existing computing and storage resources
Ensure low operational and maintenance costs
Manage heterogeneous data access policies
D4Science: Facilitator for a FAIR data management 4
www.d4science.org
Virtual Research Environment
An operational environment
Where set of resources (data,
services, computational, and
storage resources)
are assigned to group of users via
interfaces
for a limited timeframe
L. Candela, D. Castelli, P. Pagano (2013) Virtual Research Environments: An Overview and a Research Agenda. Data Science Journal, Vol. 12
Created on demand
Regulated by tailored policies
No cost for the resource
providers
Open to host and operate
custom software
D4Science: Facilitator for a FAIR data management 5
www.d4science.org
D4Science Geospatial Interpolation
In situ observations from
Copernicus Marine
Environment Monitoring
Service
Interpolation service
SeaDataNet Data-
Interpolating Variational
Analysis service (DIVA)
Estimates global, uniform
distributions of
environmental parameters
from scattered observations
Exploit the global estimate
and run niche modelling to
calculate a species
distribution
www.d4science.org
WPS
REST
Geospatial data infra.
Work-
-space
WMS
WCS
GeoTiff
NetCDF
OPeNDAP
VRE
Data preparation
+
Comp. parameters
NetCDF file
Provenance Metadata
(Prov-O)
Out. file
Sharing
Input
User
Other user
OGC StandardsVisualisation
Publication
VRE
The SeaDataNet-D4Science Connector
Architecture
www.d4science.org
•I1. (meta)data use a formal,
accessible, shared, and broadly
applicable language for knowledge
representation
•I2. (meta)data use vocabularies that
follow FAIR principles
•I3. (meta)data include qualified
references to other (meta)data.
•R1. meta(data) have a plurality of
accurate and relevant attributes
•R1.1. (meta)data are released with a
clear and accessible data usage license.
•R1.2. (meta)data are associated with
their provenance.
•R1.3. (meta)data meet domain-relevant
community standards.
•A1 retrievable by their identifier
using a standardized protocol
•A1.1 the protocol is open, free, and
universally implementable
•A1.2 the protocol allows for an
authentication and authorization
procedure
•A2 metadata are accessible, even
when the data are no longer
available.
•F1. globally unique and eternally
persistent identifier
•F2. rich metadata
•F3. indexed in a searchable resource
•F4. metadata specify the data
identifier
Findable Accessible
InteroperableRe-usable
D4Science: Facilitator for a FAIR data management 8
www.d4science.org
D4Science: Findability
Findability is enabled
• By extending the concept of resources to datasets,
methods/algorithms, research objects, and services
• by assigning to each of the D4Science managed resources
• a unique identifier
• rich and extensible metadata (including attribution, provenance
and licence information)
• by publishing resources in tailored and global catalogues that
supports keyword, faceted and temporal/geospatial discovery
D4Science: Facilitator for a FAIR data management 9
www.d4science.org
D4Science: Accessibility
Accessibility is obtained
• by making shared and published resources available through
multiple protocols in order to maximise the set of potential
exploitation cases
• by providing also for transparent Authentication and
Authorization, whenever the published resource requires it
• by enabling policies enforcement
D4Science: Facilitator for a FAIR data management 10
www.d4science.org
D4Science: Interoperability
Interoperability is facilitated
• by enriching automatically the resources with metadata in
multiple formats
• including ISO 19115, Darwin Core, Dublin Core, DCAT and
application profiles
• by promoting exploitation of ontologies and controlled
vocabularies
D4Science: Facilitator for a FAIR data management 11
www.d4science.org
D4Science: Reusability
Reusability is promoted
• by systematically endowing shared and published resources with
• a clear licence governing their use/re-use
• citation and attribution statements
• by systematically generating provenance metadata
• by design allowing the execution of the experiment in the same
technical and contextual environment
D4Science: Facilitator for a FAIR data management 12
www.d4science.org
D4Science enacts FAIR because …
 Embrace as-a-Service approach
 Exploit communication standards
 Hide complexity of computational capabilities
 Enable Access via VRE governed by tailored policies
 Facilitate provenance and attribution management
 Implement economy-of-scale and costs reduction
 Promote collaboration and sharing
 Enable Re-usability
www.d4science.org
THANK YOU
Contact Points
pasquale.pagano@isti.cnr.it
www.d4science.org
info@d4science.org

Mais conteúdo relacionado

Mais procurados

FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
Carole Goble
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
Carole Goble
 
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
SEAD
 
FAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practiceFAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practice
Carole Goble
 

Mais procurados (20)

FAIR Data Bridging from researcher data management to ELIXIR archives in the...
FAIR Data Bridging from researcher data management to ELIXIR archives in the...FAIR Data Bridging from researcher data management to ELIXIR archives in the...
FAIR Data Bridging from researcher data management to ELIXIR archives in the...
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
 
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
 
Certifying CISER! A Data Seal of Approval Case Study
Certifying CISER! A Data Seal of Approval Case StudyCertifying CISER! A Data Seal of Approval Case Study
Certifying CISER! A Data Seal of Approval Case Study
 
SEAD slide set (October 2011)
SEAD slide set (October 2011)SEAD slide set (October 2011)
SEAD slide set (October 2011)
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
 
Digital Library Federation - DataNets Panel presentation (Nov. 1st, 2011)
Digital Library Federation - DataNets Panel presentation (Nov. 1st, 2011)Digital Library Federation - DataNets Panel presentation (Nov. 1st, 2011)
Digital Library Federation - DataNets Panel presentation (Nov. 1st, 2011)
 
A Big Picture in Research Data Management
A Big Picture in Research Data ManagementA Big Picture in Research Data Management
A Big Picture in Research Data Management
 
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
 
Northumbria University Geospatial Metadata Workshop 20110505
Northumbria University Geospatial Metadata Workshop 20110505Northumbria University Geospatial Metadata Workshop 20110505
Northumbria University Geospatial Metadata Workshop 20110505
 
FAIR data
FAIR dataFAIR data
FAIR data
 
FAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practiceFAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practice
 
Komatsoulis internet2 executive track
Komatsoulis internet2 executive trackKomatsoulis internet2 executive track
Komatsoulis internet2 executive track
 
The swings and roundabouts of a decade of fun and games with Research Objects
The swings and roundabouts of a decade of fun and games with Research Objects The swings and roundabouts of a decade of fun and games with Research Objects
The swings and roundabouts of a decade of fun and games with Research Objects
 
D4Science Data infrastructure: a facilitator for a FAIR data management
D4Science Data infrastructure: a facilitator for a FAIR data managementD4Science Data infrastructure: a facilitator for a FAIR data management
D4Science Data infrastructure: a facilitator for a FAIR data management
 
EOSC-Life Workflow Collaboratory
EOSC-Life Workflow CollaboratoryEOSC-Life Workflow Collaboratory
EOSC-Life Workflow Collaboratory
 
Nif tdr project webinar mehnert
Nif tdr project webinar mehnertNif tdr project webinar mehnert
Nif tdr project webinar mehnert
 
Fair data principles for AOASG
Fair data principles for AOASGFair data principles for AOASG
Fair data principles for AOASG
 
Educause 2015 RDM Maturity
Educause 2015 RDM Maturity Educause 2015 RDM Maturity
Educause 2015 RDM Maturity
 
Practical and Conceptual Considerations of Research Object Preservation
Practical and Conceptual Considerations of Research Object PreservationPractical and Conceptual Considerations of Research Object Preservation
Practical and Conceptual Considerations of Research Object Preservation
 

Semelhante a D4Science Data Infrastructure - Facilitator for a FAIR Data Management

Scholze liber 2015-06-25_final
Scholze liber 2015-06-25_finalScholze liber 2015-06-25_final
Scholze liber 2015-06-25_final
Karlsruhe Institute of Technology (KIT)
 
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
Carole Goble
 
Research methods group accelarating impact by sharing data
Research methods group  accelarating impact by sharing dataResearch methods group  accelarating impact by sharing data
Research methods group accelarating impact by sharing data
World Agroforestry (ICRAF)
 

Semelhante a D4Science Data Infrastructure - Facilitator for a FAIR Data Management (20)

FAIR play?
FAIR play? FAIR play?
FAIR play?
 
Scholze liber 2015-06-25_final
Scholze liber 2015-06-25_finalScholze liber 2015-06-25_final
Scholze liber 2015-06-25_final
 
African Open Science Platform
African Open Science PlatformAfrican Open Science Platform
African Open Science Platform
 
Virtual Research Environments supporting tailor-made data management service...
Virtual Research Environments supporting tailor-made data management service...Virtual Research Environments supporting tailor-made data management service...
Virtual Research Environments supporting tailor-made data management service...
 
NFAIS Talk on Enabling FAIR Data
NFAIS Talk on Enabling FAIR DataNFAIS Talk on Enabling FAIR Data
NFAIS Talk on Enabling FAIR Data
 
Virtual research environments for implementing long tail open science
Virtual research environments for implementing long tail open scienceVirtual research environments for implementing long tail open science
Virtual research environments for implementing long tail open science
 
Sharing Big Data - Bob Jones
Sharing Big Data - Bob JonesSharing Big Data - Bob Jones
Sharing Big Data - Bob Jones
 
dkNET Introduction for Librarians
dkNET Introduction for LibrariansdkNET Introduction for Librarians
dkNET Introduction for Librarians
 
Research Data Management in GLAM: Managing Data for Cultural Heritage
Research Data Management in GLAM: Managing Data for Cultural HeritageResearch Data Management in GLAM: Managing Data for Cultural Heritage
Research Data Management in GLAM: Managing Data for Cultural Heritage
 
dkNET Office Hours - "Are You Ready for 2023: New NIH Data Management and Sha...
dkNET Office Hours - "Are You Ready for 2023: New NIH Data Management and Sha...dkNET Office Hours - "Are You Ready for 2023: New NIH Data Management and Sha...
dkNET Office Hours - "Are You Ready for 2023: New NIH Data Management and Sha...
 
re3data.org – Registry of Research Data Repositories
re3data.org – Registry of Research Data Repositoriesre3data.org – Registry of Research Data Repositories
re3data.org – Registry of Research Data Repositories
 
DSpace CRIS EFS Miami.pdf
DSpace CRIS EFS Miami.pdfDSpace CRIS EFS Miami.pdf
DSpace CRIS EFS Miami.pdf
 
Data sharing in the Netherlands
Data sharing in the NetherlandsData sharing in the Netherlands
Data sharing in the Netherlands
 
Open Science Globally: Some Developments/Dr Simon Hodson
Open Science Globally: Some Developments/Dr Simon HodsonOpen Science Globally: Some Developments/Dr Simon Hodson
Open Science Globally: Some Developments/Dr Simon Hodson
 
SPatially Explicit Data Discovery, Extraction and Evaluation Services (SPEDDE...
SPatially Explicit Data Discovery, Extraction and Evaluation Services (SPEDDE...SPatially Explicit Data Discovery, Extraction and Evaluation Services (SPEDDE...
SPatially Explicit Data Discovery, Extraction and Evaluation Services (SPEDDE...
 
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
 
Turning FAIR into Reality: Final outcomes from the European Commission FAIR D...
Turning FAIR into Reality: Final outcomes from the European Commission FAIR D...Turning FAIR into Reality: Final outcomes from the European Commission FAIR D...
Turning FAIR into Reality: Final outcomes from the European Commission FAIR D...
 
FAIR Data Management and FAIR Data Sharing
FAIR Data Management and FAIR Data SharingFAIR Data Management and FAIR Data Sharing
FAIR Data Management and FAIR Data Sharing
 
Research methods group accelarating impact by sharing data
Research methods group  accelarating impact by sharing dataResearch methods group  accelarating impact by sharing data
Research methods group accelarating impact by sharing data
 
Data accessibilityandchallenges
Data accessibilityandchallengesData accessibilityandchallenges
Data accessibilityandchallenges
 

Mais de Blue BRIDGE

Mais de Blue BRIDGE (20)

PerformFISH: Consumer Driven Production - Integrating Innovative Approaches f...
PerformFISH: Consumer Driven Production - Integrating Innovative Approaches f...PerformFISH: Consumer Driven Production - Integrating Innovative Approaches f...
PerformFISH: Consumer Driven Production - Integrating Innovative Approaches f...
 
BlueBRIDGE supporting education
BlueBRIDGE supporting educationBlueBRIDGE supporting education
BlueBRIDGE supporting education
 
LME: LEARN & IOC Capacity Building Activities
LME: LEARN & IOC Capacity Building ActivitiesLME: LEARN & IOC Capacity Building Activities
LME: LEARN & IOC Capacity Building Activities
 
Machine Learning methods to estimate the performance of aquafarms
Machine Learning methods to estimate the performance of aquafarms Machine Learning methods to estimate the performance of aquafarms
Machine Learning methods to estimate the performance of aquafarms
 
Environmental observation data to detect aquaculture structures: merging Cope...
Environmental observation data to detect aquaculture structures: merging Cope...Environmental observation data to detect aquaculture structures: merging Cope...
Environmental observation data to detect aquaculture structures: merging Cope...
 
Application of Earth Observation (EO) Data for Detection, Characterization an...
Application of Earth Observation (EO) Data for Detection, Characterization an...Application of Earth Observation (EO) Data for Detection, Characterization an...
Application of Earth Observation (EO) Data for Detection, Characterization an...
 
Capacity building, validation and repeatability
Capacity building, validation and repeatabilityCapacity building, validation and repeatability
Capacity building, validation and repeatability
 
Fostering global data management with public tuna fisheries data
Fostering global data management with public tuna fisheries dataFostering global data management with public tuna fisheries data
Fostering global data management with public tuna fisheries data
 
Understanding biodiversity features in marine protected areas
Understanding biodiversity features in marine protected areasUnderstanding biodiversity features in marine protected areas
Understanding biodiversity features in marine protected areas
 
Panel discussion on Global Repositories of Merged Public Data
Panel discussion on Global Repositories of Merged Public DataPanel discussion on Global Repositories of Merged Public Data
Panel discussion on Global Repositories of Merged Public Data
 
Invasive species and climate change
Invasive species and climate changeInvasive species and climate change
Invasive species and climate change
 
Blue Skills
Blue SkillsBlue Skills
Blue Skills
 
The BIG picture - Advanced data visualization for SDG, basic stock assessment...
The BIG picture - Advanced data visualization for SDG, basic stock assessment...The BIG picture - Advanced data visualization for SDG, basic stock assessment...
The BIG picture - Advanced data visualization for SDG, basic stock assessment...
 
Global Record of Stocks and Fisheries (GRFS)
Global Record of Stocks and Fisheries (GRFS)Global Record of Stocks and Fisheries (GRFS)
Global Record of Stocks and Fisheries (GRFS)
 
Projecting global fish stocks and catches up to 2100
Projecting global fish stocks and catches up to 2100Projecting global fish stocks and catches up to 2100
Projecting global fish stocks and catches up to 2100
 
BlueBRIDGE: Major Achievements & future vision
BlueBRIDGE: Major Achievements & future visionBlueBRIDGE: Major Achievements & future vision
BlueBRIDGE: Major Achievements & future vision
 
Managing tuna fisheries data at a global scale: the Tuna Atlas VRE
Managing tuna fisheries data at a global scale: the Tuna Atlas VREManaging tuna fisheries data at a global scale: the Tuna Atlas VRE
Managing tuna fisheries data at a global scale: the Tuna Atlas VRE
 
SeaDataCloud – further developing the pan-European SeaDataNet infrastructure ...
SeaDataCloud – further developing the pan-European SeaDataNet infrastructure ...SeaDataCloud – further developing the pan-European SeaDataNet infrastructure ...
SeaDataCloud – further developing the pan-European SeaDataNet infrastructure ...
 
The BlueBRIDGE Project - Pasquale Pagano
The BlueBRIDGE Project - Pasquale PaganoThe BlueBRIDGE Project - Pasquale Pagano
The BlueBRIDGE Project - Pasquale Pagano
 
Thematic clouds for EOSC : The Food Cloud and the Blue Cloud
Thematic clouds for EOSC: The Food Cloud and the Blue Cloud�Thematic clouds for EOSC: The Food Cloud and the Blue Cloud�
Thematic clouds for EOSC : The Food Cloud and the Blue Cloud
 

Último

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 

Último (20)

Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 

D4Science Data Infrastructure - Facilitator for a FAIR Data Management

  • 1. www.d4science.org D4SCIENCE DATA INFRASTRUCTURE Facilitator for a FAIR data management Pasquale Pagano CNR – ISTI (Pisa, Italy)
  • 2. www.d4science.org Outline Context Requirements Virtual Research Environments Dealing with complexity FAIR principles Conclusions D4Science: Facilitator for a FAIR data management 1
  • 3. www.d4science.org D4Science is an hybrid data infrastructure technologies integrated to provide elastic access and usage of data and data-management capabilities D4Science: Facilitator for a FAIR data management 2 • +55 VREs hosted • +2500 scientists in 44 countries • +50 data providers • +25,000 derivative data/month • over a billion quality records • +20,000 temporal datasets • +50,000 spatial datasets • 99.7% service availability Humanities and Cultural Heritage Social Mining Environmental Studies Biological and Ecological Studies
  • 4. www.d4science.org are multidisciplinary, involve members belonging to diverse organisations cannot rely on costly environments managed by dedicated organizations require to access data and services that are spread among many providers Communities’ needs D4Science: Facilitator for a FAIR data management 3 cost and time required to implement this approach largely exceed the available capacities Not individual researchers but group of researchers dynamically aggregated to address research questions/problems build and operate their own supporting environments wish to effectively inject open science in daily tasks
  • 5. www.d4science.org Requirements for IT systems Support collaborative research and experimentation Implement Reproducibility-Repeatability-Reusability Allow sharing data and findings Grant open access to produced scientific knowledge and data Tackle simplified access to existing computing and storage resources Ensure low operational and maintenance costs Manage heterogeneous data access policies D4Science: Facilitator for a FAIR data management 4
  • 6. www.d4science.org Virtual Research Environment An operational environment Where set of resources (data, services, computational, and storage resources) are assigned to group of users via interfaces for a limited timeframe L. Candela, D. Castelli, P. Pagano (2013) Virtual Research Environments: An Overview and a Research Agenda. Data Science Journal, Vol. 12 Created on demand Regulated by tailored policies No cost for the resource providers Open to host and operate custom software D4Science: Facilitator for a FAIR data management 5
  • 7. www.d4science.org D4Science Geospatial Interpolation In situ observations from Copernicus Marine Environment Monitoring Service Interpolation service SeaDataNet Data- Interpolating Variational Analysis service (DIVA) Estimates global, uniform distributions of environmental parameters from scattered observations Exploit the global estimate and run niche modelling to calculate a species distribution
  • 8. www.d4science.org WPS REST Geospatial data infra. Work- -space WMS WCS GeoTiff NetCDF OPeNDAP VRE Data preparation + Comp. parameters NetCDF file Provenance Metadata (Prov-O) Out. file Sharing Input User Other user OGC StandardsVisualisation Publication VRE The SeaDataNet-D4Science Connector Architecture
  • 9. www.d4science.org •I1. (meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation •I2. (meta)data use vocabularies that follow FAIR principles •I3. (meta)data include qualified references to other (meta)data. •R1. meta(data) have a plurality of accurate and relevant attributes •R1.1. (meta)data are released with a clear and accessible data usage license. •R1.2. (meta)data are associated with their provenance. •R1.3. (meta)data meet domain-relevant community standards. •A1 retrievable by their identifier using a standardized protocol •A1.1 the protocol is open, free, and universally implementable •A1.2 the protocol allows for an authentication and authorization procedure •A2 metadata are accessible, even when the data are no longer available. •F1. globally unique and eternally persistent identifier •F2. rich metadata •F3. indexed in a searchable resource •F4. metadata specify the data identifier Findable Accessible InteroperableRe-usable D4Science: Facilitator for a FAIR data management 8
  • 10. www.d4science.org D4Science: Findability Findability is enabled • By extending the concept of resources to datasets, methods/algorithms, research objects, and services • by assigning to each of the D4Science managed resources • a unique identifier • rich and extensible metadata (including attribution, provenance and licence information) • by publishing resources in tailored and global catalogues that supports keyword, faceted and temporal/geospatial discovery D4Science: Facilitator for a FAIR data management 9
  • 11. www.d4science.org D4Science: Accessibility Accessibility is obtained • by making shared and published resources available through multiple protocols in order to maximise the set of potential exploitation cases • by providing also for transparent Authentication and Authorization, whenever the published resource requires it • by enabling policies enforcement D4Science: Facilitator for a FAIR data management 10
  • 12. www.d4science.org D4Science: Interoperability Interoperability is facilitated • by enriching automatically the resources with metadata in multiple formats • including ISO 19115, Darwin Core, Dublin Core, DCAT and application profiles • by promoting exploitation of ontologies and controlled vocabularies D4Science: Facilitator for a FAIR data management 11
  • 13. www.d4science.org D4Science: Reusability Reusability is promoted • by systematically endowing shared and published resources with • a clear licence governing their use/re-use • citation and attribution statements • by systematically generating provenance metadata • by design allowing the execution of the experiment in the same technical and contextual environment D4Science: Facilitator for a FAIR data management 12
  • 14. www.d4science.org D4Science enacts FAIR because …  Embrace as-a-Service approach  Exploit communication standards  Hide complexity of computational capabilities  Enable Access via VRE governed by tailored policies  Facilitate provenance and attribution management  Implement economy-of-scale and costs reduction  Promote collaboration and sharing  Enable Re-usability