SlideShare a Scribd company logo
1 of 35
Download to read offline
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No. 654065 www.eudat.eu
Linking HPC to Data
Management
Stéphane COUTIN (CINES)
Giuseppe Fiameni (CINECA)
This work is licensed under the Creative
Commons CC-BY 4.0 licence
Objectives
High level presentation of research
data management and H2020 context
Present a simple approach and draft a
DMP for a given case.
THE CHANGING DATA LANDSCAPE
Image CC-BY-SA ‘data.path Ryoji.Ikeda - 3’ by r2hox www.flickr.com/photos/rh2ox/9990016123
Data explosion
More and more data is
being created
Issue is not creating
data, but being able to
navigate and use it
Data management is
critical to make sure
data are well-organised,
understandable and
reusable
Digital data are fragile and susceptible to loss for a wide variety of reasons
Natural disaster
Facilities infrastructure failure
Storage failure
Server hardware/software failure
Application software failure
Format obsolescence
Human error
Malicious attack
Loss of staffing competencies
Loss of institutional commitment
Loss of financial stability
Changes in user expectations
Data loss
Image CC-BY ‘Hard Drive 016’ by Jon Ross www.flickr.com/photos/jon_a_ross/1482849745
Link rot – more 404 errors
generated over time
Reference rot* – link rot
plus content drift i.e.
webpages evolving and
no longer reflecting
original content cited
* Term coined by Hiberlink http://hiberlink.org
Data persistency issues
Jonathan D. Wren Bioinformatics 2008;24:1381-1385
MANAGING & SHARING DATA
Why manage research data?
To make your research easier!
To stop yourself drowning in irrelevant stuff
In case you need the data later
To avoid accusations of fraud or bad science
To share your data for others to use and learn from
To get credit for producing it
Because funders or your organisation require it
Well-managed data opens up opportunities
for re-use, integration and new science
H2020 open research data pilot
• Already expanded from a select pilot to all work
areas
• All need to consider which data can be made
open
• Mantra = “As open as possible as closed as
necessary”
• Underlying driver is good (FAIR) data
management
Image CC-BY-SA by SangyaPundir
Key requirements of the open data pilot
Beneficiaries participating in the Pilot will:
Deposit data in a research data repository of
their choice
Take measures to make it possible for others to
access, mine, exploit, reproduce and
disseminate the data free of charge
Provide information about tools and instruments
necessary for validating the results (where
possible, provide the tools and instruments
themselves)
http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi
/oa_pilot/h2020-hi-oa-data-mgt_en.pdf
Suggested DMP creation process
Analyse your project Information System
Suggest : Data Flow Diagram
Apply FAIR principles
Include data life cycle and time dimensions
Estimate costs
Iterate
Get funders support
Maintain DMP up to date
Simple diagram focusing on data dynamics
You can use other diagram type
DFD : Data Flow Diagram
Data
Processing
Data store
External
interaction
Data Flow
You and your team are submitting a proposal for a project in the domain of smart cities.
The City has implemented a large set of sensors measuring traffic. The data are collected
in the City datacenter.
You want to develop an application being able to forecast the traffic and also how it will
be impacted by events like planned roadworks. This application would run on a PRACE
site, not located in the City. On the PRACE site your storage space is limited to 10 TB.
The application uses the following inputs:
Sensors historical data over the last 12 months : sensors produce 1TB of data a day.
You implement a preprocessing module translating those data into a reduced data set
(10 MB per day). It is based on a format you have defined to describe the traffic.
The results provided by the simulation. This enables comparison between forecasted
and actual traffic in order to ‘train’ the application.
Weather data (historical and forecast) provided by the national meteo agency. They
use the SYNOP format. The volume is negligible.
Results will be accessible by the city council employees.
Create the project data flow diagram and fill the data summary chapter using a
table.
What would you appreciate to use efficiently the weather data?
Exercise – Phase 1
Data summary table
Dataset Description Origin?
Existing?
Format Size Who could use it?
Proposed data flow diagram
Sensors collection area
PRACE HPC Site
Simulations
PRACE
Storage
Output files
extractor
Input files
Raw sensor
data
Data
Preprocessing
Reduced
sensor data
Weather data
City council
employees
Data transfer
Data summary table
Dataset Description Origin? Existing? Format Size Who could use it?
Raw sensor
data
Available, collected
from sensors
Various 1TB per
day
Reduced
sensor data
Actual
traffic, …
Extracted from raw
sensor data
Binary
(specific)
10 MB a
day
Our simulation
Weather
data
Actual and
forecast
Existing. Meteo open
data platform
SYNOP 1MB a
week
Our simulation
Citizens, scientists, ..
Simulation
results
Forecasted
traffic
Results of our
simulation
Binary
(specific)
10 MB a
day
City council
employees, our
application
CREATING
DATA
PROCESSING
DATA
ANALYSING
DATA
PRESERVING
DATA
GIVING
ACCESS TO
DATA
RE-USING
DATA
Research data lifecycle
CREATING DATA: designing research,
DMPs, planning consent, locate existing
data, data collection and management,
capturing and creating metadata
RE-USING DATA: follow-
up research, new
research, undertake
research reviews,
scrutinising findings,
teaching & learning
ACCESS TO DATA:
distributing data,
sharing data,
controlling access,
establishing copyright,
promoting data PRESERVING DATA: data storage, back-
up & archiving, migrating to best format
& medium, creating metadata and
documentation
ANALYSING DATA:
interpreting, & deriving
data, producing outputs,
authoring publications,
preparing for sharing
PROCESSING DATA:
entering, transcribing,
checking, validating and
cleaning data, anonymising
data, describing data,
manage and store data
Ref: UK Data Archive: http://www.data-archive.ac.uk/create-manage/life-cycle
Bitstream
Persistent Identifier
Metadata
Digital objects can be
aggregated to digital
collections
What is a digital object?
CDI Data Model
22
Digital object example
A file format is a convention on how a data is
represented on a media. It can be:
Specified: a description of the convention exists,
and is sufficiently described to allow a complete
implementation of it;
Open: the convention is available without any
restrictions of access or implementation;
Standardized: the convention has been adopted
by standardization agencies (ISO, W3C). Example:
PDF/A.
A wide utilization of a format can also enable it to be
considered as a standard, even if there’s no official standard for
it. Example: PDF.
Proprietary: those formats depend on the existence
of an owner. They can be published. Example: Word.
The level of durability of a format depends on these
criteria.
Data formats
Through a web interface, this tool enables the
verification of a file, especially its validity and if it’s well-
formed against the specifications of the declared
format, to know if it can be archived.
You just have to download the file you want to test. The
file is then analyzed by the tool which sends
automatically the answer.
If the file is not well-formed or not valid, tutorials to help
correcting the file are available for the user. If the
problem is not resolved, the user can contact the CINES
expertise by e-mail.
The list of the file formats accepted in PAC (CINES
Arrchiving Platform) is available on FACILE
(https://facile.cines.fr/ )
FACILE : a format validation tool
Complexity and diversity of file formats
A few ‘pivot’ formats
HDF
NetCDF
A lot of specific binaries formats
Need to document the format
Store or reference documentation in the digital
object
Store or reference code
HPC data formats
Licensing research data
• Horizon 2020 guidelines point to CC-BY or CC-0
• EUDAT licensing wizard help you pick licence for data & software
(available in B2SHARE)
• DCC How-to guide helps you to license data
www.dcc.ac.uk/resources/how-guides/license-research-data
Commonly defined as ‘data about data’, metadata
helps to make data findable and understandable
Metadata can be:
Descriptive: information about the content and
context of the data
Structural: information about the structure of the
data
Administrative: information about the file type, rights
management and preservation processes
What is metadata?
Comprehensive metadata will:
Facilitate data discovery
Help users determine the applicability of the data
Enable interpretation and reuse
Allow any limitations to be understood
Clarify ownership and restrictions on reuse
Offer permanence as it transcends people and time
Provide interoperability
Why use metadata?
The good and the bad
Metres / seconds
2015-09-10T15:00:01+01:00
Longitudinal wind speed
PDF 1.7
2008 US Population statistics
Barcelona, Venezuela
Furlongs and fortnight
10th Sept. 2015 15:00:01
U
PDF
Population statistics
Barcelona
More precise and
standardised Ambiguous
Digital preservation context
39
Main risks deal with:
• Comprehension
• Integrity
• Exploitation
• Valorization
Quality assurance
procedures to be setup for
• Metadata
• File formats
• Representation information
• Storage
• Access
• Technology watching
Digital preservation challenges
40
Setup quality assurance procedures to mitigate the
impact of the four main identified risks when they
occur
Challenge Solutions
Loss of content knowledge • Metadata;
• Persistent, unique identifiers.
File format obsolescence • Handling of a limited set of durable formats;
• File format identification, validation;
• Logical migration (format conversion).
Storage media failure • Management of media ageing;
• Physical migration.
Software or hardware disappearance • Technology watching , anticipation ,
proactivity.
More details at https://www.cines.fr/en/long-term-preservation/
Certifications
Certification can help selecting a repository
Certification focuses on:
Organizational infrastructure
Digital object management
Technology
Usually refers to OAIS model
OAIS (Open Archival Information System) model
Framework for an archive, now ISO 14721
Defines a functional and an informations models
Repository certification : Data Seal of
approval
16 quality guidelines for researchers and institutions that create
digital research files, organizations that archive research files, and
users of research data.
The objectives of the Data Seal of Approval are to safeguard
data, to ensure high quality and to guide reliable management
of research data for the future without requiring the
implementation of new standards, regulations or high costs.
The DSA
Gives researchers, research sponsors the assurance that their
research results will be stored in a reliable manner and can be
reused
Allows data repositories to archive and distribute research
data efficiently
Is part of a European Framework for Audit and Certification of
Trusted Repositories
Online application and self-assessment of the 16 guidelines by the
repository
Review by a member of the DSA Board
Formal certification: ISO 16363
ISO 16363 – « Audit and certification of trustworthy
digital repositories »
Evaluation criteria for an auditor to judge if a
repository is trustworthy)
Published in 2012
Strongly based on OAIS reference model
ISO 16919:2014 – « Requirements for bodies
providing audit and certification of candidate
trustworthy digital repositories »
specifies requirements for bodies providing ISO
16363 audit and certification – provide detailed
competences that auditors need
www.eudat.eu
Thanks – any questions
Acknowledgements:
Thanks to Mark van de Sanden, Marjan Grootveld , Sarah Jones
and Giuseppe Fiameni for some of the slides

More Related Content

What's hot

Shibboleth Access Management Federations and Secure SDI: ESDIN Experience
Shibboleth Access Management Federations and Secure SDI: ESDIN Experience Shibboleth Access Management Federations and Secure SDI: ESDIN Experience
Shibboleth Access Management Federations and Secure SDI: ESDIN Experience EDINA, University of Edinburgh
 
Research Data Management (RDM) Initiatives at the University of Edinburgh
Research Data Management (RDM) Initiatives at the University of EdinburghResearch Data Management (RDM) Initiatives at the University of Edinburgh
Research Data Management (RDM) Initiatives at the University of EdinburghEDINA, University of Edinburgh
 
Cambridge University Geospatial Metadata Workshop 20110524
Cambridge University Geospatial Metadata Workshop 20110524Cambridge University Geospatial Metadata Workshop 20110524
Cambridge University Geospatial Metadata Workshop 20110524EDINA, University of Edinburgh
 
Certifying CISER! A Data Seal of Approval Case Study
Certifying CISER! A Data Seal of Approval Case StudyCertifying CISER! A Data Seal of Approval Case Study
Certifying CISER! A Data Seal of Approval Case StudyHistoric Environment Scotland
 
The EU INSPIRE Directive and what it might mean for UK academia
The EU INSPIRE Directive and what it might mean for UK academiaThe EU INSPIRE Directive and what it might mean for UK academia
The EU INSPIRE Directive and what it might mean for UK academiaEDINA, University of Edinburgh
 
Geospatial Metadata and Spatial Data: It's all Greek to me!
Geospatial Metadata and Spatial Data: It's all Greek to me!Geospatial Metadata and Spatial Data: It's all Greek to me!
Geospatial Metadata and Spatial Data: It's all Greek to me!EDINA, University of Edinburgh
 
B2STAGE- how to shift large amounts of data| www.eudat.eu |
B2STAGE- how to shift large amounts of data| www.eudat.eu | B2STAGE- how to shift large amounts of data| www.eudat.eu |
B2STAGE- how to shift large amounts of data| www.eudat.eu | EUDAT
 
Geospatial metadata and spatial data workshop: 19 June 2014
Geospatial metadata and spatial data workshop: 19 June 2014Geospatial metadata and spatial data workshop: 19 June 2014
Geospatial metadata and spatial data workshop: 19 June 2014EDINA, University of Edinburgh
 
Now we are six: Integrating Edinburgh DataShare into local and internet in...
Now we are six: Integrating Edinburgh DataShare into local and internet in...Now we are six: Integrating Edinburgh DataShare into local and internet in...
Now we are six: Integrating Edinburgh DataShare into local and internet in...Robin Rice
 
Repository Fringe 2016 - Survey Documentation and Analysis
Repository Fringe 2016 - Survey Documentation and AnalysisRepository Fringe 2016 - Survey Documentation and Analysis
Repository Fringe 2016 - Survey Documentation and AnalysisEDINA, University of Edinburgh
 
Northumbria University Geospatial Metadata Workshop 20110505
Northumbria University Geospatial Metadata Workshop 20110505Northumbria University Geospatial Metadata Workshop 20110505
Northumbria University Geospatial Metadata Workshop 20110505EDINA, University of Edinburgh
 
Research engagement in EUDAT| www.eudat.eu |
Research engagement in EUDAT| www.eudat.eu | Research engagement in EUDAT| www.eudat.eu |
Research engagement in EUDAT| www.eudat.eu | EUDAT
 
Modeling Data Life Cycles with PROV
Modeling Data Life Cycles with PROVModeling Data Life Cycles with PROV
Modeling Data Life Cycles with PROVEUDAT
 
The Go-Geo! Spatial Data Portal: A Data Discovery and Research Tool for UK Ac...
The Go-Geo! Spatial Data Portal: A Data Discovery and Research Tool for UK Ac...The Go-Geo! Spatial Data Portal: A Data Discovery and Research Tool for UK Ac...
The Go-Geo! Spatial Data Portal: A Data Discovery and Research Tool for UK Ac...EDINA, University of Edinburgh
 
B2SHARE: Record lifecycle and HTTP API| www.eudat.eu |
B2SHARE: Record lifecycle and HTTP API| www.eudat.eu | B2SHARE: Record lifecycle and HTTP API| www.eudat.eu |
B2SHARE: Record lifecycle and HTTP API| www.eudat.eu | EUDAT
 
Scottish Digital Library Consortium Meeting: Edinburgh DataShare
Scottish Digital Library Consortium Meeting: Edinburgh DataShareScottish Digital Library Consortium Meeting: Edinburgh DataShare
Scottish Digital Library Consortium Meeting: Edinburgh DataShareRobin Rice
 

What's hot (20)

Shibboleth Access Management Federations and Secure SDI: ESDIN Experience
Shibboleth Access Management Federations and Secure SDI: ESDIN Experience Shibboleth Access Management Federations and Secure SDI: ESDIN Experience
Shibboleth Access Management Federations and Secure SDI: ESDIN Experience
 
Research Data Management (RDM) Initiatives at the University of Edinburgh
Research Data Management (RDM) Initiatives at the University of EdinburghResearch Data Management (RDM) Initiatives at the University of Edinburgh
Research Data Management (RDM) Initiatives at the University of Edinburgh
 
Cambridge University Geospatial Metadata Workshop 20110524
Cambridge University Geospatial Metadata Workshop 20110524Cambridge University Geospatial Metadata Workshop 20110524
Cambridge University Geospatial Metadata Workshop 20110524
 
Certifying CISER! A Data Seal of Approval Case Study
Certifying CISER! A Data Seal of Approval Case StudyCertifying CISER! A Data Seal of Approval Case Study
Certifying CISER! A Data Seal of Approval Case Study
 
The EU INSPIRE Directive and what it might mean for UK academia
The EU INSPIRE Directive and what it might mean for UK academiaThe EU INSPIRE Directive and what it might mean for UK academia
The EU INSPIRE Directive and what it might mean for UK academia
 
OGC Interoperability Experiments and Authentication
OGC Interoperability Experiments and AuthenticationOGC Interoperability Experiments and Authentication
OGC Interoperability Experiments and Authentication
 
Geospatial Metadata and Spatial Data: It's all Greek to me!
Geospatial Metadata and Spatial Data: It's all Greek to me!Geospatial Metadata and Spatial Data: It's all Greek to me!
Geospatial Metadata and Spatial Data: It's all Greek to me!
 
B2STAGE- how to shift large amounts of data| www.eudat.eu |
B2STAGE- how to shift large amounts of data| www.eudat.eu | B2STAGE- how to shift large amounts of data| www.eudat.eu |
B2STAGE- how to shift large amounts of data| www.eudat.eu |
 
Geospatial metadata and spatial data workshop: 19 June 2014
Geospatial metadata and spatial data workshop: 19 June 2014Geospatial metadata and spatial data workshop: 19 June 2014
Geospatial metadata and spatial data workshop: 19 June 2014
 
Now we are six: Integrating Edinburgh DataShare into local and internet in...
Now we are six: Integrating Edinburgh DataShare into local and internet in...Now we are six: Integrating Edinburgh DataShare into local and internet in...
Now we are six: Integrating Edinburgh DataShare into local and internet in...
 
Repository Fringe 2016 - Survey Documentation and Analysis
Repository Fringe 2016 - Survey Documentation and AnalysisRepository Fringe 2016 - Survey Documentation and Analysis
Repository Fringe 2016 - Survey Documentation and Analysis
 
Northumbria University Geospatial Metadata Workshop 20110505
Northumbria University Geospatial Metadata Workshop 20110505Northumbria University Geospatial Metadata Workshop 20110505
Northumbria University Geospatial Metadata Workshop 20110505
 
Research engagement in EUDAT| www.eudat.eu |
Research engagement in EUDAT| www.eudat.eu | Research engagement in EUDAT| www.eudat.eu |
Research engagement in EUDAT| www.eudat.eu |
 
Modeling Data Life Cycles with PROV
Modeling Data Life Cycles with PROVModeling Data Life Cycles with PROV
Modeling Data Life Cycles with PROV
 
Authentication Methods: Shibboleth
Authentication Methods: ShibbolethAuthentication Methods: Shibboleth
Authentication Methods: Shibboleth
 
Geospatial Metadata Workshop
Geospatial Metadata WorkshopGeospatial Metadata Workshop
Geospatial Metadata Workshop
 
The Go-Geo! Spatial Data Portal: A Data Discovery and Research Tool for UK Ac...
The Go-Geo! Spatial Data Portal: A Data Discovery and Research Tool for UK Ac...The Go-Geo! Spatial Data Portal: A Data Discovery and Research Tool for UK Ac...
The Go-Geo! Spatial Data Portal: A Data Discovery and Research Tool for UK Ac...
 
Glasgow University Geo Metadata Workshop
Glasgow University Geo Metadata WorkshopGlasgow University Geo Metadata Workshop
Glasgow University Geo Metadata Workshop
 
B2SHARE: Record lifecycle and HTTP API| www.eudat.eu |
B2SHARE: Record lifecycle and HTTP API| www.eudat.eu | B2SHARE: Record lifecycle and HTTP API| www.eudat.eu |
B2SHARE: Record lifecycle and HTTP API| www.eudat.eu |
 
Scottish Digital Library Consortium Meeting: Edinburgh DataShare
Scottish Digital Library Consortium Meeting: Edinburgh DataShareScottish Digital Library Consortium Meeting: Edinburgh DataShare
Scottish Digital Library Consortium Meeting: Edinburgh DataShare
 

Similar to Linking HPC to Data Management - EUDAT Summer School (Giuseppe Fiameni, CINECA)

Data management plans – EUDAT Best practices and case study | www.eudat.eu
Data management plans – EUDAT Best practices and case study | www.eudat.euData management plans – EUDAT Best practices and case study | www.eudat.eu
Data management plans – EUDAT Best practices and case study | www.eudat.euEUDAT
 
Advanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data VirtualizationAdvanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data VirtualizationDenodo
 
The Horizon 2020 Open Data Pilot - OpenAIRE webinar (Oct. 21 2014) by Sarah J...
The Horizon 2020 Open Data Pilot - OpenAIRE webinar (Oct. 21 2014) by Sarah J...The Horizon 2020 Open Data Pilot - OpenAIRE webinar (Oct. 21 2014) by Sarah J...
The Horizon 2020 Open Data Pilot - OpenAIRE webinar (Oct. 21 2014) by Sarah J...OpenAIRE
 
H2020 data pilot openaire
H2020 data pilot openaireH2020 data pilot openaire
H2020 data pilot openaireSarah Jones
 
Standard Safeguarding Dataset - overview for CSCDUG.pptx
Standard Safeguarding Dataset - overview for CSCDUG.pptxStandard Safeguarding Dataset - overview for CSCDUG.pptx
Standard Safeguarding Dataset - overview for CSCDUG.pptxRocioMendez59
 
EUDAT Research Data Management | www.eudat.eu |
EUDAT Research Data Management | www.eudat.eu | EUDAT Research Data Management | www.eudat.eu |
EUDAT Research Data Management | www.eudat.eu | EUDAT
 
A Logical Architecture is Always a Flexible Architecture (ASEAN)
A Logical Architecture is Always a Flexible Architecture (ASEAN)A Logical Architecture is Always a Flexible Architecture (ASEAN)
A Logical Architecture is Always a Flexible Architecture (ASEAN)Denodo
 
Cloud and Bid data Dr.VK.pdf
Cloud and Bid data Dr.VK.pdfCloud and Bid data Dr.VK.pdf
Cloud and Bid data Dr.VK.pdfkalai75
 
Cloud Computing & Big Data
Cloud Computing & Big DataCloud Computing & Big Data
Cloud Computing & Big DataMrinal Kumar
 
OSFair2017 Workshop | Service provisioning for excellent sciences
OSFair2017 Workshop | Service provisioning for excellent sciencesOSFair2017 Workshop | Service provisioning for excellent sciences
OSFair2017 Workshop | Service provisioning for excellent sciencesOpen Science Fair
 
The European Commission's Open Data ambition (Marjan Grootveld) - EUDAT Summe...
The European Commission's Open Data ambition (Marjan Grootveld) - EUDAT Summe...The European Commission's Open Data ambition (Marjan Grootveld) - EUDAT Summe...
The European Commission's Open Data ambition (Marjan Grootveld) - EUDAT Summe...EUDAT
 
Advanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data VirtualizationAdvanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data VirtualizationDenodo
 
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...EUDAT
 
Research Data Management, Challenges and Tools - Per Öster
Research Data Management, Challenges and Tools - Per Öster Research Data Management, Challenges and Tools - Per Öster
Research Data Management, Challenges and Tools - Per Öster LEARN Project
 
Big Data Processing with Hadoop : A Review
Big Data Processing with Hadoop : A ReviewBig Data Processing with Hadoop : A Review
Big Data Processing with Hadoop : A ReviewIRJET Journal
 
The NIH Data Commons - BD2K All Hands Meeting 2015
The NIH Data Commons -  BD2K All Hands Meeting 2015The NIH Data Commons -  BD2K All Hands Meeting 2015
The NIH Data Commons - BD2K All Hands Meeting 2015Vivien Bonazzi
 

Similar to Linking HPC to Data Management - EUDAT Summer School (Giuseppe Fiameni, CINECA) (20)

Data management plans – EUDAT Best practices and case study | www.eudat.eu
Data management plans – EUDAT Best practices and case study | www.eudat.euData management plans – EUDAT Best practices and case study | www.eudat.eu
Data management plans – EUDAT Best practices and case study | www.eudat.eu
 
What is a DMP
What is a DMPWhat is a DMP
What is a DMP
 
Advanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data VirtualizationAdvanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data Virtualization
 
The Horizon 2020 Open Data Pilot - OpenAIRE webinar (Oct. 21 2014) by Sarah J...
The Horizon 2020 Open Data Pilot - OpenAIRE webinar (Oct. 21 2014) by Sarah J...The Horizon 2020 Open Data Pilot - OpenAIRE webinar (Oct. 21 2014) by Sarah J...
The Horizon 2020 Open Data Pilot - OpenAIRE webinar (Oct. 21 2014) by Sarah J...
 
H2020 data pilot openaire
H2020 data pilot openaireH2020 data pilot openaire
H2020 data pilot openaire
 
Standard Safeguarding Dataset - overview for CSCDUG.pptx
Standard Safeguarding Dataset - overview for CSCDUG.pptxStandard Safeguarding Dataset - overview for CSCDUG.pptx
Standard Safeguarding Dataset - overview for CSCDUG.pptx
 
EUDAT Research Data Management | www.eudat.eu |
EUDAT Research Data Management | www.eudat.eu | EUDAT Research Data Management | www.eudat.eu |
EUDAT Research Data Management | www.eudat.eu |
 
A Logical Architecture is Always a Flexible Architecture (ASEAN)
A Logical Architecture is Always a Flexible Architecture (ASEAN)A Logical Architecture is Always a Flexible Architecture (ASEAN)
A Logical Architecture is Always a Flexible Architecture (ASEAN)
 
Cloud and Bid data Dr.VK.pdf
Cloud and Bid data Dr.VK.pdfCloud and Bid data Dr.VK.pdf
Cloud and Bid data Dr.VK.pdf
 
Census Hub Project
Census Hub ProjectCensus Hub Project
Census Hub Project
 
Cloud Computing & Big Data
Cloud Computing & Big DataCloud Computing & Big Data
Cloud Computing & Big Data
 
Hybrid Cloud Strategy for Big Data and Analytics
Hybrid Cloud Strategy for Big Data and Analytics Hybrid Cloud Strategy for Big Data and Analytics
Hybrid Cloud Strategy for Big Data and Analytics
 
OSFair2017 Workshop | Service provisioning for excellent sciences
OSFair2017 Workshop | Service provisioning for excellent sciencesOSFair2017 Workshop | Service provisioning for excellent sciences
OSFair2017 Workshop | Service provisioning for excellent sciences
 
The European Commission's Open Data ambition (Marjan Grootveld) - EUDAT Summe...
The European Commission's Open Data ambition (Marjan Grootveld) - EUDAT Summe...The European Commission's Open Data ambition (Marjan Grootveld) - EUDAT Summe...
The European Commission's Open Data ambition (Marjan Grootveld) - EUDAT Summe...
 
Advanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data VirtualizationAdvanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data Virtualization
 
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...
 
Research Data Management, Challenges and Tools - Per Öster
Research Data Management, Challenges and Tools - Per Öster Research Data Management, Challenges and Tools - Per Öster
Research Data Management, Challenges and Tools - Per Öster
 
Big Data Processing with Hadoop : A Review
Big Data Processing with Hadoop : A ReviewBig Data Processing with Hadoop : A Review
Big Data Processing with Hadoop : A Review
 
The NIH Data Commons - BD2K All Hands Meeting 2015
The NIH Data Commons -  BD2K All Hands Meeting 2015The NIH Data Commons -  BD2K All Hands Meeting 2015
The NIH Data Commons - BD2K All Hands Meeting 2015
 
EUDAT B2SAFE & EOSC-hub
EUDAT B2SAFE & EOSC-hubEUDAT B2SAFE & EOSC-hub
EUDAT B2SAFE & EOSC-hub
 

More from EUDAT

EUDAT_Brochure_Generica_Jan_UPDATED(5).pdf
EUDAT_Brochure_Generica_Jan_UPDATED(5).pdfEUDAT_Brochure_Generica_Jan_UPDATED(5).pdf
EUDAT_Brochure_Generica_Jan_UPDATED(5).pdfEUDAT
 
EUDAT Booklet Mar22 (2).pdf
EUDAT Booklet Mar22 (2).pdfEUDAT Booklet Mar22 (2).pdf
EUDAT Booklet Mar22 (2).pdfEUDAT
 
EUDAT_Brochure_Generica_Jan_UPDATED (1).pdf
EUDAT_Brochure_Generica_Jan_UPDATED (1).pdfEUDAT_Brochure_Generica_Jan_UPDATED (1).pdf
EUDAT_Brochure_Generica_Jan_UPDATED (1).pdfEUDAT
 
EUDAT Brochure - B2HANDLE.pdf
EUDAT Brochure - B2HANDLE.pdfEUDAT Brochure - B2HANDLE.pdf
EUDAT Brochure - B2HANDLE.pdfEUDAT
 
EUDAT Brochure - B2DROP.pdf
EUDAT Brochure - B2DROP.pdfEUDAT Brochure - B2DROP.pdf
EUDAT Brochure - B2DROP.pdfEUDAT
 
EUDAT Brochure - B2SHARE.pdf
EUDAT Brochure - B2SHARE.pdfEUDAT Brochure - B2SHARE.pdf
EUDAT Brochure - B2SHARE.pdfEUDAT
 
EUDAT Brochure - B2SAFE.pdf
EUDAT Brochure - B2SAFE.pdfEUDAT Brochure - B2SAFE.pdf
EUDAT Brochure - B2SAFE.pdfEUDAT
 
EUDAT Brochure - B2FIND(1).pdf
EUDAT Brochure - B2FIND(1).pdfEUDAT Brochure - B2FIND(1).pdf
EUDAT Brochure - B2FIND(1).pdfEUDAT
 
EUDAT Brochure - B2ACCESS.pdf
EUDAT Brochure - B2ACCESS.pdfEUDAT Brochure - B2ACCESS.pdf
EUDAT Brochure - B2ACCESS.pdfEUDAT
 
Rob Carrillo - Writing effective service documentation for EUDAT services
Rob Carrillo - Writing effective service documentation for EUDAT servicesRob Carrillo - Writing effective service documentation for EUDAT services
Rob Carrillo - Writing effective service documentation for EUDAT servicesEUDAT
 
Ariyo - EUDAT CDI B2 services documentation
Ariyo - EUDAT CDI B2 services documentationAriyo - EUDAT CDI B2 services documentation
Ariyo - EUDAT CDI B2 services documentationEUDAT
 
Using B2NOTE: The U.Porto Pilot
Using B2NOTE: The U.Porto PilotUsing B2NOTE: The U.Porto Pilot
Using B2NOTE: The U.Porto PilotEUDAT
 
OpenAIRE Advance - Kick off last week
OpenAIRE Advance - Kick off last weekOpenAIRE Advance - Kick off last week
OpenAIRE Advance - Kick off last weekEUDAT
 
European Open Science Cloud - Skills workshop
European Open Science Cloud - Skills workshopEuropean Open Science Cloud - Skills workshop
European Open Science Cloud - Skills workshopEUDAT
 
Linking service capabilities to data stweardship competences for professional...
Linking service capabilities to data stweardship competences for professional...Linking service capabilities to data stweardship competences for professional...
Linking service capabilities to data stweardship competences for professional...EUDAT
 
FAIRness of training materials
FAIRness of training materialsFAIRness of training materials
FAIRness of training materialsEUDAT
 
Training by EOSC-hub - Integrating and Managing services for the European Ope...
Training by EOSC-hub - Integrating and Managing services for the European Ope...Training by EOSC-hub - Integrating and Managing services for the European Ope...
Training by EOSC-hub - Integrating and Managing services for the European Ope...EUDAT
 
Draft Governance Framework for the EOSC
Draft Governance Framework for the EOSCDraft Governance Framework for the EOSC
Draft Governance Framework for the EOSCEUDAT
 
Building Interoperable AAI for Researchers
Building Interoperable AAI for ResearchersBuilding Interoperable AAI for Researchers
Building Interoperable AAI for ResearchersEUDAT
 
ENVRIPLUS Data for Science Theme
ENVRIPLUS Data for Science ThemeENVRIPLUS Data for Science Theme
ENVRIPLUS Data for Science ThemeEUDAT
 

More from EUDAT (20)

EUDAT_Brochure_Generica_Jan_UPDATED(5).pdf
EUDAT_Brochure_Generica_Jan_UPDATED(5).pdfEUDAT_Brochure_Generica_Jan_UPDATED(5).pdf
EUDAT_Brochure_Generica_Jan_UPDATED(5).pdf
 
EUDAT Booklet Mar22 (2).pdf
EUDAT Booklet Mar22 (2).pdfEUDAT Booklet Mar22 (2).pdf
EUDAT Booklet Mar22 (2).pdf
 
EUDAT_Brochure_Generica_Jan_UPDATED (1).pdf
EUDAT_Brochure_Generica_Jan_UPDATED (1).pdfEUDAT_Brochure_Generica_Jan_UPDATED (1).pdf
EUDAT_Brochure_Generica_Jan_UPDATED (1).pdf
 
EUDAT Brochure - B2HANDLE.pdf
EUDAT Brochure - B2HANDLE.pdfEUDAT Brochure - B2HANDLE.pdf
EUDAT Brochure - B2HANDLE.pdf
 
EUDAT Brochure - B2DROP.pdf
EUDAT Brochure - B2DROP.pdfEUDAT Brochure - B2DROP.pdf
EUDAT Brochure - B2DROP.pdf
 
EUDAT Brochure - B2SHARE.pdf
EUDAT Brochure - B2SHARE.pdfEUDAT Brochure - B2SHARE.pdf
EUDAT Brochure - B2SHARE.pdf
 
EUDAT Brochure - B2SAFE.pdf
EUDAT Brochure - B2SAFE.pdfEUDAT Brochure - B2SAFE.pdf
EUDAT Brochure - B2SAFE.pdf
 
EUDAT Brochure - B2FIND(1).pdf
EUDAT Brochure - B2FIND(1).pdfEUDAT Brochure - B2FIND(1).pdf
EUDAT Brochure - B2FIND(1).pdf
 
EUDAT Brochure - B2ACCESS.pdf
EUDAT Brochure - B2ACCESS.pdfEUDAT Brochure - B2ACCESS.pdf
EUDAT Brochure - B2ACCESS.pdf
 
Rob Carrillo - Writing effective service documentation for EUDAT services
Rob Carrillo - Writing effective service documentation for EUDAT servicesRob Carrillo - Writing effective service documentation for EUDAT services
Rob Carrillo - Writing effective service documentation for EUDAT services
 
Ariyo - EUDAT CDI B2 services documentation
Ariyo - EUDAT CDI B2 services documentationAriyo - EUDAT CDI B2 services documentation
Ariyo - EUDAT CDI B2 services documentation
 
Using B2NOTE: The U.Porto Pilot
Using B2NOTE: The U.Porto PilotUsing B2NOTE: The U.Porto Pilot
Using B2NOTE: The U.Porto Pilot
 
OpenAIRE Advance - Kick off last week
OpenAIRE Advance - Kick off last weekOpenAIRE Advance - Kick off last week
OpenAIRE Advance - Kick off last week
 
European Open Science Cloud - Skills workshop
European Open Science Cloud - Skills workshopEuropean Open Science Cloud - Skills workshop
European Open Science Cloud - Skills workshop
 
Linking service capabilities to data stweardship competences for professional...
Linking service capabilities to data stweardship competences for professional...Linking service capabilities to data stweardship competences for professional...
Linking service capabilities to data stweardship competences for professional...
 
FAIRness of training materials
FAIRness of training materialsFAIRness of training materials
FAIRness of training materials
 
Training by EOSC-hub - Integrating and Managing services for the European Ope...
Training by EOSC-hub - Integrating and Managing services for the European Ope...Training by EOSC-hub - Integrating and Managing services for the European Ope...
Training by EOSC-hub - Integrating and Managing services for the European Ope...
 
Draft Governance Framework for the EOSC
Draft Governance Framework for the EOSCDraft Governance Framework for the EOSC
Draft Governance Framework for the EOSC
 
Building Interoperable AAI for Researchers
Building Interoperable AAI for ResearchersBuilding Interoperable AAI for Researchers
Building Interoperable AAI for Researchers
 
ENVRIPLUS Data for Science Theme
ENVRIPLUS Data for Science ThemeENVRIPLUS Data for Science Theme
ENVRIPLUS Data for Science Theme
 

Recently uploaded

Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 

Recently uploaded (20)

Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 

Linking HPC to Data Management - EUDAT Summer School (Giuseppe Fiameni, CINECA)

  • 1. EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No. 654065 www.eudat.eu Linking HPC to Data Management Stéphane COUTIN (CINES) Giuseppe Fiameni (CINECA) This work is licensed under the Creative Commons CC-BY 4.0 licence
  • 2. Objectives High level presentation of research data management and H2020 context Present a simple approach and draft a DMP for a given case.
  • 3. THE CHANGING DATA LANDSCAPE Image CC-BY-SA ‘data.path Ryoji.Ikeda - 3’ by r2hox www.flickr.com/photos/rh2ox/9990016123
  • 4. Data explosion More and more data is being created Issue is not creating data, but being able to navigate and use it Data management is critical to make sure data are well-organised, understandable and reusable
  • 5. Digital data are fragile and susceptible to loss for a wide variety of reasons Natural disaster Facilities infrastructure failure Storage failure Server hardware/software failure Application software failure Format obsolescence Human error Malicious attack Loss of staffing competencies Loss of institutional commitment Loss of financial stability Changes in user expectations Data loss Image CC-BY ‘Hard Drive 016’ by Jon Ross www.flickr.com/photos/jon_a_ross/1482849745
  • 6. Link rot – more 404 errors generated over time Reference rot* – link rot plus content drift i.e. webpages evolving and no longer reflecting original content cited * Term coined by Hiberlink http://hiberlink.org Data persistency issues Jonathan D. Wren Bioinformatics 2008;24:1381-1385
  • 8. Why manage research data? To make your research easier! To stop yourself drowning in irrelevant stuff In case you need the data later To avoid accusations of fraud or bad science To share your data for others to use and learn from To get credit for producing it Because funders or your organisation require it Well-managed data opens up opportunities for re-use, integration and new science
  • 9. H2020 open research data pilot • Already expanded from a select pilot to all work areas • All need to consider which data can be made open • Mantra = “As open as possible as closed as necessary” • Underlying driver is good (FAIR) data management Image CC-BY-SA by SangyaPundir
  • 10. Key requirements of the open data pilot Beneficiaries participating in the Pilot will: Deposit data in a research data repository of their choice Take measures to make it possible for others to access, mine, exploit, reproduce and disseminate the data free of charge Provide information about tools and instruments necessary for validating the results (where possible, provide the tools and instruments themselves) http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi /oa_pilot/h2020-hi-oa-data-mgt_en.pdf
  • 11.
  • 12. Suggested DMP creation process Analyse your project Information System Suggest : Data Flow Diagram Apply FAIR principles Include data life cycle and time dimensions Estimate costs Iterate Get funders support Maintain DMP up to date
  • 13. Simple diagram focusing on data dynamics You can use other diagram type DFD : Data Flow Diagram Data Processing Data store External interaction Data Flow
  • 14. You and your team are submitting a proposal for a project in the domain of smart cities. The City has implemented a large set of sensors measuring traffic. The data are collected in the City datacenter. You want to develop an application being able to forecast the traffic and also how it will be impacted by events like planned roadworks. This application would run on a PRACE site, not located in the City. On the PRACE site your storage space is limited to 10 TB. The application uses the following inputs: Sensors historical data over the last 12 months : sensors produce 1TB of data a day. You implement a preprocessing module translating those data into a reduced data set (10 MB per day). It is based on a format you have defined to describe the traffic. The results provided by the simulation. This enables comparison between forecasted and actual traffic in order to ‘train’ the application. Weather data (historical and forecast) provided by the national meteo agency. They use the SYNOP format. The volume is negligible. Results will be accessible by the city council employees. Create the project data flow diagram and fill the data summary chapter using a table. What would you appreciate to use efficiently the weather data? Exercise – Phase 1
  • 15. Data summary table Dataset Description Origin? Existing? Format Size Who could use it?
  • 16. Proposed data flow diagram Sensors collection area PRACE HPC Site Simulations PRACE Storage Output files extractor Input files Raw sensor data Data Preprocessing Reduced sensor data Weather data City council employees Data transfer
  • 17. Data summary table Dataset Description Origin? Existing? Format Size Who could use it? Raw sensor data Available, collected from sensors Various 1TB per day Reduced sensor data Actual traffic, … Extracted from raw sensor data Binary (specific) 10 MB a day Our simulation Weather data Actual and forecast Existing. Meteo open data platform SYNOP 1MB a week Our simulation Citizens, scientists, .. Simulation results Forecasted traffic Results of our simulation Binary (specific) 10 MB a day City council employees, our application
  • 18. CREATING DATA PROCESSING DATA ANALYSING DATA PRESERVING DATA GIVING ACCESS TO DATA RE-USING DATA Research data lifecycle CREATING DATA: designing research, DMPs, planning consent, locate existing data, data collection and management, capturing and creating metadata RE-USING DATA: follow- up research, new research, undertake research reviews, scrutinising findings, teaching & learning ACCESS TO DATA: distributing data, sharing data, controlling access, establishing copyright, promoting data PRESERVING DATA: data storage, back- up & archiving, migrating to best format & medium, creating metadata and documentation ANALYSING DATA: interpreting, & deriving data, producing outputs, authoring publications, preparing for sharing PROCESSING DATA: entering, transcribing, checking, validating and cleaning data, anonymising data, describing data, manage and store data Ref: UK Data Archive: http://www.data-archive.ac.uk/create-manage/life-cycle
  • 19. Bitstream Persistent Identifier Metadata Digital objects can be aggregated to digital collections What is a digital object?
  • 22. A file format is a convention on how a data is represented on a media. It can be: Specified: a description of the convention exists, and is sufficiently described to allow a complete implementation of it; Open: the convention is available without any restrictions of access or implementation; Standardized: the convention has been adopted by standardization agencies (ISO, W3C). Example: PDF/A. A wide utilization of a format can also enable it to be considered as a standard, even if there’s no official standard for it. Example: PDF. Proprietary: those formats depend on the existence of an owner. They can be published. Example: Word. The level of durability of a format depends on these criteria. Data formats
  • 23. Through a web interface, this tool enables the verification of a file, especially its validity and if it’s well- formed against the specifications of the declared format, to know if it can be archived. You just have to download the file you want to test. The file is then analyzed by the tool which sends automatically the answer. If the file is not well-formed or not valid, tutorials to help correcting the file are available for the user. If the problem is not resolved, the user can contact the CINES expertise by e-mail. The list of the file formats accepted in PAC (CINES Arrchiving Platform) is available on FACILE (https://facile.cines.fr/ ) FACILE : a format validation tool
  • 24. Complexity and diversity of file formats A few ‘pivot’ formats HDF NetCDF A lot of specific binaries formats Need to document the format Store or reference documentation in the digital object Store or reference code HPC data formats
  • 25. Licensing research data • Horizon 2020 guidelines point to CC-BY or CC-0 • EUDAT licensing wizard help you pick licence for data & software (available in B2SHARE) • DCC How-to guide helps you to license data www.dcc.ac.uk/resources/how-guides/license-research-data
  • 26. Commonly defined as ‘data about data’, metadata helps to make data findable and understandable Metadata can be: Descriptive: information about the content and context of the data Structural: information about the structure of the data Administrative: information about the file type, rights management and preservation processes What is metadata?
  • 27. Comprehensive metadata will: Facilitate data discovery Help users determine the applicability of the data Enable interpretation and reuse Allow any limitations to be understood Clarify ownership and restrictions on reuse Offer permanence as it transcends people and time Provide interoperability Why use metadata?
  • 28. The good and the bad Metres / seconds 2015-09-10T15:00:01+01:00 Longitudinal wind speed PDF 1.7 2008 US Population statistics Barcelona, Venezuela Furlongs and fortnight 10th Sept. 2015 15:00:01 U PDF Population statistics Barcelona More precise and standardised Ambiguous
  • 29. Digital preservation context 39 Main risks deal with: • Comprehension • Integrity • Exploitation • Valorization Quality assurance procedures to be setup for • Metadata • File formats • Representation information • Storage • Access • Technology watching
  • 30. Digital preservation challenges 40 Setup quality assurance procedures to mitigate the impact of the four main identified risks when they occur Challenge Solutions Loss of content knowledge • Metadata; • Persistent, unique identifiers. File format obsolescence • Handling of a limited set of durable formats; • File format identification, validation; • Logical migration (format conversion). Storage media failure • Management of media ageing; • Physical migration. Software or hardware disappearance • Technology watching , anticipation , proactivity. More details at https://www.cines.fr/en/long-term-preservation/
  • 31. Certifications Certification can help selecting a repository Certification focuses on: Organizational infrastructure Digital object management Technology Usually refers to OAIS model
  • 32. OAIS (Open Archival Information System) model Framework for an archive, now ISO 14721 Defines a functional and an informations models
  • 33. Repository certification : Data Seal of approval 16 quality guidelines for researchers and institutions that create digital research files, organizations that archive research files, and users of research data. The objectives of the Data Seal of Approval are to safeguard data, to ensure high quality and to guide reliable management of research data for the future without requiring the implementation of new standards, regulations or high costs. The DSA Gives researchers, research sponsors the assurance that their research results will be stored in a reliable manner and can be reused Allows data repositories to archive and distribute research data efficiently Is part of a European Framework for Audit and Certification of Trusted Repositories Online application and self-assessment of the 16 guidelines by the repository Review by a member of the DSA Board
  • 34. Formal certification: ISO 16363 ISO 16363 – « Audit and certification of trustworthy digital repositories » Evaluation criteria for an auditor to judge if a repository is trustworthy) Published in 2012 Strongly based on OAIS reference model ISO 16919:2014 – « Requirements for bodies providing audit and certification of candidate trustworthy digital repositories » specifies requirements for bodies providing ISO 16363 audit and certification – provide detailed competences that auditors need
  • 35. www.eudat.eu Thanks – any questions Acknowledgements: Thanks to Mark van de Sanden, Marjan Grootveld , Sarah Jones and Giuseppe Fiameni for some of the slides