SlideShare uma empresa Scribd logo
1 de 28
ARCHIVER
Archiving and Preservation for Research Environments
CS3 Workshop
27th January 2020 – Copenhagen
João Fernandes (CERN)
Project Objective
2
Focus: Archiving and Data Preservation Services using commercial cloud services to be available via the European Open Science
Cloud (EOSC)
Procurement R&D budget: 3.4M euro
Starting Date: 1st of January 2019
Duration: 36 Months
Coordinator: CERN (Lead Procurer)
Consortium
Includes Buyers and Experts in the preparation, execution and promotion of the Procurement of
R&D
3
Procurers - Public organisations committing funds to contribute to a joint-R&D-
procurement, research data use cases and R&D testing effort
Experts – Partner organisations bringing expertise in requirement assessment and promotion activities, not part of the
Buyers Group
Preferred Partners (Early Adopters)
4
• Confirmed subscription received from 11 organisations: High level of interest from the community
• Participants:
• Demand side public sector organisations Information Webinar (04th September) = 47 participants
• Key advantages
• Assess if resulting services address archiving and preservation needs
• Contribute and shape the R&D carried out in the project, contribute with use cases and have the option to purchase pilot-
scale services by the end of the project
Challenge
20/02/2020 5
• Demonstrate services for long-term
preservation and archiving in the PB
range of scientific data
• F.A.I.R archiving services following
best practices and standards
• Expand resulting services to several
scientific domains
• Transparent business models and
make resulting services available
through the EOSC catalogue
Current Status of Scientific Data
Repositories
• Basic bit preservation and
archiving capabilities
• Data volumes and
communities growing
• Longstanding archiving and
preservation activities, but
most of data not yet
published
• Fragmentation across
scientific disciplines with
underestimation of costs at
the planning phase
6
R&D Scoping Activities and Dialogue with the Private Sector
Continuous updates to the FAQ
Consortium Matchmaking
Early Adopters Engagement
08th February
1st
Preparation
Workshop
20th February
2nd
Preparation
Workshop
08th April
OMC Kick-off
CERN
07th May
OMC Event
Barcelona
23rd May
OMC Event
Stansted
05th June
OMC
Consolidation
CERN
Feedback on the Draft PCP Contract Notice
Open Market Consultation Process
Feedback
Integration
Feedback
Integration
Feedback
Integration
Geographical Distribution of Companies
7
Wide geographical distribution
42 companies
Majority from 12 European Countries
Cost-effective
Business
Model
Taking into
account:
- Scale
- Ingest rates
- Archive lifetime
- # of copies
- Exit strategies
- Portability
- SLAs
Regulation &
Legislation
- Auditing
- Self-
assessment
- Data Retention
- GDPR
Outcome: R&D Challenge
8
Data integrity/security; cloud/hybrid deployment; data
volume in the PB range; high, sustained ingest data rates;
ISO certification: 27000, 27040, 19086 and related
Archives connected to the GEANT network.
OAIS conformant services: data readability formats, normalization,
obsolesce monitoring, files fixity, authenticity checks, etc.;
ISO 14721/16363, 26324 and related standards
User services: search, discover, share, indexing, data removal, etc.;
Access under Federated AAI
Layer 1
Storage/Basic Archiving/Secure
backup
Layer 2
Preservation
Layer 3
Baseline user services
Layer 4
Advanced
services
High level services: visual representation of data (domain specific),
reproducibility of scientific analyses using Machine Learning
Algorithms, etc.;
Core
R&D
Bonus
R&D
Scientific use cases deployments documented at: https://www.archiver-project.eu/deployment-scenarios
9
High Energy Physics
The BaBar Experiment
During this year, the BaBar Experiment infrastructure at SLAC will be decommissioned. 2 PB of
BaBar data can no longer be stored at the host laboratory. Currently a copy of the data is being
held by CERN IT-ST.
Goal: To ensure that a complete second copy of Babar data will be retained for possible
comparisons with data from other experiments and be shared through the CERN Open Data
Portal.
CERN Open Data Portal
The CERN Open Data portal disseminates close to 2 PBs of primary and derived datasets from
particle physics as they are released by LHC Collaborations and is being used for both education
and research purposes.
Goal: Achieve total reproducibility of research, being able to completely instantiate data,
associated software and services off-premise. Offer research reproducibility services to
individual researchers running open data analyses completely independent from the original on-
premise infrastructure.
CERN Digital Memory
Deployment consisting on a requirement to archive approximately 1.5 PB of digital Memory,
containing analogue documents produced by the Organization in the 20th century as well as
digital production of the 21st century (web sites, social media, emails, etc.)
Goal : Produce a dark archive in the cloud following standard OAIS practices.
10
Life Sciences
EMBL on FIRE
EMBL-EBI provides data archiving services to the global molecular biology community. These
data archives are currently based on an internal service (FIRE: FIle REplication). FIRE currently
holds 20PB of data and is growing at 40% per year.
Goal: cost-effective scaling via cloud-based storage solutions. Distribute data effectively be
on cloud, covering the increasing needs for cloud-hosted analysis.
EMBL Cloud Data Caching
Life sciences research communities access more and more internal data from public cloud
services for their data analysis.
Goal: To progressively cache data in the cloud, with the on-premises data being replicated
and discarded as required. Which data should be cached, how much and for how long, will be
a tradeoff between the cost of cloud storage and of having the network capacity/latency to
download the data multiple times.
Scientific use cases deployments documented at: https://www.archiver-project.eu/deployment-scenarios
11
The MAGIC Cherenkov gamma-ray telescopes and the PAUcam camera for the William
Herschel Telescope are located in the Observatorio del Roque de los Muchachos, in
Canary Islands, Spain. The first Large Scale Telescope of the next-generation Cherenkov
Telescope Array (CTA) is also there. They produce about 0.3 PB of raw data per year
which is automatically sent to PIC in Barcelona.
PIC Large File Storage
Goal: To replace the current in-house tape library storage. Each instance of the service to
be purchased is the 5-year safe-keeping of a yearly dataset from a single source.
PIC Mixed File Remote Storage
Goal: To archive the derived datasets from at most two sources, becoming part of the
yearly dataset. In addition, allow update/upload of derived data sets for a period of 4
years following the creation of the data,
PIC Data Distribution
Goal: To replace the Hierarchical Storage Manager, disk storage and data distribution
service. Each instance of the service to be purchased is the 5-year safe-keeping and data
distribution of a yearly dataset and its derived datasets.
Astronomy
12
Photon Sciences
PETRA III is the worldwide most brilliant storage ring based X-ray sources for high energy photons with 22 beamlines
distributed over three experimental halls. The European XFEL is a world's largest X-ray laser generating 27 000 ultrashort X-
ray per second and with a brilliance that is a billion times higher than that of the best conventional X-ray radiation sources.
The two facilities produce yearly about several 10s PB of raw data and this is expected to double in size every year.
Goal: Develop a hybrid model that combines current on-premise archiving services with the resulting services of ARCHIVER.
To move a predefined set of datasets in public clouds and make them open for public access.
Scientific use cases deployments documented at: https://www.archiver-project.eu/deployment-scenarios
13
Data integrity/security; cloud/hybrid deployment
Data volume in the PB range; high, sustained ingest data
rates. ISO certification: 27000, 27040, 19086 and related
standards. Archives connected to the GEANT network
OAIS conformant services: data readability formats,
normalization, obsolesce monitoring, files fixity, authenticity
checks, etc.
ISO 14721/16363, 26324 and related standards
User services: search, discover, share, indexing, data
removal, etc.
Access under Federated IAM
Layer 1
Storage/Basic
Archiving/Secure backup
Layer 2
Preservation
Layer 3
Baseline user
services
Layer 4
Advanced
services
High level services: visual representation of data (domain
specific), reproducibility of using Machine Learning Algorithms,
etc.;
EMBL1–FIRE
PIC2–MixedFileRemoteStorage
DESY1–PETRAIII/EUXFEL
CERN3–CERNOpenData
CERN2–CERNDigitalMemory
CERN1–TheBaBarExperiment
PIC3–DataDistribution
EMBL2–CloudCaching
PIC1–LargeFileStorage
Definition of the R&D scope Deployments derived from 4 ESFRI landmarks
CTA, ELIXIR, EuXFEL and HL-LHC
Scientific use cases deployments documented at: https://www.archiver-project.eu/deployment-scenarios
Project Timeline
14
ARCHIVER R&D Implementation Phases
Research Infrastructures Engagement
Role of the EOSC:
To ensure that 1.7 million European researchers and 70 million professionals in
science and technology reap the full benefits of data-driven science
- Federated virtual environment, free at the point of use for the end researcher
- Open services for storage, analysis and re-use of research data
- Promote an approach across national borders & scientific disciplines
- Promote choice of the deployment model: on-prem, hybrid, off-prem
EOSC Phase 1 investment of EUR 300 Million on core services
EOSC legal entity expected to be created by the end of 2020
European Open Science Cloud - The Vision
“We are creating a European Open Science Cloud now. It is a trusted space for researchers to store their data and to
access data from researchers from all other disciplines. We will create a pool of interlinked information, a ‘web of research
data’. (…) The idea is that once we have the rules of the game ready, then we will open this up to the broader public
sector and to business as well. So that companies can come in, store the data and use the data.”
Special Address by Ursula von der Leyen, President of the European Commission, 22nd January, WEF DAVOS 2020
https://www.youtube.com/watch?v=QN476nVbFVs&feature=youtu.be&t=682
EOSC should provide a level playing field –
same requirements for commercial and not-for-profit
providers
Accept commercial services in Data Mgmt. Plans
Stay mainstream and interoperable by adopting widely
used and internationally recognized standards
Promote choice, an ecosystem for innovation, fostering
data self-determination and digital sovereignty in Europe
https://www.spielwarenmesse.de
EOSC – Engagement of commercial providers
ARCHIVER key contributions for the EOSC
18
Long-Term Archiving and Preservation
of Research Data at the core strategy of
the EOSC
ARCHIVER key in defining EOSC Rules of
Participation for the private sector both
as service providers and as R&D
partners for “close to market” solutions
List of 40+ EOSC projects available at: https://www.eosc-portal.eu/about/eosc-projects
ARCHIVER services in the European Open Science Cloud
19
ARCHIVER Services available in the EOSC
2019 2020 2021 2022 2024
Objective: to make resulting services available in the European Open Science Cloud catalogue
What does this really mean?
Atomic Use Case: “As a researcher, I want to have access to the full set of ARCHIVER services, so that,
I’m able to evaluate their functionality for my specific research field, able to purchase them with a clear
cost model, and implement an exit strategy to be able to repatriate or move my research data
seamlessly to another location by the end of the contract and usage period.”
2023
ARCHIVER Data Management Strategy
20
• DMPs both for research use cases & for the project
FAIR guiding principles
post-GDPR era: strong focus on technical and organisational measures for
Data Privacy & Protection as the path to digital sovereignty
Modern guidelines provided by Science Europe for pan-European
Research Data Management (RDM)
Establishing core requirements for Research DMPs & a set of
criteria to assess trustworthy repositories:
https://www.scienceeurope.org/our-resources/practical-guide-to-the-international-alignment-of-
research-data-management/
ARCHIVER EOSC Services - Technical Validation (I)
21
Resource provisioning using Terraform API
OSS container orchestration systems
Automated deployment based on Kubernetes and Docker
Results stored back at CERN S3 cloud storage service
Available at Github under an OSS license:
https://ocre-testsuite.readthedocs.io/en/latest/
Tests provided by the research community
https://fairsharing.github.io/FAIR-Evaluator-FrontEnd/
Started in HNSciCloud
in use already in OCRE
reused and expanded in ARCHIVER
ARCHIVER EOSC Services – Legal & Organisational (II)
22
Data Privacy and Protection (GDPR) – An EC pillar
Personal Data wrapped in several components (Fed AAI, Research Data itself)
Technical & Organisational measures: “Privacy by Design” approach for GDPR
conformance
Best Practices: Standards
Infrastructure: ISO 27001 series, European Cybersecurity Act
Long Term Data Preservation: OAIS and CoreTrustSeal
Self-Assessment: Definition of responsibilities across data stewards and service
providers
Exit Strategies
Stimulate the use of Open Source, Open APIs: measures for vendor lock-in prevention
Definition and field testing of viable exit plans (provider/on-prem & provider/provider)
EOSC Services – Financial & Business Model (III)
23
Establish a range of sustainable, cost-effective purchasing options
Must support organisations or individual researchers to store their data after
the end of a procurement cycle or research grant of individual researcher
Requirement to service providers to establish a “Total Cost of Services” study
From the architecture phase (Design) to Prototype and Pilot
“Total” TCS: must include all factors that a research organisation or individual
researcher will bear when running the ARCHIVER resulting services over a
defined period
Strongly connected to exit strategies
Escrow services concept: public research organisations as data stewards
Protection factor against scenarios such as vendor locks or supplier bankruptcy
Summary
24
ARCHIVER aims to develop a set of commercial, FAIR, archiving and preservation services for
research data
Petabyte research data (tens of petabytes and beyond) in multiple scientific domains
Open, Trustworthy, aligned with best practices (ISO, OAIS, CoreTrustSeal)
Strong preference by Open Source Software, Open Standards as measures to prevent vendor lock-ins
Set of “derived rules” for commercial services onboarding in the EOSC
Technical: extensive field testing, “research data ready” archiving and preservation services
Legal: GDPR as an opportunity for high quality digital services guaranteeing digital sovereignty
Financial: models adapted to research considering public procurement cycles and research grants periods,
allowing effective cost planning for LTDP
ARCHIVER R&D activity will start end of Q1 2020
Tender for R&D services to open in less than 48h!  for submission of R&D bids from the 31st of January
(submission period of 2 months)
Info Session Webinars: February 7th and March 18th
Selected companies will be providing R&D services, 3 phases: Design (2020), Prototype (2020/2021) and Pilot
(2021)
Questions?
project-office@archiver-project.eu
All information at:
http://archiver-project.eu/
25
Additional Slides
26
Synergies with CS3Mesh4EOSC (CS3Mesh Kick-off)
• Test Suite
• Maybe CS3Mesh can profit from the ARCHIVER test suite available in Github
• https://github.com/cern-it-efp/OCRE-Testsuite
• A simple framework using Terraform for resource provisioning; Kubernetes for
resource orchestration as a standard way to embark scientific use cases tests
• APIs
• Would it make sense to define and test together the APIs implementation that
will be available in ARCHIVER resulting repository services?
• Preference is give to open, general purpose APIs
• Testing data workflows from CS3 -> ARCHIVER based on “data temperature”?
27
ARCHIVER Test Catalogue
28
Initial test catalogue include tests on network, storage and compute:
https://github.com/cern-it-efp/OCRE-Testsuite/
More tests being included:
FAIRsFAIR project evaluation of ”FAIRness”:
https://fairsharing.github.io/FAIR-Evaluator-FrontEnd/#!/#%2F!
Tests provided by the research community; using Science Europe assessment criteria
guidelines
Data ingestion: test of the ingest process, ingests with incremental changes for high
volumes of data, lifecycle from archive packages data creation, combination and/or
aggregation before final archival for long-term preservation
Open APIs: prevent vendor lock-ins and create innovative workflows (CS3MESH?)
Extensive testing of exit plans: during R&D execution, from provider2on-prem /
provider2provider

Mais conteúdo relacionado

Mais procurados

3 archiver omc deployment_scenarios
3 archiver omc deployment_scenarios3 archiver omc deployment_scenarios
3 archiver omc deployment_scenariosArchiver
 
Using e-Infrastructures for Biodiversity Conservation
Using e-Infrastructures for Biodiversity ConservationUsing e-Infrastructures for Biodiversity Conservation
Using e-Infrastructures for Biodiversity ConservationBlue BRIDGE
 
Tutorial on Hybrid Data Infrastructures: D4Science as a case study
Tutorial on Hybrid Data Infrastructures: D4Science as a case studyTutorial on Hybrid Data Infrastructures: D4Science as a case study
Tutorial on Hybrid Data Infrastructures: D4Science as a case studyBlue BRIDGE
 
Gergely Sipos, Claudio Cacciari: Welcome and mapping the landscape: EOSC-hub ...
Gergely Sipos, Claudio Cacciari: Welcome and mapping the landscape: EOSC-hub ...Gergely Sipos, Claudio Cacciari: Welcome and mapping the landscape: EOSC-hub ...
Gergely Sipos, Claudio Cacciari: Welcome and mapping the landscape: EOSC-hub ...EOSC-hub project
 
Exposing EO Linked (meta-)Data from OpenSearch Catalogue
Exposing EO Linked (meta-)Data from OpenSearch CatalogueExposing EO Linked (meta-)Data from OpenSearch Catalogue
Exposing EO Linked (meta-)Data from OpenSearch CatalogueRaul Palma
 
Tdr Overview Pres Advocates
Tdr Overview Pres AdvocatesTdr Overview Pres Advocates
Tdr Overview Pres Advocatesjamestoon
 
HNSciCloud: Project Results and lessons learned
HNSciCloud: Project Results and lessons learnedHNSciCloud: Project Results and lessons learned
HNSciCloud: Project Results and lessons learnedEOSC-hub project
 
Reference Model for an Open Archival Information Systems (OAIS): Overview and...
Reference Model for an Open Archival Information Systems (OAIS): Overview and...Reference Model for an Open Archival Information Systems (OAIS): Overview and...
Reference Model for an Open Archival Information Systems (OAIS): Overview and...faflrt
 
ELIXIR UK Node presentation to the ELIXIR Board
ELIXIR UK Node presentation to the ELIXIR BoardELIXIR UK Node presentation to the ELIXIR Board
ELIXIR UK Node presentation to the ELIXIR BoardCarole Goble
 
Towards an Infrastructure for Mining Scientific Publications
Towards an Infrastructure for Mining Scientific PublicationsTowards an Infrastructure for Mining Scientific Publications
Towards an Infrastructure for Mining Scientific Publicationspetrknoth
 
EUDAT Collaborative Data Infrastructure: Data Access and Re-use Service Area
EUDAT Collaborative Data Infrastructure: Data Access and Re-use Service AreaEUDAT Collaborative Data Infrastructure: Data Access and Re-use Service Area
EUDAT Collaborative Data Infrastructure: Data Access and Re-use Service AreaEUDAT
 
Phidias: Steps forward in detection and identification of anomalous atmospher...
Phidias: Steps forward in detection and identification of anomalous atmospher...Phidias: Steps forward in detection and identification of anomalous atmospher...
Phidias: Steps forward in detection and identification of anomalous atmospher...Phidias
 
FIAT/IFTA MMC Seminar May 2015. Metadata Standards for Preservation and Quali...
FIAT/IFTA MMC Seminar May 2015. Metadata Standards for Preservation and Quali...FIAT/IFTA MMC Seminar May 2015. Metadata Standards for Preservation and Quali...
FIAT/IFTA MMC Seminar May 2015. Metadata Standards for Preservation and Quali...FIAT/IFTA
 

Mais procurados (20)

3 archiver omc deployment_scenarios
3 archiver omc deployment_scenarios3 archiver omc deployment_scenarios
3 archiver omc deployment_scenarios
 
Using e-Infrastructures for Biodiversity Conservation
Using e-Infrastructures for Biodiversity ConservationUsing e-Infrastructures for Biodiversity Conservation
Using e-Infrastructures for Biodiversity Conservation
 
Tutorial on Hybrid Data Infrastructures: D4Science as a case study
Tutorial on Hybrid Data Infrastructures: D4Science as a case studyTutorial on Hybrid Data Infrastructures: D4Science as a case study
Tutorial on Hybrid Data Infrastructures: D4Science as a case study
 
UK RepositoryNet+ Mimas Workshop
UK RepositoryNet+ Mimas WorkshopUK RepositoryNet+ Mimas Workshop
UK RepositoryNet+ Mimas Workshop
 
Gergely Sipos, Claudio Cacciari: Welcome and mapping the landscape: EOSC-hub ...
Gergely Sipos, Claudio Cacciari: Welcome and mapping the landscape: EOSC-hub ...Gergely Sipos, Claudio Cacciari: Welcome and mapping the landscape: EOSC-hub ...
Gergely Sipos, Claudio Cacciari: Welcome and mapping the landscape: EOSC-hub ...
 
Who is doing what, and how do we know? [PEPRS]
Who is doing what, and how do we know? [PEPRS]Who is doing what, and how do we know? [PEPRS]
Who is doing what, and how do we know? [PEPRS]
 
Exposing EO Linked (meta-)Data from OpenSearch Catalogue
Exposing EO Linked (meta-)Data from OpenSearch CatalogueExposing EO Linked (meta-)Data from OpenSearch Catalogue
Exposing EO Linked (meta-)Data from OpenSearch Catalogue
 
Tdr Overview Pres Advocates
Tdr Overview Pres AdvocatesTdr Overview Pres Advocates
Tdr Overview Pres Advocates
 
HNSciCloud: Project Results and lessons learned
HNSciCloud: Project Results and lessons learnedHNSciCloud: Project Results and lessons learned
HNSciCloud: Project Results and lessons learned
 
Reference Model for an Open Archival Information Systems (OAIS): Overview and...
Reference Model for an Open Archival Information Systems (OAIS): Overview and...Reference Model for an Open Archival Information Systems (OAIS): Overview and...
Reference Model for an Open Archival Information Systems (OAIS): Overview and...
 
Geoservices Activities at EDINA
Geoservices Activities at EDINAGeoservices Activities at EDINA
Geoservices Activities at EDINA
 
Open Access Repository Junction
Open Access Repository JunctionOpen Access Repository Junction
Open Access Repository Junction
 
ELIXIR UK Node presentation to the ELIXIR Board
ELIXIR UK Node presentation to the ELIXIR BoardELIXIR UK Node presentation to the ELIXIR Board
ELIXIR UK Node presentation to the ELIXIR Board
 
Towards an Infrastructure for Mining Scientific Publications
Towards an Infrastructure for Mining Scientific PublicationsTowards an Infrastructure for Mining Scientific Publications
Towards an Infrastructure for Mining Scientific Publications
 
EUDAT Collaborative Data Infrastructure: Data Access and Re-use Service Area
EUDAT Collaborative Data Infrastructure: Data Access and Re-use Service AreaEUDAT Collaborative Data Infrastructure: Data Access and Re-use Service Area
EUDAT Collaborative Data Infrastructure: Data Access and Re-use Service Area
 
Phidias: Steps forward in detection and identification of anomalous atmospher...
Phidias: Steps forward in detection and identification of anomalous atmospher...Phidias: Steps forward in detection and identification of anomalous atmospher...
Phidias: Steps forward in detection and identification of anomalous atmospher...
 
Geospatial Metadata Workshop
Geospatial Metadata WorkshopGeospatial Metadata Workshop
Geospatial Metadata Workshop
 
Geospatial Metadata Workshop
Geospatial Metadata WorkshopGeospatial Metadata Workshop
Geospatial Metadata Workshop
 
Launch Elixir BE 2017
Launch Elixir BE 2017Launch Elixir BE 2017
Launch Elixir BE 2017
 
FIAT/IFTA MMC Seminar May 2015. Metadata Standards for Preservation and Quali...
FIAT/IFTA MMC Seminar May 2015. Metadata Standards for Preservation and Quali...FIAT/IFTA MMC Seminar May 2015. Metadata Standards for Preservation and Quali...
FIAT/IFTA MMC Seminar May 2015. Metadata Standards for Preservation and Quali...
 

Semelhante a Hybrid Cloud storage deployment models: ARCHIVER presentation at the CS3 Workshop, January 2019

Progress of the Helix Nebula Science Cloud PCP Project
Progress of the Helix Nebula Science Cloud PCP ProjectProgress of the Helix Nebula Science Cloud PCP Project
Progress of the Helix Nebula Science Cloud PCP ProjectHelix Nebula The Science Cloud
 
WEBINAR: "How to manage your data to make them open and fair"
WEBINAR:  "How to manage your data to make them open and fair"  WEBINAR:  "How to manage your data to make them open and fair"
WEBINAR: "How to manage your data to make them open and fair" OpenAIRE
 
The BlueBRIDGE approach to collaborative research
The BlueBRIDGE approach to collaborative researchThe BlueBRIDGE approach to collaborative research
The BlueBRIDGE approach to collaborative researchBlue BRIDGE
 
1 archiver omc project_overview
1 archiver omc project_overview1 archiver omc project_overview
1 archiver omc project_overviewArchiver
 
BioDT for the UiO Science section meeting 2023-03-24
BioDT for the UiO Science section meeting 2023-03-24BioDT for the UiO Science section meeting 2023-03-24
BioDT for the UiO Science section meeting 2023-03-24Dag Endresen
 
Cloud Computing Needs for Earth Observation Data Analysis: EGI and EOSC-hub
Cloud Computing Needs for Earth Observation Data Analysis: EGI and EOSC-hubCloud Computing Needs for Earth Observation Data Analysis: EGI and EOSC-hub
Cloud Computing Needs for Earth Observation Data Analysis: EGI and EOSC-hubBjörn Backeberg
 
Project update - João Fernandes
Project update - João FernandesProject update - João Fernandes
Project update - João FernandesArchiver
 
Archiver pilot phase kick off Award Ceremony
Archiver pilot phase kick off Award CeremonyArchiver pilot phase kick off Award Ceremony
Archiver pilot phase kick off Award CeremonyArchiver
 
Archiver pilot phase kick off Award Ceremony
Archiver pilot phase kick off Award CeremonyArchiver pilot phase kick off Award Ceremony
Archiver pilot phase kick off Award CeremonyArchiver
 
The Helix Nebula Pre-Commercial Procurement - 1° Asterics-Obelics Workshop
The Helix Nebula Pre-Commercial Procurement - 1° Asterics-Obelics WorkshopThe Helix Nebula Pre-Commercial Procurement - 1° Asterics-Obelics Workshop
The Helix Nebula Pre-Commercial Procurement - 1° Asterics-Obelics WorkshopHelix Nebula The Science Cloud
 
Virtual research environments for implementing long tail open science
Virtual research environments for implementing long tail open scienceVirtual research environments for implementing long tail open science
Virtual research environments for implementing long tail open scienceBlue BRIDGE
 
20100401 정영임 da 전략 tft_0330
20100401 정영임 da 전략 tft_033020100401 정영임 da 전략 tft_0330
20100401 정영임 da 전략 tft_0330glorykim
 
20100401 정영임 da 전략 tft_0330
20100401 정영임 da 전략 tft_033020100401 정영임 da 전략 tft_0330
20100401 정영임 da 전략 tft_0330광영 김
 
Gergely Sipos (EGI): Exploiting scientific data in the international context ...
Gergely Sipos (EGI): Exploiting scientific data in the international context ...Gergely Sipos (EGI): Exploiting scientific data in the international context ...
Gergely Sipos (EGI): Exploiting scientific data in the international context ...Gergely Sipos
 
eROSA Stakeholder WS1: EUDAT – The pan-European data infrastructure
eROSA Stakeholder WS1: EUDAT – The pan-European data infrastructureeROSA Stakeholder WS1: EUDAT – The pan-European data infrastructure
eROSA Stakeholder WS1: EUDAT – The pan-European data infrastructuree-ROSA
 
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)Blue BRIDGE
 

Semelhante a Hybrid Cloud storage deployment models: ARCHIVER presentation at the CS3 Workshop, January 2019 (20)

Progress of the Helix Nebula Science Cloud PCP Project
Progress of the Helix Nebula Science Cloud PCP ProjectProgress of the Helix Nebula Science Cloud PCP Project
Progress of the Helix Nebula Science Cloud PCP Project
 
The Archiver project
The Archiver projectThe Archiver project
The Archiver project
 
WEBINAR: "How to manage your data to make them open and fair"
WEBINAR:  "How to manage your data to make them open and fair"  WEBINAR:  "How to manage your data to make them open and fair"
WEBINAR: "How to manage your data to make them open and fair"
 
The Science Cloud Users: Challenges and Needs
The Science Cloud Users: Challenges and NeedsThe Science Cloud Users: Challenges and Needs
The Science Cloud Users: Challenges and Needs
 
The BlueBRIDGE approach to collaborative research
The BlueBRIDGE approach to collaborative researchThe BlueBRIDGE approach to collaborative research
The BlueBRIDGE approach to collaborative research
 
1 archiver omc project_overview
1 archiver omc project_overview1 archiver omc project_overview
1 archiver omc project_overview
 
BioDT for the UiO Science section meeting 2023-03-24
BioDT for the UiO Science section meeting 2023-03-24BioDT for the UiO Science section meeting 2023-03-24
BioDT for the UiO Science section meeting 2023-03-24
 
Cloud Computing Needs for Earth Observation Data Analysis: EGI and EOSC-hub
Cloud Computing Needs for Earth Observation Data Analysis: EGI and EOSC-hubCloud Computing Needs for Earth Observation Data Analysis: EGI and EOSC-hub
Cloud Computing Needs for Earth Observation Data Analysis: EGI and EOSC-hub
 
Project update - João Fernandes
Project update - João FernandesProject update - João Fernandes
Project update - João Fernandes
 
Archiver pilot phase kick off Award Ceremony
Archiver pilot phase kick off Award CeremonyArchiver pilot phase kick off Award Ceremony
Archiver pilot phase kick off Award Ceremony
 
Archiver pilot phase kick off Award Ceremony
Archiver pilot phase kick off Award CeremonyArchiver pilot phase kick off Award Ceremony
Archiver pilot phase kick off Award Ceremony
 
The Helix Nebula Pre-Commercial Procurement - 1° Asterics-Obelics Workshop
The Helix Nebula Pre-Commercial Procurement - 1° Asterics-Obelics WorkshopThe Helix Nebula Pre-Commercial Procurement - 1° Asterics-Obelics Workshop
The Helix Nebula Pre-Commercial Procurement - 1° Asterics-Obelics Workshop
 
What is a DMP
What is a DMPWhat is a DMP
What is a DMP
 
The European Open Science Cloud
The European Open Science CloudThe European Open Science Cloud
The European Open Science Cloud
 
Virtual research environments for implementing long tail open science
Virtual research environments for implementing long tail open scienceVirtual research environments for implementing long tail open science
Virtual research environments for implementing long tail open science
 
20100401 정영임 da 전략 tft_0330
20100401 정영임 da 전략 tft_033020100401 정영임 da 전략 tft_0330
20100401 정영임 da 전략 tft_0330
 
20100401 정영임 da 전략 tft_0330
20100401 정영임 da 전략 tft_033020100401 정영임 da 전략 tft_0330
20100401 정영임 da 전략 tft_0330
 
Gergely Sipos (EGI): Exploiting scientific data in the international context ...
Gergely Sipos (EGI): Exploiting scientific data in the international context ...Gergely Sipos (EGI): Exploiting scientific data in the international context ...
Gergely Sipos (EGI): Exploiting scientific data in the international context ...
 
eROSA Stakeholder WS1: EUDAT – The pan-European data infrastructure
eROSA Stakeholder WS1: EUDAT – The pan-European data infrastructureeROSA Stakeholder WS1: EUDAT – The pan-European data infrastructure
eROSA Stakeholder WS1: EUDAT – The pan-European data infrastructure
 
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
 

Mais de Archiver

Archiver at CS3 - Cloud Storage Synchronization and Sharing Services
Archiver at CS3 - Cloud Storage Synchronization and Sharing ServicesArchiver at CS3 - Cloud Storage Synchronization and Sharing Services
Archiver at CS3 - Cloud Storage Synchronization and Sharing ServicesArchiver
 
Wrapping Up and Next Steps¶
Wrapping Up and Next Steps¶Wrapping Up and Next Steps¶
Wrapping Up and Next Steps¶Archiver
 
Overview of the EOSC¶
Overview of the EOSC¶Overview of the EOSC¶
Overview of the EOSC¶Archiver
 
ARCHIVER Tender Requirements
ARCHIVER Tender RequirementsARCHIVER Tender Requirements
ARCHIVER Tender RequirementsArchiver
 
Wrapping up and_next_steps_stansted
Wrapping up and_next_steps_stanstedWrapping up and_next_steps_stansted
Wrapping up and_next_steps_stanstedArchiver
 
20190523 archiver fim
20190523 archiver fim20190523 archiver fim
20190523 archiver fimArchiver
 
Geant cloud peering-v2
Geant cloud peering-v2Geant cloud peering-v2
Geant cloud peering-v2Archiver
 
Archiver omc stansted_tendering_procedure_and_requirements_final
Archiver omc stansted_tendering_procedure_and_requirements_finalArchiver omc stansted_tendering_procedure_and_requirements_final
Archiver omc stansted_tendering_procedure_and_requirements_finalArchiver
 
Archiver 3rd omc_project_overview
Archiver 3rd omc_project_overviewArchiver 3rd omc_project_overview
Archiver 3rd omc_project_overviewArchiver
 
Wrapping up_and_next_steps
Wrapping up_and_next_stepsWrapping up_and_next_steps
Wrapping up_and_next_stepsArchiver
 
Introduction to_planning_poker_addestino
Introduction to_planning_poker_addestinoIntroduction to_planning_poker_addestino
Introduction to_planning_poker_addestinoArchiver
 
Archiver 2nd_OMC event_Barcelona_Project Overview
Archiver 2nd_OMC event_Barcelona_Project OverviewArchiver 2nd_OMC event_Barcelona_Project Overview
Archiver 2nd_OMC event_Barcelona_Project OverviewArchiver
 
Archiver OMC event_Barcelona_ Welcome to_accio
Archiver OMC event_Barcelona_ Welcome to_accio Archiver OMC event_Barcelona_ Welcome to_accio
Archiver OMC event_Barcelona_ Welcome to_accio Archiver
 
6 presentation wrapping up and next steps v2
6 presentation wrapping up and next steps v26 presentation wrapping up and next steps v2
6 presentation wrapping up and next steps v2Archiver
 
5 introduction to geant
5 introduction to geant5 introduction to geant
5 introduction to geantArchiver
 
4 archiver omc session 1
4 archiver omc session 1 4 archiver omc session 1
4 archiver omc session 1 Archiver
 
2 procurement and legal aspects
2 procurement and legal aspects 2 procurement and legal aspects
2 procurement and legal aspects Archiver
 

Mais de Archiver (17)

Archiver at CS3 - Cloud Storage Synchronization and Sharing Services
Archiver at CS3 - Cloud Storage Synchronization and Sharing ServicesArchiver at CS3 - Cloud Storage Synchronization and Sharing Services
Archiver at CS3 - Cloud Storage Synchronization and Sharing Services
 
Wrapping Up and Next Steps¶
Wrapping Up and Next Steps¶Wrapping Up and Next Steps¶
Wrapping Up and Next Steps¶
 
Overview of the EOSC¶
Overview of the EOSC¶Overview of the EOSC¶
Overview of the EOSC¶
 
ARCHIVER Tender Requirements
ARCHIVER Tender RequirementsARCHIVER Tender Requirements
ARCHIVER Tender Requirements
 
Wrapping up and_next_steps_stansted
Wrapping up and_next_steps_stanstedWrapping up and_next_steps_stansted
Wrapping up and_next_steps_stansted
 
20190523 archiver fim
20190523 archiver fim20190523 archiver fim
20190523 archiver fim
 
Geant cloud peering-v2
Geant cloud peering-v2Geant cloud peering-v2
Geant cloud peering-v2
 
Archiver omc stansted_tendering_procedure_and_requirements_final
Archiver omc stansted_tendering_procedure_and_requirements_finalArchiver omc stansted_tendering_procedure_and_requirements_final
Archiver omc stansted_tendering_procedure_and_requirements_final
 
Archiver 3rd omc_project_overview
Archiver 3rd omc_project_overviewArchiver 3rd omc_project_overview
Archiver 3rd omc_project_overview
 
Wrapping up_and_next_steps
Wrapping up_and_next_stepsWrapping up_and_next_steps
Wrapping up_and_next_steps
 
Introduction to_planning_poker_addestino
Introduction to_planning_poker_addestinoIntroduction to_planning_poker_addestino
Introduction to_planning_poker_addestino
 
Archiver 2nd_OMC event_Barcelona_Project Overview
Archiver 2nd_OMC event_Barcelona_Project OverviewArchiver 2nd_OMC event_Barcelona_Project Overview
Archiver 2nd_OMC event_Barcelona_Project Overview
 
Archiver OMC event_Barcelona_ Welcome to_accio
Archiver OMC event_Barcelona_ Welcome to_accio Archiver OMC event_Barcelona_ Welcome to_accio
Archiver OMC event_Barcelona_ Welcome to_accio
 
6 presentation wrapping up and next steps v2
6 presentation wrapping up and next steps v26 presentation wrapping up and next steps v2
6 presentation wrapping up and next steps v2
 
5 introduction to geant
5 introduction to geant5 introduction to geant
5 introduction to geant
 
4 archiver omc session 1
4 archiver omc session 1 4 archiver omc session 1
4 archiver omc session 1
 
2 procurement and legal aspects
2 procurement and legal aspects 2 procurement and legal aspects
2 procurement and legal aspects
 

Último

Observational constraints on mergers creating magnetism in massive stars
Observational constraints on mergers creating magnetism in massive starsObservational constraints on mergers creating magnetism in massive stars
Observational constraints on mergers creating magnetism in massive starsSérgio Sacani
 
FBI Profiling - Forensic Psychology.pptx
FBI Profiling - Forensic Psychology.pptxFBI Profiling - Forensic Psychology.pptx
FBI Profiling - Forensic Psychology.pptxPayal Shrivastava
 
Oxo-Acids of Halogens and their Salts.pptx
Oxo-Acids of Halogens and their Salts.pptxOxo-Acids of Halogens and their Salts.pptx
Oxo-Acids of Halogens and their Salts.pptxfarhanvvdk
 
Q4-Mod-1c-Quiz-Projectile-333344444.pptx
Q4-Mod-1c-Quiz-Projectile-333344444.pptxQ4-Mod-1c-Quiz-Projectile-333344444.pptx
Q4-Mod-1c-Quiz-Projectile-333344444.pptxtuking87
 
Environmental Acoustics- Speech interference level, acoustics calibrator.pptx
Environmental Acoustics- Speech interference level, acoustics calibrator.pptxEnvironmental Acoustics- Speech interference level, acoustics calibrator.pptx
Environmental Acoustics- Speech interference level, acoustics calibrator.pptxpriyankatabhane
 
Combining Asynchronous Task Parallelism and Intel SGX for Secure Deep Learning
Combining Asynchronous Task Parallelism and Intel SGX for Secure Deep LearningCombining Asynchronous Task Parallelism and Intel SGX for Secure Deep Learning
Combining Asynchronous Task Parallelism and Intel SGX for Secure Deep Learningvschiavoni
 
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptxGENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptxRitchAndruAgustin
 
CHROMATOGRAPHY PALLAVI RAWAT.pptx
CHROMATOGRAPHY  PALLAVI RAWAT.pptxCHROMATOGRAPHY  PALLAVI RAWAT.pptx
CHROMATOGRAPHY PALLAVI RAWAT.pptxpallavirawat456
 
Gas-ExchangeS-in-Plants-and-Animals.pptx
Gas-ExchangeS-in-Plants-and-Animals.pptxGas-ExchangeS-in-Plants-and-Animals.pptx
Gas-ExchangeS-in-Plants-and-Animals.pptxGiovaniTrinidad
 
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...Sérgio Sacani
 
Pests of Sunflower_Binomics_Identification_Dr.UPR
Pests of Sunflower_Binomics_Identification_Dr.UPRPests of Sunflower_Binomics_Identification_Dr.UPR
Pests of Sunflower_Binomics_Identification_Dr.UPRPirithiRaju
 
dll general biology week 1 - Copy.docx
dll general biology   week 1 - Copy.docxdll general biology   week 1 - Copy.docx
dll general biology week 1 - Copy.docxkarenmillo
 
whole genome sequencing new and its types including shortgun and clone by clone
whole genome sequencing new  and its types including shortgun and clone by clonewhole genome sequencing new  and its types including shortgun and clone by clone
whole genome sequencing new and its types including shortgun and clone by clonechaudhary charan shingh university
 
DETECTION OF MUTATION BY CLB METHOD.pptx
DETECTION OF MUTATION BY CLB METHOD.pptxDETECTION OF MUTATION BY CLB METHOD.pptx
DETECTION OF MUTATION BY CLB METHOD.pptx201bo007
 
BACTERIAL DEFENSE SYSTEM by Dr. Chayanika Das
BACTERIAL DEFENSE SYSTEM by Dr. Chayanika DasBACTERIAL DEFENSE SYSTEM by Dr. Chayanika Das
BACTERIAL DEFENSE SYSTEM by Dr. Chayanika DasChayanika Das
 
Science (Communication) and Wikipedia - Potentials and Pitfalls
Science (Communication) and Wikipedia - Potentials and PitfallsScience (Communication) and Wikipedia - Potentials and Pitfalls
Science (Communication) and Wikipedia - Potentials and PitfallsDobusch Leonhard
 
Abnormal LFTs rate of deco and NAFLD.pptx
Abnormal LFTs rate of deco and NAFLD.pptxAbnormal LFTs rate of deco and NAFLD.pptx
Abnormal LFTs rate of deco and NAFLD.pptxzeus70441
 
cybrids.pptx production_advanges_limitation
cybrids.pptx production_advanges_limitationcybrids.pptx production_advanges_limitation
cybrids.pptx production_advanges_limitationSanghamitraMohapatra5
 

Último (20)

Observational constraints on mergers creating magnetism in massive stars
Observational constraints on mergers creating magnetism in massive starsObservational constraints on mergers creating magnetism in massive stars
Observational constraints on mergers creating magnetism in massive stars
 
FBI Profiling - Forensic Psychology.pptx
FBI Profiling - Forensic Psychology.pptxFBI Profiling - Forensic Psychology.pptx
FBI Profiling - Forensic Psychology.pptx
 
Oxo-Acids of Halogens and their Salts.pptx
Oxo-Acids of Halogens and their Salts.pptxOxo-Acids of Halogens and their Salts.pptx
Oxo-Acids of Halogens and their Salts.pptx
 
Q4-Mod-1c-Quiz-Projectile-333344444.pptx
Q4-Mod-1c-Quiz-Projectile-333344444.pptxQ4-Mod-1c-Quiz-Projectile-333344444.pptx
Q4-Mod-1c-Quiz-Projectile-333344444.pptx
 
Environmental Acoustics- Speech interference level, acoustics calibrator.pptx
Environmental Acoustics- Speech interference level, acoustics calibrator.pptxEnvironmental Acoustics- Speech interference level, acoustics calibrator.pptx
Environmental Acoustics- Speech interference level, acoustics calibrator.pptx
 
Combining Asynchronous Task Parallelism and Intel SGX for Secure Deep Learning
Combining Asynchronous Task Parallelism and Intel SGX for Secure Deep LearningCombining Asynchronous Task Parallelism and Intel SGX for Secure Deep Learning
Combining Asynchronous Task Parallelism and Intel SGX for Secure Deep Learning
 
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptxGENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
 
CHROMATOGRAPHY PALLAVI RAWAT.pptx
CHROMATOGRAPHY  PALLAVI RAWAT.pptxCHROMATOGRAPHY  PALLAVI RAWAT.pptx
CHROMATOGRAPHY PALLAVI RAWAT.pptx
 
Gas-ExchangeS-in-Plants-and-Animals.pptx
Gas-ExchangeS-in-Plants-and-Animals.pptxGas-ExchangeS-in-Plants-and-Animals.pptx
Gas-ExchangeS-in-Plants-and-Animals.pptx
 
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...
 
Pests of Sunflower_Binomics_Identification_Dr.UPR
Pests of Sunflower_Binomics_Identification_Dr.UPRPests of Sunflower_Binomics_Identification_Dr.UPR
Pests of Sunflower_Binomics_Identification_Dr.UPR
 
dll general biology week 1 - Copy.docx
dll general biology   week 1 - Copy.docxdll general biology   week 1 - Copy.docx
dll general biology week 1 - Copy.docx
 
Ultrastructure and functions of Chloroplast.pptx
Ultrastructure and functions of Chloroplast.pptxUltrastructure and functions of Chloroplast.pptx
Ultrastructure and functions of Chloroplast.pptx
 
whole genome sequencing new and its types including shortgun and clone by clone
whole genome sequencing new  and its types including shortgun and clone by clonewhole genome sequencing new  and its types including shortgun and clone by clone
whole genome sequencing new and its types including shortgun and clone by clone
 
DETECTION OF MUTATION BY CLB METHOD.pptx
DETECTION OF MUTATION BY CLB METHOD.pptxDETECTION OF MUTATION BY CLB METHOD.pptx
DETECTION OF MUTATION BY CLB METHOD.pptx
 
BACTERIAL DEFENSE SYSTEM by Dr. Chayanika Das
BACTERIAL DEFENSE SYSTEM by Dr. Chayanika DasBACTERIAL DEFENSE SYSTEM by Dr. Chayanika Das
BACTERIAL DEFENSE SYSTEM by Dr. Chayanika Das
 
Science (Communication) and Wikipedia - Potentials and Pitfalls
Science (Communication) and Wikipedia - Potentials and PitfallsScience (Communication) and Wikipedia - Potentials and Pitfalls
Science (Communication) and Wikipedia - Potentials and Pitfalls
 
PLASMODIUM. PPTX
PLASMODIUM. PPTXPLASMODIUM. PPTX
PLASMODIUM. PPTX
 
Abnormal LFTs rate of deco and NAFLD.pptx
Abnormal LFTs rate of deco and NAFLD.pptxAbnormal LFTs rate of deco and NAFLD.pptx
Abnormal LFTs rate of deco and NAFLD.pptx
 
cybrids.pptx production_advanges_limitation
cybrids.pptx production_advanges_limitationcybrids.pptx production_advanges_limitation
cybrids.pptx production_advanges_limitation
 

Hybrid Cloud storage deployment models: ARCHIVER presentation at the CS3 Workshop, January 2019

  • 1. ARCHIVER Archiving and Preservation for Research Environments CS3 Workshop 27th January 2020 – Copenhagen João Fernandes (CERN)
  • 2. Project Objective 2 Focus: Archiving and Data Preservation Services using commercial cloud services to be available via the European Open Science Cloud (EOSC) Procurement R&D budget: 3.4M euro Starting Date: 1st of January 2019 Duration: 36 Months Coordinator: CERN (Lead Procurer)
  • 3. Consortium Includes Buyers and Experts in the preparation, execution and promotion of the Procurement of R&D 3 Procurers - Public organisations committing funds to contribute to a joint-R&D- procurement, research data use cases and R&D testing effort Experts – Partner organisations bringing expertise in requirement assessment and promotion activities, not part of the Buyers Group
  • 4. Preferred Partners (Early Adopters) 4 • Confirmed subscription received from 11 organisations: High level of interest from the community • Participants: • Demand side public sector organisations Information Webinar (04th September) = 47 participants • Key advantages • Assess if resulting services address archiving and preservation needs • Contribute and shape the R&D carried out in the project, contribute with use cases and have the option to purchase pilot- scale services by the end of the project
  • 5. Challenge 20/02/2020 5 • Demonstrate services for long-term preservation and archiving in the PB range of scientific data • F.A.I.R archiving services following best practices and standards • Expand resulting services to several scientific domains • Transparent business models and make resulting services available through the EOSC catalogue Current Status of Scientific Data Repositories • Basic bit preservation and archiving capabilities • Data volumes and communities growing • Longstanding archiving and preservation activities, but most of data not yet published • Fragmentation across scientific disciplines with underestimation of costs at the planning phase
  • 6. 6 R&D Scoping Activities and Dialogue with the Private Sector Continuous updates to the FAQ Consortium Matchmaking Early Adopters Engagement 08th February 1st Preparation Workshop 20th February 2nd Preparation Workshop 08th April OMC Kick-off CERN 07th May OMC Event Barcelona 23rd May OMC Event Stansted 05th June OMC Consolidation CERN Feedback on the Draft PCP Contract Notice Open Market Consultation Process Feedback Integration Feedback Integration Feedback Integration
  • 7. Geographical Distribution of Companies 7 Wide geographical distribution 42 companies Majority from 12 European Countries
  • 8. Cost-effective Business Model Taking into account: - Scale - Ingest rates - Archive lifetime - # of copies - Exit strategies - Portability - SLAs Regulation & Legislation - Auditing - Self- assessment - Data Retention - GDPR Outcome: R&D Challenge 8 Data integrity/security; cloud/hybrid deployment; data volume in the PB range; high, sustained ingest data rates; ISO certification: 27000, 27040, 19086 and related Archives connected to the GEANT network. OAIS conformant services: data readability formats, normalization, obsolesce monitoring, files fixity, authenticity checks, etc.; ISO 14721/16363, 26324 and related standards User services: search, discover, share, indexing, data removal, etc.; Access under Federated AAI Layer 1 Storage/Basic Archiving/Secure backup Layer 2 Preservation Layer 3 Baseline user services Layer 4 Advanced services High level services: visual representation of data (domain specific), reproducibility of scientific analyses using Machine Learning Algorithms, etc.; Core R&D Bonus R&D Scientific use cases deployments documented at: https://www.archiver-project.eu/deployment-scenarios
  • 9. 9 High Energy Physics The BaBar Experiment During this year, the BaBar Experiment infrastructure at SLAC will be decommissioned. 2 PB of BaBar data can no longer be stored at the host laboratory. Currently a copy of the data is being held by CERN IT-ST. Goal: To ensure that a complete second copy of Babar data will be retained for possible comparisons with data from other experiments and be shared through the CERN Open Data Portal. CERN Open Data Portal The CERN Open Data portal disseminates close to 2 PBs of primary and derived datasets from particle physics as they are released by LHC Collaborations and is being used for both education and research purposes. Goal: Achieve total reproducibility of research, being able to completely instantiate data, associated software and services off-premise. Offer research reproducibility services to individual researchers running open data analyses completely independent from the original on- premise infrastructure. CERN Digital Memory Deployment consisting on a requirement to archive approximately 1.5 PB of digital Memory, containing analogue documents produced by the Organization in the 20th century as well as digital production of the 21st century (web sites, social media, emails, etc.) Goal : Produce a dark archive in the cloud following standard OAIS practices.
  • 10. 10 Life Sciences EMBL on FIRE EMBL-EBI provides data archiving services to the global molecular biology community. These data archives are currently based on an internal service (FIRE: FIle REplication). FIRE currently holds 20PB of data and is growing at 40% per year. Goal: cost-effective scaling via cloud-based storage solutions. Distribute data effectively be on cloud, covering the increasing needs for cloud-hosted analysis. EMBL Cloud Data Caching Life sciences research communities access more and more internal data from public cloud services for their data analysis. Goal: To progressively cache data in the cloud, with the on-premises data being replicated and discarded as required. Which data should be cached, how much and for how long, will be a tradeoff between the cost of cloud storage and of having the network capacity/latency to download the data multiple times. Scientific use cases deployments documented at: https://www.archiver-project.eu/deployment-scenarios
  • 11. 11 The MAGIC Cherenkov gamma-ray telescopes and the PAUcam camera for the William Herschel Telescope are located in the Observatorio del Roque de los Muchachos, in Canary Islands, Spain. The first Large Scale Telescope of the next-generation Cherenkov Telescope Array (CTA) is also there. They produce about 0.3 PB of raw data per year which is automatically sent to PIC in Barcelona. PIC Large File Storage Goal: To replace the current in-house tape library storage. Each instance of the service to be purchased is the 5-year safe-keeping of a yearly dataset from a single source. PIC Mixed File Remote Storage Goal: To archive the derived datasets from at most two sources, becoming part of the yearly dataset. In addition, allow update/upload of derived data sets for a period of 4 years following the creation of the data, PIC Data Distribution Goal: To replace the Hierarchical Storage Manager, disk storage and data distribution service. Each instance of the service to be purchased is the 5-year safe-keeping and data distribution of a yearly dataset and its derived datasets. Astronomy
  • 12. 12 Photon Sciences PETRA III is the worldwide most brilliant storage ring based X-ray sources for high energy photons with 22 beamlines distributed over three experimental halls. The European XFEL is a world's largest X-ray laser generating 27 000 ultrashort X- ray per second and with a brilliance that is a billion times higher than that of the best conventional X-ray radiation sources. The two facilities produce yearly about several 10s PB of raw data and this is expected to double in size every year. Goal: Develop a hybrid model that combines current on-premise archiving services with the resulting services of ARCHIVER. To move a predefined set of datasets in public clouds and make them open for public access. Scientific use cases deployments documented at: https://www.archiver-project.eu/deployment-scenarios
  • 13. 13 Data integrity/security; cloud/hybrid deployment Data volume in the PB range; high, sustained ingest data rates. ISO certification: 27000, 27040, 19086 and related standards. Archives connected to the GEANT network OAIS conformant services: data readability formats, normalization, obsolesce monitoring, files fixity, authenticity checks, etc. ISO 14721/16363, 26324 and related standards User services: search, discover, share, indexing, data removal, etc. Access under Federated IAM Layer 1 Storage/Basic Archiving/Secure backup Layer 2 Preservation Layer 3 Baseline user services Layer 4 Advanced services High level services: visual representation of data (domain specific), reproducibility of using Machine Learning Algorithms, etc.; EMBL1–FIRE PIC2–MixedFileRemoteStorage DESY1–PETRAIII/EUXFEL CERN3–CERNOpenData CERN2–CERNDigitalMemory CERN1–TheBaBarExperiment PIC3–DataDistribution EMBL2–CloudCaching PIC1–LargeFileStorage Definition of the R&D scope Deployments derived from 4 ESFRI landmarks CTA, ELIXIR, EuXFEL and HL-LHC Scientific use cases deployments documented at: https://www.archiver-project.eu/deployment-scenarios
  • 14. Project Timeline 14 ARCHIVER R&D Implementation Phases
  • 16. Role of the EOSC: To ensure that 1.7 million European researchers and 70 million professionals in science and technology reap the full benefits of data-driven science - Federated virtual environment, free at the point of use for the end researcher - Open services for storage, analysis and re-use of research data - Promote an approach across national borders & scientific disciplines - Promote choice of the deployment model: on-prem, hybrid, off-prem EOSC Phase 1 investment of EUR 300 Million on core services EOSC legal entity expected to be created by the end of 2020 European Open Science Cloud - The Vision “We are creating a European Open Science Cloud now. It is a trusted space for researchers to store their data and to access data from researchers from all other disciplines. We will create a pool of interlinked information, a ‘web of research data’. (…) The idea is that once we have the rules of the game ready, then we will open this up to the broader public sector and to business as well. So that companies can come in, store the data and use the data.” Special Address by Ursula von der Leyen, President of the European Commission, 22nd January, WEF DAVOS 2020 https://www.youtube.com/watch?v=QN476nVbFVs&feature=youtu.be&t=682
  • 17. EOSC should provide a level playing field – same requirements for commercial and not-for-profit providers Accept commercial services in Data Mgmt. Plans Stay mainstream and interoperable by adopting widely used and internationally recognized standards Promote choice, an ecosystem for innovation, fostering data self-determination and digital sovereignty in Europe https://www.spielwarenmesse.de EOSC – Engagement of commercial providers
  • 18. ARCHIVER key contributions for the EOSC 18 Long-Term Archiving and Preservation of Research Data at the core strategy of the EOSC ARCHIVER key in defining EOSC Rules of Participation for the private sector both as service providers and as R&D partners for “close to market” solutions List of 40+ EOSC projects available at: https://www.eosc-portal.eu/about/eosc-projects
  • 19. ARCHIVER services in the European Open Science Cloud 19 ARCHIVER Services available in the EOSC 2019 2020 2021 2022 2024 Objective: to make resulting services available in the European Open Science Cloud catalogue What does this really mean? Atomic Use Case: “As a researcher, I want to have access to the full set of ARCHIVER services, so that, I’m able to evaluate their functionality for my specific research field, able to purchase them with a clear cost model, and implement an exit strategy to be able to repatriate or move my research data seamlessly to another location by the end of the contract and usage period.” 2023
  • 20. ARCHIVER Data Management Strategy 20 • DMPs both for research use cases & for the project FAIR guiding principles post-GDPR era: strong focus on technical and organisational measures for Data Privacy & Protection as the path to digital sovereignty Modern guidelines provided by Science Europe for pan-European Research Data Management (RDM) Establishing core requirements for Research DMPs & a set of criteria to assess trustworthy repositories: https://www.scienceeurope.org/our-resources/practical-guide-to-the-international-alignment-of- research-data-management/
  • 21. ARCHIVER EOSC Services - Technical Validation (I) 21 Resource provisioning using Terraform API OSS container orchestration systems Automated deployment based on Kubernetes and Docker Results stored back at CERN S3 cloud storage service Available at Github under an OSS license: https://ocre-testsuite.readthedocs.io/en/latest/ Tests provided by the research community https://fairsharing.github.io/FAIR-Evaluator-FrontEnd/ Started in HNSciCloud in use already in OCRE reused and expanded in ARCHIVER
  • 22. ARCHIVER EOSC Services – Legal & Organisational (II) 22 Data Privacy and Protection (GDPR) – An EC pillar Personal Data wrapped in several components (Fed AAI, Research Data itself) Technical & Organisational measures: “Privacy by Design” approach for GDPR conformance Best Practices: Standards Infrastructure: ISO 27001 series, European Cybersecurity Act Long Term Data Preservation: OAIS and CoreTrustSeal Self-Assessment: Definition of responsibilities across data stewards and service providers Exit Strategies Stimulate the use of Open Source, Open APIs: measures for vendor lock-in prevention Definition and field testing of viable exit plans (provider/on-prem & provider/provider)
  • 23. EOSC Services – Financial & Business Model (III) 23 Establish a range of sustainable, cost-effective purchasing options Must support organisations or individual researchers to store their data after the end of a procurement cycle or research grant of individual researcher Requirement to service providers to establish a “Total Cost of Services” study From the architecture phase (Design) to Prototype and Pilot “Total” TCS: must include all factors that a research organisation or individual researcher will bear when running the ARCHIVER resulting services over a defined period Strongly connected to exit strategies Escrow services concept: public research organisations as data stewards Protection factor against scenarios such as vendor locks or supplier bankruptcy
  • 24. Summary 24 ARCHIVER aims to develop a set of commercial, FAIR, archiving and preservation services for research data Petabyte research data (tens of petabytes and beyond) in multiple scientific domains Open, Trustworthy, aligned with best practices (ISO, OAIS, CoreTrustSeal) Strong preference by Open Source Software, Open Standards as measures to prevent vendor lock-ins Set of “derived rules” for commercial services onboarding in the EOSC Technical: extensive field testing, “research data ready” archiving and preservation services Legal: GDPR as an opportunity for high quality digital services guaranteeing digital sovereignty Financial: models adapted to research considering public procurement cycles and research grants periods, allowing effective cost planning for LTDP ARCHIVER R&D activity will start end of Q1 2020 Tender for R&D services to open in less than 48h!  for submission of R&D bids from the 31st of January (submission period of 2 months) Info Session Webinars: February 7th and March 18th Selected companies will be providing R&D services, 3 phases: Design (2020), Prototype (2020/2021) and Pilot (2021)
  • 27. Synergies with CS3Mesh4EOSC (CS3Mesh Kick-off) • Test Suite • Maybe CS3Mesh can profit from the ARCHIVER test suite available in Github • https://github.com/cern-it-efp/OCRE-Testsuite • A simple framework using Terraform for resource provisioning; Kubernetes for resource orchestration as a standard way to embark scientific use cases tests • APIs • Would it make sense to define and test together the APIs implementation that will be available in ARCHIVER resulting repository services? • Preference is give to open, general purpose APIs • Testing data workflows from CS3 -> ARCHIVER based on “data temperature”? 27
  • 28. ARCHIVER Test Catalogue 28 Initial test catalogue include tests on network, storage and compute: https://github.com/cern-it-efp/OCRE-Testsuite/ More tests being included: FAIRsFAIR project evaluation of ”FAIRness”: https://fairsharing.github.io/FAIR-Evaluator-FrontEnd/#!/#%2F! Tests provided by the research community; using Science Europe assessment criteria guidelines Data ingestion: test of the ingest process, ingests with incremental changes for high volumes of data, lifecycle from archive packages data creation, combination and/or aggregation before final archival for long-term preservation Open APIs: prevent vendor lock-ins and create innovative workflows (CS3MESH?) Extensive testing of exit plans: during R&D execution, from provider2on-prem / provider2provider