SlideShare uma empresa Scribd logo
1 de 27
Baixar para ler offline
EOSC-hub receives funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No. 777536.
eosc-hub.eu
@EOSC_eu
Baptiste Grenier / Enol Fernández
EGI Foundation
Open Data analysis with EOSC-hub services
Dissemination level: Public
2
Thanks to the EOSC-hub distributed team!
Onedata and DataHub: Lukasz Dutka,
Lukasz Opiola, Bartosz Kryza, Michal
Orzechowski
EGI FedCloud provider: Boris Parak,
Miroslav Ruda, Zdenek Sustr
EGI Check-in: Nicolas Liampotis
B2HANDLE: Kyriakos Ginis
B2FIND: Tobias Weigel, Claudia Martens
3
• Several of the use cases in EOSC-hub will enable scientific end-users to
perform data analysis experiments on large volumes of data, by exploiting
a PID-enabled, server-side, and parallel approach.
• Users expect easy to use interfaces like Jupyter Notebooks for interacting
with the system.
• Producing reusable results following FAIR guidelines
- Findability, Accessibility, Interoperability, and Reusability.
What do we want to do?
4
● Analysis
○ Notebooks / JupyterLab
○ FedCloud resources
● Data management
○ DataHub / Onedata
■ Space
■ Onezone
■ Oneprovider
■ Oneclient
● AAI (OIDC)
○ Check-in
● PID management
○ B2HANDLE
○ Handle.net
● Cataloguing and discovery
○ B2FIND
How?
5
● Integrating multiple services from the EOSC-hub catalogue to build a new
solution is worth the effort
○ Self-service APIs allow you to get nice combination of services without
overhead, still some steps cannot be automated
○ Support channels with providers are life savers while prototyping
● Need to validate the setup for production with a real research community
● Aim at a completely integrated solution that people can reuse
○ Provide python modules for easy interaction with services
○ Expand the EGI Notebooks service
○ Ensure that all required operations can be done using API calls
Lessons Learned
6
Enabling reproducibility with Notebooks
GitHub
Your
repository
EGI Notebooks
services
Zenodo
Your
laptop
Download ipynb file
Create repository
Upload ipynb file
Add requirements.txt
Specify GitHub repo
Generate DOI
Execute
Data repository
MyBinder.org
Re-execute
Obtain GitHub project reference
Provide GitHub project reference
Discover Notebook
(use DOI)
Fellow
researchers
Journal
paper
DOI
7
An Open Science story we aim for…
GitHub
Your
repository
EGI Notebooks
and Binder service
Zenodo
Your
laptop
Download ipynb file
Create repository
Upload ipynb file
Add requirements.txt
Specify GitHub repo
Generate DOI
Execute
Data repository Obtain GitHub project reference
Provide GitHub project reference
Discover Notebook
(use DOI)
Fellow
researchers
Journal
paper
DOI
Distributed
big data
DataHub
B2DROP
Etc.
GenerateDOI
8
- Onedata
▪ https://onedata.org
- EGI DataHub
▪ https://datahub.egi.eu - http://egi-datahub.readthedocs.io/
- EGI Notebooks
▪ https://www.egi.eu/services/notebooks/ - https://notebooks.egi.eu/
- EGI Check-in
▪ https://www.egi.eu/services/check-in/ - https://wiki.egi.eu/wiki/AAI
- B2FIND
▪ https://eudat.eu/services/b2find - http://eudat7-ingest.dkrz.de/
- B2HANDLE
▪ https://eudat.eu/services/b2handle - https://hdl.grnet.gr:8001/api/handles
▪ Binder
▪ https://mybinder.org
Links
eosc-hub.eu @EOSC_eu
Thank you for your
attention!
Questions?
Contact
This material by Parties of the EOSC-hub Consortium is licensed under a Creative Commons Attribution 4.0 International License.
Enol Fernandez - enol.fernandez@egi.eu
Baptiste Grenier - baptiste.grenier@egi.eu
10
1. Authenticating to DataHub using Check-in: https://datahub.egi.eu
a. Showing content of space
2. Authenticating to Notebooks using Check-in: https://cs3.fedcloud-tf.fedcloud.eu
a. Showing content of mounted space
b. Running Wind cast analysis notebook
c. Running PID registration notebook to share and publish notebooks directory
3. B2FIND cataloguing (data collected on a regular basis): http://eudat7-
ingest.dkrz.de/dataset?groups=egidatahub
4. OAI-PMH metadata in DataHub:
5. http://datahub.egi.eu/oai_pmh?verb=ListRecords&metadataPrefix=oai_dc
6. PID in Handle.net registry: http://hdl.handle.net/
7. PID pointing to shared data publicly accessible in Onedata
Demonstration flow
11
DataHub/Onedata Login with Check-in (OIDC)
12
Check-in: IdP Selection and authentication
13
IdP: Information Release consent
14
Check-in: entitlements forwarded to the service
15
DataHub: displaying spaces and providers
16
DataHub: user space content
17
Notebooks: Login with Check-in (OIDC)
18
Notebooks: Jupyter Hub env
19
Notebooks: Onedata space mounted locally
20
Notebooks: wind casting using public dataset
21
Notebooks: publishing data with PID using APIs
22
Notebooks: sharing directory, minting PID
23
B2FIND: discovery of harvested OAI-PMH metadata
24
B2FIND: displaying an entry
25
DataHub: Displaying OAI-PMH metadata
26
Handle.net: the PID in the registry
27
DataHub: the published dataset, from the PID

Mais conteúdo relacionado

Mais procurados

PHIDIAS - Boosting the use of cloud services for marine data management, serv...
PHIDIAS - Boosting the use of cloud services for marine data management, serv...PHIDIAS - Boosting the use of cloud services for marine data management, serv...
PHIDIAS - Boosting the use of cloud services for marine data management, serv...
Phidias
 

Mais procurados (20)

HNSciCloud update @ the World LHC Computing Grid deployment board
HNSciCloud update @ the World LHC Computing Grid deployment board  HNSciCloud update @ the World LHC Computing Grid deployment board
HNSciCloud update @ the World LHC Computing Grid deployment board
 
Integrating and managing services for the European Open Science Cloud
Integrating and managing services for the European Open Science CloudIntegrating and managing services for the European Open Science Cloud
Integrating and managing services for the European Open Science Cloud
 
Sharing Big Data - Bob Jones
Sharing Big Data - Bob JonesSharing Big Data - Bob Jones
Sharing Big Data - Bob Jones
 
Experience in managing service portfolio by Pasquale Pagano
Experience in managing service portfolio by Pasquale PaganoExperience in managing service portfolio by Pasquale Pagano
Experience in managing service portfolio by Pasquale Pagano
 
EOSC Architecture: a System of Systems
EOSC Architecture: a System of SystemsEOSC Architecture: a System of Systems
EOSC Architecture: a System of Systems
 
eROSA Stakeholder WS1: The European Open Science Cloud for Research Pilot Pro...
eROSA Stakeholder WS1: The European Open Science Cloud for Research Pilot Pro...eROSA Stakeholder WS1: The European Open Science Cloud for Research Pilot Pro...
eROSA Stakeholder WS1: The European Open Science Cloud for Research Pilot Pro...
 
Open Science at the University of Edinburgh
Open Science at the University of EdinburghOpen Science at the University of Edinburgh
Open Science at the University of Edinburgh
 
The Science Cloud Users: Challenges and Needs
The Science Cloud Users: Challenges and NeedsThe Science Cloud Users: Challenges and Needs
The Science Cloud Users: Challenges and Needs
 
eROSA Stakeholder WS1: EOSC Architecture
eROSA Stakeholder WS1: EOSC ArchitectureeROSA Stakeholder WS1: EOSC Architecture
eROSA Stakeholder WS1: EOSC Architecture
 
Key Aims of Europeana Cloud
Key Aims of Europeana CloudKey Aims of Europeana Cloud
Key Aims of Europeana Cloud
 
Free and Open Source Software for Regional Spatial Data Infrastructures
Free and Open Source Software for Regional Spatial Data InfrastructuresFree and Open Source Software for Regional Spatial Data Infrastructures
Free and Open Source Software for Regional Spatial Data Infrastructures
 
EODC: Earth Observation Data Centre
EODC: Earth Observation Data CentreEODC: Earth Observation Data Centre
EODC: Earth Observation Data Centre
 
Big Data Europe at eHealth Week 2017: Linking Big Data in Health
Big Data Europe at eHealth Week 2017: Linking Big Data in HealthBig Data Europe at eHealth Week 2017: Linking Big Data in Health
Big Data Europe at eHealth Week 2017: Linking Big Data in Health
 
European Open Science Cloud as an opportunity for intercontinental collaboration
European Open Science Cloud as an opportunity for intercontinental collaborationEuropean Open Science Cloud as an opportunity for intercontinental collaboration
European Open Science Cloud as an opportunity for intercontinental collaboration
 
2. EOSC-Pillar: Central and Western Europe’s Plug-in into the EOSC
2. EOSC-Pillar: Central and Western Europe’s Plug-in into the EOSC2. EOSC-Pillar: Central and Western Europe’s Plug-in into the EOSC
2. EOSC-Pillar: Central and Western Europe’s Plug-in into the EOSC
 
eROSA Policy WS2: European Open Science Cloud (EOSC) - The Perspective of e-I...
eROSA Policy WS2: European Open Science Cloud (EOSC) - The Perspective of e-I...eROSA Policy WS2: European Open Science Cloud (EOSC) - The Perspective of e-I...
eROSA Policy WS2: European Open Science Cloud (EOSC) - The Perspective of e-I...
 
About European Open Science Cloud
About European Open Science CloudAbout European Open Science Cloud
About European Open Science Cloud
 
How BlueBRIDGE data management services can support the marine & maritime sector
How BlueBRIDGE data management services can support the marine & maritime sectorHow BlueBRIDGE data management services can support the marine & maritime sector
How BlueBRIDGE data management services can support the marine & maritime sector
 
PHIDIAS - Boosting the use of cloud services for marine data management, serv...
PHIDIAS - Boosting the use of cloud services for marine data management, serv...PHIDIAS - Boosting the use of cloud services for marine data management, serv...
PHIDIAS - Boosting the use of cloud services for marine data management, serv...
 
Easy SPARQLing for the Building Performance Professional
Easy SPARQLing for the Building Performance ProfessionalEasy SPARQLing for the Building Performance Professional
Easy SPARQLing for the Building Performance Professional
 

Semelhante a Software for data management and exploitation

Reproducible Open Science with EGI Notebooks, Binder and Zenodo
Reproducible Open Science with EGI Notebooks, Binder and ZenodoReproducible Open Science with EGI Notebooks, Binder and Zenodo
Reproducible Open Science with EGI Notebooks, Binder and Zenodo
EGI Federation
 
Tutorial on Hybrid Data Infrastructures: D4Science as a case study
Tutorial on Hybrid Data Infrastructures: D4Science as a case studyTutorial on Hybrid Data Infrastructures: D4Science as a case study
Tutorial on Hybrid Data Infrastructures: D4Science as a case study
Blue BRIDGE
 
EUDAT Collaborative Data Infrastructure: Data Access and Re-use Service Area
EUDAT Collaborative Data Infrastructure: Data Access and Re-use Service AreaEUDAT Collaborative Data Infrastructure: Data Access and Re-use Service Area
EUDAT Collaborative Data Infrastructure: Data Access and Re-use Service Area
EUDAT
 
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
Carole Goble
 

Semelhante a Software for data management and exploitation (20)

Open Data analysis with EOSC-hub services
Open Data analysis with EOSC-hub servicesOpen Data analysis with EOSC-hub services
Open Data analysis with EOSC-hub services
 
Reproducible Open Science with EGI Notebooks, Binder and Zenodo
Reproducible Open Science with EGI Notebooks, Binder and ZenodoReproducible Open Science with EGI Notebooks, Binder and Zenodo
Reproducible Open Science with EGI Notebooks, Binder and Zenodo
 
WEBINAR: "How to manage your data to make them open and fair"
WEBINAR:  "How to manage your data to make them open and fair"  WEBINAR:  "How to manage your data to make them open and fair"
WEBINAR: "How to manage your data to make them open and fair"
 
Tutorial on Hybrid Data Infrastructures: D4Science as a case study
Tutorial on Hybrid Data Infrastructures: D4Science as a case studyTutorial on Hybrid Data Infrastructures: D4Science as a case study
Tutorial on Hybrid Data Infrastructures: D4Science as a case study
 
Exposing EO Linked (meta-)Data from OpenSearch Catalogue
Exposing EO Linked (meta-)Data from OpenSearch CatalogueExposing EO Linked (meta-)Data from OpenSearch Catalogue
Exposing EO Linked (meta-)Data from OpenSearch Catalogue
 
GUODA: A Unified Platform for Large-Scale Computational Research on Open-Acce...
GUODA: A Unified Platform for Large-Scale Computational Research on Open-Acce...GUODA: A Unified Platform for Large-Scale Computational Research on Open-Acce...
GUODA: A Unified Platform for Large-Scale Computational Research on Open-Acce...
 
eROSA Stakeholder WS1: EUDAT – The pan-European data infrastructure
eROSA Stakeholder WS1: EUDAT – The pan-European data infrastructureeROSA Stakeholder WS1: EUDAT – The pan-European data infrastructure
eROSA Stakeholder WS1: EUDAT – The pan-European data infrastructure
 
OGC Interfaces in Thematic Exploitation Platforms
OGC Interfaces in Thematic Exploitation PlatformsOGC Interfaces in Thematic Exploitation Platforms
OGC Interfaces in Thematic Exploitation Platforms
 
RELIANCE-reproducible-OS.pptx
RELIANCE-reproducible-OS.pptxRELIANCE-reproducible-OS.pptx
RELIANCE-reproducible-OS.pptx
 
EOSC-hub and OpenAIRE Advance webinar - introduction
EOSC-hub and OpenAIRE Advance webinar - introductionEOSC-hub and OpenAIRE Advance webinar - introduction
EOSC-hub and OpenAIRE Advance webinar - introduction
 
The EOSC Compute Platform with the EGI-ACE project
The EOSC Compute Platform with the EGI-ACE project The EOSC Compute Platform with the EGI-ACE project
The EOSC Compute Platform with the EGI-ACE project
 
Science Demonstrator Session: Physics and Astrophysics
Science Demonstrator Session: Physics and AstrophysicsScience Demonstrator Session: Physics and Astrophysics
Science Demonstrator Session: Physics and Astrophysics
 
EUDAT Collaborative Data Infrastructure: Data Access and Re-use Service Area
EUDAT Collaborative Data Infrastructure: Data Access and Re-use Service AreaEUDAT Collaborative Data Infrastructure: Data Access and Re-use Service Area
EUDAT Collaborative Data Infrastructure: Data Access and Re-use Service Area
 
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
 
EUDAT Generic Execution Framework
EUDAT Generic Execution FrameworkEUDAT Generic Execution Framework
EUDAT Generic Execution Framework
 
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
 
OpenAIRE webinar. Open Research Data in H2020
OpenAIRE webinar. Open Research Data in H2020OpenAIRE webinar. Open Research Data in H2020
OpenAIRE webinar. Open Research Data in H2020
 
Implementing Durham Etheses - Sebastian Palucha (Pecha Kucha)
Implementing Durham Etheses - Sebastian Palucha (Pecha Kucha)Implementing Durham Etheses - Sebastian Palucha (Pecha Kucha)
Implementing Durham Etheses - Sebastian Palucha (Pecha Kucha)
 
Publication of INSPIRE-based agricultural linked data
Publication of INSPIRE-based agricultural linked dataPublication of INSPIRE-based agricultural linked data
Publication of INSPIRE-based agricultural linked data
 
The GoGeo Vision for Repositories (Pecha Kucha) - Tony Mathys
The GoGeo Vision for Repositories (Pecha Kucha) - Tony MathysThe GoGeo Vision for Repositories (Pecha Kucha) - Tony Mathys
The GoGeo Vision for Repositories (Pecha Kucha) - Tony Mathys
 

Mais de EOSC-hub project

Mais de EOSC-hub project (20)

Introduction to service management and FitSM
Introduction to service management and FitSMIntroduction to service management and FitSM
Introduction to service management and FitSM
 
Service management board (SMB), Service providers’ forum (SPF)
Service management board (SMB), Service providers’ forum (SPF)Service management board (SMB), Service providers’ forum (SPF)
Service management board (SMB), Service providers’ forum (SPF)
 
Joining the EOSC-hub as a Service Provider
Joining the EOSC-hub as a Service ProviderJoining the EOSC-hub as a Service Provider
Joining the EOSC-hub as a Service Provider
 
PID services - understandability and findability of data
PID services - understandability and findability of dataPID services - understandability and findability of data
PID services - understandability and findability of data
 
Repositories for long-term preservation - certification
Repositories for long-term preservation - certificationRepositories for long-term preservation - certification
Repositories for long-term preservation - certification
 
EOSC working group on FAIR
EOSC working group on FAIREOSC working group on FAIR
EOSC working group on FAIR
 
Updates on the FAIR Data Maturity Model RDA Working Group & the DG RTD FAIR i...
Updates on the FAIR Data Maturity Model RDA Working Group & the DG RTD FAIR i...Updates on the FAIR Data Maturity Model RDA Working Group & the DG RTD FAIR i...
Updates on the FAIR Data Maturity Model RDA Working Group & the DG RTD FAIR i...
 
Services to support FAIR data - Introduction
Services to support FAIR data - IntroductionServices to support FAIR data - Introduction
Services to support FAIR data - Introduction
 
EOSC-synergy
EOSC-synergyEOSC-synergy
EOSC-synergy
 
ExPaNDS
ExPaNDSExPaNDS
ExPaNDS
 
EOSC-Pillar
EOSC-PillarEOSC-Pillar
EOSC-Pillar
 
NI4OS-Europe
NI4OS-EuropeNI4OS-Europe
NI4OS-Europe
 
Excellerat CoE
Excellerat CoEExcellerat CoE
Excellerat CoE
 
Pathways for EOSC-hub and MaX collaboration
Pathways for EOSC-hub and MaX collaborationPathways for EOSC-hub and MaX collaboration
Pathways for EOSC-hub and MaX collaboration
 
Overview on the HPC CoEs panorama
Overview on the HPC CoEs panoramaOverview on the HPC CoEs panorama
Overview on the HPC CoEs panorama
 
Overview of the Onboarding and validation process and the Rules of Participat...
Overview of the Onboarding and validation process and the Rules of Participat...Overview of the Onboarding and validation process and the Rules of Participat...
Overview of the Onboarding and validation process and the Rules of Participat...
 
ELIXIR Competence Centre in EOSC-hub
ELIXIR Competence Centre in EOSC-hubELIXIR Competence Centre in EOSC-hub
ELIXIR Competence Centre in EOSC-hub
 
Data sharing in EOSC-hub: perspectives on “sensitive” data
Data sharing in EOSC-hub: perspectives on “sensitive” dataData sharing in EOSC-hub: perspectives on “sensitive” data
Data sharing in EOSC-hub: perspectives on “sensitive” data
 
Structural biology in the cloud powered by the WeNMR thematic services
Structural biology in the cloud powered by the WeNMR thematic servicesStructural biology in the cloud powered by the WeNMR thematic services
Structural biology in the cloud powered by the WeNMR thematic services
 
Use of EOSC to support Protected Area Management applications in the context ...
Use of EOSC to support Protected Area Management applications in the context ...Use of EOSC to support Protected Area Management applications in the context ...
Use of EOSC to support Protected Area Management applications in the context ...
 

Último

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Último (20)

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 

Software for data management and exploitation

  • 1. EOSC-hub receives funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No. 777536. eosc-hub.eu @EOSC_eu Baptiste Grenier / Enol Fernández EGI Foundation Open Data analysis with EOSC-hub services Dissemination level: Public
  • 2. 2 Thanks to the EOSC-hub distributed team! Onedata and DataHub: Lukasz Dutka, Lukasz Opiola, Bartosz Kryza, Michal Orzechowski EGI FedCloud provider: Boris Parak, Miroslav Ruda, Zdenek Sustr EGI Check-in: Nicolas Liampotis B2HANDLE: Kyriakos Ginis B2FIND: Tobias Weigel, Claudia Martens
  • 3. 3 • Several of the use cases in EOSC-hub will enable scientific end-users to perform data analysis experiments on large volumes of data, by exploiting a PID-enabled, server-side, and parallel approach. • Users expect easy to use interfaces like Jupyter Notebooks for interacting with the system. • Producing reusable results following FAIR guidelines - Findability, Accessibility, Interoperability, and Reusability. What do we want to do?
  • 4. 4 ● Analysis ○ Notebooks / JupyterLab ○ FedCloud resources ● Data management ○ DataHub / Onedata ■ Space ■ Onezone ■ Oneprovider ■ Oneclient ● AAI (OIDC) ○ Check-in ● PID management ○ B2HANDLE ○ Handle.net ● Cataloguing and discovery ○ B2FIND How?
  • 5. 5 ● Integrating multiple services from the EOSC-hub catalogue to build a new solution is worth the effort ○ Self-service APIs allow you to get nice combination of services without overhead, still some steps cannot be automated ○ Support channels with providers are life savers while prototyping ● Need to validate the setup for production with a real research community ● Aim at a completely integrated solution that people can reuse ○ Provide python modules for easy interaction with services ○ Expand the EGI Notebooks service ○ Ensure that all required operations can be done using API calls Lessons Learned
  • 6. 6 Enabling reproducibility with Notebooks GitHub Your repository EGI Notebooks services Zenodo Your laptop Download ipynb file Create repository Upload ipynb file Add requirements.txt Specify GitHub repo Generate DOI Execute Data repository MyBinder.org Re-execute Obtain GitHub project reference Provide GitHub project reference Discover Notebook (use DOI) Fellow researchers Journal paper DOI
  • 7. 7 An Open Science story we aim for… GitHub Your repository EGI Notebooks and Binder service Zenodo Your laptop Download ipynb file Create repository Upload ipynb file Add requirements.txt Specify GitHub repo Generate DOI Execute Data repository Obtain GitHub project reference Provide GitHub project reference Discover Notebook (use DOI) Fellow researchers Journal paper DOI Distributed big data DataHub B2DROP Etc. GenerateDOI
  • 8. 8 - Onedata ▪ https://onedata.org - EGI DataHub ▪ https://datahub.egi.eu - http://egi-datahub.readthedocs.io/ - EGI Notebooks ▪ https://www.egi.eu/services/notebooks/ - https://notebooks.egi.eu/ - EGI Check-in ▪ https://www.egi.eu/services/check-in/ - https://wiki.egi.eu/wiki/AAI - B2FIND ▪ https://eudat.eu/services/b2find - http://eudat7-ingest.dkrz.de/ - B2HANDLE ▪ https://eudat.eu/services/b2handle - https://hdl.grnet.gr:8001/api/handles ▪ Binder ▪ https://mybinder.org Links
  • 9. eosc-hub.eu @EOSC_eu Thank you for your attention! Questions? Contact This material by Parties of the EOSC-hub Consortium is licensed under a Creative Commons Attribution 4.0 International License. Enol Fernandez - enol.fernandez@egi.eu Baptiste Grenier - baptiste.grenier@egi.eu
  • 10. 10 1. Authenticating to DataHub using Check-in: https://datahub.egi.eu a. Showing content of space 2. Authenticating to Notebooks using Check-in: https://cs3.fedcloud-tf.fedcloud.eu a. Showing content of mounted space b. Running Wind cast analysis notebook c. Running PID registration notebook to share and publish notebooks directory 3. B2FIND cataloguing (data collected on a regular basis): http://eudat7- ingest.dkrz.de/dataset?groups=egidatahub 4. OAI-PMH metadata in DataHub: 5. http://datahub.egi.eu/oai_pmh?verb=ListRecords&metadataPrefix=oai_dc 6. PID in Handle.net registry: http://hdl.handle.net/ 7. PID pointing to shared data publicly accessible in Onedata Demonstration flow
  • 11. 11 DataHub/Onedata Login with Check-in (OIDC)
  • 12. 12 Check-in: IdP Selection and authentication
  • 17. 17 Notebooks: Login with Check-in (OIDC)
  • 19. 19 Notebooks: Onedata space mounted locally
  • 20. 20 Notebooks: wind casting using public dataset
  • 21. 21 Notebooks: publishing data with PID using APIs
  • 23. 23 B2FIND: discovery of harvested OAI-PMH metadata
  • 26. 26 Handle.net: the PID in the registry
  • 27. 27 DataHub: the published dataset, from the PID