O slideshow foi denunciado.
Seu SlideShare está sendo baixado. ×

RELIANCE ROHub hackathon

RELIANCE ROHub hackathon

Baixar para ler offline

Hackathon for RELIANCE research communities.
Note: Hackathon was conducted using old version of ROHub (http://www.rohub.org). New portal to be released end of 2021 (http://reliance.rohub.org)

Hackathon for RELIANCE research communities.
Note: Hackathon was conducted using old version of ROHub (http://www.rohub.org). New portal to be released end of 2021 (http://reliance.rohub.org)

Mais Conteúdo rRelacionado

Audiolivros relacionados

Gratuito durante 30 dias do Scribd

Ver tudo

RELIANCE ROHub hackathon

  1. 1. This project has received funding from the European research infrastructures (including e-Infrastructures) under the European Union's Horizon 2020 research and innovation programme under grant agreement No 101017501 Research Lifecycle Management technologies for Earth Science Communities and Copernicus users in EOSC ROHub Hackathon 29th April 2021
  2. 2. • Unique identifier (DOI) • Aggregation of resources • Hypotheses • Data used and results produced • Methods employed to produce and analyse data • Scientific workflows, scripts, code implementing such methods • (Web) services used • People involved in the investigation • … • Annotations about these resources • Descriptive metadata • Provenance of executions • Versioning information Research Objects - overview Goal: Account, describe and share everything about your research, including how those things are related http://www.researchobject.org
  3. 3. i. To organize and describe the resources, materials, and methods of an investigation ii. To keep track & support the research lifecycle, via snapshots, releases and forks including versioning and change information iii. To share your research materials with other scientists at discrete milestones of your investigation, and collaborate via a single information unit uniquely unit uniquely identified by an URI, pref. a DOI (RO as a social object) iv. To enhance the findability and accessibility of your scientific outcomes through a single information unit associated with rich, machine-readable metadata Why Research Objects (1/2)
  4. 4. v. To enable reproducibility and reuse of the scientific methods and results via access to resources, context and metadata vi. To be recognized and cited (even of constituent parts) encouraging the release and publication of research materials vii. To preserve results and prevent decay (curation of scientific methods, e.g., workflows) viii. To provide evidence and support validation of findings claimed in scholarly articles Why Research Objects (2/2)
  5. 5. RO model customizations to Earth Science (pre-RELIANCE)  Geospatial information  Time-period coverage  Data access policies  Intellectual Property Rights  General ES metadata (DOI, discipline, size, format, date…)  RO evolution and versioning extended (fork, release and DOI variants)  Eight new RO types :  Workflow-centric  Data-centric  Research product-centric  Bibliographic  …  and their associated checklists Concepts and properties related to executable resources extended to consider other types of processes, not only workflows (roterms)  ProcessValue, subsequentProcess, previousProcess,…
  6. 6. • Data Cube centric Research Object • Treats DC as first-class resources • aggregated and described in detail, e.g., how was it generated or used. • A DC is identified and aggregated in the RO through its URI. • link to the DC that (will) open it in ADAM GUI its “latest” scene. • The aggregated DC metadata may include (D4.1), e.g., • identification, description, resolution • parameters used to generate the dataset or subset • links to access it • Such RO, MAY also include other related resources like: • (link to ) Jupyter Notebook using this DC via the ADAM API • Intermediate and final results from the analysis of the data (link to the data in some repository, e.g., B2Share or Zenodo) • Related documentation • Related publications • Others RO model customizations to Earth Science (RELIANCE)
  7. 7. Managing the research lifecycle via ROs
  8. 8. Managing the research lifecycle via ROs
  9. 9. Managing the research lifecycle via ROs
  10. 10. Managing the research lifecycle via ROs
  11. 11. Managing the research lifecycle via ROs
  12. 12. RO generation
  13. 13. • Holistic solution for research object management • implements natively the full RO model and paradigm • support different stakeholders, with the primary focus on scientists, researchers, students and enthusiasts • Comprises • backend service implementing and exposing a set of APIs • reference web client application, exposing all RO functionalities to the end-users. • Combines and leverages different technologies • DL, long term-preservation & semantic technologies http://www.rohub.org/
  14. 14. ROHUB enables: • to create and manage high-quality ROs that can be interpreted and reproduced in the future • to reference, share and preserve scientific studies, campaigns, and observations related resources, including internal ones, links to external ones as well as other ROs (nested ROs) • to collaborate with colleagues and to discover new knowledge through advanced exploratory search interfaces that exploit RO metadata (both explicitly provided and automatically extracted from its content), as well as via an standard search API OpenSearch with Geo extensions • to manage the RO evolution including the ability to generate snapshots and releases and to allow others to fork the RO to reuse it and extend it. • to publish the associated work and assign it a DOI to allow its citation in scholarly communications • to monitor and follow a particular work, getting notifications about its progress or changes in quality • To allow researchers to build reputation enabling users to rate and favorite ROs created by others • to find related works or relevant researchers in a particular domain, e.g., for possible collaborations or reviews High-level features
  15. 15. ROHub plus added value services Semantic enrichment readability, discoverability, reuse Recommendation content-based, concentric spheres Research lifecycle & scholarly communication collaboration, publication, citation, validation Quality assessment Monitoring & preservation of HQ investigations Social Impact Sharing, quality
  16. 16. • Enrich ROs with semantic metadata extracted from their aggregated human- generated content, enhancing both human and machine readability, and thus their discoverability • Extracted annotations are structured as semantic markup based on a and are included as annotations following the RO model. Semantic enrichment Ontology to represent identified annotations RO search levels
  17. 17. • Implements a content-based recommender service • Takes as input user interests (as a collection of ROs) and matches them against other ROs based on their content, exploiting the metadata metadata generated by the RO semantic enrichment process • User interface follows a visual metaphor metaphor based on concentric spheres Recommender system
  18. 18. • Checklists are well-established tool for guiding practice • ensure safety, quality, and consistency in communities • allow to specify the required metadata and RO must include and have access to, and purpose • defined for RO types based on researchers’ • Contribute to automate DMPs • Allow to calculate RO quality metrics, completeness, stability and reliability • Monitor the RO and assess its quality through their lifecycle for its preservation • Assess checklist systematically • notify users when quality drops (decay) • Focus on reuse Quality assessment & preservation
  19. 19. Research lifecycle & scholarly communication Evolve Release & Publish
  20. 20. • Researchers can build up their reputation based on the rating of other colleagues • Ratings may be used as additional information for reusing • Favorites may be used to retrieve quickly the ROs researchers are interested • Comments and replies can be used for making discussions Social Impact
  21. 21. Collaboration Who made the (content of) research object? Who maintains it? Who wrote this document? Who uploaded it? Which CSV was this Excel file imported from? Who wrote this description? When? How did we get it? What is the state of this RO? (Live or Published?) What did the research object look like before? (Revisions) – are there newer versions? Which research objects are derived from this RO? Answer to: who, where, which, what, why • Attribution • Derivation • Activities http://www.w3.org/TR/prov-primer/ PROV-O Provenance information
  22. 22. RELIANCE services logic interconnection IBIS
  23. 23. • ROHub will be interconnected with the other RELIANCE services, namely ADAM platform for Data Cubes services and text mining and semantic enrichment services (initial connection made in EVER-EST). • ROHub will be onboarded into the EOSC portal catalogue • ROHub will leverage and integrate with other EOSC services, including EOSC AAI and research data management services like B2DROP, B2SHARE or Zenodo. • ROHub plans also to leverage EOSC notebooks (Jupyter notebooks) service • ROHub version 2.0 is on the way ROHub plans in RELIANCE
  24. 24. • Two main entry points: • ROHub portal • Jupyter notebooks (via ROHub python library) • Additionally • ROHub is planned to be connected at high level from ADAM GUI • Load a DC RO, save changes in to RO, publish DC RO (make snapshot/release), open ROHub portal for further RO manipulation • ROHub may be used within VRC applications (via its API, library) ROHub use in RELIANCE
  25. 25. ROHub portal (plans) - ongoing
  26. 26. • Jupyter notebooks will be treated as the default execution environment in RELIANCE, where scientitsts will be able to access their data, create/manipulate data cubes, execute their methods, and manage their research objects • Leverage EGI notebooks Jupyter Notebooks (plans)
  27. 27. EOSC AAI integration (plans) – done under testing
  28. 28. • Zenodo is a general-purpose open-access repository used in EOSC as a catch all repository • B2SHARE is a is an EOSC service to share and publish your research data • Both services generate DOIs • The plan is to allow users to publish snapshots/releases of ROs to those services directly • Users may be able to select the community where to publish it • Integration is straightforward, users will need to provide their token to rohub to use those services on their behalf B2SHARE & Zenodo integration (under testing)
  29. 29. • B2DROP is a Personal Cloud Storage Service • May be used, if user wants it, as the storage backend for ROHub to store the “internal” RO resources of the user, instead of using the default ROHub storage sytem • Integration path under discussion (telco next week) B2DROP integration (plans)
  30. 30. Live Tour
  31. 31. • Architecture: • K. Page, R. Palma, P. Holubowicz, G. Klyne, S. Soiland-Reyes, D. Cruickshank, R. G. Cabero, E. G. Cuesta, D. D. Roure, and J. Zhao, architecture for preserving the semantics of science," Proceedings of the 2nd International Workshop on Linked Science, ISWC, Boston, • APIs: • Palma, R., Hołubowic P., et al. A suite of API for the management of Research Objects. In Proceedings of the ISWC Developers • ROHUB: • Gómez-Pérez J.M., Palma R. Research Objects for Sharing and Exchanging Research Data and Methods in Earth Science. Poster at Assembly, April 2016 • Palma R., Corcho O., Gómez-Pérez J.M., Mazurek, C., “ROHub – A Digital Library for Sharing and Preserving Research Objects”. Poster • Palma R., Corcho O., Gómez-Pérez J.M., Mazurek, C., “ROHub A Digital Library of Research Objects Supporting Scientists Towards Challenge of Proc. Extended Semantic Web Conference (ESWC), Crete, Greece, May 25-29, 2014. • Page K., Palma R., Hołubowicz P., Klyne G., Soiland-Reyes S., Garijo D., Belhajjame K., Mayer R., Research Objects for Audio Processing: In Proc. 53rd Audio Engineering Society International Conference on Semantic Audio, London, UK, January 27-29, 2014 • Palma R., Corcho O., Hołubowicz P., Pérez S., Page K., Mazurek C., Digital libraries for the preservation of research methods and Workshop on the Digital Preservation of Research Methods and Artefacts (DPRMA 2013) at Joint Conference on Digital Libraries (JCDL July 2013. • APIs overview: https://github.com/wf4ever/apis/wiki/Wf4Ever-Services-and-APIs • Source code: https://github.com/rohub • Demo video: https://youtu.be/TxW2wvreyoQ • Live Instance: http://www.rohub.org/ • New beta Portal: http://beta.rohub.org/ References

Notas do Editor

  • As a backbone technology behind RELIANCE, it is important to give a bit more context about research objects to understand the reasoning behind.
    Research objects, as perhaps some of you already know, are rich information objects that aim to account, describe and share everything about your research, including how those things are related, in a way that is understandable by both users and machines.
    Research Objects have been used and demonstrated in different communities in previous projects, from bioinformatics, to astronomy to earth science, and as a result, they are rapidly gaining more attention as a promising research-enabling technology
    Research objects can be regarded as a logical container that has a
    Unique identifier, e.g. DOI and that can encapsulate various research artefacts such as
    Hypotheses and/or purpose of the experiment
    Data used and results produced
    Methods employed to produce and analyse data
    Scientific workflows implementing such methods
    Provenance of their executions
    Versioning information
    People involved in the investigation
    Annotations about these resources
  • The Figure on the slide shows a high-level view of the RELIANCE services architecture, and their connection with EOSC and other existing services.
    * As we can see in the middle, RELIANCE services will be interconnected and complementing each other, enabling scientists to use the provided functionalities and access their work from different user interfaces using ROs as the main connecting point.

    * RELIANCE services will expose Restful APIs and python libraries, enabling the communitcation between RELIANCE and other EOSC services, as well as their use from different user interfaces.

    * DCs will be linked in the RO as first class entities, and described with a rich set of metadata (e.g., how it was generated or used) enabling an efficient access to large datasets like Copernicus data while facilitating reusability and reproducibility of the mechanisms to access such data

    * TM & enrichment services will automatically enrich the RO with metadata extracted from the available annotations and resources aggregated thus increasing their findability, interoperability and reuse, and enabling the recommendation of ROs or DCs

    * Some of these connections are already in place and will require the necessary adaptations to EOSC.
    * Also, as we can see in the diagram, RELIANCE services will be integrated to EOSC, reusing some of the core cross-cutting services as well as some advanced added-value services,
    * They may also reuse other available services like scholarly communication services, notebooks, or QCG for HPC resources allocation, and other eu-wide AAI services like EDUGAIN (to which EgI-check in is federated as service provider)