Eloy Rodrigues, Petr Knoth & Kathleen Shearer showcase the conceptual model for this vision, as well as the role and functions of repositories within this model.
Workshop title: Building a global knowledge commons - ramping up repositories to support widespread change in the ecosystem
Workshop abstract:
The extensive international deployment of repository systems in higher education and research institutions, as well as scholarly communities, provides the foundation for a distributed, globally networked infrastructure for scholarly communication. This distributed network of repositories can and should be a powerful tool to promote the transformation of the scholarly communication ecosystem. However, repository platforms are still using technologies and protocols designed almost twenty years ago, before the boom of the web and the dominance of Google, social networking, semantic web and ubiquitous mobile devices. In April 2016, the Confederation of Open Access Repositories (COAR) launched a working group to help identify new functionalities and technologies for repositories and develop a road map for their adoption. For the past several months, the group has been working to define a vision for repositories and sketch out the priority user stories and scenarios that will help guide the development of new functionalities. The results of this work will be available in the summer of 2017.
This workshop will present the functionalities and technologies for the next generation of repositories and reflect on how these functionalities will be adopted into the existing software platforms. In addition, participants will discuss the important implications for the network layers, and how repositories will uniformly interact with the networks to provide value added services on top of their content.
DAY 3 - PARALLEL SESSION 6 & 7
http://www.opensciencefair.eu/workshops/parallel-day-3-1/building-a-global-knowledge-commons-ramping-up-repositories-to-support-widespread-change-in-the-ecosystem
OSFair2017 Workshop | Building a global knowledge commons - ramping up repositories to support widespread change in the ecosystem
1. Building a global knowledge commons -
ramping up repositories to support
widespread change in the ecosystem
Eloy Rodrigues, COAR and U Minho
Petr Knoth, CORE
Kathleen Shearer, COAR
With significant input from COAR Next
Generation Working Group
2. Agenda for today
1. Brief Introduction
2. NGR - User stories and functionalities
3. Draft conceptual model
4. Repository technologies
5. Next steps, implementation and adoption
6. Topics for discussion
4. • An international association founded in 2009
• Members & Partners: over 120 institutions from 35
countries in Africa, Asia, Australasia, Europe, North and
South America
Objectives:
• Strategic voice for repositories
• Interoperability and alignment across regions
• Capacity building
• Support the development of value added services
Who is COAR?
5. – Working Group launched in April 2016
– The problem: Repositories have not fully realized their
potential and function mainly as passive, siloed recipients of
the final versions of their users’ conventionally published
research outputs
– Aim: to identify functionalities and architectures for the next
generation repositories within the context of scholarly
communication
Next generation repositories
6. Vision
To position repositories as the foundation for a
distributed, globally networked infrastructure for scholarly
communication, on top of which layers of value added
services will be deployed, thereby transforming the
system, making it more research-centric, open to and
supportive of innovation, while also collectively managed
by the scholarly community.
Next generation repositories
7. Objectives
•To achieve a level of cross-repository interoperability by
exposing uniform behaviours across repositories that leverage
web-friendly technologies and architectures, and by integrating
with existing global scholarly infrastructures specifically those
aimed at identification of e.g. contributions, research data,
contributors, institutions, funders, projects.
•To encourage the emergence of added-value services that use
these uniform behaviours to support discovery, access,
annotating, real-time curating, sharing, quality assessment,
content transfer, analytics, provenance tracing, etc.
Next generation repositories
8. Next Generation Repositories Working Group
Eloy Rodrigues, chair (COAR, Portugal)
Andrea Bollini (4Science, Italy)
Alberto Cabezas (LA Referencia, Chile)
Donatella Castelli (OpenAIRE/CNR, Italy)
Les Carr (Southampton University, UK)
Leslie Chan (University of Toronto at Scarborough, Canada)
Chuck Humphrey (Portage, Canada)
Rick Johnson (SHARE/University of Notre Dame, US)
Petr Knoth (Open University, Jisc, UK)
Paolo Manghi (CNR, Italy)
Lazarus Matizirofa (NRF, South Africa)
Pandelis Perakakis (Open Scholar, Spain)
Jochen Schirrwagen (University of Bielefeld, Germany)
Daisy Selematsela (NRF, South Africa)
Kathleen Shearer (COAR, Canada)
Tim Smith (CERN, Switzerland)
Herbert Van de Sompel (Los Alamos National Laboratory, US)
Paul Walk (EDINA, UK)
David Wilcox (Duraspace/Fedora, Canada)
Kazu Yamaji (National Institute of Informatics, Japan)
9. Principles
• Distribution of control
• Inclusiveness
• Public good
• Intelligent openness
• Sustainability
Next generation repositories
10. Design assumptions
• Focus on the resources themselves, not just
associated metadata
• Pragmatism
• Evolution, not revolution
• over configuration
• Engage with users where they are
Next generation repositories
11. Next generation repositories
Methodology
1.Identify major use cases
2.Determine functionalities/behaviours
3.Develop conceptual models
4.Define technologies and architectures
5.Publish recommendations
6.Support adoption and implementation
12. Initial Outcomes
12 User Stories made available for public comment
from February 7 – March 3, 2017
• More than 60 comments received
• Revised version produced
• Technical recommendations being developed
based on the user stories
Next generation repositories
13. Current Recommendations
12 User Stories made available for public comment
from February 7 – March 3, 2017
• More than 60 comments received
• Revised version produced
• Technical recommendations being developed
based on the user stories
Next generation repositories
15. Next generation repositories working group
The aim of this activity is to develop a global network
of repositories that allows frictionless access to
open content and encourages the creation of
cross-repository added-value services.
16. User stories
• Data mining
• Discovering metadata that describe a
scholarly resource
• Discovering the identifier of a
scholarly resource
• Discovering usage rights
• Resource syncing and notification
• Recognizing the user
• Commenting & annotating
• Providing a social notification feed
• Recommender systems for repositories
• Preservation
• Peer-review
• Comparing usage https://www.coar-repositories.o
rg/files/COAR-Next-Generatio
n-Repositories-February-7-201
7.pdf
17. Current repositories
Services we can
develop with
repositories today
Persistence layer Persistence layer
Interoperability Interoperability
Metadata
Usage
interactions
and metrics
Content
Links
between
resources
Notifications
Global sign-on
Comments Peer-reviews Messages
Metadata
Services we can
develop with the
next generation of
repositories
Next generation repositories
Conceptual layer
Conceptual layer
18. User stories and priority areas
Discovery
and
exposing
resources
Batch
Navigation
Notification
Research
workflows
and
lifecycle
Annotation
Commenting
Social
interaction
Research
evaluation
Peer review
Metrics
• Data mining
• Discovering metadata that describe
a scholarly resource
• Discovering the identifier of a
scholarly resource
• Discovering usage rights
• Resource syncing and notification
• Recognizing the user
• Commenting & annotating
• Providing a social notification feed
• Recommender systems for
repositories
• Preservation
• Peer-review
• Comparing usage
19. Beyond the metadata record
»Content! (manuscript, data)
»Links (citations, data citations, relatedness,
versions, etc.)
»User interaction data
»Comments
»Messages
»Peer-reviews
»Annotations
20. Three vertical discovery mechanisms
»Batch – Transferring bulk data
»Navigation – Helping robots to find resources in
repositories by means of navigation
»Notification – Enabling robots to subscribe to
changes in repositories
21. Global sign on
As a user, I want my repository to recognize me and
other users so that I can be connected with other users
who I know, leave comments and be informed of
content that is of interest to me
22. Transparent social network over repositories
»What are the components:
›Annotation
›Commenting/social
interaction
›Notification feed
›Recommender systems
»Novelty:
›All this in a transparent and
distributed environment
23. Research evaluation
»Comparing usage:
›addressing the fact altmetrics don’t work yet in
the OA world
›Discovery is a key repository function which has a
priority over metrics accuracy!
»Open citations
»Peer-reviews
24. Conclusions
»COAR NGR WG wants to see repositories succeed.
To achieve that, the repositories technology needs
to be competitive with commercial offerings.
26. …repositories are nodes in a larger
network, contributing their collective
contents to a global knowledge
commons on top of which value added
services can be built.
27. Conceptual Model - Core Concepts
• Sets of resources each with a URI
• Resources may or may not be connected
• Connected resources may or may not be physically stored
in the same repository
• Many different services can mirror, utilize, enhance, enrich,
extend, derive (etc.) from the resources
• Enable reimagined User Services to interface with global
set of resource(s) accessible via URI(s)
• Indexes facilitating global registry of objects may or may
not be centralized and/or be distributed systems acting
together. (TBD)
29. A new level of interoperability
In the past years we have focused on
interoperability at the Repository level, now
we need interoperability at the resource level
(and below)
Resources need to talk to each other to be
reusable, this in turn will make the
repositories the basis of a global scholarly
ecosystem
29
Andrea Bollini, 4Science / David Wilcox, Duraspace – OR 2017 – Brisbane, Australia
30. IIIF - http://iiif.io/
IIIF International Image Interoperability
Framework is a set of shared APIs to
provide access to image based resources
in a strongly interoperable way.
It is growing in adoption and scope
covering now also audio/video and 3D
objects.
Andrea Bollini, 4Science / David Wilcox, Duraspace – OR 2017 – Brisbane, Australia
31. IIIF - http://iiif.io/
What Is an Interoperable Resource
• Discoverable
• Viewable via APIs
• Interactive and Manipulable (for tools, analytics)
• Citable / Shareable
• Mash Up-able
• Annotation-ready
• With attribution, license and links (back to the
image in local context)
Credits: Tom Craimer
International Image Interoperability Framework
32. Why is it relevant in the context of the NGR
work?
It is a concrete example of technology that enables
interoperability at the resource level
You can combine resources hosted in different
repositories at any level of granularity:
- Single images in a set
- Region of a specific image
Other repositories can host additional related
resources like web annotation, comments, etc.
Andrea Bollini, 4Science / David Wilcox, Duraspace – OR 2017 – Brisbane, Australia
33. Le manuscrit 5 de la Bibliothèque
municipale de Châteauroux, c. 1460
Folio in BVMM
Miniature in the BNF
Credits: Tom Craimer
International Image Interoperability Framework
35. IIIF and Repositories
• Several projects are exploring the use of IIIF technologies
in the repositories software (DSpace, Fedora, Hydra)
https://wiki.duraspace.org/display/DSPACE/IIIF+and+DSpace
• Don’t miss my presentation:
DSpace for Cultural Heritage: adding support for
images visualization,audio/video streaming and
enhancing the data model
Session: DSpace IG 3: Integrating DSpace
Room: Ballroom C
Session time: 29/Jun/2017, 3:30pm - 5:00pm
35
Andrea Bollini, 4Science / David Wilcox, Duraspace – OR 2017 – Brisbane, Australia
36. Dataset – OpenDATA
• Datasets need to be usable: preview,
sampling, visualization, remote computation &
more
• OpenDATA: standards formats & APIs
required. CKAN provides automatic REST WS
on top of your tabular data. Now available also
to the DSpace users thanks to the open source
DSpace-CKAN integration by 4Science
36
Andrea Bollini, 4Science / David Wilcox, Duraspace – OR 2017 – Brisbane, Australia
37. Signposting - http://signposting.org/
Signposting is an approach to make the scholarly web more friendly
to machines exposing relations as Typed Links in HTTP Link headers
The following discovering patterns are currently defined:
• Author
• Bibliographic Metadata
• Identifier
• Publication Boundary
• Resource Type
The Signposting approach is fully aligned with hypermedia (REST,
HATEOAS) lines of thinking regarding web interoperability.
(DSpace7 REST – Fedora API)
Andrea Bollini, 4Science / David Wilcox, Duraspace – OR 2017 – Brisbane, Australia
38. Signposting - http://signposting.org/
As an example, Herbert Van de Sompel and Michael
L. Nelson are the authors of the paper with
DOI https://doi.org/10.1045/november2015-vandeso
mpel; their
respective ORCIDs are http://orcid.org/0000-0002-
0715-6126 and http://orcid.org/0000-0003-3749-8
116
Andrea Bollini, 4Science / David Wilcox, Duraspace – OR 2017 – Brisbane, Australia
39. Signposting - http://signposting.org/
curl -I "https://doi.org/10.1045/november2015-vandesompel”
HTTP/1.1 303 See Other
Location:
http://www.dlib.org/dlib/november15/vandesompel/11vandeso
mpel.html
Link: <http://orcid.org/0000-0002-0715-6126> ; rel="author",
<http://orcid.org/0000-0003-3749-8116> ; rel="author"
Andrea Bollini, 4Science / David Wilcox, Duraspace – OR 2017 – Brisbane, Australia
40. Signposting - http://signposting.org/
• The new versions of DSpace-CRIS 5.7 & 6.1
ship with support for the following patterns:
– Author
– Identifier
– Publication Boundary
• An issue has been open to track this
requirement also for DSpace 7
https://jira.duraspace.org/browse/DS-3589
Andrea Bollini, 4Science / David Wilcox, Duraspace – OR 2017 – Brisbane, Australia
41. A reflection on the current
repositories data model
• A revision of the current data model is
needed
• Precise identification of persons,
organizations, projects, concepts and linked
resources (dataset, different versions etc.)
• Avoid loss of details to allow a fine grain and
effective interoperability
Andrea Bollini, 4Science / David Wilcox, Duraspace – OR 2017 – Brisbane, Australia
42. ResourceSync -
http://www.openarchives.org/rs/1.1/resourcesync
• Successor of the OAI-PMH protocol and
much more…
• Faster, reliable and scalable
• Allows real-time notification (and recovering
of missed messages)
• Drives resource synchronization: content
and metadata are both managed
Andrea Bollini, 4Science / David Wilcox, Duraspace – OR 2017 – Brisbane, Australia
43. ResourceSync -
http://www.openarchives.org/rs/1.1/resourcesync
A first implementation of resourcesync for
DSpace was produced in the past years:
https://github.com/CottageLabs/DSpaceResource
Sync
A ticket now exists to resume such
implementation and maybe include in the
mainstream:
https://jira.duraspace.org/browse/DS-3590
Andrea Bollini, 4Science / David Wilcox, Duraspace – OR 2017 – Brisbane, Australia
44. ResourceSync -
http://www.openarchives.org/rs/1.1/resourcesync
The Hydra-in-a-box team tested ResourceSync
with the Hyku repository:
http://hydrainabox.samvera.org/2017/06/22/reso
urcesync.html
ResourceSync shows great promise and the team
will continue working toward an implementation
Andrea Bollini, 4Science / David Wilcox, Duraspace – OR 2017 – Brisbane, Australia
45. Next Steps
1. Refine the Conceptual Model(s) - for different
stakeholder communities
2. Publish recommended technologies (September 2017)
3. Promote adoption of new technologies into repository
platforms
4. Support upgrading and adoption of NGRs at the local
level
5. Facilitate the development of network services on top
of repositories
45
Next generation repositories
46. 1. What is the best way to engage with the broader
community about our vision?
2. How can we get widespread adoption of these
functionalities in repositories?
3. What are the most important value added layers to
start building on top of repositories?
46
Discussion topics