Next generation repositories

Paul Walk
Director, Antleaf
Managing Director, Dublin Core Metadata Initiative (DCMI)
Web: http://www.paulwalk.net
Email: paul@paulwalk.net
Twitter: @paulwalk
www.antleaf.com www.coar-repositories.org
Next Generation Institutional Repositories

3 cheers for the current generation of repositories!

cheer #1:
proven technology,
ubiquitous in our
institutions

cheer #2:
strong community
support

cheer #3:
distributed policy
control

about the COAR Next
Generation Repositories
Working Group

Next Generation Repositories Working Group
• Eloy Rodrigues, chair (COAR,
Portugal)
• Andrea Bollini (CINECA, Italy)
• Alberto Cabezas (LA Referencia,
Chile)
• Donatella Castelli (OpenAIRE/CNR,
Italy)
• Les Carr (Southampton University,
UK)
• Leslie Chan (University of Toronto
at Scarborough, Canada)
• Rick Johnson (SHARE/University of
Notre Dame, US)
• Petr Knoth (Jisc and Open
University, UK)
• Paolo Manghi (CNR, Italy)
• Lazarus Matizirofa (NRF, South
Africa)
• Pandelis Perakakis (Open Scholar,
Spain)
• Oya Rieger (Cornell University, US)
• Jochen Schirrwagen (University of
Bielefeld, Germany)
• Daisy Selematsela (NRF, South
Africa)
• Kathleen Shearer (COAR, Canada)
• Tim Smith (CERN, Switzerland)
• Herbert Van de Sompel (Los
Alamos National Laboratory, US)
• Paul Walk (Antleaf, UK)
• David Wilcox (Duraspace/Fedora,
Canada)
• ▪ Kazu Yamaji (National
Institute of Informatics, Japan)

To position repositories as the
foundation for a distributed, globally
networked infrastructure for scholarly
communication…

objectives
• cross-repository interoperability
• encourage the emergence of added-value services
• transform the scholarly communication system by emphasising:
• collective, open and distributed management of open content
• collective innovation

principles
• distribution of control of scholarly resources
• inclusiveness: different institutions and regions have particular needs (e.g
diverse language, policies and priorities) and this must be supported
• for the public good
• intelligent openness

Intended outputs
• direct outputs:
• the Next Generation Working Group will collectively produce:
• reports
• conceptual models
• recommendations for particular technologies
• indirect outputs:
• some individuals independently of the Next Generation Working Group
will:
• implement software changes to repository platforms
• build infrastructure (micro-services)

design assumptions
• focus on resources
• not just associated metadata - treat them equally
• pragmatism
• favour the simpler approach
• evolution, not revolution
• use existing software and systems where possible
• convention over configuration
• standardise only where necessary and minimise constraints
• engage with users where they are:
• integrate into environments and systems where users are already engaged
Not all users are human, some are machines!

repository ‘behaviours’
and user-stories

“behaviours”
• Supporting discovery of content
• exposing identifiers and links between resources
• supporting navigation
• supporting batch discovery
• actively sharing or exposing notifications
• Participating in the social network
• Global identification of people in the repository network
• Annotation, commenting and reviews - e.g. Open Peer Review
• Logging and exposing of user interaction data across repositories
• Preservation
• Supporting other processes
• Declaring licenses at a resource level
• Exposing standardised usage metrics
• Content transfer (e.g. for text and data mining)

user stories
as <some actor>,
I want to <do something>,
in order to gain <some benefit>

user stories relating to repository ‘behaviours’
Example user-stories for the behaviour “Discovery through navigation”:
• as a human or machine user, I want to easily and uniformly identify the
metadata in a repository record, so that I can ascertain the relevance
of the resource.
• as a repository manager, I want to be able to access the metadata in
my repository in real time through an API in order to build views or
services on any platform using the data.
• as a research manager (funder or institution), I want to be able to track
the research outputs related to a specific funded project to
demonstrate value and compliance with policy

characteristics of the next
generation repository

repositories must be deeply connected
• outgoing:
• individual content resources
• directly accessible on the network
• individual metadata records
• not just in batches
• individual users
• as part of a variety of professional and social networks
• incoming:
• using all appropriate global identifier systems
• accepting automated deposit of content and data from other systems (e.g.
scientific instruments)
• allowing external services to interact with content
• content mining
• annotation services
• etc.

repositories need to be active
• the next generation repository needs to talk to the world
• publishing events to notification hubs and notifying users
• and to listen, and respond:
• respond to requests for content and metadata, equally
• continuously improve the information it has, adding value where it can by:
• responding to and supporting annotation and peer review
• not just allowing text/data-mining, but supporting it and benefitting from the
derived information
supporting user workflows - providing and accepting data

active repositories
• repositories could become pro-active
components in an event-driven
scholarly system
• publishing ‘events’ such as the addition
of a new item to one or more
notification hubs
• third-party systems ‘subscribing’ to
these notifications - many potential
applications
• would involve very little or no effort by
repository administrators
• modest software development

being of, not just on, the Web
• obvious…but not really done yet
• the ‘splash page’ requiring human
mediation is a real problem
• “signposting the scholarly web”
• link HTTP headers
• would involve very little or no effort
by repository administrators
• a small amount of software
development in repository systems
http://signposting.org

content, metadata and people
Diagram by Herbert Van de Sompel

conclusion
• the goal:
• To position repositories as the foundation for a distributed, globally
networked infrastructure for scholarly communication…
• we already have much of what is needed:
• ubiquitous distribution of open repository platforms
• the desire to challenge the status quo
to work in the square (meydan), not the tower (kule)
together, we can establish a scholarly communications
infrastructure that we can be proud of, and that our
children will thank us for!

Paul Walk
Director, Antleaf
Managing Director, Dublin Core Metadata Initiative (DCMI)
Web: http://www.paulwalk.net
Email: paul@paulwalk.net
Twitter: @paulwalk www.antleaf.com www.dublincore.org
Teşekkürler!
More information:
http://bit.ly/coar-repo-ng

Next generation repositories

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Next generation repositories

Similar to Next generation repositories (20)

More from Paul Walk

More from Paul Walk (20)

Recently uploaded

Recently uploaded (20)

Next generation repositories

Editor's Notes