6. about the COAR Next
Generation Repositories
Working Group
7.
8. Next Generation Repositories Working Group
• Eloy Rodrigues, chair (COAR,
Portugal)
• Andrea Bollini (CINECA, Italy)
• Alberto Cabezas (LA Referencia,
Chile)
• Donatella Castelli (OpenAIRE/CNR,
Italy)
• Les Carr (Southampton University,
UK)
• Leslie Chan (University of Toronto
at Scarborough, Canada)
• Rick Johnson (SHARE/University of
Notre Dame, US)
• Petr Knoth (Jisc and Open
University, UK)
• Paolo Manghi (CNR, Italy)
• Lazarus Matizirofa (NRF, South
Africa)
• Pandelis Perakakis (Open Scholar,
Spain)
• Oya Rieger (Cornell University, US)
• Jochen Schirrwagen (University of
Bielefeld, Germany)
• Daisy Selematsela (NRF, South
Africa)
• Kathleen Shearer (COAR, Canada)
• Tim Smith (CERN, Switzerland)
• Herbert Van de Sompel (Los
Alamos National Laboratory, US)
• Paul Walk (Antleaf, UK)
• David Wilcox (Duraspace/Fedora,
Canada)
• ▪ Kazu Yamaji (National
Institute of Informatics, Japan)
9. To position repositories as the
foundation for a distributed, globally
networked infrastructure for scholarly
communication…
10. objectives
• cross-repository interoperability
• encourage the emergence of added-value services
• transform the scholarly communication system by emphasising:
• collective, open and distributed management of open content
• collective innovation
11. principles
• distribution of control of scholarly resources
• inclusiveness: different institutions and regions have particular needs (e.g
diverse language, policies and priorities) and this must be supported
• for the public good
• intelligent openness
12. Intended outputs
• direct outputs:
• the Next Generation Working Group will collectively produce:
• reports
• conceptual models
• recommendations for particular technologies
• indirect outputs:
• some individuals independently of the Next Generation Working Group
will:
• implement software changes to repository platforms
• build infrastructure (micro-services)
13. design assumptions
• focus on resources
• not just associated metadata - treat them equally
• pragmatism
• favour the simpler approach
• evolution, not revolution
• use existing software and systems where possible
• convention over configuration
• standardise only where necessary and minimise constraints
• engage with users where they are:
• integrate into environments and systems where users are already engaged
Not all users are human, some are machines!
15. “behaviours”
• Supporting discovery of content
• exposing identifiers and links between resources
• supporting navigation
• supporting batch discovery
• actively sharing or exposing notifications
• Participating in the social network
• Global identification of people in the repository network
• Annotation, commenting and reviews - e.g. Open Peer Review
• Logging and exposing of user interaction data across repositories
• Preservation
• Supporting other processes
• Declaring licenses at a resource level
• Exposing standardised usage metrics
• Content transfer (e.g. for text and data mining)
16. user stories
as <some actor>,
I want to <do something>,
in order to gain <some benefit>
17. user stories relating to repository ‘behaviours’
Example user-stories for the behaviour “Discovery through navigation”:
• as a human or machine user, I want to easily and uniformly identify the
metadata in a repository record, so that I can ascertain the relevance
of the resource.
• as a repository manager, I want to be able to access the metadata in
my repository in real time through an API in order to build views or
services on any platform using the data.
• as a research manager (funder or institution), I want to be able to track
the research outputs related to a specific funded project to
demonstrate value and compliance with policy
19. repositories must be deeply connected
• outgoing:
• individual content resources
• directly accessible on the network
• individual metadata records
• not just in batches
• individual users
• as part of a variety of professional and social networks
• incoming:
• using all appropriate global identifier systems
• accepting automated deposit of content and data from other systems (e.g.
scientific instruments)
• allowing external services to interact with content
• content mining
• annotation services
• etc.
20. repositories need to be active
• the next generation repository needs to talk to the world
• publishing events to notification hubs and notifying users
• and to listen, and respond:
• respond to requests for content and metadata, equally
• continuously improve the information it has, adding value where it can by:
• responding to and supporting annotation and peer review
• not just allowing text/data-mining, but supporting it and benefitting from the
derived information
supporting user workflows - providing and accepting data
21. active repositories
• repositories could become pro-active
components in an event-driven
scholarly system
• publishing ‘events’ such as the addition
of a new item to one or more
notification hubs
• third-party systems ‘subscribing’ to
these notifications - many potential
applications
• would involve very little or no effort by
repository administrators
• modest software development
22. being of, not just on, the Web
• obvious…but not really done yet
• the ‘splash page’ requiring human
mediation is a real problem
• “signposting the scholarly web”
• link HTTP headers
• would involve very little or no effort
by repository administrators
• a small amount of software
development in repository systems
http://signposting.org
24. conclusion
• the goal:
• To position repositories as the foundation for a distributed, globally
networked infrastructure for scholarly communication…
• we already have much of what is needed:
• ubiquitous distribution of open repository platforms
• the desire to challenge the status quo
to work in the square (meydan), not the tower (kule)
together, we can establish a scholarly communications
infrastructure that we can be proud of, and that our
children will thank us for!
25. Paul Walk
Director, Antleaf
Managing Director, Dublin Core Metadata Initiative (DCMI)
Web: http://www.paulwalk.net
Email: paul@paulwalk.net
Twitter: @paulwalk www.antleaf.com www.dublincore.org
Teşekkürler!
More information:
http://bit.ly/coar-repo-ng
Editor's Notes
Thank you for inviting me to speak - it is an honour to be invited to Izmir.
I should tell you that I no longer work for the University of Edinburgh - I have decided to start my own consultancy company instead working in the are of open access and research data management. Today I am representing COAR
I’d like to start by proposing 3 cheers for the current generation of repositories - three important aspects
most of our repository systems are built from technology which has been in near-continuous development for more than a decade.
the community support for repository systems is considerable - look around you for the evidence of that! :-)
the resources within our repositories are under the control of our institutions, not under the control of a handful of publishers
monopoly avoidance strategy
the most important aspect from my point of view
Confederation of Open Access Repositories
international association with >100 members from 35 countries - 5 continents represented
libraries, universities, research institutions, government funding agencies etc.
the WG includes some luminaries from the world of repositories. And I'm in there too.
this is the goal - implicit in this is competition for the current global infrastructure which is largely owned and deployed by the commercial academic publishers
to support:
discovery, access, annotation, real-time curation, sharing, quality assessment, content transfer, analytics, provenance tracing, etc.
by intelligent openness, I mean actually supporting re-use, not just making something ‘open’
we are already seeing some of the repository platforms adopt some of the recommendations emerging from this work (for example the adoption of “signposting”)
this provisional lists of behaviours allow us to group related technologies together and apply them to addressing user-stories
from agile development methodology - a useful way to simply frame users’ priorities
not just connected in a general sense, connected at every level:
repositories are nodes in the network
content items are nodes in the network
metadata records are nodes in the network
users are are nodes in the network
and the network is The Web
This blog post is why I was invited to join the COAR working group
some interesting musing about peer-to-peer distributed control!
alternatives to high-latency aggregation
Herbert Van de Sompel & Michael Nelson
make the webpage itself both human and machine readable
resources are linked through a common vocabulary and url that expresses the relationship between content and metadata.