This slidedeck describes the work that was carried out to create a proof-of-concept for a persistent URI Service to be used by EU institutions and/or Member States for publishing Linked Open Government Data (LOGD). The work is supported by the Interoperability Solutions for European Public Administrations (ISA) Programme of the European Commission as part of its Action 1.1 on semantic interoperability.
From January till October 2014, the ISA Programme supported the work of an informal, inter-institutional Task Force on a proposal for a common policy for the management of persistent, HTTP-based, Uniform Resource Identifiers (HTTP URIs) by EU institutions. This policy includes the following elements:
1. A common inter-institutional governance and management of URIs: an inter-institutional URI management body with roles, responsibilities, and a decision mechanism;
2. Common design rules for persistent URI sets: common rules for the design of persistent URI sets by EU institutions; and
3. A persistent URI Service for the europa.eu domain: a central Web service providing redirection and content negotiation mechanisms for persistent URI namespaces. This service would be responsible for the registration and management of persistent URI namespaces and the forwarding of HTTP requests (URI redirection) towards the local register.
The latter Persistent URI Service is the main topic of this presentation, which reports on a proof-of-concept that was carried out in the months July and August 2014.
Elevate Your Business with Our IT Expertise in New Orleans
EU Institutions Persistent URI Service PoC
1. Persistent URI
Service for EU
institutions:
proof-of-concept
Zakaria Arrassi – PwC EU Services
Stijn Goedertier – PwC EU Services
Stéphane Roulier – PwC EU Services
Persistent URI Task Force Meeting
19 September 2014
2. HTTP URI: identifier and locator
Definition
• A compact sequence of characters that identifies and/or
locates a resource and that follows the HTTP URI scheme
• Double use:
o as an identifier: to identify information resources,
physical resources or abstract resources
o as a locator: to get (information about) a resource
(HTTP GET), possibly via redirection
2
3. HTTP URI: identifier and locator
Examples
• A country, e.g. Belgium
BE
http://publications.europa.eu/resource/authority/country/BEL
• A concept scheme, e.g. Countries Named Authority List
http://publications.europa.eu/resource/authority/country
• A pesticide substance, e.g. lepidopteran pheromones
http://ec.europa.eu/semantic_webgate/html/dataset/pesticide
s/resource/substances-1894
• A contract notice, TED notice 229842
http://ted.europa.eu/udl?uri=TED:NOTICE:229842-
2014:TEXT:EN:HTML
3
4. Existing problem:
Fragmented namespaces - lack of coordination: EU institutions mint their own URIs.
Fragmentation of effort: EU institutions use their own sub-domains, virtual folders, etc. No
common infrastructure.
Lack of service-level guarantees: EU institutions don’t commit to service levels (major barrier
to reuse of URIs).
Proposed solution: common policy consisting of
Inter-institutional governance: roles, responsibilities, and decision-making process.
Design principles: strict rules and guidelines for IRI sets by EU institutions.
Configurable, persistent URI service for europa.eu: service providing redirection and content
negotiation mechanisms and enforcing common URI conventions (e.g. purl.europa.eu).
Benefits:
Increased service levels: central, high-availability service
Harmonisation and trustworthiness: good governance and change management
Network effects: avoid duplication of resources and information… through the reliable reuse
4
Persistent URI Service
Business case
5. 5
Persistent URI Service
Existing persistent URI services
• OCLC: purl.org
• US Library of Congress: id.loc.gov
• US Government Printing Office: purl.gpo.gov
• W3ID: w3id.org
• UK Gov: data.gov.uk
• DBpedia: dbpedia.org
• DOI handlers: doi.org
6. 6
PURI Service: Proof-of-concept
Objectives and approach
Objectives:
o Demonstrate technical feasibility with examples
o Identify additional requirements or concerns
Approach:
1. Requirement analysis
2. Comparison of existing open-source software
3. Deployment of PoC on http://uri.semic.eu
4. Configuration of sample persistent URI namespaces
7. Persistent URI Service
1. Requirement analysis
centrally decided Locally decided under central guidance
• One central register of URI namespaces: the
Persistent URI Service manages the URI namespace.
• Many local registers of resources: the local registers
contain the local identifier of resources for which
information is kept in the register.
7
URI namespace
http://{subdomain}.europa.eu/{namespace}/
tail
{local id}/{version}/{language}
Central register of URI namespaces Local register of resources
8. Persistent URI Service
1. Requirement analysis
• Actors: user (HTTP client), PURI administrator (Persistent
URI Service), namespace owner (local registry),
• Use cases:
a) Request / approve a persistent URI namespace
b) Configure redirection rules for a URI namespace
c) Submit HTTP requests on persistent URIs, which are
redirected to the right local registry
d) Monitor HTTP requests
8
9. “I want to make
resources in my registry
available with
persistent HTTP URIs
that serve both as a
common identifier and
locator. ”
9
Persistent URI Service
1. Requirement analysis
User
PURI Administrator Namespace owner
“For data integration
and application
integration, I need HTTP
URIs that are commonly
used to identify and
locate important EUI
resources.”
“I want to provide a
service that allows EUIs
to request and manage
persistent HTTP URI
namespaces. I want to
monitor service levels
(incl. persistence).”
Persistent Browser / HTTP client URI Service local registry
10. 10
Persistent URI Service
1. Requirement analysis
User
PURI Administrator Namespace owner
a) Request / approve a
persistent URI namespace
Use case:
Persistent Browser / HTTP client URI Service local registry
11. Persistent URI Service
1. Requirement analysis: user scenario
1. Tenders Electronically Daily (TED) wants to have persistent URIs
for contract notices (CN) and contract award notices (CAN). TED
requests the persistent URI namespace
http://data.europa.eu/contract-notice/
2. The PURI administrator verifies the request and approves the
request, granting TED access to the persistent URI namespace.
11
12. Persistent URI Service
Scope criteria
Scope criteria Examples
12
1. Authoritative source
2. Commitment of persistence
3. Inter-organisational
4. Machine-readable
information
5. Existing register
• Data models: INSPIRE data
models?
• Reference data: EuroVoc,
NALs, NUTs, GEMET?
• Registers: staff register,
budget lines, TED, Trade
Marks, FTS, Ship, …
• Documents: OJ, Eur-Lex, …
• High-value datasets
13. 13
Persistent URI Service
1. Requirement analysis
User
PURI Administrator Namespace owner
b) Configure redirection
rules for a URI namespace
Use case:
Persistent Browser / HTTP client URI Service local registry
14. Persistent URI Service
1. Requirement analysis: user scenario
3. TED configures a redirection rule on the URI namespace:
http://data.europa.eu/contractnotice/{$local_id}
redirect to
http://ted.europa.eu/udl?uri=TED:NOTICE:{$local_id
}.
4. 5 years later, TED is migrated to the CELLAR platform. TED re-configures
the redirection rule
http://data.europa.eu/contractnotice/{$local_id}
redirect to
http://cellar.europa.eu/TED:NOTICE:{$local_id}.
14
15. 15
Persistent URI Service
1. Requirement analysis
User
PURI Administrator Namespace owner
c) Submit HTTP requests on
persistent URIs, which are
redirected to the right local
registry
Use case:
Persistent Browser / HTTP client URI Service local registry
16. http://data.europa.eu/ http://ted.europa.eu/udl
HTTP Client Persistent URI Service
HTTP/1.1 GET http://data.europa.eu/contract-notice/229842-2014
Accept: application/rdf+xml
HTTP/1.1 303 See Other
Location: http://ted.europa.eu/udl?uri=TED:NOTICE:229842-2014
HTTP/1.1 GET http://ted.europa.eu/udl?uri=TED:NOTICE:229842-2014
Accept: application/rdf+xml
HTTP/1.1 200 OK
Accept-Ranges: bytes
Content-Type: application/rdf+xml; charset=UTF-8
Content-Length: 1821
local register
Persistent URI Service
1. Requirement analysis: user scenario
5. Each HTTP request on a persistent URI is redirected to
the right local register (following the redirection rules).
16
17. 17
Persistent URI Service
1. Requirement analysis
User
PURI Administrator Namespace owner
d) Monitor HTTP requests
Use case:
Persistent Browser / HTTP client URI Service local registry
18. Persistent URI Service
1. Requirement analysis: user scenario
6. The PURI administrator uses the Persistent URI Application
to monitor the incoming HTTP requests on the persistent
URI namespace. The URI Technical Team discovers that URIs
on the namespace of TED are no longer dereferenceable; an
HTTP 404 error code is returned. The URI Technical Team
reports this to the TED team. The TED team fixes these
comments on the local registry.
18
19. 19
Persistent URI Service
2. Comparison of existing open-source software
Features
Apache
HTTPD
PID
Purlz
NetKernel
Callimachus
URL
shortener
1. Functionality
a) Request / approve URI namespace
b) Configure redirection rules for a URI namespace
- HTTP request parameters
- Input URI pattern
- Output URI pattern
- Internal redirection (proxy forwarding)
- External redirection
- Response status code
- HTTP response parameters
c) Redirection of HTTP requests
d) Monitor HTTP requests
2. Open-source license (OSI approved)
3. Maturity
4. Maintained
20. Persistent URI Service
3. Deployment on http://uri.semic.eu
20
• Using standard software
(Apache HTTPD Server)
• Temporary domain (not
data.europa.eu)
• Temporary server (Amazon
Web Services micro-instance)
21. Persistent URI Service
4. Configuration of sample persistent URI namespaces
• DG EMPLOYMENT: ESCO taxonomy
o E.g. http://uri.semic.eu/id/esco/occupation/506
• DG SANCO: PPP products
o E.g. http://uri.semic.eu/pesticide-substance/1894
• DG COMM: RAPID
o E.g. http://uri.semic.eu/press-release/IP-14-780/FR
• Publications Office: TED
o E.g. http://uri.semic.eu/contract-notice/229842-2014
21
22. Persistent URI Service
Conclusion
• Demonstrated technical feasibility with standard Web
software (Apache HTTPD Server)
• Benefits:
o Guarantees for persistence: thanks to the policy, the
redirection rules, and the monitoring of service levels.
o Flexibility: local registries keep managing own
resources (including the local id).
o Speed and efficiency: easy to configure.
o visibility: local register remains visible thanks to
redirection.
22
23. ISA Programme Action 1.1 –
Semantic Interoperability
Follow @SEMICeu on Twitter
Join the SEMIC group on LinkedIn
Join the SEMIC community on Joinup
Project Officers
Vassilios.Peristeras@ec.europa.eu
Suzanne.Wigard@ec.europa.eu
Athanasios.Karalopoulos@ec.europa.eu
Visit our initiatives Get involved
ADMS.
SW
CORE
PUBLIC
SERVICE
VOCABULARY
24. Disclaimer
This presentation was prepared for the Persistent URI Task Force by PwC EU Services. It
represents work that was commissioned by the ISA programme of the European Commission.
The views expressed in this report are purely those of the authors and may not, in any
circumstances, be interpreted as stating an official position of the European Commission.
The European Commission does not guarantee the accuracy of the information included in this
study, nor does it accept any responsibility for any use thereof.
Reference herein to any specific products, specifications, process, or service by trade name,
trademark, manufacturer, or otherwise, does not necessarily constitute or imply its endorsement,
recommendation, or favouring by the European Commission.
All care has been taken by the author to ensure that s/he has obtained, where necessary,
permission to use any parts of manuscripts including illustrations, maps, and graphs, on which
intellectual property rights already exist from the titular holder(s) of such rights or from her/his or
their legal representative.
SEMIC
SEMANTIC
INTEROPERABILITY
COMMUNITY
24
Notas do Editor
Definition HTTP URIs: “a compact sequence of characters that identifies a resource and that follows the HTTP URI scheme”
HTTP URIs can be used both as an identifier to identify physical and abstract resources and as a link to get (information about) a resource.
Definition HTTP URIs: “a compact sequence of characters that identifies a resource and that follows the HTTP URI scheme”
HTTP URIs can be used both as an identifier to identify physical and abstract resources and as a link to get (information about) a resource.
Definition HTTP URIs: “a compact sequence of characters that identifies a resource and that follows the HTTP URI scheme”
HTTP URIs can be used both as an identifier to identify physical and abstract resources and as a link to get (information about) a resource.
Definition HTTP URIs: “a compact sequence of characters that identifies a resource and that follows the HTTP URI scheme”
HTTP URIs can be used both as an identifier to identify physical and abstract resources and as a link to get (information about) a resource.
Applications (machines) can now retrieve machine-readable data from TED using a stable identifier.
http://uri.semic.eu/contract-notice/190403-2009
Definition HTTP URIs: “a compact sequence of characters that identifies a resource and that follows the HTTP URI scheme”
HTTP URIs can be used both as an identifier to identify physical and abstract resources and as a link to get (information about) a resource.
scalability: can be assured via standard Web solutions (e.g. load balancing)