This presentation gives details on technologies and approaches towards exploiting Linked Data by building LD applications. In particular, it gives an overview of popular existing applications and introduces the main technologies that support implementation and development. Furthermore, it illustrates how data exposed through common Web APIs can be integrated with Linked Data in order to create mashups.
2. Motivation: Music!
• Our aim: build a music-based portal using Linked
CH 1
Data technologies.
• So far, we have studied different mechanisms to
consume Linked Data:
•
•
•
•
Executing SPARQL queries
Dereferencing URIs
Downloading RDF dumps
Extracting RDFa data
CH 2
CH 3
• The output of these mechanisms is displayed to the
CH 4
user by applying visualization techniques.
EUCLID – Building Linked Data applications
2
3. Analysis &
Mining Module
Visualization
Module
RDFa
Data acquisition
LD Dataset
Access
Application
Motivation: Music!
SPARQL
Endpoint
Publishing
Vocabulary
Mapping
Interlinking
Physical Wrapper
Integrated
Dataset
Cleansing
LD Wrapper
R2R Transf.
LD Wrapper
RDF/
XML
Streaming providers
Downloads
Musical Content
Metadata
EUCLID – Building Linked Data applications
Other content
3
4. Agenda
1. Characterization of Linked Data applications
2. Linked Data application architecture
3. Linked Data application development frameworks
4. Using Web APIs
EUCLID – Building Linked Data applications
4
6. Linked Data Application
Consumes
LD
Manipulate
s& Produces
Web app
LD
LD app
LD applications have three main parts:
• Consumes Linked Data: Systems that only consume LD are considered
mashups. Consuming LD does not necessarily mean that sources expose
RDF-based data. The app may use wrappers to transform the data into
Linked Data.
• Manipulates/Produces Linked Data: Performs updates to RDF data
and makes the data accessible on the Web of Data.
• Web App/Interface: Often operates on the Web. Allows to easily
integrate and export data.
Source: M. Hausenblas. “Linked Data Applications”
EUCLID – Building Linked Data applications
6
7. Categories of
Linked Data Applications
According to their usage, the majority of current Linked Data
applications corresponds to:
• Generic Linked Data browsers: Dereference URIs to retrieve the
resource description. Consume and expose Linked Data.
CH 4
Examples: Sig.ma, Sindice, Marbles, etc.
• Linked Data search engines: Allows the user to submit queries.
Consume and republish the retrieved data.
Examples: Swoogle, Watson, etc.
CH 4
• Domain-specific Linked Data applications: Built for specific
purposes.
Examples: see later.
Source: M. Hausenblas. “Linked Data Applications”
Source: M. Martin and S. Auer. “Categorisation of Semantic Web Applications”
EUCLID – Building Linked Data applications
7
8. Categories of
Linked Data Applications (2)
Furthermore, Linked Data applications can be classified
according to the following dimensions:
Dimensions
Levels
Description
Semantic
technology depth
Extrinsic
Use of semantics on the surface of the application.
Intrinsic
Conventional technologies (e.g., RDBMS) are
complemented or replaced with SW equivalents.
Information flow
direction
Consuming
LD is retrieved from the source or via a wrapper.
Producing
Publishes LD (in RDF-based formats).
Semantic richness
Shallow
Simple taxonomies, use of RDF or RDFS.
Strong
High level representation formalisms (OWL variants)
Isolated
Creation of own vocabularies
Integrated
Reuse of information at schema or instance level
Semantic
integration
Source: M. Martin and S. Auer. “Categorisation of Semantic Web Applications”
EUCLID – Building Linked Data applications
8
9. Example:
Data.gov.uk
• Provides a data catalog about UK’s governmental information.
Source: http://data.gov.uk
EUCLID – Building Linked Data applications
9
10. Example:
Data.gov.uk (3)
• A catalog of applications is available at the website
Source: http://data.gov.uk/apps
EUCLID – Building Linked Data applications
10
11. Example:
Data.gov
• Provides a catalog about US governmental data.
Source: http://catalog.data.gov/dataset
EUCLID – Building Linked Data applications
11
12. Example:
Data.gov (2)
• App developers can build applications on top of Data.gov.uk
data sets available at: https://catalog.data.gov/dataset
• Their platform also provides a set of applications built on top
of these data sets
Mobile Apps
Web Apps
Source: http://www.data.gov/research/page/research-apps
EUCLID – Building Linked Data applications
12
13. Example: BBC –
Dynamic Semantic Publishing
• The BBC DSP architecture aims at
automating aggregation and
publishing of interrelated content
within the BBC portal.
• Journalists are able to semantically
annotate content with LD concepts
through the Graffiti tool.
Graffiti tool
• OWLIM triple store is used to keep
the RDF data and to perform
reasoning over the data.
Source: http://www.bbc.co.uk/blogs/bbcinternet/2012/04/sports_dynamic_semantic.html
EUCLID – Building Linked Data applications
13
14. Example:
ResearchSpace
• The ResearchSpace
environment aims at
providing a set of RDF data
sets and tools to describe
concepts and objects related
to cultural historical research.
Image Annotation
• The tools are highly
interactive: allow users to
access the data and
contribute to the data set by
creating RDF annotations.
Geo Mapper
Source: https://sites.google.com/a/researchspace.org/researchspace/
EUCLID – Building Linked Data applications
14
15. Example:
ResearchSpace (2)
The ResearchSpace infrastructure
RDF data is
accessed via
SPARQL or Sesame
OpenRDF API
Implement GUI
requirements
Store and serve
multiresolution
images and titles
User Interface
Source: https://confluence.ontotext.com/display/ResearchSpace/RS+Infrastructure
EUCLID – Building Linked Data applications
15
16. Example:
ResearchSpace CRM Search System
Search by predicates
Faceted
search
Source: Snapshot from https://www.youtube.com/watch?v=HCnwgq6ebAs
EUCLID – Building Linked Data applications
16
17. Example:
Open Pharmacology Space
• OPS is a platform that aims at integrating pharmacological
data available in open standards.
• The OPS platform offers an API to access its data.
• The following applications have been built on top of OPS:
• Open PHACTS Explorer: Allows browsing the OPS data.
• ChemBioNavigator: Visualizes the composition of a molecule group.
• PharmaTrek: Allows navigating the content of ChEMBL.
Open PHACTS Explorer
ChemBioNavigator
PharmaTrek
Source: http://www.openphacts.org/open-phacts-discovery-platform
EUCLID – Building Linked Data applications
17
18. Example:
Open Pharmacology Space (2)
The OPS platform architecture
Produces LD
Semantic
technology
depth: intrinsic
and extrinsic
Consumes LD
Source: Williams A., Harland L., Groth P,. et al.: Open PHACTS: Semantic
interoperability for drug discovery. Drug Discovery Today, June 06, 2012
EUCLID – Building Linked Data applications
18
19. Example:
eCloudManager
Use case: data center management
• Multitude of managed resources
• Hardware (physical storage, network, computational infrastructure)
• Virtualization capabilities (virtual clusters, live migration)
• Software applications
• Multitude of APIs and
data sources
• Tool sprawl!
Source: http://www.fluidops.com/ecloudmanager/
EUCLID – Building Linked Data applications
19
20. Example: eCloudManager –
Integrated View on the Data Center
• Integration of different SW
and HW components,
storage systems, compute
infrastructures,
applications, CRM systems,
ticket systems, project
catalogs.
• Automatic correlation of
data retrieved from various
systems.
• Unified view on data and
metadata across the border
of company units.
• Exploration, analysis, and
actions based on the entire
data corpus.
Project Data
Applications &
Landscapes
Compute
Infrastructure
Storage
Infrastructure
Source: http://www.fluidops.com/ecloudmanager/
Integrated view showing connections between hardware layer,
application layer, projects, and customers
22. Software Architecture
• Denotes the structures or components of a software system.
• It is comprised of:
• Elements: Includes software (logic) components,
databases, web servers, services, legacy systems or other
type of components required in the system.
• Relationships between the elements: Mechanisms to
communicate the different elements within the
architecture.
• Software architecture also refers to a set of practices to use or
design a (software) system.
EUCLID – Building Linked Data applications
22
23. Multitier Architecture
• Logically separates the components/functions of the system
into different tiers, allowing for easy reuse or replace of a
particular tier.
• The most common use of the multitier architecture is the
three-tier architecture.
Presentation tier
Corresponds to the user interface. Translates
the results into human-readable information.
Logic tier
Implements the business logic, analytical
computation etc.: performs detailed processing.
Sales per
album of
‘The Beatles’
Search music artist:
‘The Beatles’.
Retrieve album
information.
Data tier
Stores the data. This tier is independent
from the business logic.
EUCLID – Building Linked Data applications
SPARQL
query
Aggregate
information per
album.
RDF
results
23
24. General Architecture of
Linked Data Applications
Presentation Tier
Logic Tier
Data Tier
Integrated
Dataset
(Triple Store)
Data Access
Component
Republication
Republication
Component
Data Integration Component
Vocabulary
Mapping
Physical Wrapper
Interlinking
SPARQL Wr.
R2R Transf.
Cleansing
LD Wrapper
RDF/
XML
Web Data accessed via APIs
SPARQL
Endpoints
EUCLID – Building Linked Data applications
Relational Data
Linked Data
24
25. Architectural Patterns
1. The Crawling Pattern: Crawls or loads data in
advance. Data is managed in one triple store, thus it can
be accessed efficiently. The disadvantage of this pattern is
that the data might not be up to date.
2. The On-The-Fly Dereferencing Pattern: URIs
are dereferenced at the moment that the app requires the
data. This pattern retrieves up to date data. Performance
is affected when the app must dereference many URIs.
Data
Access
App
Cache
Data
Access
App
Data
Access
App
3. The (Federated) Query Pattern: Submits complex
queries to a fixed set of data sources. Enables applications
to work with current data directly retrieved from the
sources. Finding optimal query execution plans over a
large number of sources is a complex problem.
Source: T. Heath, C. Bizer. Linked Data: Evolving the Web into a Global Data Space
EUCLID – Building Linked Data applications
25
26. Data Layer
Data Access Component
• Linked Data applications may implement a Mediator-Wrapper
Architecture to access heterogeneous sources:
– Wrappers are built around each data source in order to provide an
unified view of the retrieved data.
• The method to access the data depends on the Linked Data
architectural pattern.
• The factors that determine the decision of a pattern are:
–
–
–
–
Number of data sources to access
Requirement of consuming up-to-date data
Tolerance to high response time
Requirement of discovering new data sources
EUCLID – Building Linked Data applications
26
27. Data Layer (2)
Data Access Component (2)
• The data access component may be implemented by using
one or a combination of the following tools:
Mechanisms
Tools (Examples)
Linked Data Crawlers
LDspider https://code.google.com/p/ldspider/
Slug https://code.google.com/p/slug-semweb-crawler/
Linked Data Client Libraries
Semantic Web Client Library http://wifo5-03.informatik.unimannheim.de/bizer/ng4j/semwebclient/
The Tabulator http://www.w3.org/2005/ajar/tab
Moriarty https://code.google.com/p/moriarty/
SPARQL Client Libraries
Jena Semantic Web Framework http://jena.apache.org/
Federated SPARQL Engines
ANAPSID https://github.com/anapsid/anapsid
FedX http://www.fluidops.com/fedx/
SPLENDID https://code.google.com/p/rdffederator/
Search Engine APIs
Sindice http://sindice.com/developers/api
Uberblic http://uberblic.com/
EUCLID – Building Linked Data applications
27
28. Data Layer (3)
Data Integration Component
• Consolidates the data retrieved from heterogeneous sources.
• This component may operate at:
– Schema level: Performs vocabulary mappings in order to translate
data into a single unified schema. Links correspond to RDFS properties
CH 2
or OWL property and class axioms.
– Instance level: Performs entity resolution via owl:sameAs links. In
case the data sources do not provide the links, further tools like Silk or
CH 3
Open Refine can be used to integrate the data.
Data Access
Component
Data Integration Component
Vocabulary
Mapping
Interlinking
EUCLID – Building Linked Data applications
Cleansing
28
29. Data Layer (4)
Integrated Dataset
• The dataset resulting of integrated and consolidated data can
be cached in a RDF store.
• There are many solutions to deploy triple/RDF stores, e.g.:
•
•
•
•
•
•
OWLIM (http://www.ontotext.com/owlim)
Jena TDB (http://jena.apache.org/documentation/tdb/)
Cumulus RDF (https://code.google.com/p/cumulusrdf/)
AllegroGraph (http://www.franz.com/agraph/allegrograph/)
Virtuoso Universal Server (http://virtuoso.openlinksw.com/)
RDF3x (https://code.google.com/p/rdf3x/)
Integrated
Dataset
Republication
EUCLID – Building Linked Data applications
Republication
Component
29
30. Data Layer (5)
Republication Component
• Exposes as Linked Data portions
• There are different solutions to make the data accessible:
Data Layer
•
•
•
•
Via SPARQL endpoints (e.g., Sesame OpenRDF SPARQL Endpoint, …)
Via APIs (e.g., Linked Data API)
As RDF dumps
With the built-in means of your framework/CMS (e.g., Drupal,
Information Workbench, …)
Integrated
Dataset
Republication
EUCLID – Building Linked Data applications
Republication
Component
30
31. Application and Presentation Layers
• The logic layer implements sophisticated processing according
to the functionalities of the application. This layer may include
data mining components as well as reasoners that are not
integrated in the data layer.
• The presentation layer displays the information to the user in
various formats, including text, diagrams or other type of
CH 4
visualization techniques.
Presentation Layer
Logic Layer
EUCLID – Building Linked Data applications
31
33. Information Workbench
• Platform for development of linked data applications
Semantics- & Linked Data-based
Integration of Enterprise and Open Data
Sources
Intelligent Data Access and Analytics
• Visual exploration
• Semantic search
• Dashboarding and reporting
Collaboration and Knowledge
Management Platform
• Wiki-based curation & authoring of
data
• Collaborative workflows
Source: http://www.fluidops.com/information-workbench/
Semantic Web Data
EUCLID – Building Linked Data applications
33
34. Information Workbench (2)
Customized application
solutions
Reusable UI and data
integration components
Data storage and
management platform
External resources to reuse
data and create mashups
EUCLID – Building Linked Data applications
34
35. Data Storage & Access
Data Management based on Sesame framework
• Open Source, written
in Java
• Layered architecture
for semantic data
Stable (yet extensilble)
Sesame Access API
management
APIs for data access,
manipulation, ...
SAIL API
• Easy to plug in new
Stackable
SAIL 1 (e.g. Query
data management
architecure of
Optimization Layer)
custom data
components on
SAIL 2 (e.g. Distributed Query
management
Execution Layer)
demand
components
• Most of the existing
DB2
DB3
DB1
Easy integration by
triple stores support
implementing a generic API
Sesame API
EUCLID – Building Linked Data applications
35
36. Back-End Configuration Options
•
•
Back-end data store is specified via the
IWB configuration properties
Both local and remote data access are
possible
See http://iwb.fluidops.com/resource/Help:RepositoryConfiguration
Local repository
Remote Sesame
repository
Arbitrary SPARQL
endpoint
Sesame
remote
repository
client API
Sesame
SPARQL
repository
client API
Sesame
HTTP Server
Sesame Sail
API
Sesame Sail API
Sesame
native store
OWLIM
SYSTAP
bigdata
AllegroGraph
SPARQL
endpoint
…
…
…
EUCLID – Building Linked Data applications
36
37. Data Integration:
Data Provider Concept
Data providers support the periodic
extraction & integration from external
data sources into a central repository
• Lifting from arbitrary data formats to
RDF (e.g., relational, XML, CSV)
• Parametrizable (e.g. connection
information, refresh interval, ..)
• Built-in UI for instantiating providers
• Intuitive interfaces and APIs for
writing own, custom providers
Connect to
data source
Extract data
from source
Examples:
R2RML
SPARQL
Convert data
into RDF
EUCLID – Building Linked Data applications
XML2RDF
RDF
Groovy Script
Store RDF in
repository
37
38. Data Warehousing vs.
Federation
Warehousing
Federation
• Data is copied from the source
into the warehouse
• Query runs in the warehouse
• Supported in IWB using data
providers
• Data remains in federated DB
• Query is pushed down to
federated DB
• Supported in IWB using
SPARQL federation
Query
Query
Warehouse
Federation
Query
Load
DB
DB
DB
EUCLID – Building Linked Data applications
DB
38
39. Virtualized Data Integration
with FedX
Information Workbench:
Integration of Virtualized Data Sources as a Service
Application Layer
Semantic Wiki
Collaboration
Reporting & Analytics
Visual Exploration
See http://iwb.fluidops.com/resource/Help:FedX
Transparent & On-Demand
Integration of Data Sources
See http://www.fluidops.com/fedx/
Virtualization Layer
Data Layer
SPARQL
Endpoint
SPARQL
Endpoint
SPARQL
Endpoint
Data Source
Data Source
Data Source
Metadata
Registry
EUCLID – Building Linked Data applications
Data Registries
CKAN, data.gov, etc.
+ Enterprise Data
39
40. Customizable User Interface
Current resource
Navigation
shortcuts
Wiki page
management
View
selection
toolbar
Main view area
Demo available at http://musicbrainz.fluidops.net
EUCLID – Building Linked Data applications
40
41. User Interface Concept:
One Page URI
Resource page
Resource page
Resource page
Resource page
Graph
EUCLID – Building Linked Data applications
41
42. Data Driven UI: Ontology as
“Structural Backbone”
Resource page
UI templates
Template:…
Resource page
Template:mo:MusicArtist
Ontology
(RDFS/OWL)
RDF Data
Graph
EUCLID – Building Linked Data applications
42
43. Different Views on
Every Resource
Wiki View
Table View
Graph View
Pivot View
EUCLID – Building Linked Data applications
43
44. Widget-Based User Interface
Visualization and Exploration
Analytics and Reporting
Authoring and Content Creation
Mashups with Social Media
Widgets are not static and can be integrated
into the UI using a Wiki-style syntax.
CH 4
EUCLID – Building Linked Data applications
44
45. Example: Add Widgets to Wiki
•
•
•
•
•
•
•
•
•
•
•
•
•
{{#widget: BarChart |
query ='SELECT distinct (COUNT(?Release) AS ?COUNT) ?label WHERE {
?? foaf:made ?Release .
?Release rdf:type mo:Release .
?Release dc:title ?label .
}
GROUP BY ?label
ORDER BY DESC(?COUNT)
LIMIT 10
'
| input = 'label'
| output = 'COUNT'
}}
Example: Show top 10 released records for an artist
EUCLID – Building Linked Data applications
45
46. Music Example
Page of a class:
• Shows an overview of MusicArtist instances
See http://musicbrainz.fluidops.net/resource/mo:MusicArtist
EUCLID – Building Linked Data applications
46
47. Music Example (2)
Page of a class template:
• Defines a layout for displaying each resource of the class
• Uses semantic wiki syntax
See http://musicbrainz.fluidops.net/resource/Template:mo:MusicArtist
EUCLID – Building Linked Data applications
47
48. Music Example (3)
Page of a class instance:
• Displays the data about the resource according to the class
template
See
http://musicbrainz.fluidops.net/resource/?uri=http%3A%2F%2Fmusicbrainz.org%
EUCLID – Building Linked Data applications
2Fartist%2Fb10bbbfc-cf9e-42e0-be17-e2c3e1d2600d%23_
48
49. Mashups with external sources
• Relevant information and UI elements from external sources
can be incorporated in the wiki view
• IWB contains multiple mashup widgets for popular social
media sources
–
–
–
–
–
–
Twitter
Youtube
Facebook
New York Times news
LinkedIn
…
Template instantiation
?? =
http://musicbrainz.org/artist/a3cb23fcacd3-4ce0-8f36-1e5aa6a18432%23_
?x = „U2“
{{#widget: Youtube
| searchString = $SELECT ?x
WHERE { ?? foaf:name ?x . }$
| asynch = 'true’ }}
EUCLID – Building Linked Data applications
49
50. Triple Editor
• Edit structured data associated with a resource
• Make change, add and remove triples
Table View
EUCLID – Building Linked Data applications
50
51. Ontology-Based Data Input
Triple Editor takes into account the ontology definition:
• Autosuggestion tool considers the domains and ranges of the
properties
Example: properties available for the
class mo:MusicGroup are suggested
automatically
EUCLID – Building Linked Data applications
51
52. Validation of User Input
Validation uses property definitions in the ontology:
• The property myIntegerProperty has an associated
rdfs:range definition.
• This ensures that all objects must be of XML schema type
xsd:integer.
EUCLID – Building Linked Data applications
52
53. Further Information
• Information Workbench product page
• http://www.fluidops.com/information-workbench/
• Demo system
• http://musicbrainz.fluidops.net/
• Download a free Community Edition version
• http://www.fluidops.com/information-workbench/iwb-download/
• Online documentation
• http://help.fluidops.com/help/topic/iwb.help-2.5/help.html
EUCLID – Building Linked Data applications
53
55. Callimachus
• Scalable platform for creating and
running data-driven websites.
• Can be deployed on a server,
allowing users to develop their
pages and applications via a Web
browser.
• Resources can be created via the
user interface to build an
application.
Source: http://callimachusproject.org
EUCLID – Building Linked Data applications
55
56. Linked Media Framework (lmf)
• Offers advanced services for linked media management,
built on top of:
• Apache Marmotta (Linked Data platform)
• Apache Stanbol (extraction and enhancement framework)
• Apache Solr (indexation)
• Typical use cases include:
Building semantic
search over data
Publishing legacy data
as Linked Data
Using a SKOS thesaurus
for information extraction
Source: https://code.google.com/p/lmf/
EUCLID – Building Linked Data applications
56
57. Synth
• Development environment implemented with Ruby on Rails.
• Allows for building applications following the Semantic
Hypermedia Design Method (SHDM).
• Provides a set of
modules that receive
models and produce
the hypermedia app
described in the model:
•
•
•
•
Domain
Navigation
Behavior
Interface
Source: http://www.tecweb.inf.puc-rio.br/synth
EUCLID – Building Linked Data applications
57
59. Underlying Technology Basics
HTTP Overview
• HTTP, by which all documents on the WWW are served, is a
client server protocol
• Every interaction is based on:
Request
Response
EUCLID – Building Linked Data applications
59
60. Underlying Technology Basics (2)
HTTP Request
• Method
Request
• GET (retrieve entity identified by URI)
• PUT (store entity under the given URI)
• POST(submit the information as a new subordinate of the
resource URI)
• DELETE (delete entity identified by URI)
• Additionally HEAD, TRACE, CONNECT, OPTIONS, PATCH
• URI
• Header
• [optional] Body (with POST, PUT)
EUCLID – Building Linked Data applications
60
61. Underlying Technology Basics (3)
HTTP Response
• Response Code (Integer)
Response
• 1xx: Provisional response, contains the Status-Line and optional
headers
• 2xx: Indicates that the applications request was successfully received,
understood, and accepted.
• 3xx: Further action needs to be taken by the user agent in order to
fulfill the request.
• 4xx: The applications request was erroneous.
• 5xx: The server has erred or is incapable of performing the request.
Source: http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html
• Header
• [optional] Body
EUCLID – Building Linked Data applications
61
62. Underlying Technology Basics (4)
HTTP Request – Response Pattern
• A Client can submit a request
• Response from the server
HTTP GET
http://en.wikipedia.org/wiki/Beatles
Client
Web Server
200 (OK)
[HTML Page about the Beatles]
EUCLID – Building Linked Data applications
62
63. Underlying Technology Basics (5)
HTTP Conneg & Linked Data URI Lookup
• A foundational issue in Linked Data was the distinction of URIs
for real-world objects versus documents (e.g., RDF) that
might describe them.
• This can be handled in the HTTP Header together with
content negotiation (conneg):
HTTP GET
http://dbpedia.org/resource/The_Beatles
Accept: text/html
Client
Web Server
303 (see other)
http://dbpedia.org/page/The_Beatles
EUCLID – Building Linked Data applications
63
64. Underlying Technology Basics
(6)
HTTP Conneg & Linked Data URI Lookup
• A foundational issue in Linked Data was the distinction of URIs
for real-world objects versus documents (e.g., RDF) that
might describe them.
• This can be handled in the HTTP Header together with
content negotiation (conneg):
HTTP GET
http://dbpedia.org/resource/The_Beatles
Accept: text/turtle
Client
Web Server
303 (see other)
http://dbpedia.org/data/beatles.n3
EUCLID – Building Linked Data applications
64
65. Underlying Technology Basics
(6)
HTTP Conneg & Linked Data URI Lookup
• A foundational issue in Linked Data was the distinction of URIs
for real-world objects versus documents (e.g., RDF) that
might describe them.
• This can be handled in the HTTP Header together with
content negotiation (conneg):
HTTP GET
http://dbpedia.org/data/beatles.n3
Accept: text/turtle
Client
Web Server
200 (OK)
[RDF data about the Beatles]
EUCLID – Building Linked Data applications
65
66. Web APIs
Motivation
The Web has more to offer than just the retrieval of static data:
• Data is often dynamically created as a result of some
calculation carried out over input data (e.g., weather
information).
• Data can change frequently (e.g., moving objects).
• Service endpoints, forms and APIs are used to trigger
functionalities in the Web and the real world and provide
access to dynamic and static data sources.
• Web APIs provide a programming interface exposed on the
Web to allow apps to make use of these functionalities.
EUCLID – Building Linked Data applications
66
67. Web APIs (2)
Motivation
• The number of Web APIs is significantly increasing
• An important role on the Web plays Representational State
Transfer (REST)
• Architectural style for client–server interaction
• Focused on the Web architecture
• ProgrammableWeb is
a general directory for
Web APIs:
• Allows providers to
register their API
• Allows application
developers to search
for APIs
Source: http://programmableweb.com
EUCLID – Building Linked Data applications
67
68. Richardson Maturity Model for
REST Services
Level 3: HATEOAS
Level 2: HTTP Verbs
Level 1: Resources and URIs
Each layer builds on
the concepts and
technologies of the
layers below
Source: Richardson, L. & Ruby, S.; RESTful Web Services O'Reilly, 2007.
EUCLID – Building Linked Data applications
68
69. Level 1: Thinking in Resources
• Resources rather than service endpoints:
• A resource is anything with which a client is able to interact
• Real world resources (e.g., car, movie, person…) are projected onto the
Web by making the information associated with it accessible on the Web
• Identifier
• URI uniquely identify (many-to-one) a resource on the Web
• For addressing and manipulation
• Different representations for one resource are possible
Service boundary
JSON Representation
Movie
Resource
http://rest-music.org/artist/beatles
RDF/N3 Representation
http://rest-music.org/artist/the_beatles
XML Representation
Source: Webber, J.; Parastatidis, S. & Robinson, I. S.; REST in Practice - Hypermedia and
Systems Architecture. O'Reilly, 2010.
EUCLID – Building Linked Data applications
69
70. Level 2: The Web as a Platform
Uniform interface for interaction
• HTTP Verbs as methods to act on resources:
HTTP
Verb
Effect
Characteristic
GET
retrieve the representation of a resource identified
with a URI
safe,
idempotent
PUT
create or overwrite a resource identified by a clientgenerated URI
idempotent
POST
create a resource identified by a server-generated URI
DELETE delete a resource (or its representation) identified with
a URI
idempotent
OPTIONS
safe,
idempotent
request for information about the available
communication options
• Response Codes to coordinate the interactions
(e.g., 200 OK, 201 CREATED, 303 SEE OTHER, 404 NOT FOUND)
EUCLID – Building Linked Data applications
70
71. Level 2: The Web as a Platform (2)
Characteristics of HTTP Verbs
• Safe: Guaranties not to change the resource on the server
• Example: Retrieving the representation of a resource does not change it
GET
/artist/beatles
Name: The Beatles
Genre: Rock
Origin: Liverpool
…
• Idempotent: The effect of several identical requests with an
idempotent HTTP Verb is the same as for a single request
• Example: Once a resource is deleted, deleting it again does not change
anything
DELETE
/artist/beatles
200 OK
Name: The Beatles
Genre: Rock
Origin: Liverpool
…
EUCLID – Building Linked Data applications
DELETE
/artist/beatles
404 not found
71
72. Level 3:
The Hypermedia Constraint
HATEOAS = Hypermedia As The Engine Of Application State
• Include links in the resource representations
to other relevant resources
http://service.org/music/order
HTTP POST
dbp:Revolver a dbp:Album;
mus:upc “094638241720”.
Response
mus:001
mus:001
mus:001
mus:001
mus:001
a mus:Order.
db:content dbp:Revolver.
mus:price “10€”.
mus:status “awaiting payment”
mus:pay mus:001_pay.
Source: Fielding, R. T.; Architectural styles and the design of network-based software
architectures, University of California, Irvine, 2000.
EUCLID – Building Linked Data applications
72
73. Example: Freebase API
Freebase API
• To retrieve RDF Freebase offers a specific API
• No content negotiation
• Allows applications to retrieve a subgraph of data connected
to a specific Freebase object
• The URI is the concatenation of the RDF service URL and the Freebase
identifier
https://www.googleapis.com/freebase/v1/rdf/<id>
Source: https://developers.google.com/freebase/
EUCLID – Building Linked Data applications
73
74. Example: Freebase API (2)
Freebase API
• Example:
GET https://www.googleapis.com/freebase/v1/rdf/m/07c0j
• Every Freebase fact is mapped to a triple
• Slashes in the ID are replaced by dots
• Some facts are mapped to RDF Schema
(e.g., /type/object/name
rdfs:label)
• The response contains the first 100 values for each predicate
@prefix ns: <http://rdf.freebase.com/ns/>.
ns:m.07c0j
ns:award.award_nominee.award_nominations
ns:m.0jwnvw4;
ns:base.websites.website.website
<http://www.thebeatles.com>;
…
Source: https://developers.google.com/freebase/
EUCLID – Building Linked Data applications
74
75. Well-Known Non-RDF Web APIs
• Twitter
Provides access to timelines, tweets, direct messages between
users, followers, users, places, … See http://dev.twitter.com
• LastFM
Provides access to music-related resources: albums, artists,
groups, events, venues, … See http://dev.twitter.com
• Foursquare
Check in at their current location, create tips and lists, access
recommendations, … See http://developer.foursquare.com
• …
EUCLID – Building Linked Data applications
75
76. Summary
• Linked Data applications
• LD application = Consumes LD + Manipulates/Produces LD + Web app
• Can be categorized according the following dimensions:
•
•
•
•
Semantic technology depth
Information flow direction
Semantic richness
Semantic integration
• Architecture of Linked Data applications
• Multitier combined with a wrapper-mediator architecture
• Architectural patterns to consume LD: Crawling, on-the-fly
dereferencing, (federated) query pattern.
• Main components: Triple store, logic components, UI components,
data access & integration component, republishing component
EUCLID – Building Linked Data applications
76
77. Summary (2)
• Linked Data application development frameworks:
Information Workbench
• Data storage: Provides warehousing and federation capabilities
• Data integration: Performed by Data Providers, e.g., OpenRefine
• Data-driven, widget-based user interface, automatically generated by
executing SPARQL queries
• User input validation via comparison against the underlying ontology
• Web APIs
• Basic concepts: Request and Response
• Request methods: GET, PUT, POST, DELETE (+ HEAD, TRACE, CONNECT, OPTIONS)
• Response codes: 1xx provisional, 2xx success, 3xx further action required,
4xx client error, 5xx server error
• Richardson Maturity Model for REST services: Level 1 – Resources and
URIs, Level 2 – HTTP verbs, Level 3 – HATEOAS
EUCLID – Building Linked Data applications
77
78. For exercises, quiz and further material visit our website:
http://www.euclid-project.eu
Course
eBook
Other channels:
@euclid_project
euclidproject
EUCLID – Building Linked Data applications
euclidproject
78
Notas do Editor
Maribel’s comment: The portals list “ZIP” as formats for data sets, but they do not provide further information about the format of the zipped files.Maribel’s comment: This graphic only shows the most common data formats in both data sets, but there are more.