2. Schedule
Digital Enterprise Research Institute www.deri.ie
Linked Data Principles – 10%
Web of Data 101 (URI, HTTP, RDF) – 40%
Linking Open Data community project – 20%
Tools and Applications – 30%
2
3. Why?
Digital Enterprise Research Institute www.deri.ie
Web of Data = linked data + vocabularies +
embedded metadata (RDFa, microformats, etc.)
When publishing linked data you provide a
standardised, uniform, and generic API for:
discovery, see also http://webofdata.wordpress.com/
integration/meshup
distributed query
uniform access to metadata and data
enable serendipity
See also [EXPL]
3
4. What?
Digital Enterprise Research Institute www.deri.ie
In contrast to the full-fledged Semantic Web vision,
linked data is mainly about publishing structured
data in RDF using URIs rather than focusing on the
ontological level or inference. This simplification—
just as the Web simplified the established academic
approaches of Hypertext systems—lowers the entry
barrier for data provider, hence fosters a wide-
spread adoption.
[EXPL]
4
5. Linked Data Principles
Digital Enterprise Research Institute www.deri.ie
By Tim Berners-Lee, ca. 2006 [LD]
Use URIs to identify things (anything, not just documents)
Use HTTP URIs – globally unique names, distributed
ownership – allows people to look up things
Provide useful information in RDF – when someone looks
up a URI
Include RDF links to other URIs – to enable discovery of
related information
5
6. Linked Data Principles
Digital Enterprise Research Institute www.deri.ie
Issues
These are principles, not implementation advices
Many things (deliberately?) kept blurry
Non-information resource vs. information resource debate
(see also [AWWSW])
CN and 303, the httpRange TAG issue [UA]
Formats: HTML + RDF/XML vs. RDFa
6
7. Linked Data Principles
Digital Enterprise Research Institute www.deri.ie
Ongoing work
Description/discovery
– semantic sitemaps, see
http://sw.deri.org/2007/07/sitemapextension/
– voiD, see http://semanticweb.org/wiki/VoiD
Trust (SPOT09 at ESWC09 for example)
Multimedia/Fragments, see
http://www.interlinkingmultimedia.info/
Foundational issues in TAG/AWWSW
Transforming the read-only Web of Data into a read/write
Web of Data, see for example
http://esw.w3.org/topic/PushBackDataToLegacySources
7
8. Schedule
Digital Enterprise Research Institute www.deri.ie
Linked Data Principles – 15%
Web of Data 101 (URI, HTTP, RDF) – 40%
Linking Open Data community project – 15%
Tools and Applications – 30%
8
9. Web of Data 101 - URI
Digital Enterprise Research Institute www.deri.ie
A Uniform Resource Identifier (URI) is a compact
sequence of characters that identifies an abstract
or physical resource. [RFC3986]
Syntax
URI = scheme quot;:quot; hier-part [ quot;?quot; query ] [ quot;#quot; fragment ]
Example
foo://example.com:8042/over/there?name=ferret#nose
_/ _________________/_________/ __________/ __/
| | | | |
scheme authority path query fragment
9
10. Web of Data 101 - URI
Digital Enterprise Research Institute www.deri.ie
Don’t confuse scheme with protocol
Scheme: defines URI layout and (certain) semantics; go and
register with IANA using [RFC4395]
Protocol: defines communication means between
endpoints (such as HTTP, FTP, etc.)
URI resolution (as of [RFC3986])
STEP OUTPUT BUFFER INPUT BUFFER
1: /a/b/c/./../../g
2E: /a /b/c/./../../g
2E: /a/b /c/./../../g
2E: /a/b/c /./../../g
2B: /a/b/c /../../g
2C: /a/b /../g
2C: /a /g
2E: /a/g
10
11. Web of Data 101 - URI
Digital Enterprise Research Institute www.deri.ie
URIrefs, URI references [RDF AS]
An RDF URI reference is a Unicode string does not contain
any control characters (#x00 - #x1F, #x7F-#x9F) and would
produce a valid URI character sequence representing an
absolute URI when subjected to an UTF-8 encoding along
with %-escaping non-US-ASCII octets.
QNames, Qualified Names [XML NS]
XML’s way to allow namespaced elements/attributes as of
QName = Prefix ‘:‘ LocalPart
CURIEs, Compact URIs [CURIE]
Generic, abbreviated syntax for expressing URIs, currently
in SPARQL, RDFa, and XHTML2 deployed
11
12. Web of Data 101 - HTTP
Digital Enterprise Research Institute www.deri.ie
The Hypertext Transfer Protocol (HTTP) is an
application-level protocol for distributed,
collaborative, hypermedia information systems. It is
a generic, stateless, protocol which can be used
for many tasks beyond its use for hypertext, such
as name servers and distributed object
management systems, through extension of its
request methods, error codes and headers. A
feature of HTTP is the typing and negotiation of
data representation, allowing systems to be built
independently of the data being transferred.
[RFC2616]
12
13. Web of Data 101 - HTTP
Digital Enterprise Research Institute www.deri.ie
HTTP messages consist of requests from client to
server and responses from server to client
Set of methods is predefined (such as GET, POST,
etc.), but can be expanded
Set of status codes is defined
Informational 1xx, provisional response, (100 Continue)
Successful 2xx, request successfully received, understood, and
accepted (201 Created)
Redirection 3xx, further action needs to be taken by user agent
to fulfill the request (301 Moved Permanently)
Client Error 4xx, client erred (405 Method Not Allowed)
Server Error 5xx, server encountered an unexpected condition
(501 Not Implemented)
13
14. Web of Data 101 - HTTP
Digital Enterprise Research Institute www.deri.ie
GET /html/rfc2616 HTTP/1.1
REQUEST
Host: tools.ietf.org
User-Agent: Mozilla/5.0
Accept: text/html,application/xhtml
+xml,application/xml;q=0.9,*/*;q=0.8
RESPONSE
HTTP/1.x 200 OK
Date: Thu, 05 Mar 2009 08:17:33 GMT
Server: Apache/2.2.11
Content-Location: rfc2616.html
Last-Modified: Tue, 20 Jan 2009 09:16:04 GMT
Content-Type: text/html; charset=UTF-8
14
15. Web of Data 101 - HTTP
Digital Enterprise Research Institute www.deri.ie
Content Negotiation (CN, conneg) is the process of
selecting the best representation for a given
response when there are multiple representations
available
Three types of CN: server-driven, agent-driven CN,
transparent CN
Example
curl -I -H quot;Accept: application/rdf+xmlquot; http://dbpedia.org/resource/Galway
HTTP/1.1 303 See Other
Content-Type: application/rdf+xml
Location: http://dbpedia.org/data/Galway.rdf
15
16. Web of Data 101 - HTTP
Digital Enterprise Research Institute www.deri.ie
Caching (see Cache–Control header field) is
essential for scalability
HTTPbis [HTTPbis], IETF WG chaired by Mark
Nottingham, mainly about: patches, clarifications,
deprecate non-used features, documentation of
security properties
16
17. Web of Data 101 - HTTP
Digital Enterprise Research Institute www.deri.ie
Representational State Transfer [REST]
resource the intended conceptual target of a hypertext
reference
resource identifier URL, URN
representation HTML document, JPEG image
representation media type, last-modified time
metadata
resource source link, alternates, vary
metadata
control data if-modified-since, cache-control
17
18. Web of Data 101 - RDF
Digital Enterprise Research Institute www.deri.ie
As of [RDF AS] a data model: a directed, labeled
graph based on URIs
Triple: (subject predicate object)
subject … URIref or bNode
predicate … URIref
object … URIref or bNode or literal
18
19. Web of Data 101 - RDF
Digital Enterprise Research Institute www.deri.ie
19
20. Web of Data 101 - Overview
Digital Enterprise Research Institute www.deri.ie
Web's Standard Retrieval Algorithm as of [SDD]:
1. parse URI and find HTTP protocol
2. look up DNS name to determine the
associated IP address
3. open a TCP stream to port 80 at the IP
address determined above
4. format an HTTP GET request for resource
and sends that to the server
5. read response from the server
6. from the status code (200) determine that a
representation of the resource is available
7. inspect the returned Content-Type
8. pass the entity-body to its HTML rendering
engine
20
22. Web of Data 101 - Overview
Digital Enterprise Research Institute www.deri.ie
22
23. Web of Data 101 - Overview
Digital Enterprise Research Institute www.deri.ie
23
24. Schedule
Digital Enterprise Research Institute www.deri.ie
Linked Data Principles – 15%
Web of Data 101 (URI, HTTP, RDF) – 40%
Linking Open Data community project – 15%
Tools and Applications – 30%
24
25. Linking Open Data Project
Digital Enterprise Research Institute www.deri.ie
Community project with W3C support started in
early 2007 [LOD]
Idea: take existing (open) data sets and make
them available on the Web in RDF
Interlink them with other data sets
Kudos to Tom Heath and Richard Cyganiak;
the material in this section is heavily based
on their work.
25
26. Linking Open Data Project
Digital Enterprise Research Institute www.deri.ie
May 2007
26
27. Linking Open Data Project
Digital Enterprise Research Institute www.deri.ie
Feb 2009
27
28. Linking Open Data Project
Digital Enterprise Research Institute www.deri.ie
DBpedia
28
29. Linking Open Data Project
Digital Enterprise Research Institute www.deri.ie
Geonames
29
30. Schedule
Digital Enterprise Research Institute www.deri.ie
Linked Data Principles – 15%
Web of Data 101 (URI, HTTP, RDF) – 40%
Linking Open Data community project – 15%
Tools and Applications – 30%
30
31. Tools and Applications
Digital Enterprise Research Institute www.deri.ie
Linking Open Data homepage [LOD] has
Browsing with Tabulator, VisiNav, DBpedia Mobile, etc.
Searching with Sindice, SWSE, Falcons, etc.
Mashups, e.g. Revyu, BBC Music, DERI Pipes
See further
http://esw.w3.org/topic/SweoIG/TaskForces/
CommunityProjects/LinkingOpenData/Applications
31
34. Tools and Applications
Digital Enterprise Research Institute www.deri.ie
Virtuoso including RDF triple store, SPARQL access
to data, Open source edition
http://virtuoso.openlinksw.com/
Talis Platform, SaaS, Cloud-based storage for RDF
data and binary objects, SPARQL access, REST APIs
http://www.talis.com/platform
ARC (PHP) http://arc.semsol.org/
Jena (Java) http://jena.sourceforge.net/
Summary see
http://www.semanticscripting.org/SFSW2005/SFSW-
Toolkits.pdf
34
35. Tools and Applications
Digital Enterprise Research Institute www.deri.ie
Frameworks
LOOMP http://www.loomp.org/
Silk http://www4.wiwiss.fu-berlin.de/bizer/silk/
SQUIN http://squin.sourceforge.net/
Paget http://code.google.com/p/paget
Other Tools (debug, etc.)
curl
Live HTTP headers (FF plug in)
http://sparql.org/sparql.html
https://wiki.mozilla.org/Labs/Ubiquity/
Commands_In_The_Wild (Web of Data, etc.)
35
36. Tools and Applications
Digital Enterprise Research Institute www.deri.ie
Further resources regarding the publishing process
http://linkeddata.org/docs/how-to-publish
http://events.linkeddata.org/iswc2008tutorial/
http://videolectures.net/iswc08_heath_hpldw/
http://www.w3.org/TR/swbp-vocab-pub/
http://vapour.sourceforge.net/
36
37. References
Digital Enterprise Research Institute www.deri.ie
[EXPL] … ‘Exploiting Linked Data For Building Web Applications’, Hausenblas,
2009, accepted for publication in IEEE IC
(pre-print version: http://sw-app.org/pub/exploit-lod-webapps-IEEEIC-preprint.pdf)
[LD] … http://www.w3.org/DesignIssues/LinkedData.html
[UA] … http://esw.w3.org/topic/FindingResourceDescriptions
[RFC3986] … http://www.ietf.org/rfc/rfc3986.txt
[RDF AS] … http://www.w3.org/TR/rdf-concepts/#section-Graph-URIref
[XML NS] … http://www.w3.org/TR/xml-names/#ns-qualnames
[CURIE] … http://www.w3.org/TR/curie/
[RFC4395] … http://tools.ietf.org/html/rfc4395
[RFC2616] … http://www.ietf.org/rfc/rfc2616.txt
[HTTPbis] … http://tools.ietf.org/wg/httpbis/
[REST] … ‘Principled design of the modern Web architecture’, Fielding and Taylor,
2002, http://portal.acm.org/citation.cfm?doid=514183.514185
[SDD] … http://www.w3.org/2001/tag/doc/selfDescribingDocuments
[AWWSW] … http://esw.w3.org/topic/AwwswHome
[LOD] …
http://esw.w3.org/topic/SweoIG/TaskForces/CommunityProjects/LinkingOpenData
37