2. Linked Data Fragments
Triple Pattern Fragments
Federated querying
Querying Federations
of Triple Pattern Fragments
3. Linked Data Fragments
Triple Pattern Fragments
Federated querying
Querying Federations
of Triple Pattern Fragments
4. A whole spectrum of trade-offs
exists between the two extremes.
high server costlow server cost
data
dump
SPARQL
endpoint
interface offered by the server
high availability low availability
high bandwidth low bandwidth
out-of-date data live data
low client costhigh client cost
Linked Data
documents
5. data
metadata
controls
What triples does it contain?
What do we know about it?
How to access more data?
All RDF interfaces offer fragments
with the following characteristics.
6. all dataset triples
(none)
data dump
number of triples, file size
data
metadata
controls
Each type of Linked Data Fragment
is defined by three characteristics.
7. triples matching the query
(none)
(none)
SPARQL query result
data
metadata
controls
Each type of Linked Data Fragment
is defined by three characteristics.
8. Linked Data Fragments
Triple Pattern Fragments
Federated querying
Querying Federations
of Triple Pattern Fragments
9. We design new mixes of trade-offs
with much lower server-side cost.
high server costlow server cost
data
dump
SPARQL
query results
high availability low availability
high bandwidth low bandwidth
out-of-date data live data
low client costhigh client cost
Linked Data
documents
10. low server cost
data
dump
SPARQL
query results
high availability
live data
Linked Data
documents
Triple Pattern
Fragments
A Triple Pattern Fragments interface
is low-cost and enables clients to query.
11. matches of a triple pattern
total number of matches
access to all other fragments
data
metadata
controls
(paged)
A Triple Pattern Fragments interface
is low-cost and enables clients to query.
13. Give them a SPARQL query.
Give them a URL of any dataset fragment.
How can intelligent clients
solve SPARQL queries over fragments?
They look inside the fragment
to see how to access the dataset
and use the metadata
to decide how to plan the query.
14. Let’s follow the execution
of an example SPARQL query.
SELECT ?artist ?name WHERE {
?artist a dbpedia-owl:Artist;
rdfs:label ?name;
dbpedia-owl:birthPlace dbpedia:Padua.
FILTER LANGMATCHES(LANG(?name), "EN")
}
Find names of artists born in Padua, Italy.
Fragment: http://fragments.dbpedia.org/2014/en
15. The client looks inside the fragment
to see how to access the dataset.
<http://fragments.dbpedia.org/2014/en#dataset> hydra:search [
hydra:template "http://fragments.dbpedia.org/2014/en
{?subject,predicate,object}";
hydra:mapping
[ hydra:variable "subject"; hydra:property rdf:subject ],
[ hydra:variable "predicate"; hydra:property rdf:predicate ],
[ hydra:variable "object"; hydra:property rdf:object ]
].
Fragment: http://fragments.dbpedia.org/2014/en
“I can query the dataset by triple pattern.”
16. The client splits the query
into the available fragments.
SELECT ?artist ?name WHERE {
?artist a dbpedia-owl:Artist;
rdfs:label ?name;
dbpedia-owl:birthPlace dbpedia:Padua.
FILTER LANGMATCHES(LANG(?name), "EN")
}
17. The client gets the fragments
and inspects their metadata.
?artist a dbpedia-owl:Artist.
first 100 triples
96.000
?artist rdfs:label ?name.
first 100 triples
12.000.000
?artist dbont:birthPlace dbpedia:Padua.
first 100 triples
135
18. ?artist a dbpedia-owl:Artist. 96.000
?artist rdfs:label ?name. 12.000.000
?artist dbont:birthPlace dbpedia:Padua.
dbpedia:Alberto_Benettin dbont:birthPlace dbpedia:Padua.
135
dbpedia:Alberto_Bigon dbont:birthPlace dbpedia:Padua.
The metadata enables the client
to choose the right starting point.
dbp:Alberto_Benettin a dbont:Artist.
dbp:Alberto_Benettin rdfs:label ?name.
19. Clients execute the query in 3 seconds
on a highly available, low-cost server.
SELECT ?artist ?name WHERE {
?artist a dbpedia-owl:Artist;
rdfs:label ?name;
dbpedia-owl:birthPlace dbpedia:Padua.
FILTER LANGMATCHES(LANG(?name), "EN")
}
Try it yourself:
bit.ly/artistspadua
20. Querying Datasets on
1 10 100
10100100010000
clients
throughput(q/hr)
Virtuoso 6
Fuseki–tdb
triple pattern
Fig. 3.1: Server performance (log-log plot)
The query throughput is lower,
but resilient to high client numbers.
executed SPARQL queries per hour
21. The server traffic is higher,
but requests are significantly lighter.
ets on the Web with High Availability 13
oso 6 Virtuoso 7
–tdb Fuseki–hdt
pattern fragments
1 10 100
0
2
4
clients
datasent(mb)
Fig. 3.2: Server network trafficdata sent by server in MB
22. Caching is significantly more effective,
as clients reuse fragments for queries.
1 10 100
0
2
clients
t(mb)
Fig. 3.2: Server network traffic
1 10 100
0
10
20
clients
sent(mb)
Fig. 3.4: Cache network traffic
6
8
ramus
data sent by cache in MB
23. The server uses much less CPU,
allowing for higher availability.
server CPU usage per core
1 10 100
0
50
100
150
clients
#timeou
Fig. 3.3: Query timeouts
1
1 10 100
0
50
100
clients
cpuuse(%)
Fig. 3.5: Server processor usage per core
1
100
e(%)
24. Linked Data Fragments
Triple Pattern Fragments
Federated querying
Querying Federations
of Triple Pattern Fragments
25. Federated querying is native
to Triple Pattern Fragment clients.
Every query is decomposed locally.
Clients send simple requests to a server.
For clients, it doesn’t matter
which server they send queries to.
26. For federation, we just send queries
to multiple servers.
No prior source selection.
Each triple pattern is sent to all servers.
If a certain pattern has no result,
just don’t send more specific patterns.
27. Federation compares pretty well
to SPARQL endpoint federation.
dge
date
n of
nter-
mea-
s on
er in
pos-
the
nter-
ular
om-
TPF
ANAPSID
ANAPSIDEG
FedX(warm)
SPLENDID
LD . . . . .
LD . . . . .
LD . . . . .
LD . . . . .
LD . . . . .
LD . . . . .
LD . . . . .
LD . . . . .
LD . . . . .
LD . . . . .
LD . . . . .
LS . . . . .
LS . . . . .
FedBench
recall
28. Federation compares pretty well
to SPARQL endpoint federation.
dge
date
n of
nter-
mea-
s on
er in
pos-
the
nter-
ular
om-
TPF
ANAPSID
ANAPSIDEG
FedX(warm)
SPLENDID
LD . . . . .
LD . . . . .
LD . . . . .
LD . . . . .
LD . . . . .
LD . . . . .
LD . . . . .
LD . . . . .
LD . . . . .
LD . . . . .
LD . . . . .
LS . . . . .
LS . . . . .
recall
Complex
queries
ets
mat
hed
EC
ated
Data
CD)
om-
gain
was
, ac-
m in
ncy.
bers
the
ems:
Ex-
LS . . . . .
LS . . . . .
LS . . . . .
LS . . . . .
LS . . . . .
CD . . . . .
CD . . . . .
CD . . . . .
CD . . . . .
CD . . . . .
CD . . . . .
CD . . . . .
C . . . . .
C . . . . .
C . . . . .
C . . . . .
C . . . . .
C . . . . .
C . . . . .
C . . . . .
C . . . . .
C . . . . .
# queries
= .
.
29. Federation compares pretty well,
even time-wise in some cases.
LD LD LD LD LD LD LD LD LD LD LD CD
50
100
executiontime(s)
150
200
250
300
iontime(s)
LD LD LD LD LD LD LD LD LD CD CD CD CD CD
LS LS LS LS LS C C C C C C C C
TPF ANAPSID ANAPSID EG FedX SPLENDID
mes of FedBench query execution on the TPF client/server setup compared to SPARQL endp
FedBench
30. Federation compares pretty well,
even time-wise in some cases.
LD LD LD LD LD LD LD LD LD LD LD CD
50
100
executiontime(
LS LS LS LS LS LS LS C C C C
0
50
100
150
200
250
300
executiontime(s)
TPF ANAPSID ANAPSID EG FedX
Figure : Evaluation times of FedBench query execution on the TPF client/server setup c
systems (timeout of s). These measurements should be considered together with
TPF-related measurements were performed in the context of this article; the numbers
LD LD LD LD LD LD LD LD LD CD CD CD CD CD
LS LS LS LS LS C C C C C C C C
TPF ANAPSID ANAPSID EG FedX SPLENDID
mes of FedBench query execution on the TPF client/server setup compared to SPARQL endp
LD LD LD LD LD CD CD CD CD CD CD CD
LS C C C C C C C C C C
NAPSID ANAPSID EG FedX SPLENDID
xecution on the TPF client/server setup compared to SPARQL endpoint federation
nts should be considered together with the recall for each query (Table ). The
the context of this article; the numbers for the four SPARQL endpoint federation
Complex
queries
31. Note the different setup
in the previous comparisons.
SPARQL endpoint federation
was measured with local servers.
Triple Pattern Fragments federation
was measured over the Web.
32. Linked Data Fragments
Triple Pattern Fragments
Federated querying
Querying Federations
of Triple Pattern Fragments
33. Triple Pattern Fragments are easy:
all software is available as open source.
github.com/LinkedDataFragments
linkeddatafragments.org
Software
Documentation and specification
34. More than 650.000 TPF interfaces
are available for federated querying.
fragments.dbpedia.org
lodlaundromat.org/wardrobe/
data.linkeddatafragments.org