SlideShare a Scribd company logo
1 of 29
Download to read offline
ELIS – Multimedia Lab
Reducing HTTP traffic for
scalable linked data consumption
Query Execution Optimization for
Clients of Triple Patterns Fragments
Joachim Van Herwegen, Ruben Verborgh, Erik Mannens, Rik Van De Walle
2
ELIS – Multimedia Lab
SPARQL endpoints, data dumps, simple interfaces, …
Still looking for the ultimate linked data solution
Full SPARQL support
High scalability
Fast response time
Low server & client load
…
Not found yet, so we focused on improving the response time
for clients using simple interfaces (Triple Pattern Fragments).
Accessing linked data
3
ELIS – Multimedia Lab
Accessing Linked Data
Problem statement
Improved join tree
Optimizing local joins
Bringing it all together
Query Execution Optimization for
Clients of Triple Patterns Fragments
4
ELIS – Multimedia Lab
Accessing Linked Data
Problem statement
Improved join tree
Optimizing local joins
Bringing it all together
Query Execution Optimization for
Clients of Triple Patterns Fragments
5
ELIS – Multimedia Lab
Linked Data access extremes
SPARQL protocol
Live data
Full SPARQL support
High server load
Data dump
Static data
Remote: 1 query
Local: full queries
High client load
6
ELIS – Multimedia Lab
Generic way to describe how linked data can be accessed
Data Results when accessing a selector
Metadata Description of the fragment
Controls Links to other fragments
Verborgh et al. – Web-scale querying through Linked Data Fragments
Linked Data Fragments
7
ELIS – Multimedia Lab
Accessing data through a SPARQL endpoint
Data Bindings matching a SPARQL query
Metadata { } (data contains everything needed)
Controls { } (interface can answer everything)
SPARQL endpoint
8
ELIS – Multimedia Lab
Accessing data through Triple Pattern Fragments
Data Triples matching a triple pattern
Metadata Count estimate, page size, etc.
Controls First page, next page, root fragment
Triple Pattern Fragments
9
ELIS – Multimedia Lab
Triple Pattern Fragments
URI
query
results
metadata/controls
10
ELIS – Multimedia Lab
Accessing Linked Data
Problem statement
Improved join tree
Optimizing local joins
Bringing it all together
Query Execution Optimization for
Clients of Triple Patterns Fragments
11
ELIS – Multimedia Lab
SELECT ?person ?city WHERE {
?person a db:Architect. 1200 triples
?person db:birthPlace ?city. 430,000 triples
?city dc:subject db:Category:Capitals_in_Europe. 60 triples
}
Start from the smallest pattern, apply bindings and do recursion
Greedy algorithm
birthPlace architect
400
40,000
Capitals
1
12
ELIS – Multimedia Lab
SELECT ?person ?city WHERE {
?person a db:Architect. 1200 triples
?person db:birthPlace ?city. 430,000 triples
?city dc:subject db:Category:Capitals_in_Europe. 60 triples
}
Find optimal solution for every pattern
Optimized algorithm
Capitals birthPlace architect
1
400
local
12
13
ELIS – Multimedia Lab
Accessing Linked Data
Problem statement
Improved join tree
Optimizing local joins
Bringing it all together
Query Execution Optimization for
Clients of Triple Patterns Fragments
14
ELIS – Multimedia Lab
Goal
Minimize HTTP calls required to solve BGP query
Solution
2 possible roles for every pattern in query:
Download pattern completely
or
Bind variable and download resulting patterns
Estimate best option for every pattern
Optimized algorithm
15
ELIS – Multimedia Lab
?player :team ?club 365,000 triples
?club :type :SoccerClub 16,000 triples
?club :ground ?city 15,000 triples
?city :country :Spain 7,000 triples
?player :birthPlace ?city 430,000 triples
Always download smallest pattern
Determine others on shared variables and results so far
Can change during runtime
Extended example
16
ELIS – Multimedia Lab
Extended example
?city
:country
:Spain
?city
?club
:ground
?city
?player
:birthPlace
?city
?club
?player
?player
:team
?club
?club
:type
:SoccerClub
supplies ?city
supplied by
?city
17
ELIS – Multimedia Lab
First iteration
?city
:country
:Spain
?city
?club
:ground
?city
?player
:birthPlace
?city
?club
?player
?player
:team
?club
?club
:type
:SoccerClub
18
ELIS – Multimedia Lab
Further iterations
?city
:country
:Spain
?city
?club
:ground
?city
?player
:birthPlace
?city
?club
?player
?player
:team
?club
?club
:type
:SoccerClub
Making sure no
pattern is ignored
19
ELIS – Multimedia Lab
Estimate which option requires least HTTP calls.
Download:
#𝑡𝑟𝑖𝑝𝑙𝑒𝑠
𝑝𝑎𝑔𝑒𝑠𝑖𝑧𝑒
avg pages per binding avg bindings per triple
Bind:
𝑎𝑣𝑔 𝑡𝑟𝑖𝑝𝑙𝑒𝑠/𝑏𝑖𝑛𝑑𝑖𝑛𝑔
𝑝𝑎𝑔𝑒𝑠𝑖𝑧𝑒
⋅ max
𝑏𝑖𝑛𝑑𝑖𝑛𝑔𝑠 𝑓𝑜𝑢𝑛𝑑
𝑡𝑟𝑖𝑝𝑙𝑒𝑠 𝑑𝑜𝑤𝑛𝑙𝑜𝑎𝑑𝑒𝑑
⋅ #𝑡𝑟𝑖𝑝𝑙𝑒𝑠
for all suppliers
Swap when necessary, taking into account work done so far
Updating pattern roles
20
ELIS – Multimedia Lab
Accessing Linked Data
Problem statement
Improved join tree
Optimizing local joins
Bringing it all together
Query Execution Optimization for
Clients of Triple Patterns Fragments
21
ELIS – Multimedia Lab
Most join data from previous iterations can be reused.
Challenge: reuse as much data as possible.
Local joining
Triple data
Iteration i
Triple data
Iteration i+1 New triples!
22
ELIS – Multimedia Lab
Join tree step
Iteration i Iteration i + 1
Bindings
i-1
Triples
i
New New
Bindings
i-1
Triples
i
Bindings
i
Bindings
i
New
23
ELIS – Multimedia Lab
Start with the largest unchanged, connected set of patterns.
Estimate remainder of join order based on pattern size and
connectivity.
Minimizing joins
New New
Bindings
i-1
Triples
i
New triples
Propagated
changes
24
ELIS – Multimedia Lab
Accessing Linked Data
Problem statement
Improved join tree
Optimizing local joins
Bringing it all together
Query Execution Optimization for
Clients of Triple Patterns Fragments
25
ELIS – Multimedia Lab
Prevent local optima
Join tree instead of join path
Reuse local join data
Summary
26
ELIS – Multimedia Lab
Single machine
Intel Core i5-3230M CPU @ 2.60GHz
8 GB RAM
Both client and server
Artificial delay of 100ms on server to simulate network delay
Test setup
27
ELIS – Multimedia Lab
WatDiv benchmark queries, 100ms delay on server
Median # HTTP calls Median time (s)
Results
28
ELIS – Multimedia Lab
Less HTTP calls with more client-side processing
Ideal for slow connection situations
Still room for improvements
No parallelism
Focus on BGPs
More work per HTTP call
Not guaranteed to be better
Conclusion
29
ELIS – Multimedia Lab
Thank you!
Come see demo #13 on thursday
Questions?

More Related Content

Viewers also liked

Opportunistic Linked Data Querying through Approximate Membership Metadata
Opportunistic Linked Data Querying through Approximate Membership MetadataOpportunistic Linked Data Querying through Approximate Membership Metadata
Opportunistic Linked Data Querying through Approximate Membership Metadata
Miel Vander Sande
 
Querying federations 
of Triple Pattern Fragments
Querying federations 
of Triple Pattern FragmentsQuerying federations 
of Triple Pattern Fragments
Querying federations 
of Triple Pattern Fragments
Ruben Verborgh
 
LDOW2013 r&wbase: git for triples
LDOW2013 r&wbase: git for triplesLDOW2013 r&wbase: git for triples
LDOW2013 r&wbase: git for triples
Miel Vander Sande
 
Using EPUB 3 and the Open Web Platform for Enhanced Presentation and Machine-...
Using EPUB 3 and the Open Web Platform for Enhanced Presentation and Machine-...Using EPUB 3 and the Open Web Platform for Enhanced Presentation and Machine-...
Using EPUB 3 and the Open Web Platform for Enhanced Presentation and Machine-...
Pieter Heyvaert
 
OSLO: Open Standards for Linked Organizations
OSLO: Open Standards for Linked OrganizationsOSLO: Open Standards for Linked Organizations
OSLO: Open Standards for Linked Organizations
Laurens De Vocht
 

Viewers also liked (20)

Opportunistic Linked Data Querying through Approximate Membership Metadata
Opportunistic Linked Data Querying through Approximate Membership MetadataOpportunistic Linked Data Querying through Approximate Membership Metadata
Opportunistic Linked Data Querying through Approximate Membership Metadata
 
Querying federations 
of Triple Pattern Fragments
Querying federations 
of Triple Pattern FragmentsQuerying federations 
of Triple Pattern Fragments
Querying federations 
of Triple Pattern Fragments
 
Towards an Interface for User-Friendly Linked Data Generation Administration
Towards an Interface for User-Friendly Linked Data Generation AdministrationTowards an Interface for User-Friendly Linked Data Generation Administration
Towards an Interface for User-Friendly Linked Data Generation Administration
 
Situation of open data in Flanders
Situation of open data in FlandersSituation of open data in Flanders
Situation of open data in Flanders
 
LDOW2013 r&wbase: git for triples
LDOW2013 r&wbase: git for triplesLDOW2013 r&wbase: git for triples
LDOW2013 r&wbase: git for triples
 
Machines are the new Digital Natives
Machines are the new Digital NativesMachines are the new Digital Natives
Machines are the new Digital Natives
 
iRail: History & current issues
iRail: History & current issuesiRail: History & current issues
iRail: History & current issues
 
Using EPUB 3 and the Open Web Platform for Enhanced Presentation and Machine-...
Using EPUB 3 and the Open Web Platform for Enhanced Presentation and Machine-...Using EPUB 3 and the Open Web Platform for Enhanced Presentation and Machine-...
Using EPUB 3 and the Open Web Platform for Enhanced Presentation and Machine-...
 
Querying Heterogeneous Linked Date Interfaces through Reasoning
Querying Heterogeneous Linked Date Interfaces through ReasoningQuerying Heterogeneous Linked Date Interfaces through Reasoning
Querying Heterogeneous Linked Date Interfaces through Reasoning
 
Time travelling through DBpedia
Time travelling through DBpediaTime travelling through DBpedia
Time travelling through DBpedia
 
Presentation Data Science Challenge
Presentation Data Science ChallengePresentation Data Science Challenge
Presentation Data Science Challenge
 
Towards a Uniform User Interface for Editing Mapping Definitions
Towards a Uniform User Interface for Editing Mapping DefinitionsTowards a Uniform User Interface for Editing Mapping Definitions
Towards a Uniform User Interface for Editing Mapping Definitions
 
DBpedia Mappings Quality Assessment
DBpedia Mappings Quality AssessmentDBpedia Mappings Quality Assessment
DBpedia Mappings Quality Assessment
 
Scaling out federated queries for Life Sciences Data In Production
Scaling out federated queries for Life Sciences Data In ProductionScaling out federated queries for Life Sciences Data In Production
Scaling out federated queries for Life Sciences Data In Production
 
ComparativeMotifFinding
ComparativeMotifFindingComparativeMotifFinding
ComparativeMotifFinding
 
Benchmarking the Effectiveness of Associating Chains of Links for Exploratory...
Benchmarking the Effectiveness of Associating Chains of Links for Exploratory...Benchmarking the Effectiveness of Associating Chains of Links for Exploratory...
Benchmarking the Effectiveness of Associating Chains of Links for Exploratory...
 
RMLEditor: A Graph-based Mapping Editor for Linked Data Mappings
RMLEditor: A Graph-based Mapping Editor for Linked Data MappingsRMLEditor: A Graph-based Mapping Editor for Linked Data Mappings
RMLEditor: A Graph-based Mapping Editor for Linked Data Mappings
 
Effect of Heuristics on Serendipity in Path-Based Storytelling with Linked Data
Effect of Heuristics on Serendipity in Path-Based Storytelling with Linked DataEffect of Heuristics on Serendipity in Path-Based Storytelling with Linked Data
Effect of Heuristics on Serendipity in Path-Based Storytelling with Linked Data
 
OSLO: Open Standards for Linked Organizations
OSLO: Open Standards for Linked OrganizationsOSLO: Open Standards for Linked Organizations
OSLO: Open Standards for Linked Organizations
 
Reproducibility with 
the 99 cents Linked Data archive
Reproducibility with 
the 99 cents Linked Data archiveReproducibility with 
the 99 cents Linked Data archive
Reproducibility with 
the 99 cents Linked Data archive
 

Similar to ESWC2015 - Query Optimization for Clients of Linked Data Fragments

Python + MPP Database = Large Scale AI/ML Projects in Production Faster
Python + MPP Database = Large Scale AI/ML Projects in Production FasterPython + MPP Database = Large Scale AI/ML Projects in Production Faster
Python + MPP Database = Large Scale AI/ML Projects in Production Faster
Paige_Roberts
 
http://www.hfadeel.com/Blog/?p=151
http://www.hfadeel.com/Blog/?p=151http://www.hfadeel.com/Blog/?p=151
http://www.hfadeel.com/Blog/?p=151
xlight
 
A survey of top k query processing techniques in relational database systems
A survey of top k query processing techniques in relational database systemsA survey of top k query processing techniques in relational database systems
A survey of top k query processing techniques in relational database systems
unyil96
 
Bhupeshbansal bigdata
Bhupeshbansal bigdata Bhupeshbansal bigdata
Bhupeshbansal bigdata
Bhupesh Bansal
 

Similar to ESWC2015 - Query Optimization for Clients of Linked Data Fragments (20)

Querying data on the Web – client or server?
Querying data on the Web – client or server?Querying data on the Web – client or server?
Querying data on the Web – client or server?
 
Querying datasets on the Web with high availability
Querying datasets on the Web with high availabilityQuerying datasets on the Web with high availability
Querying datasets on the Web with high availability
 
Issues in AI product development and practices in audio applications
Issues in AI product development and practices in audio applicationsIssues in AI product development and practices in audio applications
Issues in AI product development and practices in audio applications
 
Sustainable queryable access to Linked Data
Sustainable queryable access to Linked DataSustainable queryable access to Linked Data
Sustainable queryable access to Linked Data
 
Legacy Systems Interactions with the Supply Chain Through the C2NET Cloud-ba...
Legacy Systems Interactions with the Supply  Chain Through the C2NET Cloud-ba...Legacy Systems Interactions with the Supply  Chain Through the C2NET Cloud-ba...
Legacy Systems Interactions with the Supply Chain Through the C2NET Cloud-ba...
 
Linked Data Fragments
Linked Data FragmentsLinked Data Fragments
Linked Data Fragments
 
Elastic search from the trenches
Elastic search from the trenchesElastic search from the trenches
Elastic search from the trenches
 
Deep Learning Inference at speed and scale
Deep Learning Inference at speed and scaleDeep Learning Inference at speed and scale
Deep Learning Inference at speed and scale
 
The Fast Path to Building Operational Applications with Spark
The Fast Path to Building Operational Applications with SparkThe Fast Path to Building Operational Applications with Spark
The Fast Path to Building Operational Applications with Spark
 
Junhua wang ai_next_con
Junhua wang ai_next_conJunhua wang ai_next_con
Junhua wang ai_next_con
 
Predicting query performance and explaining results to assist Linked Data con...
Predicting query performance and explaining results to assist Linked Data con...Predicting query performance and explaining results to assist Linked Data con...
Predicting query performance and explaining results to assist Linked Data con...
 
Python + MPP Database = Large Scale AI/ML Projects in Production Faster
Python + MPP Database = Large Scale AI/ML Projects in Production FasterPython + MPP Database = Large Scale AI/ML Projects in Production Faster
Python + MPP Database = Large Scale AI/ML Projects in Production Faster
 
http://www.hfadeel.com/Blog/?p=151
http://www.hfadeel.com/Blog/?p=151http://www.hfadeel.com/Blog/?p=151
http://www.hfadeel.com/Blog/?p=151
 
Use Case Patterns for LLM Applications (1).pdf
Use Case Patterns for LLM Applications (1).pdfUse Case Patterns for LLM Applications (1).pdf
Use Case Patterns for LLM Applications (1).pdf
 
Azure Databricks for Data Scientists
Azure Databricks for Data ScientistsAzure Databricks for Data Scientists
Azure Databricks for Data Scientists
 
Vitus Masters Defense
Vitus Masters DefenseVitus Masters Defense
Vitus Masters Defense
 
Sem tech 2011 v8
Sem tech 2011 v8Sem tech 2011 v8
Sem tech 2011 v8
 
A survey of top k query processing techniques in relational database systems
A survey of top k query processing techniques in relational database systemsA survey of top k query processing techniques in relational database systems
A survey of top k query processing techniques in relational database systems
 
Bhupeshbansal bigdata
Bhupeshbansal bigdata Bhupeshbansal bigdata
Bhupeshbansal bigdata
 
Scaling Streaming - Concepts, Research, Goals
Scaling Streaming - Concepts, Research, GoalsScaling Streaming - Concepts, Research, Goals
Scaling Streaming - Concepts, Research, Goals
 

Recently uploaded

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Recently uploaded (20)

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 

ESWC2015 - Query Optimization for Clients of Linked Data Fragments

  • 1. ELIS – Multimedia Lab Reducing HTTP traffic for scalable linked data consumption Query Execution Optimization for Clients of Triple Patterns Fragments Joachim Van Herwegen, Ruben Verborgh, Erik Mannens, Rik Van De Walle
  • 2. 2 ELIS – Multimedia Lab SPARQL endpoints, data dumps, simple interfaces, … Still looking for the ultimate linked data solution Full SPARQL support High scalability Fast response time Low server & client load … Not found yet, so we focused on improving the response time for clients using simple interfaces (Triple Pattern Fragments). Accessing linked data
  • 3. 3 ELIS – Multimedia Lab Accessing Linked Data Problem statement Improved join tree Optimizing local joins Bringing it all together Query Execution Optimization for Clients of Triple Patterns Fragments
  • 4. 4 ELIS – Multimedia Lab Accessing Linked Data Problem statement Improved join tree Optimizing local joins Bringing it all together Query Execution Optimization for Clients of Triple Patterns Fragments
  • 5. 5 ELIS – Multimedia Lab Linked Data access extremes SPARQL protocol Live data Full SPARQL support High server load Data dump Static data Remote: 1 query Local: full queries High client load
  • 6. 6 ELIS – Multimedia Lab Generic way to describe how linked data can be accessed Data Results when accessing a selector Metadata Description of the fragment Controls Links to other fragments Verborgh et al. – Web-scale querying through Linked Data Fragments Linked Data Fragments
  • 7. 7 ELIS – Multimedia Lab Accessing data through a SPARQL endpoint Data Bindings matching a SPARQL query Metadata { } (data contains everything needed) Controls { } (interface can answer everything) SPARQL endpoint
  • 8. 8 ELIS – Multimedia Lab Accessing data through Triple Pattern Fragments Data Triples matching a triple pattern Metadata Count estimate, page size, etc. Controls First page, next page, root fragment Triple Pattern Fragments
  • 9. 9 ELIS – Multimedia Lab Triple Pattern Fragments URI query results metadata/controls
  • 10. 10 ELIS – Multimedia Lab Accessing Linked Data Problem statement Improved join tree Optimizing local joins Bringing it all together Query Execution Optimization for Clients of Triple Patterns Fragments
  • 11. 11 ELIS – Multimedia Lab SELECT ?person ?city WHERE { ?person a db:Architect. 1200 triples ?person db:birthPlace ?city. 430,000 triples ?city dc:subject db:Category:Capitals_in_Europe. 60 triples } Start from the smallest pattern, apply bindings and do recursion Greedy algorithm birthPlace architect 400 40,000 Capitals 1
  • 12. 12 ELIS – Multimedia Lab SELECT ?person ?city WHERE { ?person a db:Architect. 1200 triples ?person db:birthPlace ?city. 430,000 triples ?city dc:subject db:Category:Capitals_in_Europe. 60 triples } Find optimal solution for every pattern Optimized algorithm Capitals birthPlace architect 1 400 local 12
  • 13. 13 ELIS – Multimedia Lab Accessing Linked Data Problem statement Improved join tree Optimizing local joins Bringing it all together Query Execution Optimization for Clients of Triple Patterns Fragments
  • 14. 14 ELIS – Multimedia Lab Goal Minimize HTTP calls required to solve BGP query Solution 2 possible roles for every pattern in query: Download pattern completely or Bind variable and download resulting patterns Estimate best option for every pattern Optimized algorithm
  • 15. 15 ELIS – Multimedia Lab ?player :team ?club 365,000 triples ?club :type :SoccerClub 16,000 triples ?club :ground ?city 15,000 triples ?city :country :Spain 7,000 triples ?player :birthPlace ?city 430,000 triples Always download smallest pattern Determine others on shared variables and results so far Can change during runtime Extended example
  • 16. 16 ELIS – Multimedia Lab Extended example ?city :country :Spain ?city ?club :ground ?city ?player :birthPlace ?city ?club ?player ?player :team ?club ?club :type :SoccerClub supplies ?city supplied by ?city
  • 17. 17 ELIS – Multimedia Lab First iteration ?city :country :Spain ?city ?club :ground ?city ?player :birthPlace ?city ?club ?player ?player :team ?club ?club :type :SoccerClub
  • 18. 18 ELIS – Multimedia Lab Further iterations ?city :country :Spain ?city ?club :ground ?city ?player :birthPlace ?city ?club ?player ?player :team ?club ?club :type :SoccerClub Making sure no pattern is ignored
  • 19. 19 ELIS – Multimedia Lab Estimate which option requires least HTTP calls. Download: #𝑡𝑟𝑖𝑝𝑙𝑒𝑠 𝑝𝑎𝑔𝑒𝑠𝑖𝑧𝑒 avg pages per binding avg bindings per triple Bind: 𝑎𝑣𝑔 𝑡𝑟𝑖𝑝𝑙𝑒𝑠/𝑏𝑖𝑛𝑑𝑖𝑛𝑔 𝑝𝑎𝑔𝑒𝑠𝑖𝑧𝑒 ⋅ max 𝑏𝑖𝑛𝑑𝑖𝑛𝑔𝑠 𝑓𝑜𝑢𝑛𝑑 𝑡𝑟𝑖𝑝𝑙𝑒𝑠 𝑑𝑜𝑤𝑛𝑙𝑜𝑎𝑑𝑒𝑑 ⋅ #𝑡𝑟𝑖𝑝𝑙𝑒𝑠 for all suppliers Swap when necessary, taking into account work done so far Updating pattern roles
  • 20. 20 ELIS – Multimedia Lab Accessing Linked Data Problem statement Improved join tree Optimizing local joins Bringing it all together Query Execution Optimization for Clients of Triple Patterns Fragments
  • 21. 21 ELIS – Multimedia Lab Most join data from previous iterations can be reused. Challenge: reuse as much data as possible. Local joining Triple data Iteration i Triple data Iteration i+1 New triples!
  • 22. 22 ELIS – Multimedia Lab Join tree step Iteration i Iteration i + 1 Bindings i-1 Triples i New New Bindings i-1 Triples i Bindings i Bindings i New
  • 23. 23 ELIS – Multimedia Lab Start with the largest unchanged, connected set of patterns. Estimate remainder of join order based on pattern size and connectivity. Minimizing joins New New Bindings i-1 Triples i New triples Propagated changes
  • 24. 24 ELIS – Multimedia Lab Accessing Linked Data Problem statement Improved join tree Optimizing local joins Bringing it all together Query Execution Optimization for Clients of Triple Patterns Fragments
  • 25. 25 ELIS – Multimedia Lab Prevent local optima Join tree instead of join path Reuse local join data Summary
  • 26. 26 ELIS – Multimedia Lab Single machine Intel Core i5-3230M CPU @ 2.60GHz 8 GB RAM Both client and server Artificial delay of 100ms on server to simulate network delay Test setup
  • 27. 27 ELIS – Multimedia Lab WatDiv benchmark queries, 100ms delay on server Median # HTTP calls Median time (s) Results
  • 28. 28 ELIS – Multimedia Lab Less HTTP calls with more client-side processing Ideal for slow connection situations Still room for improvements No parallelism Focus on BGPs More work per HTTP call Not guaranteed to be better Conclusion
  • 29. 29 ELIS – Multimedia Lab Thank you! Come see demo #13 on thursday Questions?