SlideShare uma empresa Scribd logo
1 de 12
Baixar para ler offline
RDF and Graph benchmarking
www.ldbc.eu
GRAPH-TA
Barcelona
Feb 19, 2013
Graph-TA, Barcelona Feb 19th, 2013
Agenda
- Why benchmarking
- Presentation of LDBC
-

Remarks
Who is who
Project overview: WP and Task Forces
TUC, Technical Users Community

- Benchmarking RDF and GDB
- Common issues
- Open questions
Why benchmarking
• Two main objectives:
– Allow final users to assess the performance of the
software they want to buy
– Push the technology to the limit to allow for
progress

• Main effort in DB benchmarking up to now
– TPC: Transaction Processing Performance Council
• Relational DBs: Transactional and DSS
LDBC
• Objectives:
– Benchmarks for the emerging field of RDF and
Graph database management systems (GDBs)
– Spur industry cooperation around benchmarks
– Create LDBC foundation during 1Q 2013.
– Become a technology push effort, making
improvements measurable
– Become the de-facto research benchmark, usable,
interesting and open to inputs.
Preliminary remarks
• Nature:

– LDBC is different from other EC projects.
– The objective for LDBC is to survive after the end.

• Opportunity:

– Have a benchmarking effort sponsored by EC.
– Focal point for the vendor community.
– Showcase to the user community.

• Collaboration:

– LDBC should lead to great achievements and world
recognition, with the help of all the community:
industry, technologists and users.
Who is involved
•
•
•
•
•
•
•
•

FORTH, research centre, Greece
TUM, research centre, Germany
UIBK, technology centre, Austria
Neo Technologies, Graph management, Sweeden
OGL, RDF management, UK
ONT, RDF management, Bulgaria
VUA, research institution, Netherlands
DAMA-UPC, research institution, Spain
Project Overview: the WPs
Matrix Organization
• EU project reporting activities (WPs)
• LDBC benchmark task force activities (TFs)
WP1

WP2

WP3

WP4

TF1
TF2
TF3
TF4
TF5

(part of) deliverable/
task force artifact
Task forces
• Focusses on specific benchmarking effort (i.e.
transactional, analytical, integration for RDF/GDB)
• Decides on the Use Case to be used
• Procedure:
– Designs and implements data generation (characteristics,
scale, etc.)
– Incorporates the generic methodology
– Designs specific workload
• Choke points specific for the effort
• Incorporating the needs from users
• Incorporating the opinions from industry
Technical User Community (TUC)
• It will be the driving force for LDBC:
– To help understand users needs and decide use cases
– To decide the type of problem/scenarios to be tackled,
i.e. task forces to be deployed
– To provide typical queries placed to RDF and GDBs

• First TUC meeting, Barcelona 19-20 Nov.
– Start with an on-line questionaire:
http://goo.gl/PwGtK
– The outcomes will determine important directions
Common issues
• Use case for RDF and GDBs:

• Social Network Analysis
• Semantic Publishing, specific for RDF (SP)

• Methodology:

• Audited benchmarks
• Specific rules, similar to TPC

•

Workload for RDF
–
–
–
–
–
–

Throughput, concurrency
Traversals
Reasoning
Data updates
Integration: LOD, geonames, etc.
SP: semantic annotation support,
relationship btwn ontologies and
instances, links to other content, text
and metadata.

•

Workload for GDBs

– Throughput, concurrency
– Traversals, shortest paths, pattern
matching, clustering algorithms
– Data updates, transactionality
– Update semantics (serializable/acid vs
delayed commit vs batch)
– Raw traversal speed, use of indexes
Open questions
• Use cases:
•
•
•
•

How realistic would you see synthetic data generation?
Use of real data like twitter, Facebook or Open Ontologies?
Any suggestions for use cases?
Any suggestion for scenarios: analytical, transactional, integration, others?

• Would it make sense to propose open benchmark scenarios?
• The tight rules of TPC:

– Are those against the realism of benchmarks?
– Can we solve this in any flexible way?

• Will RDF and GDB move towards the same technology/solutions?
• GDBs: no standard language. How to proceed?

Mais conteúdo relacionado

Destaque (6)

Ldbc spb 2.0 evolution
Ldbc spb 2.0 evolutionLdbc spb 2.0 evolution
Ldbc spb 2.0 evolution
 
LDBC SNB Benchmark Auditing
LDBC SNB Benchmark AuditingLDBC SNB Benchmark Auditing
LDBC SNB Benchmark Auditing
 
Social Network Benchmark Interactive Workload
Social Network Benchmark Interactive WorkloadSocial Network Benchmark Interactive Workload
Social Network Benchmark Interactive Workload
 
SADI: A design-pattern for “native” Linked-Data Semantic Web Services
SADI: A design-pattern for “native” Linked-Data Semantic Web ServicesSADI: A design-pattern for “native” Linked-Data Semantic Web Services
SADI: A design-pattern for “native” Linked-Data Semantic Web Services
 
Keynote IDEAS2013 - Peter Boncz
Keynote IDEAS2013 - Peter BonczKeynote IDEAS2013 - Peter Boncz
Keynote IDEAS2013 - Peter Boncz
 
E-Commerce and Graph-driven Applications: Experiences and Optimizations while...
E-Commerce and Graph-driven Applications: Experiences and Optimizations while...E-Commerce and Graph-driven Applications: Experiences and Optimizations while...
E-Commerce and Graph-driven Applications: Experiences and Optimizations while...
 

Semelhante a GRAPH-TA 2013 - RDF and Graph benchmarking - Jose Lluis Larriba Pey

Crafting bigdatabenchmarks
Crafting bigdatabenchmarksCrafting bigdatabenchmarks
Crafting bigdatabenchmarks
Tilmann Rabl
 
RM Lesson learned HP - INCOSE
RM Lesson learned HP - INCOSERM Lesson learned HP - INCOSE
RM Lesson learned HP - INCOSE
Shmuel Matis
 

Semelhante a GRAPH-TA 2013 - RDF and Graph benchmarking - Jose Lluis Larriba Pey (20)

Graph-TA 2013 - Josep Lluís Larriba Pey
Graph-TA 2013 - Josep Lluís Larriba PeyGraph-TA 2013 - Josep Lluís Larriba Pey
Graph-TA 2013 - Josep Lluís Larriba Pey
 
AGM 2013 Task Force meetings
AGM 2013 Task Force meetingsAGM 2013 Task Force meetings
AGM 2013 Task Force meetings
 
Overview of Hydrogen TCP, Task 41. Introduce discussion points from the hydro...
Overview of Hydrogen TCP, Task 41. Introduce discussion points from the hydro...Overview of Hydrogen TCP, Task 41. Introduce discussion points from the hydro...
Overview of Hydrogen TCP, Task 41. Introduce discussion points from the hydro...
 
LDBC 8th TUC Meeting: Introduction and status update
LDBC 8th TUC Meeting: Introduction and status updateLDBC 8th TUC Meeting: Introduction and status update
LDBC 8th TUC Meeting: Introduction and status update
 
Connecting DMPs & Repositories
Connecting DMPs & RepositoriesConnecting DMPs & Repositories
Connecting DMPs & Repositories
 
2017 iii 2_robert_tomas_inspire_miwp
2017 iii 2_robert_tomas_inspire_miwp2017 iii 2_robert_tomas_inspire_miwp
2017 iii 2_robert_tomas_inspire_miwp
 
Crafting bigdatabenchmarks
Crafting bigdatabenchmarksCrafting bigdatabenchmarks
Crafting bigdatabenchmarks
 
DMP EUDAT
DMP EUDATDMP EUDAT
DMP EUDAT
 
2012 06-22 - aviac meeting presentation
2012 06-22 - aviac meeting presentation2012 06-22 - aviac meeting presentation
2012 06-22 - aviac meeting presentation
 
Hobbit project overview presented at EBDVF 2017
Hobbit project overview presented at EBDVF 2017Hobbit project overview presented at EBDVF 2017
Hobbit project overview presented at EBDVF 2017
 
EdReNe 2015 - A Norwegian test pilot about tagging learning resources with LRMI
EdReNe 2015 - A Norwegian test pilot about tagging learning resources with LRMIEdReNe 2015 - A Norwegian test pilot about tagging learning resources with LRMI
EdReNe 2015 - A Norwegian test pilot about tagging learning resources with LRMI
 
Team Data Science Process Presentation (TDSP), Aug 29, 2017
Team Data Science Process Presentation (TDSP), Aug 29, 2017Team Data Science Process Presentation (TDSP), Aug 29, 2017
Team Data Science Process Presentation (TDSP), Aug 29, 2017
 
SGIP August 15, 2013 eMeeting - State of the Union
SGIP August 15, 2013 eMeeting - State of the UnionSGIP August 15, 2013 eMeeting - State of the Union
SGIP August 15, 2013 eMeeting - State of the Union
 
RM Lesson learned HP - INCOSE
RM Lesson learned HP - INCOSERM Lesson learned HP - INCOSE
RM Lesson learned HP - INCOSE
 
Benchmarking of distributed linked data streaming systems
Benchmarking of distributed linked data streaming systemsBenchmarking of distributed linked data streaming systems
Benchmarking of distributed linked data streaming systems
 
RDA and serials: Theoretical and practical applications. Preconference
RDA and serials: Theoretical and practical applications. PreconferenceRDA and serials: Theoretical and practical applications. Preconference
RDA and serials: Theoretical and practical applications. Preconference
 
W3C Digital Publishing Interest Group Update
W3C Digital Publishing Interest Group UpdateW3C Digital Publishing Interest Group Update
W3C Digital Publishing Interest Group Update
 
Summary of Day 1
Summary of Day 1Summary of Day 1
Summary of Day 1
 
project planning-estimation
project planning-estimationproject planning-estimation
project planning-estimation
 
IWMW 1998: Deploying new web technologies
IWMW 1998: Deploying new web technologiesIWMW 1998: Deploying new web technologies
IWMW 1998: Deploying new web technologies
 

Mais de Ioan Toma

The LDBC Social Network Benchmark Interactive Workload - SIGMOD 2015
The LDBC Social Network Benchmark Interactive Workload - SIGMOD 2015The LDBC Social Network Benchmark Interactive Workload - SIGMOD 2015
The LDBC Social Network Benchmark Interactive Workload - SIGMOD 2015
Ioan Toma
 
FOSDEM2014 - Social Network Benchmark (SNB) Graph Generator - Peter Boncz
FOSDEM2014 - Social Network Benchmark (SNB) Graph Generator - Peter BonczFOSDEM2014 - Social Network Benchmark (SNB) Graph Generator - Peter Boncz
FOSDEM2014 - Social Network Benchmark (SNB) Graph Generator - Peter Boncz
Ioan Toma
 

Mais de Ioan Toma (10)

Parallel and incremental materialisation of RDF/DATALOG in RDFOX
Parallel and incremental materialisation of RDF/DATALOG in RDFOXParallel and incremental materialisation of RDF/DATALOG in RDFOX
Parallel and incremental materialisation of RDF/DATALOG in RDFOX
 
MODAClouds Decision Support System for Cloud Service Selection
MODAClouds Decision Support System for Cloud Service SelectionMODAClouds Decision Support System for Cloud Service Selection
MODAClouds Decision Support System for Cloud Service Selection
 
MarkLogic Overview and Use Cases
MarkLogic Overview and Use CasesMarkLogic Overview and Use Cases
MarkLogic Overview and Use Cases
 
Towards Temporal Graph Management and Analytics
Towards Temporal Graph Management and AnalyticsTowards Temporal Graph Management and Analytics
Towards Temporal Graph Management and Analytics
 
The LDBC Social Network Benchmark Interactive Workload - SIGMOD 2015
The LDBC Social Network Benchmark Interactive Workload - SIGMOD 2015The LDBC Social Network Benchmark Interactive Workload - SIGMOD 2015
The LDBC Social Network Benchmark Interactive Workload - SIGMOD 2015
 
Querying the Wikidata Knowledge Graph
Querying the Wikidata Knowledge GraphQuerying the Wikidata Knowledge Graph
Querying the Wikidata Knowledge Graph
 
Lighthouse: Large-scale graph pattern matching on Giraph
Lighthouse: Large-scale graph pattern matching on GiraphLighthouse: Large-scale graph pattern matching on Giraph
Lighthouse: Large-scale graph pattern matching on Giraph
 
HP Labs: Titan DB on LDBC SNB interactive by Tomer Sagi (HP)
HP Labs: Titan DB on LDBC SNB interactive by Tomer Sagi (HP)HP Labs: Titan DB on LDBC SNB interactive by Tomer Sagi (HP)
HP Labs: Titan DB on LDBC SNB interactive by Tomer Sagi (HP)
 
SPIMBENCH: A scalable, Schema-Aware Instance Matching Benchmark for the Seman...
SPIMBENCH: A scalable, Schema-Aware Instance Matching Benchmark for the Seman...SPIMBENCH: A scalable, Schema-Aware Instance Matching Benchmark for the Seman...
SPIMBENCH: A scalable, Schema-Aware Instance Matching Benchmark for the Seman...
 
FOSDEM2014 - Social Network Benchmark (SNB) Graph Generator - Peter Boncz
FOSDEM2014 - Social Network Benchmark (SNB) Graph Generator - Peter BonczFOSDEM2014 - Social Network Benchmark (SNB) Graph Generator - Peter Boncz
FOSDEM2014 - Social Network Benchmark (SNB) Graph Generator - Peter Boncz
 

Último

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 

Último (20)

The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 

GRAPH-TA 2013 - RDF and Graph benchmarking - Jose Lluis Larriba Pey

  • 1. RDF and Graph benchmarking www.ldbc.eu GRAPH-TA Barcelona Feb 19, 2013 Graph-TA, Barcelona Feb 19th, 2013
  • 2. Agenda - Why benchmarking - Presentation of LDBC - Remarks Who is who Project overview: WP and Task Forces TUC, Technical Users Community - Benchmarking RDF and GDB - Common issues - Open questions
  • 3. Why benchmarking • Two main objectives: – Allow final users to assess the performance of the software they want to buy – Push the technology to the limit to allow for progress • Main effort in DB benchmarking up to now – TPC: Transaction Processing Performance Council • Relational DBs: Transactional and DSS
  • 4. LDBC • Objectives: – Benchmarks for the emerging field of RDF and Graph database management systems (GDBs) – Spur industry cooperation around benchmarks – Create LDBC foundation during 1Q 2013. – Become a technology push effort, making improvements measurable – Become the de-facto research benchmark, usable, interesting and open to inputs.
  • 5. Preliminary remarks • Nature: – LDBC is different from other EC projects. – The objective for LDBC is to survive after the end. • Opportunity: – Have a benchmarking effort sponsored by EC. – Focal point for the vendor community. – Showcase to the user community. • Collaboration: – LDBC should lead to great achievements and world recognition, with the help of all the community: industry, technologists and users.
  • 6. Who is involved • • • • • • • • FORTH, research centre, Greece TUM, research centre, Germany UIBK, technology centre, Austria Neo Technologies, Graph management, Sweeden OGL, RDF management, UK ONT, RDF management, Bulgaria VUA, research institution, Netherlands DAMA-UPC, research institution, Spain
  • 8. Matrix Organization • EU project reporting activities (WPs) • LDBC benchmark task force activities (TFs) WP1 WP2 WP3 WP4 TF1 TF2 TF3 TF4 TF5 (part of) deliverable/ task force artifact
  • 9. Task forces • Focusses on specific benchmarking effort (i.e. transactional, analytical, integration for RDF/GDB) • Decides on the Use Case to be used • Procedure: – Designs and implements data generation (characteristics, scale, etc.) – Incorporates the generic methodology – Designs specific workload • Choke points specific for the effort • Incorporating the needs from users • Incorporating the opinions from industry
  • 10. Technical User Community (TUC) • It will be the driving force for LDBC: – To help understand users needs and decide use cases – To decide the type of problem/scenarios to be tackled, i.e. task forces to be deployed – To provide typical queries placed to RDF and GDBs • First TUC meeting, Barcelona 19-20 Nov. – Start with an on-line questionaire: http://goo.gl/PwGtK – The outcomes will determine important directions
  • 11. Common issues • Use case for RDF and GDBs: • Social Network Analysis • Semantic Publishing, specific for RDF (SP) • Methodology: • Audited benchmarks • Specific rules, similar to TPC • Workload for RDF – – – – – – Throughput, concurrency Traversals Reasoning Data updates Integration: LOD, geonames, etc. SP: semantic annotation support, relationship btwn ontologies and instances, links to other content, text and metadata. • Workload for GDBs – Throughput, concurrency – Traversals, shortest paths, pattern matching, clustering algorithms – Data updates, transactionality – Update semantics (serializable/acid vs delayed commit vs batch) – Raw traversal speed, use of indexes
  • 12. Open questions • Use cases: • • • • How realistic would you see synthetic data generation? Use of real data like twitter, Facebook or Open Ontologies? Any suggestions for use cases? Any suggestion for scenarios: analytical, transactional, integration, others? • Would it make sense to propose open benchmark scenarios? • The tight rules of TPC: – Are those against the realism of benchmarks? – Can we solve this in any flexible way? • Will RDF and GDB move towards the same technology/solutions? • GDBs: no standard language. How to proceed?