SlideShare uma empresa Scribd logo
1 de 35
1 
Rob Vesse 
rvesse@yarcdata.com 
@RobVesse
2 
1. Rewind to 2012 
2. Limitations 
3. Evolving the Framework 
4. Examples 
5. Future Work
3
4 
 Presentation I gave at this conference in 2012 
 Slides at http://www.slideshare.net/RobVesse/practical-sparql-benchmarking 
 Highlighted some issues with SPARQL Benchmarking: 
 Standard Benchmarks all have know deficiencies 
 Lack of standardized methodology 
 Best benchmark is the one you run with your data and workload 
 Introduced the 1.x version of our SPARQL Query 
Benchmarker tool 
 Java tool and API for benchmarking 
 Used a methodology based upon combination of the BSBM runner and Revelytix SP2B white 
paper 
 Reports various appropriate statistics 
 Various configuration options to change what exactly is benchmarked e.g. whether results are 
fully parsed and counted
5 
 The 1.x tool was open sourced shortly after the 2012 
conference under a 3 clause BSD License 
 Available on SourceForge 
 http://sourceforge.net/projects/sparql-query-bm/files/1.0.0/ 
 Also as Maven artifacts (in Maven Central): 
 Group ID: net.sf.sparql-query-bm 
 Artifact IDs: 
 cmd 
 core 
 Latest 1.x Version: 1.1.0
6
 The 1.x tool can only benchmark SPARQL queries 
 SPARQL 1.1 has been standardized since the 1.x version of 
the tool was written and adds various additional SPARQL 
features that you may want to test: 
7 
 SPARQL Updates 
 SPARQL Graph Store Protocol 
 Queries are fixed 
 No parameterization support 
 Can't pass custom endpoint parameters in 
 For example enable/disable reasoning 
 Also no way to test endpoint specific extensions 
 e.g. transactions
8 
 Requires using HTTP endpoints to access the SPARQL 
system to be tested 
 Adds communication overheads to the results 
 Sometimes this may be desirable 
 No ability to test SPARQL operations in-memory 
 i.e. can't test lower level APIs
 Only supports a single benchmarking methodology 
 Methodology is hard coded 
 Can't do things like run a subset of the provided operations 
on each run 
9 
 Or repeat an operation within a run 
 Or retry an operation under specific failure conditions 
 Configuration of the methodology is tightly coupled to the 
methodology 
 Many aspects are actually independent of the methodology
1 
0 
 Used a simplistic text based format 
 One query file per line 
 No way to specify additional parameters 
 No way to assign a friendly name to queries 
 Assigns each query the filename
 There is a progress monitoring API but it is limited 
 E.g. Gets called after a query completes but not before it 
starts 
 Makes it awkward/impossible to implement some kinds of 
monitoring 
1 
1 
 e.g. crash detection, memory usage
1 
2 
 In the interests of speed over usability we rolled our own 
command line arguments parser 
 Means argument parsing is awkward to extend
1 
3
1 
4 
 Earlier this year we found a compelling reason to rewrite 
the tool and address the various limitations 
 First 2.x release was made 9th June 2014 
 Minor bug fix and maintenance releases since 
 Releases available at: 
 http://sourceforge.net/projects/sparql-query-bm/files/ 
 Code is now using Git 
 http://git.code.sf.net/p/sparql-query-bm/git sparql-query-bm-git 
 Mirrors available on GitHub for those who think that it is the one true source 
 https://github.com/rvesse/sparql-query-bm 
 Maven artifacts available through Maven Central as before: 
 Group ID: net.sf.sparql-query-bm 
 Artifact IDs: core, cmd and dist 
 Latest 2.x version: 2.0.1
 Concept of Queries replaced with the general concept of 
Operations 
 Also divorces the definition of an operation with how to run 
said operation 
1 
5 
 Makes it easier to change runtime behaviour of operations 
 20 built-in operations provided 
 API allows defining and plugging in new operations as 
desired 
 http://sparql-query-bm.sourceforge.net/javadoc/latest/core/
1 
6 
 Several kinds of query/update 
 Fixed 
 Parameterized 
 Dataset Size 
 Variants for both remote endpoints and in-memory 
datasets 
 Remote variants have additional NVP variants 
 Allows adding custom parameters to the remote request 
 Accounts for 13 of the built in operations
1 
7 
 One for each graph store protocol operation: 
 DELETE 
 GET 
 HEAD 
 POST 
 PUT 
 Accounts for a further 5 of the built-in operations
1 
8 
 Sleep 
 Do nothing for some period 
 Useful for simulating quiet periods as part of testing 
 Mix 
 Allow grouping a set of operations into a single operation 
 Lets you compose mixes from other mixes
1 
9 
 As already noted in-memory variants of some operations 
are now available 
 These run tests against a Dataset implementation 
 Part of Apache Jena ARQ API 
 Removes SPARQL Protocol and HTTP overhead from testing 
 Of course depending on Dataset implementation may still be some communication overhead 
 But this is likely using lower level back end native communications protocols instead
2 
0 
 Addresses the limitation of hard coded methodology 
 Separates test running into three components: 
 Overall runner 
 Mix runner 
 Operation runner 
 Each has own API and can be customized as desired 
 Various useful base/abstract implementations provided 
 Four different test runners are provided: 
 Benchmark 
 Smoke 
 Soak 
 Stress
2 
1 
 Smoke 
 Runs the mix once and indicates whether it passes/fails 
 Pass is defined as all operations pass 
 Soak 
 Run the mix continuously for some period of time 
 Test how a system reacts under continuous load 
 Stress 
 Run the mix with increasingly high load 
 Test how a system reacts under increasing load 
 AbstractRunner provides a basic framework and helper 
method to make it easy to add custom runners or 
customize existing runs
2 
2 
 Allows customizing how mixes and individual operations 
are run 
 Some alternative implementations built in: 
 E.g. SamplingOperationMixRunner 
 Runs a sample of the operations in the mix 
 May include repeats 
 E.g. RetryingOperationRunner 
 Retries an operation if it doesn't succeed 
 Easy to implement your own
2 
3 
 Separates test configuration from the test runner 
 Interface with all common configuration defined 
 Endpoints 
 Timeouts 
 Progress Listeners 
 etc 
 NB - Runners are typically defined such that they restrict 
their input options to sub-interfaces that add runner 
specific configuration e.g. 
 Warm-ups for benchmarks 
 Total runtime for soak testing 
 Ramp up factor for stress testing
2 
4 
 Now using TSV as the file format 
 Still wanted to be simple enough that someone with zero RDF/SPARQL knowledge can 
configure 
 Each line is a series of parameters separated by a tab 
character 
 First parameter is an identifier for the type of the operation 
 Used to decide how to interpret the remaining parameters 
 Can define your own mix file format and register a loader 
for it 
 Possible to override the loader for a specific operation 
identifier since this has an API 
 Means you can do neat tricks like use a mix designed for remote endpoints against an in-memory 
dataset
query 806670-warmup1.rq 806670 Warmup Query 1 
query 806670-warmup2.rq 806670 Warmup Query 2 
query 806670-nofilter.rq 806670 Query with No Filter 
query 806670-filter3.rq 806670 Query with Filter (Variant 3) 
param-query 806670-filter3-params.rq instances.tsv Parameterized Query with 
Filter (Variant 3) 
query 806670-filter4.rq 806670 Query with Filter (Variant 4) 
query 806670-filter4a.rq 806670 Query with Filter (Variant 4a - Zero Results) 
param-query 806670-filter4-params.rq instances.tsv Parameterized Query with 
Filter (Variant 4) 
query 806238-warmup1.rq 806238 Warmup Query 1 
query 806238-warmup2.rq 806238 Warmup Query 2 
query 806238-comment43.rq 806238 Query (Comment 43) 
query 806238-comment43a.rq 806238 Query (Comment 43 - SELECT * sub-query) 
query 806238-comment45.rq 806238 Query (Comment 45 - Multiple sub-queries) 
query 806238-comment54.rq 806238 Query (Comment 54) 
param-update load-full1m.ru graph-names.tsv Load 1M Dataset into named graph 
param-query count-loaded.rq graph-names.tsv Count named graph 
param-update drop-loaded.ru graph-names.tsv Drop named graph 
query count.rq Count quads 
checkpoint10 Checkpoint every 10 runs 
sleep 180 3 minute sleep 
2 
5
 Now provides notifications before and after operation and 
mix runs 
 Improvements to how some of the built-in 
implementations handle multi-threaded output 
2 
6 
 Makes it easier to distinguish where errors occurred when running multi-threaded 
benchmarks
2 
7 
 Now based upon the powerful open source Airline library 
 https://github.com/airlift/airline 
 Provides a command line interface to each built-in runner 
 Also provides AbstractCommandwith all standard options exposed 
 Standardized exit codes across all commands 
 Comprehensive built-in help 
 Can help you define operation mixes 
 ./operations 
 ./operation --op param-query
2 
8
 These are things we've done (or are currently doing) with 
the framework that aren't in the open source releases 
 However the 2.x framework makes these (hopefully) easy 
to replicate yourself 
2 
9
3 
0 
 Many stores often have rich REST APIs in addition to their 
SPARQL APIs 
 Can be useful to include testing of these in your mixes 
 Requires implementing two interfaces: 
 Operation 
 OperationCallable 
 Abstract implementations of both available to give you the 
boiler plate bits 
 Internally we have 9 different custom operations defined 
which test a subset of our REST API: 
 Database Management 
 Asynchronous Queries 
 Import Management
 One thing we're particularly interested in is how operations 
affect memory usage 
3 
1 
 We added custom progress listeners that track and monitor memory usage 
 Reports on min, max and average memory usage 
 We also have another progress listener that tracks 
processes to identify when a test run may have been 
impacted by other activity on the system
3 
2 
public class RetryOnAuthFailureOperationRunner extends RetryingOperationRunner { 
public RetryOnAuthFailureOperationRunner() { 
this(1); 
} 
public RetryOnAuthFailureOperationRunner(int maxRetries) { 
super(maxRetries); 
} 
@Override 
protected <T extends Options> boolean shouldRetry(Runner<T> runner, T options, 
Operation op, OperationRun run) { 
return run.getErrorCategory() == ErrorCategories.AUTHENTICATION; 
} 
} 
 Extends the built-in RetryingOperationRunner 
 Simply adds a constraint on retries by overriding the 
shouldRetry() method
3 
3
3 
4 
 Embrace Java 7 features fully 
 Use ServiceLoader to automatically discover new operations and mix formats 
 Make it even easier to customize runners 
 i.e. provide more abstraction of the current implementations
3 
5 
Questions? 
rvesse@yarcdata.com 
@RobVesse

Mais conteúdo relacionado

Mais procurados

Debugging Apache Spark - Scala & Python super happy fun times 2017
Debugging Apache Spark -   Scala & Python super happy fun times 2017Debugging Apache Spark -   Scala & Python super happy fun times 2017
Debugging Apache Spark - Scala & Python super happy fun times 2017Holden Karau
 
Pandas UDF and Python Type Hint in Apache Spark 3.0
Pandas UDF and Python Type Hint in Apache Spark 3.0Pandas UDF and Python Type Hint in Apache Spark 3.0
Pandas UDF and Python Type Hint in Apache Spark 3.0Databricks
 
Querying Linked Data with SPARQL
Querying Linked Data with SPARQLQuerying Linked Data with SPARQL
Querying Linked Data with SPARQLOlaf Hartig
 
Apache Spark MLlib 2.0 Preview: Data Science and Production
Apache Spark MLlib 2.0 Preview: Data Science and ProductionApache Spark MLlib 2.0 Preview: Data Science and Production
Apache Spark MLlib 2.0 Preview: Data Science and ProductionDatabricks
 
Apache Spark - Intro to Large-scale recommendations with Apache Spark and Python
Apache Spark - Intro to Large-scale recommendations with Apache Spark and PythonApache Spark - Intro to Large-scale recommendations with Apache Spark and Python
Apache Spark - Intro to Large-scale recommendations with Apache Spark and PythonChristian Perone
 
Why Scala Is Taking Over the Big Data World
Why Scala Is Taking Over the Big Data WorldWhy Scala Is Taking Over the Big Data World
Why Scala Is Taking Over the Big Data WorldDean Wampler
 
Learn Apache Spark: A Comprehensive Guide
Learn Apache Spark: A Comprehensive GuideLearn Apache Spark: A Comprehensive Guide
Learn Apache Spark: A Comprehensive GuideWhizlabs
 
Holden Karau - Spark ML for Custom Models
Holden Karau - Spark ML for Custom ModelsHolden Karau - Spark ML for Custom Models
Holden Karau - Spark ML for Custom Modelssparktc
 
SPARQL 1.1 Update (2013-03-05)
SPARQL 1.1 Update (2013-03-05)SPARQL 1.1 Update (2013-03-05)
SPARQL 1.1 Update (2013-03-05)andyseaborne
 
Scalable Data Science in Python and R on Apache Spark
Scalable Data Science in Python and R on Apache SparkScalable Data Science in Python and R on Apache Spark
Scalable Data Science in Python and R on Apache Sparkfelixcss
 
Migrating Apache Spark ML Jobs to Spark + Tensorflow on Kubeflow
Migrating Apache Spark ML Jobs to Spark + Tensorflow on KubeflowMigrating Apache Spark ML Jobs to Spark + Tensorflow on Kubeflow
Migrating Apache Spark ML Jobs to Spark + Tensorflow on KubeflowDatabricks
 
Apache Spark Super Happy Funtimes - CHUG 2016
Apache Spark Super Happy Funtimes - CHUG 2016Apache Spark Super Happy Funtimes - CHUG 2016
Apache Spark Super Happy Funtimes - CHUG 2016Holden Karau
 
Tuning and Monitoring Deep Learning on Apache Spark
Tuning and Monitoring Deep Learning on Apache SparkTuning and Monitoring Deep Learning on Apache Spark
Tuning and Monitoring Deep Learning on Apache SparkDatabricks
 
Functional programming in Scala
Functional programming in ScalaFunctional programming in Scala
Functional programming in Scaladatamantra
 
Speeding up PySpark with Arrow
Speeding up PySpark with ArrowSpeeding up PySpark with Arrow
Speeding up PySpark with ArrowRubén Berenguel
 
Getting started with Apache Spark in Python - PyLadies Toronto 2016
Getting started with Apache Spark in Python - PyLadies Toronto 2016Getting started with Apache Spark in Python - PyLadies Toronto 2016
Getting started with Apache Spark in Python - PyLadies Toronto 2016Holden Karau
 
Apache: Big Data - Starting with Apache Spark, Best Practices
Apache: Big Data - Starting with Apache Spark, Best PracticesApache: Big Data - Starting with Apache Spark, Best Practices
Apache: Big Data - Starting with Apache Spark, Best Practicesfelixcss
 
Scalable Data Science with SparkR: Spark Summit East talk by Felix Cheung
Scalable Data Science with SparkR: Spark Summit East talk by Felix CheungScalable Data Science with SparkR: Spark Summit East talk by Felix Cheung
Scalable Data Science with SparkR: Spark Summit East talk by Felix CheungSpark Summit
 
Intro to apache spark stand ford
Intro to apache spark stand fordIntro to apache spark stand ford
Intro to apache spark stand fordThu Hiền
 

Mais procurados (20)

Debugging Apache Spark - Scala & Python super happy fun times 2017
Debugging Apache Spark -   Scala & Python super happy fun times 2017Debugging Apache Spark -   Scala & Python super happy fun times 2017
Debugging Apache Spark - Scala & Python super happy fun times 2017
 
Pandas UDF and Python Type Hint in Apache Spark 3.0
Pandas UDF and Python Type Hint in Apache Spark 3.0Pandas UDF and Python Type Hint in Apache Spark 3.0
Pandas UDF and Python Type Hint in Apache Spark 3.0
 
Querying Linked Data with SPARQL
Querying Linked Data with SPARQLQuerying Linked Data with SPARQL
Querying Linked Data with SPARQL
 
Apache Spark MLlib 2.0 Preview: Data Science and Production
Apache Spark MLlib 2.0 Preview: Data Science and ProductionApache Spark MLlib 2.0 Preview: Data Science and Production
Apache Spark MLlib 2.0 Preview: Data Science and Production
 
Apache Spark - Intro to Large-scale recommendations with Apache Spark and Python
Apache Spark - Intro to Large-scale recommendations with Apache Spark and PythonApache Spark - Intro to Large-scale recommendations with Apache Spark and Python
Apache Spark - Intro to Large-scale recommendations with Apache Spark and Python
 
Why Scala Is Taking Over the Big Data World
Why Scala Is Taking Over the Big Data WorldWhy Scala Is Taking Over the Big Data World
Why Scala Is Taking Over the Big Data World
 
Learn Apache Spark: A Comprehensive Guide
Learn Apache Spark: A Comprehensive GuideLearn Apache Spark: A Comprehensive Guide
Learn Apache Spark: A Comprehensive Guide
 
Holden Karau - Spark ML for Custom Models
Holden Karau - Spark ML for Custom ModelsHolden Karau - Spark ML for Custom Models
Holden Karau - Spark ML for Custom Models
 
SPARQL 1.1 Update (2013-03-05)
SPARQL 1.1 Update (2013-03-05)SPARQL 1.1 Update (2013-03-05)
SPARQL 1.1 Update (2013-03-05)
 
Scalable Data Science in Python and R on Apache Spark
Scalable Data Science in Python and R on Apache SparkScalable Data Science in Python and R on Apache Spark
Scalable Data Science in Python and R on Apache Spark
 
Migrating Apache Spark ML Jobs to Spark + Tensorflow on Kubeflow
Migrating Apache Spark ML Jobs to Spark + Tensorflow on KubeflowMigrating Apache Spark ML Jobs to Spark + Tensorflow on Kubeflow
Migrating Apache Spark ML Jobs to Spark + Tensorflow on Kubeflow
 
Apache Spark Super Happy Funtimes - CHUG 2016
Apache Spark Super Happy Funtimes - CHUG 2016Apache Spark Super Happy Funtimes - CHUG 2016
Apache Spark Super Happy Funtimes - CHUG 2016
 
Tuning and Monitoring Deep Learning on Apache Spark
Tuning and Monitoring Deep Learning on Apache SparkTuning and Monitoring Deep Learning on Apache Spark
Tuning and Monitoring Deep Learning on Apache Spark
 
SPARQL Cheat Sheet
SPARQL Cheat SheetSPARQL Cheat Sheet
SPARQL Cheat Sheet
 
Functional programming in Scala
Functional programming in ScalaFunctional programming in Scala
Functional programming in Scala
 
Speeding up PySpark with Arrow
Speeding up PySpark with ArrowSpeeding up PySpark with Arrow
Speeding up PySpark with Arrow
 
Getting started with Apache Spark in Python - PyLadies Toronto 2016
Getting started with Apache Spark in Python - PyLadies Toronto 2016Getting started with Apache Spark in Python - PyLadies Toronto 2016
Getting started with Apache Spark in Python - PyLadies Toronto 2016
 
Apache: Big Data - Starting with Apache Spark, Best Practices
Apache: Big Data - Starting with Apache Spark, Best PracticesApache: Big Data - Starting with Apache Spark, Best Practices
Apache: Big Data - Starting with Apache Spark, Best Practices
 
Scalable Data Science with SparkR: Spark Summit East talk by Felix Cheung
Scalable Data Science with SparkR: Spark Summit East talk by Felix CheungScalable Data Science with SparkR: Spark Summit East talk by Felix Cheung
Scalable Data Science with SparkR: Spark Summit East talk by Felix Cheung
 
Intro to apache spark stand ford
Intro to apache spark stand fordIntro to apache spark stand ford
Intro to apache spark stand ford
 

Semelhante a Practical SPARQL Benchmarking Revisited

Integration Group - Robot Framework
Integration Group - Robot Framework Integration Group - Robot Framework
Integration Group - Robot Framework OpenDaylight
 
Play framework : A Walkthrough
Play framework : A WalkthroughPlay framework : A Walkthrough
Play framework : A Walkthroughmitesh_sharma
 
Network Protocol Testing Using Robot Framework
Network Protocol Testing Using Robot FrameworkNetwork Protocol Testing Using Robot Framework
Network Protocol Testing Using Robot FrameworkPayal Jain
 
Maximizing SAP ABAP Performance
Maximizing SAP ABAP PerformanceMaximizing SAP ABAP Performance
Maximizing SAP ABAP PerformancePeterHBrown
 
Meetup 2022 - APIs with Quarkus.pdf
Meetup 2022 - APIs with Quarkus.pdfMeetup 2022 - APIs with Quarkus.pdf
Meetup 2022 - APIs with Quarkus.pdfLuca Mattia Ferrari
 
Native Support of Prometheus Monitoring in Apache Spark 3.0
Native Support of Prometheus Monitoring in Apache Spark 3.0Native Support of Prometheus Monitoring in Apache Spark 3.0
Native Support of Prometheus Monitoring in Apache Spark 3.0Databricks
 
Performancetestingjmeter 131210111657-phpapp02
Performancetestingjmeter 131210111657-phpapp02Performancetestingjmeter 131210111657-phpapp02
Performancetestingjmeter 131210111657-phpapp02Nitish Bhardwaj
 
Linaro Connect 2016 (BKK16) - Introduction to LISA
Linaro Connect 2016 (BKK16) - Introduction to LISALinaro Connect 2016 (BKK16) - Introduction to LISA
Linaro Connect 2016 (BKK16) - Introduction to LISAPatrick Bellasi
 
Adventures in Laravel 5 SunshinePHP 2016 Tutorial
Adventures in Laravel 5 SunshinePHP 2016 TutorialAdventures in Laravel 5 SunshinePHP 2016 Tutorial
Adventures in Laravel 5 SunshinePHP 2016 TutorialJoe Ferguson
 
Streaming Processing with a Distributed Commit Log
Streaming Processing with a Distributed Commit LogStreaming Processing with a Distributed Commit Log
Streaming Processing with a Distributed Commit LogJoe Stein
 
Mykola Kovsh - Functional API automation with Jmeter
Mykola Kovsh - Functional API automation with JmeterMykola Kovsh - Functional API automation with Jmeter
Mykola Kovsh - Functional API automation with JmeterIevgenii Katsan
 
Marathon Testing Tool
Marathon Testing ToolMarathon Testing Tool
Marathon Testing Toolnarayan dudhe
 
Performance Testing REST APIs
Performance Testing REST APIsPerformance Testing REST APIs
Performance Testing REST APIsJason Weden
 
Basics of QTP Framework
Basics of QTP FrameworkBasics of QTP Framework
Basics of QTP FrameworkAnish10110
 
How to use Exachk effectively to manage Exadata environments OGBEmea
How to use Exachk effectively to manage Exadata environments OGBEmeaHow to use Exachk effectively to manage Exadata environments OGBEmea
How to use Exachk effectively to manage Exadata environments OGBEmeaSandesh Rao
 
WE18_Performance_Up.ppt
WE18_Performance_Up.pptWE18_Performance_Up.ppt
WE18_Performance_Up.pptwebhostingguy
 

Semelhante a Practical SPARQL Benchmarking Revisited (20)

Integration Group - Robot Framework
Integration Group - Robot Framework Integration Group - Robot Framework
Integration Group - Robot Framework
 
Play framework : A Walkthrough
Play framework : A WalkthroughPlay framework : A Walkthrough
Play framework : A Walkthrough
 
Network Protocol Testing Using Robot Framework
Network Protocol Testing Using Robot FrameworkNetwork Protocol Testing Using Robot Framework
Network Protocol Testing Using Robot Framework
 
Automation using ibm rft
Automation using ibm rftAutomation using ibm rft
Automation using ibm rft
 
Maximizing SAP ABAP Performance
Maximizing SAP ABAP PerformanceMaximizing SAP ABAP Performance
Maximizing SAP ABAP Performance
 
Meetup 2022 - APIs with Quarkus.pdf
Meetup 2022 - APIs with Quarkus.pdfMeetup 2022 - APIs with Quarkus.pdf
Meetup 2022 - APIs with Quarkus.pdf
 
Native Support of Prometheus Monitoring in Apache Spark 3.0
Native Support of Prometheus Monitoring in Apache Spark 3.0Native Support of Prometheus Monitoring in Apache Spark 3.0
Native Support of Prometheus Monitoring in Apache Spark 3.0
 
Performancetestingjmeter 131210111657-phpapp02
Performancetestingjmeter 131210111657-phpapp02Performancetestingjmeter 131210111657-phpapp02
Performancetestingjmeter 131210111657-phpapp02
 
Linaro Connect 2016 (BKK16) - Introduction to LISA
Linaro Connect 2016 (BKK16) - Introduction to LISALinaro Connect 2016 (BKK16) - Introduction to LISA
Linaro Connect 2016 (BKK16) - Introduction to LISA
 
Testing Toolbox
Testing ToolboxTesting Toolbox
Testing Toolbox
 
10071756.ppt
10071756.ppt10071756.ppt
10071756.ppt
 
Adventures in Laravel 5 SunshinePHP 2016 Tutorial
Adventures in Laravel 5 SunshinePHP 2016 TutorialAdventures in Laravel 5 SunshinePHP 2016 Tutorial
Adventures in Laravel 5 SunshinePHP 2016 Tutorial
 
Streaming Processing with a Distributed Commit Log
Streaming Processing with a Distributed Commit LogStreaming Processing with a Distributed Commit Log
Streaming Processing with a Distributed Commit Log
 
Mykola Kovsh - Functional API automation with Jmeter
Mykola Kovsh - Functional API automation with JmeterMykola Kovsh - Functional API automation with Jmeter
Mykola Kovsh - Functional API automation with Jmeter
 
Marathon Testing Tool
Marathon Testing ToolMarathon Testing Tool
Marathon Testing Tool
 
Performance Testing REST APIs
Performance Testing REST APIsPerformance Testing REST APIs
Performance Testing REST APIs
 
Basics of QTP Framework
Basics of QTP FrameworkBasics of QTP Framework
Basics of QTP Framework
 
Robot framework
Robot frameworkRobot framework
Robot framework
 
How to use Exachk effectively to manage Exadata environments OGBEmea
How to use Exachk effectively to manage Exadata environments OGBEmeaHow to use Exachk effectively to manage Exadata environments OGBEmea
How to use Exachk effectively to manage Exadata environments OGBEmea
 
WE18_Performance_Up.ppt
WE18_Performance_Up.pptWE18_Performance_Up.ppt
WE18_Performance_Up.ppt
 

Mais de Rob Vesse

Challenges and patterns for semantics at scale
Challenges and patterns for semantics at scaleChallenges and patterns for semantics at scale
Challenges and patterns for semantics at scaleRob Vesse
 
Introducing JDBC for SPARQL
Introducing JDBC for SPARQLIntroducing JDBC for SPARQL
Introducing JDBC for SPARQLRob Vesse
 
Practical SPARQL Benchmarking
Practical SPARQL BenchmarkingPractical SPARQL Benchmarking
Practical SPARQL BenchmarkingRob Vesse
 
Everyday Tools for the Semantic Web Developer
Everyday Tools for the Semantic Web DeveloperEveryday Tools for the Semantic Web Developer
Everyday Tools for the Semantic Web DeveloperRob Vesse
 
Everyday Tools for the Semantic Web Developer
Everyday Tools for the Semantic Web DeveloperEveryday Tools for the Semantic Web Developer
Everyday Tools for the Semantic Web DeveloperRob Vesse
 
dotNetRDF - A Semantic Web/RDF Library for .Net Developers
dotNetRDF - A Semantic Web/RDF Library for .Net DevelopersdotNetRDF - A Semantic Web/RDF Library for .Net Developers
dotNetRDF - A Semantic Web/RDF Library for .Net DevelopersRob Vesse
 

Mais de Rob Vesse (6)

Challenges and patterns for semantics at scale
Challenges and patterns for semantics at scaleChallenges and patterns for semantics at scale
Challenges and patterns for semantics at scale
 
Introducing JDBC for SPARQL
Introducing JDBC for SPARQLIntroducing JDBC for SPARQL
Introducing JDBC for SPARQL
 
Practical SPARQL Benchmarking
Practical SPARQL BenchmarkingPractical SPARQL Benchmarking
Practical SPARQL Benchmarking
 
Everyday Tools for the Semantic Web Developer
Everyday Tools for the Semantic Web DeveloperEveryday Tools for the Semantic Web Developer
Everyday Tools for the Semantic Web Developer
 
Everyday Tools for the Semantic Web Developer
Everyday Tools for the Semantic Web DeveloperEveryday Tools for the Semantic Web Developer
Everyday Tools for the Semantic Web Developer
 
dotNetRDF - A Semantic Web/RDF Library for .Net Developers
dotNetRDF - A Semantic Web/RDF Library for .Net DevelopersdotNetRDF - A Semantic Web/RDF Library for .Net Developers
dotNetRDF - A Semantic Web/RDF Library for .Net Developers
 

Último

Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 

Último (20)

Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 

Practical SPARQL Benchmarking Revisited

  • 1. 1 Rob Vesse rvesse@yarcdata.com @RobVesse
  • 2. 2 1. Rewind to 2012 2. Limitations 3. Evolving the Framework 4. Examples 5. Future Work
  • 3. 3
  • 4. 4  Presentation I gave at this conference in 2012  Slides at http://www.slideshare.net/RobVesse/practical-sparql-benchmarking  Highlighted some issues with SPARQL Benchmarking:  Standard Benchmarks all have know deficiencies  Lack of standardized methodology  Best benchmark is the one you run with your data and workload  Introduced the 1.x version of our SPARQL Query Benchmarker tool  Java tool and API for benchmarking  Used a methodology based upon combination of the BSBM runner and Revelytix SP2B white paper  Reports various appropriate statistics  Various configuration options to change what exactly is benchmarked e.g. whether results are fully parsed and counted
  • 5. 5  The 1.x tool was open sourced shortly after the 2012 conference under a 3 clause BSD License  Available on SourceForge  http://sourceforge.net/projects/sparql-query-bm/files/1.0.0/  Also as Maven artifacts (in Maven Central):  Group ID: net.sf.sparql-query-bm  Artifact IDs:  cmd  core  Latest 1.x Version: 1.1.0
  • 6. 6
  • 7.  The 1.x tool can only benchmark SPARQL queries  SPARQL 1.1 has been standardized since the 1.x version of the tool was written and adds various additional SPARQL features that you may want to test: 7  SPARQL Updates  SPARQL Graph Store Protocol  Queries are fixed  No parameterization support  Can't pass custom endpoint parameters in  For example enable/disable reasoning  Also no way to test endpoint specific extensions  e.g. transactions
  • 8. 8  Requires using HTTP endpoints to access the SPARQL system to be tested  Adds communication overheads to the results  Sometimes this may be desirable  No ability to test SPARQL operations in-memory  i.e. can't test lower level APIs
  • 9.  Only supports a single benchmarking methodology  Methodology is hard coded  Can't do things like run a subset of the provided operations on each run 9  Or repeat an operation within a run  Or retry an operation under specific failure conditions  Configuration of the methodology is tightly coupled to the methodology  Many aspects are actually independent of the methodology
  • 10. 1 0  Used a simplistic text based format  One query file per line  No way to specify additional parameters  No way to assign a friendly name to queries  Assigns each query the filename
  • 11.  There is a progress monitoring API but it is limited  E.g. Gets called after a query completes but not before it starts  Makes it awkward/impossible to implement some kinds of monitoring 1 1  e.g. crash detection, memory usage
  • 12. 1 2  In the interests of speed over usability we rolled our own command line arguments parser  Means argument parsing is awkward to extend
  • 13. 1 3
  • 14. 1 4  Earlier this year we found a compelling reason to rewrite the tool and address the various limitations  First 2.x release was made 9th June 2014  Minor bug fix and maintenance releases since  Releases available at:  http://sourceforge.net/projects/sparql-query-bm/files/  Code is now using Git  http://git.code.sf.net/p/sparql-query-bm/git sparql-query-bm-git  Mirrors available on GitHub for those who think that it is the one true source  https://github.com/rvesse/sparql-query-bm  Maven artifacts available through Maven Central as before:  Group ID: net.sf.sparql-query-bm  Artifact IDs: core, cmd and dist  Latest 2.x version: 2.0.1
  • 15.  Concept of Queries replaced with the general concept of Operations  Also divorces the definition of an operation with how to run said operation 1 5  Makes it easier to change runtime behaviour of operations  20 built-in operations provided  API allows defining and plugging in new operations as desired  http://sparql-query-bm.sourceforge.net/javadoc/latest/core/
  • 16. 1 6  Several kinds of query/update  Fixed  Parameterized  Dataset Size  Variants for both remote endpoints and in-memory datasets  Remote variants have additional NVP variants  Allows adding custom parameters to the remote request  Accounts for 13 of the built in operations
  • 17. 1 7  One for each graph store protocol operation:  DELETE  GET  HEAD  POST  PUT  Accounts for a further 5 of the built-in operations
  • 18. 1 8  Sleep  Do nothing for some period  Useful for simulating quiet periods as part of testing  Mix  Allow grouping a set of operations into a single operation  Lets you compose mixes from other mixes
  • 19. 1 9  As already noted in-memory variants of some operations are now available  These run tests against a Dataset implementation  Part of Apache Jena ARQ API  Removes SPARQL Protocol and HTTP overhead from testing  Of course depending on Dataset implementation may still be some communication overhead  But this is likely using lower level back end native communications protocols instead
  • 20. 2 0  Addresses the limitation of hard coded methodology  Separates test running into three components:  Overall runner  Mix runner  Operation runner  Each has own API and can be customized as desired  Various useful base/abstract implementations provided  Four different test runners are provided:  Benchmark  Smoke  Soak  Stress
  • 21. 2 1  Smoke  Runs the mix once and indicates whether it passes/fails  Pass is defined as all operations pass  Soak  Run the mix continuously for some period of time  Test how a system reacts under continuous load  Stress  Run the mix with increasingly high load  Test how a system reacts under increasing load  AbstractRunner provides a basic framework and helper method to make it easy to add custom runners or customize existing runs
  • 22. 2 2  Allows customizing how mixes and individual operations are run  Some alternative implementations built in:  E.g. SamplingOperationMixRunner  Runs a sample of the operations in the mix  May include repeats  E.g. RetryingOperationRunner  Retries an operation if it doesn't succeed  Easy to implement your own
  • 23. 2 3  Separates test configuration from the test runner  Interface with all common configuration defined  Endpoints  Timeouts  Progress Listeners  etc  NB - Runners are typically defined such that they restrict their input options to sub-interfaces that add runner specific configuration e.g.  Warm-ups for benchmarks  Total runtime for soak testing  Ramp up factor for stress testing
  • 24. 2 4  Now using TSV as the file format  Still wanted to be simple enough that someone with zero RDF/SPARQL knowledge can configure  Each line is a series of parameters separated by a tab character  First parameter is an identifier for the type of the operation  Used to decide how to interpret the remaining parameters  Can define your own mix file format and register a loader for it  Possible to override the loader for a specific operation identifier since this has an API  Means you can do neat tricks like use a mix designed for remote endpoints against an in-memory dataset
  • 25. query 806670-warmup1.rq 806670 Warmup Query 1 query 806670-warmup2.rq 806670 Warmup Query 2 query 806670-nofilter.rq 806670 Query with No Filter query 806670-filter3.rq 806670 Query with Filter (Variant 3) param-query 806670-filter3-params.rq instances.tsv Parameterized Query with Filter (Variant 3) query 806670-filter4.rq 806670 Query with Filter (Variant 4) query 806670-filter4a.rq 806670 Query with Filter (Variant 4a - Zero Results) param-query 806670-filter4-params.rq instances.tsv Parameterized Query with Filter (Variant 4) query 806238-warmup1.rq 806238 Warmup Query 1 query 806238-warmup2.rq 806238 Warmup Query 2 query 806238-comment43.rq 806238 Query (Comment 43) query 806238-comment43a.rq 806238 Query (Comment 43 - SELECT * sub-query) query 806238-comment45.rq 806238 Query (Comment 45 - Multiple sub-queries) query 806238-comment54.rq 806238 Query (Comment 54) param-update load-full1m.ru graph-names.tsv Load 1M Dataset into named graph param-query count-loaded.rq graph-names.tsv Count named graph param-update drop-loaded.ru graph-names.tsv Drop named graph query count.rq Count quads checkpoint10 Checkpoint every 10 runs sleep 180 3 minute sleep 2 5
  • 26.  Now provides notifications before and after operation and mix runs  Improvements to how some of the built-in implementations handle multi-threaded output 2 6  Makes it easier to distinguish where errors occurred when running multi-threaded benchmarks
  • 27. 2 7  Now based upon the powerful open source Airline library  https://github.com/airlift/airline  Provides a command line interface to each built-in runner  Also provides AbstractCommandwith all standard options exposed  Standardized exit codes across all commands  Comprehensive built-in help  Can help you define operation mixes  ./operations  ./operation --op param-query
  • 28. 2 8
  • 29.  These are things we've done (or are currently doing) with the framework that aren't in the open source releases  However the 2.x framework makes these (hopefully) easy to replicate yourself 2 9
  • 30. 3 0  Many stores often have rich REST APIs in addition to their SPARQL APIs  Can be useful to include testing of these in your mixes  Requires implementing two interfaces:  Operation  OperationCallable  Abstract implementations of both available to give you the boiler plate bits  Internally we have 9 different custom operations defined which test a subset of our REST API:  Database Management  Asynchronous Queries  Import Management
  • 31.  One thing we're particularly interested in is how operations affect memory usage 3 1  We added custom progress listeners that track and monitor memory usage  Reports on min, max and average memory usage  We also have another progress listener that tracks processes to identify when a test run may have been impacted by other activity on the system
  • 32. 3 2 public class RetryOnAuthFailureOperationRunner extends RetryingOperationRunner { public RetryOnAuthFailureOperationRunner() { this(1); } public RetryOnAuthFailureOperationRunner(int maxRetries) { super(maxRetries); } @Override protected <T extends Options> boolean shouldRetry(Runner<T> runner, T options, Operation op, OperationRun run) { return run.getErrorCategory() == ErrorCategories.AUTHENTICATION; } }  Extends the built-in RetryingOperationRunner  Simply adds a constraint on retries by overriding the shouldRetry() method
  • 33. 3 3
  • 34. 3 4  Embrace Java 7 features fully  Use ServiceLoader to automatically discover new operations and mix formats  Make it even easier to customize runners  i.e. provide more abstraction of the current implementations
  • 35. 3 5 Questions? rvesse@yarcdata.com @RobVesse

Notas do Editor

  1. Ask for a show of hands as to who has used the tool to get an idea of the audience
  2. SPARQL 1.1 standardized 21st March 2013