SlideShare uma empresa Scribd logo
1 de 33
On the need to include
functional testing in RDF
stream engine
benchmarks
Daniele Dell’Aglio,
Marco Balduini, and
Emanuele Della Valle
1st International Workshop on
Benchmarking RDF Systems (BeRSys
2013)
co-located with ESWC 2013
May 26th, 2013
Montpellier, France
Agenda
Background on
Data Stream Management Systems (DSMS)
RDF Stream Engines
Benchmarking RDF Stream Engine
Operational semantics of RDF Stream Engines
Testing the correctness of continuous queries'
results
Conclusions
BeRSys 2013 - May 26, 2013Emanuele Della Valle - http://streamreasoning.org
Background
Data Stream Management Systems
What are data streams?
Formally:
Data streams are unbounded sequences of time-
varying data elements
Less formally:
an (almost)“continuous”flow of information
with the recent information being more relevant as it
describes the current state of a dynamic system
time
BeRSys 2013 - May 26, 2013Emanuele Della Valle - http://streamreasoning.org
Background
Data Stream Management Systems
The nature of streams requires a paradigmatic
change*
from persistent data
to be stored and queried on demand
a.k.a. one time semantics
to transient data
to be consumed on the fly by continuous queries
a.k.a. continuous semantics
* This paradigmatic change first arose in DB community BeRSys 2013 - May 26, 2013Emanuele Della Valle - http://streamreasoning.org
Background
Data Stream Management Systems
Continuous queries registered over streams
that, in most of the cases, are observed trough
windows
Streams of answer
produced by the
relation to stream
operators
Registered
Continuou
s Query
Window:
stream to
relation
operators
input streams
Express using
relation to relation
operators
BeRSys 2013 - May 26, 2013Emanuele Della Valle - http://streamreasoning.org
Background
Data Stream Management Systems
Types of windows (a.k.a., stream to relation
operators)
physical: a given number of data elements
logical: a variable number of data elements which occur
during a given time interval (e.g., 1 hour)
Sliding: they are progressively advanced of
a given STEP (e.g., 5 minutes)
Tumbling: they are advanced of exactly their time
interval
BeRSys 2013 - May 26, 2013Emanuele Della Valle - http://streamreasoning.org
Background
RDF Stream Engines
RDF Stream Engines ports DSMS concepts into
the Semantic Web extending
RDF data model with the notion of RDF Stream
…
<si pi oi> : [τ1]
<si+1 pi+1 oi+1> : [τ1+1]
…
SPARQL to express and process continuous queries
Existing languages/engines
CQELS
SPARQLSTREAM
C-SPARQL
Timestamps are
non-decreasing to allow
for expressing
contemporaneity
BeRSys 2013 - May 26, 2013Emanuele Della Valle - http://streamreasoning.org
Background
E.g., where C-SPARQL extends
SPARQL
Background
E.g., where C-SPARQL extends
SPARQL
Background
An example of C-SPARQL query
Who are the opinion makers? i.e., the users who are
likely to influence the behaviour of other users who
follow them
REGISTER STREAM OpinionMakers COMPUTED EVERY 5m AS
CONSTRUCT { ?opinionMaker sd:about ?resource }
FROM STREAM
<http://streamingsocialdata.org/interactions>
[RANGE 30m STEP 5m]
WHERE {
?opinionMaker ?opinion ?resource .
?follower sioc:follows ?opinionMaker.
?follower ?opinion ?resource.
FILTER ( cs:timestamp(?follower) >
cs:timestamp(?opinionMaker)
&& ?opinion != sd:accesses ) BeRSys 2013 - May 26, 2013
Background
An example of C-SPARQL query
Who are the opinion makers? i.e., the users who are
likely to influence the behaviour of other users who
follow them
REGISTER STREAM OpinionMakers COMPUTED EVERY 5m AS
CONSTRUCT { ?opinionMaker sd:about ?resource }
FROM STREAM
<http://streamingsocialdata.org/interactions>
[RANGE 30m STEP 5m]
WHERE {
?opinionMaker ?opinion ?resource .
?follower sioc:follows ?opinionMaker.
?follower ?opinion ?resource.
FILTER ( cs:timestamp(?follower) >
cs:timestamp(?opinionMaker)
&& ?opinion != sd:accesses )
Query registration
(for continuous
execution)
FROM STREAM
clause
WINDOW
RDF Stream added
as
new output format
Builtin to
access
timestamps
Aggregates
as in SPARQL
1.1
BeRSys 2013 - May 26, 2013
Background
Benchmarking RDF stream engines
SRBench
Dataset: LinkedSensorData (real meteorological
sensor data)
Queries: 17 continuous queries, some requiring RDFS
reasoning
KPI: feature coverage and correctness
LSBench
Dataset: synthetic social network inspired data set
Queries: 12 continuous queries involving multiple
stream and static knowledge
KPI: input throughput and correctness
Not
verified
Verified
comparing the
number of
results produced
by different
Is verifying correctness hard?
Not for SPARQL
http://www.w3.org/2009/sparql/docs/tests/
Queries + expected results
However, it is hard for continuous (SPARQL)
queries
1 query  multiple correct results
Input data and query are not enough to determine
the correct result
BeRSys 2013 - May 26, 2013Emanuele Della Valle - http://streamreasoning.org
A simple test
Take the motivation scenario of CQELS
there are two connected rooms, r1 and r2;
each room has a sensor able to detect the individuals
inside, m1 and m2.
The stream ST contains the following triples:
<:m1 :detectedAt :r1>:[1]
<:m2 :detectedAt :r1>:[3]
<:m1 :detectedAt :r2>:[12]
<:m2 :detectedAt :r2>:[15]
S1
S2
S3
S4
BeRSys 2013 - May 26, 2013Emanuele Della Valle - http://streamreasoning.org
The query of the simple test
We want to know when the two individuals m1 and m2 are in the
same room using time-based tumbling window of 10 seconds.
REGISTER QUERY SimpleTest AS
SELECT ?room
FROM STREAM <http://ex.org/ST> [RANGE 10s STEP 10s]
WHERE {
:m1 :detectedAt ?room .
:m2 :detectedAt ?room
}
BeRSys 2013 - May 26, 2013Emanuele Della Valle - http://streamreasoning.org
All the results you can obtain
Running the test in C-SPARQL, CQELS and
SPARQLSTREAM the following results can be
obtained.
Are they all correct?
How can this be?
BeRSys 2013 - May 26, 2013Emanuele Della Valle - http://streamreasoning.org
These engines have
different operational
semantics!
BeRSys 2013 - May 26, 2013Emanuele Della Valle - http://streamreasoning.org
The devil is in the details!
S1 S2 S3 S4
W0
S1 = <:m1 :detectedAt
:r1>:[1]
S2 = <:m2 :detectedAt
:r1>:[3]
S3 = <:m1 :detectedAt
:r2>:[12]
S4 = <:m2 .detectedAt
:r2>:[15]
ST
t3 12 151
W1
W2
W3
W4
W5
W6
BeRSys 2013 - May 26, 2013Emanuele Della Valle - http://streamreasoning.org
Can operational semantics of RDF
Stream Engines be modelled?
A model has been proposed to explain the
differences that appear between different DSMS:
SECRET
The results of a DSMS not only depends on the input
and the query, but also on the system
BeRSys 2013 - May 26, 2013Emanuele Della Valle - http://streamreasoning.org
ScopE in the SECRET model
It is the time range of the active window
[topen,tclose)
it is determined using the size ω and
slideβparameters of the window as written by the
query issuer
BeRSys 2013 - May 26, 2013Emanuele Della Valle - http://streamreasoning.org
tapp
1 2 3 4 5 6 7
tW1
W2
W3
Closed
Open
Active
ω=3
β= 2
Content in the SECRET model
It is the subset of the stream included of the
active window
It is determined using
the size ω and slide β parameters of the window and
t0 the time instant on which the first window starts,
W0
W1
W2
W3
t3 12 151
ω=β=10
Different values for
t0 BeRSys 2013 - May 26, 2013Emanuele Della Valle - http://streamreasoning.org
Report in the SECRET model
It defines the conditions under which the window
contents become visible for further query
evaluation and result reporting
It can take a logical combination of the following:
content change
window close
non-empty content
periodic
BeRSys 2013 - May 26, 2013Emanuele Della Valle - http://streamreasoning.org
Explaining the results of C-SPARQL
S1 S2 S3 S4
W0
W1
W2
W3
W4
W5
W6
ST
t3 12 151
ω=β=10
The reporting strategy of
C-SPARQL is window
close and non-empty
result
BeRSys 2013 - May 26, 2013Emanuele Della Valle - http://streamreasoning.org
S1 = <:m1 :detectedAt
:r1>:[1]
S2 = <:m2 :detectedAt
:r1>:[3]
S3 = <:m1 :detectedAt
:r2>:[12]
S4 = <:m2 .detectedAt
:r2>:[15]
Explaining the results of
SPARQLSTREAM
S1 S2 S3 S4
W0
W1
W2
W3
W4
W5
W6
ST
t3 12 151
ω=β=10
The reporting strategy of
SPARQLSTREAM is window
close
BeRSys 2013 - May 26, 2013Emanuele Della Valle - http://streamreasoning.org
S1 = <:m1 :detectedAt
:r1>:[1]
S2 = <:m2 :detectedAt
:r1>:[3]
S3 = <:m1 :detectedAt
:r2>:[12]
S4 = <:m2 .detectedAt
:r2>:[15]
Explaining the results of
CQELS
S1 S2 S3 S4
W0
W1
W2
W3
W4
W5
W6
ST
t3 12 151
ω=β=10
The reporting strategy of
CQELS is content
change, and non-empty
result
BeRSys 2013 - May 26, 2013Emanuele Della Valle - http://streamreasoning.org
S1 = <:m1 :detectedAt
:r1>:[1]
S2 = <:m2 :detectedAt
:r1>:[3]
S3 = <:m1 :detectedAt
:r2>:[12]
S4 = <:m2 .detectedAt
:r2>:[15]
And so what?
BeRSys 2013 - May 26, 2013Emanuele Della Valle - http://streamreasoning.org
Let's test correctness!
Data
Query
SELECT ?room WHERE {
STREAM <http://ex.org/s1> [RANGE 3s SLIDE 3s] {
?p1 :detectedAt ?room .
?p2 :detectedAt ?room }
FILTER (?p1 != ?p2) }
Timeline
S1 = <:m1 :detectedAt :r1>:[0]
S2 = <:m2 :detectedAt :r2>:[5]
S3 = <:m3 :detectedAt :r1>:[10]
S4 = <:m4 :detectedAt :r2>:[15]
S1 S2 S3 S4
t0 5 10 15
S1
ω=β=3
Results of the test
Why?
BeRSys 2013 - May 26, 2013Emanuele Della Valle - http://streamreasoning.org
Trying to make sense of it …
Data
Query
SELECT ?room WHERE {
STREAM <http://ex.org/s1> [RANGE 3s SLIDE 3s] {
?p1 :detectedAt ?room .
?p2 :detectedAt ?room }
FILTER (?p1 != ?p2) }
Timeline
S1 = <:m1 :detectedAt :r1>:[0]
S2 = <:m2 :detectedAt :r2>:[5]
S3 = <:m3 :detectedAt :r1>:[10]
S4 = <:m4 :detectedAt :r2>:[15]
S1 S2 S3 S4
t0 5 10 15
S1
ω=β=3
Let's remove this
filter
Results of the new test
Is this caused by
incorrect removal of
the triples from the
window?
BeRSys 2013 - May 26, 2013Emanuele Della Valle - http://streamreasoning.org
Does throughput matters
without correctness?
BeRSys 2013 - May 26, 2013Emanuele Della Valle - http://streamreasoning.org
Conclusions
The different operational semantics of existing
RDF stream engines affect the outputs and the
performance of those systems
Throughput measurements must be
performed
while testing correctness
Modeling RDF stream Engines using SECRET
allows for checking correctness
SRbench and LSBench should be extended with
an "oracle" that checks correctness
BeRSys 2013 - May 26, 2013Emanuele Della Valle - http://streamreasoning.org
Thank you for you
attention: questions?
Daniele Dell’Aglio,
Marco Balduini, and
Emanuele Della Valle

Mais conteúdo relacionado

Mais procurados

OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...
OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...
OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...Oscar Corcho
 
Query Rewriting in RDF Stream Processing
Query Rewriting in RDF Stream ProcessingQuery Rewriting in RDF Stream Processing
Query Rewriting in RDF Stream ProcessingJean-Paul Calbimonte
 
Compiling openCypher graph queries with Spark Catalyst
Compiling openCypher graph queries with Spark CatalystCompiling openCypher graph queries with Spark Catalyst
Compiling openCypher graph queries with Spark CatalystGábor Szárnyas
 
Intro to Spark and Spark SQL
Intro to Spark and Spark SQLIntro to Spark and Spark SQL
Intro to Spark and Spark SQLjeykottalam
 
Spark & Cassandra at DataStax Meetup on Jan 29, 2015
Spark & Cassandra at DataStax Meetup on Jan 29, 2015 Spark & Cassandra at DataStax Meetup on Jan 29, 2015
Spark & Cassandra at DataStax Meetup on Jan 29, 2015 Sameer Farooqui
 
Data Stream Analytics - Why they are important
Data Stream Analytics - Why they are importantData Stream Analytics - Why they are important
Data Stream Analytics - Why they are importantParis Carbone
 
Apache Spark GraphX highlights.
Apache Spark GraphX highlights. Apache Spark GraphX highlights.
Apache Spark GraphX highlights. Doug Needham
 

Mais procurados (7)

OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...
OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...
OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...
 
Query Rewriting in RDF Stream Processing
Query Rewriting in RDF Stream ProcessingQuery Rewriting in RDF Stream Processing
Query Rewriting in RDF Stream Processing
 
Compiling openCypher graph queries with Spark Catalyst
Compiling openCypher graph queries with Spark CatalystCompiling openCypher graph queries with Spark Catalyst
Compiling openCypher graph queries with Spark Catalyst
 
Intro to Spark and Spark SQL
Intro to Spark and Spark SQLIntro to Spark and Spark SQL
Intro to Spark and Spark SQL
 
Spark & Cassandra at DataStax Meetup on Jan 29, 2015
Spark & Cassandra at DataStax Meetup on Jan 29, 2015 Spark & Cassandra at DataStax Meetup on Jan 29, 2015
Spark & Cassandra at DataStax Meetup on Jan 29, 2015
 
Data Stream Analytics - Why they are important
Data Stream Analytics - Why they are importantData Stream Analytics - Why they are important
Data Stream Analytics - Why they are important
 
Apache Spark GraphX highlights.
Apache Spark GraphX highlights. Apache Spark GraphX highlights.
Apache Spark GraphX highlights.
 

Destaque

Stream Reasoning: State of the Art and Beyond
Stream Reasoning: State of the Art and BeyondStream Reasoning: State of the Art and Beyond
Stream Reasoning: State of the Art and BeyondEmanuele Della Valle
 
Order Matters! Harnessing a World of Orderings for Reasoning over Massive Data
Order Matters! Harnessing a World of Orderings for Reasoning over Massive DataOrder Matters! Harnessing a World of Orderings for Reasoning over Massive Data
Order Matters! Harnessing a World of Orderings for Reasoning over Massive DataEmanuele Della Valle
 
Ist16-03 An Introduction to the Semantic Web
Ist16-03 An Introduction to the Semantic Web Ist16-03 An Introduction to the Semantic Web
Ist16-03 An Introduction to the Semantic Web Emanuele Della Valle
 
Listening to the pulse of our cities with Stream Reasoning (and few more tech...
Listening to the pulse of our cities with Stream Reasoning (and few more tech...Listening to the pulse of our cities with Stream Reasoning (and few more tech...
Listening to the pulse of our cities with Stream Reasoning (and few more tech...Emanuele Della Valle
 
IC2008 Valutazione di un Enunciato
IC2008 Valutazione di un EnunciatoIC2008 Valutazione di un Enunciato
IC2008 Valutazione di un EnunciatoEmanuele Della Valle
 
City Data Fusion for Event Management (in Italiano)
City Data Fusion for Event Management (in Italiano)City Data Fusion for Event Management (in Italiano)
City Data Fusion for Event Management (in Italiano)Emanuele Della Valle
 

Destaque (6)

Stream Reasoning: State of the Art and Beyond
Stream Reasoning: State of the Art and BeyondStream Reasoning: State of the Art and Beyond
Stream Reasoning: State of the Art and Beyond
 
Order Matters! Harnessing a World of Orderings for Reasoning over Massive Data
Order Matters! Harnessing a World of Orderings for Reasoning over Massive DataOrder Matters! Harnessing a World of Orderings for Reasoning over Massive Data
Order Matters! Harnessing a World of Orderings for Reasoning over Massive Data
 
Ist16-03 An Introduction to the Semantic Web
Ist16-03 An Introduction to the Semantic Web Ist16-03 An Introduction to the Semantic Web
Ist16-03 An Introduction to the Semantic Web
 
Listening to the pulse of our cities with Stream Reasoning (and few more tech...
Listening to the pulse of our cities with Stream Reasoning (and few more tech...Listening to the pulse of our cities with Stream Reasoning (and few more tech...
Listening to the pulse of our cities with Stream Reasoning (and few more tech...
 
IC2008 Valutazione di un Enunciato
IC2008 Valutazione di un EnunciatoIC2008 Valutazione di un Enunciato
IC2008 Valutazione di un Enunciato
 
City Data Fusion for Event Management (in Italiano)
City Data Fusion for Event Management (in Italiano)City Data Fusion for Event Management (in Italiano)
City Data Fusion for Event Management (in Italiano)
 

Semelhante a On the need to include functional testing in RDF stream engine benchmarks

Towards efficient processing of RDF data streams
Towards efficient processing of RDF data streamsTowards efficient processing of RDF data streams
Towards efficient processing of RDF data streamsAlejandro Llaves
 
Towards efficient processing of RDF data streams
Towards efficient processing of RDF data streamsTowards efficient processing of RDF data streams
Towards efficient processing of RDF data streamsAlejandro Llaves
 
Apache Lens at Hadoop meetup
Apache Lens at Hadoop meetupApache Lens at Hadoop meetup
Apache Lens at Hadoop meetupamarsri
 
What we do to improve scalability in our RDF processing system
What we do to improve scalability in our RDF processing systemWhat we do to improve scalability in our RDF processing system
What we do to improve scalability in our RDF processing systemAlejandro Llaves
 
Spark ml streaming
Spark ml streamingSpark ml streaming
Spark ml streamingAdam Doyle
 
A cloud service architecture for analyzing big monitoring data
A cloud service architecture for analyzing big monitoring dataA cloud service architecture for analyzing big monitoring data
A cloud service architecture for analyzing big monitoring dataredpel dot com
 
Machine learning at scale challenges and solutions
Machine learning at scale challenges and solutionsMachine learning at scale challenges and solutions
Machine learning at scale challenges and solutionsStavros Kontopoulos
 
SLALOM Webinar Final Technical Outcomes Explanined "Using the SLALOM Technica...
SLALOM Webinar Final Technical Outcomes Explanined "Using the SLALOM Technica...SLALOM Webinar Final Technical Outcomes Explanined "Using the SLALOM Technica...
SLALOM Webinar Final Technical Outcomes Explanined "Using the SLALOM Technica...Oliver Barreto Rodríguez
 
Semantic technologies for the Internet of Things
Semantic technologies for the Internet of Things Semantic technologies for the Internet of Things
Semantic technologies for the Internet of Things PayamBarnaghi
 
Streaming analytics state of the art
Streaming analytics state of the artStreaming analytics state of the art
Streaming analytics state of the artStavros Kontopoulos
 
Payola ESWC 2014 demo poster
Payola ESWC 2014 demo posterPayola ESWC 2014 demo poster
Payola ESWC 2014 demo posterJiří Helmich
 
towards_analytics_query_engine
towards_analytics_query_enginetowards_analytics_query_engine
towards_analytics_query_engineNantia Makrynioti
 
Database Integrated Analytics using R InitialExperiences wi
Database Integrated Analytics using R InitialExperiences wiDatabase Integrated Analytics using R InitialExperiences wi
Database Integrated Analytics using R InitialExperiences wiOllieShoresna
 
RDF Stream Processing Models (SR4LD2013)
RDF Stream Processing Models (SR4LD2013)RDF Stream Processing Models (SR4LD2013)
RDF Stream Processing Models (SR4LD2013)Daniele Dell'Aglio
 
Rules validation - Copy
Rules validation - CopyRules validation - Copy
Rules validation - CopyHicham Berrada
 
From Prototyping to Deployment at Scale with R and sparklyr with Kevin Kuo
From Prototyping to Deployment at Scale with R and sparklyr with Kevin KuoFrom Prototyping to Deployment at Scale with R and sparklyr with Kevin Kuo
From Prototyping to Deployment at Scale with R and sparklyr with Kevin KuoDatabricks
 
SPARJA: a Distributed Social Graph Partitioning and Replication Middleware
SPARJA: a Distributed Social Graph Partitioning and Replication MiddlewareSPARJA: a Distributed Social Graph Partitioning and Replication Middleware
SPARJA: a Distributed Social Graph Partitioning and Replication MiddlewareMaria Stylianou
 
2014 IEEE DOTNET CLOUD COMPUTING PROJECT Scalable analytics for iaa s cloud a...
2014 IEEE DOTNET CLOUD COMPUTING PROJECT Scalable analytics for iaa s cloud a...2014 IEEE DOTNET CLOUD COMPUTING PROJECT Scalable analytics for iaa s cloud a...
2014 IEEE DOTNET CLOUD COMPUTING PROJECT Scalable analytics for iaa s cloud a...IEEEFINALSEMSTUDENTPROJECTS
 

Semelhante a On the need to include functional testing in RDF stream engine benchmarks (20)

Towards efficient processing of RDF data streams
Towards efficient processing of RDF data streamsTowards efficient processing of RDF data streams
Towards efficient processing of RDF data streams
 
Towards efficient processing of RDF data streams
Towards efficient processing of RDF data streamsTowards efficient processing of RDF data streams
Towards efficient processing of RDF data streams
 
Apache Lens at Hadoop meetup
Apache Lens at Hadoop meetupApache Lens at Hadoop meetup
Apache Lens at Hadoop meetup
 
What we do to improve scalability in our RDF processing system
What we do to improve scalability in our RDF processing systemWhat we do to improve scalability in our RDF processing system
What we do to improve scalability in our RDF processing system
 
Spark ml streaming
Spark ml streamingSpark ml streaming
Spark ml streaming
 
A cloud service architecture for analyzing big monitoring data
A cloud service architecture for analyzing big monitoring dataA cloud service architecture for analyzing big monitoring data
A cloud service architecture for analyzing big monitoring data
 
Machine learning at scale challenges and solutions
Machine learning at scale challenges and solutionsMachine learning at scale challenges and solutions
Machine learning at scale challenges and solutions
 
SLALOM Webinar Final Technical Outcomes Explanined "Using the SLALOM Technica...
SLALOM Webinar Final Technical Outcomes Explanined "Using the SLALOM Technica...SLALOM Webinar Final Technical Outcomes Explanined "Using the SLALOM Technica...
SLALOM Webinar Final Technical Outcomes Explanined "Using the SLALOM Technica...
 
Semantic technologies for the Internet of Things
Semantic technologies for the Internet of Things Semantic technologies for the Internet of Things
Semantic technologies for the Internet of Things
 
Streaming analytics state of the art
Streaming analytics state of the artStreaming analytics state of the art
Streaming analytics state of the art
 
Payola ESWC 2014 demo poster
Payola ESWC 2014 demo posterPayola ESWC 2014 demo poster
Payola ESWC 2014 demo poster
 
towards_analytics_query_engine
towards_analytics_query_enginetowards_analytics_query_engine
towards_analytics_query_engine
 
E05312426
E05312426E05312426
E05312426
 
Database Integrated Analytics using R InitialExperiences wi
Database Integrated Analytics using R InitialExperiences wiDatabase Integrated Analytics using R InitialExperiences wi
Database Integrated Analytics using R InitialExperiences wi
 
RDF Stream Processing Models (SR4LD2013)
RDF Stream Processing Models (SR4LD2013)RDF Stream Processing Models (SR4LD2013)
RDF Stream Processing Models (SR4LD2013)
 
Rules validation - Copy
Rules validation - CopyRules validation - Copy
Rules validation - Copy
 
NoSQL
NoSQLNoSQL
NoSQL
 
From Prototyping to Deployment at Scale with R and sparklyr with Kevin Kuo
From Prototyping to Deployment at Scale with R and sparklyr with Kevin KuoFrom Prototyping to Deployment at Scale with R and sparklyr with Kevin Kuo
From Prototyping to Deployment at Scale with R and sparklyr with Kevin Kuo
 
SPARJA: a Distributed Social Graph Partitioning and Replication Middleware
SPARJA: a Distributed Social Graph Partitioning and Replication MiddlewareSPARJA: a Distributed Social Graph Partitioning and Replication Middleware
SPARJA: a Distributed Social Graph Partitioning and Replication Middleware
 
2014 IEEE DOTNET CLOUD COMPUTING PROJECT Scalable analytics for iaa s cloud a...
2014 IEEE DOTNET CLOUD COMPUTING PROJECT Scalable analytics for iaa s cloud a...2014 IEEE DOTNET CLOUD COMPUTING PROJECT Scalable analytics for iaa s cloud a...
2014 IEEE DOTNET CLOUD COMPUTING PROJECT Scalable analytics for iaa s cloud a...
 

Mais de Emanuele Della Valle

Taming velocity - a tale of four streams
Taming velocity - a tale of four streamsTaming velocity - a tale of four streams
Taming velocity - a tale of four streamsEmanuele Della Valle
 
Work in progress on Inductive Stream Reasoning
Work in progress on Inductive Stream ReasoningWork in progress on Inductive Stream Reasoning
Work in progress on Inductive Stream ReasoningEmanuele Della Valle
 
Knowledge graphs in search engines
Knowledge graphs in search enginesKnowledge graphs in search engines
Knowledge graphs in search enginesEmanuele Della Valle
 
La città dei balocchi 2017 in numeri - Fluxedo
La città dei balocchi 2017 in numeri - FluxedoLa città dei balocchi 2017 in numeri - Fluxedo
La città dei balocchi 2017 in numeri - FluxedoEmanuele Della Valle
 
Stream Reasoning: a summary of ten years of research and a vision for the nex...
Stream Reasoning: a summary of ten years of research and a vision for the nex...Stream Reasoning: a summary of ten years of research and a vision for the nex...
Stream Reasoning: a summary of ten years of research and a vision for the nex...Emanuele Della Valle
 
ACQUA: Approximate Continuous Query Answering over Streams and Dynamic Linked...
ACQUA: Approximate Continuous Query Answering over Streams and Dynamic Linked...ACQUA: Approximate Continuous Query Answering over Streams and Dynamic Linked...
ACQUA: Approximate Continuous Query Answering over Streams and Dynamic Linked...Emanuele Della Valle
 
Stream reasoning: an approach to tame the velocity and variety dimensions of ...
Stream reasoning: an approach to tame the velocity and variety dimensions of ...Stream reasoning: an approach to tame the velocity and variety dimensions of ...
Stream reasoning: an approach to tame the velocity and variety dimensions of ...Emanuele Della Valle
 
Big Data: how to use it to create value
Big Data: how to use it to create valueBig Data: how to use it to create value
Big Data: how to use it to create valueEmanuele Della Valle
 
Ist16-02 HL7 from v2 (syntax) to v3 (semantics)
Ist16-02 HL7 from v2 (syntax) to v3 (semantics)Ist16-02 HL7 from v2 (syntax) to v3 (semantics)
Ist16-02 HL7 from v2 (syntax) to v3 (semantics)Emanuele Della Valle
 
IST16-01 - Introduction to Interoperability and Semantic Technologies
IST16-01 - Introduction to Interoperability and Semantic TechnologiesIST16-01 - Introduction to Interoperability and Semantic Technologies
IST16-01 - Introduction to Interoperability and Semantic TechnologiesEmanuele Della Valle
 
Stream reasoning: mastering the velocity and the variety dimensions of Big Da...
Stream reasoning: mastering the velocity and the variety dimensions of Big Da...Stream reasoning: mastering the velocity and the variety dimensions of Big Da...
Stream reasoning: mastering the velocity and the variety dimensions of Big Da...Emanuele Della Valle
 
Listening to the pulse of our cities fusing Social Media Streams and Call Dat...
Listening to the pulse of our cities fusing Social Media Streams and Call Dat...Listening to the pulse of our cities fusing Social Media Streams and Call Dat...
Listening to the pulse of our cities fusing Social Media Streams and Call Dat...Emanuele Della Valle
 
Social listener-brera-design-district-2015-03
Social listener-brera-design-district-2015-03Social listener-brera-design-district-2015-03
Social listener-brera-design-district-2015-03Emanuele Della Valle
 
Semantic technologies and Interoperability
Semantic technologies and InteroperabilitySemantic technologies and Interoperability
Semantic technologies and InteroperabilityEmanuele Della Valle
 
Big data: why, what, paradigm shifts enabled , tools and market landscape
Big data: why, what, paradigm shifts enabled , tools and market landscapeBig data: why, what, paradigm shifts enabled , tools and market landscape
Big data: why, what, paradigm shifts enabled , tools and market landscapeEmanuele Della Valle
 
City Data Fusion and City Sensing presented at EIT ICT Labs for EXPO 2015
City Data Fusion and City Sensing presented at EIT ICT Labs for EXPO 2015City Data Fusion and City Sensing presented at EIT ICT Labs for EXPO 2015
City Data Fusion and City Sensing presented at EIT ICT Labs for EXPO 2015Emanuele Della Valle
 

Mais de Emanuele Della Valle (20)

Taming velocity - a tale of four streams
Taming velocity - a tale of four streamsTaming velocity - a tale of four streams
Taming velocity - a tale of four streams
 
Stream reasoning
Stream reasoningStream reasoning
Stream reasoning
 
Work in progress on Inductive Stream Reasoning
Work in progress on Inductive Stream ReasoningWork in progress on Inductive Stream Reasoning
Work in progress on Inductive Stream Reasoning
 
Big Data and Data Science W's
Big Data and Data Science W'sBig Data and Data Science W's
Big Data and Data Science W's
 
Knowledge graphs in search engines
Knowledge graphs in search enginesKnowledge graphs in search engines
Knowledge graphs in search engines
 
La città dei balocchi 2017 in numeri - Fluxedo
La città dei balocchi 2017 in numeri - FluxedoLa città dei balocchi 2017 in numeri - Fluxedo
La città dei balocchi 2017 in numeri - Fluxedo
 
Stream Reasoning: a summary of ten years of research and a vision for the nex...
Stream Reasoning: a summary of ten years of research and a vision for the nex...Stream Reasoning: a summary of ten years of research and a vision for the nex...
Stream Reasoning: a summary of ten years of research and a vision for the nex...
 
ACQUA: Approximate Continuous Query Answering over Streams and Dynamic Linked...
ACQUA: Approximate Continuous Query Answering over Streams and Dynamic Linked...ACQUA: Approximate Continuous Query Answering over Streams and Dynamic Linked...
ACQUA: Approximate Continuous Query Answering over Streams and Dynamic Linked...
 
Stream reasoning: an approach to tame the velocity and variety dimensions of ...
Stream reasoning: an approach to tame the velocity and variety dimensions of ...Stream reasoning: an approach to tame the velocity and variety dimensions of ...
Stream reasoning: an approach to tame the velocity and variety dimensions of ...
 
Big Data: how to use it to create value
Big Data: how to use it to create valueBig Data: how to use it to create value
Big Data: how to use it to create value
 
Ist16-04 An introduction to RDF
Ist16-04 An introduction to RDF Ist16-04 An introduction to RDF
Ist16-04 An introduction to RDF
 
Ist16-02 HL7 from v2 (syntax) to v3 (semantics)
Ist16-02 HL7 from v2 (syntax) to v3 (semantics)Ist16-02 HL7 from v2 (syntax) to v3 (semantics)
Ist16-02 HL7 from v2 (syntax) to v3 (semantics)
 
IST16-01 - Introduction to Interoperability and Semantic Technologies
IST16-01 - Introduction to Interoperability and Semantic TechnologiesIST16-01 - Introduction to Interoperability and Semantic Technologies
IST16-01 - Introduction to Interoperability and Semantic Technologies
 
Stream reasoning: mastering the velocity and the variety dimensions of Big Da...
Stream reasoning: mastering the velocity and the variety dimensions of Big Da...Stream reasoning: mastering the velocity and the variety dimensions of Big Da...
Stream reasoning: mastering the velocity and the variety dimensions of Big Da...
 
On Stream Reasoning
On Stream ReasoningOn Stream Reasoning
On Stream Reasoning
 
Listening to the pulse of our cities fusing Social Media Streams and Call Dat...
Listening to the pulse of our cities fusing Social Media Streams and Call Dat...Listening to the pulse of our cities fusing Social Media Streams and Call Dat...
Listening to the pulse of our cities fusing Social Media Streams and Call Dat...
 
Social listener-brera-design-district-2015-03
Social listener-brera-design-district-2015-03Social listener-brera-design-district-2015-03
Social listener-brera-design-district-2015-03
 
Semantic technologies and Interoperability
Semantic technologies and InteroperabilitySemantic technologies and Interoperability
Semantic technologies and Interoperability
 
Big data: why, what, paradigm shifts enabled , tools and market landscape
Big data: why, what, paradigm shifts enabled , tools and market landscapeBig data: why, what, paradigm shifts enabled , tools and market landscape
Big data: why, what, paradigm shifts enabled , tools and market landscape
 
City Data Fusion and City Sensing presented at EIT ICT Labs for EXPO 2015
City Data Fusion and City Sensing presented at EIT ICT Labs for EXPO 2015City Data Fusion and City Sensing presented at EIT ICT Labs for EXPO 2015
City Data Fusion and City Sensing presented at EIT ICT Labs for EXPO 2015
 

Último

MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 

Último (20)

MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 

On the need to include functional testing in RDF stream engine benchmarks

  • 1. On the need to include functional testing in RDF stream engine benchmarks Daniele Dell’Aglio, Marco Balduini, and Emanuele Della Valle 1st International Workshop on Benchmarking RDF Systems (BeRSys 2013) co-located with ESWC 2013 May 26th, 2013 Montpellier, France
  • 2. Agenda Background on Data Stream Management Systems (DSMS) RDF Stream Engines Benchmarking RDF Stream Engine Operational semantics of RDF Stream Engines Testing the correctness of continuous queries' results Conclusions BeRSys 2013 - May 26, 2013Emanuele Della Valle - http://streamreasoning.org
  • 3. Background Data Stream Management Systems What are data streams? Formally: Data streams are unbounded sequences of time- varying data elements Less formally: an (almost)“continuous”flow of information with the recent information being more relevant as it describes the current state of a dynamic system time BeRSys 2013 - May 26, 2013Emanuele Della Valle - http://streamreasoning.org
  • 4. Background Data Stream Management Systems The nature of streams requires a paradigmatic change* from persistent data to be stored and queried on demand a.k.a. one time semantics to transient data to be consumed on the fly by continuous queries a.k.a. continuous semantics * This paradigmatic change first arose in DB community BeRSys 2013 - May 26, 2013Emanuele Della Valle - http://streamreasoning.org
  • 5. Background Data Stream Management Systems Continuous queries registered over streams that, in most of the cases, are observed trough windows Streams of answer produced by the relation to stream operators Registered Continuou s Query Window: stream to relation operators input streams Express using relation to relation operators BeRSys 2013 - May 26, 2013Emanuele Della Valle - http://streamreasoning.org
  • 6. Background Data Stream Management Systems Types of windows (a.k.a., stream to relation operators) physical: a given number of data elements logical: a variable number of data elements which occur during a given time interval (e.g., 1 hour) Sliding: they are progressively advanced of a given STEP (e.g., 5 minutes) Tumbling: they are advanced of exactly their time interval BeRSys 2013 - May 26, 2013Emanuele Della Valle - http://streamreasoning.org
  • 7. Background RDF Stream Engines RDF Stream Engines ports DSMS concepts into the Semantic Web extending RDF data model with the notion of RDF Stream … <si pi oi> : [τ1] <si+1 pi+1 oi+1> : [τ1+1] … SPARQL to express and process continuous queries Existing languages/engines CQELS SPARQLSTREAM C-SPARQL Timestamps are non-decreasing to allow for expressing contemporaneity BeRSys 2013 - May 26, 2013Emanuele Della Valle - http://streamreasoning.org
  • 10. Background An example of C-SPARQL query Who are the opinion makers? i.e., the users who are likely to influence the behaviour of other users who follow them REGISTER STREAM OpinionMakers COMPUTED EVERY 5m AS CONSTRUCT { ?opinionMaker sd:about ?resource } FROM STREAM <http://streamingsocialdata.org/interactions> [RANGE 30m STEP 5m] WHERE { ?opinionMaker ?opinion ?resource . ?follower sioc:follows ?opinionMaker. ?follower ?opinion ?resource. FILTER ( cs:timestamp(?follower) > cs:timestamp(?opinionMaker) && ?opinion != sd:accesses ) BeRSys 2013 - May 26, 2013
  • 11. Background An example of C-SPARQL query Who are the opinion makers? i.e., the users who are likely to influence the behaviour of other users who follow them REGISTER STREAM OpinionMakers COMPUTED EVERY 5m AS CONSTRUCT { ?opinionMaker sd:about ?resource } FROM STREAM <http://streamingsocialdata.org/interactions> [RANGE 30m STEP 5m] WHERE { ?opinionMaker ?opinion ?resource . ?follower sioc:follows ?opinionMaker. ?follower ?opinion ?resource. FILTER ( cs:timestamp(?follower) > cs:timestamp(?opinionMaker) && ?opinion != sd:accesses ) Query registration (for continuous execution) FROM STREAM clause WINDOW RDF Stream added as new output format Builtin to access timestamps Aggregates as in SPARQL 1.1 BeRSys 2013 - May 26, 2013
  • 12. Background Benchmarking RDF stream engines SRBench Dataset: LinkedSensorData (real meteorological sensor data) Queries: 17 continuous queries, some requiring RDFS reasoning KPI: feature coverage and correctness LSBench Dataset: synthetic social network inspired data set Queries: 12 continuous queries involving multiple stream and static knowledge KPI: input throughput and correctness Not verified Verified comparing the number of results produced by different
  • 13. Is verifying correctness hard? Not for SPARQL http://www.w3.org/2009/sparql/docs/tests/ Queries + expected results However, it is hard for continuous (SPARQL) queries 1 query  multiple correct results Input data and query are not enough to determine the correct result BeRSys 2013 - May 26, 2013Emanuele Della Valle - http://streamreasoning.org
  • 14. A simple test Take the motivation scenario of CQELS there are two connected rooms, r1 and r2; each room has a sensor able to detect the individuals inside, m1 and m2. The stream ST contains the following triples: <:m1 :detectedAt :r1>:[1] <:m2 :detectedAt :r1>:[3] <:m1 :detectedAt :r2>:[12] <:m2 :detectedAt :r2>:[15] S1 S2 S3 S4 BeRSys 2013 - May 26, 2013Emanuele Della Valle - http://streamreasoning.org
  • 15. The query of the simple test We want to know when the two individuals m1 and m2 are in the same room using time-based tumbling window of 10 seconds. REGISTER QUERY SimpleTest AS SELECT ?room FROM STREAM <http://ex.org/ST> [RANGE 10s STEP 10s] WHERE { :m1 :detectedAt ?room . :m2 :detectedAt ?room } BeRSys 2013 - May 26, 2013Emanuele Della Valle - http://streamreasoning.org
  • 16. All the results you can obtain Running the test in C-SPARQL, CQELS and SPARQLSTREAM the following results can be obtained. Are they all correct? How can this be? BeRSys 2013 - May 26, 2013Emanuele Della Valle - http://streamreasoning.org
  • 17. These engines have different operational semantics! BeRSys 2013 - May 26, 2013Emanuele Della Valle - http://streamreasoning.org
  • 18. The devil is in the details! S1 S2 S3 S4 W0 S1 = <:m1 :detectedAt :r1>:[1] S2 = <:m2 :detectedAt :r1>:[3] S3 = <:m1 :detectedAt :r2>:[12] S4 = <:m2 .detectedAt :r2>:[15] ST t3 12 151 W1 W2 W3 W4 W5 W6 BeRSys 2013 - May 26, 2013Emanuele Della Valle - http://streamreasoning.org
  • 19. Can operational semantics of RDF Stream Engines be modelled? A model has been proposed to explain the differences that appear between different DSMS: SECRET The results of a DSMS not only depends on the input and the query, but also on the system BeRSys 2013 - May 26, 2013Emanuele Della Valle - http://streamreasoning.org
  • 20. ScopE in the SECRET model It is the time range of the active window [topen,tclose) it is determined using the size ω and slideβparameters of the window as written by the query issuer BeRSys 2013 - May 26, 2013Emanuele Della Valle - http://streamreasoning.org tapp 1 2 3 4 5 6 7 tW1 W2 W3 Closed Open Active ω=3 β= 2
  • 21. Content in the SECRET model It is the subset of the stream included of the active window It is determined using the size ω and slide β parameters of the window and t0 the time instant on which the first window starts, W0 W1 W2 W3 t3 12 151 ω=β=10 Different values for t0 BeRSys 2013 - May 26, 2013Emanuele Della Valle - http://streamreasoning.org
  • 22. Report in the SECRET model It defines the conditions under which the window contents become visible for further query evaluation and result reporting It can take a logical combination of the following: content change window close non-empty content periodic BeRSys 2013 - May 26, 2013Emanuele Della Valle - http://streamreasoning.org
  • 23. Explaining the results of C-SPARQL S1 S2 S3 S4 W0 W1 W2 W3 W4 W5 W6 ST t3 12 151 ω=β=10 The reporting strategy of C-SPARQL is window close and non-empty result BeRSys 2013 - May 26, 2013Emanuele Della Valle - http://streamreasoning.org S1 = <:m1 :detectedAt :r1>:[1] S2 = <:m2 :detectedAt :r1>:[3] S3 = <:m1 :detectedAt :r2>:[12] S4 = <:m2 .detectedAt :r2>:[15]
  • 24. Explaining the results of SPARQLSTREAM S1 S2 S3 S4 W0 W1 W2 W3 W4 W5 W6 ST t3 12 151 ω=β=10 The reporting strategy of SPARQLSTREAM is window close BeRSys 2013 - May 26, 2013Emanuele Della Valle - http://streamreasoning.org S1 = <:m1 :detectedAt :r1>:[1] S2 = <:m2 :detectedAt :r1>:[3] S3 = <:m1 :detectedAt :r2>:[12] S4 = <:m2 .detectedAt :r2>:[15]
  • 25. Explaining the results of CQELS S1 S2 S3 S4 W0 W1 W2 W3 W4 W5 W6 ST t3 12 151 ω=β=10 The reporting strategy of CQELS is content change, and non-empty result BeRSys 2013 - May 26, 2013Emanuele Della Valle - http://streamreasoning.org S1 = <:m1 :detectedAt :r1>:[1] S2 = <:m2 :detectedAt :r1>:[3] S3 = <:m1 :detectedAt :r2>:[12] S4 = <:m2 .detectedAt :r2>:[15]
  • 26. And so what? BeRSys 2013 - May 26, 2013Emanuele Della Valle - http://streamreasoning.org
  • 27. Let's test correctness! Data Query SELECT ?room WHERE { STREAM <http://ex.org/s1> [RANGE 3s SLIDE 3s] { ?p1 :detectedAt ?room . ?p2 :detectedAt ?room } FILTER (?p1 != ?p2) } Timeline S1 = <:m1 :detectedAt :r1>:[0] S2 = <:m2 :detectedAt :r2>:[5] S3 = <:m3 :detectedAt :r1>:[10] S4 = <:m4 :detectedAt :r2>:[15] S1 S2 S3 S4 t0 5 10 15 S1 ω=β=3
  • 28. Results of the test Why? BeRSys 2013 - May 26, 2013Emanuele Della Valle - http://streamreasoning.org
  • 29. Trying to make sense of it … Data Query SELECT ?room WHERE { STREAM <http://ex.org/s1> [RANGE 3s SLIDE 3s] { ?p1 :detectedAt ?room . ?p2 :detectedAt ?room } FILTER (?p1 != ?p2) } Timeline S1 = <:m1 :detectedAt :r1>:[0] S2 = <:m2 :detectedAt :r2>:[5] S3 = <:m3 :detectedAt :r1>:[10] S4 = <:m4 :detectedAt :r2>:[15] S1 S2 S3 S4 t0 5 10 15 S1 ω=β=3 Let's remove this filter
  • 30. Results of the new test Is this caused by incorrect removal of the triples from the window? BeRSys 2013 - May 26, 2013Emanuele Della Valle - http://streamreasoning.org
  • 31. Does throughput matters without correctness? BeRSys 2013 - May 26, 2013Emanuele Della Valle - http://streamreasoning.org
  • 32. Conclusions The different operational semantics of existing RDF stream engines affect the outputs and the performance of those systems Throughput measurements must be performed while testing correctness Modeling RDF stream Engines using SECRET allows for checking correctness SRbench and LSBench should be extended with an "oracle" that checks correctness BeRSys 2013 - May 26, 2013Emanuele Della Valle - http://streamreasoning.org
  • 33. Thank you for you attention: questions? Daniele Dell’Aglio, Marco Balduini, and Emanuele Della Valle