SlideShare uma empresa Scribd logo
1 de 22
Baixar para ler offline
Dipartimento di
Elettronica, Informazione e
Bioingegneria
An Experience on Empirical
Research about RDF Stream
Processing
Daniele Dell’Aglio – daniele.dellaglio@polimi.it
Joint work with: Jean-Paul Calbimonte, Marco Balduini, Oscar Corcho
and Emanuele Della Valle
Dipartimento di Elettronica, Informazione
e Bioingegneria
RDF Stream Processing in a nutshell
 Continuous queries over RDF streams - infinite
sequences of time-stamped RDF statements (RDF
streams)
 Bring together DSMS/CEP and Semantic Web research
fields
 Several prototypes – with similar models – are available
today
 Trend on evaluation and comparison of the existing
systems
26 May 2014 - EMPIRICAL@ESWC2014
DanieleDell'Aglio-ExperimentalresearchaboutRSPs
2
Dipartimento di Elettronica, Informazione
e Bioingegneria
The CQL model for RSPs
 Transform a set of mappings in another set of
mappings
 SPARQL 1.0/1.1 queries
 Each set of mapping produced by the R2R operator
is transformed and appended to the output
stream
 Operators: RStream, DStream, IStream
 Converts the infinite stream of RDF elements in a
finite set of mappings
 The window operators: time-based, tuple-based, …
S2R
operator
R2R
operator
R2S
operator
Input stream
Output stream
DanieleDell'Aglio-ExperimentalresearchaboutRSPs
3
26 May 2014 - EMPIRICAL@ESWC2014
Dipartimento di Elettronica, Informazione
e Bioingegneria
R2R operator
S2R - Time-based sliding window
S3
S4 S5
S6
S7
S8
S9 S10
S11
S12
S
S1
S2
W(ω,β)
β
ω
t
widthslide
DanieleDell'Aglio-ExperimentalresearchaboutRSPs
4
26 May 2014 - EMPIRICAL@ESWC2014
Dipartimento di Elettronica, Informazione
e Bioingegneria
Implementations (oversimplified!)
 C-SPARQL
– RDF Store + Stream processor


RDF Store
Stream
processor
Continuous
query
continuous
results
translator
DanieleDell'Aglio-ExperimentalresearchaboutRSPs
5
26 May 2014 - EMPIRICAL@ESWC2014
Dipartimento di Elettronica, Informazione
e Bioingegneria
Implementations (oversimplified!)
 C-SPARQL
– RDF Store + Stream processor
 CQELS:
– Implemented from scratch. Focus on performance

RDF Store
Stream
processor
Continuous
query
continuous
results
Native RSP
Continuous
query
continuous
results
translator
DanieleDell'Aglio-ExperimentalresearchaboutRSPs
5
26 May 2014 - EMPIRICAL@ESWC2014
Dipartimento di Elettronica, Informazione
e Bioingegneria
Implementations (oversimplified!)
 C-SPARQL
– RDF Store + Stream processor
 CQELS:
– Implemented from scratch. Focus on performance
 SPARQLstream:
– Ontology-based stream query answering
RDF Store
Stream
processor
Continuous
query
continuous
results
Native RSP
Continuous
query
continuous
results
translator
DSMS/CEP
Continuous
query
continuous
results
rewriter
R2RML mappings
DanieleDell'Aglio-ExperimentalresearchaboutRSPs
5
26 May 2014 - EMPIRICAL@ESWC2014
Dipartimento di Elettronica, Informazione
e Bioingegneria
Same inputs, different outputs…
 And the continuous
query:
– Where are Alice and
Bob, when they are
together?
– With a tumbling
window W(ω=β=5)
Execution 1° answer 2° answer
1 :hall [6] :kitchen [11]
2 :hall [5] :kitchen [10]
3 :hall [6] :kitchen [11]
4 - [7] - [12]
S1 S2 S3 S4S
t3 6 91
:alice :isIn :hall
:bob :isIn :hall
:alice :isIn :kitchen
:bob :isIn :kitchen
width
slide
 After 4 executions:
 Let’s consider the following stream:
DanieleDell'Aglio-ExperimentalresearchaboutRSPs
8
26 May 2014 - EMPIRICAL@ESWC2014
Dipartimento di Elettronica, Informazione
e Bioingegneria
The first hypothesis
 All the three systems show similar behaviours
 Intuition: there are one or more parameters that are not
taken into account by the model
 As consequence, the implementations can output
different correct answers
DanieleDell'Aglio-ExperimentalresearchaboutRSPs
9
26 May 2014 - EMPIRICAL@ESWC2014
Dipartimento di Elettronica, Informazione
e Bioingegneria
The first hypothesis
 HP1: it is possible to have a unique correct answer if we
can control the time instant on which the sliding window
operator starts to work (t0)
S1 S2 S3 S4S
t3 6 91
:bob :isIn :hall :bob :isIn :kitchen
t0=0
:alice :isIn :hall :alice :isIn :kitchen
t0=1
t0=2
DanieleDell'Aglio-ExperimentalresearchaboutRSPs
10
26 May 2014 - EMPIRICAL@ESWC2014
Dipartimento di Elettronica, Informazione
e Bioingegneria
The experiment
 We work on the difference between the time
instant on which the stream starts (ts) and the
query registration time (tq)
– At each execution, we check the result
– We estimated the delay between tq and t0
tq
ts
 Black box approach
– we work on inputs/outputs
– the source code of all the systems
RSP
DanieleDell'Aglio-ExperimentalresearchaboutRSPs
11
26 May 2014 - EMPIRICAL@ESWC2014
t0
Dipartimento di Elettronica, Informazione
e Bioingegneria
Observation and explanation
 As result, for each system
– We identified the value of the t0 parameter
– We are able to produce the different results for each t0
value
 Is it enough to claim that hypothesis 1 holds?
Exec 1° answer 2° answer
1 :hall [6] :kitchen [11]
2 :hall [5] :kitchen [10]
3 :hall [6] :kitchen [11]
4 - [7] - [12]
Window 1° answer 2° answer
t0=0 :hall [5] :kitchen [10]
t0=1 :hall [6] :kitchen [11]
t0=2 - [7] - [12]
DanieleDell'Aglio-ExperimentalresearchaboutRSPs
12
26 May 2014 - EMPIRICAL@ESWC2014
Dipartimento di Elettronica, Informazione
e Bioingegneria
Some consideration on the experiment
 Comparison:
– We ran the experiment multiple times to collect
instances and check them
 Reproducibility: can other researchers reproduce the
experiment?
– We released both the code and the data used for the
experiment (see
http://streamreasoning.org/Benchmarks/)
 Repeatability: is the result universally valid?
– We changed inputs (streams and queries) and
OS/JVM to verify if the hypothesis holds
– We repeated the experiment with different
implementations (C-SPARQL, CQELS, etc.)
DanieleDell'Aglio-ExperimentalresearchaboutRSPs
13
26 May 2014 - EMPIRICAL@ESWC2014
Dipartimento di Elettronica, Informazione
e Bioingegneria
Something more on repeatability…
 We made some assumptions on the setting
26 May 2014 - EMPIRICAL@ESWC2014
DanieleDell'Aglio-ExperimentalresearchaboutRSPs
14
S2R
R2R R2SS2R
S2R
From single
to multi
window
From single to
multi stream
Reasoning
q2
Static
knowledge
Multiple
queries
Dipartimento di Elettronica, Informazione
e Bioingegneria
 As “side effect” of the first experiment, we
discovered that results of different systems are
not the same:
 Intuition: t0 is not the only parameter our model
lacks
A more complex problem…
Exec 1° answer 2° answer
1 :hall [6] :kitchen [11]
2 :hall [5] :kitchen [10]
3 :hall [6] :kitchen [11]
4 - [7] - [12]
Exec 1° answer 2° answer
1 :hall [3] :kitchen [9]
2 No answers
3 :hall [3] :kitchen [9]
4 No answers
C-SPARQL CQELS
DanieleDell'Aglio-ExperimentalresearchaboutRSPs
15
26 May 2014 - EMPIRICAL@ESWC2014
Dipartimento di Elettronica, Informazione
e Bioingegneria
R2R operator
The SECRET framework
S3
S4 S5
S6
S7
S8
S9 S10
S11
S12
S
S1
S2
W(ω,β)
β
ω
t0: When does the
window start?
(internal window
param)
TICK: When are
data stream
elements added to
the window?
Triple-based vs
graph-based
REPORT: When is the window content
made available to the R2R operator?
Non-empty content, Content-change,
Window-close, Periodic
t
DanieleDell'Aglio-ExperimentalresearchaboutRSPs
16
26 May 2014 - EMPIRICAL@ESWC2014
Dipartimento di Elettronica, Informazione
e Bioingegneria
SECRET and RSPs
 HP2: given an input stream, a query, the value of t0 and
description of the RSP w.r.t. SECRET, we can determine
the answer that will be provided by the system
 To investigate it, we built a software that evaluates in
batch the answer and matches it with the RSP one
DanieleDell'Aglio-ExperimentalresearchaboutRSPs
17
26 May 2014 - EMPIRICAL@ESWC2014
Dipartimento di Elettronica, Informazione
e Bioingegneria
Observation and analysis
 We prepared a set of seven
queries (to stress different part of
the sliding window)
 We run each query multiple times
 Most of the times, we can foresee the
answer that will be provided
CQELS
C-SPARQL
SPARQLstream
Q1
Q2
Q3
Q4
Q5
Q6
Q7
DanieleDell'Aglio-ExperimentalresearchaboutRSPs
18
26 May 2014 - EMPIRICAL@ESWC2014
Dipartimento di Elettronica, Informazione
e Bioingegneria
Observation and analysis
 We investigated the observations where there is
not a match, and we discovered that they were
errors in the implementations, such as:
– Initialization
– Slide parameter
– Window contents
– Internal timestamp management
 Conclusion: HP2 seems to be valid in the
considered setting
DanieleDell'Aglio-ExperimentalresearchaboutRSPs
19
26 May 2014 - EMPIRICAL@ESWC2014
Dipartimento di Elettronica, Informazione
e Bioingegneria
CSR-bench
 The main outcome of our experience is CSR-bench, an
extension of the CSR benchmark
– More info at http://www.w3.org/wiki/CSRBench
 Two main components:
– A common model for the RDF stream processor
operational semantics
– An oracle (an automatic correctness validator),
available at https://github.com/dellaglio/csrbench-
oracle
– A test suite
DanieleDell'Aglio-ExperimentalresearchaboutRSPs
20
26 May 2014 - EMPIRICAL@ESWC2014
Dipartimento di Elettronica, Informazione
e Bioingegneria
References
 Daniele Dell'Aglio, Marco Balduini, Emanuele Della Valle. On the need to
include functional testing in RDF stream engine benchmarks. 1st
International Workshop on Benchmarking RDF Systems (BeRSys2013)
 Daniele Dell'Aglio, Jean-Paul Calbimonte, Marco Balduini, Óscar Corcho,
Emanuele Della Valle: On Correctness in RDF Stream Processor
Benchmarking. International Semantic Web Conference (2) 2013: 326-342
 Barbieri, D.F., Braga, D., Ceri, S., Della Valle, E., Grossniklaus, M.: C-
SPARQL: A continuous query language for RDF data streams. IJSC 4(1)
(2010) 3–25
 Calbimonte, J.P., Jeung, H., Corcho, O., Aberer, K.: Enabling Query
Technologies for the Semantic Sensor Web. IJSWIS 8(1) (2012) 43–63
 Le-Phuoc, D., Dao-Tran, M., Xavier Parreira, J., Hauswirth, M.: A native and
adaptive approach for unified processing of linked streams and linked data.
In: ISWC. (2011) 370–388
DanieleDell'Aglio-ExperimentalresearchaboutRSPs
21
26 May 2014 - EMPIRICAL@ESWC2014
Dipartimento di Elettronica, Informazione
e Bioingegneria
Thank you! Questions?
An Experience on Empirical Research about
RDF Stream Processing
Daniele Dell’Aglio
(DEIB, Politecnico di Milano)
daniele.dellaglio@polimi.it
DanieleDell'Aglio-ExperimentalresearchaboutRSPs
22
26 May 2014 - EMPIRICAL@ESWC2014

Mais conteúdo relacionado

Semelhante a An experience on empirical research about rdf stream

04.15.15 energy design assistance program tracker 2
04.15.15 energy design assistance program tracker 204.15.15 energy design assistance program tracker 2
04.15.15 energy design assistance program tracker 2
melanie_bissonnette
 
Big(ger) Data in Software Engineering
Big(ger) Data in Software EngineeringBig(ger) Data in Software Engineering
Big(ger) Data in Software Engineering
Mehdi Mirakhorli
 

Semelhante a An experience on empirical research about rdf stream (20)

RSP4J: An API for RDF Stream Processing
RSP4J: An API for RDF Stream ProcessingRSP4J: An API for RDF Stream Processing
RSP4J: An API for RDF Stream Processing
 
On Unified Stream Reasoning - The RDF Stream Processing realm
On Unified Stream Reasoning - The RDF Stream Processing realmOn Unified Stream Reasoning - The RDF Stream Processing realm
On Unified Stream Reasoning - The RDF Stream Processing realm
 
NLP Data Cleansing Based on Linguistic Ontology Constraints
NLP Data Cleansing Based on Linguistic Ontology ConstraintsNLP Data Cleansing Based on Linguistic Ontology Constraints
NLP Data Cleansing Based on Linguistic Ontology Constraints
 
io-Chem-BD, una solució per gestionar el Big Data en Química Computacional
io-Chem-BD, una solució per gestionar el Big Data en Química Computacionalio-Chem-BD, una solució per gestionar el Big Data en Química Computacional
io-Chem-BD, una solució per gestionar el Big Data en Química Computacional
 
Creating and Utilizing Linked Open Statistical Data for the Development of Ad...
Creating and Utilizing Linked Open Statistical Data for the Development of Ad...Creating and Utilizing Linked Open Statistical Data for the Development of Ad...
Creating and Utilizing Linked Open Statistical Data for the Development of Ad...
 
Empirical research results for the evolution of a data-intensive software sys...
Empirical research results for the evolution of a data-intensive software sys...Empirical research results for the evolution of a data-intensive software sys...
Empirical research results for the evolution of a data-intensive software sys...
 
MAVRL Workshop 2014 - Python Materials Genomics (pymatgen)
MAVRL Workshop 2014 - Python Materials Genomics (pymatgen)MAVRL Workshop 2014 - Python Materials Genomics (pymatgen)
MAVRL Workshop 2014 - Python Materials Genomics (pymatgen)
 
04.15.15 energy design assistance program tracker 2
04.15.15 energy design assistance program tracker 204.15.15 energy design assistance program tracker 2
04.15.15 energy design assistance program tracker 2
 
DRESD Project Presentation - December 2006
DRESD Project Presentation - December 2006DRESD Project Presentation - December 2006
DRESD Project Presentation - December 2006
 
Overview of the SPARQL-Generate language and latest developments
Overview of the SPARQL-Generate language and latest developmentsOverview of the SPARQL-Generate language and latest developments
Overview of the SPARQL-Generate language and latest developments
 
Towards processing and reasoning streams of events in knowledge driven manufa...
Towards processing and reasoning streams of events in knowledge driven manufa...Towards processing and reasoning streams of events in knowledge driven manufa...
Towards processing and reasoning streams of events in knowledge driven manufa...
 
polystore_NYC_inrae_sysinfo2021-1.pdf
polystore_NYC_inrae_sysinfo2021-1.pdfpolystore_NYC_inrae_sysinfo2021-1.pdf
polystore_NYC_inrae_sysinfo2021-1.pdf
 
Big(ger) Data in Software Engineering
Big(ger) Data in Software EngineeringBig(ger) Data in Software Engineering
Big(ger) Data in Software Engineering
 
AI and Machine Learning for the Connected Home with Stephen Galsworthy
AI and Machine Learning for the Connected Home with Stephen GalsworthyAI and Machine Learning for the Connected Home with Stephen Galsworthy
AI and Machine Learning for the Connected Home with Stephen Galsworthy
 
A Library for Emerging High-Performance Computing Clusters
A Library for Emerging High-Performance Computing ClustersA Library for Emerging High-Performance Computing Clusters
A Library for Emerging High-Performance Computing Clusters
 
Ecet 365 Enhance teaching / snaptutorial.com
Ecet 365   Enhance teaching / snaptutorial.comEcet 365   Enhance teaching / snaptutorial.com
Ecet 365 Enhance teaching / snaptutorial.com
 
Towards efficient processing of RDF data streams
Towards efficient processing of RDF data streamsTowards efficient processing of RDF data streams
Towards efficient processing of RDF data streams
 
Towards efficient processing of RDF data streams
Towards efficient processing of RDF data streamsTowards efficient processing of RDF data streams
Towards efficient processing of RDF data streams
 
Pavankumar Banakar Resume
Pavankumar Banakar ResumePavankumar Banakar Resume
Pavankumar Banakar Resume
 
Patching Mr Robot: Mitigating IoT-Related Cyber-social Disasters by getting F...
Patching Mr Robot: Mitigating IoT-Related Cyber-social Disasters by getting F...Patching Mr Robot: Mitigating IoT-Related Cyber-social Disasters by getting F...
Patching Mr Robot: Mitigating IoT-Related Cyber-social Disasters by getting F...
 

Mais de Daniele Dell'Aglio

P&MSP2012 - Version Control Systems
P&MSP2012 - Version Control SystemsP&MSP2012 - Version Control Systems
P&MSP2012 - Version Control Systems
Daniele Dell'Aglio
 
P&MSP2012 - Logging Frameworks
P&MSP2012 - Logging FrameworksP&MSP2012 - Logging Frameworks
P&MSP2012 - Logging Frameworks
Daniele Dell'Aglio
 

Mais de Daniele Dell'Aglio (20)

Distributed stream consistency checking
Distributed stream consistency checkingDistributed stream consistency checking
Distributed stream consistency checking
 
On web stream processing
On web stream processingOn web stream processing
On web stream processing
 
On a web of data streams
On a web of data streamsOn a web of data streams
On a web of data streams
 
Triplewave: a step towards RDF Stream Processing on the Web
Triplewave: a step towards RDF Stream Processing on the WebTriplewave: a step towards RDF Stream Processing on the Web
Triplewave: a step towards RDF Stream Processing on the Web
 
On unifying query languages for RDF streams
On unifying query languages for RDF streamsOn unifying query languages for RDF streams
On unifying query languages for RDF streams
 
RSEP-QL: A Query Model to Capture Event Pattern Matching in RDF Stream Proces...
RSEP-QL: A Query Model to Capture Event Pattern Matching in RDF Stream Proces...RSEP-QL: A Query Model to Capture Event Pattern Matching in RDF Stream Proces...
RSEP-QL: A Query Model to Capture Event Pattern Matching in RDF Stream Proces...
 
Summary of the Stream Reasoning workshop at ISWC 2016
Summary of the Stream Reasoning workshop at ISWC 2016Summary of the Stream Reasoning workshop at ISWC 2016
Summary of the Stream Reasoning workshop at ISWC 2016
 
On Unified Stream Reasoning
On Unified Stream ReasoningOn Unified Stream Reasoning
On Unified Stream Reasoning
 
Querying the Web of Data with XSPARQL 1.1
Querying the Web of Data with XSPARQL 1.1Querying the Web of Data with XSPARQL 1.1
Querying the Web of Data with XSPARQL 1.1
 
Augmented Participation to Live Events through Social Network Content Enrichm...
Augmented Participation to Live Events through Social Network Content Enrichm...Augmented Participation to Live Events through Social Network Content Enrichm...
Augmented Participation to Live Events through Social Network Content Enrichm...
 
RDF Stream Processing Models (RSP2014)
RDF Stream Processing Models (RSP2014)RDF Stream Processing Models (RSP2014)
RDF Stream Processing Models (RSP2014)
 
A Survey of Temporal Extensions of Description Logics
A Survey of Temporal Extensions of Description LogicsA Survey of Temporal Extensions of Description Logics
A Survey of Temporal Extensions of Description Logics
 
IMaRS - Incremental Materialization for RDF Streams (SR4LD2013)
IMaRS - Incremental Materialization for RDF Streams (SR4LD2013)IMaRS - Incremental Materialization for RDF Streams (SR4LD2013)
IMaRS - Incremental Materialization for RDF Streams (SR4LD2013)
 
RDF Stream Processing Models (SR4LD2013)
RDF Stream Processing Models (SR4LD2013)RDF Stream Processing Models (SR4LD2013)
RDF Stream Processing Models (SR4LD2013)
 
Ontology based top-k query answering over massive, heterogeneous, and dynamic...
Ontology based top-k query answering over massive, heterogeneous, and dynamic...Ontology based top-k query answering over massive, heterogeneous, and dynamic...
Ontology based top-k query answering over massive, heterogeneous, and dynamic...
 
An Ontological Formulation and an OPM profile for Causality in Planning Appli...
An Ontological Formulation and an OPM profile for Causality in Planning Appli...An Ontological Formulation and an OPM profile for Causality in Planning Appli...
An Ontological Formulation and an OPM profile for Causality in Planning Appli...
 
P&MSP2012 - Maven
P&MSP2012 - MavenP&MSP2012 - Maven
P&MSP2012 - Maven
 
P&MSP2012 - Version Control Systems
P&MSP2012 - Version Control SystemsP&MSP2012 - Version Control Systems
P&MSP2012 - Version Control Systems
 
P&MSP2012 - Unit Testing
P&MSP2012 - Unit TestingP&MSP2012 - Unit Testing
P&MSP2012 - Unit Testing
 
P&MSP2012 - Logging Frameworks
P&MSP2012 - Logging FrameworksP&MSP2012 - Logging Frameworks
P&MSP2012 - Logging Frameworks
 

Último

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Último (20)

Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 

An experience on empirical research about rdf stream

  • 1. Dipartimento di Elettronica, Informazione e Bioingegneria An Experience on Empirical Research about RDF Stream Processing Daniele Dell’Aglio – daniele.dellaglio@polimi.it Joint work with: Jean-Paul Calbimonte, Marco Balduini, Oscar Corcho and Emanuele Della Valle
  • 2. Dipartimento di Elettronica, Informazione e Bioingegneria RDF Stream Processing in a nutshell  Continuous queries over RDF streams - infinite sequences of time-stamped RDF statements (RDF streams)  Bring together DSMS/CEP and Semantic Web research fields  Several prototypes – with similar models – are available today  Trend on evaluation and comparison of the existing systems 26 May 2014 - EMPIRICAL@ESWC2014 DanieleDell'Aglio-ExperimentalresearchaboutRSPs 2
  • 3. Dipartimento di Elettronica, Informazione e Bioingegneria The CQL model for RSPs  Transform a set of mappings in another set of mappings  SPARQL 1.0/1.1 queries  Each set of mapping produced by the R2R operator is transformed and appended to the output stream  Operators: RStream, DStream, IStream  Converts the infinite stream of RDF elements in a finite set of mappings  The window operators: time-based, tuple-based, … S2R operator R2R operator R2S operator Input stream Output stream DanieleDell'Aglio-ExperimentalresearchaboutRSPs 3 26 May 2014 - EMPIRICAL@ESWC2014
  • 4. Dipartimento di Elettronica, Informazione e Bioingegneria R2R operator S2R - Time-based sliding window S3 S4 S5 S6 S7 S8 S9 S10 S11 S12 S S1 S2 W(ω,β) β ω t widthslide DanieleDell'Aglio-ExperimentalresearchaboutRSPs 4 26 May 2014 - EMPIRICAL@ESWC2014
  • 5. Dipartimento di Elettronica, Informazione e Bioingegneria Implementations (oversimplified!)  C-SPARQL – RDF Store + Stream processor   RDF Store Stream processor Continuous query continuous results translator DanieleDell'Aglio-ExperimentalresearchaboutRSPs 5 26 May 2014 - EMPIRICAL@ESWC2014
  • 6. Dipartimento di Elettronica, Informazione e Bioingegneria Implementations (oversimplified!)  C-SPARQL – RDF Store + Stream processor  CQELS: – Implemented from scratch. Focus on performance  RDF Store Stream processor Continuous query continuous results Native RSP Continuous query continuous results translator DanieleDell'Aglio-ExperimentalresearchaboutRSPs 5 26 May 2014 - EMPIRICAL@ESWC2014
  • 7. Dipartimento di Elettronica, Informazione e Bioingegneria Implementations (oversimplified!)  C-SPARQL – RDF Store + Stream processor  CQELS: – Implemented from scratch. Focus on performance  SPARQLstream: – Ontology-based stream query answering RDF Store Stream processor Continuous query continuous results Native RSP Continuous query continuous results translator DSMS/CEP Continuous query continuous results rewriter R2RML mappings DanieleDell'Aglio-ExperimentalresearchaboutRSPs 5 26 May 2014 - EMPIRICAL@ESWC2014
  • 8. Dipartimento di Elettronica, Informazione e Bioingegneria Same inputs, different outputs…  And the continuous query: – Where are Alice and Bob, when they are together? – With a tumbling window W(ω=β=5) Execution 1° answer 2° answer 1 :hall [6] :kitchen [11] 2 :hall [5] :kitchen [10] 3 :hall [6] :kitchen [11] 4 - [7] - [12] S1 S2 S3 S4S t3 6 91 :alice :isIn :hall :bob :isIn :hall :alice :isIn :kitchen :bob :isIn :kitchen width slide  After 4 executions:  Let’s consider the following stream: DanieleDell'Aglio-ExperimentalresearchaboutRSPs 8 26 May 2014 - EMPIRICAL@ESWC2014
  • 9. Dipartimento di Elettronica, Informazione e Bioingegneria The first hypothesis  All the three systems show similar behaviours  Intuition: there are one or more parameters that are not taken into account by the model  As consequence, the implementations can output different correct answers DanieleDell'Aglio-ExperimentalresearchaboutRSPs 9 26 May 2014 - EMPIRICAL@ESWC2014
  • 10. Dipartimento di Elettronica, Informazione e Bioingegneria The first hypothesis  HP1: it is possible to have a unique correct answer if we can control the time instant on which the sliding window operator starts to work (t0) S1 S2 S3 S4S t3 6 91 :bob :isIn :hall :bob :isIn :kitchen t0=0 :alice :isIn :hall :alice :isIn :kitchen t0=1 t0=2 DanieleDell'Aglio-ExperimentalresearchaboutRSPs 10 26 May 2014 - EMPIRICAL@ESWC2014
  • 11. Dipartimento di Elettronica, Informazione e Bioingegneria The experiment  We work on the difference between the time instant on which the stream starts (ts) and the query registration time (tq) – At each execution, we check the result – We estimated the delay between tq and t0 tq ts  Black box approach – we work on inputs/outputs – the source code of all the systems RSP DanieleDell'Aglio-ExperimentalresearchaboutRSPs 11 26 May 2014 - EMPIRICAL@ESWC2014 t0
  • 12. Dipartimento di Elettronica, Informazione e Bioingegneria Observation and explanation  As result, for each system – We identified the value of the t0 parameter – We are able to produce the different results for each t0 value  Is it enough to claim that hypothesis 1 holds? Exec 1° answer 2° answer 1 :hall [6] :kitchen [11] 2 :hall [5] :kitchen [10] 3 :hall [6] :kitchen [11] 4 - [7] - [12] Window 1° answer 2° answer t0=0 :hall [5] :kitchen [10] t0=1 :hall [6] :kitchen [11] t0=2 - [7] - [12] DanieleDell'Aglio-ExperimentalresearchaboutRSPs 12 26 May 2014 - EMPIRICAL@ESWC2014
  • 13. Dipartimento di Elettronica, Informazione e Bioingegneria Some consideration on the experiment  Comparison: – We ran the experiment multiple times to collect instances and check them  Reproducibility: can other researchers reproduce the experiment? – We released both the code and the data used for the experiment (see http://streamreasoning.org/Benchmarks/)  Repeatability: is the result universally valid? – We changed inputs (streams and queries) and OS/JVM to verify if the hypothesis holds – We repeated the experiment with different implementations (C-SPARQL, CQELS, etc.) DanieleDell'Aglio-ExperimentalresearchaboutRSPs 13 26 May 2014 - EMPIRICAL@ESWC2014
  • 14. Dipartimento di Elettronica, Informazione e Bioingegneria Something more on repeatability…  We made some assumptions on the setting 26 May 2014 - EMPIRICAL@ESWC2014 DanieleDell'Aglio-ExperimentalresearchaboutRSPs 14 S2R R2R R2SS2R S2R From single to multi window From single to multi stream Reasoning q2 Static knowledge Multiple queries
  • 15. Dipartimento di Elettronica, Informazione e Bioingegneria  As “side effect” of the first experiment, we discovered that results of different systems are not the same:  Intuition: t0 is not the only parameter our model lacks A more complex problem… Exec 1° answer 2° answer 1 :hall [6] :kitchen [11] 2 :hall [5] :kitchen [10] 3 :hall [6] :kitchen [11] 4 - [7] - [12] Exec 1° answer 2° answer 1 :hall [3] :kitchen [9] 2 No answers 3 :hall [3] :kitchen [9] 4 No answers C-SPARQL CQELS DanieleDell'Aglio-ExperimentalresearchaboutRSPs 15 26 May 2014 - EMPIRICAL@ESWC2014
  • 16. Dipartimento di Elettronica, Informazione e Bioingegneria R2R operator The SECRET framework S3 S4 S5 S6 S7 S8 S9 S10 S11 S12 S S1 S2 W(ω,β) β ω t0: When does the window start? (internal window param) TICK: When are data stream elements added to the window? Triple-based vs graph-based REPORT: When is the window content made available to the R2R operator? Non-empty content, Content-change, Window-close, Periodic t DanieleDell'Aglio-ExperimentalresearchaboutRSPs 16 26 May 2014 - EMPIRICAL@ESWC2014
  • 17. Dipartimento di Elettronica, Informazione e Bioingegneria SECRET and RSPs  HP2: given an input stream, a query, the value of t0 and description of the RSP w.r.t. SECRET, we can determine the answer that will be provided by the system  To investigate it, we built a software that evaluates in batch the answer and matches it with the RSP one DanieleDell'Aglio-ExperimentalresearchaboutRSPs 17 26 May 2014 - EMPIRICAL@ESWC2014
  • 18. Dipartimento di Elettronica, Informazione e Bioingegneria Observation and analysis  We prepared a set of seven queries (to stress different part of the sliding window)  We run each query multiple times  Most of the times, we can foresee the answer that will be provided CQELS C-SPARQL SPARQLstream Q1 Q2 Q3 Q4 Q5 Q6 Q7 DanieleDell'Aglio-ExperimentalresearchaboutRSPs 18 26 May 2014 - EMPIRICAL@ESWC2014
  • 19. Dipartimento di Elettronica, Informazione e Bioingegneria Observation and analysis  We investigated the observations where there is not a match, and we discovered that they were errors in the implementations, such as: – Initialization – Slide parameter – Window contents – Internal timestamp management  Conclusion: HP2 seems to be valid in the considered setting DanieleDell'Aglio-ExperimentalresearchaboutRSPs 19 26 May 2014 - EMPIRICAL@ESWC2014
  • 20. Dipartimento di Elettronica, Informazione e Bioingegneria CSR-bench  The main outcome of our experience is CSR-bench, an extension of the CSR benchmark – More info at http://www.w3.org/wiki/CSRBench  Two main components: – A common model for the RDF stream processor operational semantics – An oracle (an automatic correctness validator), available at https://github.com/dellaglio/csrbench- oracle – A test suite DanieleDell'Aglio-ExperimentalresearchaboutRSPs 20 26 May 2014 - EMPIRICAL@ESWC2014
  • 21. Dipartimento di Elettronica, Informazione e Bioingegneria References  Daniele Dell'Aglio, Marco Balduini, Emanuele Della Valle. On the need to include functional testing in RDF stream engine benchmarks. 1st International Workshop on Benchmarking RDF Systems (BeRSys2013)  Daniele Dell'Aglio, Jean-Paul Calbimonte, Marco Balduini, Óscar Corcho, Emanuele Della Valle: On Correctness in RDF Stream Processor Benchmarking. International Semantic Web Conference (2) 2013: 326-342  Barbieri, D.F., Braga, D., Ceri, S., Della Valle, E., Grossniklaus, M.: C- SPARQL: A continuous query language for RDF data streams. IJSC 4(1) (2010) 3–25  Calbimonte, J.P., Jeung, H., Corcho, O., Aberer, K.: Enabling Query Technologies for the Semantic Sensor Web. IJSWIS 8(1) (2012) 43–63  Le-Phuoc, D., Dao-Tran, M., Xavier Parreira, J., Hauswirth, M.: A native and adaptive approach for unified processing of linked streams and linked data. In: ISWC. (2011) 370–388 DanieleDell'Aglio-ExperimentalresearchaboutRSPs 21 26 May 2014 - EMPIRICAL@ESWC2014
  • 22. Dipartimento di Elettronica, Informazione e Bioingegneria Thank you! Questions? An Experience on Empirical Research about RDF Stream Processing Daniele Dell’Aglio (DEIB, Politecnico di Milano) daniele.dellaglio@polimi.it DanieleDell'Aglio-ExperimentalresearchaboutRSPs 22 26 May 2014 - EMPIRICAL@ESWC2014