Stream reasoning is an approach that blends artificial intelligence and stream processing to make sense of multiple, heterogeneous data streams in real-time. It allows querying and reasoning over data streams using ontologies to represent streaming data. Deductive stream reasoning uses rules and ontologies while inductive stream reasoning uses machine learning to continuously learn from streaming data and adapt to concept drift. Stream reasoning has been studied in over 1000 scientific papers in the last 12 years and shows promise in addressing the challenges of volume, velocity, variety and veracity in big streaming data.
Stream Reasoning: An Approach to Blend AI and Stream Processing
1. STREAM REASONING
AN APPROACH TO BLEND
AI AND STREAM PROCESSING
Emanuele Della Valle
Politecnico di Milano
http://emanueledellavalle.org
@manudellavalle
Milano - 17.10.2019
2. DATA ENGINEERING AND DATA SCIENCE TECHS
CAN TAME VOLUME
Emanuele Della Valle - http://emanueledellavalle.org - @manudellavalle
3. DATA ENGINEERING AND DATA SCIENCE TECHS
CAN TAME VELOCITY
Emanuele Della Valle - http://emanueledellavalle.org - @manudellavalle
4. DATA ENGINEERING AND DATA SCIENCE TECHS
CANNOT TAME VOLUME AND VELOCITY SIMULTANEOUSLY
ZB
EB
PB
TB
GB
MB
KB
months days hours min. sec. ms.
Emanuele Della Valle - http://emanueledellavalle.org - @manudellavalle
5. DATA ENGINEERING AND DATA SCIENCE TECHS
CAN TAME VARIETY USING AI
Emanuele Della Valle - http://emanueledellavalle.org - @manudellavalle
6. DATA ENGINEERING AND DATA SCIENCE TECHS
VARIETY MAKES PROBLEMS HARDER
ZB
EB
PB
TB
GB
MB
KB
months days hours min. sec. ms.
VARIETY
STILL THERE ARE USERS
WHOSE DECISIONS
NEED TO TAME ALL Vs
Emanuele Della Valle - http://emanueledellavalle.org - @manudellavalle
7. STILL THERE ARE USERS WHOSE DECISIONS NEED TO TAME ALL Vs
OFF-SHORE OIL OPERATIONS
‣ When sensors on a drilling pipe in an oil-rig indicate that it is about to get
stuck, how long — according to historical records — can I keep drilling?
‣ 400,000 sensors from 10s of differente producers
‣ 10,000 observations per second, many out-of-operational-ranges
8. STILL THERE ARE USERS WHOSE DECISIONS NEED TO TAME ALL Vs
REQUIREMENT ANALYSIS
A system able to answer those queries must be able to
▸ handle massive datasets x
▸ process data streams on the fly x
▸ cope with heterogeneous datasets x
▸ cope with incomplete data x x
▸ cope with noisy data x
▸ provide reactive answers x
Volume
Velocity
Variety
Veracity
Emanuele Della Valle - http://emanueledellavalle.org - @manudellavalle
9. STILL THERE ARE USERS WHOSE DECISIONS NEED TO TAME ALL Vs
(PARTIAL) SOLUTIONS: STREAM PROCESSING
▸ A paradigmatic change!
window
input streams streams of answerRegistered
Continuous
Query
Dynamic
System
Emanuele Della Valle - http://emanueledellavalle.org - @manudellavalle
10. STILL THERE ARE USERS WHOSE DECISIONS NEED TO TAME ALL Vs
STREAM PROCESSING VS. REQUIREMENTS
Requirement SP
massive datasets
data streams
heterogeneous dataset
incomplete data
noisy data
reactive answers
Emanuele Della Valle - http://emanueledellavalle.org - @manudellavalle
11. STILL THERE ARE USERS WHOSE DECISIONS NEED TO TAME ALL Vs
AI VS. REQUIREMENTS
Requirement SP AI
massive datasets
data streams
heterogeneous dataset
incomplete data
noisy data
reactive answers
Emanuele Della Valle - http://emanueledellavalle.org - @manudellavalle
12. Is it possible to make sense in real time
of multiple, heterogeneous, gigantic and
inevitably noisy and incomplete data streams
in order to support the decision processes of
extremely large numbers of concurrent
users?
E. Della Valle, S. Ceri, F. van Harmelen & H. Stuckenschmidt, 2010
STILL THERE ARE USERS WHOSE DECISIONS NEED TO TAME ALL Vs
STREAM REASONING RESEARCH QUESTION
Emanuele Della Valle - http://emanueledellavalle.org - @manudellavalle
13. ( , 13), ( , 12), ( , 8) , ( , 8)
DEDUCTIVE STREAM REASONING
STREAM PROCESSING
time
1 minute wide window
Which are the top-4
most frequent colours
in the last minute?
Is there a
followed by a
in the last minute yes, many
Emanuele Della Valle - http://emanueledellavalle.org - @manudellavalle
14. DEDUCTIVE STREAM REASONING
STREAM PROCESSING + SYMBOLIC AI
time
1 minute wide window
An ontology of colours
Emanuele Della Valle - http://emanueledellavalle.org - @manudellavalle
15. ( , 13),( , 8) , ( , 8)
DEDUCTIVE STREAM REASONING
DEDUCTIVE STREAM REASONING
time
1 minute wide window
Which are the top-2 most
frequent cool colours in
the last minute?
Is there a primary cool
colour followed by a
secondary warm one
yes, followed by .
An ontology of colours
Emanuele Della Valle - http://emanueledellavalle.org - @manudellavalle
16. INDUCTIVE STREAM REASONING
THE CONCEPT DRIFT PROBLEM
Emanuele Della Valle - http://emanueledellavalle.org - @manudellavalle
17. INDUCTIVE STREAM REASONING
AN EARLY ATTEMPT OF INDUCTIVE STREAM REASONING
D.F. Barbieri, D. Braga, S. Ceri, E. Della Valle, Y. Huang, V. Tresp,
A. Rettinger, H. Wermser: Deductive and Inductive Stream
Reasoning for Semantic Social Media Analytics.
IEEE Intelligent Systems 25(6): 32-41 (2010)
• How can we determining the optimal size of the window?
Emanuele Della Valle - http://emanueledellavalle.org - @manudellavalle
18. INDUCTIVE STREAM REASONING
ADAPTIVE SLIDING WINDOW (ADWIN)
Bifet, A. and Gavaldà, R., 2009, August. Adaptive learning from evolving data streams. In
International Symposium on Intelligent Data Analysis (pp. 249-260). Springer
Emanuele Della Valle - http://emanueledellavalle.org - @manudellavalle
19. INDUCTIVE STREAM REASONING
CONCEPT DRIFT AND STREAMING MACHINE LEARNING
• Hoeffding
Adaptive Tree
• Adaptive Random
Forest
• Temporally
Augmented
Classifier
A. Bifet, R. Gavaldà, G. Holmes, B. Pfahringer: Machine Learning for Data Streams: with Practical
Examples in MOA. The MIT Press (March 2, 2018)
Emanuele Della Valle - http://emanueledellavalle.org - @manudellavalle
https://moa.cms.waikato.ac.nz/
20. STREAM REASONING
DEDUCTIVE + INDUCTIVE STREAM REASONING
time
1 minute wide window
A better ontology
of colours continuously
learned from the data
Which are the most
frequent sentiments in
the last minute?
Is there a impulsive,
irritating colour followed
by an happy one
The better is the ontology of the colours we are using
the more expressive are the queries we can register
Emanuele Della Valle - http://emanueledellavalle.org - @manudellavalle
21. STREAM REASONING
1000+ SCIENTIFIC PAPERS IN 12 YEAR
▸ It is possible extend the Semantic Web stack in order
to represent heterogeneous data streams, continuous
queries, and continuous reasoning tasks
▸ It is possible to optimise continuous querying and
continuous reasoning so to provide reactive answers
▸ Streaming Machine Learning is starting to show that it is
possible to continuously learn models
Emanuele Della Valle - http://emanueledellavalle.org - @manudellavalle
22. STREAM REASONING
STREAM REASONING VS. REQUIREMENTS
Requirement Stream Reasoning
massive datasets
data streams
heterogeneous dataset
incomplete data
noisy data
reactive answers
not specifically treated so far treated but not resolved universally addressed by all studies
Emanuele Della Valle - http://emanueledellavalle.org - @manudellavalle
23. STREAM REASONING
MASTER THESIS?
▸ Semantics of stream processing languages
▸ Scaling deductive Stream Reasoning
▸ Stream Reasoning on Kafka
▸ Stream Reasoning on Spark
▸ Advancing Inductive Stream Reasoning
▸ Streaming Graph Machine Learning
▸ Streaming Anomaly Detection
▸ Build the first working inductive and
deductive Stream Reasoners
▸ Applications
Emanuele Della Valle - http://emanueledellavalle.org - @manudellavalle
25. STREAM REASONING
THANK YOU!
ANY QUESTION?
Emanuele Della Valle
Politecnico di Milano
http://emanueledellavalle.org
@manudellavalle
Milano - 17.10.2019