SlideShare uma empresa Scribd logo
1 de 19
Peter R. Pietzuch
prp@doc.ic.ac.uk
Integrating Scale Out and Fault Tolerance
in Stream Processing using
Operator State Management
with Raul Castro Fernandez*
Matteo Migliavacca+ and Peter Pietzuch*
*Imperial College London, +Kent Univerisity
Big data …
… in numbers:
– 2.5 billions on gigabytes of data every day (source IBM)
– LSST telescope, Chile 2016, 30 TB nightly
… come from everywhere:
– web feeds, social networking
– mobile devices, sensors, cameras
– scientific instruments
– online transactions (public and private sectors)
… have value:
– Global Pulse forum for detecting human crises internationally
– real-time big data analytics in UK £25 billions  £216 billions in 2012-17
– recommendation applications (LinkedIn, Amazon)
2
 processing infrastructure for big data analysis
A black-box approach for big data analysis
• users issue analysis queries with real-time semantics
• streams of data updates, time-varying rates, generated in real-time
• streams of result data
 processing in near real-time
3
time
Stream
Processing
System
• queries consist of operators (join, map, select, ..., UDOs)
• operators form graphs
• operators process streams of tuples on-the-fly
• operators span nodes
Distributed Stream Processing System
4
Elastic DSPSs in the Cloud
Real-time big data analysis challenge traditional DSPS:
? what about continuous workload surges?
? what about real-time resource allocation to workload variations?
? keeping the state correct forstateful operators?
Massively scalable , cloud-based DSPSs [SIGMOD 2013]
1. gracefully handles stateful operators’ state
2. operator state management for combined scale out and
fault tolerance
3. SEEP system and evaluation
4. related work
5. future research directions
5
Stream Processing in the Cloud
• clouds provide infinite pools of resources
6
? How do we build a stream processing platform in the Cloud?
• Failure resilience:
– active fault-tolerance needs 2x resources
– passive fault-tolerance leads to long
recovery times
• Intra-query parallelism:
– provisioning for workload peaks
unnecessarily conservative
 dynamic scale out:
increase resources
when peaks appear
 hybrid fault-tolerance:
low resource overhead
with fast recovery
 Both mechanisms must support stateful operators
Stateless vs Stateful Operators
7
stateless:
 failure recovery
 scale out
filter
> 5
filter
filter
counter
counter
counter
stateful:
× failure recovery
× scale out
(the, 10)
(with, 5) (the, 10)
(with, 5)
the with the
(the, 2) !=12
(with, 1) !=6
7 1 5 9 9
7
9
9
(the, …)
(with, …)
with
operator state: a summary of past tuples’ processing
State Management
8
processing state: (summary of past tuples’ processing)
routing state: (routing of tuples)
buffer state: (tuples)
 operator state is an external entity managed by the DSPS
 primitives for state management
 mechanisms (scale out, failure recovery) on top of primitives
 dynamic reconfiguration of the dataflow graph
A
B
C
State Management Primitives
9
takes snapshot of state and
makes it externally available
 restore
 backup
A
A
B
B
 checkpoint
 partition
moves copy of state from
one operator to another
splits state in a semantically correct
fashion for parallel processing
State Management Scale Out, Stateful Ops
10
A
A
periodically, stateful operators
checkpoint and back up state
to designated upstream
backup node, in memory
A
A
backup node already
has state of operator
to be parallelised
A’
A
A’
A
A’
 checkpoint
 backup
 partition
 restore upstream ops send
unprocessed tuples
to update
checkpointed state
B
 How do we partition stateful operators?
Partitioning Stateful Operators
• 1. Processing state modeled as (key, value) dictionary
• 2. State partitioned according to key k of tuples
• 3. Tuples will be routed to correct operator as of k
11
t=1, key=c, “computer”
t=3, key=c, “cambridge”
t=3, (c, computer:1, cambridge:1)
t=1, “computer”
t=2, “laboratory”
t=3, “cambridge” splitter
counter
t=2, key=l, “laboratory”
(a  k), A
(l  z), A’
t=2, (l, laboratory:1)
counter
A
A’
routing
state
buffer state
processing state
Passive Fault-Tolerance Model
• recreate operator state by replaying tuples after failure:
– upstream backup: sends acks upstream for tuples processed downstream
• may result in long recovery times due to large buffers:
– system is reprocessing streams after failure  inefficient
12
ACKs
data
A B C D
Recovering using State Management (R+SM)
13
A
A
A
• Benefit from state management primitives:
– use periodically backed up state on upstream node to recover faster
– trim buffers at backup node
– same primitives as in scale out
A
A
state is restored and unprocessed
tuples are replayed from buffer
 same primitives for parallel recovery
A
A’
State Management in Action: SEEP
14
(1)
(2)
(1) dynamic Scale Out: detect bottleneck , add new parallelised operator
(2) failure Recovery: detect failure, replace with new operator
EC2 stats
fault
detector
scale out
coordinator
deployment manager
query manager
queries
bottleneck detector
scaling policy
VM pool
faults
recovery
coordinator
Dynamic Scale Out: Detecting bottlenecks
CPU
utilisation
report
35%
85%
30%
logical infrastructure
view
35% 85% 30%
bottleneck
detector
15
The VM Pool: Adding operators
• problem: allocating new VMs takes minutes...
16
bottleneck
detector
monitoring
information
Cloud
provider
VM1 VM2
virtual machine pool
provision VM from cloud
(order of mins)
add new VM to pool
fault detector
VM2
VM3 (dynamic pool size)
Experimental Evaluation
• Goals:
– investigate effectiveness of scale out mechanism
– recovery time after failure using R+SM
– overhead of state management
• Scalable and Elastic Event Processing (SEEP):
– implemented in Java; Storm-like data flow model
• Sample queries + workload
– Linear Road Benchmark (LRB) to evaluate scale out [VLDB’04]
• provides an increasing stream workload over time
• query with 8 operators, 3 are stateful; SLA: results < 5 secs
– Windowed word count query (2 ops) to evaluate fault tolerance
• induce failure to observe performance impact
• Deployment on Amazon AWS EC2
– sources and sinks on high-memory double extra large instances
– operators on small instances
17
Scale Out: LRB Workload
18
scales to load factor L=350
with 50 VMs on Amazon EC2
(automated query parallelisation,
scale out policy at 70%)
L=512 highest result [VLDB’12]
(hand-crafted query on cluster)
scale out leads to latency peaks,
but remains within LRB SLA
 SEEP scales out to increasing workload in the Linear Road Benchmark
Conclusions
19
• Stream processing will grow in importance:
– handling the data deluge
– enables real-time response and decision making
• Integrated approach for scale out and failure recovery:
– operator state an independent entity
– primitives and mechanisms
• Efficient approach extensible for additional operators:
– effectively applied to Amazon EC2 running LRB
– parallel recovery

Mais conteúdo relacionado

Semelhante a data-stream-processing-SEEP.pptx

Flink Streaming Hadoop Summit San Jose
Flink Streaming Hadoop Summit San JoseFlink Streaming Hadoop Summit San Jose
Flink Streaming Hadoop Summit San Jose
Kostas Tzoumas
 
SnappyData, the Spark Database. A unified cluster for streaming, transactions...
SnappyData, the Spark Database. A unified cluster for streaming, transactions...SnappyData, the Spark Database. A unified cluster for streaming, transactions...
SnappyData, the Spark Database. A unified cluster for streaming, transactions...
SnappyData
 

Semelhante a data-stream-processing-SEEP.pptx (20)

Introduction to Apache Apex by Thomas Weise
Introduction to Apache Apex by Thomas WeiseIntroduction to Apache Apex by Thomas Weise
Introduction to Apache Apex by Thomas Weise
 
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and TransformIntro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
 
Discretized Stream - Fault-Tolerant Streaming Computation at Scale - SOSP
Discretized Stream - Fault-Tolerant Streaming Computation at Scale - SOSPDiscretized Stream - Fault-Tolerant Streaming Computation at Scale - SOSP
Discretized Stream - Fault-Tolerant Streaming Computation at Scale - SOSP
 
Next Gen Big Data Analytics with Apache Apex
Next Gen Big Data Analytics with Apache Apex Next Gen Big Data Analytics with Apache Apex
Next Gen Big Data Analytics with Apache Apex
 
Stream Processing Overview
Stream Processing OverviewStream Processing Overview
Stream Processing Overview
 
Realtime Statistics based on Apache Storm and RocketMQ
Realtime Statistics based on Apache Storm and RocketMQRealtime Statistics based on Apache Storm and RocketMQ
Realtime Statistics based on Apache Storm and RocketMQ
 
Flink Streaming Hadoop Summit San Jose
Flink Streaming Hadoop Summit San JoseFlink Streaming Hadoop Summit San Jose
Flink Streaming Hadoop Summit San Jose
 
Machine Learning with Apache Flink at Stockholm Machine Learning Group
Machine Learning with Apache Flink at Stockholm Machine Learning GroupMachine Learning with Apache Flink at Stockholm Machine Learning Group
Machine Learning with Apache Flink at Stockholm Machine Learning Group
 
Strata Singapore: Gearpump Real time DAG-Processing with Akka at Scale
Strata Singapore: GearpumpReal time DAG-Processing with Akka at ScaleStrata Singapore: GearpumpReal time DAG-Processing with Akka at Scale
Strata Singapore: Gearpump Real time DAG-Processing with Akka at Scale
 
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
 
SnappyData at Spark Summit 2017
SnappyData at Spark Summit 2017SnappyData at Spark Summit 2017
SnappyData at Spark Summit 2017
 
SnappyData, the Spark Database. A unified cluster for streaming, transactions...
SnappyData, the Spark Database. A unified cluster for streaming, transactions...SnappyData, the Spark Database. A unified cluster for streaming, transactions...
SnappyData, the Spark Database. A unified cluster for streaming, transactions...
 
Apache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data 2016: Next Gen Big Data Analytics with Apache ApexApache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
 
First Flink Bay Area meetup
First Flink Bay Area meetupFirst Flink Bay Area meetup
First Flink Bay Area meetup
 
Telegraph Cq English
Telegraph Cq EnglishTelegraph Cq English
Telegraph Cq English
 
3.2 Streaming and Messaging
3.2 Streaming and Messaging3.2 Streaming and Messaging
3.2 Streaming and Messaging
 
Automated Discovery of Performance Regressions in Enterprise Applications
Automated Discovery of Performance Regressions in Enterprise ApplicationsAutomated Discovery of Performance Regressions in Enterprise Applications
Automated Discovery of Performance Regressions in Enterprise Applications
 
Swift Parallel Scripting for High-Performance Workflow
Swift Parallel Scripting for High-Performance WorkflowSwift Parallel Scripting for High-Performance Workflow
Swift Parallel Scripting for High-Performance Workflow
 
Tsinghua University: Two Exemplary Applications in China
Tsinghua University: Two Exemplary Applications in ChinaTsinghua University: Two Exemplary Applications in China
Tsinghua University: Two Exemplary Applications in China
 
Update on Trinity System Procurement and Plans
Update on Trinity System Procurement and PlansUpdate on Trinity System Procurement and Plans
Update on Trinity System Procurement and Plans
 

Mais de AhmadTawfigAlRadaide

Mais de AhmadTawfigAlRadaide (8)

Research title.pptx
Research title.pptxResearch title.pptx
Research title.pptx
 
نموذج اوراق المؤتمر.pptx
نموذج اوراق المؤتمر.pptxنموذج اوراق المؤتمر.pptx
نموذج اوراق المؤتمر.pptx
 
49231fc6-3c82-4122-b3cb-4c64fd9db005.pptx
49231fc6-3c82-4122-b3cb-4c64fd9db005.pptx49231fc6-3c82-4122-b3cb-4c64fd9db005.pptx
49231fc6-3c82-4122-b3cb-4c64fd9db005.pptx
 
Chapter 4 Project Integration Management.ppt
Chapter 4 Project Integration Management.pptChapter 4 Project Integration Management.ppt
Chapter 4 Project Integration Management.ppt
 
Chapter 3 The Project Management Process Groups A Case Study.ppt
Chapter 3 The Project Management Process Groups A Case Study.pptChapter 3 The Project Management Process Groups A Case Study.ppt
Chapter 3 The Project Management Process Groups A Case Study.ppt
 
Chapter 2 The Project Management and Information Technology Context.ppt
Chapter 2 The Project Management and Information Technology Context.pptChapter 2 The Project Management and Information Technology Context.ppt
Chapter 2 The Project Management and Information Technology Context.ppt
 
Chapter 1 Introduction to Project Management.ppt
Chapter 1 Introduction to Project Management.pptChapter 1 Introduction to Project Management.ppt
Chapter 1 Introduction to Project Management.ppt
 
ch 3.ppt
ch 3.pptch 3.ppt
ch 3.ppt
 

Último

一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
ayvbos
 
Indian Escort in Abu DHabi 0508644382 Abu Dhabi Escorts
Indian Escort in Abu DHabi 0508644382 Abu Dhabi EscortsIndian Escort in Abu DHabi 0508644382 Abu Dhabi Escorts
Indian Escort in Abu DHabi 0508644382 Abu Dhabi Escorts
Monica Sydney
 
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
ydyuyu
 
在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查
ydyuyu
 
Russian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
Russian Escort Abu Dhabi 0503464457 Abu DHabi EscortsRussian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
Russian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
Monica Sydney
 
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdfpdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
JOHNBEBONYAP1
 
一比一原版(Flinders毕业证书)弗林德斯大学毕业证原件一模一样
一比一原版(Flinders毕业证书)弗林德斯大学毕业证原件一模一样一比一原版(Flinders毕业证书)弗林德斯大学毕业证原件一模一样
一比一原版(Flinders毕业证书)弗林德斯大学毕业证原件一模一样
ayvbos
 
一比一原版田纳西大学毕业证如何办理
一比一原版田纳西大学毕业证如何办理一比一原版田纳西大学毕业证如何办理
一比一原版田纳西大学毕业证如何办理
F
 
Abu Dhabi Escorts Service 0508644382 Escorts in Abu Dhabi
Abu Dhabi Escorts Service 0508644382 Escorts in Abu DhabiAbu Dhabi Escorts Service 0508644382 Escorts in Abu Dhabi
Abu Dhabi Escorts Service 0508644382 Escorts in Abu Dhabi
Monica Sydney
 
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
pxcywzqs
 
一比一原版奥兹学院毕业证如何办理
一比一原版奥兹学院毕业证如何办理一比一原版奥兹学院毕业证如何办理
一比一原版奥兹学院毕业证如何办理
F
 

Último (20)

20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
 
Trump Diapers Over Dems t shirts Sweatshirt
Trump Diapers Over Dems t shirts SweatshirtTrump Diapers Over Dems t shirts Sweatshirt
Trump Diapers Over Dems t shirts Sweatshirt
 
20240507 QFM013 Machine Intelligence Reading List April 2024.pdf
20240507 QFM013 Machine Intelligence Reading List April 2024.pdf20240507 QFM013 Machine Intelligence Reading List April 2024.pdf
20240507 QFM013 Machine Intelligence Reading List April 2024.pdf
 
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
 
Meaning of On page SEO & its process in detail.
Meaning of On page SEO & its process in detail.Meaning of On page SEO & its process in detail.
Meaning of On page SEO & its process in detail.
 
Indian Escort in Abu DHabi 0508644382 Abu Dhabi Escorts
Indian Escort in Abu DHabi 0508644382 Abu Dhabi EscortsIndian Escort in Abu DHabi 0508644382 Abu Dhabi Escorts
Indian Escort in Abu DHabi 0508644382 Abu Dhabi Escorts
 
"Boost Your Digital Presence: Partner with a Leading SEO Agency"
"Boost Your Digital Presence: Partner with a Leading SEO Agency""Boost Your Digital Presence: Partner with a Leading SEO Agency"
"Boost Your Digital Presence: Partner with a Leading SEO Agency"
 
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
 
20240508 QFM014 Elixir Reading List April 2024.pdf
20240508 QFM014 Elixir Reading List April 2024.pdf20240508 QFM014 Elixir Reading List April 2024.pdf
20240508 QFM014 Elixir Reading List April 2024.pdf
 
在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查
 
Ballia Escorts Service Girl ^ 9332606886, WhatsApp Anytime Ballia
Ballia Escorts Service Girl ^ 9332606886, WhatsApp Anytime BalliaBallia Escorts Service Girl ^ 9332606886, WhatsApp Anytime Ballia
Ballia Escorts Service Girl ^ 9332606886, WhatsApp Anytime Ballia
 
Russian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
Russian Escort Abu Dhabi 0503464457 Abu DHabi EscortsRussian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
Russian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
 
APNIC Updates presented by Paul Wilson at ARIN 53
APNIC Updates presented by Paul Wilson at ARIN 53APNIC Updates presented by Paul Wilson at ARIN 53
APNIC Updates presented by Paul Wilson at ARIN 53
 
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdfpdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
 
一比一原版(Flinders毕业证书)弗林德斯大学毕业证原件一模一样
一比一原版(Flinders毕业证书)弗林德斯大学毕业证原件一模一样一比一原版(Flinders毕业证书)弗林德斯大学毕业证原件一模一样
一比一原版(Flinders毕业证书)弗林德斯大学毕业证原件一模一样
 
一比一原版田纳西大学毕业证如何办理
一比一原版田纳西大学毕业证如何办理一比一原版田纳西大学毕业证如何办理
一比一原版田纳西大学毕业证如何办理
 
Abu Dhabi Escorts Service 0508644382 Escorts in Abu Dhabi
Abu Dhabi Escorts Service 0508644382 Escorts in Abu DhabiAbu Dhabi Escorts Service 0508644382 Escorts in Abu Dhabi
Abu Dhabi Escorts Service 0508644382 Escorts in Abu Dhabi
 
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
 
一比一原版奥兹学院毕业证如何办理
一比一原版奥兹学院毕业证如何办理一比一原版奥兹学院毕业证如何办理
一比一原版奥兹学院毕业证如何办理
 
Vip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac Room
Vip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac RoomVip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac Room
Vip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac Room
 

data-stream-processing-SEEP.pptx

  • 1. Peter R. Pietzuch prp@doc.ic.ac.uk Integrating Scale Out and Fault Tolerance in Stream Processing using Operator State Management with Raul Castro Fernandez* Matteo Migliavacca+ and Peter Pietzuch* *Imperial College London, +Kent Univerisity
  • 2. Big data … … in numbers: – 2.5 billions on gigabytes of data every day (source IBM) – LSST telescope, Chile 2016, 30 TB nightly … come from everywhere: – web feeds, social networking – mobile devices, sensors, cameras – scientific instruments – online transactions (public and private sectors) … have value: – Global Pulse forum for detecting human crises internationally – real-time big data analytics in UK £25 billions  £216 billions in 2012-17 – recommendation applications (LinkedIn, Amazon) 2  processing infrastructure for big data analysis
  • 3. A black-box approach for big data analysis • users issue analysis queries with real-time semantics • streams of data updates, time-varying rates, generated in real-time • streams of result data  processing in near real-time 3 time Stream Processing System
  • 4. • queries consist of operators (join, map, select, ..., UDOs) • operators form graphs • operators process streams of tuples on-the-fly • operators span nodes Distributed Stream Processing System 4
  • 5. Elastic DSPSs in the Cloud Real-time big data analysis challenge traditional DSPS: ? what about continuous workload surges? ? what about real-time resource allocation to workload variations? ? keeping the state correct forstateful operators? Massively scalable , cloud-based DSPSs [SIGMOD 2013] 1. gracefully handles stateful operators’ state 2. operator state management for combined scale out and fault tolerance 3. SEEP system and evaluation 4. related work 5. future research directions 5
  • 6. Stream Processing in the Cloud • clouds provide infinite pools of resources 6 ? How do we build a stream processing platform in the Cloud? • Failure resilience: – active fault-tolerance needs 2x resources – passive fault-tolerance leads to long recovery times • Intra-query parallelism: – provisioning for workload peaks unnecessarily conservative  dynamic scale out: increase resources when peaks appear  hybrid fault-tolerance: low resource overhead with fast recovery  Both mechanisms must support stateful operators
  • 7. Stateless vs Stateful Operators 7 stateless:  failure recovery  scale out filter > 5 filter filter counter counter counter stateful: × failure recovery × scale out (the, 10) (with, 5) (the, 10) (with, 5) the with the (the, 2) !=12 (with, 1) !=6 7 1 5 9 9 7 9 9 (the, …) (with, …) with operator state: a summary of past tuples’ processing
  • 8. State Management 8 processing state: (summary of past tuples’ processing) routing state: (routing of tuples) buffer state: (tuples)  operator state is an external entity managed by the DSPS  primitives for state management  mechanisms (scale out, failure recovery) on top of primitives  dynamic reconfiguration of the dataflow graph A B C
  • 9. State Management Primitives 9 takes snapshot of state and makes it externally available  restore  backup A A B B  checkpoint  partition moves copy of state from one operator to another splits state in a semantically correct fashion for parallel processing
  • 10. State Management Scale Out, Stateful Ops 10 A A periodically, stateful operators checkpoint and back up state to designated upstream backup node, in memory A A backup node already has state of operator to be parallelised A’ A A’ A A’  checkpoint  backup  partition  restore upstream ops send unprocessed tuples to update checkpointed state B  How do we partition stateful operators?
  • 11. Partitioning Stateful Operators • 1. Processing state modeled as (key, value) dictionary • 2. State partitioned according to key k of tuples • 3. Tuples will be routed to correct operator as of k 11 t=1, key=c, “computer” t=3, key=c, “cambridge” t=3, (c, computer:1, cambridge:1) t=1, “computer” t=2, “laboratory” t=3, “cambridge” splitter counter t=2, key=l, “laboratory” (a  k), A (l  z), A’ t=2, (l, laboratory:1) counter A A’ routing state buffer state processing state
  • 12. Passive Fault-Tolerance Model • recreate operator state by replaying tuples after failure: – upstream backup: sends acks upstream for tuples processed downstream • may result in long recovery times due to large buffers: – system is reprocessing streams after failure  inefficient 12 ACKs data A B C D
  • 13. Recovering using State Management (R+SM) 13 A A A • Benefit from state management primitives: – use periodically backed up state on upstream node to recover faster – trim buffers at backup node – same primitives as in scale out A A state is restored and unprocessed tuples are replayed from buffer  same primitives for parallel recovery A A’
  • 14. State Management in Action: SEEP 14 (1) (2) (1) dynamic Scale Out: detect bottleneck , add new parallelised operator (2) failure Recovery: detect failure, replace with new operator EC2 stats fault detector scale out coordinator deployment manager query manager queries bottleneck detector scaling policy VM pool faults recovery coordinator
  • 15. Dynamic Scale Out: Detecting bottlenecks CPU utilisation report 35% 85% 30% logical infrastructure view 35% 85% 30% bottleneck detector 15
  • 16. The VM Pool: Adding operators • problem: allocating new VMs takes minutes... 16 bottleneck detector monitoring information Cloud provider VM1 VM2 virtual machine pool provision VM from cloud (order of mins) add new VM to pool fault detector VM2 VM3 (dynamic pool size)
  • 17. Experimental Evaluation • Goals: – investigate effectiveness of scale out mechanism – recovery time after failure using R+SM – overhead of state management • Scalable and Elastic Event Processing (SEEP): – implemented in Java; Storm-like data flow model • Sample queries + workload – Linear Road Benchmark (LRB) to evaluate scale out [VLDB’04] • provides an increasing stream workload over time • query with 8 operators, 3 are stateful; SLA: results < 5 secs – Windowed word count query (2 ops) to evaluate fault tolerance • induce failure to observe performance impact • Deployment on Amazon AWS EC2 – sources and sinks on high-memory double extra large instances – operators on small instances 17
  • 18. Scale Out: LRB Workload 18 scales to load factor L=350 with 50 VMs on Amazon EC2 (automated query parallelisation, scale out policy at 70%) L=512 highest result [VLDB’12] (hand-crafted query on cluster) scale out leads to latency peaks, but remains within LRB SLA  SEEP scales out to increasing workload in the Linear Road Benchmark
  • 19. Conclusions 19 • Stream processing will grow in importance: – handling the data deluge – enables real-time response and decision making • Integrated approach for scale out and failure recovery: – operator state an independent entity – primitives and mechanisms • Efficient approach extensible for additional operators: – effectively applied to Amazon EC2 running LRB – parallel recovery