SlideShare uma empresa Scribd logo
1 de 57
Riding the stream
processing wave
Samarth Shetty
Sept 13 2019
Riding the stream
processing wave
Samarth Shetty
Sept 13 2019
1
2
3
4
Agenda
Overview
Hard Problems
Future Work
Q&A
Stream Processing
• Continuous Processing
• Unbounded datasets
• Low Latency Applications
Recommended reading:
Tyler Akidau: https://www.oreilly.com/ideas/the-world-beyond-batch-streaming-101
Example Application
Count number of "Page-Views" for each member in a 5 minute window
Page View Page View/Member
Stream Processing JobMessaging Queue Messaging Queue
Apache Samza
• Stream Processing Platform
• Top Level Apache project (2014)
• Created by LinkedIn. In use at
LinkedIn, Slack, Intuit, Redfin etc
Apache Samza
Scale @ LinkedIn
• ~4k jobs
• 20k+ containers
• ~2 Trillion messages processed per
day
Stream Processing at LinkedIn
Security
Bot Detection, Access
Monitoring
Notifications
Email and Push Notifications
Classification
Topic tagging, Image
classifications
Stream Processing at LinkedIn
Site Speed
Site Speed and Health
Monitoring
Index Updates
Updates to Search Index
Business Metrics
Pre-aggregated real-time
counts by dimensions
1
2
3
4
Future Work
Agenda
Overview
Hard Problems
Q&A
Hard Problems
Common challenges we
face in stream processing.
• Scale
• Operability
• Scenarios spanning Offline and
Online environments
• Data access and Data movement
Hard Problems
Common challenges we
face in stream processing.
Today’s session we will talk about
• Scale (Stateful applications)
• Operability
• Scenarios spanning Offline and
Online environments
• Data access and Data movement
Real time targeting platform
1. Celia creates
a LinkedIn post
2. Targeting platform scores each edge
based on features such as connection
strength, content affinity
3. Prunes low-quality edges
based on scores (FPR).
4.. Remaining edges trigger the
notification platform, where they are
scored again (SPR) and optimized
(aggregation, capping, etc.)
Targeting
Platform
1
1
Notification
Platform
Real time targeting
platform
• Low Latency
• High QPS
• Large State
○ Features generated offline used
for scoring in nearline
Real time targeting
platform
• High QPS
○ Parallelism and Async Processing
• Large State
○ Optimized state lookup...
Samza
Local State
• Used for Lookups, Buffering data,
computed results
• Local state can be in-memory or
on disk.
• State computed or ingested from
a remote source
Change Capture or HDFS->Kafka Push
Input Stream(s)
Output Stream)
Local
Store
Bootstrap local state
Samza
Local State
• How does it compare to remote
state?
Change Capture or HDFS->Kafka Push
Input Stream(s)
Output Stream)
Local
Store
Bootstrap local state
Samza
Local State
• How does it compare to remote
state?
• 100 X
• Faster
Change Capture or HDFS->Kafka Push
Input Stream(s)
Output Stream)
Local
Store
Bootstrap local state
Samza
Local State
• How does it compare to remote
state?
• 100 X
• Faster
Change Capture or HDFS->Kafka Push
Input Stream(s)
Output Stream)
Local
Store
Bootstrap local state
30 X
Throughput Gains
Shadi A. Noghabi et al. Samza: stateful scalable stream processing at LinkedIn. Proc. VLDB
Endow. 10, 12 (August 2017), 1634-1645.
Samza
Local State
• How do we provide durability?
○ Backed up in log compacted
topic
○ Incremental checkpointing
Change Capture or HDFS->Kafka Push
Input Stream(s)
Log Compacted Kafka topic
Output Stream)
State Backup
Local
Store
Bootstrap local state
Local State
● How do we handle
application failures?
Samza
Change Capture Stream
Input Stream(s)
Log Compacted Kafka topic
Output Stream)
State Backup
Local
Store
Bootstrap local
state
Local State
● How do we handle
application failures?
Samza
Change Capture Stream
Input Stream(s)
Log Compacted Kafka topic
Output Stream)
State Backup
Local
Store
Bootstrap local
state
Local State
● How do we handle
application failures?
Samza
Change Capture Stream
Input Stream(s)
Log Compacted Kafka topic
Output Stream)
State Backup
Local
Store
Bootstrap local
state
Samza Master
Heartbeats
X
Local State
● How do we handle
application failures?
Samza
Change Capture Stream
Input Stream(s)
Log Compacted Kafka topic
Output Stream)
State Backup
Local
Store
Bootstrap local
state
Samza Master
Heartbeats
X
Samza
New Container
Local State
● How do we handle
application failures?
Samza
Change Capture Stream
Input Stream(s)
Log Compacted Kafka topic
Output Stream)
State Backup
Local
Store
Bootstrap local
state
Samza Master
Heartbeats
X
Samza
New Container
Restore Local state
from State Backup
Local State
● How do we handle
application failures?
Change Capture Stream
Input Stream(s)
Log Compacted Kafka topic
Catch Up with
Bootstrap stream
Samza
State Backup
Read from last
checkpoint
Output Stream
Restoring Local
State
Challenges
• For large state, recovery can take
up to an hour
• Impacted by Kafka quotas, SSD
bottlenecks etc
Local State
● 50%: Per container state <
0.5GB
Local State
● 50%: Per container state <
0.5GB
● 95%: Per container state <
36 GB
Local State
● 50%: Per container state <
0.5GB
● 95%: Per container state <
36 GB
● Max container state
is~150GB and growing
Restoring Local
State
• Can we reduce the frequency of
state restore?
• Can we reduce the time for state
restore?
• Can we have a bounded time for
state restore?
Restoring Local
State
• Can we reduce the frequency of
state restore?
• Can we reduce the time for state
restore?
• Can we have a bounded time for
state restore?
Restoring Local
State
• Can we reduce the frequency of
state restore?
• Can we reduce the time for state
restore?
• Can we have a bounded time for
state restore?
Host Affinity
Reducing downtime during recovery
Task-1
Container-1
Container-2
Heartbeat
Samza master
Task-2
Durable containerID – host
mapping
• Restart containers on same
host
• Re-use on-disk state snapshot
(host affinity)
• Catch-up on only delta from
the Kafka change-log
0
Downtime
Host Affinity
Reducing downtime during recovery
Task-1
Container-1
Container-2
Heartbeat
Samza master
Task-2
Durable containerID – host
mapping
• Limitations
○ Host affinity is not
guaranteed
○ Host failures are a reality :)
○ Bugs and host contention
may cause a full state restore
0
Standby Containers
Bounded time for state restore
• Jobs have active and standby
containers
• Standby container keeps a
copy of application state
• Only active containers process
messages
Active Container
Standby Container
Input Stream
Heartbeat
Samza master
Change Log
Standby Containers
Bounded time for state restore
• Active container’s host fails
Standby Container
Input Stream
Heartbeat
Samza master
X
Change Log
Active Container
Standby Containers
Bounded time for state restore
• Active container’s host fails
• Heartbeat to host and
container lost
Standby Container
Input Stream
Heartbeat
Samza master
X
Change Log
Active Container
X
Standby Containers
Bounded time for state restore
• Active container’s host fails
• Heartbeat to host and
container lost
• Samza master selects a
standby for promotion
Standby Container
Input Stream
Heartbeat
Samza master
X
Change Log
Active Container
X
Standby Containers
Bounded time for state restore
• Samza master promotes
standby to active
Active Container
Samza master
Standby Containers
Bounded time for state restore
• Samza master promotes
standby to active
• Newly activated container
processes from checkpoint
Active Container
Samza master
Change Log
Input Stream
Standby Containers
Bounded time for state restore
• Samza master promotes
standby to active
• Newly activated container
processes from checkpoint
• Samza master creates a new
standby
Active ContainerInput Stream
Samza master
Change Log
Standby Container
Standby Containers
Bounded time for state restore
• Samza master promotes
standby to active
• Newly activated container
processes from checkpoint
• Samza master creates a new
standby
• Replica factor is configurable
Active ContainerInput Stream
Samza master
Change Log
Standby Container
Standby Containers
Bounded time for state restore
• Samza master promotes
standby to active
• Newly activated container
processes from checkpoint
• Samza master creates a new
standby
• Replica factor is configurable
Active ContainerInput Stream
Samza master
Change Log
Standby Container
• Bounded Restore Time: 5 mins
• ~20x faster for large state
stores (200GB+)
Hard Problems
Common challenges we
face in stream processing.
Today’s session we will talk about
• Scale (Stateful applications)
• Operability
• Scenarios spanning Offline and
Online environments
• Data access and Data movement
Scenarios spanning
Offline and Online
environments
• ML Model Training, Feature
Engineering (Generation and
Access)
• Lambda Architecture
• Experimentation
Feature
Management
Frame: Virtual Feature Store
https://www.slideshare.net/DavidStein1/frame-feature-management-for-productive-machine-learning
• Goal: Simplify feature discovery
and access
• Applications get features by
“name” in a global namespace
• Abstraction layer for feature
access
• Unified across environments and
data sources
Frame
Simplifying Feature Access (Datastore: HDFS)
0
https://www.slideshare.net/DavidStein1/frame-feature-management-for-productive-machine-learning
Frame
Simplifying Feature Access (Datastore: KV, REST etc)
0
https://www.slideshare.net/DavidStein1/frame-feature-management-for-productive-machine-learning
Frame
Nearline Applications
Simplifying Lambda
Unified Metrics
• Metrics pipeline built on Pig, Hive
• Need real time insights
• Solution:
○ Convert Pig and Hive to Samza
pipelines for nearline processing
○ Apache Pinot for serving results
Khai Tran: https://engineering.linkedin.com/blog/2019/01/bridging-offline-and-nearline-computations-with-apache-calcite
Simplifying Lambda
Unified Metrics
Samza jobs
Batch jobs
UMP neartime platform
UMP offline platform
Raptor
code configMetrics
definition
HDFS
Pinot
Khai Tran: https://engineering.linkedin.com/blog/2019/01/bridging-offline-and-nearline-computations-with-apache-calcite
Lambda architecture with a single codebase
Simplifying Lambda
Unified Metrics
Khai Tran: https://engineering.linkedin.com/blog/2019/01/bridging-offline-and-nearline-computations-with-apache-calcite
...
Metric union
User code
User code
Dimension
decoration Calcite relational algebra
as an IR
convert generateoptimize
Beam physical plan
Pig to Calcite Calcite to Beam
Streaming
config
Beam Java API code
In-progress
Explorations
Convergence API
• Apache Beam
○ Samza supports a Beam runner
○ Exploring Spark-Beam runner
• SQL
○ Samza SQL and Spark SQL
1
2
3
4
Agenda
Overview
Hard Problems
Future Work
Q&A
Future Work
• Auto Sizing of Jobs
• Multi Language Support (e.g
Python)
• Frame for feature generation
• State store on Azure Managed
Disks
Thank you

Mais conteúdo relacionado

Mais procurados

Slash n: Tech Talk Track 2 – Website Architecture-Mistakes & Learnings - Sidd...
Slash n: Tech Talk Track 2 – Website Architecture-Mistakes & Learnings - Sidd...Slash n: Tech Talk Track 2 – Website Architecture-Mistakes & Learnings - Sidd...
Slash n: Tech Talk Track 2 – Website Architecture-Mistakes & Learnings - Sidd...
slashn
 
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseHBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBase
enissoz
 
High Availability for HBase Tables - Past, Present, and Future
High Availability for HBase Tables - Past, Present, and FutureHigh Availability for HBase Tables - Past, Present, and Future
High Availability for HBase Tables - Past, Present, and Future
DataWorks Summit
 
RedisConf18 - Fail-Safe Starvation-Free Durable Priority Queues in Redis
RedisConf18 - Fail-Safe Starvation-Free Durable Priority Queues in RedisRedisConf18 - Fail-Safe Starvation-Free Durable Priority Queues in Redis
RedisConf18 - Fail-Safe Starvation-Free Durable Priority Queues in Redis
Redis Labs
 

Mais procurados (20)

Slash n: Tech Talk Track 2 – Website Architecture-Mistakes & Learnings - Sidd...
Slash n: Tech Talk Track 2 – Website Architecture-Mistakes & Learnings - Sidd...Slash n: Tech Talk Track 2 – Website Architecture-Mistakes & Learnings - Sidd...
Slash n: Tech Talk Track 2 – Website Architecture-Mistakes & Learnings - Sidd...
 
Off-heaping the Apache HBase Read Path
Off-heaping the Apache HBase Read Path Off-heaping the Apache HBase Read Path
Off-heaping the Apache HBase Read Path
 
HBase: Where Online Meets Low Latency
HBase: Where Online Meets Low LatencyHBase: Where Online Meets Low Latency
HBase: Where Online Meets Low Latency
 
Tales from Taming the Long Tail
Tales from Taming the Long TailTales from Taming the Long Tail
Tales from Taming the Long Tail
 
Voldemort : Prototype to Production
Voldemort : Prototype to ProductionVoldemort : Prototype to Production
Voldemort : Prototype to Production
 
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseHBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBase
 
HBaseCon 2013: How to Get the MTTR Below 1 Minute and More
HBaseCon 2013: How to Get the MTTR Below 1 Minute and MoreHBaseCon 2013: How to Get the MTTR Below 1 Minute and More
HBaseCon 2013: How to Get the MTTR Below 1 Minute and More
 
HBase Accelerated: In-Memory Flush and Compaction
HBase Accelerated: In-Memory Flush and CompactionHBase Accelerated: In-Memory Flush and Compaction
HBase Accelerated: In-Memory Flush and Compaction
 
High Availability for HBase Tables - Past, Present, and Future
High Availability for HBase Tables - Past, Present, and FutureHigh Availability for HBase Tables - Past, Present, and Future
High Availability for HBase Tables - Past, Present, and Future
 
HBaseCon 2015: HBase Performance Tuning @ Salesforce
HBaseCon 2015: HBase Performance Tuning @ SalesforceHBaseCon 2015: HBase Performance Tuning @ Salesforce
HBaseCon 2015: HBase Performance Tuning @ Salesforce
 
Keynote: Apache HBase at Yahoo! Scale
Keynote: Apache HBase at Yahoo! ScaleKeynote: Apache HBase at Yahoo! Scale
Keynote: Apache HBase at Yahoo! Scale
 
Zero-downtime Hadoop/HBase Cross-datacenter Migration
Zero-downtime Hadoop/HBase Cross-datacenter MigrationZero-downtime Hadoop/HBase Cross-datacenter Migration
Zero-downtime Hadoop/HBase Cross-datacenter Migration
 
MariaDB Performance Tuning and Optimization
MariaDB Performance Tuning and OptimizationMariaDB Performance Tuning and Optimization
MariaDB Performance Tuning and Optimization
 
HBase Storage Internals
HBase Storage InternalsHBase Storage Internals
HBase Storage Internals
 
Realtime classroom analytics powered by apache druid
Realtime classroom analytics powered by apache druid Realtime classroom analytics powered by apache druid
Realtime classroom analytics powered by apache druid
 
RedisConf18 - Fail-Safe Starvation-Free Durable Priority Queues in Redis
RedisConf18 - Fail-Safe Starvation-Free Durable Priority Queues in RedisRedisConf18 - Fail-Safe Starvation-Free Durable Priority Queues in Redis
RedisConf18 - Fail-Safe Starvation-Free Durable Priority Queues in Redis
 
hbaseconasia2017: Apache HBase at Netease
hbaseconasia2017: Apache HBase at Neteasehbaseconasia2017: Apache HBase at Netease
hbaseconasia2017: Apache HBase at Netease
 
Meet HBase 1.0
Meet HBase 1.0Meet HBase 1.0
Meet HBase 1.0
 
hbaseconasia2017: hbase-2.0.0
hbaseconasia2017: hbase-2.0.0hbaseconasia2017: hbase-2.0.0
hbaseconasia2017: hbase-2.0.0
 
Apache HBaseの現在 - 火山と呼ばれたHBaseは今どうなっているのか
Apache HBaseの現在 - 火山と呼ばれたHBaseは今どうなっているのかApache HBaseの現在 - 火山と呼ばれたHBaseは今どうなっているのか
Apache HBaseの現在 - 火山と呼ばれたHBaseは今どうなっているのか
 

Semelhante a Riding the Stream Processing Wave (Strange loop 2019)

London devops logging
London devops loggingLondon devops logging
London devops logging
Tomas Doran
 
M6d cassandrapresentation
M6d cassandrapresentationM6d cassandrapresentation
M6d cassandrapresentation
Edward Capriolo
 
Omid Efficient Transaction Mgmt and Processing for HBase
Omid Efficient Transaction Mgmt and Processing for HBaseOmid Efficient Transaction Mgmt and Processing for HBase
Omid Efficient Transaction Mgmt and Processing for HBase
DataWorks Summit
 
Red Hat Storage Day Seattle: Stabilizing Petabyte Ceph Cluster in OpenStack C...
Red Hat Storage Day Seattle: Stabilizing Petabyte Ceph Cluster in OpenStack C...Red Hat Storage Day Seattle: Stabilizing Petabyte Ceph Cluster in OpenStack C...
Red Hat Storage Day Seattle: Stabilizing Petabyte Ceph Cluster in OpenStack C...
Red_Hat_Storage
 

Semelhante a Riding the Stream Processing Wave (Strange loop 2019) (20)

London devops logging
London devops loggingLondon devops logging
London devops logging
 
ApacheCon BigData - What it takes to process a trillion events a day?
ApacheCon BigData - What it takes to process a trillion events a day?ApacheCon BigData - What it takes to process a trillion events a day?
ApacheCon BigData - What it takes to process a trillion events a day?
 
Apache Performance Tuning: Scaling Out
Apache Performance Tuning: Scaling OutApache Performance Tuning: Scaling Out
Apache Performance Tuning: Scaling Out
 
Replication, Durability, and Disaster Recovery
Replication, Durability, and Disaster RecoveryReplication, Durability, and Disaster Recovery
Replication, Durability, and Disaster Recovery
 
M6d cassandrapresentation
M6d cassandrapresentationM6d cassandrapresentation
M6d cassandrapresentation
 
Performance Tuning RocksDB for Kafka Streams’ State Stores
Performance Tuning RocksDB for Kafka Streams’ State StoresPerformance Tuning RocksDB for Kafka Streams’ State Stores
Performance Tuning RocksDB for Kafka Streams’ State Stores
 
Work with hundred of hot terabytes in JVMs
Work with hundred of hot terabytes in JVMsWork with hundred of hot terabytes in JVMs
Work with hundred of hot terabytes in JVMs
 
AWS Webcast - Cost and Performance Optimization in Amazon RDS
AWS Webcast - Cost and Performance Optimization in Amazon RDSAWS Webcast - Cost and Performance Optimization in Amazon RDS
AWS Webcast - Cost and Performance Optimization in Amazon RDS
 
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
 
Tuning kafka pipelines
Tuning kafka pipelinesTuning kafka pipelines
Tuning kafka pipelines
 
Omid Efficient Transaction Mgmt and Processing for HBase
Omid Efficient Transaction Mgmt and Processing for HBaseOmid Efficient Transaction Mgmt and Processing for HBase
Omid Efficient Transaction Mgmt and Processing for HBase
 
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...
 
Server side caching Vs other alternatives
Server side caching Vs other alternativesServer side caching Vs other alternatives
Server side caching Vs other alternatives
 
To Serverless and Beyond
To Serverless and BeyondTo Serverless and Beyond
To Serverless and Beyond
 
Relational Databases Utilising Amazon RDS - Technical 201
Relational Databases Utilising Amazon RDS - Technical 201Relational Databases Utilising Amazon RDS - Technical 201
Relational Databases Utilising Amazon RDS - Technical 201
 
SQL Server Reporting Services Disaster Recovery webinar
SQL Server Reporting Services Disaster Recovery webinarSQL Server Reporting Services Disaster Recovery webinar
SQL Server Reporting Services Disaster Recovery webinar
 
Eventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and Hadoop
Eventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and HadoopEventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and Hadoop
Eventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and Hadoop
 
Red Hat Storage Day Seattle: Stabilizing Petabyte Ceph Cluster in OpenStack C...
Red Hat Storage Day Seattle: Stabilizing Petabyte Ceph Cluster in OpenStack C...Red Hat Storage Day Seattle: Stabilizing Petabyte Ceph Cluster in OpenStack C...
Red Hat Storage Day Seattle: Stabilizing Petabyte Ceph Cluster in OpenStack C...
 
Samza tech talk_2015 - strata
Samza tech talk_2015 - strataSamza tech talk_2015 - strata
Samza tech talk_2015 - strata
 
hbaseconasia2019 Test-suite for Automating Data-consistency checks on HBase
hbaseconasia2019 Test-suite for Automating Data-consistency checks on HBasehbaseconasia2019 Test-suite for Automating Data-consistency checks on HBase
hbaseconasia2019 Test-suite for Automating Data-consistency checks on HBase
 

Último

result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college project
Tonystark477637
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
dharasingh5698
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
rknatarajan
 

Último (20)

result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college project
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur EscortsRussian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
 

Riding the Stream Processing Wave (Strange loop 2019)

  • 1. Riding the stream processing wave Samarth Shetty Sept 13 2019
  • 2. Riding the stream processing wave Samarth Shetty Sept 13 2019
  • 4. Stream Processing • Continuous Processing • Unbounded datasets • Low Latency Applications Recommended reading: Tyler Akidau: https://www.oreilly.com/ideas/the-world-beyond-batch-streaming-101
  • 5. Example Application Count number of "Page-Views" for each member in a 5 minute window Page View Page View/Member Stream Processing JobMessaging Queue Messaging Queue
  • 6. Apache Samza • Stream Processing Platform • Top Level Apache project (2014) • Created by LinkedIn. In use at LinkedIn, Slack, Intuit, Redfin etc
  • 7. Apache Samza Scale @ LinkedIn • ~4k jobs • 20k+ containers • ~2 Trillion messages processed per day
  • 8. Stream Processing at LinkedIn Security Bot Detection, Access Monitoring Notifications Email and Push Notifications Classification Topic tagging, Image classifications
  • 9. Stream Processing at LinkedIn Site Speed Site Speed and Health Monitoring Index Updates Updates to Search Index Business Metrics Pre-aggregated real-time counts by dimensions
  • 11. Hard Problems Common challenges we face in stream processing. • Scale • Operability • Scenarios spanning Offline and Online environments • Data access and Data movement
  • 12. Hard Problems Common challenges we face in stream processing. Today’s session we will talk about • Scale (Stateful applications) • Operability • Scenarios spanning Offline and Online environments • Data access and Data movement
  • 13. Real time targeting platform 1. Celia creates a LinkedIn post 2. Targeting platform scores each edge based on features such as connection strength, content affinity 3. Prunes low-quality edges based on scores (FPR). 4.. Remaining edges trigger the notification platform, where they are scored again (SPR) and optimized (aggregation, capping, etc.) Targeting Platform 1 1 Notification Platform
  • 14. Real time targeting platform • Low Latency • High QPS • Large State ○ Features generated offline used for scoring in nearline
  • 15. Real time targeting platform • High QPS ○ Parallelism and Async Processing • Large State ○ Optimized state lookup...
  • 16. Samza Local State • Used for Lookups, Buffering data, computed results • Local state can be in-memory or on disk. • State computed or ingested from a remote source Change Capture or HDFS->Kafka Push Input Stream(s) Output Stream) Local Store Bootstrap local state
  • 17. Samza Local State • How does it compare to remote state? Change Capture or HDFS->Kafka Push Input Stream(s) Output Stream) Local Store Bootstrap local state
  • 18. Samza Local State • How does it compare to remote state? • 100 X • Faster Change Capture or HDFS->Kafka Push Input Stream(s) Output Stream) Local Store Bootstrap local state
  • 19. Samza Local State • How does it compare to remote state? • 100 X • Faster Change Capture or HDFS->Kafka Push Input Stream(s) Output Stream) Local Store Bootstrap local state 30 X Throughput Gains Shadi A. Noghabi et al. Samza: stateful scalable stream processing at LinkedIn. Proc. VLDB Endow. 10, 12 (August 2017), 1634-1645.
  • 20. Samza Local State • How do we provide durability? ○ Backed up in log compacted topic ○ Incremental checkpointing Change Capture or HDFS->Kafka Push Input Stream(s) Log Compacted Kafka topic Output Stream) State Backup Local Store Bootstrap local state
  • 21. Local State ● How do we handle application failures? Samza Change Capture Stream Input Stream(s) Log Compacted Kafka topic Output Stream) State Backup Local Store Bootstrap local state
  • 22. Local State ● How do we handle application failures? Samza Change Capture Stream Input Stream(s) Log Compacted Kafka topic Output Stream) State Backup Local Store Bootstrap local state
  • 23. Local State ● How do we handle application failures? Samza Change Capture Stream Input Stream(s) Log Compacted Kafka topic Output Stream) State Backup Local Store Bootstrap local state Samza Master Heartbeats X
  • 24. Local State ● How do we handle application failures? Samza Change Capture Stream Input Stream(s) Log Compacted Kafka topic Output Stream) State Backup Local Store Bootstrap local state Samza Master Heartbeats X Samza New Container
  • 25. Local State ● How do we handle application failures? Samza Change Capture Stream Input Stream(s) Log Compacted Kafka topic Output Stream) State Backup Local Store Bootstrap local state Samza Master Heartbeats X Samza New Container Restore Local state from State Backup
  • 26. Local State ● How do we handle application failures? Change Capture Stream Input Stream(s) Log Compacted Kafka topic Catch Up with Bootstrap stream Samza State Backup Read from last checkpoint Output Stream
  • 27. Restoring Local State Challenges • For large state, recovery can take up to an hour • Impacted by Kafka quotas, SSD bottlenecks etc
  • 28. Local State ● 50%: Per container state < 0.5GB
  • 29. Local State ● 50%: Per container state < 0.5GB ● 95%: Per container state < 36 GB
  • 30. Local State ● 50%: Per container state < 0.5GB ● 95%: Per container state < 36 GB ● Max container state is~150GB and growing
  • 31. Restoring Local State • Can we reduce the frequency of state restore? • Can we reduce the time for state restore? • Can we have a bounded time for state restore?
  • 32. Restoring Local State • Can we reduce the frequency of state restore? • Can we reduce the time for state restore? • Can we have a bounded time for state restore?
  • 33. Restoring Local State • Can we reduce the frequency of state restore? • Can we reduce the time for state restore? • Can we have a bounded time for state restore?
  • 34. Host Affinity Reducing downtime during recovery Task-1 Container-1 Container-2 Heartbeat Samza master Task-2 Durable containerID – host mapping • Restart containers on same host • Re-use on-disk state snapshot (host affinity) • Catch-up on only delta from the Kafka change-log 0 Downtime
  • 35. Host Affinity Reducing downtime during recovery Task-1 Container-1 Container-2 Heartbeat Samza master Task-2 Durable containerID – host mapping • Limitations ○ Host affinity is not guaranteed ○ Host failures are a reality :) ○ Bugs and host contention may cause a full state restore 0
  • 36. Standby Containers Bounded time for state restore • Jobs have active and standby containers • Standby container keeps a copy of application state • Only active containers process messages Active Container Standby Container Input Stream Heartbeat Samza master Change Log
  • 37. Standby Containers Bounded time for state restore • Active container’s host fails Standby Container Input Stream Heartbeat Samza master X Change Log Active Container
  • 38. Standby Containers Bounded time for state restore • Active container’s host fails • Heartbeat to host and container lost Standby Container Input Stream Heartbeat Samza master X Change Log Active Container X
  • 39. Standby Containers Bounded time for state restore • Active container’s host fails • Heartbeat to host and container lost • Samza master selects a standby for promotion Standby Container Input Stream Heartbeat Samza master X Change Log Active Container X
  • 40. Standby Containers Bounded time for state restore • Samza master promotes standby to active Active Container Samza master
  • 41. Standby Containers Bounded time for state restore • Samza master promotes standby to active • Newly activated container processes from checkpoint Active Container Samza master Change Log Input Stream
  • 42. Standby Containers Bounded time for state restore • Samza master promotes standby to active • Newly activated container processes from checkpoint • Samza master creates a new standby Active ContainerInput Stream Samza master Change Log Standby Container
  • 43. Standby Containers Bounded time for state restore • Samza master promotes standby to active • Newly activated container processes from checkpoint • Samza master creates a new standby • Replica factor is configurable Active ContainerInput Stream Samza master Change Log Standby Container
  • 44. Standby Containers Bounded time for state restore • Samza master promotes standby to active • Newly activated container processes from checkpoint • Samza master creates a new standby • Replica factor is configurable Active ContainerInput Stream Samza master Change Log Standby Container • Bounded Restore Time: 5 mins • ~20x faster for large state stores (200GB+)
  • 45. Hard Problems Common challenges we face in stream processing. Today’s session we will talk about • Scale (Stateful applications) • Operability • Scenarios spanning Offline and Online environments • Data access and Data movement
  • 46. Scenarios spanning Offline and Online environments • ML Model Training, Feature Engineering (Generation and Access) • Lambda Architecture • Experimentation
  • 47. Feature Management Frame: Virtual Feature Store https://www.slideshare.net/DavidStein1/frame-feature-management-for-productive-machine-learning • Goal: Simplify feature discovery and access • Applications get features by “name” in a global namespace • Abstraction layer for feature access • Unified across environments and data sources
  • 48. Frame Simplifying Feature Access (Datastore: HDFS) 0 https://www.slideshare.net/DavidStein1/frame-feature-management-for-productive-machine-learning
  • 49. Frame Simplifying Feature Access (Datastore: KV, REST etc) 0 https://www.slideshare.net/DavidStein1/frame-feature-management-for-productive-machine-learning
  • 51. Simplifying Lambda Unified Metrics • Metrics pipeline built on Pig, Hive • Need real time insights • Solution: ○ Convert Pig and Hive to Samza pipelines for nearline processing ○ Apache Pinot for serving results Khai Tran: https://engineering.linkedin.com/blog/2019/01/bridging-offline-and-nearline-computations-with-apache-calcite
  • 52. Simplifying Lambda Unified Metrics Samza jobs Batch jobs UMP neartime platform UMP offline platform Raptor code configMetrics definition HDFS Pinot Khai Tran: https://engineering.linkedin.com/blog/2019/01/bridging-offline-and-nearline-computations-with-apache-calcite Lambda architecture with a single codebase
  • 53. Simplifying Lambda Unified Metrics Khai Tran: https://engineering.linkedin.com/blog/2019/01/bridging-offline-and-nearline-computations-with-apache-calcite ... Metric union User code User code Dimension decoration Calcite relational algebra as an IR convert generateoptimize Beam physical plan Pig to Calcite Calcite to Beam Streaming config Beam Java API code
  • 54. In-progress Explorations Convergence API • Apache Beam ○ Samza supports a Beam runner ○ Exploring Spark-Beam runner • SQL ○ Samza SQL and Spark SQL
  • 56. Future Work • Auto Sizing of Jobs • Multi Language Support (e.g Python) • Frame for feature generation • State store on Azure Managed Disks