SlideShare uma empresa Scribd logo
1 de 100
Baixar para ler offline
a 
[ 
b 
K 
Z
• 
• 
• 
CONTACT ME @edvorkin
• 
• 
• 
• 
• 
• 
• 
• 
• 
•
[
real-time medical news from curated Twitter feed
Every second, on average, around 6,000 tweets are tweeted on Twitter, which corresponds to over 350,000 tweets sent per minute, 500 million tweets per day 
350,000 
^ 
1 % = 3500 
^
•How to scale 
•How to deal with failures 
•What to do with failed messages 
•A lot of infrastructure concerns 
•Complexity 
•Tedious coding 
DB 
t 
*Image credit:Nathanmarz: slideshare: storm
Inherently BATCH-Oriented System
•Exponential rise in real- time data 
•New business opportunity 
•Economics of OSS and commodity hardware 
Stream processing has emerged as a key use case* 
*Source: Discover HDP2.1: Apache Storm for Stream Data Processing. Hortonworks. 2014
•Detecting fraud while someone swiping credit card 
•Place ad on website while someone is reading a specific article 
•Alerts on application and machine failures 
•Use stream-processing in batch oriented fashion
4
% 
å 
å
Created by Nathan Martz 
Acquired by Twitter 
Apache Incubator Project 
Open sourced 
Part of Hortonworks HDP2 platform 
U 
a 
x 
Top Level 
Apache Project
Most mature, widely adopted framework 
Source: http://storm.incubator.apache.org/
Process endless stream of data. 
1M+ messages / sec on a 10- 15 node cluster 
/ 
4
Guaranteed message processing 
Û
Tuples, Streams, Spouts, Bolts and Topologies 
Z 
å 
å 
å
TUPLE 
Storm data type: Immutable List of Key/Value pair of any data type 
word: “Hello” 
Count: 25 
Frequency: 0.25
Unbounded Sequence of Tuples between nodes 
STREAM
SPOUT 
The Source of the Stream
Read from stream of data – queues, web logs, API calls, databases 
Spout responsibilities
BOLT 
⚡
•Process tuples and perform actions: calculations, API calls, DB calls 
•Produce new output stream based on computations 
Bolt 
⚡ 
F(x)
•A topology is a network of spouts and bolts 
•Defines data flow 
4
•May have multiple spouts 
4
•Each spout and bolt may have many instances that perform all the processing in parallel 
4 
• 
• 
• 
• 
•
How tuples are send between instances of spouts and bolts 
Random Distribution. 
Routes tuples to bolt based on the value of the field. 
Same values always route to the same bolt 
Replicates the tuple stream across all the bolt tasks. Each task receive a copy of tuple. 
Routes all tuple in the stream to single task. Should be used with caution. 
4
å 
å 
å 
å
compile 'org.apache.storm:storm-core:0.9.2’ 
<dependency> 
<groupId>org.apache.storm</groupId> 
<artifactId>storm-core</artifactId> 
<version>0.9.2</version> 
</dependency>
Two 1 
Households 1 
Both 1 
Alike 1 
In 1 
Dignity 1 
sentence 
word 
Word 
⚡ 
⚡ 
⚡ 
3 
final count: 
Two 20 
Households 24 
Both 22 
Alike 1 
In 1 
Dignity 10 
"Two households, both alike in dignity" 
Two 
Households 
Both 
alike 
in 
dignity
Data Source
SplitSentenceBolt 
Resource initialization
WordCountBolt
PrinterBolt
Linking it all together
How to scale stream processing 
q 
å 
å 
å 
å 
å
storm main components 
Machines in a storm cluster 
JVM processes running on a node. One or more per node. 
Java thread running within worker JVM process. 
Instances of spouts and bolts.
q
q
How tuples are send between instances of spouts and bolts
a 
å 
å 
å 
å 
å 
å
Tuple tree 
Reliable vs unreliable topologies
Methods from ISpout interface
Reliability in Bolts 
Anchoring 
Ack 
Fail
Unit testing Storm components 
a
BDD style of testing
Extending OutputCollector
Extending OutputCollector
Z 
å 
å 
å 
å 
å 
å 
å
Physical View 
4
deploying topology to a cluster 
storm jar wordcount-1.0.jar com.demo.storm.WordCountTopology word- count-topology
Monitoring and performance tuning
x 
å 
å 
å 
å 
å 
å 
å 
å
Run under supervision: 
Monit, supervisord
Nimbus move work to another node
Supervisor will restart worker
Micro-Batch Stream Processing 
K 
å 
å 
å 
å 
å 
å 
å 
å 
å
Functions, Filters, aggregations, joins, grouping 
Ordered batches of tuples. Batches can be partitioned. 
Similar to Pig or Cascading 
Transactional spouts 
Trident has first class abstraction for reading and writing to stateful sources 
Ü 
4
Stream processed in small batches 
•Each batch has a unique ID which is always the same on each replay 
•If one tuple failed, the whole batch is reprocessed 
•Higher throutput than storm but higher latency as well
How trident provides exactly –one semantics?
Store the count along with BatchID 
COUNT 
100 
BATCHID 
1 
COUNT 
110 
BATCHID 
2 
10 more tuples with batchId 2 
Failure: Batch 2 replayed The same batchId (2) 
•Spout should replay a batch exactly as it was played before 
•Trident API hide dealing with batchID complexity
Word count with trident
Word count with Trident
Word count with Trident
Style of computation 
4
By styles of computation 
4
å 
å 
å 
å 
å 
å 
å 
å 
å 
å
Enhancing Twitter feed with lead Image and Title 
•Readability enhancements 
•Image Scaling 
•Remove duplicates 
•Custom Business Logic
Writing twitter spout
Status
use Twitter4J java library
use existing Spout from Storm contrib project on GitHub 
Spouts exists for: Twitter, Kafka, JMS, RabbitMQ, Amazon SQS, Kinesis, MongoDB….
•Storm takes care of scalability and fault-tolerance 
•What happens if there is burst in traffic?
Introducing Queuing Layer with Kafka 
Ñ
4
Solr Indexing
Processing Groovy Rules (DSL) on a scale in real-time
å 
å 
å 
å 
å 
å 
å 
å 
å 
å 
å
Statsd and Storm Metrics API 
http://www.michael-noll.com/blog/2013/11/06/sending-metrics-from-storm-to-graphite/
•Use cache if you can: for example Google Guava caching utilities 
•In memory DB 
•Tick tuples (for batch updates)
•Linear classification (Perceptron, Passive-Aggresive, Winnow, AROW) 
•Linear regression (Perceptron, Passive-Aggresive) 
•Clustering (KMeans) 
•Feature scaling (standardization, normalization) 
•Text feature extraction 
•Stream statistics (mean, variance) 
•Pre-Trained Twitter sentiment classifier 
Trident-ML
http://www.michael-noll.com 
http://www.bigdata-cookbook.com/post/72320512609/storm-metrics- how-to 
http://svendvanderveken.wordpress.com/
edvorkin/Storm_Demo_Spring2GX
Go ahead. Ask away.

Mais conteúdo relacionado

Mais procurados

Apache Storm Concepts
Apache Storm ConceptsApache Storm Concepts
Apache Storm Concepts
André Dias
 
Cassandra and Storm at Health Market Sceince
Cassandra and Storm at Health Market SceinceCassandra and Storm at Health Market Sceince
Cassandra and Storm at Health Market Sceince
P. Taylor Goetz
 
Realtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and HadoopRealtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and Hadoop
DataWorks Summit
 
Introduction to Storm
Introduction to Storm Introduction to Storm
Introduction to Storm
Chandler Huang
 
Multi-Tenant Storm Service on Hadoop Grid
Multi-Tenant Storm Service on Hadoop GridMulti-Tenant Storm Service on Hadoop Grid
Multi-Tenant Storm Service on Hadoop Grid
DataWorks Summit
 

Mais procurados (20)

Real-time streams and logs with Storm and Kafka
Real-time streams and logs with Storm and KafkaReal-time streams and logs with Storm and Kafka
Real-time streams and logs with Storm and Kafka
 
Storm: The Real-Time Layer - GlueCon 2012
Storm: The Real-Time Layer  - GlueCon 2012Storm: The Real-Time Layer  - GlueCon 2012
Storm: The Real-Time Layer - GlueCon 2012
 
Real-Time Big Data at In-Memory Speed, Using Storm
Real-Time Big Data at In-Memory Speed, Using StormReal-Time Big Data at In-Memory Speed, Using Storm
Real-Time Big Data at In-Memory Speed, Using Storm
 
Apache Storm Concepts
Apache Storm ConceptsApache Storm Concepts
Apache Storm Concepts
 
Storm and Cassandra
Storm and Cassandra Storm and Cassandra
Storm and Cassandra
 
Cassandra and Storm at Health Market Sceince
Cassandra and Storm at Health Market SceinceCassandra and Storm at Health Market Sceince
Cassandra and Storm at Health Market Sceince
 
Hadoop Summit Europe 2014: Apache Storm Architecture
Hadoop Summit Europe 2014: Apache Storm ArchitectureHadoop Summit Europe 2014: Apache Storm Architecture
Hadoop Summit Europe 2014: Apache Storm Architecture
 
Storm
StormStorm
Storm
 
Introduction to Apache Storm - Concept & Example
Introduction to Apache Storm - Concept & ExampleIntroduction to Apache Storm - Concept & Example
Introduction to Apache Storm - Concept & Example
 
Slide #1:Introduction to Apache Storm
Slide #1:Introduction to Apache StormSlide #1:Introduction to Apache Storm
Slide #1:Introduction to Apache Storm
 
Introduction to Twitter Storm
Introduction to Twitter StormIntroduction to Twitter Storm
Introduction to Twitter Storm
 
Scaling Apache Storm (Hadoop Summit 2015)
Scaling Apache Storm (Hadoop Summit 2015)Scaling Apache Storm (Hadoop Summit 2015)
Scaling Apache Storm (Hadoop Summit 2015)
 
Realtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and HadoopRealtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and Hadoop
 
Distributed Realtime Computation using Apache Storm
Distributed Realtime Computation using Apache StormDistributed Realtime Computation using Apache Storm
Distributed Realtime Computation using Apache Storm
 
Introduction to Storm
Introduction to Storm Introduction to Storm
Introduction to Storm
 
Multi-Tenant Storm Service on Hadoop Grid
Multi-Tenant Storm Service on Hadoop GridMulti-Tenant Storm Service on Hadoop Grid
Multi-Tenant Storm Service on Hadoop Grid
 
Real-time Big Data Processing with Storm
Real-time Big Data Processing with StormReal-time Big Data Processing with Storm
Real-time Big Data Processing with Storm
 
Real time big data analytics with Storm by Ron Bodkin of Think Big Analytics
Real time big data analytics with Storm by Ron Bodkin of Think Big AnalyticsReal time big data analytics with Storm by Ron Bodkin of Think Big Analytics
Real time big data analytics with Storm by Ron Bodkin of Think Big Analytics
 
Introduction to Apache Storm
Introduction to Apache StormIntroduction to Apache Storm
Introduction to Apache Storm
 
Storm-on-YARN: Convergence of Low-Latency and Big-Data
Storm-on-YARN: Convergence of Low-Latency and Big-DataStorm-on-YARN: Convergence of Low-Latency and Big-Data
Storm-on-YARN: Convergence of Low-Latency and Big-Data
 

Destaque

Apache Storm
Apache StormApache Storm
Apache Storm
Edureka!
 

Destaque (18)

The Future of Apache Storm
The Future of Apache StormThe Future of Apache Storm
The Future of Apache Storm
 
Ilex beller schteitale
Ilex beller schteitaleIlex beller schteitale
Ilex beller schteitale
 
StormWars - when the data stream shrinks
StormWars - when the data stream shrinksStormWars - when the data stream shrinks
StormWars - when the data stream shrinks
 
Real-Time Analytics with Apache Storm
Real-Time Analytics with Apache StormReal-Time Analytics with Apache Storm
Real-Time Analytics with Apache Storm
 
Twitter Stream Processing
Twitter Stream ProcessingTwitter Stream Processing
Twitter Stream Processing
 
Storm – Streaming Data Analytics at Scale - StampedeCon 2014
Storm – Streaming Data Analytics at Scale - StampedeCon 2014Storm – Streaming Data Analytics at Scale - StampedeCon 2014
Storm – Streaming Data Analytics at Scale - StampedeCon 2014
 
Apache Storm
Apache StormApache Storm
Apache Storm
 
Apache Storm
Apache StormApache Storm
Apache Storm
 
Big Data Streaming processing using Apache Storm - FOSSCOMM 2016
Big Data Streaming processing using Apache Storm - FOSSCOMM 2016Big Data Streaming processing using Apache Storm - FOSSCOMM 2016
Big Data Streaming processing using Apache Storm - FOSSCOMM 2016
 
Become an Independent Product Management Consultant
Become an Independent Product Management ConsultantBecome an Independent Product Management Consultant
Become an Independent Product Management Consultant
 
Real time and reliable processing with Apache Storm
Real time and reliable processing with Apache StormReal time and reliable processing with Apache Storm
Real time and reliable processing with Apache Storm
 
October 2014 HUG : Oozie HA
October 2014 HUG : Oozie HAOctober 2014 HUG : Oozie HA
October 2014 HUG : Oozie HA
 
Real-Time Streaming with Apache Spark Streaming and Apache Storm
Real-Time Streaming with Apache Spark Streaming and Apache StormReal-Time Streaming with Apache Spark Streaming and Apache Storm
Real-Time Streaming with Apache Spark Streaming and Apache Storm
 
The Future of Apache Storm
The Future of Apache StormThe Future of Apache Storm
The Future of Apache Storm
 
Azure Machine Learning 101
Azure Machine Learning 101Azure Machine Learning 101
Azure Machine Learning 101
 
Apache Storm vs. Spark Streaming – two Stream Processing Platforms compared
Apache Storm vs. Spark Streaming – two Stream Processing Platforms comparedApache Storm vs. Spark Streaming – two Stream Processing Platforms compared
Apache Storm vs. Spark Streaming – two Stream Processing Platforms compared
 
Architectual Comparison of Apache Apex and Spark Streaming
Architectual Comparison of Apache Apex and Spark StreamingArchitectual Comparison of Apache Apex and Spark Streaming
Architectual Comparison of Apache Apex and Spark Streaming
 
Apache storm vs. Spark Streaming
Apache storm vs. Spark StreamingApache storm vs. Spark Streaming
Apache storm vs. Spark Streaming
 

Semelhante a Learning Stream Processing with Apache Storm

Storm: distributed and fault-tolerant realtime computation
Storm: distributed and fault-tolerant realtime computationStorm: distributed and fault-tolerant realtime computation
Storm: distributed and fault-tolerant realtime computation
nathanmarz
 

Semelhante a Learning Stream Processing with Apache Storm (20)

Cleveland HUG - Storm
Cleveland HUG - StormCleveland HUG - Storm
Cleveland HUG - Storm
 
Integrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationsIntegrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applications
 
Towards Benchmaking Modern Distruibuted Systems-(Grace Huang, Intel)
Towards Benchmaking Modern Distruibuted Systems-(Grace Huang, Intel)Towards Benchmaking Modern Distruibuted Systems-(Grace Huang, Intel)
Towards Benchmaking Modern Distruibuted Systems-(Grace Huang, Intel)
 
Devoxx university - Kafka de haut en bas
Devoxx university - Kafka de haut en basDevoxx university - Kafka de haut en bas
Devoxx university - Kafka de haut en bas
 
Genomic Computation at Scale with Serverless, StackStorm and Docker Swarm
Genomic Computation at Scale with Serverless, StackStorm and Docker SwarmGenomic Computation at Scale with Serverless, StackStorm and Docker Swarm
Genomic Computation at Scale with Serverless, StackStorm and Docker Swarm
 
InfluxEnterprise Architecture Patterns by Tim Hall & Sam Dillard
InfluxEnterprise Architecture Patterns by Tim Hall & Sam DillardInfluxEnterprise Architecture Patterns by Tim Hall & Sam Dillard
InfluxEnterprise Architecture Patterns by Tim Hall & Sam Dillard
 
John adams talk cloudy
John adams   talk cloudyJohn adams   talk cloudy
John adams talk cloudy
 
Storm: distributed and fault-tolerant realtime computation
Storm: distributed and fault-tolerant realtime computationStorm: distributed and fault-tolerant realtime computation
Storm: distributed and fault-tolerant realtime computation
 
Bigdata roundtable-storm
Bigdata roundtable-stormBigdata roundtable-storm
Bigdata roundtable-storm
 
Apache Storm
Apache StormApache Storm
Apache Storm
 
Apache Flink(tm) - A Next-Generation Stream Processor
Apache Flink(tm) - A Next-Generation Stream ProcessorApache Flink(tm) - A Next-Generation Stream Processor
Apache Flink(tm) - A Next-Generation Stream Processor
 
Tale of two streaming frameworks- Apace Storm & Apache Flink
Tale of two streaming frameworks- Apace Storm & Apache FlinkTale of two streaming frameworks- Apace Storm & Apache Flink
Tale of two streaming frameworks- Apace Storm & Apache Flink
 
Tale of two streaming frameworks (Karthik D - Walmart)
Tale of two streaming frameworks (Karthik D - Walmart)Tale of two streaming frameworks (Karthik D - Walmart)
Tale of two streaming frameworks (Karthik D - Walmart)
 
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
 
Jan 2012 HUG: Storm
Jan 2012 HUG: StormJan 2012 HUG: Storm
Jan 2012 HUG: Storm
 
Workshop slides
Workshop slidesWorkshop slides
Workshop slides
 
Springone2gx 2014 Reactive Streams and Reactor
Springone2gx 2014 Reactive Streams and ReactorSpringone2gx 2014 Reactive Streams and Reactor
Springone2gx 2014 Reactive Streams and Reactor
 
InfluxEnterprise Architectural Patterns by Dean Sheehan, Senior Director, Pre...
InfluxEnterprise Architectural Patterns by Dean Sheehan, Senior Director, Pre...InfluxEnterprise Architectural Patterns by Dean Sheehan, Senior Director, Pre...
InfluxEnterprise Architectural Patterns by Dean Sheehan, Senior Director, Pre...
 
Architecture of Flink's Streaming Runtime @ ApacheCon EU 2015
Architecture of Flink's Streaming Runtime @ ApacheCon EU 2015Architecture of Flink's Streaming Runtime @ ApacheCon EU 2015
Architecture of Flink's Streaming Runtime @ ApacheCon EU 2015
 
Building Scalable and Extendable Data Pipeline for Call of Duty Games: Lesson...
Building Scalable and Extendable Data Pipeline for Call of Duty Games: Lesson...Building Scalable and Extendable Data Pipeline for Call of Duty Games: Lesson...
Building Scalable and Extendable Data Pipeline for Call of Duty Games: Lesson...
 

Último

VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
dharasingh5698
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
dollysharma2066
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
ankushspencer015
 
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoorTop Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
dharasingh5698
 
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 

Último (20)

Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torque
 
NFPA 5000 2024 standard .
NFPA 5000 2024 standard                                  .NFPA 5000 2024 standard                                  .
NFPA 5000 2024 standard .
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
 
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
 
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoorTop Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
 
chapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringchapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineering
 
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
 
Unit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdfUnit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdf
 
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced LoadsFEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
 
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
 
Work-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxWork-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptx
 

Learning Stream Processing with Apache Storm

Notas do Editor

  1. Attention! Before you open this template be sure what you have the following fonts installed: Novecento Sans wide font family (6 free weight) http://typography.synthview.com Abattis Cantarell http://www.fontsquirrel.com/fonts/cantarell Icon Sets Fonts: raphaelicons-webfont.ttf from this page: http://icons.marekventur.de iconic_stroke.ttf from this page: http://somerandomdude.com/work/open-iconic modernpics.otf from this page: http://www.fontsquirrel.com/fonts/modern-pictograms general_foundicons.ttf, social_foundicons.ttf, accessibility_foundicons.ttf from this page: http://www.zurb.com/playground/foundation-icons fontawesome-webfont.ttf from this page: http://fortawesome.github.io/Font-Awesome Entypo.otf from this page: http://www.fontsquirrel.com/fonts/entypo sosa-regular-webfont.ttf from this page: http://tenbytwenty.com/?xxxx_posts=sosa All fonts are permitted free use in commercial projects. If you have difficulties to install those fonts or have no time to find all of them, please follow the FAQs: http://graphicriver.net/item/six-template/3626243/support
  2. Recently we at WebMD had to create application that process data from twitter
  3. infrastructure
  4. Infrastructure investment Administration cost Steep learning curve Huge ecosystem: pig, hive, ambari, cascading, flume ….
  5. social media sentiments, machine sensors, internet of things, interconnected devices, logs, clickstream CEP or stream processing solution existed before but was very costly
  6. Pause
  7. Ready for the enterprise – not only for twitter or linked in
  8. Pause Meaning – fault tolerant
  9. Workers, spout, slow down on basics
  10. A bolt processes any number of input streams and produces any number of new output streams. Most of the logic of a computation goes into bolts, such as functions, filters, streaming joins, streaming aggregations, talking to databases, and so on.
  11. A bolt processes any number of input streams and produces any number of new output streams. Most of the logic of a computation goes into bolts, such as functions, filters, streaming joins, streaming aggregations, talking to databases, and so on.
  12. pause A topology is a network of spouts and bolts, with each edge in the network representing a bolt subscribing to the output stream of some other spout or bolt. DAG
  13. pause A topology is a network of spouts and bolts, with each edge in the network representing a bolt subscribing to the output stream of some other spout or bolt. DAG
  14. pause A topology is a network of spouts and bolts, with each edge in the network representing a bolt subscribing to the output stream of some other spout or bolt. DAG
  15. pause
  16. Like driver in Hadoop
  17. pause
  18. pause
  19. Storm considers a tuple coming of a spout fully processed when every message in the tree has been processed. A tuple is considered failed when its tree of messages fails to be fully processed within a configurable timeout. The default is 30 seconds.
  20. Storm considers a tuple coming of a spout fully processed when every message in the tree has been processed. A tuple is considered failed when its tree of messages fails to be fully processed within a configurable timeout. The default is 30 seconds. When emitting a tuple, the Spout provides a "message id" that will be used to identify the tuple later
  21. Storm considers a tuple coming of a spout fully processed when every message in the tree has been processed. A tuple is considered failed when its tree of messages fails to be fully processed within a configurable timeout. The default is 30 seconds. Link between incoming and derived tuple.
  22. Master and worker node Nimbus – simular to job tracker in Hadoop Nimbus- responsible for distributing code around the cluster, assigning tasks to machines, and monitoring for failures Each worker node runs a daemon called the "Supervisor". The supervisor listens for work assigned to its machine and starts and stops worker processes as necessary based on what Nimbus has assigned to it. All coordination between Nimbus and the Supervisors is done through a Zookeeper cluster.
  23. Master and worker node Nimbus- responsible for distributing code around the cluster, assigning tasks to machines, and monitoring for failures Each worker node runs a daemon called the "Supervisor". The supervisor listens for work assigned to its machine and starts and stops worker processes as necessary based on what Nimbus has assigned to it. All coordination between Nimbus and the Supervisors is done through a Zookeeper cluster.
  24. Capacity – percentage of time bolt was busy executing particular task
  25. Processing will continue. But topology lifecycle operations and reassignment facility are lost. Run under system supervision
  26. Trident topologies got converted into storm topologies with spout/tuples
  27. Higher throutput than storm but higher latency as well
  28. Spout should replay a batch exactly as it was played before Kafka spout Trident API hide dealing with batchID complexity
  29. Java fluent api Write functions or filters instead of bolts
  30. Fire and forget
  31. A single Kafka broker can handle hundreds of megabytes of reads and writes per second from thousands of clients
  32. Same code, just different topologies and original sources Lambda architecture
  33. Groovy Script engine