SlideShare uma empresa Scribd logo
Storm
Distributed and fault-tolerant realtime computation




                                          Nathan Marz
                                            Twitter
Basic info
• Open sourced September 19th
• Implementation is 12,000 lines of code
• Used by over 25 companies
• >2280 watchers on Github (most watched
  JVM project)
• Very active mailing list
   • >1700 messages
   • >520 members
Hadoop


              Batch computation




Distributed                       Fault-tolerant
Storm


         Realtime computation




Distributed                 Fault-tolerant
Hadoop


• Large, finite jobs
• Process a lot of data at once
• High latency
Storm


• Infinite computations called topologies
• Process infinite streams of data
• Tuple-at-a-time computational model
• Low latency
Before Storm



Queues        Workers
Example




 (simplified)
Example




Workers schemify tweets
 and append to Hadoop
Example




Workers update statistics on URLs by
incrementing counters in Cassandra
Problems

• Scaling is painful
• Poor fault-tolerance
• Coding is tedious
What we want
• Guaranteed data processing
• Horizontal scalability
• Fault-tolerance
• No intermediate message brokers!
• Higher level abstraction than message passing
• “Just works”
Storm
Guaranteed data processing
Horizontal scalability
Fault-tolerance
No intermediate message brokers!
Higher level abstraction than message passing
“Just works”
Use cases



  Stream      Distributed   Continuous
processing       RPC        computation
Storm Cluster
Storm Cluster




Master node (similar to Hadoop JobTracker)
Storm Cluster




Used for cluster coordination
Storm Cluster




 Run worker processes
Starting a topology
Killing a topology
Concepts

• Streams
• Spouts
• Bolts
• Topologies
Streams


Tuple   Tuple   Tuple   Tuple   Tuple   Tuple   Tuple




          Unbounded sequence of tuples
Spouts




Source of streams
Spout examples


• Read from Kestrel queue
• Read from Twitter streaming API
Bolts




Processes input streams and produces new streams
Bolts
• Functions
• Filters
• Aggregation
• Joins
• Talk to databases
Topology




Network of spouts and bolts
Tasks




Spouts and bolts execute as
many tasks across the cluster
Task execution




Tasks are spread across the cluster
Task execution




Tasks are spread across the cluster
Stream grouping




When a tuple is emitted, which task does it go to?
Stream grouping

• Shuffle grouping: pick a random task
• Fields grouping: mod hashing on a
  subset of tuple fields
• All grouping: send to all tasks
• Global grouping: pick task with lowest id
Topology
shuffle      [“id1”, “id2”]




           shuffle
[“url”]


  shuffle

              all
Streaming word count




TopologyBuilder is used to construct topologies in Java
Streaming word count




Define a spout in the topology with parallelism of 5 tasks
Streaming word count




Split sentences into words with parallelism of 8 tasks
Streaming word count



Consumer decides what data it receives and how it gets grouped




Split sentences into words with parallelism of 8 tasks
Streaming word count




   Create a word count stream
Streaming word count




      splitsentence.py
Streaming word count
Streaming word count




  Submitting topology to a cluster
Streaming word count




  Running topology in local mode
Demo
Distributed RPC




Data flow for Distributed RPC
DRPC Example


Computing “reach” of a URL on the fly
Reach


Reach is the number of unique people
    exposed to a URL on Twitter
Computing reach
                Follower
                           Distinct
      Tweeter   Follower   follower

                Follower
                           Distinct
URL   Tweeter              follower   Count   Reach
                Follower

                Follower   Distinct
      Tweeter              follower
                Follower
Reach topology
Reach topology
Reach topology
Reach topology


          Keep set of followers for
         each request id in memory
Reach topology




          Update followers set when
           receive a new follower
Reach topology


            Emit partial count after
          receiving all followers for a
                   request id
Demo
Guaranteeing message
     processing




       “Tuple tree”
Guaranteeing message
     processing

• A spout tuple is not fully processed until all
  tuples in the tree have been completed
Guaranteeing message
     processing

• If the tuple tree is not completed within a
  specified timeout, the spout tuple is replayed
Guaranteeing message
     processing




      Reliability API
Guaranteeing message
     processing




“Anchoring” creates a new edge in the tuple tree
Guaranteeing message
     processing




 Marks a single node in the tree as complete
Guaranteeing message
     processing

• Storm tracks tuple trees for you in an
  extremely efficient way
Transactional topologies


How do you do idempotent counting with an
     at least once delivery guarantee?
Transactional topologies


      Won’t you overcount?
Transactional topologies


 Transactional topologies solve this problem
Transactional topologies


Built completely on top of Storm’s primitives
        of streams, spouts, and bolts
Transactional topologies


Batch 1   Batch 2   Batch 3




          Process small batches of tuples
Transactional topologies


Batch 1     Batch 2   Batch 3




          If a batch fails, replay the whole batch
Transactional topologies


Batch 1   Batch 2   Batch 3




   Once a batch is completed, commit the batch
Transactional topologies


Batch 1   Batch 2   Batch 3




Bolts can optionally implement “commit” method
Transactional topologies

Commit 1   Commit 1   Commit 2   Commit 3   Commit 4   Commit 4




   Commits are ordered. If there’s a failure during
    commit, the whole batch + commit is retried
Example
Example

       New instance of this object
      for every transaction attempt
Example



          Aggregate the count for
                this batch
Example


      Only update database if
       transaction ids differ
Example


     This enables idempotency since
          commits are ordered
Example


      (Credit goes to Kafka guys
       for figuring out this trick)
Transactional topologies


Multiple batches can be processed in parallel,
but commits are guaranteed to be ordered
Transactional topologies

• Will be available in next version of Storm
  (0.7.0)
• Requires a source queue that can replay
  identical batches of messages
• Aiming for first TransactionalSpout
  implementation to use Kafka
Storm UI
Storm on EC2


https://github.com/nathanmarz/storm-deploy




          One-click deploy tool
Starter code


https://github.com/nathanmarz/storm-starter




            Example topologies
Documentation
Ecosystem

• Scala, JRuby, and Clojure DSL’s
• Kestrel, AMQP, JMS, and other spout adapters
• Serializers
• Multilang adapters
• Cassandra, MongoDB integration
Questions?


http://github.com/nathanmarz/storm

Mais conteúdo relacionado

Mais procurados

Introduction to Apache Storm
Introduction to Apache StormIntroduction to Apache Storm
Introduction to Apache Storm
Tiziano De Matteis
 
Storm presentation
Storm presentationStorm presentation
Storm presentation
Shyam Raj
 
Introduction to Apache Storm - Concept & Example
Introduction to Apache Storm - Concept & ExampleIntroduction to Apache Storm - Concept & Example
Introduction to Apache Storm - Concept & Example
Dung Ngua
 
Scaling Apache Storm (Hadoop Summit 2015)
Scaling Apache Storm (Hadoop Summit 2015)Scaling Apache Storm (Hadoop Summit 2015)
Scaling Apache Storm (Hadoop Summit 2015)
Robert Evans
 
Resource Aware Scheduling in Apache Storm
Resource Aware Scheduling in Apache StormResource Aware Scheduling in Apache Storm
Resource Aware Scheduling in Apache Storm
DataWorks Summit/Hadoop Summit
 
Real-time streams and logs with Storm and Kafka
Real-time streams and logs with Storm and KafkaReal-time streams and logs with Storm and Kafka
Real-time streams and logs with Storm and Kafka
Andrew Montalenti
 
Apache Storm
Apache StormApache Storm
Apache Storm
masifqadri
 
Storm Anatomy
Storm AnatomyStorm Anatomy
Storm Anatomy
Eiichiro Uchiumi
 
Cassandra and Storm at Health Market Sceince
Cassandra and Storm at Health Market SceinceCassandra and Storm at Health Market Sceince
Cassandra and Storm at Health Market Sceince
P. Taylor Goetz
 
Apache Storm
Apache StormApache Storm
Apache Storm
Nguyen Quang
 
Multi-Tenant Storm Service on Hadoop Grid
Multi-Tenant Storm Service on Hadoop GridMulti-Tenant Storm Service on Hadoop Grid
Multi-Tenant Storm Service on Hadoop Grid
DataWorks Summit
 
Storm and Cassandra
Storm and Cassandra Storm and Cassandra
Storm and Cassandra
T Jake Luciani
 
Apache Storm and twitter Streaming API integration
Apache Storm and twitter Streaming API integrationApache Storm and twitter Streaming API integration
Apache Storm and twitter Streaming API integration
Uday Vakalapudi
 
Apache Storm Concepts
Apache Storm ConceptsApache Storm Concepts
Apache Storm Concepts
André Dias
 
Real-time Big Data Processing with Storm
Real-time Big Data Processing with StormReal-time Big Data Processing with Storm
Real-time Big Data Processing with Storm
viirya
 
Using Simplicity to Make Hard Big Data Problems Easy
Using Simplicity to Make Hard Big Data Problems EasyUsing Simplicity to Make Hard Big Data Problems Easy
Using Simplicity to Make Hard Big Data Problems Easy
nathanmarz
 
Storm: Distributed and fault tolerant realtime computation
Storm: Distributed and fault tolerant realtime computationStorm: Distributed and fault tolerant realtime computation
Storm: Distributed and fault tolerant realtime computation
Ferran Galí Reniu
 
Improved Reliable Streaming Processing: Apache Storm as example
Improved Reliable Streaming Processing: Apache Storm as exampleImproved Reliable Streaming Processing: Apache Storm as example
Improved Reliable Streaming Processing: Apache Storm as example
DataWorks Summit/Hadoop Summit
 
Multi-tenant Apache Storm as a service
Multi-tenant Apache Storm as a serviceMulti-tenant Apache Storm as a service
Multi-tenant Apache Storm as a service
Robert Evans
 
Storm - As deep into real-time data processing as you can get in 30 minutes.
Storm - As deep into real-time data processing as you can get in 30 minutes.Storm - As deep into real-time data processing as you can get in 30 minutes.
Storm - As deep into real-time data processing as you can get in 30 minutes.
Dan Lynn
 

Mais procurados (20)

Introduction to Apache Storm
Introduction to Apache StormIntroduction to Apache Storm
Introduction to Apache Storm
 
Storm presentation
Storm presentationStorm presentation
Storm presentation
 
Introduction to Apache Storm - Concept & Example
Introduction to Apache Storm - Concept & ExampleIntroduction to Apache Storm - Concept & Example
Introduction to Apache Storm - Concept & Example
 
Scaling Apache Storm (Hadoop Summit 2015)
Scaling Apache Storm (Hadoop Summit 2015)Scaling Apache Storm (Hadoop Summit 2015)
Scaling Apache Storm (Hadoop Summit 2015)
 
Resource Aware Scheduling in Apache Storm
Resource Aware Scheduling in Apache StormResource Aware Scheduling in Apache Storm
Resource Aware Scheduling in Apache Storm
 
Real-time streams and logs with Storm and Kafka
Real-time streams and logs with Storm and KafkaReal-time streams and logs with Storm and Kafka
Real-time streams and logs with Storm and Kafka
 
Apache Storm
Apache StormApache Storm
Apache Storm
 
Storm Anatomy
Storm AnatomyStorm Anatomy
Storm Anatomy
 
Cassandra and Storm at Health Market Sceince
Cassandra and Storm at Health Market SceinceCassandra and Storm at Health Market Sceince
Cassandra and Storm at Health Market Sceince
 
Apache Storm
Apache StormApache Storm
Apache Storm
 
Multi-Tenant Storm Service on Hadoop Grid
Multi-Tenant Storm Service on Hadoop GridMulti-Tenant Storm Service on Hadoop Grid
Multi-Tenant Storm Service on Hadoop Grid
 
Storm and Cassandra
Storm and Cassandra Storm and Cassandra
Storm and Cassandra
 
Apache Storm and twitter Streaming API integration
Apache Storm and twitter Streaming API integrationApache Storm and twitter Streaming API integration
Apache Storm and twitter Streaming API integration
 
Apache Storm Concepts
Apache Storm ConceptsApache Storm Concepts
Apache Storm Concepts
 
Real-time Big Data Processing with Storm
Real-time Big Data Processing with StormReal-time Big Data Processing with Storm
Real-time Big Data Processing with Storm
 
Using Simplicity to Make Hard Big Data Problems Easy
Using Simplicity to Make Hard Big Data Problems EasyUsing Simplicity to Make Hard Big Data Problems Easy
Using Simplicity to Make Hard Big Data Problems Easy
 
Storm: Distributed and fault tolerant realtime computation
Storm: Distributed and fault tolerant realtime computationStorm: Distributed and fault tolerant realtime computation
Storm: Distributed and fault tolerant realtime computation
 
Improved Reliable Streaming Processing: Apache Storm as example
Improved Reliable Streaming Processing: Apache Storm as exampleImproved Reliable Streaming Processing: Apache Storm as example
Improved Reliable Streaming Processing: Apache Storm as example
 
Multi-tenant Apache Storm as a service
Multi-tenant Apache Storm as a serviceMulti-tenant Apache Storm as a service
Multi-tenant Apache Storm as a service
 
Storm - As deep into real-time data processing as you can get in 30 minutes.
Storm - As deep into real-time data processing as you can get in 30 minutes.Storm - As deep into real-time data processing as you can get in 30 minutes.
Storm - As deep into real-time data processing as you can get in 30 minutes.
 

Destaque

Storms!
Storms!Storms!
Storm: distributed and fault-tolerant realtime computation
Storm: distributed and fault-tolerant realtime computationStorm: distributed and fault-tolerant realtime computation
Storm: distributed and fault-tolerant realtime computation
nathanmarz
 
Storms and cyclones
Storms and cyclonesStorms and cyclones
Storms and cyclones
Vaibhav Anand
 
Demystifying Data Engineering
Demystifying Data EngineeringDemystifying Data Engineering
Demystifying Data Engineering
nathanmarz
 
Wind, storms and cyclones ppt
Wind, storms and cyclones pptWind, storms and cyclones ppt
Wind, storms and cyclones ppt
3962
 
Apache Storm 0.9 basic training - Verisign
Apache Storm 0.9 basic training - VerisignApache Storm 0.9 basic training - Verisign
Apache Storm 0.9 basic training - Verisign
Michael Noll
 
Docker architecture-05
Docker architecture-05Docker architecture-05
Docker architecture-05
Mohammadreza Amini
 
Storm overview & integration
Storm overview & integrationStorm overview & integration
Storm overview & integration
Vanja Radovanović
 
Future of Apache Storm
Future of Apache StormFuture of Apache Storm
Future of Apache Storm
DataWorks Summit/Hadoop Summit
 
Realtime processing with storm presentation
Realtime processing with storm presentationRealtime processing with storm presentation
Realtime processing with storm presentation
Gabriel Eisbruch
 
Weather
WeatherWeather
Weather
Mr. M
 
Tempo & music
Tempo & musicTempo & music
Tempo & music
cristisfer
 
Pantig sa grade 3
Pantig sa grade 3Pantig sa grade 3
Pantig sa grade 3
Razel Rebamba
 
Tipos de tormenta
Tipos de tormentaTipos de tormenta
Tipos de tormenta
Mario Cruz Barrera
 
Presentation On Winds,Storms, & Cyclones....
Presentation On Winds,Storms, & Cyclones....Presentation On Winds,Storms, & Cyclones....
Presentation On Winds,Storms, & Cyclones....
Sourav Shaw
 
Air masses and fronts
Air masses and frontsAir masses and fronts
Air masses and fronts
eudalddiaz
 
Kinds of weather
Kinds of weatherKinds of weather
Kinds of weather
Charisse Marie Verallo
 
Wind,storms and cyclones
Wind,storms and cyclonesWind,storms and cyclones
Wind,storms and cyclones
Mayank Raj
 
Real time and reliable processing with Apache Storm
Real time and reliable processing with Apache StormReal time and reliable processing with Apache Storm
Real time and reliable processing with Apache Storm
Andrea Iacono
 
Weather elements unit 15
Weather elements unit 15Weather elements unit 15
Weather elements unit 15
MissST
 

Destaque (20)

Storms!
Storms!Storms!
Storms!
 
Storm: distributed and fault-tolerant realtime computation
Storm: distributed and fault-tolerant realtime computationStorm: distributed and fault-tolerant realtime computation
Storm: distributed and fault-tolerant realtime computation
 
Storms and cyclones
Storms and cyclonesStorms and cyclones
Storms and cyclones
 
Demystifying Data Engineering
Demystifying Data EngineeringDemystifying Data Engineering
Demystifying Data Engineering
 
Wind, storms and cyclones ppt
Wind, storms and cyclones pptWind, storms and cyclones ppt
Wind, storms and cyclones ppt
 
Apache Storm 0.9 basic training - Verisign
Apache Storm 0.9 basic training - VerisignApache Storm 0.9 basic training - Verisign
Apache Storm 0.9 basic training - Verisign
 
Docker architecture-05
Docker architecture-05Docker architecture-05
Docker architecture-05
 
Storm overview & integration
Storm overview & integrationStorm overview & integration
Storm overview & integration
 
Future of Apache Storm
Future of Apache StormFuture of Apache Storm
Future of Apache Storm
 
Realtime processing with storm presentation
Realtime processing with storm presentationRealtime processing with storm presentation
Realtime processing with storm presentation
 
Weather
WeatherWeather
Weather
 
Tempo & music
Tempo & musicTempo & music
Tempo & music
 
Pantig sa grade 3
Pantig sa grade 3Pantig sa grade 3
Pantig sa grade 3
 
Tipos de tormenta
Tipos de tormentaTipos de tormenta
Tipos de tormenta
 
Presentation On Winds,Storms, & Cyclones....
Presentation On Winds,Storms, & Cyclones....Presentation On Winds,Storms, & Cyclones....
Presentation On Winds,Storms, & Cyclones....
 
Air masses and fronts
Air masses and frontsAir masses and fronts
Air masses and fronts
 
Kinds of weather
Kinds of weatherKinds of weather
Kinds of weather
 
Wind,storms and cyclones
Wind,storms and cyclonesWind,storms and cyclones
Wind,storms and cyclones
 
Real time and reliable processing with Apache Storm
Real time and reliable processing with Apache StormReal time and reliable processing with Apache Storm
Real time and reliable processing with Apache Storm
 
Weather elements unit 15
Weather elements unit 15Weather elements unit 15
Weather elements unit 15
 

Semelhante a Storm

Jan 2012 HUG: Storm
Jan 2012 HUG: StormJan 2012 HUG: Storm
Jan 2012 HUG: Storm
Yahoo Developer Network
 
Cleveland HUG - Storm
Cleveland HUG - StormCleveland HUG - Storm
Cleveland HUG - Storm
justinjleet
 
Low Latency Streaming Data Processing in Hadoop
Low Latency Streaming Data Processing in HadoopLow Latency Streaming Data Processing in Hadoop
Low Latency Streaming Data Processing in Hadoop
InSemble
 
Building Big Data Streaming Architectures
Building Big Data Streaming ArchitecturesBuilding Big Data Streaming Architectures
Building Big Data Streaming Architectures
David Martínez Rego
 
Apache Storm
Apache StormApache Storm
Apache Storm
Rajind Ruparathna
 
Hadoop Ecosystem and Low Latency Streaming Architecture
Hadoop Ecosystem and Low Latency Streaming ArchitectureHadoop Ecosystem and Low Latency Streaming Architecture
Hadoop Ecosystem and Low Latency Streaming Architecture
InSemble
 
Data Stream Algorithms in Storm and R
Data Stream Algorithms in Storm and RData Stream Algorithms in Storm and R
Data Stream Algorithms in Storm and R
Radek Maciaszek
 
From Device to Data Center to Insights
From Device to Data Center to InsightsFrom Device to Data Center to Insights
From Device to Data Center to Insights
DataWorks Summit/Hadoop Summit
 
Real-Time Streaming with Apache Spark Streaming and Apache Storm
Real-Time Streaming with Apache Spark Streaming and Apache StormReal-Time Streaming with Apache Spark Streaming and Apache Storm
Real-Time Streaming with Apache Spark Streaming and Apache Storm
Davorin Vukelic
 
High Performance Systems in Go - GopherCon 2014
High Performance Systems in Go - GopherCon 2014High Performance Systems in Go - GopherCon 2014
High Performance Systems in Go - GopherCon 2014
Derek Collison
 
Alvaro Videla, Building a Distributed Data Ingestion System with RabbitMQ
Alvaro Videla, Building a Distributed Data Ingestion System with RabbitMQAlvaro Videla, Building a Distributed Data Ingestion System with RabbitMQ
Alvaro Videla, Building a Distributed Data Ingestion System with RabbitMQ
Tanya Denisyuk
 
Trend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache BigtopTrend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache Bigtop
Evans Ye
 
Apache storm vs. Spark Streaming
Apache storm vs. Spark StreamingApache storm vs. Spark Streaming
Apache storm vs. Spark Streaming
P. Taylor Goetz
 
From Device to Data Center to Insights: Architectural Considerations for the ...
From Device to Data Center to Insights: Architectural Considerations for the ...From Device to Data Center to Insights: Architectural Considerations for the ...
From Device to Data Center to Insights: Architectural Considerations for the ...
P. Taylor Goetz
 
Mhug apache storm
Mhug apache stormMhug apache storm
Mhug apache storm
Joseph Niemiec
 
introduction to data processing using Hadoop and Pig
introduction to data processing using Hadoop and Pigintroduction to data processing using Hadoop and Pig
introduction to data processing using Hadoop and Pig
Ricardo Varela
 
Hortonworks Technical Workshop: Real Time Monitoring with Apache Hadoop
Hortonworks Technical Workshop: Real Time Monitoring with Apache HadoopHortonworks Technical Workshop: Real Time Monitoring with Apache Hadoop
Hortonworks Technical Workshop: Real Time Monitoring with Apache Hadoop
Hortonworks
 
The Cassandra Distributed Database
The Cassandra Distributed DatabaseThe Cassandra Distributed Database
The Cassandra Distributed Database
Eric Evans
 
InfluxEnterprise Architecture Patterns by Tim Hall & Sam Dillard
InfluxEnterprise Architecture Patterns by Tim Hall & Sam DillardInfluxEnterprise Architecture Patterns by Tim Hall & Sam Dillard
InfluxEnterprise Architecture Patterns by Tim Hall & Sam Dillard
InfluxData
 
Follow the White Rabbit - Message Queues with PHP
Follow the White Rabbit - Message Queues with PHPFollow the White Rabbit - Message Queues with PHP
Follow the White Rabbit - Message Queues with PHP
Eric Rodriguez (Hiring in Lex)
 

Semelhante a Storm (20)

Jan 2012 HUG: Storm
Jan 2012 HUG: StormJan 2012 HUG: Storm
Jan 2012 HUG: Storm
 
Cleveland HUG - Storm
Cleveland HUG - StormCleveland HUG - Storm
Cleveland HUG - Storm
 
Low Latency Streaming Data Processing in Hadoop
Low Latency Streaming Data Processing in HadoopLow Latency Streaming Data Processing in Hadoop
Low Latency Streaming Data Processing in Hadoop
 
Building Big Data Streaming Architectures
Building Big Data Streaming ArchitecturesBuilding Big Data Streaming Architectures
Building Big Data Streaming Architectures
 
Apache Storm
Apache StormApache Storm
Apache Storm
 
Hadoop Ecosystem and Low Latency Streaming Architecture
Hadoop Ecosystem and Low Latency Streaming ArchitectureHadoop Ecosystem and Low Latency Streaming Architecture
Hadoop Ecosystem and Low Latency Streaming Architecture
 
Data Stream Algorithms in Storm and R
Data Stream Algorithms in Storm and RData Stream Algorithms in Storm and R
Data Stream Algorithms in Storm and R
 
From Device to Data Center to Insights
From Device to Data Center to InsightsFrom Device to Data Center to Insights
From Device to Data Center to Insights
 
Real-Time Streaming with Apache Spark Streaming and Apache Storm
Real-Time Streaming with Apache Spark Streaming and Apache StormReal-Time Streaming with Apache Spark Streaming and Apache Storm
Real-Time Streaming with Apache Spark Streaming and Apache Storm
 
High Performance Systems in Go - GopherCon 2014
High Performance Systems in Go - GopherCon 2014High Performance Systems in Go - GopherCon 2014
High Performance Systems in Go - GopherCon 2014
 
Alvaro Videla, Building a Distributed Data Ingestion System with RabbitMQ
Alvaro Videla, Building a Distributed Data Ingestion System with RabbitMQAlvaro Videla, Building a Distributed Data Ingestion System with RabbitMQ
Alvaro Videla, Building a Distributed Data Ingestion System with RabbitMQ
 
Trend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache BigtopTrend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache Bigtop
 
Apache storm vs. Spark Streaming
Apache storm vs. Spark StreamingApache storm vs. Spark Streaming
Apache storm vs. Spark Streaming
 
From Device to Data Center to Insights: Architectural Considerations for the ...
From Device to Data Center to Insights: Architectural Considerations for the ...From Device to Data Center to Insights: Architectural Considerations for the ...
From Device to Data Center to Insights: Architectural Considerations for the ...
 
Mhug apache storm
Mhug apache stormMhug apache storm
Mhug apache storm
 
introduction to data processing using Hadoop and Pig
introduction to data processing using Hadoop and Pigintroduction to data processing using Hadoop and Pig
introduction to data processing using Hadoop and Pig
 
Hortonworks Technical Workshop: Real Time Monitoring with Apache Hadoop
Hortonworks Technical Workshop: Real Time Monitoring with Apache HadoopHortonworks Technical Workshop: Real Time Monitoring with Apache Hadoop
Hortonworks Technical Workshop: Real Time Monitoring with Apache Hadoop
 
The Cassandra Distributed Database
The Cassandra Distributed DatabaseThe Cassandra Distributed Database
The Cassandra Distributed Database
 
InfluxEnterprise Architecture Patterns by Tim Hall & Sam Dillard
InfluxEnterprise Architecture Patterns by Tim Hall & Sam DillardInfluxEnterprise Architecture Patterns by Tim Hall & Sam Dillard
InfluxEnterprise Architecture Patterns by Tim Hall & Sam Dillard
 
Follow the White Rabbit - Message Queues with PHP
Follow the White Rabbit - Message Queues with PHPFollow the White Rabbit - Message Queues with PHP
Follow the White Rabbit - Message Queues with PHP
 

Mais de nathanmarz

The inherent complexity of stream processing
The inherent complexity of stream processingThe inherent complexity of stream processing
The inherent complexity of stream processing
nathanmarz
 
The Epistemology of Software Engineering
The Epistemology of Software EngineeringThe Epistemology of Software Engineering
The Epistemology of Software Engineering
nathanmarz
 
Your Code is Wrong
Your Code is WrongYour Code is Wrong
Your Code is Wrong
nathanmarz
 
Runaway complexity in Big Data... and a plan to stop it
Runaway complexity in Big Data... and a plan to stop itRunaway complexity in Big Data... and a plan to stop it
Runaway complexity in Big Data... and a plan to stop it
nathanmarz
 
ElephantDB
ElephantDBElephantDB
ElephantDB
nathanmarz
 
Become Efficient or Die: The Story of BackType
Become Efficient or Die: The Story of BackTypeBecome Efficient or Die: The Story of BackType
Become Efficient or Die: The Story of BackType
nathanmarz
 
The Secrets of Building Realtime Big Data Systems
The Secrets of Building Realtime Big Data SystemsThe Secrets of Building Realtime Big Data Systems
The Secrets of Building Realtime Big Data Systems
nathanmarz
 
Clojure at BackType
Clojure at BackTypeClojure at BackType
Clojure at BackType
nathanmarz
 
Cascalog workshop
Cascalog workshopCascalog workshop
Cascalog workshop
nathanmarz
 
Cascalog at Strange Loop
Cascalog at Strange LoopCascalog at Strange Loop
Cascalog at Strange Loop
nathanmarz
 
Cascalog at Hadoop Day
Cascalog at Hadoop DayCascalog at Hadoop Day
Cascalog at Hadoop Day
nathanmarz
 
Cascalog at May Bay Area Hadoop User Group
Cascalog at May Bay Area Hadoop User GroupCascalog at May Bay Area Hadoop User Group
Cascalog at May Bay Area Hadoop User Group
nathanmarz
 
Cascalog
CascalogCascalog
Cascalog
nathanmarz
 
Cascading
CascadingCascading
Cascading
nathanmarz
 

Mais de nathanmarz (14)

The inherent complexity of stream processing
The inherent complexity of stream processingThe inherent complexity of stream processing
The inherent complexity of stream processing
 
The Epistemology of Software Engineering
The Epistemology of Software EngineeringThe Epistemology of Software Engineering
The Epistemology of Software Engineering
 
Your Code is Wrong
Your Code is WrongYour Code is Wrong
Your Code is Wrong
 
Runaway complexity in Big Data... and a plan to stop it
Runaway complexity in Big Data... and a plan to stop itRunaway complexity in Big Data... and a plan to stop it
Runaway complexity in Big Data... and a plan to stop it
 
ElephantDB
ElephantDBElephantDB
ElephantDB
 
Become Efficient or Die: The Story of BackType
Become Efficient or Die: The Story of BackTypeBecome Efficient or Die: The Story of BackType
Become Efficient or Die: The Story of BackType
 
The Secrets of Building Realtime Big Data Systems
The Secrets of Building Realtime Big Data SystemsThe Secrets of Building Realtime Big Data Systems
The Secrets of Building Realtime Big Data Systems
 
Clojure at BackType
Clojure at BackTypeClojure at BackType
Clojure at BackType
 
Cascalog workshop
Cascalog workshopCascalog workshop
Cascalog workshop
 
Cascalog at Strange Loop
Cascalog at Strange LoopCascalog at Strange Loop
Cascalog at Strange Loop
 
Cascalog at Hadoop Day
Cascalog at Hadoop DayCascalog at Hadoop Day
Cascalog at Hadoop Day
 
Cascalog at May Bay Area Hadoop User Group
Cascalog at May Bay Area Hadoop User GroupCascalog at May Bay Area Hadoop User Group
Cascalog at May Bay Area Hadoop User Group
 
Cascalog
CascalogCascalog
Cascalog
 
Cascading
CascadingCascading
Cascading
 

Último

Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
MichaelKnudsen27
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
Wouter Lemaire
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
Chart Kalyan
 
Operating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptxOperating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptx
Pravash Chandra Das
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
Hiroshi SHIBATA
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
DanBrown980551
 
A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024
Intelisync
 
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdfNunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf
flufftailshop
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
saastr
 
Finale of the Year: Apply for Next One!
Finale of the Year: Apply for Next One!Finale of the Year: Apply for Next One!
Finale of the Year: Apply for Next One!
GDSC PJATK
 
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - HiikeSystem Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
Hiike
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Zilliz
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
AWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptxAWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptx
HarisZaheer8
 

Último (20)

Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
 
Operating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptxOperating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptx
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
 
A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024
 
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdfNunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
 
Finale of the Year: Apply for Next One!
Finale of the Year: Apply for Next One!Finale of the Year: Apply for Next One!
Finale of the Year: Apply for Next One!
 
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - HiikeSystem Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
AWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptxAWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptx
 

Storm