2. INTRODUCTION
• Like Hadoop for realtime processing instead of batch
• Open Source
• Developed by BackType which was later acquired byTwitter
• Developed for analyzingTwitter data
• Similar to S4
8. BOLTS
• A component that takes tuples as input and produces tuples
as output
• Can do filtering, joining, functions, aggregations etc.
• Does not have to process a tuple immediately and may hold
onto tuples to process later
• Comparison with Hadoop:
A bolt can be a mapper or a reducer (or anything)
10. STORMTOPOLOGY
• Spouts, bolts and streams
• Distributed
• Runs indefinitely until it is stopped
• Arbitrary complexity
• Streams requiring multiple steps also requires multiple bolts
• No intermediate queues for streams
11. FAULT-TOLERANCE
• Nimbus daemon and Supervisor
daemons are fail-fast and stateless
• Each worker sends heartbeats to Nimbus
• Transactional topologies → Guaranteed processing
Nimbus
Zookeeper
Supervisor
Supervisor
Supervisor
Supervisor
Zookeeper