This document discusses stream processing versus batch processing. It notes that stream processing involves data that is in motion and processed in real-time using streaming platforms and directed acyclic graphs. Batch processing involves data at rest that is processed through queries on the full dataset. The document also discusses challenges of stream processing like out-of-order and late data, and how windowing can provide a finite view of infinite data streams.
48. @gamussa @confluentinc @thephillyjug
Time model
Different use cases time semantics
Majority of use cases require event-
time semantics
Other use cases may require
processing-time or special variants
like ingestion-time
56. @gamussa @confluentinc @thephillyjug
Windowing
Windowing is an operation that groups
events
Most commonly needed: time windows,
session windows
Examples:
✗Real-time monitoring: 5-minute averages
✗Reader behavior on a website: user browsing sessions
62. @gamussa @confluentinc @thephillyjug
Out-of-order and late data
Users with mobile phones enter
airplane, lose Internet connectivity
Emails are being written
during the 10h flight
Internet connectivity is restored,
phones will send queued emails now
65. @gamussa @confluentinc @thephillyjug
Stream Processing: results
• Yes, it’s possible to get computation
results in real time
• Windows – finite view of infinite data
• Based on temporal characteristics of the evet
66. @gamussa @confluentinc @thephillyjug
Stream Processing: results
• Yes, it’s possible to get computation
results in real time
• Windows – finite view of infinite data
• Based on temporal characteristics of the evet
• Late event processing
• You choose how long to wait