This document compares various streaming analytics technologies including Apache Storm, Apache Trident, Apache Flink, and Spark Streaming. It discusses key features needed in streaming applications such as fault tolerance, message processing guarantees, back pressure, and resource utilization. It then provides an overview of each technology, describing their architectures, programming models, support for features like state management, and ability to run on shared clusters. The document concludes with suggestions on how to benchmark Spark Streaming applications.
Flink uses effectively distributed blocking queues with bounded capacity
The output side never puts too much data on the wire by a simple watermark mechanism. If enough data is in-flight, we wait before we copy more data to the wire until it is below a threshold. This guarantees that there is never too much data in-flight. If new data is not consumed on the receiving side (because there is no buffer available), this slows down the sender.
http://data-artisans.com/how-flink-handles-backpressure/