Akka Streams (0.7) talk for the Tokyo Scala User Group, hosted by Dwango.
Akka streams are an reactive streams implementation which allows for asynchronous back-pressured processing of data in complext pipelines. This talk aims to highlight the details about how reactive streams work as well as some of the ideas behind akka streams.
11. Streams
“You cannot enter the same river twice”
~ Heraclitus
http://en.wikiquote.org/wiki/Heraclitus
12. Streams
ライルタイムストリーム処理
Real Time Stream Processing
!パブリッシャーにつけることが遅ければ、
データは川のように流れるため、最初の
要素を逃がしてしまう可能性がある
When you attach “late” to a Publisher,
you may miss initial elements – it’s a river of data.
http://en.wikiquote.org/wiki/Heraclitus
21. Reactive Streams - Inter-op
システム達を協力させたい.
We want to make different implementations
co-operate with each other.
http://reactive-streams.org
22. Reactive Streams - Inter-op
リアクティブストリームのプロトコールを
使ってシステム達を話し合わせる
The different implementations “talk to each other”
using the Reactive Streams protocol.
http://reactive-streams.org
23. Reactive Streams - Inter-op
リアクティブストリームSPIはユーザーAPIではない。
対象システムのライブラリを使うべき。
The Reactive Streams SPI is NOT meant to be user-api.
You should use one of the implementing libraries.
http://reactive-streams.org
32. Back-pressure? Push + NACK model
バッファーが溢れてしまった場合、どうなる?
What if the buffer overflows?
33. Back-pressure? Push + NACK model (a)
有界バッファーを利用して、溢れてしまったメッセージを
落とし、再送を求める.
!
Use bounded buffer, drop messages + require re-sending
34. Back-pressure? Push + NACK model (a)
有界バッファーを利用して、溢れてしまったメッセージを
落とし、再送を求める.
!
Use bounded buffer, drop messages + require re-sending
Kernel does this!
Routers do this!
(TCP)
35. Back-pressure? Push + NACK model (b)
バッファーの容量を増やす・・・まぁ、
メモリ容量がある限り!
Increase buffer size…
Well, while you have memory available!
44. Back-pressure? RS: Dynamic Push/Pull
プッシュ一方 - 速度が遅いサブスクライバーの場合、
安全じゃない
Just push – not safe when Slow Subscriber
!
!
プル一方 - 速度早いサブスクライバーの場合、
遅すぎる
Just pull – too slow when Fast Subscriber
45. Back-pressure? RS: Dynamic Push/Pull
プッシュ一方 - 速度が遅いサブスクライバーの場合、
安全じゃない
Just push – not safe when Slow Subscriber
!
!
プル一方 - 速度早いサブスクライバーの場合、
遅すぎる
Just pull – too slow when Fast Subscriber
!
Solution:
動的な調整(リアクティブストリーム)
Dynamic adjustment (Reactive Streams)
46. Back-pressure? RS: Dynamic Push/Pull
速度が遅いサブスクライバーは自分のバッ
ファーが3つの要素まで受け取られる。パブリッ
シャーはサブスクライバーのバッファーが溢れ
るほどデータを送らない。Slow Subscriber sees it’s buffer can
take 3 elements. Publisher will never blow up it’s buffer.
47. Back-pressure? RS: Dynamic Push/Pull
速度が速いパブリッシャーは最大でも3つの要
素を送る。これは、プル型のback-pressure.
Fast Publisher will send at-most 3 elements. This is pull-based-backpressure.
48. Back-pressure? RS: Dynamic Push/Pull
速度が速いサブスクライバーは、実際のデータ
が届く前に、たくさんリクエストを送ることが
できる!Fast Subscriber can issue more Request(n), before more data arrives!
49. Back-pressure? RS: Dynamic Push/Pull
パブリッシャーはサブスクライバーの全てのリ
クエストを溜める.
Fast Subscriber can issue more Request(n), before more data
arrives. Publisher can accumulate demand.
50. Back-pressure? RS: Accumulate demand
パブリッシャーはサブスクライバーの全ての
リクエストを溜める. Publisher accumulates total demand per subscriber.
51. Back-pressure? RS: Accumulate demand
溜まった要素をパブリッシュすることは安全だ。
サブスクライバーのバッファーは溢れない。
Total demand of elements is safe to publish.
Subscriber’s buffer will not overflow.
53. Back-pressure? RS: Requesting “a lot”
Fast Subscriber, can request “a lot” from Publisher.
This is effectively “publisher push”, and is really fast.
Buffer size is known and this is safe.
57. Akka = アッカ
Akka is a high-performance concurrency
library for Scala and Java.
!
At it’s core it focuses on the Actor Model:
58. Akka = アッカ
Akka is a high-performance concurrency
library for Scala and Java.
!
At it’s core it focuses on the Actor Model:
アクターができること:
• メッセージを送信・受信する (Send / receive messages)
• アクターを作る (Create Actors)
• 自分の動作を変える (Change it’s behaviour)
65. Akka Streams – Linear Flow
FlowFrom[Double].map(_.toInt). [...]
ソースはまだ付けられていない.
パイプはダブルを処理する準備を整いた.
No Source attached yet.
“Pipe ready to work with Doubles”.
66. Akka Streams – Linear Flow
implicit val sys = ActorSystem("tokyo-sys")!
!
アクターが住んでいる世界.
AkkaStreamsはアクターを使うため、
アクターシステムは必要.
!
ActorSystem is the world in which Actors live in.
AkkaStreams uses Actors, so it needs ActorSystem.
67. Akka Streams – Linear Flow
implicit val sys = ActorSystem("tokyo-sys")!
implicit val mat = FlowMaterializer()!
ストリームをどうやって具体化するかの
ロジックが含まれている.
!
Contains logic on HOW to materialise the stream.
68. Akka Streams – Linear Flow
implicit val sys = ActorSystem("tokyo-sys")!
implicit val mat = FlowMaterializer()!
単純にアクターか、
もしくは実装されていれば、
Apache Spark (?!)
!
A materialiser can choose HOW to materialise, it could even use
Apache Spark (?!) if someone would implement that… :-)
69. Akka Streams – Linear Flow
implicit val sys = ActorSystem("tokyo-sys")!
implicit val mat = FlowMaterializer()!
バッファーの容量を設定できるなど
!
You can configure it’s buffer sizes etc.
70. Akka Streams – Linear Flow
implicit val sys = ActorSystem("tokyo-sys")!
implicit val mat = FlowMaterializer()!
val foreachSink = ForeachSink[Int](println)!
val mf = FlowFrom(1 to 3).withSink(foreachSink).run()
Uses the implicit FlowMaterializer
71. Akka Streams – Linear Flow
implicit val sys = ActorSystem("tokyo-sys")!
implicit val mat = FlowMaterializer()!
val foreachSink = ForeachSink[Int](println)!
val mf = FlowFrom(1 to 3).withSink(foreachSink).run()(mat)
72. Akka Streams – Linear Flow
val mf = FlowFrom[Int].!
map(_ * 2).!
withSink(ForeachSink(println)) // needs source,!
// can NOT run
走らせるためにソースが必要!
73. Akka Streams – Linear Flow
val f = FlowFrom[Int].!
map(_ * 2).!
! ! ! withSink(ForeachSink(i => println(s"i = $i”))).!
! ! // needs Source to run!
インプットが必要
走らせるためにソースが必要!
74. Akka Streams – Linear Flow
val f = FlowFrom[Int].!
map(_ * 2).!
! ! ! withSink(ForeachSink(i => println(s"i = $i”))).!
! ! // needs Source to run!
75. Akka Streams – Linear Flow
val f = FlowFrom[Int].!
map(_ * 2).!
! ! ! withSink(ForeachSink(i => println(s"i = $i”))).!
! ! // needs Source to run!
76. Akka Streams – Linear Flow
val f = FlowFrom[Int].!
map(_ * 2).!
! ! ! withSink(ForeachSink(i => println(s"i = $i”))).!
! ! // needs Source to run!
!
! ! ! f.withSource(IterableSource(1 to 10)).run()
準備完了!
77. Akka Streams – Linear Flow
val f = FlowFrom[Int].!
map(_ * 2).!
! ! ! withSink(ForeachSink(i => println(s"i = $i”))).!
! ! // needs Source to run!
!
! ! ! f.withSource(IterableSource(1 to 10)).run()
準備完了!
78. Akka Streams – Flows are reusable
!
! ! ! f.withSource(IterableSource(1 to 10)).run()!
! ! ! f.withSource(IterableSource(1 to 100)).run()!
! ! ! f.withSource(IterableSource(1 to 1000)).run()
80. Akka Streams <-> Actors – Advanced
val subscriber = system.actorOf(Props[SubStreamParent], ”parent")!
!
FlowFrom(1 to 100).!
map(_.toString).!
filter(_.length == 2).!
drop(2).!
groupBy(_.last).!
publishTo(ActorSubscriber(subscriber))!
各グループもストリームだよ!
Each “group” is a stream too! It’s a “Stream of Streams”.
81. Akka Streams <-> Actors – Advanced
!
groupBy(_.last).
「11」をグループ「1」に、「12」をグループ「2」になど
GroupBy groups “11” to group “1”, “12” to group “2” etc.
82. Akka Streams <-> Actors – Advanced
!
groupBy(_.last).
サブスクライバーに[グループキー、サブストリームフロー]を提供する
It offers (groupKey, subStreamFlow) to Subscriber
83. Akka Streams <-> Actors – Advanced
!
groupBy(_.last).
子供を起動させ、サブーフローを扱わせる
It can then start children, to handle the sub-flows!
84. Akka Streams <-> Actors – Advanced
!
groupBy(_.last).
例えば、グループ毎に1人の子供
For example, one child for each group.
85. Akka Streams <-> Actors – Advanced
val subscriber = system.actorOf(Props[SubStreamParent], ”parent")!
!
FlowFrom(1 to 100).!
map(_.toString).!
filter(_.length == 2).!
drop(2).!
groupBy(_.last).!
publishTo(ActorSubscriber(subscriber))!
普通 Akka Actor, will consume SubStream offers.
86. Akka Streams <-> Actors – Advanced
class SubStreamParent extends ActorSubscriber !
with ImplicitFlowMaterializer !
with ActorLogging {!
!
override def requestStrategy = OneByOneRequestStrategy!
!
override def receive = {!
case OnNext((groupId: String, subStream: FlowWithSource[_, _])) =>!
!
val subSub = context.actorOf(Props[SubStreamSubscriber], !
s"sub-$groupId")!
subStream.publishTo(ActorSubscriber(subSub))!
}!
}!
87. Akka Streams <-> Actors – Advanced
class SubStreamParent extends ActorSubscriber !
with ImplicitFlowMaterializer !
with ActorLogging {!
!
override def requestStrategy = OneByOneRequestStrategy!
!
override def receive = {!
case OnNext((groupId: String, subStream: FlowWithSource[_, _])) =>!
!
val subSub = context.actorOf(Props[SubStreamSubscriber], !
s"sub-$groupId")!
subStream.publishTo(ActorSubscriber(subSub))!
}!
}!
88. Akka Streams <-> Actors – Advanced
class SubStreamParent extends ActorSubscriber !
with ImplicitFlowMaterializer !
with ActorLogging {!
!
override def requestStrategy = OneByOneRequestStrategy!
!
override def receive = {!
case OnNext((groupId: String, subStream: FlowWithSource[_, _])) =>!
!
val subSub = context.actorOf(Props[SubStreamSubscriber], !
s"sub-$groupId")!
subStream.publishTo(ActorSubscriber(subSub))!
}!
}!
89. Akka Streams <-> Actors – Advanced
class SubStreamParent extends ActorSubscriber !
with ImplicitFlowMaterializer !
with ActorLogging {!
!
override def requestStrategy = OneByOneRequestStrategy!
!
override def receive = {!
case OnNext((groupId: String, subStream: FlowWithSource[_, _])) =>!
!
val subSub = context.actorOf(Props[SubStreamSubscriber], !
s"sub-$groupId")!
subStream.publishTo(ActorSubscriber(subSub))!
}!
}!
90. Akka Streams <-> Actors – Advanced
class SubStreamParent extends ActorSubscriber !
with ImplicitFlowMaterializer !
with ActorLogging {!
!
override def requestStrategy = OneByOneRequestStrategy!
!
override def receive = {!
case OnNext((groupId: String, subStream: FlowWithSource[_, _])) =>!
!
val subSub = context.actorOf(Props[SubStreamSubscriber], !
s"sub-$groupId")!
subStream.publishTo(ActorSubscriber(subSub))!
}!
}!
93. Akka Streams – GraphFlow
Linear Flows
or
non-akka pipelines
Could be another RS implementation!
94. Akka Streams – GraphFlow
Fan-out elements
and
Fan-in elements
95. Akka Streams – GraphFlow
Fan-out elements
and
Fan-in elements
Now you need a FlowGraph
96. Akka Streams – GraphFlow
// first define some pipeline pieces!
val f1 = FlowFrom[Input].map(_.toIntermediate)!
val f2 = FlowFrom[Intermediate].map(_.enrich)!
val f3 = FlowFrom[Enriched].filter(_.isImportant)!
val f4 = FlowFrom[Intermediate].mapFuture(_.enrichAsync)!
!
// then add input and output placeholders!
val in = SubscriberSource[Input]!
val out = PublisherSink[Enriched]!
98. Akka Streams – GraphFlow
val b3 = Broadcast[Int]("b3")!
val b7 = Broadcast[Int]("b7")!
val b11 = Broadcast[Int]("b11")!
val m8 = Merge[Int]("m8")!
val m9 = Merge[Int]("m9")!
val m10 = Merge[Int]("m10")!
val m11 = Merge[Int]("m11")!
val in3 = IterableSource(List(3))!
val in5 = IterableSource(List(5))!
val in7 = IterableSource(List(7))!
105. Akka Streams – GraphFlow
Sinks and Sources are “keys”
which can be addressed within the graph
val resultFuture2 = FutureSink[Seq[Int]]!
val resultFuture9 = FutureSink[Seq[Int]]!
val resultFuture10 = FutureSink[Seq[Int]]!
!
val g = FlowGraph { implicit b =>!
// ...!
m10 ~> FlowFrom[Int].grouped(1000) ~> resultFuture10!
// ...!
}.run()!
!
Await.result(g.getSinkFor(resultFuture2), 3.seconds).sorted!
should be(List(5, 7))
106. Akka Streams – GraphFlow
Sinks and Sources are “keys”
which can be addressed within the graph
val resultFuture2 = FutureSink[Seq[Int]]!
val resultFuture9 = FutureSink[Seq[Int]]!
val resultFuture10 = FutureSink[Seq[Int]]!
!
val g = FlowGraph { implicit b =>!
// ...!
m10 ~> FlowFrom[Int].grouped(1000) ~> resultFuture10!
// ...!
}.run()!
!
Await.result(g.getSinkFor(resultFuture2), 3.seconds).sorted!
should be(List(5, 7))
107. Akka Streams – GraphFlow
!
val g = FlowGraph {}!
FlowGraphは不変で、安全に共有でき、
何度も使いまわせる!
!
FlowGraph is immutable and safe to share and re-use!
Think of it as “the description” which then gets “run”.