How to Make Norikra Perfect

How to make
Norikra perfect
Stream Processing Casual Talks #1 #streamctjp
Jul 22, 2016
Satoshi Tagomori (@tagomoris)

Satoshi "Moris" Tagomori
(@tagomoris)
Fluentd, MessagePack-Ruby, Norikra, ...
Treasure Data, Inc.

1. How Norikra is perfect
2. How to make Norikra more perfect

Norikra: 
Schema-less Stream Processing using SQL
• Server software, written in JRuby, runs on JVM
• Open source software (GPLv2)
• http://norikra.github.io/
• https://github.com/norikra/norikra

SELECT user.age, COUNT(*) as cnt
FROM events.win:time_batch(5 mins)
WHERE current=”San Diego”
AND attend.$0 AND attend.$1
GROUP BY user.age
{“name”:”tagomoris”,
“user:{“age”:35, “corp”:”LINE”,
“address”:”Tokyo”},
“current”:”San Diego”,
“speaker”:true,
“attend”:[true,true,false, ...]
}
{“user.age":35,"cnt":5}, 
{"user.age":36,"cnt":8}, ...

How Norikra is Perfect
• Ultra fast bootstrap
• Schema on read
• Handling complex (nested) events
• Dynamic query registration/unregistration
• Simple Web UI
• Data connector: Fluentd
• Extensible: UDF/Listener plugins
• Performance: good enough for small/middle site

Schema on Read
• Query first, Data next
• Query must know what it requires
• field names, types of fields, ...
• Platform can ingest any data into processor. 
Query can fetch events which matches required
schema.
schema-less (mixed)
data stream
fields subset
for query A
fields subset
for query B
query A
query B
events from
billing service
events from
API endpoint

Architecture
Norikra Server (on JVM)
Esper Instance (Query Engine)
Type Deﬁnition

Manager
Output Event
Pool
Norikra Engine
RPC Server

mizuno (Jetty + Rack)
Rack RPC Handler
Norikra

Client
msgpack-
rpc-over-http

For details :)
• Norikra: Stream Processing with SQL 
http://www.slideshare.net/tagomoris/norikra-stream-processing-with-sql
• Norikra: SQL Stream Processing in Ruby 
http://www.slideshare.net/tagomoris/norikra-sql-stream-processing-in-ruby
• Norikra in Action 
http://www.slideshare.net/tagomoris/norikra-in-action-ver-2014-spring
• Landscape of Norikra Features 
http://www.slideshare.net/tagomoris/norikra-meetup-features
• Norikra Recent Updates 
http://www.slideshare.net/tagomoris/norikra-recent-updates

Recent Updates
• v1.4.0: Jul 19, 2016
• Add support for "-D" and "-agentlib" of JVM
• Update msgpack version
• Previous release v1.3.1: May 7, 2015
• Explained in "Norikra Recent Updates" slide

Good & Bad
• Good for startup: 
Fast bootstrap, SQL, Web UI, Fluentd plugins,  
Handling complex events, ...
• Good for middle: 
Dynamic query registration, Dynamic UDF loading, 
Good performance enough for middle (10k events/sec), 
Schema on read, ...
• Bad for big players: 
No Distribution, No High availability, 
Uncontrollable JVM/Esper behavior (CPU&Memory)

Tentative name:
Perfect Norikra

Perfect Norikra
• All features of Norikra
• Including "Ultra fast bootstrap"
• Compatible RPC API w/ original Norikra
• Distributed execution on any scheduler
• YARN? Mesos? or ...?
• Automatic failover & retry for failures (HA)
• Automated optimization for load balancing
• Dynamic scaling out 
from 1 to 100 nodes - without any restarts/retries

Rough Sketch
RPC Server
RPC Handler
Type Deﬁnition Manager
Query Compiler
DAG Optimizer / Deoptimizer
DAG Executor
Event Router
Event Buffer
Queries
Events
Events
master node
processor node

Rough Sketch
• Brand new query executor
• SQL Parser
• Query compiler into DAG
• SQL operators as sub-DAGs (inspired by TimeStream)
• DAG executor
• Brand new dataﬂow manager / nodes
• Sync/Async data replication
• Barriers for event stream (inspired by Flink)
• Versioned routing/distribution

Dynamic Scaling Out
• Processing nodes are stateful
• state: limited by available memory size
• growing stream size -> memory overﬂow :-(
• Scaling strategy must be dynamic
• restarting queries (of static scaling) increases
latency

Query: COUNT(DISTINCT uid) per 1day
7/1 7/2 7/3 7/4
3nodes 3nodes 3nodes
memory usage per node

7/1 7/2 7/3 7/4
memory overﬂow - CRASH!
Burst Trafﬁc - failure
3nodes 3nodes 3nodes

7/1 7/2 7/3 7/4
3nodes 3nodes 6nodes6nodes
Crash
Recovery
• After crash, restart the query w/ increased # of nodes
• After restart, query re-reads all data of that window
• After recovery, all nodes back to realtime calculation
Crash & Recovery Strategy(1)

7/1 7/2 7/3 7/4
Crash & Recovery Strategy(2)
3nodes 3nodes 6nodes6nodes
Crash
Recovery
• Pros: Very easy to implement
• Cons: Requires all data stored (distributed ﬁlesystem?)
• Cons: Hard to know # of nodes for increasing trafﬁc
• Cons: Recovery state requires more nodes than normal state

Dynamic Scaling Out strategy(1)
7/1 7/2 7/3 7/4
3nodes 5nodes5nodes 6nodes
intermediate result
3nodes
merge results 
for ﬁnal result
• Before crash, increase # of processing nodes
• Queries always produces intermediate results w/ # of distribution
• Query results should be produced by merging intermediate results

Dynamic Scaling Out strategy(2)
7/1 7/2 7/3 7/4
3nodes 5nodes5nodes 6nodes
intermediate result
3nodes
merge results 
for ﬁnal result
• Pros: Less latency, less computing power
• Cons: All operator must support such calculation 
- SQL !

For Dynamic Scaling Out
• De-optimization of operators
• Virtual nodes for routing
• ... and many others

Hard things
• Resource monitoring & limitation
• Multi-tenancy
• UDF and sandbox
• Queries without aggregations

Why not on Spark or Flink?
• Because of schema-less event processing 
- it requires dataﬂow controlled by query manager
• Because of dynamic scaling 
- it requires brand new dataﬂow layer

No Bytes Implemented :P
Stay Tuned!
We are hiring! by Treasure Data

How to Make Norikra Perfect

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (6)

Similar to How to Make Norikra Perfect

Similar to How to Make Norikra Perfect (20)

More from SATOSHI TAGOMORI

More from SATOSHI TAGOMORI (13)

Recently uploaded

Recently uploaded (20)

How to Make Norikra Perfect