Treasure Data Summer Internship Final Report

Who are you?
Ritta Narita
(github:@naritta)
The University of Tokyo, Engineering M2

Researched about Physic simulation
I’ve worked in some companies.

2

Projects for Intern
hivemall: Original VM for Random Forests

!
ﬂuentd: Socket Manager with ServerEngine
3

hivemall: Original VM for Random Forests
4

What’s Random Forest ?
make many decision trees,

accept a majority decision
decision tree(play golf or not)
to know the result of decision tree,

need calculation for bound features.
humidity

> 30 %?
whether

= sunny
wind speed

> 10 m/s ?
play golf
don’t play
don’t play
don’t play
yes
yes
yes
5

generate JS code

→execute using eval

!
At present, to calculate decision tree
if (x[0]==0){

if (x[1]>30){

return 1;}

・・・

}

else {

return 1;

}
x = [weather, humidity, wind]

0=play golf, 1=don’t play
humidity

> 30 %?
whether

= sunny
wind speed

> 10 m/s ?
play golf
don’t play
don’t play
don’t play
yes
yes
yes
6

!
!
due to using eval, can execute any code

!
For example

hostile JS code like inﬁnite loop

→burden for TD

!
It’s difﬁcult to restrict JS code

→need restricted environment to calculate decision tree

!
Problem for JS
7

Then
generate original op code from tree model

→execute on originalVM
PUT x[1]

PUT 0

IFEQ 10

!
・

・

・
x = [weather, humidity, wind]

0=play golf, 1=don’t play
if (x[0]==0){

if (x[1]>30){

return 1;}

・・・

}

else {

return 1;

}
8

What’s the merit?
・can ﬁnd illegal code like inﬁnite loop easily

・only for comparator, so very restricted

・less op code, very fast
9

My work
op code featured for comparator

only PUSH, POP, GOTO, IF~

!
can ﬁnd inﬁnite loop

In this code, supposed not to have loop

→don’t execute same code
10

hadoop version 2.6, Hive 1.2.0 (Tez 0.6.1)

!
hadoop cluster size: c3.2xlarge 8 nodes

!
!
randomforest

!
number of test examples in test_rf: 18083

!
number of trees: 500

!
!
!
compile num: 500

!
eval num: 500 * 18083

!
Javascript : 1062.04 s
(Nashorn)

!
VM: 106.84 s
comparison with JS
10 times faster
11

Why don’t you use Java bytecode and ASM ?
12

Because of the number of class loading
for example, if every clients make 500 models…

↓

too many class loading
If using one class and 500 method,

It is same.
13

summary
・very restricted, can ﬁnd illegal code

!
・10 times faster

!
・future prospects:

can make it even faster by binary code

!
・merged in development branch

and will be released in v0.4

14

ﬂuentd: Socket Manager with ServerEngine
15

In ﬂuentd v0.14
produce

!
New multiprocess model
16

multiprocess at present
use in_multiprocess plugin

have to use multi sockets and assign each ports by user
super

visor
worker
worker
worker
port:

24224
port:

24226
port:

24225
17

multiprocess v0.14
super

visor
worker
worker
worker
port:

24224
using Socket Manager, share one listening socket

→can use multicore without any assignment
port:

24224
port:

24224
port:

24224
Socket

Manager

server
Socket

Manager

client
Socket

Manager

client
Socket

Manager

client
18

can use multicore power fully without unconsciousness

setting ﬁle will get very simple 21
with SocketManagerwith in_multiprocess plugin
<source>

type multiprocess

<process>

cmdline -c /etc/td-agent/td-agent-child1.conf

</process>

<process>

cmdline -c /etc/td-agent/td-agent-child2.conf

</process>

</source>

!
#/etc/td-agent/td-agent-child1.conf

<source>

type forward

port 24224

</source>

!
#/etc/td-agent/td-agent-child2.conf

<source>

type forward

port 24225

</source>
<source>

type forward

port 24224

</source>
setting when using 2 core

To implement Socket Manager, I used ServerEngine
worker
worker
worker
super

visor
Server

Engine
live restart
Heartbeat via pipe

auto restart
22
ServerEngine is: a framework to implement

robust multiprocess servers like Unicorn.

Implementation (Unix)
②Unix Domain Socket (send_io ﬁle descriptor)
worker
worker
worker
Socket

Manager

client
Socket

Manager

client
Socket

Manager

client
FD
Spawn
Socket

Manager

server
super

visor
Server

Engine
①DRb (request listening socket)
24

Unix: very simple

Windows: a little complex
main difference
1. can’t share socket by FD

in Windows, socket descriptor ≠ ﬁle descriptor

It doesn’t make sense to share FD

(have to use Winsock2 API to share sockets)

!
2. have to lock accept

in unix, don’t need consider thundering herd

but do in windows.

25

Implementation (Windows)
DRb
create socket from port and bind

(WSASocket)

↓

duplicate exclusive socket by pid

(WSADuplicateSocket)

↓

get socket protocol (WSAProcolInfo)
worker
worker
worker
Socket

Manager

server
Socket

Manager

client
Socket

Manager

client
Socket

Manager

client
from WSAProcolInfo,

make WSASocket

↓

handle into FD

↓

IO.for_fd(FD)

send this IO to Cool.io

super

visor
Server

Engine
26

accept mutex
worker
worker
get

mutex
detach

release

mutex
attach

listening socket

to cool.io loop
accept
mutex
read and send data

to buffer/output
server socket
get

mutex
detach

release

mutex
attach

listening socket

to cool.io loop
accept
read and send data

to buffer/output
deal with post processing

in this process as it is
other process can listen

while this process is dealing with data
27

rotation in order

by accept mutex
①2376→②3456→③2696→④3388

→①2376→②3456→ 28

As a result of test,

Thundering herd doesn’t occur in windows.

Tentatively I implemented roughly with mutex,

but I want to use IOCP like livuv in the future.

!
Patches are welcome from Windows specialist!
29

benchmark result (unix)
AWS ubuntu 14.04 m4.xlarge
RPS IO
conventional
model
6798.69

/sec
1361.07

kb/s
new model

(4 workers)
13743.02

/sec
2751.29

kb/s
in_http → out_forward
30

benchmark result (windows)
AWS Microsoft Windows Server 2012 R2 m4.xlarge
RPS IO
conventional
model
1834.01

/sec
385.07

kb/s
new model

(4 workers)
3513.31

/sec
737.66

kb/s
in_http → out_forward
31

Future work
・Buffering in multiprocess

・accept mutex based IOCP…etc
summary
・Implemented ﬂuentd Socket Manager with ServerEngine,

and will be faster without consciousness.

!
・There is details in ServerEngine Issue,

you can test my forked branch(ﬂuentd and ServerEngine)

and I’ll send PR after this report.
32

Why don’t you use Object serialization?
35

Because of memory problem
When Random forests model is big and many customers use it,

It is too much memory consumption
36

ServerEngine is:
To implement Socket Manager, I used ServerEngine
a framework to implement

robust multiprocess servers like Unicorn.
37

how to use Socket Manager in ﬂuentd side
!
#get socket manager

socket_manager = ServerEngine::SocketManager.new_socket_manager

!
#get FD from socket manager

fd = socket_manager.get_tcp(bind, port)

!
#create listening socket from FD

lsock = TCPServer.for_fd(fd.to_i)

it doesn’t need consider about socket sharing in ﬂuentd side,

ServerEngine deal with it inside.
38

Benchmark Result
I’ll add multiprocess buffering function,

After that I’ll do benchmark formally.

!
Tentatively Show the rough result
40

Treasure Data Summer Internship Final Report

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Destaque

Destaque (20)

Semelhante a Treasure Data Summer Internship Final Report

Semelhante a Treasure Data Summer Internship Final Report (20)

Último

Último (20)

Treasure Data Summer Internship Final Report