2. Introduce myself
@higepon
Mona OS
http://www.monaos.org
Mosh
A fast Scheme interpreter
Outputz
http://outputz.com
Feb 26 2010 Mio - a Skip Graph based ordered KVS 2
3. Summary
Mio is...
a distributed orderd KVS
memcached + range search
Skip Graph based
Written in Erlang
http://github.com/higepon/mio
In alpha quality
Feb 26 2010 Mio - a Skip Graph based ordered KVS 3
5. RDBMS vs KVS
Scalability
KVS set/get
volatile
High
functionality
Transaction
SQL RDBMS
Feb 26 2010 Mio - a Skip Graph based ordered KVS 5
6. RDBMS vs KVS
Scalability
KVS set/get
volatile
High
Complement each other
functionality
Transaction
SQL RDBMS
Feb 26 2010 Mio - a Skip Graph based ordered KVS 5
7. Mio
Scalability
KVS
High
functionality
RDBMS
Feb 26 2010 Mio - a Skip Graph based ordered KVS 6
8. Mio
Scalability
KVS Mio
+Range search
High
functionality
RDBMS
Feb 26 2010 Mio - a Skip Graph based ordered KVS 6
9. Mio
Scalability
KVS Mio
+Range search
Makes RDBMS High
lighter workload functionality
RDBMS
Feb 26 2010 Mio - a Skip Graph based ordered KVS 6
10. Range search?
Queries
last 7 days
prev/next
Top 10 ranking
SQL
SELECT * FROM photos WHERE date between xxx
and xx order by date limit 10
RDBMS handles these queires
Feb 26 2010 Mio - a Skip Graph based ordered KVS 7
12. The Challenges and Design Decisions
Range search
Ordered structure
Skip Graphs algorithm
Scale-Out
distributed using Erlang functions
memcached compatible I/F
Volatile
keep it simple
Feb 26 2010 Mio - a Skip Graph based ordered KVS 9
13. Skip Graphs
James Aspnes (2003)
Feb 26 2010 Mio - a Skip Graph based ordered KVS 10
14. Supported operations
search by key
insert (join)
remove
range search by key1 and key2
Feb 26 2010 Mio - a Skip Graph based ordered KVS 11
15. Set of sorted doubly linked lists
Shibuya Shinjuku Tamachi Ueno Yoyogi
Same as railway stations
All keys (stations) consist doubly linked list
Knows only his left and right station
Keep sorted by key
Search Shibuya start from Ueno
Go to left. O(n)
Feb 26 2010 Mio - a Skip Graph based ordered KVS 12
16. Make an express lane
Skip
Express
Shinjuku Ueno
Local
Shibuya Shinjuku Tamachi Ueno Yoyogi
Skip some stations
Ueno -> Shinjuku -> Shibuya
Tamachi is placed on another express
lane
Feb 26 2010 Mio - a Skip Graph based ordered KVS 13
17. Multiple lanes
Level 2
Level 1
Level 0
Shibuya Shinjuku Tamachi Ueno Yoyogi
Level 0 lane
all keys are in the list
Level n (n > 0) lane
express lane
n + 1 lane is more express than n lane.
Feb 26 2010 Mio - a Skip Graph based ordered KVS 14
18. Search
Level 2
Level 1
Level 0
Shibuya Shinjuku Tamachi Ueno Yoyogi
Start from highest to lower level
Can search from any stations
O(log n)
Feb 26 2010 Mio - a Skip Graph based ordered KVS 15
19. Range Search
Level 2
Level 1
Level 0
Shibuya Shinjuku Tamachi Ueno Yoyogi
Search key1
Collect matched on Level 0
ex. Key1 = Ueno , Key2 = Shibuya
Feb 26 2010 Mio - a Skip Graph based ordered KVS 16
20. Remove
B
A B C A C
Remove on each Level
Update neighbor’s links
Highest to lower
Feb 26 2010 Mio - a Skip Graph based ordered KVS 17
21. Insert
B
A C A B C
Insert on each Level
Update neighbors’s links
Lowest to higher (in reverse order to remove)
In which express lane is a new station insereted?
radomly located
uniform
Feb 26 2010 Mio - a Skip Graph based ordered KVS 18
22. Easy to implement?
No
Really simple, but ...
We should support concurrent insert/remove
If neighbor is removed when inserting?
If someone inserts another to neighbor?
Searching crash?
Fragile linked list
We can’t find any perfect concurrent join
algorithm.
Feb 26 2010 Mio - a Skip Graph based ordered KVS 19
23. Our concurrent algorithm
Lock some nodes
Please read the source code :)
Defined three invariants
A B C
A C A C
B B
Feb 26 2010 Mio - a Skip Graph based ordered KVS 20
25. Written in Erlang
A station(key, value) is a process
gen_server process
Hold left/right on each level
Follow left/right = gen_server:call/2
No distinction between local and remote process
Erlang is great!
Ditributed with -name option
erl -name name@FQDN
Feb 26 2010 Mio - a Skip Graph based ordered KVS 22
26. Performance
5000 qps on single node
really slow on multiple nodes
need less communication between nodes
need better algorithm
Feb 26 2010 Mio - a Skip Graph based ordered KVS 23
28. Tips for practical Erlang
Max process option +P
Set proper value. Don’t use MAX.
gerbage_collect()
Fast enough, reduce memory usage.
hibernate is slow...
refactorerl
fprof on gen_server shows nothing
Use dynomite profile
Feb 26 2010 Mio - a Skip Graph based ordered KVS 25
29. Tips for practical Erlang
Common test
Coverage
load test
gen_server:call is slow
Use mnesia for property access.
Easy replication
Easy to run
Should users run erl with many options?
Shell script borrowed from RabbitMQ
Feb 26 2010 Mio - a Skip Graph based ordered KVS 26
30. Summary, Once more
Mio is...
a distributed orderd KVS
memcached + range search
Skip Graph based
Written in Erlang
http://github.com/higepon/mio
In alpha quality
Feb 26 2010 Mio - a Skip Graph based ordered KVS 27