Genislab builds better products and faster go-to-market with Lean project man...
Distributed Key-Value Store Arakoon in OCaml
1. Arakoon
A distributed consistent key-value store
Romain Slootmaekers Nicolas Trangez
Incubaid BVBA
{romain,nicolas}@incubaid.com
Twitter: @incubaid
Team Blog: blog.incubaid.com
September 14, 2012
Romain Slootmaekers, Nicolas Trangez Arakoon
2. Introduction
Researchers at Incubaid
Incubaid is a technology incubator, active in datacenter &
cloud computing
Prior exits through Terremark, Telenet (Belgian telco),
Veritas/Symantec, Sun Microsystems,. . .
Talk about general use of FP in our companies tomorrow at
CUFP
Romain Slootmaekers, Nicolas Trangez Arakoon
3. Arakoon
Distributed, consistent, persistent key-value store
OCaml using Lwt
Multi-Paxos consensus protocol implementation
Guaranteed consistency across cluster nodes
Available as long as a majority (N/2 + 1) members is
reachable
Handles message loss or duplication, split-brain
networking,. . .
TokyoCabinet backend
Open Source (AGPL-3), see http://arakoon.org
Romain Slootmaekers, Nicolas Trangez Arakoon
4. Arakoon Features
Arakoon feature-set goes beyond basic key-value CRUD
interface:
Range & prefix lookups on keys (incl. paging)
Transactional sequences
Test-and-set / CAS
Server-side extensions, “user functions”
Simple binary protocol with clients in OCaml, C, Python,
PHP
Romain Slootmaekers, Nicolas Trangez Arakoon
5. Arakoon Deployments
X000 deployments in several products by different
companies
Created primarily to store metadata of large-scale storage
Also used as “NoSQL” store for IAAS platforms
Romain Slootmaekers, Nicolas Trangez Arakoon
6. Baardskeerder
Append-only B-tree(ish) database
OCaml
Replace TokyoCabinet in Arakoon 2.x cycle
“SSD-friendly”
LGPL-3,
http://incubaid.github.com/baardskeerder
Romain Slootmaekers, Nicolas Trangez Arakoon
7. Why OCaml?
Short prototype-to-production cycle
FP suits problem domain
Availability of cooperative threads (Lwt)
“Async” was released after project incubation
Fast compiler
Native binaries, good performance
Prior experience in Amplidata storage product (CUFP talk
tomorrow)
Romain Slootmaekers, Nicolas Trangez Arakoon
8. Experiences
Fairly easy to get contributors up to speed
. . . but does require some mental effort
Hard (if possible at all) to hire people with prior OCaml
knowledge
. . . but not strictly necessary
Got to stable version within a couple of months (used in
production deployments)
Fixing bugs or adding features didn’t introduce more bugs
. . . yet this is mostly thanks to an extensive “system”
testsuite
Lots of “bugs” caused by deployment issues or sysadmin
interventions
Romain Slootmaekers, Nicolas Trangez Arakoon
9. Lessons learned
Don’t let “developer familiarity” influence app design: using
OCaml OO features doesn’t make it easier to contribute
(seems to add confusion)
Providing a single script to bootstrap an OCaml
environment + dependencies is a big plus
Since “monadic threading” is new to most contributors,
using the Lwt syntax extension might be a bad idea (as
experienced prior to Arakoon incubation)
Don’t let “RealWorld# IO” creep into the parts which
could/should be kept pure
Most likely the #1 mistake in Arakoon 1.x
Fixed in 2.x: Paxos state-machine is pure
Helps in testing & manual correctness validation
AGPL-3 might not be an ideal license, yet driven by
business-needs
Romain Slootmaekers, Nicolas Trangez Arakoon
10. Experiences: OCaml
Language not too hard to grok
Stable & fast compiler
Non-trivial but not impossible to debug binaries using GDB
& read/interpret assembly, if required
http://blog.incubaid.com/2011/12/04/
on-segmentation-faults-stack-overflows-gdb-and-ocaml
Stable runtime, except e.g. select fdset bug causing runtime
memory corruption (http://caml.inria.fr/mantis/
print_bug_page.php?bug_id=5563)
Memory (leak) issues hard to pinpoint and debug
Limited standard library, many “basic” functions need ad-hoc
implementation
Standard library and Lwt provide lots of bindings to low-level
procedures for system-level programming
Not “open”ing some module at compile time can result in
segfaults at runtime?!? (http://bugs.debian.org/
cgi-bin/bugreport.cgi?bug=602170)
Romain Slootmaekers, Nicolas Trangez Arakoon
11. Experiences: Infrastructure
ocamlbuild works OK for the basic cases, but once you
need myocamlbuild.ml, you’re on your own.
Ever tried including C++ code in an ocamlbuild setup?
Not convinced oasis improves things (“Oasis is the new
Maven”)
Preliminary experiments with OPAM on Monday were
promising/encouraging!
Romain Slootmaekers, Nicolas Trangez Arakoon
12. Experiences: Lwt - The upside
OK to work with, API-wise
Lots of built-in functions
Active maintainers on mailing list, bugs reports are
handled quickly
Documentation OK’ish - unless it’s completely missing
(e.g. Lwt_pqueue)
Romain Slootmaekers, Nicolas Trangez Arakoon
13. Experiences: Lwt - The downside
Irregular and unpredictable release schedule
Regressions!
Native binary stack-overflow at runtime: not reproducible,
only corrupt core dumps available
Took > 2 man-weeks to pinpoint
Reduced to 2 very small test-cases both exposing the bug
in different ways
Fixed using work-around in Arakoon, reported to Lwt list
including tests, quickly fixed in Lwt-darcs and next release
2 releases afterwards, the exact same issue was
re-introduced, both test-cases failing again
Romain Slootmaekers, Nicolas Trangez Arakoon
14. Experiences: Lwt - The downside
Significant refactoring/re-implementation changes
in-between releases
Hard to test
Hard to validate correctness
(Performance impact: Baardskeerder IO abstracted, can use
“Unix” or “Lwt_unix” (or others). Lwt-backed benchmark is 20x
slower than sync version)
src/unix/lwt_libev_stubs.c:
93 /* Extract the event loop now.
94
95 It seems to crash if we don’t do that (??). */
96 struct ev_loop *loop = Ev_loop_val(val_loop);
Romain Slootmaekers, Nicolas Trangez Arakoon
15. Conclusion
Overall: positive experience
Build infrastructure could use some love
Lwt is a great project, but releng/testing/(docs) could
improve
Convincing others a non-standard language like OCaml
is/was a good choice for Arakoon, especially to
non-coders, can be hard, but it’s worth the effort
Still unclear how to get more contributors & users on
board. Ideas welcome!
Questions?
Romain Slootmaekers, Nicolas Trangez Arakoon