SlideShare uma empresa Scribd logo
1 de 15
Baixar para ler offline
Arakoon
      A distributed consistent key-value store


    Romain Slootmaekers                    Nicolas Trangez

                              Incubaid BVBA
                   {romain,nicolas}@incubaid.com
                           Twitter: @incubaid
                    Team Blog: blog.incubaid.com


                     September 14, 2012




Romain Slootmaekers, Nicolas Trangez   Arakoon
Introduction




      Researchers at Incubaid
      Incubaid is a technology incubator, active in datacenter &
      cloud computing
      Prior exits through Terremark, Telenet (Belgian telco),
      Veritas/Symantec, Sun Microsystems,. . .
      Talk about general use of FP in our companies tomorrow at
      CUFP




         Romain Slootmaekers, Nicolas Trangez   Arakoon
Arakoon



    Distributed, consistent, persistent key-value store
    OCaml using Lwt
    Multi-Paxos consensus protocol implementation
           Guaranteed consistency across cluster nodes
           Available as long as a majority (N/2 + 1) members is
           reachable
           Handles message loss or duplication, split-brain
           networking,. . .
    TokyoCabinet backend
    Open Source (AGPL-3), see http://arakoon.org




          Romain Slootmaekers, Nicolas Trangez   Arakoon
Arakoon Features




     Arakoon feature-set goes beyond basic key-value CRUD
     interface:
         Range & prefix lookups on keys (incl. paging)
         Transactional sequences
         Test-and-set / CAS
         Server-side extensions, “user functions”
     Simple binary protocol with clients in OCaml, C, Python,
     PHP




        Romain Slootmaekers, Nicolas Trangez   Arakoon
Arakoon Deployments




     X000 deployments in several products by different
     companies
     Created primarily to store metadata of large-scale storage
     Also used as “NoSQL” store for IAAS platforms




        Romain Slootmaekers, Nicolas Trangez   Arakoon
Baardskeerder




     Append-only B-tree(ish) database
     OCaml
     Replace TokyoCabinet in Arakoon 2.x cycle
     “SSD-friendly”
     LGPL-3,
     http://incubaid.github.com/baardskeerder




        Romain Slootmaekers, Nicolas Trangez   Arakoon
Why OCaml?



    Short prototype-to-production cycle
    FP suits problem domain
    Availability of cooperative threads (Lwt)
        “Async” was released after project incubation
    Fast compiler
    Native binaries, good performance
    Prior experience in Amplidata storage product (CUFP talk
    tomorrow)




       Romain Slootmaekers, Nicolas Trangez   Arakoon
Experiences


     Fairly easy to get contributors up to speed
     . . . but does require some mental effort
     Hard (if possible at all) to hire people with prior OCaml
     knowledge
     . . . but not strictly necessary
     Got to stable version within a couple of months (used in
     production deployments)
     Fixing bugs or adding features didn’t introduce more bugs
     . . . yet this is mostly thanks to an extensive “system”
     testsuite
     Lots of “bugs” caused by deployment issues or sysadmin
     interventions


         Romain Slootmaekers, Nicolas Trangez   Arakoon
Lessons learned

     Don’t let “developer familiarity” influence app design: using
     OCaml OO features doesn’t make it easier to contribute
     (seems to add confusion)
     Providing a single script to bootstrap an OCaml
     environment + dependencies is a big plus
     Since “monadic threading” is new to most contributors,
     using the Lwt syntax extension might be a bad idea (as
     experienced prior to Arakoon incubation)
     Don’t let “RealWorld# IO” creep into the parts which
     could/should be kept pure
         Most likely the #1 mistake in Arakoon 1.x
         Fixed in 2.x: Paxos state-machine is pure
         Helps in testing & manual correctness validation
     AGPL-3 might not be an ideal license, yet driven by
     business-needs

        Romain Slootmaekers, Nicolas Trangez   Arakoon
Experiences: OCaml
     Language not too hard to grok
     Stable & fast compiler
     Non-trivial but not impossible to debug binaries using GDB
     & read/interpret assembly, if required
     http://blog.incubaid.com/2011/12/04/
     on-segmentation-faults-stack-overflows-gdb-and-ocaml
     Stable runtime, except e.g. select fdset bug causing runtime
     memory corruption (http://caml.inria.fr/mantis/
     print_bug_page.php?bug_id=5563)
     Memory (leak) issues hard to pinpoint and debug
     Limited standard library, many “basic” functions need ad-hoc
     implementation
     Standard library and Lwt provide lots of bindings to low-level
     procedures for system-level programming
     Not “open”ing some module at compile time can result in
     segfaults at runtime?!? (http://bugs.debian.org/
     cgi-bin/bugreport.cgi?bug=602170)
        Romain Slootmaekers, Nicolas Trangez   Arakoon
Experiences: Infrastructure




     ocamlbuild works OK for the basic cases, but once you
     need myocamlbuild.ml, you’re on your own.
     Ever tried including C++ code in an ocamlbuild setup?
     Not convinced oasis improves things (“Oasis is the new
     Maven”)
     Preliminary experiments with OPAM on Monday were
     promising/encouraging!




        Romain Slootmaekers, Nicolas Trangez   Arakoon
Experiences: Lwt - The upside




     OK to work with, API-wise
     Lots of built-in functions
     Active maintainers on mailing list, bugs reports are
     handled quickly
     Documentation OK’ish - unless it’s completely missing
     (e.g. Lwt_pqueue)




         Romain Slootmaekers, Nicolas Trangez   Arakoon
Experiences: Lwt - The downside



     Irregular and unpredictable release schedule
     Regressions!
         Native binary stack-overflow at runtime: not reproducible,
         only corrupt core dumps available
         Took > 2 man-weeks to pinpoint
         Reduced to 2 very small test-cases both exposing the bug
         in different ways
         Fixed using work-around in Arakoon, reported to Lwt list
         including tests, quickly fixed in Lwt-darcs and next release
         2 releases afterwards, the exact same issue was
         re-introduced, both test-cases failing again




        Romain Slootmaekers, Nicolas Trangez   Arakoon
Experiences: Lwt - The downside

     Significant refactoring/re-implementation changes
     in-between releases
          Hard to test
          Hard to validate correctness
     (Performance impact: Baardskeerder IO abstracted, can use
     “Unix” or “Lwt_unix” (or others). Lwt-backed benchmark is 20x
     slower than sync version)
     src/unix/lwt_libev_stubs.c:

  93 /* Extract the event loop now.
  94
  95    It seems to crash if we don’t do that (??). */
  96   struct ev_loop *loop = Ev_loop_val(val_loop);




         Romain Slootmaekers, Nicolas Trangez   Arakoon
Conclusion


     Overall: positive experience
     Build infrastructure could use some love
     Lwt is a great project, but releng/testing/(docs) could
     improve
     Convincing others a non-standard language like OCaml
     is/was a good choice for Arakoon, especially to
     non-coders, can be hard, but it’s worth the effort
     Still unclear how to get more contributors & users on
     board. Ideas welcome!
     Questions?




        Romain Slootmaekers, Nicolas Trangez   Arakoon

Mais conteúdo relacionado

Semelhante a Distributed Key-Value Store Arakoon in OCaml

Overview of QP Frameworks and QM Modeling Tools (Notes)
Overview of QP Frameworks and QM Modeling Tools (Notes)Overview of QP Frameworks and QM Modeling Tools (Notes)
Overview of QP Frameworks and QM Modeling Tools (Notes)Quantum Leaps, LLC
 
Tackling non-determinism in Hadoop - Testing and debugging distributed system...
Tackling non-determinism in Hadoop - Testing and debugging distributed system...Tackling non-determinism in Hadoop - Testing and debugging distributed system...
Tackling non-determinism in Hadoop - Testing and debugging distributed system...Akihiro Suda
 
UniK - a unikernel compiler and runtime
UniK - a unikernel compiler and runtimeUniK - a unikernel compiler and runtime
UniK - a unikernel compiler and runtimeLee Calcote
 
Securing IoT Applications
Securing IoT Applications Securing IoT Applications
Securing IoT Applications WSO2
 
stackconf 2022: It’s Time to Debloat the Cloud with Unikraft
stackconf 2022: It’s Time to Debloat the Cloud with Unikraftstackconf 2022: It’s Time to Debloat the Cloud with Unikraft
stackconf 2022: It’s Time to Debloat the Cloud with UnikraftNETWAYS
 
Zoo keeper in the wild
Zoo keeper in the wildZoo keeper in the wild
Zoo keeper in the wilddatamantra
 
Tampere Technical University - Seminar Presentation in testind day 2016 - Sca...
Tampere Technical University - Seminar Presentation in testind day 2016 - Sca...Tampere Technical University - Seminar Presentation in testind day 2016 - Sca...
Tampere Technical University - Seminar Presentation in testind day 2016 - Sca...Sakari Hoisko
 
Hol 1940-01-net pdf-en
Hol 1940-01-net pdf-enHol 1940-01-net pdf-en
Hol 1940-01-net pdf-endborsan
 
Lean Model-Driven Development through Model-Interpretation: the CPAL design ...
Lean Model-Driven Development through  Model-Interpretation: the CPAL design ...Lean Model-Driven Development through  Model-Interpretation: the CPAL design ...
Lean Model-Driven Development through Model-Interpretation: the CPAL design ...Nicolas Navet
 
Flaky tests and bugs in Apache software (e.g. Hadoop)
Flaky tests and bugs in Apache software (e.g. Hadoop)Flaky tests and bugs in Apache software (e.g. Hadoop)
Flaky tests and bugs in Apache software (e.g. Hadoop)Akihiro Suda
 
CAPS: What's best for deploying and managing OpenStack? Chef vs. Ansible vs. ...
CAPS: What's best for deploying and managing OpenStack? Chef vs. Ansible vs. ...CAPS: What's best for deploying and managing OpenStack? Chef vs. Ansible vs. ...
CAPS: What's best for deploying and managing OpenStack? Chef vs. Ansible vs. ...Daniel Krook
 
CAPS: What's best for deploying and managing OpenStack? Chef vs. Ansible vs. ...
CAPS: What's best for deploying and managing OpenStack? Chef vs. Ansible vs. ...CAPS: What's best for deploying and managing OpenStack? Chef vs. Ansible vs. ...
CAPS: What's best for deploying and managing OpenStack? Chef vs. Ansible vs. ...Animesh Singh
 
DockerDay2015: Keynote
DockerDay2015: KeynoteDockerDay2015: Keynote
DockerDay2015: KeynoteDocker-Hanoi
 
Performance analysis of synchronisation problem
Performance analysis of synchronisation problemPerformance analysis of synchronisation problem
Performance analysis of synchronisation problemharshit200793
 
Functional and non-functional testing with IoT-Testware
Functional and non-functional testing with IoT-TestwareFunctional and non-functional testing with IoT-Testware
Functional and non-functional testing with IoT-TestwareAxel Rennoch
 
Extending DevOps to Big Data Applications with Kubernetes
Extending DevOps to Big Data Applications with KubernetesExtending DevOps to Big Data Applications with Kubernetes
Extending DevOps to Big Data Applications with KubernetesNicola Ferraro
 

Semelhante a Distributed Key-Value Store Arakoon in OCaml (20)

Overview of QP Frameworks and QM Modeling Tools (Notes)
Overview of QP Frameworks and QM Modeling Tools (Notes)Overview of QP Frameworks and QM Modeling Tools (Notes)
Overview of QP Frameworks and QM Modeling Tools (Notes)
 
Tackling non-determinism in Hadoop - Testing and debugging distributed system...
Tackling non-determinism in Hadoop - Testing and debugging distributed system...Tackling non-determinism in Hadoop - Testing and debugging distributed system...
Tackling non-determinism in Hadoop - Testing and debugging distributed system...
 
Breaking the monolith
Breaking the monolithBreaking the monolith
Breaking the monolith
 
UniK - a unikernel compiler and runtime
UniK - a unikernel compiler and runtimeUniK - a unikernel compiler and runtime
UniK - a unikernel compiler and runtime
 
Securing IoT Applications
Securing IoT Applications Securing IoT Applications
Securing IoT Applications
 
stackconf 2022: It’s Time to Debloat the Cloud with Unikraft
stackconf 2022: It’s Time to Debloat the Cloud with Unikraftstackconf 2022: It’s Time to Debloat the Cloud with Unikraft
stackconf 2022: It’s Time to Debloat the Cloud with Unikraft
 
DevOps in your Oracle Stack
DevOps in your Oracle StackDevOps in your Oracle Stack
DevOps in your Oracle Stack
 
Zoo keeper in the wild
Zoo keeper in the wildZoo keeper in the wild
Zoo keeper in the wild
 
Tampere Technical University - Seminar Presentation in testind day 2016 - Sca...
Tampere Technical University - Seminar Presentation in testind day 2016 - Sca...Tampere Technical University - Seminar Presentation in testind day 2016 - Sca...
Tampere Technical University - Seminar Presentation in testind day 2016 - Sca...
 
Stackato
StackatoStackato
Stackato
 
Hol 1940-01-net pdf-en
Hol 1940-01-net pdf-enHol 1940-01-net pdf-en
Hol 1940-01-net pdf-en
 
Lean Model-Driven Development through Model-Interpretation: the CPAL design ...
Lean Model-Driven Development through  Model-Interpretation: the CPAL design ...Lean Model-Driven Development through  Model-Interpretation: the CPAL design ...
Lean Model-Driven Development through Model-Interpretation: the CPAL design ...
 
Flaky tests and bugs in Apache software (e.g. Hadoop)
Flaky tests and bugs in Apache software (e.g. Hadoop)Flaky tests and bugs in Apache software (e.g. Hadoop)
Flaky tests and bugs in Apache software (e.g. Hadoop)
 
CAPS: What's best for deploying and managing OpenStack? Chef vs. Ansible vs. ...
CAPS: What's best for deploying and managing OpenStack? Chef vs. Ansible vs. ...CAPS: What's best for deploying and managing OpenStack? Chef vs. Ansible vs. ...
CAPS: What's best for deploying and managing OpenStack? Chef vs. Ansible vs. ...
 
CAPS: What's best for deploying and managing OpenStack? Chef vs. Ansible vs. ...
CAPS: What's best for deploying and managing OpenStack? Chef vs. Ansible vs. ...CAPS: What's best for deploying and managing OpenStack? Chef vs. Ansible vs. ...
CAPS: What's best for deploying and managing OpenStack? Chef vs. Ansible vs. ...
 
DockerDay2015: Keynote
DockerDay2015: KeynoteDockerDay2015: Keynote
DockerDay2015: Keynote
 
Performance analysis of synchronisation problem
Performance analysis of synchronisation problemPerformance analysis of synchronisation problem
Performance analysis of synchronisation problem
 
Stackato v6
Stackato v6Stackato v6
Stackato v6
 
Functional and non-functional testing with IoT-Testware
Functional and non-functional testing with IoT-TestwareFunctional and non-functional testing with IoT-Testware
Functional and non-functional testing with IoT-Testware
 
Extending DevOps to Big Data Applications with Kubernetes
Extending DevOps to Big Data Applications with KubernetesExtending DevOps to Big Data Applications with Kubernetes
Extending DevOps to Big Data Applications with Kubernetes
 

Último

The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 

Último (20)

The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 

Distributed Key-Value Store Arakoon in OCaml

  • 1. Arakoon A distributed consistent key-value store Romain Slootmaekers Nicolas Trangez Incubaid BVBA {romain,nicolas}@incubaid.com Twitter: @incubaid Team Blog: blog.incubaid.com September 14, 2012 Romain Slootmaekers, Nicolas Trangez Arakoon
  • 2. Introduction Researchers at Incubaid Incubaid is a technology incubator, active in datacenter & cloud computing Prior exits through Terremark, Telenet (Belgian telco), Veritas/Symantec, Sun Microsystems,. . . Talk about general use of FP in our companies tomorrow at CUFP Romain Slootmaekers, Nicolas Trangez Arakoon
  • 3. Arakoon Distributed, consistent, persistent key-value store OCaml using Lwt Multi-Paxos consensus protocol implementation Guaranteed consistency across cluster nodes Available as long as a majority (N/2 + 1) members is reachable Handles message loss or duplication, split-brain networking,. . . TokyoCabinet backend Open Source (AGPL-3), see http://arakoon.org Romain Slootmaekers, Nicolas Trangez Arakoon
  • 4. Arakoon Features Arakoon feature-set goes beyond basic key-value CRUD interface: Range & prefix lookups on keys (incl. paging) Transactional sequences Test-and-set / CAS Server-side extensions, “user functions” Simple binary protocol with clients in OCaml, C, Python, PHP Romain Slootmaekers, Nicolas Trangez Arakoon
  • 5. Arakoon Deployments X000 deployments in several products by different companies Created primarily to store metadata of large-scale storage Also used as “NoSQL” store for IAAS platforms Romain Slootmaekers, Nicolas Trangez Arakoon
  • 6. Baardskeerder Append-only B-tree(ish) database OCaml Replace TokyoCabinet in Arakoon 2.x cycle “SSD-friendly” LGPL-3, http://incubaid.github.com/baardskeerder Romain Slootmaekers, Nicolas Trangez Arakoon
  • 7. Why OCaml? Short prototype-to-production cycle FP suits problem domain Availability of cooperative threads (Lwt) “Async” was released after project incubation Fast compiler Native binaries, good performance Prior experience in Amplidata storage product (CUFP talk tomorrow) Romain Slootmaekers, Nicolas Trangez Arakoon
  • 8. Experiences Fairly easy to get contributors up to speed . . . but does require some mental effort Hard (if possible at all) to hire people with prior OCaml knowledge . . . but not strictly necessary Got to stable version within a couple of months (used in production deployments) Fixing bugs or adding features didn’t introduce more bugs . . . yet this is mostly thanks to an extensive “system” testsuite Lots of “bugs” caused by deployment issues or sysadmin interventions Romain Slootmaekers, Nicolas Trangez Arakoon
  • 9. Lessons learned Don’t let “developer familiarity” influence app design: using OCaml OO features doesn’t make it easier to contribute (seems to add confusion) Providing a single script to bootstrap an OCaml environment + dependencies is a big plus Since “monadic threading” is new to most contributors, using the Lwt syntax extension might be a bad idea (as experienced prior to Arakoon incubation) Don’t let “RealWorld# IO” creep into the parts which could/should be kept pure Most likely the #1 mistake in Arakoon 1.x Fixed in 2.x: Paxos state-machine is pure Helps in testing & manual correctness validation AGPL-3 might not be an ideal license, yet driven by business-needs Romain Slootmaekers, Nicolas Trangez Arakoon
  • 10. Experiences: OCaml Language not too hard to grok Stable & fast compiler Non-trivial but not impossible to debug binaries using GDB & read/interpret assembly, if required http://blog.incubaid.com/2011/12/04/ on-segmentation-faults-stack-overflows-gdb-and-ocaml Stable runtime, except e.g. select fdset bug causing runtime memory corruption (http://caml.inria.fr/mantis/ print_bug_page.php?bug_id=5563) Memory (leak) issues hard to pinpoint and debug Limited standard library, many “basic” functions need ad-hoc implementation Standard library and Lwt provide lots of bindings to low-level procedures for system-level programming Not “open”ing some module at compile time can result in segfaults at runtime?!? (http://bugs.debian.org/ cgi-bin/bugreport.cgi?bug=602170) Romain Slootmaekers, Nicolas Trangez Arakoon
  • 11. Experiences: Infrastructure ocamlbuild works OK for the basic cases, but once you need myocamlbuild.ml, you’re on your own. Ever tried including C++ code in an ocamlbuild setup? Not convinced oasis improves things (“Oasis is the new Maven”) Preliminary experiments with OPAM on Monday were promising/encouraging! Romain Slootmaekers, Nicolas Trangez Arakoon
  • 12. Experiences: Lwt - The upside OK to work with, API-wise Lots of built-in functions Active maintainers on mailing list, bugs reports are handled quickly Documentation OK’ish - unless it’s completely missing (e.g. Lwt_pqueue) Romain Slootmaekers, Nicolas Trangez Arakoon
  • 13. Experiences: Lwt - The downside Irregular and unpredictable release schedule Regressions! Native binary stack-overflow at runtime: not reproducible, only corrupt core dumps available Took > 2 man-weeks to pinpoint Reduced to 2 very small test-cases both exposing the bug in different ways Fixed using work-around in Arakoon, reported to Lwt list including tests, quickly fixed in Lwt-darcs and next release 2 releases afterwards, the exact same issue was re-introduced, both test-cases failing again Romain Slootmaekers, Nicolas Trangez Arakoon
  • 14. Experiences: Lwt - The downside Significant refactoring/re-implementation changes in-between releases Hard to test Hard to validate correctness (Performance impact: Baardskeerder IO abstracted, can use “Unix” or “Lwt_unix” (or others). Lwt-backed benchmark is 20x slower than sync version) src/unix/lwt_libev_stubs.c: 93 /* Extract the event loop now. 94 95 It seems to crash if we don’t do that (??). */ 96 struct ev_loop *loop = Ev_loop_val(val_loop); Romain Slootmaekers, Nicolas Trangez Arakoon
  • 15. Conclusion Overall: positive experience Build infrastructure could use some love Lwt is a great project, but releng/testing/(docs) could improve Convincing others a non-standard language like OCaml is/was a good choice for Arakoon, especially to non-coders, can be hard, but it’s worth the effort Still unclear how to get more contributors & users on board. Ideas welcome! Questions? Romain Slootmaekers, Nicolas Trangez Arakoon