This document provides an overview and technical discussion of Membase. It begins with introducing Membase and how it allows both applications and databases to scale horizontally. The rest of the document discusses Membase architecture, deployment options, use cases, and a demo. It also briefly explores developing with Membase and the future direction of NodeCode, which will allow extending Membase through custom modules.
4. Before: Application scales linearly, data hits wall Application Scales Out Just add more commodity web servers Database Scales Up Get a bigger, more complex server 4
5. Membase is a distributed database 5 Application user Web application server Membase Servers In the data center On the administrator console
6. Built-in Memcached Caching Layer 6 Memcached Memcached Membase Database Membase Database Memcached Mode Membase Mode Membase development team has contributed over half of the source code to the Memcached project.
8. Secure multitenant support 8 Application user Web application server Bucket 1 Bucket 2 Aggregate Cluster Memory and Disk Capacity Membase data servers In the data center On the administrator console
9. Five minutes or less to a working cluster Downloads for Linux and Windows Start with a single node One button press joins nodes to a cluster Easy to develop against Just SET and GET – no schema required Drop it in. 10,000+ existing applications already “speak membase” (via memcached) Practically every language and application framework is supported, out of the box Easy to manage One-click failover and cluster rebalancing Graphical and programmatic interfaces Configurable alerting Membase is Simple, Fast, Elastic 9
10. Membase is Simple, Fast, Elastic 10 Predictable “Never keep an application waiting” Quasi-deterministic latency and throughput Low latency Built-in Memcached technology Auto-migration of hot data to lowest latency storage technology (RAM, SSD, Disk) Selectable write behavior – asynchronous, synchronous (on replication, persistence) High throughput Multi-threaded Low lock contention Asynchronous wherever possible Automatic write de-duplication
11. Membase is Simple, Fast, Elastic 11 Zero-downtime elasticity Spread I/O and data across commodity servers (or VMs) Consistent performance with linear cost Dynamic rebalancing of a live cluster All nodes are created equal No special case nodes Clone to grow Extensible Filtered TAP interface provides hook points for external systems (e.g. full-text search, backup, warehouse) Data bucket – engine API for specialized container types Membase NodeCode[FUTURE]
12. Proven at small, and extra large scale Leading cloud service (PAAS) provider Over 65,000 hosted applications Membase Server supporting over 3,000 Heroku customers Social game leader – FarmVille, Mafia Wars, Café World Over 230 million monthly users Zynga is a core contributor to and large scale user of Membase Server 12
13. After: Data layer scales like application logic layerData layer now scales with linear cost and constant performance. Application Scales Out Just add more commodity web servers Membase Servers Database Scales Out Just add more commodity data servers Scaling out flattens the cost and performance curves. 13
14. Membase - A practical path to “NoSQL” adoption 14
16. 16 Where does Membase Fit? Applications with many, many users Online, dynamic applications Applications with growing datasets which need quick access, growing audience
17. Use case – Ad targeting 17 40 milliseconds to come up with an answer. profiles, real time campaign statistics 3 2 1 profiles, campaigns events
20. 20 Clustering Underlying cluster functionality based on erlang OTP Have a custom, vector clock based way of storing and propagating... Cluster topology vBucket mapping Collect statistics from many nodes of the cluster Identify hot keys, resource utilization 20
22. 22 TAP A generic, scalable method of streaming mutations from a given server As data operations arrive, they can be sent to arbitrary TAP receivers Leverages the existing memcached engine interface, and the non-blocking IO interfaces to send data Three modes of operation
23. 11211 11210 memcapable 1.0 memcapable 2.0 moxi memcached protocol listener/sender REST management API/Web UI vBucket state and replication manager Node health monitor Rebalance orchestrator Heartbeat Process monitor Global singleton supervisor Configuration manager engine interface membase storage engine http on each node one per cluster Erlang/OTP HTTP distributed erlang erlang port mapper 21100 – 21199 4369 8080 Membase Architecture
25. 25 Data buckets are secure membase “slices” Application user Web application server Bucket 1 Bucket 2 Aggregate Cluster Memory and Disk Capacity Membase data servers In the data center On the administrator console
26. 11211 11210 memcapable 1.0 memcapable 2.0 moxi memcached protocol listener/sender REST management API/Web UI vBucket state and replication manager Rebalance orchestrator Node health monitor Data Manager Cluster Manager Global singleton supervisor Configuration manager Heartbeat Process monitor engine interface membase storage engine http on each node one per cluster Erlang/OTP HTTP distributed erlang erlang port mapper 21100 – 21199 4369 8080 Membase Architecture
28. 28 Membase “write” data flow – application view User action results in the need to change the VALUE of KEY 1 Application updates key’s VALUE, performs SET operation 2 4 Membase (memcached) client hashes KEY, identifies KEY’s master server 3 SET request sent over network to master server 5 Membase replicates KEY-VALUE pair, caches it in memory and stores it to disk
29. 29 Disk > Memory Dataset may have many items infrequently accessed. However, memcached has different behavior (LRU) than wanted with membase. Still, traditional (most) RDBMS implementations are not 100% correct for us either. The speed of a miss is very, very important.
33. 33 Key-Value (with a replica) Items have: Key Value Expiration Flags CAS (more on this later) Operations include: Get/Set Increment/Decrement Append/Prepend
40. 36 Common Use: Sessions Web user sessions Highly read, less writes in many case Protocol advantage of memcached Options already for PHP, Ruby and Java Application state Not necessarily “entity” style things May be appropriate for a “cache” pool
41. 37 Common Use (cache): Rate Limiting Want to provide API calls into the system Twitter search Google search services Use the atomic increment Set an item with a unique ID Upon API request, increment and check HTTP 420: go away and come back later Your Users ¡Ouch! Your App