Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
drop_acid_pycon_2009
1. Drop ACID and think about data
Bob Ippolito
Mochi Media, Inc.
PyCon 2009 - Chicago (actually Rosemont)
March 28, 2009
Bob Ippolito (PyCon 2009) Drop ACID and think about data March 28, 2009 1 / 35
2. Author: Bob Ippolito
Date: March 2009
Venue: PyCon 2009
Bob Ippolito (PyCon 2009) Drop ACID and think about data March 28, 2009 2 / 35
3. Bob’s Perspective
Startup with lots of data:
Cofounded Mochi Media in 2005
MochiBot analytics platform (for Flash)
MochiAds ad serving platform (for Flash games)
Other cool services for game developers
Bob Ippolito (PyCon 2009) Drop ACID and think about data March 28, 2009 2 / 35
4. Mochi Ad Sales
Hard Sell:
Bob Ippolito (PyCon 2009) Drop ACID and think about data March 28, 2009 3 / 35
5. What’s ACID?
A promise ring your DBMS wears:
Atomicity: all or nothing
Consistency: no explosions
Isolation: no fights
Durability: no lying
Bob Ippolito (PyCon 2009) Drop ACID and think about data March 28, 2009 4 / 35
6. ACID Trips
Scalability and reliability:
Downtime is unacceptable
Reliable is >= 2 nodes
Scalable is ... more
Networks make it hard
Networks make it hard
Networks make it hard
Bob Ippolito (PyCon 2009) Drop ACID and think about data March 28, 2009 5 / 35
7. What can I have?
CAP theorem says pick two:
Consistency
Availability
Partition tolerance
Bob Ippolito (PyCon 2009) Drop ACID and think about data March 28, 2009 6 / 35
8. Turn up the BASE
Write smarter applications:
Basically Available
Soft state
Eventually consistent
Bob Ippolito (PyCon 2009) Drop ACID and think about data March 28, 2009 7 / 35
9. BASE jumping
Everyone else is doing it:
Google
Amazon
eBay
Yahoo!
Facebook
...
Bob Ippolito (PyCon 2009) Drop ACID and think about data March 28, 2009 8 / 35
10. BigTable
Google:
Paxos (Chubby)
Single-master
Distributed tablets via GFS
Row/Column db hybrid
Compression (BMDiff, Zippy)
Versioned (Row, Column, Timestamp)
Bloom filters
Bob Ippolito (PyCon 2009) Drop ACID and think about data March 28, 2009 9 / 35
11. BigTable Pros
Pros:
Compression = Awesome
Clients are probably simple
Integrates with map/reduce
Bob Ippolito (PyCon 2009) Drop ACID and think about data March 28, 2009 10 / 35
12. BigTable Cons
Cons:
Proprietary to Google
Single-master
Bob Ippolito (PyCon 2009) Drop ACID and think about data March 28, 2009 11 / 35
13. BigTable Diagram
Single-master:
Bob Ippolito (PyCon 2009) Drop ACID and think about data March 28, 2009 12 / 35
14. Dynamo
Amazon:
Key/Value store
Consistent hashing
Vector clocks
Read repair
Bob Ippolito (PyCon 2009) Drop ACID and think about data March 28, 2009 13 / 35
15. Dynamo Pros
Pros:
No master
Highly available for write
Knobs to make it fast to read
“Simple” (lots of half-baked clones!)
Bob Ippolito (PyCon 2009) Drop ACID and think about data March 28, 2009 14 / 35
16. Dynamo Cons
Cons:
Proprietary to Amazon
Clients need to be smart
No compression
Not suitable for column-like workloads
Just a Key/Value store
Bob Ippolito (PyCon 2009) Drop ACID and think about data March 28, 2009 15 / 35
17. Dynamo Diagram
Smart client:
Bob Ippolito (PyCon 2009) Drop ACID and think about data March 28, 2009 16 / 35
18. Cassandra
Facebook -> Apache:
Open source!
No master like Dynamo
Storage model more like BigTable
Bob Ippolito (PyCon 2009) Drop ACID and think about data March 28, 2009 17 / 35
19. Cassandra Pros
Pros:
OPEN SOURCE
Incrementally scalable
Minimal administration
Bob Ippolito (PyCon 2009) Drop ACID and think about data March 28, 2009 18 / 35
20. Cassandra Cons
Cons:
Not polished
No compression yet
Bob Ippolito (PyCon 2009) Drop ACID and think about data March 28, 2009 19 / 35
21. Cassandra Diagram
Soul Calibur:
Bob Ippolito (PyCon 2009) Drop ACID and think about data March 28, 2009 20 / 35
22. Distributed Musings
New Hotness:
Distributed databases are the new web framework
... except none of them are awesome yet
I don’t think we need another half-baked Dynamo
clone
Bob Ippolito (PyCon 2009) Drop ACID and think about data March 28, 2009 21 / 35
23. Key-Value Stores
Simple and Fast:
Similar to a Python dict
Keys usually bytes, probably limited
Values usually bytes, often have fewer limits
Extremely fast, simple
Bob Ippolito (PyCon 2009) Drop ACID and think about data March 28, 2009 22 / 35
24. Memcached
Key/Value store as cache:
No persistence
RAM only
Throws data away (on purpose)
Lightning fast
“Everyone” uses it
Bob Ippolito (PyCon 2009) Drop ACID and think about data March 28, 2009 23 / 35
25. Caching Immutable Data
If only data never changed:
Immutable is easy, do that
Bob Ippolito (PyCon 2009) Drop ACID and think about data March 28, 2009 24 / 35
26. Caching Mutable Data
Invalidation sucks:
Mutable is hard
Failed transactions?
Concurrent writers?
Dependent cache keys?
You will get it wrong and it will be hard to debug
Bob Ippolito (PyCon 2009) Drop ACID and think about data March 28, 2009 25 / 35
27. Tokyo Cabinet/Tyrant
Not your mom’s BerkeleyDB:
Disk persistent
Very performant
Actively developed
Similar replication strategy to MySQL
Bob Ippolito (PyCon 2009) Drop ACID and think about data March 28, 2009 26 / 35
28. Redis
Still very new:
Not just a Key/Value store
Matching on key spaces
Values can be bytes, lists or sets
Requires full store in RAM
Might be a nice cache server?
Bob Ippolito (PyCon 2009) Drop ACID and think about data March 28, 2009 27 / 35
29. Document Databases
Schema-free:
Very easy to use
Document Versioning
Great for storing documents
Bob Ippolito (PyCon 2009) Drop ACID and think about data March 28, 2009 28 / 35
30. CouchDB
Document DB Poster Child:
Apache project
Asynchronous replication
JSON based
Views materialized on demand (not indexes)
Neat admin UI
Bob Ippolito (PyCon 2009) Drop ACID and think about data March 28, 2009 29 / 35
31. MongoDB
C++’s revenge:
Fast
JSON and BSON (binary JSON-ish)
Asynchronous replication with auto-sharding “soon”
Index support
Nested documents
Advanced queries
Bob Ippolito (PyCon 2009) Drop ACID and think about data March 28, 2009 30 / 35
32. Column Databases
Data Warehousing:
Sequential reads are awesome
Columns compress better than rows
Doesn’t waste I/O on uninteresting columns
Bob Ippolito (PyCon 2009) Drop ACID and think about data March 28, 2009 31 / 35
33. MonetDB
Research project:
Tried really hard to get it to work
Crashes a lot and corrupts your data
Do not waste your time
Bob Ippolito (PyCon 2009) Drop ACID and think about data March 28, 2009 32 / 35
34. LucidDB
Sounds interesting:
Java/C++ open source data warehouse
No clustering
No experience yet
Bob Ippolito (PyCon 2009) Drop ACID and think about data March 28, 2009 33 / 35
35. Vertica
We paid for it:
Commercial (based on C-Store)
Clustered
Would still prefer open source
Bob Ippolito (PyCon 2009) Drop ACID and think about data March 28, 2009 34 / 35
36. Bitmap Indexes
Sequential Scans can be fast:
1-N bits per row of data
Can apply logical operations across indexes
Can be compressed (BBC, WAH)
FastBit is a good implementation
Bob Ippolito (PyCon 2009) Drop ACID and think about data March 28, 2009 35 / 35
37. Bitmap Index Uses
Big Queries:
PostgreSQL 8.1+ in-memory for some queries
Almost a requirement for column stores
FastBit is a great implementation (WAH)
Bob Ippolito (PyCon 2009) Drop ACID and think about data March 28, 2009 36 / 35
38. Bloom Filters are Neat
But our Princess is in another castle:
Probabilistic data structure
False positives at a known error
Constant space
I won’t bore you with the math
Bob Ippolito (PyCon 2009) Drop ACID and think about data March 28, 2009 37 / 35
39. Bloom Filter Diagram
Actually Relevant:
Bob Ippolito (PyCon 2009) Drop ACID and think about data March 28, 2009 38 / 35
40. Bloom Filter Uses
Find stuff, maybe:
Approximate counting of a large set (e.g. unique IPs
from logs)
Knowing that data is definitely NOT stored
somewhere, e.g. remote cache
Several variants (Counting Bloom Filter, Scalable
Bloom Filter, ...)
Bob Ippolito (PyCon 2009) Drop ACID and think about data March 28, 2009 39 / 35
41. Questions?
Open Space:
Open Space TODAY @ 5pm, Lambert. See
Jonathan Ellis
Bob Ippolito (PyCon 2009) Drop ACID and think about data March 28, 2009 40 / 35