2. 2Confidential and Proprietary |
Outline
● About Zoosk/me
● Why Zoosk chose Cassandra
● Two use cases
● Production setup
● Things to watch out for
● Future plans
3. 3Confidential and Proprietary |
Zoosk
● Founded in 2007
● Zoosk is a leading online dating company
● Over 33 million searchable members
● #1 grossing online dating app in the Apple App Store, Zoosk is a
market leader in mobile dating.
● Available in over 80 countries and translated into 25 languages
4. 4Confidential and Proprietary |
Wei Zhu
● Platform Engineer
● Developed Zoosk Internal API (ZIA), a Java and PHP Restful service
architecture
● Implemented a few services using Cassandra as storage
● Other stuff
5. 5Confidential and Proprietary |
Why do we want to move away from
MySQL?
● The traditional master-slave architecture of MySQL (one write master
with n-1 slaves) only supports one write master. We are using MHA
which requires master-slave.
● Manual sharding process with rapid growth of data is really painful.
● Management overhead is high.
6. 6Confidential and Proprietary |
Why Cassandra?
● Had bad experience with Mongo
– Memory consumption
– Stability
● Riak
– read-before-write is a no-no.
– Riak favors reads more than writes
– Riak with Bitcask has more demand for memory
7. 7Confidential and Proprietary |
Highlights of Cassandra
● Minimal Administration.
● No Single Point of Failure.
● Handles failure gracefully, Cassandra is crash-only.
● Scales Horizontally.
● Writes are durable.
● Consistency is tunable as needed on reads and writes.
● Schema is flexible, can be updated live.
● Replication is easy, Rack and Datacenter aware.
8. 8Confidential and Proprietary |
Benchmark
● Friends table, 2.7B friend relations in MySQL db.
● Created data for 6 Million users, based on the published Facebook
friend distribution.
– Number of friends from 6 – 5000.
– Average 490 friends.
– Total of 2.94 B relations.
– ~700 G of data
9. 9Confidential and Proprietary |
Benchmark numbers (out of box setting)
● We only ran for couple of hours, since we didn’t know what
compaction/repair can do to you at that time.
● Dell C1100 Three nodes cluster, RF = 3
– Dual L5640 CPUs (6-Core 2.13 Ghz), 72GB Memory (18 x 4GB), 4 x
100GB SLC SSDs (or MET-MLC)
Unit: ms, RL: read latency, WL: write latency
10. 10Confidential and Proprietary |
A
p
a
c
h
e
A
p
a
c
h
e
L
B
L
B
ZIA Service Layer
Tomcat
ZIA Service Layer
Tomcat
Jersey
(ZIA business Logic)
Hector
or
CQL Java
Driver
Hector
or
CQL Java
Driver
ZIA Service Layer
Tomcat
ZIA Service Layer
Tomcat
ZIA Service Layer
Tomcat
ZIA Service Layer
Tomcat
CassandraCassandra
CassandraCassandra
CassandraCassandra
Http
Post
JSON
M
e
m
c
a
c
h
e
M
e
m
c
a
c
h
e
11. 11Confidential and Proprietary |
Friends in MySQL
Friends Table:
id user_ID friend_user_id
1 1231069955177344716 1231070367578097419
2 1231070367578097419 1231069955177344716
3 1231069955177344716 1231070505050586151
4 1231070505050586151 1231069955177344716
12. 12Confidential and Proprietary |
Users
id first_name last_name
1231069955177344716 Mary Smith
1231070367578097419 James Brown
1231070505050586151 Robert Wilson
13. 13Confidential and Proprietary |
Cassandra Schema
● // column name is a composite column with fname + lname + user_id
create column family friends
with comparator = 'CompositeType(UTF8Type, UTF8Type, LongType)’
and key_validation_class = 'LongType'
and compaction_strategy='LeveledCompactionStrategy’
● Data is denormalized, a bit complicated for updating. (What if user
decides to change their name?)
16. 16Confidential and Proprietary |
Cassandra schema for notification
create column family notifications
with column_type = 'Standard'
and comparator = 'CompositeType(TimeUUIDType, UTF8Type)'
and default_validation_class = 'UTF8Type'
and key_validation_class = 'LongType'
20. 20Confidential and Proprietary |
Production Setup
– Persistent Notifications: 5 Nodes Single DC, RF = 3
• 1.1.6
• SSD
• Powerful machines (Used to be Mysql Server): 74G RAM, 24core
• Cassandra is running on 8G Heap
• 30 GB data per node
• 250 Writes per second
• 70 Reads per second
• Write Latency: <0.02ms
• Read Latency: < 2ms
21. 21Confidential and Proprietary |
Production Setup
– All the rest: 14 Nodes, 2DCs, {DC1:3, DC2:3}
• Active-backup
• 2.0.8
• Less powerful machines: 32G RAM, 2 core
• Very little usage for now
• Cassandra is running on 8G Heap
• Consistency level is set to LOCAL_QUORUM
22. 22Confidential and Proprietary |
Compaction Strategy
● We choose Leveled Compaction because:
– It requires less disk space (theoretically)
– It requires more I/O, but we have SSD
– We have TTL, so compaction is important
● Things to watch out
– SSTable size was default to 5MB in version prior to (1.2.9) which is way too
small.
– Defaults to 160MB for version after 1.2.9,
https://issues.apache.org/jira/browse/CASSANDRA-5727
– Way to set SSTable size on C* 2.X
ALTER TABLE test
WITH compaction = {'class': 'LeveledCompactionStrategy',
'sstable_size_in_mb': 256};
23. 23Confidential and Proprietary |
Repair
● The hard requirement for routine repair frequency is the value of
gc_grace_seconds. (10 days by default)
● Things to watch out
– Use –pr
– Schedule repair wisely
– Watch your disk (Even for LCS, the disk would double during the repair)
– Watch your performance metrics
– nodetool setcompactionthroughput
– nodetool setstreamthroughput
25. 25Confidential and Proprietary |
Cluster setup choice
● Big cluster with less powerful machine
– It’s easier to scale with vnodes
– Less administrative overhead
– More nodes meaning higher occurrences of node failure, but C* is so
resilient to the node failure
● Small cluster with more powerful machine
– Can be tuned specific for each user case
– Self contained to each service, in case of outage, less impact
● We are moving to a single big cluster with less powerful machines
● Bring more services to Cassandra