SlideShare uma empresa Scribd logo
1 de 25
1Confidential and Proprietary |
Cassandra at Zoosk
Wei Zhu
Principal Platform Engineer
Feb 25, 2015
2Confidential and Proprietary |
Outline
● About Zoosk/me
● Why Zoosk chose Cassandra
● Two use cases
● Production setup
● Things to watch out for
● Future plans
3Confidential and Proprietary |
Zoosk
● Founded in 2007
● Zoosk is a leading online dating company
● Over 33 million searchable members
● #1 grossing online dating app in the Apple App Store, Zoosk is a
market leader in mobile dating.
● Available in over 80 countries and translated into 25 languages
4Confidential and Proprietary |
Wei Zhu
● Platform Engineer
● Developed Zoosk Internal API (ZIA), a Java and PHP Restful service
architecture
● Implemented a few services using Cassandra as storage
● Other stuff
5Confidential and Proprietary |
Why do we want to move away from
MySQL?
● The traditional master-slave architecture of MySQL (one write master
with n-1 slaves) only supports one write master. We are using MHA
which requires master-slave.
● Manual sharding process with rapid growth of data is really painful.
● Management overhead is high.
6Confidential and Proprietary |
Why Cassandra?
● Had bad experience with Mongo
– Memory consumption
– Stability
● Riak
– read-before-write is a no-no.
– Riak favors reads more than writes
– Riak with Bitcask has more demand for memory
7Confidential and Proprietary |
Highlights of Cassandra
● Minimal Administration.
● No Single Point of Failure.
● Handles failure gracefully, Cassandra is crash-only.
● Scales Horizontally.
● Writes are durable.
● Consistency is tunable as needed on reads and writes.
● Schema is flexible, can be updated live.
● Replication is easy, Rack and Datacenter aware.
8Confidential and Proprietary |
Benchmark
● Friends table, 2.7B friend relations in MySQL db.
● Created data for 6 Million users, based on the published Facebook
friend distribution.
– Number of friends from 6 – 5000.
– Average 490 friends.
– Total of 2.94 B relations.
– ~700 G of data
9Confidential and Proprietary |
Benchmark numbers (out of box setting)
● We only ran for couple of hours, since we didn’t know what
compaction/repair can do to you at that time.
● Dell C1100 Three nodes cluster, RF = 3
– Dual L5640 CPUs (6-Core 2.13 Ghz), 72GB Memory (18 x 4GB), 4 x
100GB SLC SSDs (or MET-MLC)
Unit: ms, RL: read latency, WL: write latency
10Confidential and Proprietary |
A
p
a
c
h
e
A
p
a
c
h
e
L
B
L
B
ZIA Service Layer
Tomcat
ZIA Service Layer
Tomcat
Jersey
(ZIA business Logic)
Hector
or
CQL Java
Driver
Hector
or
CQL Java
Driver
ZIA Service Layer
Tomcat
ZIA Service Layer
Tomcat
ZIA Service Layer
Tomcat
ZIA Service Layer
Tomcat
CassandraCassandra
CassandraCassandra
CassandraCassandra
Http
Post
JSON
M
e
m
c
a
c
h
e
M
e
m
c
a
c
h
e
11Confidential and Proprietary |
Friends in MySQL
Friends Table:
id user_ID friend_user_id
1 1231069955177344716 1231070367578097419
2 1231070367578097419 1231069955177344716
3 1231069955177344716 1231070505050586151
4 1231070505050586151 1231069955177344716
12Confidential and Proprietary |
Users
id first_name last_name
1231069955177344716 Mary Smith
1231070367578097419 James Brown
1231070505050586151 Robert Wilson
13Confidential and Proprietary |
Cassandra Schema
● // column name is a composite column with fname + lname + user_id
create column family friends
with comparator = 'CompositeType(UTF8Type, UTF8Type, LongType)’
and key_validation_class = 'LongType'
and compaction_strategy='LeveledCompactionStrategy’
● Data is denormalized, a bit complicated for updating. (What if user
decides to change their name?)
14Confidential and Proprietary |
Data in Cassandra
1231069955177344716 James:Brown:12310703
67578097419
Robert:Wilson:1231070
505050586151
{"s":5,"r":0,"c”:3,"l":1343
346668000,"m":0,"ct":13
43346668000,"i":106}
{"s”:7,"r":0,"c”:2,"l":1343
346410000,"m":0,"ct":13
43346410000,"i":10}
1231070367578097419 Mary:Smith:1231069955
177344716
{"s":5,"r":0,"c":1,"l":1343
346668000,"m":0,"ct":13
43346668000,"i":106}
1231070505050586151 Mary:Smith:1231069955
177344716
{"s”:7,"r":0,"c”:1,"l":1343
346410000,"m":0,"ct":13
43346410000,"i":10}
15Confidential and Proprietary |
Persistent Notification Services
16Confidential and Proprietary |
Cassandra schema for notification
create column family notifications
with column_type = 'Standard'
and comparator = 'CompositeType(TimeUUIDType, UTF8Type)'
and default_validation_class = 'UTF8Type'
and key_validation_class = 'LongType'
17Confidential and Proprietary |
Data in Cassandra for notifications
[default@zoosk] get notifications[2752669903264728509];
=> (column=8c6e8800-f687-172c-aa11-008cfa0410fc:is_viewed, value=1, timestamp=1423879093429000, ttl=1814400)
=> (column=8c6e8800-f687-172c-aa11-008cfa0410fc:items,
value={"app_id":1,"type":502,"time":1422992424,"author_zid":"02752669903264728509","payload":"{"t":"3","an":"XXXXX","agd":"f","apr":
{"t":1,"s":1,"d":0,"o":0}}"}, timestamp=1422992424656001, ttl=1814400)
=> (column=60f8ff80-6938-1744-9ebc-008cfa0ea5e8:is_viewed, value=1, timestamp=1423879093429001, ttl=1814400)
=> (column=60f8ff80-6938-1744-9ebc-008cfa0ea5e8:items,
value={"app_id":1,"type":511,"time":1423652427,"author_zid":"02752669903264728509","payload":"{"t":"3","an":"YYYYY","agd":"f","apr":
{"t":1,"s":1,"d":0,"o":0}}"}, timestamp=1423652427483001, ttl=1814400)
=> (column=f1161080-fd82-174c-9ebc-008cfa0ea5e8:items,
value={"app_id":1,"type":511,"time":1423893912,"author_zid":"02752669903264728509","payload":"{"t":"3","an":"ZZZZZZZ","apr":
{"t":1,"s":1,"d":0,"o":0}}"}, timestamp=1423893912261001, ttl=1814400)
=> (column=6adb4f80-e667-1751-9ebc-008cfa0ea5e8:items,
value={"app_id":1,"type":511,"time":1424032109,"author_zid":"02752669903264728509","payload":"{"t":"3","an":"AAAAAAA","agd":"f","apr":
{"t":1,"s":1,"d":0,"o":0}}"}, timestamp=1424032109051001, ttl=1814400)
=> (column=e9801700-f4b8-1754-9ebc-008cfa0ea5e8:items,
value={"app_id":1,"type":511,"time":1424118125,"author_zid":"02752669903264728509","payload":"{"t":"3","an":"BBBBBBB","agd":"f","apr":
{"t":1,"s":1,"d":0,"o":0}}"}, timestamp=1424118125862001, ttl=1814400)
=> (column=18093480-9978-175a-920e-008cfa0e9778:items,
value={"app_id":1,"type":511,"time":1424276977,"author_zid":"02752669903264728509","payload":"{"t":"3","an":"CCCCCCC","agd":"f","apr":
{"t":1,"s":1,"d":0,"o":0}}"}, timestamp=1424276977453001, ttl=1814400)
=> (column=30d1b180-5d73-175b-9ebc-008cfa0ea5e8:items,
value={"app_id":1,"type":509,"time":1424298525,"author_zid":"02752669903264728509","payload":"{"t":"4","an":"DDDDDDDD","agd":"f","apr
":{"t":1,"s":1,"d":0,"o":0},"exp":1424381325}"}, timestamp=1424298525775001, ttl=82800)
=> (column=e8385d00-3f6e-175c-920e-008cfa0e9778:items,
value={"app_id":1,"type":511,"time":1424323372,"author_zid":"02752669903264728509","payload":"{"t":"3","an":"EEEEEE","agd":"f",:
{"t":1,"s":1,"d":0,"o":0}}"}, timestamp=1424323372898001, ttl=1814400)
18Confidential and Proprietary |
Persistent Notification Services
19Confidential and Proprietary |
Data in Cassandra for notifications
● => (column=f1161080-fd82-174c-9ebc-008cfa0ea5e8:is_viewed, value=1, timestamp=1424378452228000, ttl=1814400)
● => (column=f1161080-fd82-174c-9ebc-008cfa0ea5e8:items,
value={"app_id":1,"type":511,"time":1423893912,"author_zid":"02752669903264728509","payload":"{"t":"3","an":"ZZZZZZZ","apr":
{"t":1,"s":1,"d":0,"o":0}}"}, timestamp=1423893912261001, ttl=1814400)
● => (column=6adb4f80-e667-1751-9ebc-008cfa0ea5e8:is_viewed, value=1, timestamp=1424378449318000, ttl=1814400)
● => (column=6adb4f80-e667-1751-9ebc-008cfa0ea5e8:items,
value={"app_id":1,"type":511,"time":1424032109,"author_zid":"02752669903264728509","payload":"{"t":"3","an":"AAAAAAA","agd":"f","a
pr":{"t":1,"s":1,"d":0,"o":0}}"}, timestamp=1424032109051001, ttl=1814400)
● => (column=e9801700-f4b8-1754-9ebc-008cfa0ea5e8:is_viewed, value=1, timestamp=1424378447794000, ttl=1814400)
● => (column=e9801700-f4b8-1754-9ebc-008cfa0ea5e8:items,
value={"app_id":1,"type":511,"time":1424118125,"author_zid":"02752669903264728509","payload":"{"t":"3","an":"BBBBBBB","agd":"f","a
pr":{"t":1,"s":1,"d":0,"o":0}}"}, timestamp=1424118125862001, ttl=1814400)
● => (column=18093480-9978-175a-920e-008cfa0e9778:is_viewed, value=1, timestamp=1424378445009000, ttl=1814400)
● => (column=18093480-9978-175a-920e-008cfa0e9778:items,
value={"app_id":1,"type":511,"time":1424276977,"author_zid":"02752669903264728509","payload":"{"t":"3","an":"CCCCCCC","agd":"f","
apr":{"t":1,"s":1,"d":0,"o":0}}"}, timestamp=1424276977453001, ttl=1814400)
● => (column=30d1b180-5d73-175b-9ebc-008cfa0ea5e8:is_viewed, value=1, timestamp=1424378443164000, ttl= 82800)
● => (column=30d1b180-5d73-175b-9ebc-008cfa0ea5e8:items,
value={"app_id":1,"type":509,"time":1424298525,"author_zid":"02752669903264728509","payload":"{"t":"4","an":"DDDDDDDD","agd":"f",
"apr":{"t":1,"s":1,"d":0,"o":0},"exp":1424381325}"}, timestamp=1424298525775001, ttl=82800)
● => (column=e8385d00-3f6e-175c-920e-008cfa0e9778:is_viewed, value=1, timestamp=1424378409297000, ttl=1814400)
● => (column=e8385d00-3f6e-175c-920e-008cfa0e9778:items,
value={"app_id":1,"type":511,"time":1424323372,"author_zid":"02752669903264728509","payload":"{"t":"3","an":"EEEEEE","agd":"f",:
{"t":1,"s":1,"d":0,"o":0}}"}, timestamp=1424323372898001, ttl=1814400)
20Confidential and Proprietary |
Production Setup
– Persistent Notifications: 5 Nodes Single DC, RF = 3
• 1.1.6
• SSD
• Powerful machines (Used to be Mysql Server): 74G RAM, 24core
• Cassandra is running on 8G Heap
• 30 GB data per node
• 250 Writes per second
• 70 Reads per second
• Write Latency: <0.02ms
• Read Latency: < 2ms
21Confidential and Proprietary |
Production Setup
– All the rest: 14 Nodes, 2DCs, {DC1:3, DC2:3}
• Active-backup
• 2.0.8
• Less powerful machines: 32G RAM, 2 core
• Very little usage for now
• Cassandra is running on 8G Heap
• Consistency level is set to LOCAL_QUORUM
22Confidential and Proprietary |
Compaction Strategy
● We choose Leveled Compaction because:
– It requires less disk space (theoretically)
– It requires more I/O, but we have SSD
– We have TTL, so compaction is important
● Things to watch out
– SSTable size was default to 5MB in version prior to (1.2.9) which is way too
small.
– Defaults to 160MB for version after 1.2.9,
https://issues.apache.org/jira/browse/CASSANDRA-5727
– Way to set SSTable size on C* 2.X
ALTER TABLE test
WITH compaction = {'class': 'LeveledCompactionStrategy',
'sstable_size_in_mb': 256};
23Confidential and Proprietary |
Repair
● The hard requirement for routine repair frequency is the value of
gc_grace_seconds. (10 days by default)
● Things to watch out
– Use –pr
– Schedule repair wisely
– Watch your disk (Even for LCS, the disk would double during the repair)
– Watch your performance metrics
– nodetool setcompactionthroughput
– nodetool setstreamthroughput
24Confidential and Proprietary |
Repair impacts performance
25Confidential and Proprietary |
Cluster setup choice
● Big cluster with less powerful machine
– It’s easier to scale with vnodes
– Less administrative overhead
– More nodes meaning higher occurrences of node failure, but C* is so
resilient to the node failure
● Small cluster with more powerful machine
– Can be tuned specific for each user case
– Self contained to each service, in case of outage, less impact
● We are moving to a single big cluster with less powerful machines
● Bring more services to Cassandra

Mais conteúdo relacionado

Semelhante a CassandraMeetup-0225-updated

Should You Read Kafka as a Stream or in Batch? Should You Even Care? | Ido Na...
Should You Read Kafka as a Stream or in Batch? Should You Even Care? | Ido Na...Should You Read Kafka as a Stream or in Batch? Should You Even Care? | Ido Na...
Should You Read Kafka as a Stream or in Batch? Should You Even Care? | Ido Na...HostedbyConfluent
 
Scaling Dropbox
Scaling DropboxScaling Dropbox
Scaling DropboxC4Media
 
PhpTek Ten Things to do to make your MySQL servers Happier and Healthier
PhpTek Ten Things to do to make your MySQL servers Happier and HealthierPhpTek Ten Things to do to make your MySQL servers Happier and Healthier
PhpTek Ten Things to do to make your MySQL servers Happier and HealthierDave Stokes
 
Vitalii Bondarenko - Масштабована бізнес-аналітика у Cloud Big Data Cluster. ...
Vitalii Bondarenko - Масштабована бізнес-аналітика у Cloud Big Data Cluster. ...Vitalii Bondarenko - Масштабована бізнес-аналітика у Cloud Big Data Cluster. ...
Vitalii Bondarenko - Масштабована бізнес-аналітика у Cloud Big Data Cluster. ...Lviv Startup Club
 
Spark and cassandra (Hulu Talk)
Spark and cassandra (Hulu Talk)Spark and cassandra (Hulu Talk)
Spark and cassandra (Hulu Talk)Jon Haddad
 
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...MongoDB
 
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022StreamNative
 
MongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & AnalyticsMongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & AnalyticsServer Density
 
iRODS/DDN User Group 20140908 Sanger
iRODS/DDN User Group 20140908 SangeriRODS/DDN User Group 20140908 Sanger
iRODS/DDN User Group 20140908 SangerJohn Constable
 
Follow the money with graphs
Follow the money with graphsFollow the money with graphs
Follow the money with graphsStanka Dalekova
 
Tx lf propercareandfeedmysql
Tx lf propercareandfeedmysqlTx lf propercareandfeedmysql
Tx lf propercareandfeedmysqlDave Stokes
 
CaSSanDra: An SSD Boosted Key-Value Store
CaSSanDra: An SSD Boosted Key-Value StoreCaSSanDra: An SSD Boosted Key-Value Store
CaSSanDra: An SSD Boosted Key-Value StoreTilmann Rabl
 
Processing Data with Ruby
Processing Data with RubyProcessing Data with Ruby
Processing Data with Rubychapados
 
Couchbase Overview Nov 2013
Couchbase Overview Nov 2013Couchbase Overview Nov 2013
Couchbase Overview Nov 2013Jeff Harris
 
Orchestrating Cassandra with Kubernetes Operator and PaaSTA
Orchestrating Cassandra with Kubernetes Operator and PaaSTAOrchestrating Cassandra with Kubernetes Operator and PaaSTA
Orchestrating Cassandra with Kubernetes Operator and PaaSTARaghavendra Prabhu
 
Neo, Titan & Cassandra
Neo, Titan & CassandraNeo, Titan & Cassandra
Neo, Titan & Cassandrajohnrjenson
 
Code4Lib 2007: MyResearch Portal
Code4Lib 2007: MyResearch PortalCode4Lib 2007: MyResearch Portal
Code4Lib 2007: MyResearch Portaleby
 
High performance json- postgre sql vs. mongodb
High performance json- postgre sql vs. mongodbHigh performance json- postgre sql vs. mongodb
High performance json- postgre sql vs. mongodbWei Shan Ang
 
DAT202_Getting started with Amazon Aurora
DAT202_Getting started with Amazon AuroraDAT202_Getting started with Amazon Aurora
DAT202_Getting started with Amazon AuroraAmazon Web Services
 
Ops Jumpstart: MongoDB Administration 101
Ops Jumpstart: MongoDB Administration 101Ops Jumpstart: MongoDB Administration 101
Ops Jumpstart: MongoDB Administration 101MongoDB
 

Semelhante a CassandraMeetup-0225-updated (20)

Should You Read Kafka as a Stream or in Batch? Should You Even Care? | Ido Na...
Should You Read Kafka as a Stream or in Batch? Should You Even Care? | Ido Na...Should You Read Kafka as a Stream or in Batch? Should You Even Care? | Ido Na...
Should You Read Kafka as a Stream or in Batch? Should You Even Care? | Ido Na...
 
Scaling Dropbox
Scaling DropboxScaling Dropbox
Scaling Dropbox
 
PhpTek Ten Things to do to make your MySQL servers Happier and Healthier
PhpTek Ten Things to do to make your MySQL servers Happier and HealthierPhpTek Ten Things to do to make your MySQL servers Happier and Healthier
PhpTek Ten Things to do to make your MySQL servers Happier and Healthier
 
Vitalii Bondarenko - Масштабована бізнес-аналітика у Cloud Big Data Cluster. ...
Vitalii Bondarenko - Масштабована бізнес-аналітика у Cloud Big Data Cluster. ...Vitalii Bondarenko - Масштабована бізнес-аналітика у Cloud Big Data Cluster. ...
Vitalii Bondarenko - Масштабована бізнес-аналітика у Cloud Big Data Cluster. ...
 
Spark and cassandra (Hulu Talk)
Spark and cassandra (Hulu Talk)Spark and cassandra (Hulu Talk)
Spark and cassandra (Hulu Talk)
 
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
 
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
 
MongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & AnalyticsMongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & Analytics
 
iRODS/DDN User Group 20140908 Sanger
iRODS/DDN User Group 20140908 SangeriRODS/DDN User Group 20140908 Sanger
iRODS/DDN User Group 20140908 Sanger
 
Follow the money with graphs
Follow the money with graphsFollow the money with graphs
Follow the money with graphs
 
Tx lf propercareandfeedmysql
Tx lf propercareandfeedmysqlTx lf propercareandfeedmysql
Tx lf propercareandfeedmysql
 
CaSSanDra: An SSD Boosted Key-Value Store
CaSSanDra: An SSD Boosted Key-Value StoreCaSSanDra: An SSD Boosted Key-Value Store
CaSSanDra: An SSD Boosted Key-Value Store
 
Processing Data with Ruby
Processing Data with RubyProcessing Data with Ruby
Processing Data with Ruby
 
Couchbase Overview Nov 2013
Couchbase Overview Nov 2013Couchbase Overview Nov 2013
Couchbase Overview Nov 2013
 
Orchestrating Cassandra with Kubernetes Operator and PaaSTA
Orchestrating Cassandra with Kubernetes Operator and PaaSTAOrchestrating Cassandra with Kubernetes Operator and PaaSTA
Orchestrating Cassandra with Kubernetes Operator and PaaSTA
 
Neo, Titan & Cassandra
Neo, Titan & CassandraNeo, Titan & Cassandra
Neo, Titan & Cassandra
 
Code4Lib 2007: MyResearch Portal
Code4Lib 2007: MyResearch PortalCode4Lib 2007: MyResearch Portal
Code4Lib 2007: MyResearch Portal
 
High performance json- postgre sql vs. mongodb
High performance json- postgre sql vs. mongodbHigh performance json- postgre sql vs. mongodb
High performance json- postgre sql vs. mongodb
 
DAT202_Getting started with Amazon Aurora
DAT202_Getting started with Amazon AuroraDAT202_Getting started with Amazon Aurora
DAT202_Getting started with Amazon Aurora
 
Ops Jumpstart: MongoDB Administration 101
Ops Jumpstart: MongoDB Administration 101Ops Jumpstart: MongoDB Administration 101
Ops Jumpstart: MongoDB Administration 101
 

CassandraMeetup-0225-updated

  • 1. 1Confidential and Proprietary | Cassandra at Zoosk Wei Zhu Principal Platform Engineer Feb 25, 2015
  • 2. 2Confidential and Proprietary | Outline ● About Zoosk/me ● Why Zoosk chose Cassandra ● Two use cases ● Production setup ● Things to watch out for ● Future plans
  • 3. 3Confidential and Proprietary | Zoosk ● Founded in 2007 ● Zoosk is a leading online dating company ● Over 33 million searchable members ● #1 grossing online dating app in the Apple App Store, Zoosk is a market leader in mobile dating. ● Available in over 80 countries and translated into 25 languages
  • 4. 4Confidential and Proprietary | Wei Zhu ● Platform Engineer ● Developed Zoosk Internal API (ZIA), a Java and PHP Restful service architecture ● Implemented a few services using Cassandra as storage ● Other stuff
  • 5. 5Confidential and Proprietary | Why do we want to move away from MySQL? ● The traditional master-slave architecture of MySQL (one write master with n-1 slaves) only supports one write master. We are using MHA which requires master-slave. ● Manual sharding process with rapid growth of data is really painful. ● Management overhead is high.
  • 6. 6Confidential and Proprietary | Why Cassandra? ● Had bad experience with Mongo – Memory consumption – Stability ● Riak – read-before-write is a no-no. – Riak favors reads more than writes – Riak with Bitcask has more demand for memory
  • 7. 7Confidential and Proprietary | Highlights of Cassandra ● Minimal Administration. ● No Single Point of Failure. ● Handles failure gracefully, Cassandra is crash-only. ● Scales Horizontally. ● Writes are durable. ● Consistency is tunable as needed on reads and writes. ● Schema is flexible, can be updated live. ● Replication is easy, Rack and Datacenter aware.
  • 8. 8Confidential and Proprietary | Benchmark ● Friends table, 2.7B friend relations in MySQL db. ● Created data for 6 Million users, based on the published Facebook friend distribution. – Number of friends from 6 – 5000. – Average 490 friends. – Total of 2.94 B relations. – ~700 G of data
  • 9. 9Confidential and Proprietary | Benchmark numbers (out of box setting) ● We only ran for couple of hours, since we didn’t know what compaction/repair can do to you at that time. ● Dell C1100 Three nodes cluster, RF = 3 – Dual L5640 CPUs (6-Core 2.13 Ghz), 72GB Memory (18 x 4GB), 4 x 100GB SLC SSDs (or MET-MLC) Unit: ms, RL: read latency, WL: write latency
  • 10. 10Confidential and Proprietary | A p a c h e A p a c h e L B L B ZIA Service Layer Tomcat ZIA Service Layer Tomcat Jersey (ZIA business Logic) Hector or CQL Java Driver Hector or CQL Java Driver ZIA Service Layer Tomcat ZIA Service Layer Tomcat ZIA Service Layer Tomcat ZIA Service Layer Tomcat CassandraCassandra CassandraCassandra CassandraCassandra Http Post JSON M e m c a c h e M e m c a c h e
  • 11. 11Confidential and Proprietary | Friends in MySQL Friends Table: id user_ID friend_user_id 1 1231069955177344716 1231070367578097419 2 1231070367578097419 1231069955177344716 3 1231069955177344716 1231070505050586151 4 1231070505050586151 1231069955177344716
  • 12. 12Confidential and Proprietary | Users id first_name last_name 1231069955177344716 Mary Smith 1231070367578097419 James Brown 1231070505050586151 Robert Wilson
  • 13. 13Confidential and Proprietary | Cassandra Schema ● // column name is a composite column with fname + lname + user_id create column family friends with comparator = 'CompositeType(UTF8Type, UTF8Type, LongType)’ and key_validation_class = 'LongType' and compaction_strategy='LeveledCompactionStrategy’ ● Data is denormalized, a bit complicated for updating. (What if user decides to change their name?)
  • 14. 14Confidential and Proprietary | Data in Cassandra 1231069955177344716 James:Brown:12310703 67578097419 Robert:Wilson:1231070 505050586151 {"s":5,"r":0,"c”:3,"l":1343 346668000,"m":0,"ct":13 43346668000,"i":106} {"s”:7,"r":0,"c”:2,"l":1343 346410000,"m":0,"ct":13 43346410000,"i":10} 1231070367578097419 Mary:Smith:1231069955 177344716 {"s":5,"r":0,"c":1,"l":1343 346668000,"m":0,"ct":13 43346668000,"i":106} 1231070505050586151 Mary:Smith:1231069955 177344716 {"s”:7,"r":0,"c”:1,"l":1343 346410000,"m":0,"ct":13 43346410000,"i":10}
  • 15. 15Confidential and Proprietary | Persistent Notification Services
  • 16. 16Confidential and Proprietary | Cassandra schema for notification create column family notifications with column_type = 'Standard' and comparator = 'CompositeType(TimeUUIDType, UTF8Type)' and default_validation_class = 'UTF8Type' and key_validation_class = 'LongType'
  • 17. 17Confidential and Proprietary | Data in Cassandra for notifications [default@zoosk] get notifications[2752669903264728509]; => (column=8c6e8800-f687-172c-aa11-008cfa0410fc:is_viewed, value=1, timestamp=1423879093429000, ttl=1814400) => (column=8c6e8800-f687-172c-aa11-008cfa0410fc:items, value={"app_id":1,"type":502,"time":1422992424,"author_zid":"02752669903264728509","payload":"{"t":"3","an":"XXXXX","agd":"f","apr": {"t":1,"s":1,"d":0,"o":0}}"}, timestamp=1422992424656001, ttl=1814400) => (column=60f8ff80-6938-1744-9ebc-008cfa0ea5e8:is_viewed, value=1, timestamp=1423879093429001, ttl=1814400) => (column=60f8ff80-6938-1744-9ebc-008cfa0ea5e8:items, value={"app_id":1,"type":511,"time":1423652427,"author_zid":"02752669903264728509","payload":"{"t":"3","an":"YYYYY","agd":"f","apr": {"t":1,"s":1,"d":0,"o":0}}"}, timestamp=1423652427483001, ttl=1814400) => (column=f1161080-fd82-174c-9ebc-008cfa0ea5e8:items, value={"app_id":1,"type":511,"time":1423893912,"author_zid":"02752669903264728509","payload":"{"t":"3","an":"ZZZZZZZ","apr": {"t":1,"s":1,"d":0,"o":0}}"}, timestamp=1423893912261001, ttl=1814400) => (column=6adb4f80-e667-1751-9ebc-008cfa0ea5e8:items, value={"app_id":1,"type":511,"time":1424032109,"author_zid":"02752669903264728509","payload":"{"t":"3","an":"AAAAAAA","agd":"f","apr": {"t":1,"s":1,"d":0,"o":0}}"}, timestamp=1424032109051001, ttl=1814400) => (column=e9801700-f4b8-1754-9ebc-008cfa0ea5e8:items, value={"app_id":1,"type":511,"time":1424118125,"author_zid":"02752669903264728509","payload":"{"t":"3","an":"BBBBBBB","agd":"f","apr": {"t":1,"s":1,"d":0,"o":0}}"}, timestamp=1424118125862001, ttl=1814400) => (column=18093480-9978-175a-920e-008cfa0e9778:items, value={"app_id":1,"type":511,"time":1424276977,"author_zid":"02752669903264728509","payload":"{"t":"3","an":"CCCCCCC","agd":"f","apr": {"t":1,"s":1,"d":0,"o":0}}"}, timestamp=1424276977453001, ttl=1814400) => (column=30d1b180-5d73-175b-9ebc-008cfa0ea5e8:items, value={"app_id":1,"type":509,"time":1424298525,"author_zid":"02752669903264728509","payload":"{"t":"4","an":"DDDDDDDD","agd":"f","apr ":{"t":1,"s":1,"d":0,"o":0},"exp":1424381325}"}, timestamp=1424298525775001, ttl=82800) => (column=e8385d00-3f6e-175c-920e-008cfa0e9778:items, value={"app_id":1,"type":511,"time":1424323372,"author_zid":"02752669903264728509","payload":"{"t":"3","an":"EEEEEE","agd":"f",: {"t":1,"s":1,"d":0,"o":0}}"}, timestamp=1424323372898001, ttl=1814400)
  • 18. 18Confidential and Proprietary | Persistent Notification Services
  • 19. 19Confidential and Proprietary | Data in Cassandra for notifications ● => (column=f1161080-fd82-174c-9ebc-008cfa0ea5e8:is_viewed, value=1, timestamp=1424378452228000, ttl=1814400) ● => (column=f1161080-fd82-174c-9ebc-008cfa0ea5e8:items, value={"app_id":1,"type":511,"time":1423893912,"author_zid":"02752669903264728509","payload":"{"t":"3","an":"ZZZZZZZ","apr": {"t":1,"s":1,"d":0,"o":0}}"}, timestamp=1423893912261001, ttl=1814400) ● => (column=6adb4f80-e667-1751-9ebc-008cfa0ea5e8:is_viewed, value=1, timestamp=1424378449318000, ttl=1814400) ● => (column=6adb4f80-e667-1751-9ebc-008cfa0ea5e8:items, value={"app_id":1,"type":511,"time":1424032109,"author_zid":"02752669903264728509","payload":"{"t":"3","an":"AAAAAAA","agd":"f","a pr":{"t":1,"s":1,"d":0,"o":0}}"}, timestamp=1424032109051001, ttl=1814400) ● => (column=e9801700-f4b8-1754-9ebc-008cfa0ea5e8:is_viewed, value=1, timestamp=1424378447794000, ttl=1814400) ● => (column=e9801700-f4b8-1754-9ebc-008cfa0ea5e8:items, value={"app_id":1,"type":511,"time":1424118125,"author_zid":"02752669903264728509","payload":"{"t":"3","an":"BBBBBBB","agd":"f","a pr":{"t":1,"s":1,"d":0,"o":0}}"}, timestamp=1424118125862001, ttl=1814400) ● => (column=18093480-9978-175a-920e-008cfa0e9778:is_viewed, value=1, timestamp=1424378445009000, ttl=1814400) ● => (column=18093480-9978-175a-920e-008cfa0e9778:items, value={"app_id":1,"type":511,"time":1424276977,"author_zid":"02752669903264728509","payload":"{"t":"3","an":"CCCCCCC","agd":"f"," apr":{"t":1,"s":1,"d":0,"o":0}}"}, timestamp=1424276977453001, ttl=1814400) ● => (column=30d1b180-5d73-175b-9ebc-008cfa0ea5e8:is_viewed, value=1, timestamp=1424378443164000, ttl= 82800) ● => (column=30d1b180-5d73-175b-9ebc-008cfa0ea5e8:items, value={"app_id":1,"type":509,"time":1424298525,"author_zid":"02752669903264728509","payload":"{"t":"4","an":"DDDDDDDD","agd":"f", "apr":{"t":1,"s":1,"d":0,"o":0},"exp":1424381325}"}, timestamp=1424298525775001, ttl=82800) ● => (column=e8385d00-3f6e-175c-920e-008cfa0e9778:is_viewed, value=1, timestamp=1424378409297000, ttl=1814400) ● => (column=e8385d00-3f6e-175c-920e-008cfa0e9778:items, value={"app_id":1,"type":511,"time":1424323372,"author_zid":"02752669903264728509","payload":"{"t":"3","an":"EEEEEE","agd":"f",: {"t":1,"s":1,"d":0,"o":0}}"}, timestamp=1424323372898001, ttl=1814400)
  • 20. 20Confidential and Proprietary | Production Setup – Persistent Notifications: 5 Nodes Single DC, RF = 3 • 1.1.6 • SSD • Powerful machines (Used to be Mysql Server): 74G RAM, 24core • Cassandra is running on 8G Heap • 30 GB data per node • 250 Writes per second • 70 Reads per second • Write Latency: <0.02ms • Read Latency: < 2ms
  • 21. 21Confidential and Proprietary | Production Setup – All the rest: 14 Nodes, 2DCs, {DC1:3, DC2:3} • Active-backup • 2.0.8 • Less powerful machines: 32G RAM, 2 core • Very little usage for now • Cassandra is running on 8G Heap • Consistency level is set to LOCAL_QUORUM
  • 22. 22Confidential and Proprietary | Compaction Strategy ● We choose Leveled Compaction because: – It requires less disk space (theoretically) – It requires more I/O, but we have SSD – We have TTL, so compaction is important ● Things to watch out – SSTable size was default to 5MB in version prior to (1.2.9) which is way too small. – Defaults to 160MB for version after 1.2.9, https://issues.apache.org/jira/browse/CASSANDRA-5727 – Way to set SSTable size on C* 2.X ALTER TABLE test WITH compaction = {'class': 'LeveledCompactionStrategy', 'sstable_size_in_mb': 256};
  • 23. 23Confidential and Proprietary | Repair ● The hard requirement for routine repair frequency is the value of gc_grace_seconds. (10 days by default) ● Things to watch out – Use –pr – Schedule repair wisely – Watch your disk (Even for LCS, the disk would double during the repair) – Watch your performance metrics – nodetool setcompactionthroughput – nodetool setstreamthroughput
  • 24. 24Confidential and Proprietary | Repair impacts performance
  • 25. 25Confidential and Proprietary | Cluster setup choice ● Big cluster with less powerful machine – It’s easier to scale with vnodes – Less administrative overhead – More nodes meaning higher occurrences of node failure, but C* is so resilient to the node failure ● Small cluster with more powerful machine – Can be tuned specific for each user case – Self contained to each service, in case of outage, less impact ● We are moving to a single big cluster with less powerful machines ● Bring more services to Cassandra