SlideShare uma empresa Scribd logo
1 de 45
Cassandra in the Netflix
       Architecture
CassandraEU London March 28th, 2012
           Denis Sheahan
Agenda
•   Netflix and the Cloud
•   Why Cassandra
•   Cassandra Deployment @ Netflix
•   Cassandra Operations
•   Cassandra lessons learned
•   Scalability
•   Open Source
Netflix and the Cloud


  With more than 23 million streaming members in the
     United States, Canada, Latin America, the United
      Kingdom and Ireland, Netflix, Inc. is the world's
     leading internet subscription service for enjoying
                   movies and TV series..
  Netflix.com is almost 100% deployed on the Amazon
                          Cloud


Source: http://ir.netflix.com
Out-Growing Data Center
             http://techblog.netflix.com/2011/02/redesigning-netflix-api.html




                               37x Growth Jan
                               2010-Jan 2011


Datacenter
Capacity
Netflix Deployed on AWS

Content      Logs           Play          WWW        API          CS
   Video                                                        International
                 S3            DRM         Sign-Up   Metadata
  Masters                                                         CS lookup


                                                      Device    Diagnostics
   EC2       EMR Hadoop     CDN routing    Search
                                                      Config     & Actions


                                           Movie     TV Movie    Customer
    S3          Hive        Bookmarks
                                          Choosing   Choosing     Call Log


              Business                                 Social
   CDNs                       Logging      Ratings              CS Analytics
             Intelligence                            Facebook
Netflix AWS Cost
Why Cassandra
Distributed Key-Value Store vs Central
            SQL Database
• Datacenter had a central database       DBA
• Schema changes require downtime
• Cloud has in contrast many key-value data
  stores
  – Joins take place in java code
  – No schema to change, no scheduled downtime
Goals for moving from Netflix DC to
             the Cloud
• Faster
  – Lower latency than the equivalent datacenter web
• Scalable
  – Avoid needing any more datacenter capacity as
    subscriber count increases
• Available
  – Substantially higher robustness and availability than
    datacenter services
• Productive
  – Optimize agility of a large development team with
    automation and tools
Cassandra
• Faster
   – Low latency, low latency variance
• Scalable
   – Supports running on Amazon EC2
   – High and scalable read and write throughput
   – Support for Multi-region clusters
• Available
   – We value Availability over Consistency – Cassandra Eventually
     Consistent
   – Supports Amazon Availability Zones
   – Data integrity checks and repairs
   – Online Snapshot Backup, Restore/Rollback
• Productive
   – We want FOSS + Support
Cassandra Deployment
Netflix Cassandra Use Cases
• Many different profiles
  – Read heavy environments with a strict SLA
  – Batch Write environments (70 rows per batch)
    also serving low latency Reads
  – Read Modify Write environments with large rows
  – Write only environments with rapidly increasing
    data sets
  – Any many more….
How much we use Cassandra
30        Number of production clusters
12        Number of multi-region clusters
3         Max regions, one cluster
65        Total TB of data across all clusters
472       Number of Cassandra nodes
72/28     Largest Cassandra cluster (nodes/data in TB)
6k/250k   Max read/writes per second
Deployment
Architecture API
AWS EC2
                              Front End Load Balancer
             Discovery
              Service                API Proxy

                                  Load Balancer


            Component                  API
             Services


               memcache
                  d                  Cassandra
                                               EC2
                                             Internal
                                               Disks

                     Backup
                S3
High Availability Deployment
• Fact of life – EC2 instances die
• We store 3 local replicas in 3 different Cassandra nodes
    – One replica per EC2 Availability Zone (AZ)
• Minimum Cluster configuration is 6 nodes, 2 per AZ
    – Single instance failure still leaves at least one node in each AZ
•   Use Local Quorum for writes
•   Use Quorum One for reads
•   Entire cluster replicated in Multi-region deployments
•   AWS Availability Zones
    – Separate buildings
    – Separate power etc.
    – Fairly close together
Astyanax - Cassandra Write Data Flows
           Single Region, Multiple Availability Zone, Token Aware

                                      Cassandra
                                      •Disks
                                      •Zone A

•   Client Writes to      Cassandra               Cassandra   If a node goes
    nodes and Zones       •Disks                  •Disks      offline, hinted handoff
•   Nodes return ack to   •Zone C                 •Zone A     completes the write
    client                                                    when the node comes
•   Data written to                   Token                   back up.
    internal commit log               Aware
    disks (no more than               Clients                 Requests can choose to
    10 seconds later)     Cassandra               Cassandra   wait for one node, a
                          •Disks                  •Disks      quorum, or all nodes to
                          •Zone C                 •Zone B     ack the write

                                      Cassandra               SSTable disk writes and
                                      •Disks
                                                              compactions occur
                                      •Zone B
                                                              asynchronously
Extending to Multi-Region
                    In production for UK/Ireland support


•   Create cluster in EU
•   Backup US cluster to S3
•   Restore backup in EU                  Cassandra
                                          • Disks
                                          • Zone A
                                                                              Cassandra
                                                                              • Disks
                                                                              • Zone A

•   Local repair EU cluster   Cassandra
                              • Disks
                                                      Cassandra
                                                      • Disks
                                                                  Cassandra               Cassandra


•
                                                                  • Disks                 • Disks

    Global repair/join        • Zone C                • Zone A
                                                                  • Zone C                • Zone A


                                            US                                  EU
                                          Clients                             Clients
                              Cassandra               Cassandra
                                                                  Cassandra               Cassandra
                              • Disks                 • Disks
                                                                  • Disks                 • Disks
                              • Zone C                • Zone B
                                                                  • Zone C                • Zone B

                                          Cassandra
                                                                              Cassandra
                                          • Disks
                                          • Zone B                            • Disks
                                                                              • Zone B




                                           S3
Data Flows for Multi-Region Writes
           Token Aware, Consistency Level = Local Quorum

•   Client writes to local replicas               If a node or region goes offline, hinted handoff
•   Local write acks returned to                  completes the write when the node comes back up.
    Client which continues when                   Nightly global compare and repair jobs ensure
    2 of 3 local nodes are                        everything stays consistent.
    committed
•   Local coordinator writes to                   Local                                 Remote
    remote coordinator.
                                                   Cassandra
•
                                                                                         Cassandra
    When data arrives, remote                      • Disks
                                                   • Zone A
                                                                                         • Disks
                                                                                         • Zone A

    coordinator node acks             Cassandra                Cassandra    Cassandra                Cassandra

•
                                      • Disks                  • Disks
    Remote co-ordinator sends         • Zone C                 • Zone A
                                                                            • Disks
                                                                            • Zone C
                                                                                                     • Disks
                                                                                                     • Zone A



    data to other remote zones                      US                                    EU
•   Remote nodes ack to local                     Clients                               Clients
                                      Cassandra                Cassandra    Cassandra                Cassandra
    coordinator                       • Disks
                                      • Zone C
                                                               • Disks
                                                               • Zone B
                                                                            • Disks
                                                                            • Zone C
                                                                                                     • Disks
                                                                                                     • Zone B

•   Data flushed to internal                       Cassandra                             Cassandra
                                                   • Disks
    commit log disks (no more                      • Zone B
                                                                                         • Disks
                                                                                         • Zone B



    than 10 seconds later)
Priam – Cassandra Automation
          Available at http://github.com/netflix

• Open Source Tomcat Code running as a sidecar on
  each Cassandra node. Deployed as a separate rpm
• Zero touch auto-configuration
• Token allocation and assignment including multi-
  region
• Broken node replacement and ring expansion
• Full and incremental backup/restore to/from S3
• Metrics collection and forwarding via JMX
Cassandra Backup/Restore
• Full Backup                                              Cassandra

  – Time based snapshot                    Cassandra                       Cassandra


  – SSTable compress -> S3
• Incremental                  Cassandra                                               Cassandra




  – SSTable write triggers                                   S3
    compressed copy to S3    Cassandra
                                                           Backup
                                                                                         Cassandra


• Archive
  – Copy cross region               Cassandra                                    Cassandra


• Restore                                          Cassandra       Cassandra

  – Full restore or create
    new Ring from Backup          A
Cassandra Operations
Consoles, Monitors and Explorers
• Netflix Application Console (NAC)
  – Primary AWS provisioning/config interface
• EPIC Counters
  – Primary method for issue resolution
• Dashboards
• Cassandra Explorer
  – Browse clusters, keyspaces, column families
• AWS Usage Analyzer
  – Breaks down costs by application and resource
Cloud Deployment Model

                  Elastic Load
 Auto Scaling     Balancer
 Group




                                         Instances
                   Security
                   Group

Launch
Configuration
                        Amazon Machine
                        Image
NAC
• Netflix Application Console (NAC) is Netflix’s primary
  tool in both Prod and Test to:
  • Create and Destroy Applications
  • Create and Destroy Auto Scaling Groups (ASGs)
  • Scale Instances up and down within an ASG and manage auto-
    scaling
  • Manage launch configs and AMIs
• http://www.slideshare.net/joesondow
NAC
Cassandra Explorer




• Kiosk mode – no alerting
• High level cluster status (thift, gossip)
• Warns on a small set of metrics             27
Epic




• Netflix-wide monitoring and alerting tool based on RRD
• Priam sends all JMX data to Epic
• Very useful for finding specific issues           28
Dashboards




• Next level cluster details
   • Throughput
   • Latency, Gossip status, Maintenance operations
   • Trouble indicators
• Useful for finding anomalies
• Most investigations start here                      29
Things we monitor
• Cassandra
  –   Throughput, Latency, Compactions , Repairs
  –   Pending threads, Dropped operations
  –   Backup failures
  –   Recent restarts
  –   Schema changes
• System
  – Disk space, Disk throughput, Load average
• Errors and exceptions in Cassandra, System and
  Tomcat log files

                                                   30
Cassandra AWS Pain Points
• Compactions cause spikes, esp. on read-heavy systems
   – Affects clients (hector, astyanax)
   – Throttling in newer Cassandra versions helps
• Repairs are toxic to performance
• Disk performance on Cloud instances and its impact on
  SSTable count
• Memory requirements due to filesystem cache
• Compression unusable in our environment
• Multi-tenancy performance unpredictable
• Java Heap size and OOM issues
Lessons learned
• In EC2 best to choose instances that are not multi-
  tenant
• Better to compact on our terms and not Cassandra’s.
  Take nodes out of service for major compactions
• Size memtable flushes for optimizing compactions
   – Helps when writes are uniformly distributed, easier to
     determine flush patterns
   – Best to optimize flushes based on memtable size, not time
   – Makes minor compactions smoother



                                                                 32
Lessons Learned (cont)
• Key and row caches
  – Left unbounded can chew up jvm memory needed for
    normal work
  – Latencies will spike as the jvm needs to fight for
    memory
  – Off-heap row cache still maintains data structures on-
    heap
• mmap() as in-memory cache
  – When process terminated, mmap pages are added to
    the free list
Lessons Learned (cont)
• Sharding
  – If a single row has many gets/mutates, the nodes
    holding it will become hot spots
  – If a row grows too large, it won’t fit into memory
     • Problem for reads, compactions, and repairs
     • Some of our indices ran afoul of this
• For more info see Jason Brown’s slides
  Cassandra from the trenches
  slideshare.net/netflix
Scalability
Scalability Testing
• Cloud Based Testing – frictionless, elastic
   – Create/destroy any sized cluster in minutes
   – Many test scenarios run in parallel

• Test Scenarios
   – Internal app specific tests
   – Simple “stress” tool provided with Cassandra

• Scale test, keep making the cluster bigger
   – Check that tooling and automation works…
   – How many ten column row writes/sec can we do?
Scale-Up Linearity
                             Client Writes/s by node count – Replication Factor = 3
               1200000
                                                                              1099837
               1000000

                800000
Transactions




                600000
                                                         537172
                400000                         366828
                200000               174373

                     0
                         0         50         100       150       200   250   300       350

                                                    EC2 Instances
Measured at the Cassandra Server
                              Throughput 3.3 Million writes/sec
Cassandra Writes per second




                                          Elapsed time seconds
Response time 0.014ms
Cassandra Response time




                                Elapsed time seconds
Per Node Activity
      Per Node           48 Nodes      96 Nodes        144 Nodes           288 Nodes
Per Server Writes/s      10,900 w/s     11,460 w/s       11,900 w/s         11,456 w/s
Mean Server Latency       0.0117 ms     0.0134 ms         0.0148 ms         0.0139 ms
Mean CPU %Busy               74.4 %         75.4 %           72.5 %             81.5 %
Disk Read                5,600 KB/s     4,590 KB/s       4,060 KB/s         4,280 KB/s
Disk Write              12,800 KB/s    11,590 KB/s      10,380 KB/s        10,080 KB/s
Network Read            22,460 KB/s    23,610 KB/s      21,390 KB/s        23,640 KB/s
Network Write           18,600 KB/s    19,600 KB/s      17,810 KB/s        19,770 KB/s


       Node specification – Xen Virtual Images, AWS US East, three zones
       • Cassandra 0.8.6, CentOS, SunJDK6
       • AWS EC2 m1 Extra Large – Standard price $ 0.68/Hour
       • 15 GB RAM, 4 Cores, 1Gbit network
       • 4 internal disks (total 1.6TB, striped together, md, XFS)
Time is Money
                         48 nodes    96 nodes               144 nodes              288 nodes
Writes Capacity         174373 w/s   366828 w/s              537172 w/s          1,099,837 w/s
Storage Capacity           12.8 TB          25.6 TB                38.4 TB                76.8 TB
Nodes Cost/hr               $32.64           $65.28                 $97.92                $195.84
Test Driver Instances          10                 20                     30                    60
Test Driver Cost/hr         $20.00           $40.00                 $60.00                $120.00
Cross AZ Traffic           5 TB/hr        10 TB/hr               15 TB/hr                301 TB/hr
Traffic Cost/10min           $8.33           $16.66                 $25.00                 $50.00
Setup Duration          15 minutes   22 minutes               31 minutes            662 minutes
AWS Billed Duration            1hr               1hr                    1 hr                  2 hr
Total Test Cost             $60.97         $121.94                $182.92                 $561.68
                                      1   Estimate two thirds of total network traffic
                                      2   Workaround for a tooling bug slowed setup
Open Source
Open Source @ Netflix




• Source at http://netflix.github.com
• Binaries at Maven https://issues.sonatype.org/browse/OSSRH-
  2116
Cassandra JMeter Plugin
• Netflix uses JMeter across the fleet for
  load testing
• JMeter plugin provides a wide range of
  samplers for Get, Put, Delete and
  Schema Creation
• Used extensively to load data, Cassandra
  stress tests, feature testing etc.
• Described at
  https://github.com/Netflix/CassJMeter/
  wiki
Astyanax
                Available at http://github.com/netflix

• Cassandra java client
• API abstraction on top of Thrift protocol
• “Fixed” Connection Pool abstraction (vs. Hector)
   –   Round robin with Failover
   –   Retry-able operations not tied to a connection
   –   Netflix PaaS Discovery service integration
   –   Host reconnect (fixed interval or exponential backoff)
   –   Token aware to save a network hop – lower latency
   –   Latency aware to avoid compacting/repairing nodes – lower
       variance
• Simplified use of serializers via method overloading (vs.
  Hector)
• ConnectionPoolMonitor interface

Mais conteúdo relacionado

Mais procurados

Pythian: My First 100 days with a Cassandra Cluster
Pythian: My First 100 days with a Cassandra ClusterPythian: My First 100 days with a Cassandra Cluster
Pythian: My First 100 days with a Cassandra ClusterDataStax Academy
 
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...DataStax
 
Cassandra and Spark
Cassandra and SparkCassandra and Spark
Cassandra and Sparknickmbailey
 
Cassandra Summit 2014: Active-Active Cassandra Behind the Scenes
Cassandra Summit 2014: Active-Active Cassandra Behind the ScenesCassandra Summit 2014: Active-Active Cassandra Behind the Scenes
Cassandra Summit 2014: Active-Active Cassandra Behind the ScenesDataStax Academy
 
Learn Cassandra at edureka!
Learn Cassandra at edureka!Learn Cassandra at edureka!
Learn Cassandra at edureka!Edureka!
 
NoSQL overview implementation free
NoSQL overview implementation freeNoSQL overview implementation free
NoSQL overview implementation freeBenoit Perroud
 
Introduction to Cassandra Architecture
Introduction to Cassandra ArchitectureIntroduction to Cassandra Architecture
Introduction to Cassandra Architecturenickmbailey
 
Apache Cassandra training. Overview and Basics
Apache Cassandra training. Overview and BasicsApache Cassandra training. Overview and Basics
Apache Cassandra training. Overview and BasicsOleg Magazov
 
Cassandra multi-datacenter operations essentials
Cassandra multi-datacenter operations essentialsCassandra multi-datacenter operations essentials
Cassandra multi-datacenter operations essentialsJulien Anguenot
 
Micro-batching: High-performance Writes (Adam Zegelin, Instaclustr) | Cassand...
Micro-batching: High-performance Writes (Adam Zegelin, Instaclustr) | Cassand...Micro-batching: High-performance Writes (Adam Zegelin, Instaclustr) | Cassand...
Micro-batching: High-performance Writes (Adam Zegelin, Instaclustr) | Cassand...DataStax
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache CassandraDataStax
 
Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinChristian Johannsen
 
Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016
Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016
Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016DataStax
 
NOSQL Database: Apache Cassandra
NOSQL Database: Apache CassandraNOSQL Database: Apache Cassandra
NOSQL Database: Apache CassandraFolio3 Software
 
Running Cassandra on Amazon EC2
Running Cassandra on Amazon EC2Running Cassandra on Amazon EC2
Running Cassandra on Amazon EC2Dave Gardner
 
No Sql Introduction
No Sql IntroductionNo Sql Introduction
No Sql IntroductionDingding Ye
 
NewSQL Database Overview
NewSQL Database OverviewNewSQL Database Overview
NewSQL Database OverviewSteve Min
 
Cassandra CLuster Management by Japan Cassandra Community
Cassandra CLuster Management by Japan Cassandra CommunityCassandra CLuster Management by Japan Cassandra Community
Cassandra CLuster Management by Japan Cassandra CommunityHiromitsu Komatsu
 
Cassandra nyc 2011 ilya maykov - ooyala - scaling video analytics with apac...
Cassandra nyc 2011   ilya maykov - ooyala - scaling video analytics with apac...Cassandra nyc 2011   ilya maykov - ooyala - scaling video analytics with apac...
Cassandra nyc 2011 ilya maykov - ooyala - scaling video analytics with apac...ivmaykov
 

Mais procurados (20)

Pythian: My First 100 days with a Cassandra Cluster
Pythian: My First 100 days with a Cassandra ClusterPythian: My First 100 days with a Cassandra Cluster
Pythian: My First 100 days with a Cassandra Cluster
 
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
 
Cassandra and Spark
Cassandra and SparkCassandra and Spark
Cassandra and Spark
 
Cassandra Summit 2014: Active-Active Cassandra Behind the Scenes
Cassandra Summit 2014: Active-Active Cassandra Behind the ScenesCassandra Summit 2014: Active-Active Cassandra Behind the Scenes
Cassandra Summit 2014: Active-Active Cassandra Behind the Scenes
 
Learn Cassandra at edureka!
Learn Cassandra at edureka!Learn Cassandra at edureka!
Learn Cassandra at edureka!
 
NoSQL overview implementation free
NoSQL overview implementation freeNoSQL overview implementation free
NoSQL overview implementation free
 
Data Stores @ Netflix
Data Stores @ NetflixData Stores @ Netflix
Data Stores @ Netflix
 
Introduction to Cassandra Architecture
Introduction to Cassandra ArchitectureIntroduction to Cassandra Architecture
Introduction to Cassandra Architecture
 
Apache Cassandra training. Overview and Basics
Apache Cassandra training. Overview and BasicsApache Cassandra training. Overview and Basics
Apache Cassandra training. Overview and Basics
 
Cassandra multi-datacenter operations essentials
Cassandra multi-datacenter operations essentialsCassandra multi-datacenter operations essentials
Cassandra multi-datacenter operations essentials
 
Micro-batching: High-performance Writes (Adam Zegelin, Instaclustr) | Cassand...
Micro-batching: High-performance Writes (Adam Zegelin, Instaclustr) | Cassand...Micro-batching: High-performance Writes (Adam Zegelin, Instaclustr) | Cassand...
Micro-batching: High-performance Writes (Adam Zegelin, Instaclustr) | Cassand...
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache Cassandra
 
Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek Berlin
 
Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016
Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016
Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016
 
NOSQL Database: Apache Cassandra
NOSQL Database: Apache CassandraNOSQL Database: Apache Cassandra
NOSQL Database: Apache Cassandra
 
Running Cassandra on Amazon EC2
Running Cassandra on Amazon EC2Running Cassandra on Amazon EC2
Running Cassandra on Amazon EC2
 
No Sql Introduction
No Sql IntroductionNo Sql Introduction
No Sql Introduction
 
NewSQL Database Overview
NewSQL Database OverviewNewSQL Database Overview
NewSQL Database Overview
 
Cassandra CLuster Management by Japan Cassandra Community
Cassandra CLuster Management by Japan Cassandra CommunityCassandra CLuster Management by Japan Cassandra Community
Cassandra CLuster Management by Japan Cassandra Community
 
Cassandra nyc 2011 ilya maykov - ooyala - scaling video analytics with apac...
Cassandra nyc 2011   ilya maykov - ooyala - scaling video analytics with apac...Cassandra nyc 2011   ilya maykov - ooyala - scaling video analytics with apac...
Cassandra nyc 2011 ilya maykov - ooyala - scaling video analytics with apac...
 

Semelhante a Netflix Cassandra Architecture at Scale

ARC203 Highly Available Architecture at Netflix - AWS re: Invent 2012
ARC203 Highly Available Architecture at Netflix - AWS re: Invent 2012ARC203 Highly Available Architecture at Netflix - AWS re: Invent 2012
ARC203 Highly Available Architecture at Netflix - AWS re: Invent 2012Amazon Web Services
 
AWS Re:Invent - High Availability Architecture at Netflix
AWS Re:Invent - High Availability Architecture at NetflixAWS Re:Invent - High Availability Architecture at Netflix
AWS Re:Invent - High Availability Architecture at NetflixAdrian Cockcroft
 
Architectures for High Availability - QConSF
Architectures for High Availability - QConSFArchitectures for High Availability - QConSF
Architectures for High Availability - QConSFAdrian Cockcroft
 
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...DataStax Academy
 
Netflix Global Cloud Architecture
Netflix Global Cloud ArchitectureNetflix Global Cloud Architecture
Netflix Global Cloud ArchitectureAdrian Cockcroft
 
Gluecon 2013 - Netflix Cloud Native Tutorial Details (part 2)
Gluecon 2013 - Netflix Cloud Native Tutorial Details (part 2)Gluecon 2013 - Netflix Cloud Native Tutorial Details (part 2)
Gluecon 2013 - Netflix Cloud Native Tutorial Details (part 2)Adrian Cockcroft
 
Using Apache Cassandra and Apache Kafka to Scale Next Gen Applications
Using Apache Cassandra and Apache Kafka to Scale Next Gen ApplicationsUsing Apache Cassandra and Apache Kafka to Scale Next Gen Applications
Using Apache Cassandra and Apache Kafka to Scale Next Gen ApplicationsData Con LA
 
SV Forum Platform Architecture SIG - Netflix Open Source Platform
SV Forum Platform Architecture SIG - Netflix Open Source PlatformSV Forum Platform Architecture SIG - Netflix Open Source Platform
SV Forum Platform Architecture SIG - Netflix Open Source PlatformAdrian Cockcroft
 
The Netflix Open Source Platform
The Netflix Open Source PlatformThe Netflix Open Source Platform
The Netflix Open Source PlatformRuslan Meshenberg
 
How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Inven...
How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Inven...How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Inven...
How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Inven...Amazon Web Services
 
Migrating Your Databases to AWS Deep Dive on Amazon RDS and AWS
Migrating Your Databases to AWS Deep Dive on Amazon RDS and AWSMigrating Your Databases to AWS Deep Dive on Amazon RDS and AWS
Migrating Your Databases to AWS Deep Dive on Amazon RDS and AWSKristana Kane
 
Amazon Cassandra Basics & Guidelines for AWS/EC2/VPC/EBS
Amazon Cassandra Basics & Guidelines for AWS/EC2/VPC/EBSAmazon Cassandra Basics & Guidelines for AWS/EC2/VPC/EBS
Amazon Cassandra Basics & Guidelines for AWS/EC2/VPC/EBSJean-Paul Azar
 
Cassandra Summit 2014: Cassandra Compute Cloud: An elastic Cassandra Infrastr...
Cassandra Summit 2014: Cassandra Compute Cloud: An elastic Cassandra Infrastr...Cassandra Summit 2014: Cassandra Compute Cloud: An elastic Cassandra Infrastr...
Cassandra Summit 2014: Cassandra Compute Cloud: An elastic Cassandra Infrastr...DataStax Academy
 
Cassandra - A Basic Introduction Guide
Cassandra - A Basic Introduction GuideCassandra - A Basic Introduction Guide
Cassandra - A Basic Introduction GuideMohammed Fazuluddin
 
Hacking apache cloud stack
Hacking apache cloud stackHacking apache cloud stack
Hacking apache cloud stackMurali Reddy
 

Semelhante a Netflix Cassandra Architecture at Scale (20)

ARC203 Highly Available Architecture at Netflix - AWS re: Invent 2012
ARC203 Highly Available Architecture at Netflix - AWS re: Invent 2012ARC203 Highly Available Architecture at Netflix - AWS re: Invent 2012
ARC203 Highly Available Architecture at Netflix - AWS re: Invent 2012
 
AWS Re:Invent - High Availability Architecture at Netflix
AWS Re:Invent - High Availability Architecture at NetflixAWS Re:Invent - High Availability Architecture at Netflix
AWS Re:Invent - High Availability Architecture at Netflix
 
Architectures for High Availability - QConSF
Architectures for High Availability - QConSFArchitectures for High Availability - QConSF
Architectures for High Availability - QConSF
 
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...
 
Netflix Global Cloud Architecture
Netflix Global Cloud ArchitectureNetflix Global Cloud Architecture
Netflix Global Cloud Architecture
 
Netflix and Open Source
Netflix and Open SourceNetflix and Open Source
Netflix and Open Source
 
Gluecon 2013 - Netflix Cloud Native Tutorial Details (part 2)
Gluecon 2013 - Netflix Cloud Native Tutorial Details (part 2)Gluecon 2013 - Netflix Cloud Native Tutorial Details (part 2)
Gluecon 2013 - Netflix Cloud Native Tutorial Details (part 2)
 
Using Apache Cassandra and Apache Kafka to Scale Next Gen Applications
Using Apache Cassandra and Apache Kafka to Scale Next Gen ApplicationsUsing Apache Cassandra and Apache Kafka to Scale Next Gen Applications
Using Apache Cassandra and Apache Kafka to Scale Next Gen Applications
 
Svc 202-netflix-open-source
Svc 202-netflix-open-sourceSvc 202-netflix-open-source
Svc 202-netflix-open-source
 
SV Forum Platform Architecture SIG - Netflix Open Source Platform
SV Forum Platform Architecture SIG - Netflix Open Source PlatformSV Forum Platform Architecture SIG - Netflix Open Source Platform
SV Forum Platform Architecture SIG - Netflix Open Source Platform
 
The Netflix Open Source Platform
The Netflix Open Source PlatformThe Netflix Open Source Platform
The Netflix Open Source Platform
 
How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Inven...
How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Inven...How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Inven...
How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Inven...
 
Migrating Your Databases to AWS Deep Dive on Amazon RDS and AWS
Migrating Your Databases to AWS Deep Dive on Amazon RDS and AWSMigrating Your Databases to AWS Deep Dive on Amazon RDS and AWS
Migrating Your Databases to AWS Deep Dive on Amazon RDS and AWS
 
CloudStack technical overview
CloudStack technical overviewCloudStack technical overview
CloudStack technical overview
 
Amazon Cassandra Basics & Guidelines for AWS/EC2/VPC/EBS
Amazon Cassandra Basics & Guidelines for AWS/EC2/VPC/EBSAmazon Cassandra Basics & Guidelines for AWS/EC2/VPC/EBS
Amazon Cassandra Basics & Guidelines for AWS/EC2/VPC/EBS
 
Cassandra Summit 2014: Cassandra Compute Cloud: An elastic Cassandra Infrastr...
Cassandra Summit 2014: Cassandra Compute Cloud: An elastic Cassandra Infrastr...Cassandra Summit 2014: Cassandra Compute Cloud: An elastic Cassandra Infrastr...
Cassandra Summit 2014: Cassandra Compute Cloud: An elastic Cassandra Infrastr...
 
cassandra@Netflix
cassandra@Netflixcassandra@Netflix
cassandra@Netflix
 
CloudStack Hyderabad Meetup: Using CloudStack to build IaaS clouds
CloudStack Hyderabad Meetup: Using CloudStack to build IaaS cloudsCloudStack Hyderabad Meetup: Using CloudStack to build IaaS clouds
CloudStack Hyderabad Meetup: Using CloudStack to build IaaS clouds
 
Cassandra - A Basic Introduction Guide
Cassandra - A Basic Introduction GuideCassandra - A Basic Introduction Guide
Cassandra - A Basic Introduction Guide
 
Hacking apache cloud stack
Hacking apache cloud stackHacking apache cloud stack
Hacking apache cloud stack
 

Mais de Acunu

Acunu and Hailo: a realtime analytics case study on Cassandra
Acunu and Hailo: a realtime analytics case study on CassandraAcunu and Hailo: a realtime analytics case study on Cassandra
Acunu and Hailo: a realtime analytics case study on CassandraAcunu
 
Virtual nodes: Operational Aspirin
Virtual nodes: Operational AspirinVirtual nodes: Operational Aspirin
Virtual nodes: Operational AspirinAcunu
 
Acunu Analytics and Cassandra at Hailo All Your Base 2013
Acunu Analytics and Cassandra at Hailo All Your Base 2013 Acunu Analytics and Cassandra at Hailo All Your Base 2013
Acunu Analytics and Cassandra at Hailo All Your Base 2013 Acunu
 
Understanding Cassandra internals to solve real-world problems
Understanding Cassandra internals to solve real-world problemsUnderstanding Cassandra internals to solve real-world problems
Understanding Cassandra internals to solve real-world problemsAcunu
 
Acunu Analytics: Simpler Real-Time Cassandra Apps
Acunu Analytics: Simpler Real-Time Cassandra AppsAcunu Analytics: Simpler Real-Time Cassandra Apps
Acunu Analytics: Simpler Real-Time Cassandra AppsAcunu
 
All Your Base
All Your BaseAll Your Base
All Your BaseAcunu
 
Realtime Analytics with Apache Cassandra
Realtime Analytics with Apache CassandraRealtime Analytics with Apache Cassandra
Realtime Analytics with Apache CassandraAcunu
 
Realtime Analytics with Apache Cassandra - JAX London
Realtime Analytics with Apache Cassandra - JAX LondonRealtime Analytics with Apache Cassandra - JAX London
Realtime Analytics with Apache Cassandra - JAX LondonAcunu
 
Real-time Cassandra
Real-time CassandraReal-time Cassandra
Real-time CassandraAcunu
 
Realtime Analytics on the Twitter Firehose with Apache Cassandra - Denormaliz...
Realtime Analytics on the Twitter Firehose with Apache Cassandra - Denormaliz...Realtime Analytics on the Twitter Firehose with Apache Cassandra - Denormaliz...
Realtime Analytics on the Twitter Firehose with Apache Cassandra - Denormaliz...Acunu
 
Realtime Analytics with Cassandra
Realtime Analytics with CassandraRealtime Analytics with Cassandra
Realtime Analytics with CassandraAcunu
 
Acunu Analytics @ Cassandra London
Acunu Analytics @ Cassandra LondonAcunu Analytics @ Cassandra London
Acunu Analytics @ Cassandra LondonAcunu
 
Exploring Big Data value for your business
Exploring Big Data value for your businessExploring Big Data value for your business
Exploring Big Data value for your businessAcunu
 
Realtime Analytics on the Twitter Firehose with Cassandra
Realtime Analytics on the Twitter Firehose with CassandraRealtime Analytics on the Twitter Firehose with Cassandra
Realtime Analytics on the Twitter Firehose with CassandraAcunu
 
Progressive NOSQL: Cassandra
Progressive NOSQL: CassandraProgressive NOSQL: Cassandra
Progressive NOSQL: CassandraAcunu
 
Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 R...
Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 R...Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 R...
Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 R...Acunu
 
Cassandra EU 2012 - Putting the X Factor into Cassandra
Cassandra EU 2012 - Putting the X Factor into CassandraCassandra EU 2012 - Putting the X Factor into Cassandra
Cassandra EU 2012 - Putting the X Factor into CassandraAcunu
 
Next Generation Cassandra
Next Generation CassandraNext Generation Cassandra
Next Generation CassandraAcunu
 
Cassandra EU 2012 - CQL: Then, Now and When by Eric Evans
Cassandra EU 2012 - CQL: Then, Now and When by Eric Evans Cassandra EU 2012 - CQL: Then, Now and When by Eric Evans
Cassandra EU 2012 - CQL: Then, Now and When by Eric Evans Acunu
 
Cassandra EU 2012 - Storage Internals by Nicolas Favre-Felix
Cassandra EU 2012 - Storage Internals by Nicolas Favre-FelixCassandra EU 2012 - Storage Internals by Nicolas Favre-Felix
Cassandra EU 2012 - Storage Internals by Nicolas Favre-FelixAcunu
 

Mais de Acunu (20)

Acunu and Hailo: a realtime analytics case study on Cassandra
Acunu and Hailo: a realtime analytics case study on CassandraAcunu and Hailo: a realtime analytics case study on Cassandra
Acunu and Hailo: a realtime analytics case study on Cassandra
 
Virtual nodes: Operational Aspirin
Virtual nodes: Operational AspirinVirtual nodes: Operational Aspirin
Virtual nodes: Operational Aspirin
 
Acunu Analytics and Cassandra at Hailo All Your Base 2013
Acunu Analytics and Cassandra at Hailo All Your Base 2013 Acunu Analytics and Cassandra at Hailo All Your Base 2013
Acunu Analytics and Cassandra at Hailo All Your Base 2013
 
Understanding Cassandra internals to solve real-world problems
Understanding Cassandra internals to solve real-world problemsUnderstanding Cassandra internals to solve real-world problems
Understanding Cassandra internals to solve real-world problems
 
Acunu Analytics: Simpler Real-Time Cassandra Apps
Acunu Analytics: Simpler Real-Time Cassandra AppsAcunu Analytics: Simpler Real-Time Cassandra Apps
Acunu Analytics: Simpler Real-Time Cassandra Apps
 
All Your Base
All Your BaseAll Your Base
All Your Base
 
Realtime Analytics with Apache Cassandra
Realtime Analytics with Apache CassandraRealtime Analytics with Apache Cassandra
Realtime Analytics with Apache Cassandra
 
Realtime Analytics with Apache Cassandra - JAX London
Realtime Analytics with Apache Cassandra - JAX LondonRealtime Analytics with Apache Cassandra - JAX London
Realtime Analytics with Apache Cassandra - JAX London
 
Real-time Cassandra
Real-time CassandraReal-time Cassandra
Real-time Cassandra
 
Realtime Analytics on the Twitter Firehose with Apache Cassandra - Denormaliz...
Realtime Analytics on the Twitter Firehose with Apache Cassandra - Denormaliz...Realtime Analytics on the Twitter Firehose with Apache Cassandra - Denormaliz...
Realtime Analytics on the Twitter Firehose with Apache Cassandra - Denormaliz...
 
Realtime Analytics with Cassandra
Realtime Analytics with CassandraRealtime Analytics with Cassandra
Realtime Analytics with Cassandra
 
Acunu Analytics @ Cassandra London
Acunu Analytics @ Cassandra LondonAcunu Analytics @ Cassandra London
Acunu Analytics @ Cassandra London
 
Exploring Big Data value for your business
Exploring Big Data value for your businessExploring Big Data value for your business
Exploring Big Data value for your business
 
Realtime Analytics on the Twitter Firehose with Cassandra
Realtime Analytics on the Twitter Firehose with CassandraRealtime Analytics on the Twitter Firehose with Cassandra
Realtime Analytics on the Twitter Firehose with Cassandra
 
Progressive NOSQL: Cassandra
Progressive NOSQL: CassandraProgressive NOSQL: Cassandra
Progressive NOSQL: Cassandra
 
Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 R...
Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 R...Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 R...
Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 R...
 
Cassandra EU 2012 - Putting the X Factor into Cassandra
Cassandra EU 2012 - Putting the X Factor into CassandraCassandra EU 2012 - Putting the X Factor into Cassandra
Cassandra EU 2012 - Putting the X Factor into Cassandra
 
Next Generation Cassandra
Next Generation CassandraNext Generation Cassandra
Next Generation Cassandra
 
Cassandra EU 2012 - CQL: Then, Now and When by Eric Evans
Cassandra EU 2012 - CQL: Then, Now and When by Eric Evans Cassandra EU 2012 - CQL: Then, Now and When by Eric Evans
Cassandra EU 2012 - CQL: Then, Now and When by Eric Evans
 
Cassandra EU 2012 - Storage Internals by Nicolas Favre-Felix
Cassandra EU 2012 - Storage Internals by Nicolas Favre-FelixCassandra EU 2012 - Storage Internals by Nicolas Favre-Felix
Cassandra EU 2012 - Storage Internals by Nicolas Favre-Felix
 

Último

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 

Último (20)

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 

Netflix Cassandra Architecture at Scale

  • 1. Cassandra in the Netflix Architecture CassandraEU London March 28th, 2012 Denis Sheahan
  • 2. Agenda • Netflix and the Cloud • Why Cassandra • Cassandra Deployment @ Netflix • Cassandra Operations • Cassandra lessons learned • Scalability • Open Source
  • 3. Netflix and the Cloud With more than 23 million streaming members in the United States, Canada, Latin America, the United Kingdom and Ireland, Netflix, Inc. is the world's leading internet subscription service for enjoying movies and TV series.. Netflix.com is almost 100% deployed on the Amazon Cloud Source: http://ir.netflix.com
  • 4. Out-Growing Data Center http://techblog.netflix.com/2011/02/redesigning-netflix-api.html 37x Growth Jan 2010-Jan 2011 Datacenter Capacity
  • 5. Netflix Deployed on AWS Content Logs Play WWW API CS Video International S3 DRM Sign-Up Metadata Masters CS lookup Device Diagnostics EC2 EMR Hadoop CDN routing Search Config & Actions Movie TV Movie Customer S3 Hive Bookmarks Choosing Choosing Call Log Business Social CDNs Logging Ratings CS Analytics Intelligence Facebook
  • 8. Distributed Key-Value Store vs Central SQL Database • Datacenter had a central database DBA • Schema changes require downtime • Cloud has in contrast many key-value data stores – Joins take place in java code – No schema to change, no scheduled downtime
  • 9. Goals for moving from Netflix DC to the Cloud • Faster – Lower latency than the equivalent datacenter web • Scalable – Avoid needing any more datacenter capacity as subscriber count increases • Available – Substantially higher robustness and availability than datacenter services • Productive – Optimize agility of a large development team with automation and tools
  • 10. Cassandra • Faster – Low latency, low latency variance • Scalable – Supports running on Amazon EC2 – High and scalable read and write throughput – Support for Multi-region clusters • Available – We value Availability over Consistency – Cassandra Eventually Consistent – Supports Amazon Availability Zones – Data integrity checks and repairs – Online Snapshot Backup, Restore/Rollback • Productive – We want FOSS + Support
  • 12. Netflix Cassandra Use Cases • Many different profiles – Read heavy environments with a strict SLA – Batch Write environments (70 rows per batch) also serving low latency Reads – Read Modify Write environments with large rows – Write only environments with rapidly increasing data sets – Any many more….
  • 13. How much we use Cassandra 30 Number of production clusters 12 Number of multi-region clusters 3 Max regions, one cluster 65 Total TB of data across all clusters 472 Number of Cassandra nodes 72/28 Largest Cassandra cluster (nodes/data in TB) 6k/250k Max read/writes per second
  • 14. Deployment Architecture API AWS EC2 Front End Load Balancer Discovery Service API Proxy Load Balancer Component API Services memcache d Cassandra EC2 Internal Disks Backup S3
  • 15. High Availability Deployment • Fact of life – EC2 instances die • We store 3 local replicas in 3 different Cassandra nodes – One replica per EC2 Availability Zone (AZ) • Minimum Cluster configuration is 6 nodes, 2 per AZ – Single instance failure still leaves at least one node in each AZ • Use Local Quorum for writes • Use Quorum One for reads • Entire cluster replicated in Multi-region deployments • AWS Availability Zones – Separate buildings – Separate power etc. – Fairly close together
  • 16. Astyanax - Cassandra Write Data Flows Single Region, Multiple Availability Zone, Token Aware Cassandra •Disks •Zone A • Client Writes to Cassandra Cassandra If a node goes nodes and Zones •Disks •Disks offline, hinted handoff • Nodes return ack to •Zone C •Zone A completes the write client when the node comes • Data written to Token back up. internal commit log Aware disks (no more than Clients Requests can choose to 10 seconds later) Cassandra Cassandra wait for one node, a •Disks •Disks quorum, or all nodes to •Zone C •Zone B ack the write Cassandra SSTable disk writes and •Disks compactions occur •Zone B asynchronously
  • 17. Extending to Multi-Region In production for UK/Ireland support • Create cluster in EU • Backup US cluster to S3 • Restore backup in EU Cassandra • Disks • Zone A Cassandra • Disks • Zone A • Local repair EU cluster Cassandra • Disks Cassandra • Disks Cassandra Cassandra • • Disks • Disks Global repair/join • Zone C • Zone A • Zone C • Zone A US EU Clients Clients Cassandra Cassandra Cassandra Cassandra • Disks • Disks • Disks • Disks • Zone C • Zone B • Zone C • Zone B Cassandra Cassandra • Disks • Zone B • Disks • Zone B S3
  • 18. Data Flows for Multi-Region Writes Token Aware, Consistency Level = Local Quorum • Client writes to local replicas If a node or region goes offline, hinted handoff • Local write acks returned to completes the write when the node comes back up. Client which continues when Nightly global compare and repair jobs ensure 2 of 3 local nodes are everything stays consistent. committed • Local coordinator writes to Local Remote remote coordinator. Cassandra • Cassandra When data arrives, remote • Disks • Zone A • Disks • Zone A coordinator node acks Cassandra Cassandra Cassandra Cassandra • • Disks • Disks Remote co-ordinator sends • Zone C • Zone A • Disks • Zone C • Disks • Zone A data to other remote zones US EU • Remote nodes ack to local Clients Clients Cassandra Cassandra Cassandra Cassandra coordinator • Disks • Zone C • Disks • Zone B • Disks • Zone C • Disks • Zone B • Data flushed to internal Cassandra Cassandra • Disks commit log disks (no more • Zone B • Disks • Zone B than 10 seconds later)
  • 19. Priam – Cassandra Automation Available at http://github.com/netflix • Open Source Tomcat Code running as a sidecar on each Cassandra node. Deployed as a separate rpm • Zero touch auto-configuration • Token allocation and assignment including multi- region • Broken node replacement and ring expansion • Full and incremental backup/restore to/from S3 • Metrics collection and forwarding via JMX
  • 20. Cassandra Backup/Restore • Full Backup Cassandra – Time based snapshot Cassandra Cassandra – SSTable compress -> S3 • Incremental Cassandra Cassandra – SSTable write triggers S3 compressed copy to S3 Cassandra Backup Cassandra • Archive – Copy cross region Cassandra Cassandra • Restore Cassandra Cassandra – Full restore or create new Ring from Backup A
  • 22. Consoles, Monitors and Explorers • Netflix Application Console (NAC) – Primary AWS provisioning/config interface • EPIC Counters – Primary method for issue resolution • Dashboards • Cassandra Explorer – Browse clusters, keyspaces, column families • AWS Usage Analyzer – Breaks down costs by application and resource
  • 23. Cloud Deployment Model Elastic Load Auto Scaling Balancer Group Instances Security Group Launch Configuration Amazon Machine Image
  • 24. NAC • Netflix Application Console (NAC) is Netflix’s primary tool in both Prod and Test to: • Create and Destroy Applications • Create and Destroy Auto Scaling Groups (ASGs) • Scale Instances up and down within an ASG and manage auto- scaling • Manage launch configs and AMIs • http://www.slideshare.net/joesondow
  • 25. NAC
  • 26.
  • 27. Cassandra Explorer • Kiosk mode – no alerting • High level cluster status (thift, gossip) • Warns on a small set of metrics 27
  • 28. Epic • Netflix-wide monitoring and alerting tool based on RRD • Priam sends all JMX data to Epic • Very useful for finding specific issues 28
  • 29. Dashboards • Next level cluster details • Throughput • Latency, Gossip status, Maintenance operations • Trouble indicators • Useful for finding anomalies • Most investigations start here 29
  • 30. Things we monitor • Cassandra – Throughput, Latency, Compactions , Repairs – Pending threads, Dropped operations – Backup failures – Recent restarts – Schema changes • System – Disk space, Disk throughput, Load average • Errors and exceptions in Cassandra, System and Tomcat log files 30
  • 31. Cassandra AWS Pain Points • Compactions cause spikes, esp. on read-heavy systems – Affects clients (hector, astyanax) – Throttling in newer Cassandra versions helps • Repairs are toxic to performance • Disk performance on Cloud instances and its impact on SSTable count • Memory requirements due to filesystem cache • Compression unusable in our environment • Multi-tenancy performance unpredictable • Java Heap size and OOM issues
  • 32. Lessons learned • In EC2 best to choose instances that are not multi- tenant • Better to compact on our terms and not Cassandra’s. Take nodes out of service for major compactions • Size memtable flushes for optimizing compactions – Helps when writes are uniformly distributed, easier to determine flush patterns – Best to optimize flushes based on memtable size, not time – Makes minor compactions smoother 32
  • 33. Lessons Learned (cont) • Key and row caches – Left unbounded can chew up jvm memory needed for normal work – Latencies will spike as the jvm needs to fight for memory – Off-heap row cache still maintains data structures on- heap • mmap() as in-memory cache – When process terminated, mmap pages are added to the free list
  • 34. Lessons Learned (cont) • Sharding – If a single row has many gets/mutates, the nodes holding it will become hot spots – If a row grows too large, it won’t fit into memory • Problem for reads, compactions, and repairs • Some of our indices ran afoul of this • For more info see Jason Brown’s slides Cassandra from the trenches slideshare.net/netflix
  • 36. Scalability Testing • Cloud Based Testing – frictionless, elastic – Create/destroy any sized cluster in minutes – Many test scenarios run in parallel • Test Scenarios – Internal app specific tests – Simple “stress” tool provided with Cassandra • Scale test, keep making the cluster bigger – Check that tooling and automation works… – How many ten column row writes/sec can we do?
  • 37. Scale-Up Linearity Client Writes/s by node count – Replication Factor = 3 1200000 1099837 1000000 800000 Transactions 600000 537172 400000 366828 200000 174373 0 0 50 100 150 200 250 300 350 EC2 Instances
  • 38. Measured at the Cassandra Server Throughput 3.3 Million writes/sec Cassandra Writes per second Elapsed time seconds
  • 39. Response time 0.014ms Cassandra Response time Elapsed time seconds
  • 40. Per Node Activity Per Node 48 Nodes 96 Nodes 144 Nodes 288 Nodes Per Server Writes/s 10,900 w/s 11,460 w/s 11,900 w/s 11,456 w/s Mean Server Latency 0.0117 ms 0.0134 ms 0.0148 ms 0.0139 ms Mean CPU %Busy 74.4 % 75.4 % 72.5 % 81.5 % Disk Read 5,600 KB/s 4,590 KB/s 4,060 KB/s 4,280 KB/s Disk Write 12,800 KB/s 11,590 KB/s 10,380 KB/s 10,080 KB/s Network Read 22,460 KB/s 23,610 KB/s 21,390 KB/s 23,640 KB/s Network Write 18,600 KB/s 19,600 KB/s 17,810 KB/s 19,770 KB/s Node specification – Xen Virtual Images, AWS US East, three zones • Cassandra 0.8.6, CentOS, SunJDK6 • AWS EC2 m1 Extra Large – Standard price $ 0.68/Hour • 15 GB RAM, 4 Cores, 1Gbit network • 4 internal disks (total 1.6TB, striped together, md, XFS)
  • 41. Time is Money 48 nodes 96 nodes 144 nodes 288 nodes Writes Capacity 174373 w/s 366828 w/s 537172 w/s 1,099,837 w/s Storage Capacity 12.8 TB 25.6 TB 38.4 TB 76.8 TB Nodes Cost/hr $32.64 $65.28 $97.92 $195.84 Test Driver Instances 10 20 30 60 Test Driver Cost/hr $20.00 $40.00 $60.00 $120.00 Cross AZ Traffic 5 TB/hr 10 TB/hr 15 TB/hr 301 TB/hr Traffic Cost/10min $8.33 $16.66 $25.00 $50.00 Setup Duration 15 minutes 22 minutes 31 minutes 662 minutes AWS Billed Duration 1hr 1hr 1 hr 2 hr Total Test Cost $60.97 $121.94 $182.92 $561.68 1 Estimate two thirds of total network traffic 2 Workaround for a tooling bug slowed setup
  • 43. Open Source @ Netflix • Source at http://netflix.github.com • Binaries at Maven https://issues.sonatype.org/browse/OSSRH- 2116
  • 44. Cassandra JMeter Plugin • Netflix uses JMeter across the fleet for load testing • JMeter plugin provides a wide range of samplers for Get, Put, Delete and Schema Creation • Used extensively to load data, Cassandra stress tests, feature testing etc. • Described at https://github.com/Netflix/CassJMeter/ wiki
  • 45. Astyanax Available at http://github.com/netflix • Cassandra java client • API abstraction on top of Thrift protocol • “Fixed” Connection Pool abstraction (vs. Hector) – Round robin with Failover – Retry-able operations not tied to a connection – Netflix PaaS Discovery service integration – Host reconnect (fixed interval or exponential backoff) – Token aware to save a network hop – lower latency – Latency aware to avoid compacting/repairing nodes – lower variance • Simplified use of serializers via method overloading (vs. Hector) • ConnectionPoolMonitor interface

Notas do Editor

  1. Find numbers some day
  2. Send to John C once finished
  3. Remove cluster names, hopefully find one without purpleMight want another slide with the clsuter details, if ready by 3/27
  4. Compaction is continually  happening after only a few seconds.  Logs show it flushes the Memtable every 10 - 15 secondsLogs show a minor compaction every 30 seconds or soYou can see this also in the iostat data there are both reads and writes going to disk, the majority are writes.Stress command linejava -jar stress.jar  -d "144 node ids"   -e ONE -n 27000000  -l 3 -i 1  -t 200 -p 7102  -o INSERT  -c 10 -rSo its writing 10 columns per row, keyid randomly chosen from 27 million idsThirty clients talk to the first 144 nodes and 30 talk to the second 144For the Insert we write three replicas which is specified in the keyspaceKeyspace: Keyspace1:  Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy  Durable Writes: true    Options: [us-east:3]  Column Families:    ColumnFamily: Standard1      Key Validation Class: org.apache.cassandra.db.marshal.BytesType      Default column value validator: org.apache.cassandra.db.marshal.BytesType      Columns sorted by: org.apache.cassandra.db.marshal.BytesType      Row cache size / save period in seconds: 0.0/0      Key cache size / save period in seconds: 200000.0/14400      Memtable thresholds: 1.7671875/1440/128 (millions of ops/minutes/MB)      GC grace seconds: 864000      Compaction min/max thresholds: 4/32      Read repair chance: 0.0      Replicate on write: true
  5. Cross AZ traffic calculation – per node average 23640+19770 = 43410 KB/s288 nodes times 3600 = 45007 GB/hour2/3s = 30000 GB/hour, $0.01/GB = $30010 minute test run = 300/6 = $50Slides error, test driver was m2.4xl not m1.xlTest driver TX 250 Mbit/s = 31MBytes/s, RX 35 Kbit/s = 4.3 Mbytes/s60 x 35MB/s * 3600 = 7.5TB/hrEach write is about 400 bytes on diskCreating the tests is heavily dependent on AWS and the fact that we can only launch 96 at a time.Looking at the AWS, Linux and Cassandra logs for 288 wayI kicked off the first ASG from 0->96 at 00:08:0227 seconds later the first Linux box was booted at SatOct 22 00:08:29 UTC 2011Last Linux (number 288) bootedat SatOct 22 01:09:51 UTC 2011Last Cassandracame online at  01:12:41 about 3 minuteslaterSo just overanhour to getthisbad boy up.  Most ofthetimewaswaitingforthenodes to jointhe cluster.  I waitedforall 96 to joinbeforestartingthenext AZAWS claimsittook 4 minutes and 40 seconds to launchthe 96 instancesThisisprettyconsistentacrossthe 3 AZs.  So about 15 - 16 minutesofthe1 hourwas AWSItseems to takeabout 1 minute 30 seconds to bootthe Linux instancesNote in launchingthe 96 therewerefailures / retries in all 3 AZsus-east-1a had 9 failuresus-east-1c had 1 failureus-east-1d also had 9 failuresRun timesvaried a bit mostlybased on how long I couldsustaintheloadwiththenumberofclientswriting 27 millionrecords.  In Cassandra stress youcannotspecifyanelapsedtime, just a totalnumberoftransactions.  Italsodecaysirregularly as threadsterminate48 waysustainedloadfor 570 seconds96 waysustainedloadfor 550 seconds144 waysustainedfor 660 seconds288 waysustainedfor 780 seconds
  6. Complete connection pool abstractionQueries and mutations wrapped in objects created by the Keyspace implementation making it possible to retry failed operations.  This varies from other connection pool implementations on which the operation is created on a specific connection and must be completely redone if it fails.Simplified serialization via method overloading.  The low level thrift library only understands data that is serialized to a byte array.  Hector requires serializers to be specified for nearly every call.  Astyanax minimizes the places where serializers are specified by using predefined ColumnFamiliy and ColumnPath definitions which specify the serializers.  The API also overloads set and get operation for common data types.The internal library does not log anything.  All internal events are instead ... calls to a ConnectionPoolMonitor interface.  This allows customization of log levels and filtering of repeating events outside of the scope of the connection poolSuper columns will soon be replaced by Composite column names. As such it is recommended to not use super columns at all and to use Composite column names instead. There is some support for super columns in Astyanax but those methods have been deprecated and will eventually be removed.