As more and more businesses move from enterprise IT solutions to web scale cloud solutions to cater to the growing customer needs, they need to be innovative and find ways the applications and infrastructures would to scale rapidly and be highly available.
High availability is an important requirement for any online business and trying to architect around failures and expecting infrastructure to fail and even then be highly available is the key to success. One such effort here at Netflix was the Active-Active implementation where we provided region resiliency. This presentation would discuss the brief overview of the active-active implementation and how it leveraged Cassandra’s architecture in the backend to achieve its goal. It will cover our journey though A-A from Cassandra’s perspective, the data validation we did to prove the backend would work without impacting customer experience. The various problems we ran into like long repair times and gc grace settings. Our lessons learnt and what would we do differently next time around?
6. WHAT IS ACTIVE-ACTIVE
Also called dual active, it is a phrase used to
describe a network of independent processing nodes
where each node has access to a replicated database
giving each node access and usage of single
application. In an active-active system all requests are
load balanced across all available processing capacity,
Where a failure occurs on a node, another node in the
network takes its place.
7. DOES AN INSTANCE FAIL?
• It can, plan for it
• Bad code / configuration pushes
• Latent issues
• Hardware failure
• Test with Chaos Monkey
8. DOES A ZONE FAIL?
• Rarely, but happened before
• Routing issues
• DC-specific issues
• App-specific issues within a zone
• Test with Chaos Gorilla
9. DOES A REGION FAIL?
• Full region – unlikely, very rare
• Individual Services can fail region-wide
• Most likely, a region-wide configuration issues
• Test with Chaos Kong
10. EVERYTHING FAILS… EVENTUALLY
• Keep your services running by embracing isolation and
redundancy
• Construct a highly agile and highly available service
from ephemeral and assumed broken components
11. ISOLATION
• Changes in one region should not affect others
• Regional outage should not affect others
• Network partitioning between regions should not affect
functionality / operations
12. REDUNDANCY
• Make more than one (of pretty much everything)
• Specifically, distribute services across Availability
Zones and regions
13. HISTORY: X-MAS EVE 2012
• Netflix multi-hour outage
• US-East1 regional Elastic Load Balancing issue
• “...data was deleted by a maintenance process
that was inadvertently run against the
production ELB state data”
23. UPDATE KEYSPACE
Update keyspace <keyspace> with placement_strategy =
'NetworkTopologyStrategy'
and strategy_options = {us-east : 3, us-west-2 : 3};
Existing region and replication factor New region and replication factor
27. BENCHMARKING GLOBAL CASSANDRA
WRITE INTENSIVE TEST OF CROSS-REGION REPLICATION
CAPACITY
16 X HI1.4XLARGE SSD NODES PER ZONE = 96 TOTAL
192 TB OF SSD IN SIX LOCATIONS UP AND RUNNING
CASSANDRA IN 20 MINUTES
Cassandra Replicas
Zone A
Cassandra Replicas
Zone B
Cassandra Replicas
Zone C
US-West-2 Region - Oregon
Cassandra Replicas
Zone A
Cassandra Replicas
Zone B
Cassandra Replicas
Zone C
US-East-1 Region - Virginia
Test
Load
Test
Load
Validation
Load
Interzone Traffic
1 Million Writes
CL.ONE (Wait for
One Replica to ack)
1 Million Reads
after 500 ms
CL.ONE with No
Data Loss
Interregional Traffic
Up to 9Gbits/s, 83ms 18 TB backups
from S3
39. TIME TO REPAIR DEPENDS ON
• Number of regions
• Number of replicas
• Data size
• Amount of entropy
40. ADJUST GC_GRACE AFTER
EXTENSION
• Column Family Setting
• Defined in seconds
• Default 10 days
• Tweak gc_grace settings to
accommodate time taken to repair
• BEWARE of deleted columns
43. CONSISTENCY LEVEL
• Check the client for consistency level setting
• In a Multiregional cluster QUORUM <>
LOCAL_QUORUM
• Recommended consistency levels
LOCAL_ONE (CASSANDRA-6202) for reads
and LOCAL_QUORUM for writes
• For region resiliency avoid – ALL or
QUORUM calls