Some vignettes and advice based on prior experience with Cassandra clusters in live environments. Includes some material from other operational slides.
4. Hardware : Memory
More is better up to a point.
From 1.2, off-heap bloom filters and compression metadata
mean greater data density per node, less on-heap used but need
more system memory
Respect Java heap limits. 8BG as an upper bound. Java7 has
improved garbage collection, may be able to push it higher.
Be aware of perm gen and native memory also.
Keep an eye on system memory. If memory is nearly exhausted,
system performance degrades across the board
Anecdotal: 20MB RAM per TB
5. Hardware : CPU
8 - 12 core is recommended
Given efficient i/o, Cassandra is more likely to be CPU
bound under most circumstances
For hosted environments, CPU bursting is useful.
The Big Picture : consider total core availability across
the cluster when sizing
Include cost per core in sizing estimates as cost
impact is higher than disk or memory
6. Hardware : Disk
DataStax recommendation is to use SSD
Commodity SSD cheaper, but know the limitations. Not
apples and apples. Manufacturers now producing “lightenterprise” products. Enterprise SSD designed for mixed
workloads.
Think 1 disk for commit log, many for data. Anything else
and you risk I/O bottlenecking
With JOBD support >1.2 distributes data over all available
disks and has sensible failure management. No longer need
to use RAID10 or similar
SSD requires update to device queue settings
7. Topology
Split local DC nodes across multiple racks for rack redundancy
With multi-dc, use LOCAL_QUORUM or lower consistency level for
most operations
If inter-DC network latency is high, you will need to tweak
phi_convict_threshold from default. Same for EC2
Be aware of single node bandwidth limitations. Buying amazing h/w
means you need available bandwidth to exercise it
Know what your latencies are at each network stage. Be aware of
inter-DC firewalls and traffic contention with other resources. (Are
there devices you have no permission to change settings for that
your cluster is impacted by?)
Plan as best you can for all eventualities
Rapid Read Repair feature (2.x) will help cluster performance
8. Surety vs. Cost
Have a 24x7 ethos. Don’t let a customers suffer because of
caution or tight pockets.
Know your growth rate.
Add nodes pre-emptively even if you are not at your outer
limit.
More nodes spread load across nodes and increase
number of concurrent requests to cluster. Think in terms
of reducing current load per node as well as adding
storage
EC2 has more flexibility but the same rules apply.
9. Sizing
A single node can comfortably store 3 - 5TB.
SATA has higher density (3TB/disk available) but
limits still hit due to per node limit above.
With both calculate theoretical max. iops
Allow extra for indexes
Run a simulation if possible with dummy data
Allow 10% free space for LCS and worst case 50% for
STCS
10. Sizing
Cassandra has its own compression, no need to do it
yourself (LZ4 performs better than Snappy)
Don’t serialize objects you may want to run analytics
on in the future
Mixed workloads make projections more difficult due
to compaction and SSTable immutability
Review column data for constants and strings that
don’t need to be stored in long form. (Common
Oversight)
11. When do we Add Nodes?
Think GROWTH RATE, its ok to be a pessimist, people
will thank you later for it.
Be as accurate as possible in your projections
Use multi-variate projections e.g.storage, request
threads, memory, network i/o, CPU usage
Update projections on a regular basis, they help
convince stakeholders to spend money
Don’t wait till the last minute, provisioning new
hardware nearly always takes longer than planned.
12.
13. Compaction
Can be a serious thorn in your side if not understood (can
be i/o intensive).
Every write is immutable, highly volatile columns will be
distributed across many SSTables, this affects read
performance
Compaction leads to fewer places to look for data
Compaction also removes deleted or TTL’d columns
Don’t forget gc_grace_seconds when using TTL.
New Hybrid compaction algorithm uses STCS in LCS level 0.
Worth checking out.
14. Virtual Nodes
Since Cassandra 1.2 virtual nodes are the default
Split token ranges from 1 per node to many (default 256)
Ranges are “shuffled” to be random and non-contiguous
Hot token ranges are unlikely to affect performance as
much as previously
Cluster expansion is more manageable, spreads load
evenly
No more node obesity…
17. OpsCentre
Produced by DataStax
Community & Enterprise Flavours
Runs agents on each node
Enterprise version allows addition & removal of nodes
Complete Integration with DataStax Community &
Enterprise
18. Chef
Very popular in industry
Open Source recipes for different flavours of
Cassandra, vanilla apache, dce.
Chef Server provides visual management
Knife provides comprehensive CLI
Enterprise features like access control and plugin
architecture
Lots of other recipes!
19. Fabric
Python Library for executing scripts across many
machines at once.
Code like you were local to the machine, fabric takes
care of remote execution
Endless possibilities for templating and management
tasks
Requires scripting skills but can be used for highly
customised deployments without relying on
chef/machine images.
22. What to Monitor?
All the usual stuff (CPU, Memory, I/O, Disk Space)
Cassandra specific:
R / W latencies
Pending compactions
nodetool tpstats
Cassandra log for exceptions / warnings
Use netstat to monitor concurrent connections
Client
Inter-node
24. Troubleshooting
Garbage Collection / Compaction
Can cause system pauses and affect node
performance
Look in logs for Heap nearly full warnings
Enable GC logging in cassandra-env.sh
Heap dumps
OpsCentre GC Graphs
25. Troubleshooting
Query Performance
Day 0 performance is one thing, day N is another
Again, try and run simulations
Can degrade over time if schema design is not optimal
Use query tracing periodically to monitor
performance of most common queries
Can be automated as part of monitoring & SLAs
26. Conclusion
Automate deployment as much as possible
Don’t mix high throughput workloads, have few data
classes per cluster
Invest time in developing an effective monitoring
infrastructure
Update projections frequently
Run simulations (including failure scenarios)
Curiosity is good, experiment with different settings
28. Training!
Book a Developer or Administrator Cassandra training
course for 17th or 19th February and receive a 33%
discount.
Offer extended to all Meetup Group members on
individual rates only.
Limited places remaining!
Promo Code: MEETUPDUB
Notas do Editor
Further Reading : http://www.slideshare.net/planetcassandra/c-summit-2013-practice-makes-perfect-extreme-cassandra-optimization-by-albert-tobey