O slideshow foi denunciado.
Seu SlideShare está sendo baixado. ×

Cassandra Troubleshooting for 2.1 and later

Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Carregando em…3
×

Confira estes a seguir

1 de 174 Anúncio

Cassandra Troubleshooting for 2.1 and later

Baixar para ler offline

Troubleshooting Cassandra 2.1: A Guided Tour of nodetool and system.log. From Cassandra Summit 2015. Download and check out the presenter notes for tips!

I’ll give a general lay of the land for troubleshooting Cassandra. Then I’ll take you on a deep dive through nodetool and system.log and give you a guided tour of the useful information they provide for troubleshooting. I’ll devote special attention to monitoring the various processes that Cassandra uses to do its work and how to effectively search for information about specific error messages online.

Troubleshooting Cassandra 2.1: A Guided Tour of nodetool and system.log. From Cassandra Summit 2015. Download and check out the presenter notes for tips!

I’ll give a general lay of the land for troubleshooting Cassandra. Then I’ll take you on a deep dive through nodetool and system.log and give you a guided tour of the useful information they provide for troubleshooting. I’ll devote special attention to monitoring the various processes that Cassandra uses to do its work and how to effectively search for information about specific error messages online.

Anúncio
Anúncio

Mais Conteúdo rRelacionado

Diapositivos para si (20)

Anúncio

Semelhante a Cassandra Troubleshooting for 2.1 and later (20)

Mais recentes (20)

Anúncio

Cassandra Troubleshooting for 2.1 and later

  1. 1. Troubleshooting Cassandra A Guided Tour of nodetool & system.log J.B. Langston, Lead Support Engineer
  2. 2. Company Confidential© 2014 DataStax, All Rights Reserved. Troubleshooting Process 2 1 Ask what changed 2 Examine bottlenecks 3 Determine which nodes have problems 4 Find and understand errors 5 Determine root cause 6 Take corrective action
  3. 3. Company Confidential© 2014 DataStax, All Rights Reserved. 3 • Did you upgrade? • Cassandra • Kernel • JVM • Driver • What metrics changed? • OpsCenter • Graphite, etc. • Change one thing at a time! What changed? • Did it work before? • Does it work in another environment? • What’s different? • Settings • Application Code • Read/Write Load • Data Volume • Hardware • Network
  4. 4. Company Confidential© 2014 DataStax, All Rights Reserved. 4 • Disk • Disk space • I/O bandwidth • Network • OS Resources/Limits • File Handles • Processes • Mapped Memory System Resources • CPU • Single Core utilization • Multi Core utilization • Memory • Heap space • Off-heap space • Linux page cache
  5. 5. Company Confidential© 2014 DataStax, All Rights Reserved. 5 Linux monitoring commands Command What it tells you… top CPU utilization and memory use per process top -H CPU utilization per thread, memory use is still per process df Free disk space iostat -x I/O bandwidth utilization free -m Memory and cache usage netstat -an Network connections established iftop Network bandwidth utilization sar All (most) of the above, with history!
  6. 6. Company Confidential© 2014 DataStax, All Rights Reserved. 6 Java monitoring commands Command What it tells you… jstack -l Status and stack trace of each thread jmap -histo Types of objects on the heap (optionally only live objects) jmap -heap Size and usage of each java heap generation jstat -gccause Causes of gc activity jmap -dump Take a heap dump for further analysis MemoryAnalyzer Post-mortem heap-dump analysis
  7. 7. Company Confidential© 2014 DataStax, All Rights Reserved. 7 • Background Processes • Flushes • Compactions • Garbage collections • Gossip • Hinted Handoff • Read Repair • Repair Cassandra Architecture • Request coordination • Read & write path • Data structures • Memtables • SSTables • Tombstones • Caches • Bloom filters • Index summaries
  8. 8. Company Confidential© 2014 DataStax, All Rights Reserved. 8 nodetool commands Command What it tells you… status / ring Overall cluster status info Status, memory usage, and caches for a single node tpstats Statistics about each thread pool on a single node cfstats Summary statistics for all tables and keyspaces on a single node cfhistograms Detailed statistics for a specific table on the local node proxyhistograms Latency statistics for requests coordinated the local node netstats Network activity: streams, read repair, and in-flight commands compactionstats Compactions pending and in progress compactionhistory Historical compaction information describecluster Basic cluster information and schema versions
  9. 9. Company Confidential© 2014 DataStax, All Rights Reserved. 9 nodetool status Note: Ownership information does not include topology; for complete information, specify a keyspace Datacenter: us-east =================== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 54.173.171.164 22.78 MB 1 16.7% 67b3823f-6663-47d0-a04f-5914081e275c 1b DN 54.174.19.98 22.82 MB 1 16.7% 48d6f717-017b-4868-a525-b396d3f899aa 1b UN 54.174.245.247 22.68 MB 1 16.7% 6817e9ca-e79d-4fed-946e-7318bcfd5343 1b Datacenter: us-west =================== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 54.153.107.100 22.72 MB 1 16.7% 4abf0a7a-00ef-441a-9f70-046cd9fe1c0c 1a UN 54.153.108.157 22.79 MB 1 16.7% 303f08dd-2a19-4175-98e7-97920232855b 1a UN 54.153.39.203 22.67 MB 1 16.7% d1a57a91-7aef-4878-a056-88949920724c 1a
  10. 10. Company Confidential© 2014 DataStax, All Rights Reserved. 10 nodetool status - data centers Note: Ownership information does not include topology; for complete information, specify a keyspace Datacenter: us-east =================== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 54.173.171.164 22.78 MB 1 16.7% 67b3823f-6663-47d0-a04f-5914081e275c 1b DN 54.174.19.98 22.82 MB 1 16.7% 48d6f717-017b-4868-a525-b396d3f899aa 1b UN 54.174.245.247 22.68 MB 1 16.7% 6817e9ca-e79d-4fed-946e-7318bcfd5343 1b Datacenter: us-west =================== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 54.153.107.100 22.72 MB 1 16.7% 4abf0a7a-00ef-441a-9f70-046cd9fe1c0c 1a UN 54.153.108.157 22.79 MB 1 16.7% 303f08dd-2a19-4175-98e7-97920232855b 1a UN 54.153.39.203 22.67 MB 1 16.7% d1a57a91-7aef-4878-a056-88949920724c 1a
  11. 11. Company Confidential© 2014 DataStax, All Rights Reserved. 11 nodetool status - up or down? Note: Ownership information does not include topology; for complete information, specify a keyspace Datacenter: us-east =================== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 54.173.171.164 22.78 MB 1 16.7% 67b3823f-6663-47d0-a04f-5914081e275c 1b DN 54.174.19.98 22.82 MB 1 16.7% 48d6f717-017b-4868-a525-b396d3f899aa 1b UN 54.174.245.247 22.68 MB 1 16.7% 6817e9ca-e79d-4fed-946e-7318bcfd5343 1b Datacenter: us-west =================== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 54.153.107.100 22.72 MB 1 16.7% 4abf0a7a-00ef-441a-9f70-046cd9fe1c0c 1a UN 54.153.108.157 22.79 MB 1 16.7% 303f08dd-2a19-4175-98e7-97920232855b 1a UN 54.153.39.203 22.67 MB 1 16.7% d1a57a91-7aef-4878-a056-88949920724c 1a
  12. 12. Company Confidential© 2014 DataStax, All Rights Reserved. 12 nodetool status - streaming state Note: Ownership information does not include topology; for complete information, specify a keyspace Datacenter: us-east =================== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 54.173.171.164 22.78 MB 1 16.7% 67b3823f-6663-47d0-a04f-5914081e275c 1b DN 54.174.19.98 22.82 MB 1 16.7% 48d6f717-017b-4868-a525-b396d3f899aa 1b UN 54.174.245.247 22.68 MB 1 16.7% 6817e9ca-e79d-4fed-946e-7318bcfd5343 1b Datacenter: us-west =================== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 54.153.107.100 22.72 MB 1 16.7% 4abf0a7a-00ef-441a-9f70-046cd9fe1c0c 1a UN 54.153.108.157 22.79 MB 1 16.7% 303f08dd-2a19-4175-98e7-97920232855b 1a UN 54.153.39.203 22.67 MB 1 16.7% d1a57a91-7aef-4878-a056-88949920724c 1a
  13. 13. Company Confidential© 2014 DataStax, All Rights Reserved. 13 nodetool status - ip address Note: Ownership information does not include topology; for complete information, specify a keyspace Datacenter: us-east =================== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 54.173.171.164 22.78 MB 1 16.7% 67b3823f-6663-47d0-a04f-5914081e275c 1b DN 54.174.19.98 22.82 MB 1 16.7% 48d6f717-017b-4868-a525-b396d3f899aa 1b UN 54.174.245.247 22.68 MB 1 16.7% 6817e9ca-e79d-4fed-946e-7318bcfd5343 1b Datacenter: us-west =================== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 54.153.107.100 22.72 MB 1 16.7% 4abf0a7a-00ef-441a-9f70-046cd9fe1c0c 1a UN 54.153.108.157 22.79 MB 1 16.7% 303f08dd-2a19-4175-98e7-97920232855b 1a UN 54.153.39.203 22.67 MB 1 16.7% d1a57a91-7aef-4878-a056-88949920724c 1a
  14. 14. Company Confidential© 2014 DataStax, All Rights Reserved. 14 nodetool status - disk usage Note: Ownership information does not include topology; for complete information, specify a keyspace Datacenter: us-east =================== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 54.173.171.164 22.78 MB 1 16.7% 67b3823f-6663-47d0-a04f-5914081e275c 1b DN 54.174.19.98 22.82 MB 1 16.7% 48d6f717-017b-4868-a525-b396d3f899aa 1b UN 54.174.245.247 22.68 MB 1 16.7% 6817e9ca-e79d-4fed-946e-7318bcfd5343 1b Datacenter: us-west =================== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 54.153.107.100 22.72 MB 1 16.7% 4abf0a7a-00ef-441a-9f70-046cd9fe1c0c 1a UN 54.153.108.157 22.79 MB 1 16.7% 303f08dd-2a19-4175-98e7-97920232855b 1a UN 54.153.39.203 22.67 MB 1 16.7% d1a57a91-7aef-4878-a056-88949920724c 1a
  15. 15. Company Confidential© 2014 DataStax, All Rights Reserved. 15 nodetool status - token count Note: Ownership information does not include topology; for complete information, specify a keyspace Datacenter: us-east =================== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 54.173.171.164 22.78 MB 1 16.7% 67b3823f-6663-47d0-a04f-5914081e275c 1b DN 54.174.19.98 22.82 MB 1 16.7% 48d6f717-017b-4868-a525-b396d3f899aa 1b UN 54.174.245.247 22.68 MB 1 16.7% 6817e9ca-e79d-4fed-946e-7318bcfd5343 1b Datacenter: us-west =================== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 54.153.107.100 22.72 MB 1 16.7% 4abf0a7a-00ef-441a-9f70-046cd9fe1c0c 1a UN 54.153.108.157 22.79 MB 1 16.7% 303f08dd-2a19-4175-98e7-97920232855b 1a UN 54.153.39.203 22.67 MB 1 16.7% d1a57a91-7aef-4878-a056-88949920724c 1a
  16. 16. Company Confidential© 2014 DataStax, All Rights Reserved. 16 nodetool status - data ownership Note: Ownership information does not include topology; for complete information, specify a keyspace Datacenter: us-east =================== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 54.173.171.164 22.78 MB 1 16.7% 67b3823f-6663-47d0-a04f-5914081e275c 1b DN 54.174.19.98 22.82 MB 1 16.7% 48d6f717-017b-4868-a525-b396d3f899aa 1b UN 54.174.245.247 22.68 MB 1 16.7% 6817e9ca-e79d-4fed-946e-7318bcfd5343 1b Datacenter: us-west =================== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 54.153.107.100 22.72 MB 1 16.7% 4abf0a7a-00ef-441a-9f70-046cd9fe1c0c 1a UN 54.153.108.157 22.79 MB 1 16.7% 303f08dd-2a19-4175-98e7-97920232855b 1a UN 54.153.39.203 22.67 MB 1 16.7% d1a57a91-7aef-4878-a056-88949920724c 1a
  17. 17. Company Confidential© 2014 DataStax, All Rights Reserved. 17 nodetool status - host id Note: Ownership information does not include topology; for complete information, specify a keyspace Datacenter: us-east =================== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 54.173.171.164 22.78 MB 1 16.7% 67b3823f-6663-47d0-a04f-5914081e275c 1b DN 54.174.19.98 22.82 MB 1 16.7% 48d6f717-017b-4868-a525-b396d3f899aa 1b UN 54.174.245.247 22.68 MB 1 16.7% 6817e9ca-e79d-4fed-946e-7318bcfd5343 1b Datacenter: us-west =================== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 54.153.107.100 22.72 MB 1 16.7% 4abf0a7a-00ef-441a-9f70-046cd9fe1c0c 1a UN 54.153.108.157 22.79 MB 1 16.7% 303f08dd-2a19-4175-98e7-97920232855b 1a UN 54.153.39.203 22.67 MB 1 16.7% d1a57a91-7aef-4878-a056-88949920724c 1a
  18. 18. Company Confidential© 2014 DataStax, All Rights Reserved. 18 nodetool status - racks Note: Ownership information does not include topology; for complete information, specify a keyspace Datacenter: us-east =================== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 54.173.171.164 22.78 MB 1 16.7% 67b3823f-6663-47d0-a04f-5914081e275c 1b DN 54.174.19.98 22.82 MB 1 16.7% 48d6f717-017b-4868-a525-b396d3f899aa 1b UN 54.174.245.247 22.68 MB 1 16.7% 6817e9ca-e79d-4fed-946e-7318bcfd5343 1b Datacenter: us-west =================== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 54.153.107.100 22.72 MB 1 16.7% 4abf0a7a-00ef-441a-9f70-046cd9fe1c0c 1a UN 54.153.108.157 22.79 MB 1 16.7% 303f08dd-2a19-4175-98e7-97920232855b 1a UN 54.153.39.203 22.67 MB 1 16.7% d1a57a91-7aef-4878-a056-88949920724c 1a
  19. 19. Company Confidential© 2014 DataStax, All Rights Reserved. 19 nodetool status - down node Note: Ownership information does not include topology; for complete information, specify a keyspace Datacenter: us-east =================== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 54.173.171.164 22.78 MB 1 16.7% 67b3823f-6663-47d0-a04f-5914081e275c 1b DN 54.174.19.98 22.82 MB 1 16.7% 48d6f717-017b-4868-a525-b396d3f899aa 1b UN 54.174.245.247 22.68 MB 1 16.7% 6817e9ca-e79d-4fed-946e-7318bcfd5343 1b Datacenter: us-west =================== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 54.153.107.100 22.72 MB 1 16.7% 4abf0a7a-00ef-441a-9f70-046cd9fe1c0c 1a UN 54.153.108.157 22.79 MB 1 16.7% 303f08dd-2a19-4175-98e7-97920232855b 1a UN 54.153.39.203 22.67 MB 1 16.7% d1a57a91-7aef-4878-a056-88949920724c 1a
  20. 20. Company Confidential© 2014 DataStax, All Rights Reserved. 20 nodetool ring - every token Note: Ownership information does not include topology; for complete information, specify a keyspace Datacenter: us-east ========== Address Rack Status State Load Owns Token -3074457345618258603 54.174.19.98 1b Down Normal 22.82 MB 16.67% -9223372036854775808 54.174.245.247 1b Up Normal 22.68 MB 16.67% -6148914691236517206 54.173.171.164 1b Up Normal 22.78 MB 16.67% -3074457345618258603 Datacenter: us-west ========== Address Rack Status State Load Owns Token 6148914691236517205 54.153.39.203 1a Up Normal 22.67 MB 16.67% 0 54.153.107.100 1a Up Normal 22.72 MB 16.67% 3074457345618258602 54.153.108.157 1a Up Normal 22.79 MB 16.67% 6148914691236517205
  21. 21. Company Confidential© 2014 DataStax, All Rights Reserved. 21 nodetool info ID : 6817e9ca-e79d-4fed-946e-7318bcfd5343 Gossip active : true Thrift active : true Native Transport active: true Load : 22.68 MB Generation No : 1426523950 Uptime (seconds) : 1557 Heap Memory (MB) : 270.85 / 1842.00 Off Heap Memory (MB) : 0.11 Data Center : us-east Rack : 1b Exceptions : 0 Key Cache : entries 156962, size 12.83 MB, capacity 100 MB, 649 hits, 713 requests, 0.910 recent hit rate, 14400 save period in seconds Row Cache : entries 0, size 0 bytes, capacity 0 bytes, 0 hits, 0 requests, NaN recent hit rate, 0 save period in seconds Counter Cache : entries 0, size 0 bytes, capacity 50 MB, 0 hits, 0 requests, NaN recent hit rate, 7200 save period in seconds Token : 80372383360720788
  22. 22. Company Confidential© 2014 DataStax, All Rights Reserved. 22 nodetool info - status info ID : 6817e9ca-e79d-4fed-946e-7318bcfd5343 Gossip active : true Thrift active : true Native Transport active: true Load : 22.68 MB Generation No : 1426523950 Uptime (seconds) : 1557 Heap Memory (MB) : 270.85 / 1842.00 Off Heap Memory (MB) : 0.11 Data Center : us-east Rack : 1b Exceptions : 0 Key Cache : entries 156962, size 12.83 MB, capacity 100 MB, 649 hits, 713 requests, 0.910 recent hit rate, 14400 save period in seconds Row Cache : entries 0, size 0 bytes, capacity 0 bytes, 0 hits, 0 requests, NaN recent hit rate, 0 save period in seconds Counter Cache : entries 0, size 0 bytes, capacity 50 MB, 0 hits, 0 requests, NaN recent hit rate, 7200 save period in seconds Token : 80372383360720788
  23. 23. Company Confidential© 2014 DataStax, All Rights Reserved. 26 nodetool info - memory usage ID : 6817e9ca-e79d-4fed-946e-7318bcfd5343 Gossip active : true Thrift active : true Native Transport active: true Load : 22.68 MB Generation No : 1426523950 Uptime (seconds) : 1557 Heap Memory (MB) : 270.85 / 1842.00 Off Heap Memory (MB) : 0.11 Data Center : us-east Rack : 1b Exceptions : 0 Key Cache : entries 156962, size 12.83 MB, capacity 100 MB, 649 hits, 713 requests, 0.910 recent hit rate, 14400 save period in seconds Row Cache : entries 0, size 0 bytes, capacity 0 bytes, 0 hits, 0 requests, NaN recent hit rate, 0 save period in seconds Counter Cache : entries 0, size 0 bytes, capacity 50 MB, 0 hits, 0 requests, NaN recent hit rate, 7200 save period in seconds Token : 80372383360720788
  24. 24. Company Confidential© 2014 DataStax, All Rights Reserved. 27 nodetool info - exception count ID : 6817e9ca-e79d-4fed-946e-7318bcfd5343 Gossip active : true Thrift active : true Native Transport active: true Load : 22.68 MB Generation No : 1426523950 Uptime (seconds) : 1557 Heap Memory (MB) : 270.85 / 1842.00 Off Heap Memory (MB) : 0.11 Data Center : us-east Rack : 1b Exceptions : 0 Key Cache : entries 156962, size 12.83 MB, capacity 100 MB, 649 hits, 713 requests, 0.910 recent hit rate, 14400 save period in seconds Row Cache : entries 0, size 0 bytes, capacity 0 bytes, 0 hits, 0 requests, NaN recent hit rate, 0 save period in seconds Counter Cache : entries 0, size 0 bytes, capacity 50 MB, 0 hits, 0 requests, NaN recent hit rate, 7200 save period in seconds Token : 80372383360720788
  25. 25. Company Confidential© 2014 DataStax, All Rights Reserved. 28 nodetool info - key cache ID : 6817e9ca-e79d-4fed-946e-7318bcfd5343 Gossip active : true Thrift active : true Native Transport active: true Load : 22.68 MB Generation No : 1426523950 Uptime (seconds) : 1557 Heap Memory (MB) : 270.85 / 1842.00 Off Heap Memory (MB) : 0.11 Data Center : us-east Rack : 1b Exceptions : 0 Key Cache : entries 156962, size 12.83 MB, capacity 100 MB, 649 hits, 713 requests, 0.910 recent hit rate, 14400 save period in seconds Row Cache : entries 0, size 0 bytes, capacity 0 bytes, 0 hits, 0 requests, NaN recent hit rate, 0 save period in seconds Counter Cache : entries 0, size 0 bytes, capacity 50 MB, 0 hits, 0 requests, NaN recent hit rate, 7200 save period in seconds Token : 80372383360720788
  26. 26. Company Confidential© 2014 DataStax, All Rights Reserved. 29 nodetool info - row cache ID : 6817e9ca-e79d-4fed-946e-7318bcfd5343 Gossip active : true Thrift active : true Native Transport active: true Load : 22.68 MB Generation No : 1426523950 Uptime (seconds) : 1557 Heap Memory (MB) : 270.85 / 1842.00 Off Heap Memory (MB) : 0.11 Data Center : us-east Rack : 1b Exceptions : 0 Key Cache : entries 156962, size 12.83 MB, capacity 100 MB, 649 hits, 713 requests, 0.910 recent hit rate, 14400 save period in seconds Row Cache : entries 0, size 0 bytes, capacity 0 bytes, 0 hits, 0 requests, NaN recent hit rate, 0 save period in seconds Counter Cache : entries 0, size 0 bytes, capacity 50 MB, 0 hits, 0 requests, NaN recent hit rate, 7200 save period in seconds Token : 80372383360720788
  27. 27. Company Confidential© 2014 DataStax, All Rights Reserved. 30 nodetool info - counter cache ID : 6817e9ca-e79d-4fed-946e-7318bcfd5343 Gossip active : true Thrift active : true Native Transport active: true Load : 22.68 MB Generation No : 1426523950 Uptime (seconds) : 1557 Heap Memory (MB) : 270.85 / 1842.00 Off Heap Memory (MB) : 0.11 Data Center : us-east Rack : 1b Exceptions : 0 Key Cache : entries 156962, size 12.83 MB, capacity 100 MB, 649 hits, 713 requests, 0.910 recent hit rate, 14400 save period in seconds Row Cache : entries 0, size 0 bytes, capacity 0 bytes, 0 hits, 0 requests, NaN recent hit rate, 0 save period in seconds Counter Cache : entries 0, size 0 bytes, capacity 50 MB, 0 hits, 0 requests, NaN recent hit rate, 7200 save period in seconds Token : 80372383360720788
  28. 28. Company Confidential© 2014 DataStax, All Rights Reserved. 31 nodetool info - cache size ID : 6817e9ca-e79d-4fed-946e-7318bcfd5343 Gossip active : true Thrift active : true Native Transport active: true Load : 22.68 MB Generation No : 1426523950 Uptime (seconds) : 1557 Heap Memory (MB) : 270.85 / 1842.00 Off Heap Memory (MB) : 0.11 Data Center : us-east Rack : 1b Exceptions : 0 Key Cache : entries 156962, size 12.83 MB, capacity 100 MB, 649 hits, 713 requests, 0.910 recent hit rate, 14400 save period in seconds Row Cache : entries 0, size 0 bytes, capacity 0 bytes, 0 hits, 0 requests, NaN recent hit rate, 0 save period in seconds Counter Cache : entries 0, size 0 bytes, capacity 50 MB, 0 hits, 0 requests, NaN recent hit rate, 7200 save period in seconds Token : 80372383360720788
  29. 29. Company Confidential© 2014 DataStax, All Rights Reserved. 32 nodetool info - cache hit rate ID : 6817e9ca-e79d-4fed-946e-7318bcfd5343 Gossip active : true Thrift active : true Native Transport active: true Load : 22.68 MB Generation No : 1426523950 Uptime (seconds) : 1557 Heap Memory (MB) : 270.85 / 1842.00 Off Heap Memory (MB) : 0.11 Data Center : us-east Rack : 1b Exceptions : 0 Key Cache : entries 156962, size 12.83 MB, capacity 100 MB, 649 hits, 713 requests, 0.910 recent hit rate, 14400 save period in seconds Row Cache : entries 0, size 0 bytes, capacity 0 bytes, 0 hits, 0 requests, NaN recent hit rate, 0 save period in seconds Counter Cache : entries 0, size 0 bytes, capacity 50 MB, 0 hits, 0 requests, NaN recent hit rate, 7200 save period in seconds Token : 80372383360720788
  30. 30. Company Confidential© 2014 DataStax, All Rights Reserved. 33 nodetool info - cache save period ID : 6817e9ca-e79d-4fed-946e-7318bcfd5343 Gossip active : true Thrift active : true Native Transport active: true Load : 22.68 MB Generation No : 1426523950 Uptime (seconds) : 1557 Heap Memory (MB) : 270.85 / 1842.00 Off Heap Memory (MB) : 0.11 Data Center : us-east Rack : 1b Exceptions : 0 Key Cache : entries 156962, size 12.83 MB, capacity 100 MB, 649 hits, 713 requests, 0.910 recent hit rate, 14400 save period in seconds Row Cache : entries 0, size 0 bytes, capacity 0 bytes, 0 hits, 0 requests, NaN recent hit rate, 0 save period in seconds Counter Cache : entries 0, size 0 bytes, capacity 50 MB, 0 hits, 0 requests, NaN recent hit rate, 7200 save period in seconds Token : 80372383360720788
  31. 31. Company Confidential© 2014 DataStax, All Rights Reserved. 34 nodetool tpstats Pool Name Active Pending Completed Blocked All time blocked MutationStage 0 0 10637419 0 0 ReadStage 0 0 146995 0 0 RequestResponseStage 0 0 7645246 0 0 ReadRepairStage 0 0 1494 0 0 CounterMutationStage 0 0 0 0 0 MiscStage 0 0 0 0 0 AntiEntropySessions 1 1 166 0 0 HintedHandoff 0 1 131 0 0 GossipStage 0 0 198095 0 0 CacheCleanupExecutor 0 0 0 0 0 InternalResponseStage 0 0 16 0 0 CommitLogArchiver 0 0 0 0 0 CompactionExecutor 0 0 63008726 0 0 ValidationExecutor 0 0 5186 0 0 MigrationStage 0 0 16 0 0 AntiEntropyStage 0 0 13711 0 0 PendingRangeCalculator 0 0 12 0 0 Sampler 0 0 0 0 0 MemtableFlushWriter 0 0 1755 0 100 MemtablePostFlush 0 0 7660 0 0 MemtableReclaimMemory 0 0 1755 0 0 Message type Dropped READ 0 RANGE_SLICE 0 _TRACE 0 MUTATION 18542 COUNTER_MUTATION 0 BINARY 0 REQUEST_RESPONSE 0 PAGED_RANGE 0 READ_REPAIR 0
  32. 32. Company Confidential© 2014 DataStax, All Rights Reserved. 35 nodetool tpstats - thread pools Pool Name Active Pending Completed Blocked All time blocked MutationStage 0 0 10637419 0 0 ReadStage 0 0 146995 0 0 RequestResponseStage 0 0 7645246 0 0 ReadRepairStage 0 0 1494 0 0 CounterMutationStage 0 0 0 0 0 MiscStage 0 0 0 0 0 AntiEntropySessions 1 1 166 0 0 HintedHandoff 0 1 131 0 0 GossipStage 0 0 198095 0 0 CacheCleanupExecutor 0 0 0 0 0 InternalResponseStage 0 0 16 0 0 CommitLogArchiver 0 0 0 0 0 CompactionExecutor 0 0 63008726 0 0 ValidationExecutor 0 0 5186 0 0 MigrationStage 0 0 16 0 0 AntiEntropyStage 0 0 13711 0 0 PendingRangeCalculator 0 0 12 0 0 Sampler 0 0 0 0 0 MemtableFlushWriter 0 0 1755 0 100 MemtablePostFlush 0 0 7660 0 0 MemtableReclaimMemory 0 0 1755 0 0 Message type Dropped READ 0 RANGE_SLICE 0 _TRACE 0 MUTATION 18542 COUNTER_MUTATION 0 BINARY 0 REQUEST_RESPONSE 0 PAGED_RANGE 0 READ_REPAIR 0
  33. 33. Company Confidential© 2014 DataStax, All Rights Reserved. 36 nodetool tpstats - active threads Pool Name Active Pending Completed Blocked All time blocked MutationStage 0 0 10637419 0 0 ReadStage 0 0 146995 0 0 RequestResponseStage 0 0 7645246 0 0 ReadRepairStage 0 0 1494 0 0 CounterMutationStage 0 0 0 0 0 MiscStage 0 0 0 0 0 AntiEntropySessions 1 1 166 0 0 HintedHandoff 0 1 131 0 0 GossipStage 0 0 198095 0 0 CacheCleanupExecutor 0 0 0 0 0 InternalResponseStage 0 0 16 0 0 CommitLogArchiver 0 0 0 0 0 CompactionExecutor 0 0 63008726 0 0 ValidationExecutor 0 0 5186 0 0 MigrationStage 0 0 16 0 0 AntiEntropyStage 0 0 13711 0 0 PendingRangeCalculator 0 0 12 0 0 Sampler 0 0 0 0 0 MemtableFlushWriter 0 0 1755 0 100 MemtablePostFlush 0 0 7660 0 0 MemtableReclaimMemory 0 0 1755 0 0 Message type Dropped READ 0 RANGE_SLICE 0 _TRACE 0 MUTATION 18542 COUNTER_MUTATION 0 BINARY 0 REQUEST_RESPONSE 0 PAGED_RANGE 0 READ_REPAIR 0
  34. 34. Company Confidential© 2014 DataStax, All Rights Reserved. 37 nodetool tpstats - pending tasks Pool Name Active Pending Completed Blocked All time blocked MutationStage 0 0 10637419 0 0 ReadStage 0 0 146995 0 0 RequestResponseStage 0 0 7645246 0 0 ReadRepairStage 0 0 1494 0 0 CounterMutationStage 0 0 0 0 0 MiscStage 0 0 0 0 0 AntiEntropySessions 1 1 166 0 0 HintedHandoff 0 1 131 0 0 GossipStage 0 0 198095 0 0 CacheCleanupExecutor 0 0 0 0 0 InternalResponseStage 0 0 16 0 0 CommitLogArchiver 0 0 0 0 0 CompactionExecutor 0 0 63008726 0 0 ValidationExecutor 0 0 5186 0 0 MigrationStage 0 0 16 0 0 AntiEntropyStage 0 0 13711 0 0 PendingRangeCalculator 0 0 12 0 0 Sampler 0 0 0 0 0 MemtableFlushWriter 0 0 1755 0 100 MemtablePostFlush 0 0 7660 0 0 MemtableReclaimMemory 0 0 1755 0 0 Message type Dropped READ 0 RANGE_SLICE 0 _TRACE 0 MUTATION 18542 COUNTER_MUTATION 0 BINARY 0 REQUEST_RESPONSE 0 PAGED_RANGE 0 READ_REPAIR 0
  35. 35. Company Confidential© 2014 DataStax, All Rights Reserved. 38 nodetool tpstats - completed tasks Pool Name Active Pending Completed Blocked All time blocked MutationStage 0 0 10637419 0 0 ReadStage 0 0 146995 0 0 RequestResponseStage 0 0 7645246 0 0 ReadRepairStage 0 0 1494 0 0 CounterMutationStage 0 0 0 0 0 MiscStage 0 0 0 0 0 AntiEntropySessions 1 1 166 0 0 HintedHandoff 0 1 131 0 0 GossipStage 0 0 198095 0 0 CacheCleanupExecutor 0 0 0 0 0 InternalResponseStage 0 0 16 0 0 CommitLogArchiver 0 0 0 0 0 CompactionExecutor 0 0 63008726 0 0 ValidationExecutor 0 0 5186 0 0 MigrationStage 0 0 16 0 0 AntiEntropyStage 0 0 13711 0 0 PendingRangeCalculator 0 0 12 0 0 Sampler 0 0 0 0 0 MemtableFlushWriter 0 0 1755 0 100 MemtablePostFlush 0 0 7660 0 0 MemtableReclaimMemory 0 0 1755 0 0 Message type Dropped READ 0 RANGE_SLICE 0 _TRACE 0 MUTATION 18542 COUNTER_MUTATION 0 BINARY 0 REQUEST_RESPONSE 0 PAGED_RANGE 0 READ_REPAIR 0
  36. 36. Company Confidential© 2014 DataStax, All Rights Reserved. 39 nodetool tpstats - currently blocked Pool Name Active Pending Completed Blocked All time blocked MutationStage 0 0 10637419 0 0 ReadStage 0 0 146995 0 0 RequestResponseStage 0 0 7645246 0 0 ReadRepairStage 0 0 1494 0 0 CounterMutationStage 0 0 0 0 0 MiscStage 0 0 0 0 0 AntiEntropySessions 1 1 166 0 0 HintedHandoff 0 1 131 0 0 GossipStage 0 0 198095 0 0 CacheCleanupExecutor 0 0 0 0 0 InternalResponseStage 0 0 16 0 0 CommitLogArchiver 0 0 0 0 0 CompactionExecutor 0 0 63008726 0 0 ValidationExecutor 0 0 5186 0 0 MigrationStage 0 0 16 0 0 AntiEntropyStage 0 0 13711 0 0 PendingRangeCalculator 0 0 12 0 0 Sampler 0 0 0 0 0 MemtableFlushWriter 0 0 1755 0 100 MemtablePostFlush 0 0 7660 0 0 MemtableReclaimMemory 0 0 1755 0 0 Message type Dropped READ 0 RANGE_SLICE 0 _TRACE 0 MUTATION 18542 COUNTER_MUTATION 0 BINARY 0 REQUEST_RESPONSE 0 PAGED_RANGE 0 READ_REPAIR 0
  37. 37. Company Confidential© 2014 DataStax, All Rights Reserved. 40 nodetool tpstats - blocked since restart Pool Name Active Pending Completed Blocked All time blocked MutationStage 0 0 10637419 0 0 ReadStage 0 0 146995 0 0 RequestResponseStage 0 0 7645246 0 0 ReadRepairStage 0 0 1494 0 0 CounterMutationStage 0 0 0 0 0 MiscStage 0 0 0 0 0 AntiEntropySessions 1 1 166 0 0 HintedHandoff 0 1 131 0 0 GossipStage 0 0 198095 0 0 CacheCleanupExecutor 0 0 0 0 0 InternalResponseStage 0 0 16 0 0 CommitLogArchiver 0 0 0 0 0 CompactionExecutor 0 0 63008726 0 0 ValidationExecutor 0 0 5186 0 0 MigrationStage 0 0 16 0 0 AntiEntropyStage 0 0 13711 0 0 PendingRangeCalculator 0 0 12 0 0 Sampler 0 0 0 0 0 MemtableFlushWriter 0 0 1755 0 100 MemtablePostFlush 0 0 7660 0 0 MemtableReclaimMemory 0 0 1755 0 0 Message type Dropped READ 0 RANGE_SLICE 0 _TRACE 0 MUTATION 18542 COUNTER_MUTATION 0 BINARY 0 REQUEST_RESPONSE 0 PAGED_RANGE 0 READ_REPAIR 0
  38. 38. Company Confidential© 2014 DataStax, All Rights Reserved. 41 nodetool tpstats - dropped messages Pool Name Active Pending Completed Blocked All time blocked MutationStage 0 0 10637419 0 0 ReadStage 0 0 146995 0 0 RequestResponseStage 0 0 7645246 0 0 ReadRepairStage 0 0 1494 0 0 CounterMutationStage 0 0 0 0 0 MiscStage 0 0 0 0 0 AntiEntropySessions 1 1 166 0 0 HintedHandoff 0 1 131 0 0 GossipStage 0 0 198095 0 0 CacheCleanupExecutor 0 0 0 0 0 InternalResponseStage 0 0 16 0 0 CommitLogArchiver 0 0 0 0 0 CompactionExecutor 0 0 63008726 0 0 ValidationExecutor 0 0 5186 0 0 MigrationStage 0 0 16 0 0 AntiEntropyStage 0 0 13711 0 0 PendingRangeCalculator 0 0 12 0 0 Sampler 0 0 0 0 0 MemtableFlushWriter 0 0 1755 0 100 MemtablePostFlush 0 0 7660 0 0 MemtableReclaimMemory 0 0 1755 0 0 Message type Dropped READ 0 RANGE_SLICE 0 _TRACE 0 MUTATION 18542 COUNTER_MUTATION 0 BINARY 0 REQUEST_RESPONSE 0 PAGED_RANGE 0 READ_REPAIR 0
  39. 39. Company Confidential© 2014 DataStax, All Rights Reserved. 42 nodetool tpstats - reads Pool Name Active Pending Completed Blocked All time blocked MutationStage 0 0 10637419 0 0 ReadStage 0 0 146995 0 0 RequestResponseStage 0 0 7645246 0 0 ReadRepairStage 0 0 1494 0 0 CounterMutationStage 0 0 0 0 0 MiscStage 0 0 0 0 0 AntiEntropySessions 1 1 166 0 0 HintedHandoff 0 1 131 0 0 GossipStage 0 0 198095 0 0 CacheCleanupExecutor 0 0 0 0 0 InternalResponseStage 0 0 16 0 0 CommitLogArchiver 0 0 0 0 0 CompactionExecutor 0 0 63008726 0 0 ValidationExecutor 0 0 5186 0 0 MigrationStage 0 0 16 0 0 AntiEntropyStage 0 0 13711 0 0 PendingRangeCalculator 0 0 12 0 0 Sampler 0 0 0 0 0 MemtableFlushWriter 0 0 1755 0 100 MemtablePostFlush 0 0 7660 0 0 MemtableReclaimMemory 0 0 1755 0 0 Message type Dropped READ 0 RANGE_SLICE 0 _TRACE 0 MUTATION 18542 COUNTER_MUTATION 0 BINARY 0 REQUEST_RESPONSE 0 PAGED_RANGE 0 READ_REPAIR 0
  40. 40. Company Confidential© 2014 DataStax, All Rights Reserved. 43 nodetool tpstats - writes Pool Name Active Pending Completed Blocked All time blocked MutationStage 0 0 10637419 0 0 ReadStage 0 0 146995 0 0 RequestResponseStage 0 0 7645246 0 0 ReadRepairStage 0 0 1494 0 0 CounterMutationStage 0 0 0 0 0 MiscStage 0 0 0 0 0 AntiEntropySessions 1 1 166 0 0 HintedHandoff 0 1 131 0 0 GossipStage 0 0 198095 0 0 CacheCleanupExecutor 0 0 0 0 0 InternalResponseStage 0 0 16 0 0 CommitLogArchiver 0 0 0 0 0 CompactionExecutor 0 0 63008726 0 0 ValidationExecutor 0 0 5186 0 0 MigrationStage 0 0 16 0 0 AntiEntropyStage 0 0 13711 0 0 PendingRangeCalculator 0 0 12 0 0 Sampler 0 0 0 0 0 MemtableFlushWriter 0 0 1755 0 100 MemtablePostFlush 0 0 7660 0 0 MemtableReclaimMemory 0 0 1755 0 0 Message type Dropped READ 0 RANGE_SLICE 0 _TRACE 0 MUTATION 18542 COUNTER_MUTATION 0 BINARY 0 REQUEST_RESPONSE 0 PAGED_RANGE 0 READ_REPAIR 0
  41. 41. Company Confidential© 2014 DataStax, All Rights Reserved. 44 nodetool tpstats - responses to coordinator Pool Name Active Pending Completed Blocked All time blocked MutationStage 0 0 10637419 0 0 ReadStage 0 0 146995 0 0 RequestResponseStage 0 0 7645246 0 0 ReadRepairStage 0 0 1494 0 0 CounterMutationStage 0 0 0 0 0 MiscStage 0 0 0 0 0 AntiEntropySessions 1 1 166 0 0 HintedHandoff 0 1 131 0 0 GossipStage 0 0 198095 0 0 CacheCleanupExecutor 0 0 0 0 0 InternalResponseStage 0 0 16 0 0 CommitLogArchiver 0 0 0 0 0 CompactionExecutor 0 0 63008726 0 0 ValidationExecutor 0 0 5186 0 0 MigrationStage 0 0 16 0 0 AntiEntropyStage 0 0 13711 0 0 PendingRangeCalculator 0 0 12 0 0 Sampler 0 0 0 0 0 MemtableFlushWriter 0 0 1755 0 100 MemtablePostFlush 0 0 7660 0 0 MemtableReclaimMemory 0 0 1755 0 0 Message type Dropped READ 0 RANGE_SLICE 0 _TRACE 0 MUTATION 18542 COUNTER_MUTATION 0 BINARY 0 REQUEST_RESPONSE 0 PAGED_RANGE 0 READ_REPAIR 0
  42. 42. Company Confidential© 2014 DataStax, All Rights Reserved. 45 nodetool tpstats - flushes Pool Name Active Pending Completed Blocked All time blocked MutationStage 0 0 10637419 0 0 ReadStage 0 0 146995 0 0 RequestResponseStage 0 0 7645246 0 0 ReadRepairStage 0 0 1494 0 0 CounterMutationStage 0 0 0 0 0 MiscStage 0 0 0 0 0 AntiEntropySessions 1 1 166 0 0 HintedHandoff 0 1 131 0 0 GossipStage 0 0 198095 0 0 CacheCleanupExecutor 0 0 0 0 0 InternalResponseStage 0 0 16 0 0 CommitLogArchiver 0 0 0 0 0 CompactionExecutor 0 0 63008726 0 0 ValidationExecutor 0 0 5186 0 0 MigrationStage 0 0 16 0 0 AntiEntropyStage 0 0 13711 0 0 PendingRangeCalculator 0 0 12 0 0 Sampler 0 0 0 0 0 MemtableFlushWriter 0 0 1755 0 100 MemtablePostFlush 0 0 7660 0 0 MemtableReclaimMemory 0 0 1755 0 0 Message type Dropped READ 0 RANGE_SLICE 0 _TRACE 0 MUTATION 18542 COUNTER_MUTATION 0 BINARY 0 REQUEST_RESPONSE 0 PAGED_RANGE 0 READ_REPAIR 0
  43. 43. Company Confidential© 2014 DataStax, All Rights Reserved. 52 nodetool cfstatsKeyspace: foo Read Count: 9413 Read Latency: 1.2054603208328907 ms. Write Count: 6287 Write Latency: 0.18585780181326547 ms. Pending Flushes: 0 Table: bar SSTable count: 2 Space used (live): 15646502 Space used (total): 15646502 Space used by snapshots (total): 105396159 Off heap memory used (total): 23435 SSTable Compression Ratio: 0.39842540307866203 Number of keys (estimate): 10588 Memtable cell count: 3707 Memtable data size: 292754 Memtable off heap memory used: 0 Memtable switch count: 74 Local read count: 3145 Local read latency: 0.606 ms Local write count: 1253 Local write latency: 0.289 ms Pending flushes: 0 Bloom filter false positives: 0 Bloom filter false ratio: 0.00000 Bloom filter space used: 13192 Bloom filter off heap memory used: 13176 Index summary off heap memory used: 4403 Compression metadata off heap memory used: 5856 Compacted partition minimum bytes: 125 Compacted partition maximum bytes: 43388628 Compacted partition mean bytes: 4774 Average live cells per slice (last five minutes): 6.699523052464229 Maximum live cells per slice (last five minutes): 464.0 Average tombstones per slice (last five minutes): 2.5837837837837836 Maximum tombstones per slice (last five minutes): 180.0
  44. 44. Company Confidential© 2014 DataStax, All Rights Reserved. 53 nodetool cfstats - keyspace/table nameKeyspace: foo Read Count: 9413 Read Latency: 1.2054603208328907 ms. Write Count: 6287 Write Latency: 0.18585780181326547 ms. Pending Flushes: 0 Table: bar SSTable count: 2 Space used (live): 15646502 Space used (total): 15646502 Space used by snapshots (total): 105396159 Off heap memory used (total): 23435 SSTable Compression Ratio: 0.39842540307866203 Number of keys (estimate): 10588 Memtable cell count: 3707 Memtable data size: 292754 Memtable off heap memory used: 0 Memtable switch count: 74 Local read count: 3145 Local read latency: 0.606 ms Local write count: 1253 Local write latency: 0.289 ms Pending flushes: 0 Bloom filter false positives: 0 Bloom filter false ratio: 0.00000 Bloom filter space used: 13192 Bloom filter off heap memory used: 13176 Index summary off heap memory used: 4403 Compression metadata off heap memory used: 5856 Compacted partition minimum bytes: 125 Compacted partition maximum bytes: 43388628 Compacted partition mean bytes: 4774 Average live cells per slice (last five minutes): 6.699523052464229 Maximum live cells per slice (last five minutes): 464.0 Average tombstones per slice (last five minutes): 2.5837837837837836 Maximum tombstones per slice (last five minutes): 180.0
  45. 45. Company Confidential© 2014 DataStax, All Rights Reserved. 54 nodetool cfstats - read/write countsKeyspace: foo Read Count: 9413 Read Latency: 1.2054603208328907 ms. Write Count: 6287 Write Latency: 0.18585780181326547 ms. Pending Flushes: 0 Table: bar SSTable count: 2 Space used (live): 15646502 Space used (total): 15646502 Space used by snapshots (total): 105396159 Off heap memory used (total): 23435 SSTable Compression Ratio: 0.39842540307866203 Number of keys (estimate): 10588 Memtable cell count: 3707 Memtable data size: 292754 Memtable off heap memory used: 0 Memtable switch count: 74 Local read count: 3145 Local read latency: 0.606 ms Local write count: 1253 Local write latency: 0.289 ms Pending flushes: 0 Bloom filter false positives: 0 Bloom filter false ratio: 0.00000 Bloom filter space used: 13192 Bloom filter off heap memory used: 13176 Index summary off heap memory used: 4403 Compression metadata off heap memory used: 5856 Compacted partition minimum bytes: 125 Compacted partition maximum bytes: 43388628 Compacted partition mean bytes: 4774 Average live cells per slice (last five minutes): 6.699523052464229 Maximum live cells per slice (last five minutes): 464.0 Average tombstones per slice (last five minutes): 2.5837837837837836 Maximum tombstones per slice (last five minutes): 180.0
  46. 46. Company Confidential© 2014 DataStax, All Rights Reserved. 55 nodetool cfstats - read/write latencyKeyspace: foo Read Count: 9413 Read Latency: 1.2054603208328907 ms. Write Count: 6287 Write Latency: 0.18585780181326547 ms. Pending Flushes: 0 Table: bar SSTable count: 2 Space used (live): 15646502 Space used (total): 15646502 Space used by snapshots (total): 105396159 Off heap memory used (total): 23435 SSTable Compression Ratio: 0.39842540307866203 Number of keys (estimate): 10588 Memtable cell count: 3707 Memtable data size: 292754 Memtable off heap memory used: 0 Memtable switch count: 74 Local read count: 3145 Local read latency: 0.606 ms Local write count: 1253 Local write latency: 0.289 ms Pending flushes: 0 Bloom filter false positives: 0 Bloom filter false ratio: 0.00000 Bloom filter space used: 13192 Bloom filter off heap memory used: 13176 Index summary off heap memory used: 4403 Compression metadata off heap memory used: 5856 Compacted partition minimum bytes: 125 Compacted partition maximum bytes: 43388628 Compacted partition mean bytes: 4774 Average live cells per slice (last five minutes): 6.699523052464229 Maximum live cells per slice (last five minutes): 464.0 Average tombstones per slice (last five minutes): 2.5837837837837836 Maximum tombstones per slice (last five minutes): 180.0
  47. 47. Company Confidential© 2014 DataStax, All Rights Reserved. 57 nodetool cfstats - space usedKeyspace: foo Read Count: 9413 Read Latency: 1.2054603208328907 ms. Write Count: 6287 Write Latency: 0.18585780181326547 ms. Pending Flushes: 0 Table: bar SSTable count: 2 Space used (live): 15646502 Space used (total): 15646502 Space used by snapshots (total): 105396159 Off heap memory used (total): 23435 SSTable Compression Ratio: 0.39842540307866203 Number of keys (estimate): 10588 Memtable cell count: 3707 Memtable data size: 292754 Memtable off heap memory used: 0 Memtable switch count: 74 Local read count: 3145 Local read latency: 0.606 ms Local write count: 1253 Local write latency: 0.289 ms Pending flushes: 0 Bloom filter false positives: 0 Bloom filter false ratio: 0.00000 Bloom filter space used: 13192 Bloom filter off heap memory used: 13176 Index summary off heap memory used: 4403 Compression metadata off heap memory used: 5856 Compacted partition minimum bytes: 125 Compacted partition maximum bytes: 43388628 Compacted partition mean bytes: 4774 Average live cells per slice (last five minutes): 6.699523052464229 Maximum live cells per slice (last five minutes): 464.0 Average tombstones per slice (last five minutes): 2.5837837837837836 Maximum tombstones per slice (last five minutes): 180.0
  48. 48. Company Confidential© 2014 DataStax, All Rights Reserved. 58 nodetool cfstats - off heap memoryKeyspace: foo Read Count: 9413 Read Latency: 1.2054603208328907 ms. Write Count: 6287 Write Latency: 0.18585780181326547 ms. Pending Flushes: 0 Table: bar SSTable count: 2 Space used (live): 15646502 Space used (total): 15646502 Space used by snapshots (total): 105396159 Off heap memory used (total): 23435 SSTable Compression Ratio: 0.39842540307866203 Number of keys (estimate): 10588 Memtable cell count: 3707 Memtable data size: 292754 Memtable off heap memory used: 0 Memtable switch count: 74 Local read count: 3145 Local read latency: 0.606 ms Local write count: 1253 Local write latency: 0.289 ms Pending flushes: 0 Bloom filter false positives: 0 Bloom filter false ratio: 0.00000 Bloom filter space used: 13192 Bloom filter off heap memory used: 13176 Index summary off heap memory used: 4403 Compression metadata off heap memory used: 5856 Compacted partition minimum bytes: 125 Compacted partition maximum bytes: 43388628 Compacted partition mean bytes: 4774 Average live cells per slice (last five minutes): 6.699523052464229 Maximum live cells per slice (last five minutes): 464.0 Average tombstones per slice (last five minutes): 2.5837837837837836 Maximum tombstones per slice (last five minutes): 180.0
  49. 49. Company Confidential© 2014 DataStax, All Rights Reserved. 59 nodetool cfstats - sstable countKeyspace: foo Read Count: 9413 Read Latency: 1.2054603208328907 ms. Write Count: 6287 Write Latency: 0.18585780181326547 ms. Pending Flushes: 0 Table: bar SSTable count: 2 Space used (live): 15646502 Space used (total): 15646502 Space used by snapshots (total): 105396159 Off heap memory used (total): 23435 SSTable Compression Ratio: 0.39842540307866203 Number of keys (estimate): 10588 Memtable cell count: 3707 Memtable data size: 292754 Memtable off heap memory used: 0 Memtable switch count: 74 Local read count: 3145 Local read latency: 0.606 ms Local write count: 1253 Local write latency: 0.289 ms Pending flushes: 0 Bloom filter false positives: 0 Bloom filter false ratio: 0.00000 Bloom filter space used: 13192 Bloom filter off heap memory used: 13176 Index summary off heap memory used: 4403 Compression metadata off heap memory used: 5856 Compacted partition minimum bytes: 125 Compacted partition maximum bytes: 43388628 Compacted partition mean bytes: 4774 Average live cells per slice (last five minutes): 6.699523052464229 Maximum live cells per slice (last five minutes): 464.0 Average tombstones per slice (last five minutes): 2.5837837837837836 Maximum tombstones per slice (last five minutes): 180.0
  50. 50. Company Confidential© 2014 DataStax, All Rights Reserved. 64 nodetool cfstats - partition sizesKeyspace: foo Read Count: 9413 Read Latency: 1.2054603208328907 ms. Write Count: 6287 Write Latency: 0.18585780181326547 ms. Pending Flushes: 0 Table: bar SSTable count: 2 Space used (live): 15646502 Space used (total): 15646502 Space used by snapshots (total): 105396159 Off heap memory used (total): 23435 SSTable Compression Ratio: 0.39842540307866203 Number of keys (estimate): 10588 Memtable cell count: 3707 Memtable data size: 292754 Memtable off heap memory used: 0 Memtable switch count: 74 Local read count: 3145 Local read latency: 0.606 ms Local write count: 1253 Local write latency: 0.289 ms Pending flushes: 0 Bloom filter false positives: 0 Bloom filter false ratio: 0.00000 Bloom filter space used: 13192 Bloom filter off heap memory used: 13176 Index summary off heap memory used: 4403 Compression metadata off heap memory used: 5856 Compacted partition minimum bytes: 125 Compacted partition maximum bytes: 43388628 Compacted partition mean bytes: 4774 Average live cells per slice (last five minutes): 6.699523052464229 Maximum live cells per slice (last five minutes): 464.0 Average tombstones per slice (last five minutes): 2.5837837837837836 Maximum tombstones per slice (last five minutes): 180.0
  51. 51. Company Confidential© 2014 DataStax, All Rights Reserved. 65 nodetool cfstats - tombstonesKeyspace: foo Read Count: 9413 Read Latency: 1.2054603208328907 ms. Write Count: 6287 Write Latency: 0.18585780181326547 ms. Pending Flushes: 0 Table: bar SSTable count: 2 Space used (live): 15646502 Space used (total): 15646502 Space used by snapshots (total): 105396159 Off heap memory used (total): 23435 SSTable Compression Ratio: 0.39842540307866203 Number of keys (estimate): 10588 Memtable cell count: 3707 Memtable data size: 292754 Memtable off heap memory used: 0 Memtable switch count: 74 Local read count: 3145 Local read latency: 0.606 ms Local write count: 1253 Local write latency: 0.289 ms Pending flushes: 0 Bloom filter false positives: 0 Bloom filter false ratio: 0.00000 Bloom filter space used: 13192 Bloom filter off heap memory used: 13176 Index summary off heap memory used: 4403 Compression metadata off heap memory used: 5856 Compacted partition minimum bytes: 125 Compacted partition maximum bytes: 43388628 Compacted partition mean bytes: 4774 Average live cells per slice (last five minutes): 6.699523052464229 Maximum live cells per slice (last five minutes): 464.0 Average tombstones per slice (last five minutes): 2.5837837837837836 Maximum tombstones per slice (last five minutes): 180.0
  52. 52. Company Confidential© 2014 DataStax, All Rights Reserved. 66 nodetool cfhistograms foo/bar histograms Percentile SSTables Write Latency Read Latency Partition Size Cell Count (micros) (micros) (bytes) 50% 3.00 124.00 924.00 29521 149 75% 3.00 215.00 1331.00 61214 310 95% 3.00 642.00 2299.00 219342 924 98% 3.00 1109.00 3311.00 379022 1331 99% 3.00 1331.00 3973.00 454826 1916 Min 0.00 43.00 51.00 1332 11 Max 3.00 2759.00 42510.00 2346799 6866
  53. 53. Company Confidential© 2014 DataStax, All Rights Reserved. 67 nodetool cfhistograms - keyspace/table foo/bar histograms Percentile SSTables Write Latency Read Latency Partition Size Cell Count (micros) (micros) (bytes) 50% 3.00 124.00 924.00 29521 149 75% 3.00 215.00 1331.00 61214 310 95% 3.00 642.00 2299.00 219342 924 98% 3.00 1109.00 3311.00 379022 1331 99% 3.00 1331.00 3973.00 454826 1916 Min 0.00 43.00 51.00 1332 11 Max 3.00 2759.00 42510.00 2346799 6866
  54. 54. Company Confidential© 2014 DataStax, All Rights Reserved. 68 nodetool cfhistograms - percentiles foo/bar histograms Percentile SSTables Write Latency Read Latency Partition Size Cell Count (micros) (micros) (bytes) 50% 3.00 124.00 924.00 29521 149 75% 3.00 215.00 1331.00 61214 310 95% 3.00 642.00 2299.00 219342 924 98% 3.00 1109.00 3311.00 379022 1331 99% 3.00 1331.00 3973.00 454826 1916 Min 0.00 43.00 51.00 1332 11 Max 3.00 2759.00 42510.00 2346799 6866
  55. 55. Company Confidential© 2014 DataStax, All Rights Reserved. 69 nodetool cfhistograms - sstables per read foo/bar histograms Percentile SSTables Write Latency Read Latency Partition Size Cell Count (micros) (micros) (bytes) 50% 3.00 124.00 924.00 29521 149 75% 3.00 215.00 1331.00 61214 310 95% 3.00 642.00 2299.00 219342 924 98% 3.00 1109.00 3311.00 379022 1331 99% 3.00 1331.00 3973.00 454826 1916 Min 0.00 43.00 51.00 1332 11 Max 3.00 2759.00 42510.00 2346799 6866
  56. 56. Company Confidential© 2014 DataStax, All Rights Reserved. 70 nodetool cfhistograms - read/write latency foo/bar histograms Percentile SSTables Write Latency Read Latency Partition Size Cell Count (micros) (micros) (bytes) 50% 3.00 124.00 924.00 29521 149 75% 3.00 215.00 1331.00 61214 310 95% 3.00 642.00 2299.00 219342 924 98% 3.00 1109.00 3311.00 379022 1331 99% 3.00 1331.00 3973.00 454826 1916 Min 0.00 43.00 51.00 1332 11 Max 3.00 2759.00 42510.00 2346799 6866
  57. 57. Company Confidential© 2014 DataStax, All Rights Reserved. 71 nodetool cfhistograms - partition size foo/bar histograms Percentile SSTables Write Latency Read Latency Partition Size Cell Count (micros) (micros) (bytes) 50% 3.00 124.00 924.00 29521 149 75% 3.00 215.00 1331.00 61214 310 95% 3.00 642.00 2299.00 219342 924 98% 3.00 1109.00 3311.00 379022 1331 99% 3.00 1331.00 3973.00 454826 1916 Min 0.00 43.00 51.00 1332 11 Max 3.00 2759.00 42510.00 2346799 6866
  58. 58. Company Confidential© 2014 DataStax, All Rights Reserved. 72 nodetool cfhistograms - cell count foo/bar histograms Percentile SSTables Write Latency Read Latency Partition Size Cell Count (micros) (micros) (bytes) 50% 3.00 124.00 924.00 29521 149 75% 3.00 215.00 1331.00 61214 310 95% 3.00 642.00 2299.00 219342 924 98% 3.00 1109.00 3311.00 379022 1331 99% 3.00 1331.00 3973.00 454826 1916 Min 0.00 43.00 51.00 1332 11 Max 3.00 2759.00 42510.00 2346799 6866
  59. 59. Company Confidential© 2014 DataStax, All Rights Reserved. 73 nodetool proxyhistograms proxy histograms Percentile Read Latency Write Latency Range Latency (micros) (micros) (micros) 50% 3311.00 179.00 1331.00 75% 3311.00 258.00 2759.00 95% 3311.00 3973.00 29521.00 98% 3311.00 3973.00 219342.00 99% 3311.00 3973.00 219342.00 Min 2760.00 87.00 536.00 Max 3311.00 3973.00 219342.00
  60. 60. Company Confidential© 2014 DataStax, All Rights Reserved. 74 nodetool proxyhistograms - percentiles proxy histograms Percentile Read Latency Write Latency Range Latency (micros) (micros) (micros) 50% 3311.00 179.00 1331.00 75% 3311.00 258.00 2759.00 95% 3311.00 3973.00 29521.00 98% 3311.00 3973.00 219342.00 99% 3311.00 3973.00 219342.00 Min 2760.00 87.00 536.00 Max 3311.00 3973.00 219342.00
  61. 61. Company Confidential© 2014 DataStax, All Rights Reserved. 75 nodetool proxyhistograms - read latency proxy histograms Percentile Read Latency Write Latency Range Latency (micros) (micros) (micros) 50% 3311.00 179.00 1331.00 75% 3311.00 258.00 2759.00 95% 3311.00 3973.00 29521.00 98% 3311.00 3973.00 219342.00 99% 3311.00 3973.00 219342.00 Min 2760.00 87.00 536.00 Max 3311.00 3973.00 219342.00
  62. 62. Company Confidential© 2014 DataStax, All Rights Reserved. 76 nodetool proxyhistograms - write latency proxy histograms Percentile Read Latency Write Latency Range Latency (micros) (micros) (micros) 50% 3311.00 179.00 1331.00 75% 3311.00 258.00 2759.00 95% 3311.00 3973.00 29521.00 98% 3311.00 3973.00 219342.00 99% 3311.00 3973.00 219342.00 Min 2760.00 87.00 536.00 Max 3311.00 3973.00 219342.00
  63. 63. Company Confidential© 2014 DataStax, All Rights Reserved. 77 nodetool proxyhistograms - range latency proxy histograms Percentile Read Latency Write Latency Range Latency (micros) (micros) (micros) 50% 3311.00 179.00 1331.00 75% 3311.00 258.00 2759.00 95% 3311.00 3973.00 29521.00 98% 3311.00 3973.00 219342.00 99% 3311.00 3973.00 219342.00 Min 2760.00 87.00 536.00 Max 3311.00 3973.00 219342.00
  64. 64. Company Confidential© 2014 DataStax, All Rights Reserved. 78 nodetool netstats Mode: NORMAL Repair 028763b0-cc1e-11e4-a20c-a1d01a3fbf30 /54.174.19.98 Receiving 6 files, 117949006 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-162-Data.db 851792/17950738 bytes(4%) received from /54.174.19.98 Sending 2 files, 47709526 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-157-Data.db 3786324/46561942 bytes(8%) sent to /54.174.19.98 Repair 020ed850-cc1e-11e4-a20c-a1d01a3fbf30 /54.174.245.247 Receiving 4 files, 93304584 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-161-Data.db 6094594/46561942 bytes(13%) received from /54.174.245.247 Sending 2 files, 47709526 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-157-Data.db 34195028/46561942 bytes(73%) sent to /54.174.245.247 Repair 018c88f0-cc1e-11e4-a20c-a1d01a3fbf30 /54.153.39.203 (using /172.31.10.65) Receiving 3 files, 49959102 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-160-Data.db 9371380/46561942 bytes(20%) received from /54.153.39.203 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-159-Data.db 2533414/2533414 bytes(100%) received from /54.153.39.203 Sending 2 files, 47709526 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-158-Data.db 1147584/1147584 bytes(100%) sent to /54.153.39.203 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-157-Data.db 46561942/46561942 bytes(100%) sent to /54.153.39.203 Read Repair Statistics: Attempted: 39576 Mismatch (Blocking): 0 Mismatch (Background): 746 Pool Name Active Pending Completed Commands n/a 58 2545817 Responses n/a 0 2833081
  65. 65. Company Confidential© 2014 DataStax, All Rights Reserved. 79 nodetool netstats - streaming mode Mode: NORMAL Repair 028763b0-cc1e-11e4-a20c-a1d01a3fbf30 /54.174.19.98 Receiving 6 files, 117949006 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-162-Data.db 851792/17950738 bytes(4%) received from /54.174.19.98 Sending 2 files, 47709526 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-157-Data.db 3786324/46561942 bytes(8%) sent to /54.174.19.98 Repair 020ed850-cc1e-11e4-a20c-a1d01a3fbf30 /54.174.245.247 Receiving 4 files, 93304584 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-161-Data.db 6094594/46561942 bytes(13%) received from /54.174.245.247 Sending 2 files, 47709526 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-157-Data.db 34195028/46561942 bytes(73%) sent to /54.174.245.247 Repair 018c88f0-cc1e-11e4-a20c-a1d01a3fbf30 /54.153.39.203 (using /172.31.10.65) Receiving 3 files, 49959102 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-160-Data.db 9371380/46561942 bytes(20%) received from /54.153.39.203 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-159-Data.db 2533414/2533414 bytes(100%) received from /54.153.39.203 Sending 2 files, 47709526 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-158-Data.db 1147584/1147584 bytes(100%) sent to /54.153.39.203 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-157-Data.db 46561942/46561942 bytes(100%) sent to /54.153.39.203 Read Repair Statistics: Attempted: 39576 Mismatch (Blocking): 0 Mismatch (Background): 746 Pool Name Active Pending Completed Commands n/a 58 2545817 Responses n/a 0 2833081
  66. 66. Company Confidential© 2014 DataStax, All Rights Reserved. 80 nodetool netstats - active streams Mode: NORMAL Repair 028763b0-cc1e-11e4-a20c-a1d01a3fbf30 /54.174.19.98 Receiving 6 files, 117949006 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-162-Data.db 851792/17950738 bytes(4%) received from /54.174.19.98 Sending 2 files, 47709526 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-157-Data.db 3786324/46561942 bytes(8%) sent to /54.174.19.98 Repair 020ed850-cc1e-11e4-a20c-a1d01a3fbf30 /54.174.245.247 Receiving 4 files, 93304584 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-161-Data.db 6094594/46561942 bytes(13%) received from /54.174.245.247 Sending 2 files, 47709526 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-157-Data.db 34195028/46561942 bytes(73%) sent to /54.174.245.247 Repair 018c88f0-cc1e-11e4-a20c-a1d01a3fbf30 /54.153.39.203 (using /172.31.10.65) Receiving 3 files, 49959102 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-160-Data.db 9371380/46561942 bytes(20%) received from /54.153.39.203 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-159-Data.db 2533414/2533414 bytes(100%) received from /54.153.39.203 Sending 2 files, 47709526 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-158-Data.db 1147584/1147584 bytes(100%) sent to /54.153.39.203 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-157-Data.db 46561942/46561942 bytes(100%) sent to /54.153.39.203 Read Repair Statistics: Attempted: 39576 Mismatch (Blocking): 0 Mismatch (Background): 746 Pool Name Active Pending Completed Commands n/a 58 2545817 Responses n/a 0 2833081
  67. 67. Company Confidential© 2014 DataStax, All Rights Reserved. 81 nodetool netstats - read repairs Mode: NORMAL Repair 028763b0-cc1e-11e4-a20c-a1d01a3fbf30 /54.174.19.98 Receiving 6 files, 117949006 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-162-Data.db 851792/17950738 bytes(4%) received from /54.174.19.98 Sending 2 files, 47709526 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-157-Data.db 3786324/46561942 bytes(8%) sent to /54.174.19.98 Repair 020ed850-cc1e-11e4-a20c-a1d01a3fbf30 /54.174.245.247 Receiving 4 files, 93304584 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-161-Data.db 6094594/46561942 bytes(13%) received from /54.174.245.247 Sending 2 files, 47709526 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-157-Data.db 34195028/46561942 bytes(73%) sent to /54.174.245.247 Repair 018c88f0-cc1e-11e4-a20c-a1d01a3fbf30 /54.153.39.203 (using /172.31.10.65) Receiving 3 files, 49959102 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-160-Data.db 9371380/46561942 bytes(20%) received from /54.153.39.203 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-159-Data.db 2533414/2533414 bytes(100%) received from /54.153.39.203 Sending 2 files, 47709526 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-158-Data.db 1147584/1147584 bytes(100%) sent to /54.153.39.203 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-157-Data.db 46561942/46561942 bytes(100%) sent to /54.153.39.203 Read Repair Statistics: Attempted: 39576 Mismatch (Blocking): 0 Mismatch (Background): 746 Pool Name Active Pending Completed Commands n/a 58 2545817 Responses n/a 0 2833081
  68. 68. Company Confidential© 2014 DataStax, All Rights Reserved. 82 nodetool netstats - coordinator stats Mode: NORMAL Repair 028763b0-cc1e-11e4-a20c-a1d01a3fbf30 /54.174.19.98 Receiving 6 files, 117949006 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-162-Data.db 851792/17950738 bytes(4%) received from /54.174.19.98 Sending 2 files, 47709526 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-157-Data.db 3786324/46561942 bytes(8%) sent to /54.174.19.98 Repair 020ed850-cc1e-11e4-a20c-a1d01a3fbf30 /54.174.245.247 Receiving 4 files, 93304584 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-161-Data.db 6094594/46561942 bytes(13%) received from /54.174.245.247 Sending 2 files, 47709526 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-157-Data.db 34195028/46561942 bytes(73%) sent to /54.174.245.247 Repair 018c88f0-cc1e-11e4-a20c-a1d01a3fbf30 /54.153.39.203 (using /172.31.10.65) Receiving 3 files, 49959102 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-160-Data.db 9371380/46561942 bytes(20%) received from /54.153.39.203 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-159-Data.db 2533414/2533414 bytes(100%) received from /54.153.39.203 Sending 2 files, 47709526 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-158-Data.db 1147584/1147584 bytes(100%) sent to /54.153.39.203 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-157-Data.db 46561942/46561942 bytes(100%) sent to /54.153.39.203 Read Repair Statistics: Attempted: 39576 Mismatch (Blocking): 0 Mismatch (Background): 746 Pool Name Active Pending Completed Commands n/a 58 2545817 Responses n/a 0 2833081
  69. 69. Company Confidential© 2014 DataStax, All Rights Reserved. 83 nodetool netstats - repair id Mode: NORMAL Repair 028763b0-cc1e-11e4-a20c-a1d01a3fbf30 /54.174.19.98 Receiving 6 files, 117949006 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-162-Data.db 851792/17950738 bytes(4%) received from /54.174.19.98 Sending 2 files, 47709526 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-157-Data.db 3786324/46561942 bytes(8%) sent to /54.174.19.98 Repair 020ed850-cc1e-11e4-a20c-a1d01a3fbf30 /54.174.245.247 Receiving 4 files, 93304584 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-161-Data.db 6094594/46561942 bytes(13%) received from /54.174.245.247 Sending 2 files, 47709526 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-157-Data.db 34195028/46561942 bytes(73%) sent to /54.174.245.247 Repair 018c88f0-cc1e-11e4-a20c-a1d01a3fbf30 /54.153.39.203 (using /172.31.10.65) Receiving 3 files, 49959102 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-160-Data.db 9371380/46561942 bytes(20%) received from /54.153.39.203 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-159-Data.db 2533414/2533414 bytes(100%) received from /54.153.39.203 Sending 2 files, 47709526 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-158-Data.db 1147584/1147584 bytes(100%) sent to /54.153.39.203 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-157-Data.db 46561942/46561942 bytes(100%) sent to /54.153.39.203 Read Repair Statistics: Attempted: 39576 Mismatch (Blocking): 0 Mismatch (Background): 746 Pool Name Active Pending Completed Commands n/a 58 2545817 Responses n/a 0 2833081
  70. 70. Company Confidential© 2014 DataStax, All Rights Reserved. 84 nodetool netstats - node IP Mode: NORMAL Repair 028763b0-cc1e-11e4-a20c-a1d01a3fbf30 /54.174.19.98 Receiving 6 files, 117949006 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-162-Data.db 851792/17950738 bytes(4%) received from /54.174.19.98 Sending 2 files, 47709526 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-157-Data.db 3786324/46561942 bytes(8%) sent to /54.174.19.98 Repair 020ed850-cc1e-11e4-a20c-a1d01a3fbf30 /54.174.245.247 Receiving 4 files, 93304584 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-161-Data.db 6094594/46561942 bytes(13%) received from /54.174.245.247 Sending 2 files, 47709526 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-157-Data.db 34195028/46561942 bytes(73%) sent to /54.174.245.247 Repair 018c88f0-cc1e-11e4-a20c-a1d01a3fbf30 /54.153.39.203 (using /172.31.10.65) Receiving 3 files, 49959102 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-160-Data.db 9371380/46561942 bytes(20%) received from /54.153.39.203 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-159-Data.db 2533414/2533414 bytes(100%) received from /54.153.39.203 Sending 2 files, 47709526 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-158-Data.db 1147584/1147584 bytes(100%) sent to /54.153.39.203 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-157-Data.db 46561942/46561942 bytes(100%) sent to /54.153.39.203 Read Repair Statistics: Attempted: 39576 Mismatch (Blocking): 0 Mismatch (Background): 746 Pool Name Active Pending Completed Commands n/a 58 2545817 Responses n/a 0 2833081
  71. 71. Company Confidential© 2014 DataStax, All Rights Reserved. 85 nodetool netstats - local ip Mode: NORMAL Repair 028763b0-cc1e-11e4-a20c-a1d01a3fbf30 /54.174.19.98 Receiving 6 files, 117949006 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-162-Data.db 851792/17950738 bytes(4%) received from /54.174.19.98 Sending 2 files, 47709526 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-157-Data.db 3786324/46561942 bytes(8%) sent to /54.174.19.98 Repair 020ed850-cc1e-11e4-a20c-a1d01a3fbf30 /54.174.245.247 Receiving 4 files, 93304584 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-161-Data.db 6094594/46561942 bytes(13%) received from /54.174.245.247 Sending 2 files, 47709526 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-157-Data.db 34195028/46561942 bytes(73%) sent to /54.174.245.247 Repair 018c88f0-cc1e-11e4-a20c-a1d01a3fbf30 /54.153.39.203 (using /172.31.10.65) Receiving 3 files, 49959102 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-160-Data.db 9371380/46561942 bytes(20%) received from /54.153.39.203 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-159-Data.db 2533414/2533414 bytes(100%) received from /54.153.39.203 Sending 2 files, 47709526 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-158-Data.db 1147584/1147584 bytes(100%) sent to /54.153.39.203 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-157-Data.db 46561942/46561942 bytes(100%) sent to /54.153.39.203 Read Repair Statistics: Attempted: 39576 Mismatch (Blocking): 0 Mismatch (Background): 746 Pool Name Active Pending Completed Commands n/a 58 2545817 Responses n/a 0 2833081
  72. 72. Company Confidential© 2014 DataStax, All Rights Reserved. 86 nodetool netstats - files/bytes being received Mode: NORMAL Repair 028763b0-cc1e-11e4-a20c-a1d01a3fbf30 /54.174.19.98 Receiving 6 files, 117949006 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-162-Data.db 851792/17950738 bytes(4%) received from /54.174.19.98 Sending 2 files, 47709526 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-157-Data.db 3786324/46561942 bytes(8%) sent to /54.174.19.98 Repair 020ed850-cc1e-11e4-a20c-a1d01a3fbf30 /54.174.245.247 Receiving 4 files, 93304584 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-161-Data.db 6094594/46561942 bytes(13%) received from /54.174.245.247 Sending 2 files, 47709526 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-157-Data.db 34195028/46561942 bytes(73%) sent to /54.174.245.247 Repair 018c88f0-cc1e-11e4-a20c-a1d01a3fbf30 /54.153.39.203 (using /172.31.10.65) Receiving 3 files, 49959102 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-160-Data.db 9371380/46561942 bytes(20%) received from /54.153.39.203 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-159-Data.db 2533414/2533414 bytes(100%) received from /54.153.39.203 Sending 2 files, 47709526 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-158-Data.db 1147584/1147584 bytes(100%) sent to /54.153.39.203 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-157-Data.db 46561942/46561942 bytes(100%) sent to /54.153.39.203 Read Repair Statistics: Attempted: 39576 Mismatch (Blocking): 0 Mismatch (Background): 746 Pool Name Active Pending Completed Commands n/a 58 2545817 Responses n/a 0 2833081
  73. 73. Company Confidential© 2014 DataStax, All Rights Reserved. 87 nodetool netstats - files/bytes being sent Mode: NORMAL Repair 028763b0-cc1e-11e4-a20c-a1d01a3fbf30 /54.174.19.98 Receiving 6 files, 117949006 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-162-Data.db 851792/17950738 bytes(4%) received from /54.174.19.98 Sending 2 files, 47709526 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-157-Data.db 3786324/46561942 bytes(8%) sent to /54.174.19.98 Repair 020ed850-cc1e-11e4-a20c-a1d01a3fbf30 /54.174.245.247 Receiving 4 files, 93304584 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-161-Data.db 6094594/46561942 bytes(13%) received from /54.174.245.247 Sending 2 files, 47709526 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-157-Data.db 34195028/46561942 bytes(73%) sent to /54.174.245.247 Repair 018c88f0-cc1e-11e4-a20c-a1d01a3fbf30 /54.153.39.203 (using /172.31.10.65) Receiving 3 files, 49959102 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-160-Data.db 9371380/46561942 bytes(20%) received from /54.153.39.203 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-159-Data.db 2533414/2533414 bytes(100%) received from /54.153.39.203 Sending 2 files, 47709526 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-158-Data.db 1147584/1147584 bytes(100%) sent to /54.153.39.203 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-157-Data.db 46561942/46561942 bytes(100%) sent to /54.153.39.203 Read Repair Statistics: Attempted: 39576 Mismatch (Blocking): 0 Mismatch (Background): 746 Pool Name Active Pending Completed Commands n/a 58 2545817 Responses n/a 0 2833081
  74. 74. Company Confidential© 2014 DataStax, All Rights Reserved. 88 nodetool netstats - sstable names Mode: NORMAL Repair 028763b0-cc1e-11e4-a20c-a1d01a3fbf30 /54.174.19.98 Receiving 6 files, 117949006 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-162-Data.db 851792/17950738 bytes(4%) received from /54.174.19.98 Sending 2 files, 47709526 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-157-Data.db 3786324/46561942 bytes(8%) sent to /54.174.19.98 Repair 020ed850-cc1e-11e4-a20c-a1d01a3fbf30 /54.174.245.247 Receiving 4 files, 93304584 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-161-Data.db 6094594/46561942 bytes(13%) received from /54.174.245.247 Sending 2 files, 47709526 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-157-Data.db 34195028/46561942 bytes(73%) sent to /54.174.245.247 Repair 018c88f0-cc1e-11e4-a20c-a1d01a3fbf30 /54.153.39.203 (using /172.31.10.65) Receiving 3 files, 49959102 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-160-Data.db 9371380/46561942 bytes(20%) received from /54.153.39.203 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-159-Data.db 2533414/2533414 bytes(100%) received from /54.153.39.203 Sending 2 files, 47709526 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-158-Data.db 1147584/1147584 bytes(100%) sent to /54.153.39.203 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-157-Data.db 46561942/46561942 bytes(100%) sent to /54.153.39.203 Read Repair Statistics: Attempted: 39576 Mismatch (Blocking): 0 Mismatch (Background): 746 Pool Name Active Pending Completed Commands n/a 58 2545817 Responses n/a 0 2833081
  75. 75. Company Confidential© 2014 DataStax, All Rights Reserved. 89 nodetool netstats - streaming progress Mode: NORMAL Repair 028763b0-cc1e-11e4-a20c-a1d01a3fbf30 /54.174.19.98 Receiving 6 files, 117949006 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-162-Data.db 851792/17950738 bytes(4%) received from /54.174.19.98 Sending 2 files, 47709526 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-157-Data.db 3786324/46561942 bytes(8%) sent to /54.174.19.98 Repair 020ed850-cc1e-11e4-a20c-a1d01a3fbf30 /54.174.245.247 Receiving 4 files, 93304584 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-161-Data.db 6094594/46561942 bytes(13%) received from /54.174.245.247 Sending 2 files, 47709526 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-157-Data.db 34195028/46561942 bytes(73%) sent to /54.174.245.247 Repair 018c88f0-cc1e-11e4-a20c-a1d01a3fbf30 /54.153.39.203 (using /172.31.10.65) Receiving 3 files, 49959102 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-160-Data.db 9371380/46561942 bytes(20%) received from /54.153.39.203 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-159-Data.db 2533414/2533414 bytes(100%) received from /54.153.39.203 Sending 2 files, 47709526 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-158-Data.db 1147584/1147584 bytes(100%) sent to /54.153.39.203 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-157-Data.db 46561942/46561942 bytes(100%) sent to /54.153.39.203 Read Repair Statistics: Attempted: 39576 Mismatch (Blocking): 0 Mismatch (Background): 746 Pool Name Active Pending Completed Commands n/a 58 2545817 Responses n/a 0 2833081
  76. 76. Company Confidential© 2014 DataStax, All Rights Reserved. 90 nodetool netstats - read repairs attempted Mode: NORMAL Repair 028763b0-cc1e-11e4-a20c-a1d01a3fbf30 /54.174.19.98 Receiving 6 files, 117949006 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-162-Data.db 851792/17950738 bytes(4%) received from /54.174.19.98 Sending 2 files, 47709526 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-157-Data.db 3786324/46561942 bytes(8%) sent to /54.174.19.98 Repair 020ed850-cc1e-11e4-a20c-a1d01a3fbf30 /54.174.245.247 Receiving 4 files, 93304584 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-161-Data.db 6094594/46561942 bytes(13%) received from /54.174.245.247 Sending 2 files, 47709526 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-157-Data.db 34195028/46561942 bytes(73%) sent to /54.174.245.247 Repair 018c88f0-cc1e-11e4-a20c-a1d01a3fbf30 /54.153.39.203 (using /172.31.10.65) Receiving 3 files, 49959102 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-160-Data.db 9371380/46561942 bytes(20%) received from /54.153.39.203 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-159-Data.db 2533414/2533414 bytes(100%) received from /54.153.39.203 Sending 2 files, 47709526 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-158-Data.db 1147584/1147584 bytes(100%) sent to /54.153.39.203 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-157-Data.db 46561942/46561942 bytes(100%) sent to /54.153.39.203 Read Repair Statistics: Attempted: 39576 Mismatch (Blocking): 0 Mismatch (Background): 746 Pool Name Active Pending Completed Commands n/a 58 2545817 Responses n/a 0 2833081
  77. 77. Company Confidential© 2014 DataStax, All Rights Reserved. 91 nodetool netstats - mismatches Mode: NORMAL Repair 028763b0-cc1e-11e4-a20c-a1d01a3fbf30 /54.174.19.98 Receiving 6 files, 117949006 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-162-Data.db 851792/17950738 bytes(4%) received from /54.174.19.98 Sending 2 files, 47709526 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-157-Data.db 3786324/46561942 bytes(8%) sent to /54.174.19.98 Repair 020ed850-cc1e-11e4-a20c-a1d01a3fbf30 /54.174.245.247 Receiving 4 files, 93304584 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-161-Data.db 6094594/46561942 bytes(13%) received from /54.174.245.247 Sending 2 files, 47709526 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-157-Data.db 34195028/46561942 bytes(73%) sent to /54.174.245.247 Repair 018c88f0-cc1e-11e4-a20c-a1d01a3fbf30 /54.153.39.203 (using /172.31.10.65) Receiving 3 files, 49959102 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-160-Data.db 9371380/46561942 bytes(20%) received from /54.153.39.203 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-159-Data.db 2533414/2533414 bytes(100%) received from /54.153.39.203 Sending 2 files, 47709526 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-158-Data.db 1147584/1147584 bytes(100%) sent to /54.153.39.203 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-157-Data.db 46561942/46561942 bytes(100%) sent to /54.153.39.203 Read Repair Statistics: Attempted: 39576 Mismatch (Blocking): 0 Mismatch (Background): 746 Pool Name Active Pending Completed Commands n/a 58 2545817 Responses n/a 0 2833081
  78. 78. Company Confidential© 2014 DataStax, All Rights Reserved. 92 nodetool netstats - commands sent Mode: NORMAL Repair 028763b0-cc1e-11e4-a20c-a1d01a3fbf30 /54.174.19.98 Receiving 6 files, 117949006 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-162-Data.db 851792/17950738 bytes(4%) received from /54.174.19.98 Sending 2 files, 47709526 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-157-Data.db 3786324/46561942 bytes(8%) sent to /54.174.19.98 Repair 020ed850-cc1e-11e4-a20c-a1d01a3fbf30 /54.174.245.247 Receiving 4 files, 93304584 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-161-Data.db 6094594/46561942 bytes(13%) received from /54.174.245.247 Sending 2 files, 47709526 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-157-Data.db 34195028/46561942 bytes(73%) sent to /54.174.245.247 Repair 018c88f0-cc1e-11e4-a20c-a1d01a3fbf30 /54.153.39.203 (using /172.31.10.65) Receiving 3 files, 49959102 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-160-Data.db 9371380/46561942 bytes(20%) received from /54.153.39.203 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-159-Data.db 2533414/2533414 bytes(100%) received from /54.153.39.203 Sending 2 files, 47709526 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-158-Data.db 1147584/1147584 bytes(100%) sent to /54.153.39.203 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-157-Data.db 46561942/46561942 bytes(100%) sent to /54.153.39.203 Read Repair Statistics: Attempted: 39576 Mismatch (Blocking): 0 Mismatch (Background): 746 Pool Name Active Pending Completed Commands n/a 58 2545817 Responses n/a 0 2833081
  79. 79. Company Confidential© 2014 DataStax, All Rights Reserved. 93 nodetool netstats - responses received Mode: NORMAL Repair 028763b0-cc1e-11e4-a20c-a1d01a3fbf30 /54.174.19.98 Receiving 6 files, 117949006 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-162-Data.db 851792/17950738 bytes(4%) received from /54.174.19.98 Sending 2 files, 47709526 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-157-Data.db 3786324/46561942 bytes(8%) sent to /54.174.19.98 Repair 020ed850-cc1e-11e4-a20c-a1d01a3fbf30 /54.174.245.247 Receiving 4 files, 93304584 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-161-Data.db 6094594/46561942 bytes(13%) received from /54.174.245.247 Sending 2 files, 47709526 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-157-Data.db 34195028/46561942 bytes(73%) sent to /54.174.245.247 Repair 018c88f0-cc1e-11e4-a20c-a1d01a3fbf30 /54.153.39.203 (using /172.31.10.65) Receiving 3 files, 49959102 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-160-Data.db 9371380/46561942 bytes(20%) received from /54.153.39.203 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-tmp-jb-159-Data.db 2533414/2533414 bytes(100%) received from /54.153.39.203 Sending 2 files, 47709526 bytes total /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-158-Data.db 1147584/1147584 bytes(100%) sent to /54.153.39.203 /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-jb-157-Data.db 46561942/46561942 bytes(100%) sent to /54.153.39.203 Read Repair Statistics: Attempted: 39576 Mismatch (Blocking): 0 Mismatch (Background): 746 Pool Name Active Pending Completed Commands n/a 58 2545817 Responses n/a 0 2833081
  80. 80. Company Confidential© 2014 DataStax, All Rights Reserved. 94 nodetool compactionstats pending tasks: 7 compaction type keyspace table completed total unit progress Compaction Keyspace1 Standard1 65967769 154639724 bytes 42.66% Compaction Keyspace1 Standard1 53880334 227493794 bytes 23.68% Active compaction remaining time : 0h00m15s pending tasks: 1 compaction type keyspace table completed total unit progress Validation Keyspace1 Standard1 74684443 95178582 bytes 78.47% Active compaction remaining time : n/a
  81. 81. Company Confidential© 2014 DataStax, All Rights Reserved. 95 nodetool compactionstats - pending tasks pending tasks: 7 compaction type keyspace table completed total unit progress Compaction Keyspace1 Standard1 65967769 154639724 bytes 42.66% Compaction Keyspace1 Standard1 53880334 227493794 bytes 23.68% Active compaction remaining time : 0h00m15s pending tasks: 1 compaction type keyspace table completed total unit progress Validation Keyspace1 Standard1 74684443 95178582 bytes 78.47% Active compaction remaining time : n/a
  82. 82. Company Confidential© 2014 DataStax, All Rights Reserved. 96 nodetool compactionstats - active compactions pending tasks: 7 compaction type keyspace table completed total unit progress Compaction Keyspace1 Standard1 65967769 154639724 bytes 42.66% Compaction Keyspace1 Standard1 53880334 227493794 bytes 23.68% Active compaction remaining time : 0h00m15s pending tasks: 1 compaction type keyspace table completed total unit progress Validation Keyspace1 Standard1 74684443 95178582 bytes 78.47% Active compaction remaining time : n/a
  83. 83. Company Confidential© 2014 DataStax, All Rights Reserved. 97 nodetool compactionstats - compaction type pending tasks: 7 compaction type keyspace table completed total unit progress Compaction Keyspace1 Standard1 65967769 154639724 bytes 42.66% Compaction Keyspace1 Standard1 53880334 227493794 bytes 23.68% Active compaction remaining time : 0h00m15s pending tasks: 1 compaction type keyspace table completed total unit progress Validation Keyspace1 Standard1 74684443 95178582 bytes 78.47% Active compaction remaining time : n/a
  84. 84. Company Confidential© 2014 DataStax, All Rights Reserved. 98 nodetool compactionstats - keyspace/table pending tasks: 7 compaction type keyspace table completed total unit progress Compaction Keyspace1 Standard1 65967769 154639724 bytes 42.66% Compaction Keyspace1 Standard1 53880334 227493794 bytes 23.68% Active compaction remaining time : 0h00m15s pending tasks: 1 compaction type keyspace table completed total unit progress Validation Keyspace1 Standard1 74684443 95178582 bytes 78.47% Active compaction remaining time : n/a
  85. 85. Company Confidential© 2014 DataStax, All Rights Reserved. 99 nodetool compactionstats - completed/total bytes pending tasks: 7 compaction type keyspace table completed total unit progress Compaction Keyspace1 Standard1 65967769 154639724 bytes 42.66% Compaction Keyspace1 Standard1 53880334 227493794 bytes 23.68% Active compaction remaining time : 0h00m15s pending tasks: 1 compaction type keyspace table completed total unit progress Validation Keyspace1 Standard1 74684443 95178582 bytes 78.47% Active compaction remaining time : n/a
  86. 86. Company Confidential© 2014 DataStax, All Rights Reserved. 100 nodetool compactionstats - progress pending tasks: 7 compaction type keyspace table completed total unit progress Compaction Keyspace1 Standard1 65967769 154639724 bytes 42.66% Compaction Keyspace1 Standard1 53880334 227493794 bytes 23.68% Active compaction remaining time : 0h00m15s pending tasks: 1 compaction type keyspace table completed total unit progress Validation Keyspace1 Standard1 74684443 95178582 bytes 78.47% Active compaction remaining time : n/a
  87. 87. Company Confidential© 2014 DataStax, All Rights Reserved. 101 nodetool compactionstats - time left pending tasks: 7 compaction type keyspace table completed total unit progress Compaction Keyspace1 Standard1 65967769 154639724 bytes 42.66% Compaction Keyspace1 Standard1 53880334 227493794 bytes 23.68% Active compaction remaining time : 0h00m15s pending tasks: 1 compaction type keyspace table completed total unit progress Validation Keyspace1 Standard1 74684443 95178582 bytes 78.47% Active compaction remaining time : n/a
  88. 88. Company Confidential© 2014 DataStax, All Rights Reserved. 102 nodetool compactionhistory Compaction History: id keyspace_name columnfamily_name compacted_at bytes_in bytes_out 5d4adcc0-cbf9-11e4-9f32-098a653a7013 system local 1426523260044 1101 543 654f31c0-cc01-11e4-a84d-098a653a7013 Keyspace1 Standard1 1426526709468 53518892 53518892 59a7b070-cc1c-11e4-a84d-098a653a7013 system hints 1426538286327 106483 0 4b8e2b10-cbf2-11e4-bb39-098a653a7013 system schema_keyspaces 1426520223809 909 264 68429440-cc19-11e4-a84d-098a653a7013 Keyspace1 Standard1 1426537022340 59878420 59109472 f06de4a0-cc19-11e4-a84d-098a653a7013 Keyspace1 Standard1 1426537250794 227493794 133487816 3bec3ee0-cbf2-11e4-bb39-098a653a7013 system local 1426520197582 720 550 e263a080-cbfa-11e4-87be-098a653a7013 system schema_columns 1426523912832 27224 11517 54741560-cc19-11e4-a84d-098a653a7013 Keyspace1 Standard1 1426536989110 70198058 61572552 c67f8c90-cc08-11e4-a84d-098a653a7013 system peers 1426529879001 2219 778 1edb6bf0-cc1a-11e4-a84d-098a653a7013 Keyspace1 Standard1 1426537328687 19723822 16552676 rows_merged {1:247562, 2:67934, 3:2945, 4:2042, 5:824, 6:232, 7:2, 8:1}
  89. 89. Company Confidential© 2014 DataStax, All Rights Reserved. 103 nodetool compactionhistory - unique id Compaction History: id keyspace_name columnfamily_name compacted_at bytes_in bytes_out 5d4adcc0-cbf9-11e4-9f32-098a653a7013 system local 1426523260044 1101 543 654f31c0-cc01-11e4-a84d-098a653a7013 Keyspace1 Standard1 1426526709468 53518892 53518892 59a7b070-cc1c-11e4-a84d-098a653a7013 system hints 1426538286327 106483 0 4b8e2b10-cbf2-11e4-bb39-098a653a7013 system schema_keyspaces 1426520223809 909 264 68429440-cc19-11e4-a84d-098a653a7013 Keyspace1 Standard1 1426537022340 59878420 59109472 f06de4a0-cc19-11e4-a84d-098a653a7013 Keyspace1 Standard1 1426537250794 227493794 133487816 3bec3ee0-cbf2-11e4-bb39-098a653a7013 system local 1426520197582 720 550 e263a080-cbfa-11e4-87be-098a653a7013 system schema_columns 1426523912832 27224 11517 54741560-cc19-11e4-a84d-098a653a7013 Keyspace1 Standard1 1426536989110 70198058 61572552 c67f8c90-cc08-11e4-a84d-098a653a7013 system peers 1426529879001 2219 778 1edb6bf0-cc1a-11e4-a84d-098a653a7013 Keyspace1 Standard1 1426537328687 19723822 16552676 rows_merged {1:247562, 2:67934, 3:2945, 4:2042, 5:824, 6:232, 7:2, 8:1}
  90. 90. Company Confidential© 2014 DataStax, All Rights Reserved. 104 nodetool compactionhistory - keyspace/table Compaction History: id keyspace_name columnfamily_name compacted_at bytes_in bytes_out 5d4adcc0-cbf9-11e4-9f32-098a653a7013 system local 1426523260044 1101 543 654f31c0-cc01-11e4-a84d-098a653a7013 Keyspace1 Standard1 1426526709468 53518892 53518892 59a7b070-cc1c-11e4-a84d-098a653a7013 system hints 1426538286327 106483 0 4b8e2b10-cbf2-11e4-bb39-098a653a7013 system schema_keyspaces 1426520223809 909 264 68429440-cc19-11e4-a84d-098a653a7013 Keyspace1 Standard1 1426537022340 59878420 59109472 f06de4a0-cc19-11e4-a84d-098a653a7013 Keyspace1 Standard1 1426537250794 227493794 133487816 3bec3ee0-cbf2-11e4-bb39-098a653a7013 system local 1426520197582 720 550 e263a080-cbfa-11e4-87be-098a653a7013 system schema_columns 1426523912832 27224 11517 54741560-cc19-11e4-a84d-098a653a7013 Keyspace1 Standard1 1426536989110 70198058 61572552 c67f8c90-cc08-11e4-a84d-098a653a7013 system peers 1426529879001 2219 778 1edb6bf0-cc1a-11e4-a84d-098a653a7013 Keyspace1 Standard1 1426537328687 19723822 16552676 rows_merged {1:247562, 2:67934, 3:2945, 4:2042, 5:824, 6:232, 7:2, 8:1}
  91. 91. Company Confidential© 2014 DataStax, All Rights Reserved. 105 nodetool compactionhistory - timestamp Compaction History: id keyspace_name columnfamily_name compacted_at bytes_in bytes_out 5d4adcc0-cbf9-11e4-9f32-098a653a7013 system local 1426523260044 1101 543 654f31c0-cc01-11e4-a84d-098a653a7013 Keyspace1 Standard1 1426526709468 53518892 53518892 59a7b070-cc1c-11e4-a84d-098a653a7013 system hints 1426538286327 106483 0 4b8e2b10-cbf2-11e4-bb39-098a653a7013 system schema_keyspaces 1426520223809 909 264 68429440-cc19-11e4-a84d-098a653a7013 Keyspace1 Standard1 1426537022340 59878420 59109472 f06de4a0-cc19-11e4-a84d-098a653a7013 Keyspace1 Standard1 1426537250794 227493794 133487816 3bec3ee0-cbf2-11e4-bb39-098a653a7013 system local 1426520197582 720 550 e263a080-cbfa-11e4-87be-098a653a7013 system schema_columns 1426523912832 27224 11517 54741560-cc19-11e4-a84d-098a653a7013 Keyspace1 Standard1 1426536989110 70198058 61572552 c67f8c90-cc08-11e4-a84d-098a653a7013 system peers 1426529879001 2219 778 1edb6bf0-cc1a-11e4-a84d-098a653a7013 Keyspace1 Standard1 1426537328687 19723822 16552676 rows_merged {1:247562, 2:67934, 3:2945, 4:2042, 5:824, 6:232, 7:2, 8:1}
  92. 92. Company Confidential© 2014 DataStax, All Rights Reserved. 106 nodetool compactionhistory - input bytes Compaction History: id keyspace_name columnfamily_name compacted_at bytes_in bytes_out 5d4adcc0-cbf9-11e4-9f32-098a653a7013 system local 1426523260044 1101 543 654f31c0-cc01-11e4-a84d-098a653a7013 Keyspace1 Standard1 1426526709468 53518892 53518892 59a7b070-cc1c-11e4-a84d-098a653a7013 system hints 1426538286327 106483 0 4b8e2b10-cbf2-11e4-bb39-098a653a7013 system schema_keyspaces 1426520223809 909 264 68429440-cc19-11e4-a84d-098a653a7013 Keyspace1 Standard1 1426537022340 59878420 59109472 f06de4a0-cc19-11e4-a84d-098a653a7013 Keyspace1 Standard1 1426537250794 227493794 133487816 3bec3ee0-cbf2-11e4-bb39-098a653a7013 system local 1426520197582 720 550 e263a080-cbfa-11e4-87be-098a653a7013 system schema_columns 1426523912832 27224 11517 54741560-cc19-11e4-a84d-098a653a7013 Keyspace1 Standard1 1426536989110 70198058 61572552 c67f8c90-cc08-11e4-a84d-098a653a7013 system peers 1426529879001 2219 778 1edb6bf0-cc1a-11e4-a84d-098a653a7013 Keyspace1 Standard1 1426537328687 19723822 16552676 rows_merged {1:247562, 2:67934, 3:2945, 4:2042, 5:824, 6:232, 7:2, 8:1}
  93. 93. Company Confidential© 2014 DataStax, All Rights Reserved. 107 nodetool compactionhistory - output bytes Compaction History: id keyspace_name columnfamily_name compacted_at bytes_in bytes_out 5d4adcc0-cbf9-11e4-9f32-098a653a7013 system local 1426523260044 1101 543 654f31c0-cc01-11e4-a84d-098a653a7013 Keyspace1 Standard1 1426526709468 53518892 53518892 59a7b070-cc1c-11e4-a84d-098a653a7013 system hints 1426538286327 106483 0 4b8e2b10-cbf2-11e4-bb39-098a653a7013 system schema_keyspaces 1426520223809 909 264 68429440-cc19-11e4-a84d-098a653a7013 Keyspace1 Standard1 1426537022340 59878420 59109472 f06de4a0-cc19-11e4-a84d-098a653a7013 Keyspace1 Standard1 1426537250794 227493794 133487816 3bec3ee0-cbf2-11e4-bb39-098a653a7013 system local 1426520197582 720 550 e263a080-cbfa-11e4-87be-098a653a7013 system schema_columns 1426523912832 27224 11517 54741560-cc19-11e4-a84d-098a653a7013 Keyspace1 Standard1 1426536989110 70198058 61572552 c67f8c90-cc08-11e4-a84d-098a653a7013 system peers 1426529879001 2219 778 1edb6bf0-cc1a-11e4-a84d-098a653a7013 Keyspace1 Standard1 1426537328687 19723822 16552676 rows_merged {1:247562, 2:67934, 3:2945, 4:2042, 5:824, 6:232, 7:2, 8:1}
  94. 94. Company Confidential© 2014 DataStax, All Rights Reserved. 111 nodetool describecluster Cluster Information: Name: test_cluster Snitch: org.apache.cassandra.locator.DynamicEndpointSnitch Partitioner: org.apache.cassandra.dht.Murmur3Partitioner Schema versions: 620ccc95-23ac-328c-8c94-b9554f19af4c: [54.173.171.164, 54.153.107.100, 54.174.19.98, 54.153.108.157, 54.153.39.203, 54.174.245.247] Cluster Information: Name: test_cluster Snitch: org.apache.cassandra.locator.DynamicEndpointSnitch Partitioner: org.apache.cassandra.dht.Murmur3Partitioner Schema versions: a2cee91b-67c2-3656-baa8-2521904305da: [54.174.19.98] 620ccc95-23ac-328c-8c94-b9554f19af4c: [54.173.171.164, 54.153.107.100, 54.153.108.157, 54.153.39.203, 54.174.245.247]
  95. 95. Company Confidential© 2014 DataStax, All Rights Reserved. 113 nodetool describecluster - schema versions Cluster Information: Name: test_cluster Snitch: org.apache.cassandra.locator.DynamicEndpointSnitch Partitioner: org.apache.cassandra.dht.Murmur3Partitioner Schema versions: 620ccc95-23ac-328c-8c94-b9554f19af4c: [54.173.171.164, 54.153.107.100, 54.174.19.98, 54.153.108.157, 54.153.39.203, 54.174.245.247] Cluster Information: Name: test_cluster Snitch: org.apache.cassandra.locator.DynamicEndpointSnitch Partitioner: org.apache.cassandra.dht.Murmur3Partitioner Schema versions: a2cee91b-67c2-3656-baa8-2521904305da: [54.174.19.98] 620ccc95-23ac-328c-8c94-b9554f19af4c: [54.173.171.164, 54.153.107.100, 54.153.108.157, 54.153.39.203, 54.174.245.247]
  96. 96. Company Confidential© 2014 DataStax, All Rights Reserved. 114 nodetool describecluster - schema agreement Cluster Information: Name: test_cluster Snitch: org.apache.cassandra.locator.DynamicEndpointSnitch Partitioner: org.apache.cassandra.dht.Murmur3Partitioner Schema versions: 620ccc95-23ac-328c-8c94-b9554f19af4c: [54.173.171.164, 54.153.107.100, 54.174.19.98, 54.153.108.157, 54.153.39.203, 54.174.245.247] Cluster Information: Name: test_cluster Snitch: org.apache.cassandra.locator.DynamicEndpointSnitch Partitioner: org.apache.cassandra.dht.Murmur3Partitioner Schema versions: a2cee91b-67c2-3656-baa8-2521904305da: [54.174.19.98] 620ccc95-23ac-328c-8c94-b9554f19af4c: [54.173.171.164, 54.153.107.100, 54.153.108.157, 54.153.39.203, 54.174.245.247]
  97. 97. Company Confidential© 2014 DataStax, All Rights Reserved. 115 nodetool describecluster - schema disagreement Cluster Information: Name: test_cluster Snitch: org.apache.cassandra.locator.DynamicEndpointSnitch Partitioner: org.apache.cassandra.dht.Murmur3Partitioner Schema versions: 620ccc95-23ac-328c-8c94-b9554f19af4c: [54.173.171.164, 54.153.107.100, 54.174.19.98, 54.153.108.157, 54.153.39.203, 54.174.245.247] Cluster Information: Name: test_cluster Snitch: org.apache.cassandra.locator.DynamicEndpointSnitch Partitioner: org.apache.cassandra.dht.Murmur3Partitioner Schema versions: a2cee91b-67c2-3656-baa8-2521904305da: [54.174.19.98] 620ccc95-23ac-328c-8c94-b9554f19af4c: [54.173.171.164, 54.153.107.100, 54.153.108.157, 54.153.39.203, 54.174.245.247]
  98. 98. Company Confidential© 2014 DataStax, All Rights Reserved. 116 system.log (2.1 and later) Log Settings Location <file>/var/log/cassandra/system.log</file> Logging Level <root level=“INFO"> Class Override <logger name="org.apache.cassandra.package.Class" level="DEBUG"/> Basic Format Level Thread Type & ID Date & Time Source File Line No. INFO [CompactionExecutor:155] 2015-02-13 02:18:40,986 CompactionTask.java :287 WARN [GossipTasks:1] 2015-02-17 19:47:37,331 Gossiper.java :648 ERROR [AntiEntropySessions:1] 2015-02-17 20:32:11,959 CassandraDaemon.java :199 DEBUG [OptionalTasks:1] 2015-02-20 11:29:14,056 ColumnFamilyStore.java :298 Default Location /var/log/cassandra/system.log Configuration File /etc/dse/cassandra/logback.xml
  99. 99. Company Confidential© 2014 DataStax, All Rights Reserved. 117 system.log (2.0 and earlier) Log Settings Location log4j.appender.R.File=/var/log/cassandra/system.log Logging Level log4j.rootLogger=INFO,stdout,R Class Override log4j.logger.org.apache.cassandra.package.Class=DEBUG Basic Format Level Thread Type & ID Date & Time Source File Line No. INFO [CompactionExecutor:155] 2015-02-13 02:18:40,986 CompactionTask.java (line 287) WARN [GossipTasks:1] 2015-02-17 19:47:37,331 Gossiper.java (line 648) ERROR [AntiEntropySessions:1] 2015-02-17 20:32:11,959 CassandraDaemon.java (line 199) DEBUG [OptionalTasks:1] 2015-02-20 11:29:14,056 ColumnFamilyStore.java (line 298) Default Location /var/log/cassandra/system.log Configuration File /etc/dse/cassandra/log4j-server.properties
  100. 100. Company Confidential© 2014 DataStax, All Rights Reserved. 118 Exceptions java.io.EOFException at java.io.DataInputStream.readFully(DataInputStream.java:197) at java.io.DataInputStream.readFully(DataInputStream.java:169) at org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:395) at org.apache.cassandra.service.CacheService$KeyCacheSerializer.deserialize (CacheService.java:356) at org.apache.cassandra.cache.AutoSavingCache.loadSaved(AutoSavingCache.java:119) at org.apache.cassandra.db.ColumnFamilyStore.<init>(ColumnFamilyStore.java:261) at org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore (ColumnFamilyStore.java:415) at org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore (ColumnFamilyStore.java:386) at org.apache.cassandra.db.Keyspace.initCf(Keyspace.java:309) at org.apache.cassandra.db.Keyspace.<init>(Keyspace.java:266) at org.apache.cassandra.db.Keyspace.open(Keyspace.java:110) at org.apache.cassandra.db.Keyspace.open(Keyspace.java:88) at org.apache.cassandra.db.SystemKeyspace.checkHealth(SystemKeyspace.java:536) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:246) at com.datastax.bdp.server.DseDaemon.setup(DseDaemon.java:376) at org.apache.cassandra.service.CassandraDaemon.activate (CassandraDaemon.java:480)
  101. 101. Company Confidential© 2014 DataStax, All Rights Reserved. 119 Exceptions - exception type java.io.EOFException at java.io.DataInputStream.readFully(DataInputStream.java:197) at java.io.DataInputStream.readFully(DataInputStream.java:169) at org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:395) at org.apache.cassandra.service.CacheService$KeyCacheSerializer.deserialize (CacheService.java:356) at org.apache.cassandra.cache.AutoSavingCache.loadSaved(AutoSavingCache.java:119) at org.apache.cassandra.db.ColumnFamilyStore.<init>(ColumnFamilyStore.java:261) at org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore (ColumnFamilyStore.java:415) at org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore (ColumnFamilyStore.java:386) at org.apache.cassandra.db.Keyspace.initCf(Keyspace.java:309) at org.apache.cassandra.db.Keyspace.<init>(Keyspace.java:266) at org.apache.cassandra.db.Keyspace.open(Keyspace.java:110) at org.apache.cassandra.db.Keyspace.open(Keyspace.java:88) at org.apache.cassandra.db.SystemKeyspace.checkHealth(SystemKeyspace.java:536) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:246) at com.datastax.bdp.server.DseDaemon.setup(DseDaemon.java:376) at org.apache.cassandra.service.CassandraDaemon.activate (CassandraDaemon.java:480)
  102. 102. Company Confidential© 2014 DataStax, All Rights Reserved. 120 Exceptions – stack trace java.io.EOFException at java.io.DataInputStream.readFully(DataInputStream.java:197) at java.io.DataInputStream.readFully(DataInputStream.java:169) at org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:395) at org.apache.cassandra.service.CacheService$KeyCacheSerializer.deserialize (CacheService.java:356) at org.apache.cassandra.cache.AutoSavingCache.loadSaved(AutoSavingCache.java:119) at org.apache.cassandra.db.ColumnFamilyStore.<init>(ColumnFamilyStore.java:261) at org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore (ColumnFamilyStore.java:415) at org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore (ColumnFamilyStore.java:386) at org.apache.cassandra.db.Keyspace.initCf(Keyspace.java:309) at org.apache.cassandra.db.Keyspace.<init>(Keyspace.java:266) at org.apache.cassandra.db.Keyspace.open(Keyspace.java:110) at org.apache.cassandra.db.Keyspace.open(Keyspace.java:88) at org.apache.cassandra.db.SystemKeyspace.checkHealth(SystemKeyspace.java:536) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:246) at com.datastax.bdp.server.DseDaemon.setup(DseDaemon.java:376) at org.apache.cassandra.service.CassandraDaemon.activate (CassandraDaemon.java:480)

Notas do Editor

  • I’m a lead support engineer at DataStax
    I’ve been at DataStax just over 3 years and this is my fourth summit
  • Overall troubleshooting process from a support engineer’s perspective
    I’ll focus on the tools Cassandra gives you to do steps 1 to 4
    Steps 5 and 6 are the hard parts; but luckily that’s what DataStax support, or the mailing list, or StackOverflow can help you with
    Doing the legwork on the first part will make the second part happen much faster
  • OpsCenter metrics: http://docs.datastax.com/en/opscenter/5.2/opsc/online_help/opscPerformanceMetrics_c.html
    Pluggable metrics reporting: http://www.datastax.com/dev/blog/pluggable-metrics-reporting-in-cassandra-2-0-2
  • It’s also helpful to keep in mind the various system resources that Cassandra consumes
    CPU
    Consider both single core and multi-core utilization
    Some processes are single-threaded and bottlenecked by a single core
    Memory
    Heap space, typically limited to a subset of your total physical memory
    Cassandra stores many objects off heap to avoid Garbage Collection issues
    The OS will use whatever memory is left over for page cache; make sure you leave enough free memory for this
    Disk
    Available disk space
    I/O bandwidth utilization
    Network
    Primarily concerned with bandwidth
    Keep in mind firewalls; the path needs to be open
    OS Resources/Limits
    File handles, processes, etc.
    Make sure you set high enough limits in limits.conf or ulimits
  • Linux Monitoring Commands: http://www.tecmint.com/command-line-tools-to-monitor-linux-performance/
  • Java Performance Monitoring: http://www.ibm.com/developerworks/library/j-5things8/
  • Cassandra Core Concepts: https://academy.datastax.com/fr/courses/ds201-cassandra-core-concepts
    Cassandra Architecture: http://docs.datastax.com/en/cassandra/2.1/cassandra/architecture/architectureTOC.html
    Database Internals: http://docs.datastax.com/en/cassandra/2.1/cassandra/dml/dmlDatabaseInternalsTOC.html
  • Overview of some of the most useful nodetool commands and what they do
    Nodetool documentation: http://docs.datastax.com/en/cassandra/2.1/cassandra/tools/toolsNodetool_r.html
  • Nodetool status gives an overview of the entire cluster’s status
  • The cluster is divided into datacenter
  • The first column shows whether a node is up or down
  • The second column shows the node’s state, which one of:
    normal
    leaving the cluster
    joining the cluster
    moving its token
  • The IP address of each node.
    This will be the broadcast_address if specified; otherwise, the listen_address
  • The amount of data on each node
    It is normal for this to differ slightly
    Large discrepancies could indicate a problem (in terms percentage)
    Wide rows/partitions
    Uneven racks
    Compaction issues
  • Number of tokens
    Normally 256 for vnodes, 1 for non-vnodes
  • Percentage of ring each node owns
    Note message at the top
    Without a keyspace, assumes SimpleStrategy with RF=1
    Can cause strange readout when using multiple DCs with a small offset between tokens (nothing to worry about)
    With keyspace, shows ownership according to RF in that keyspace
    With keyspace, should add up to RF times 100%
    May differ slightly from node to node with vnodes due to random token distribution
  • UUID of the node. Used to uniquely identify the node when running removenode command.
  • The rack the node is on
    Used to avoid single point of failure
    Avoid if not using vnodes
    If using vnodes, ensure the same number of nodes are in each rack
  • In this example, we have one node that is down, so that is where we’d focus our investigation
  • Nodetool ring is an old way of checking status
    Shows the same information as nodetool status, except for the token
    When using vnodes, it will show every token on each node and becomes difficult to read
  • Nodetool info shows information specific to the node where it is run
    Some of the information is also shown by nodetool ring/status
  • Same information shown by nodetool status and ring
    Not going to rehash this
  • Whether gossip, thrift, and native transport are enabled
    Can be individually enabled/disabled with nodetool commands
  • Gossip generation; increases each time the node is restarted
  • How many seconds the node has been running
  • Heap and off heap memory usage
    Heap memory shows amount currently in use and maximum
    Amount currently in use may include garbage
    The amount of garbage included varies depending on when GC was last run
    Off heap memory is stored outside the heap but adds to the process’s overall resident size
    If a process gets killed by the kernel OOM killer, make sure it isn’t using too much off heap memory
  • Number of exceptions that occurred since last restart
    Not every error involves an exception, but it’s still a good indicator
  • Key cache caches the location of partition keys in memory
    Avoids using bloom filters to determine which sstables to read
    Capacity defaults to 100MB or 5% of the heap, whichever is less
  • Row cache stores entire rows in memory (entire partitions prior to 2.1)
    Disabled by default
    Only enable in these circumstances:
    Small, very hot data set that will fit in memory
    Data should be read much more than written because writes invalidate row in cache
    Prior to 2.1, only useful on small partitions (no clustering keys!)
    OK to use with clustering keys in 2.1 because individual rows are cached
  • Holds hot counter values in memory
    2.1 introduced more accurate counters
    Performance cost due to read-before-write
    Counter cache mitigates performance cost
    Capacity defaults to 50MB or 2.5% of heap, whichever is less
  • Entries, size and capacity
    - Entries is number of keys/rows/counters in the cache
    Size is amount actually in use (bytes)
    Capacity is amount specified by key_cache_size_in_mb in cassandra.yaml
    If size is consistently much less than capacity, you have this set too high
  • Cache hits, total requests, and hit rate
    - If hit rate is low, try increasing cache capacity slowly until you see diminishing returns
  • Caches are periodically saved to disk and reloaded when a node is restarted
    This is to avoid a cold cache after restarts
  • tpstats shows thread pool statistics
    Cassandra has various thread pools that handle important foreground and background tasks
    This information is also logged to system.log by StatusLogger.java periodically or when a message is dropped
  • Name of the thread pool
  • Number of tasks being actively serviced by a thread
    For several stages, this can be configured in cassandra.yaml
    For others, it is equal to the number of CPU cores
    For a few others, it is a hardcoded limit, usually 1
  • Number tasks waiting to be serviced
    Unless limited via yaml setting, limit is ~2 billion
    You’ll run out of memory long before you ever hit this limit
    High number of pending tasks indicates the stage is overloaded
  • Number of tasks completed since last restart
  • Number of tasks currently blocked for I/O
    Should almost always be 0
  • Total number of tasks blocked since last restart.
    Usually zero except for FlushWriter
  • At the bottom is a list of message types and the number of messages dropped
    Load shedding drops requests that have been pending beyond timeout specified in cassandra.yaml
    Dropped messages usually indicate overloaded cluster; check for other causes, then add more nodes
  • Handles local reads for which this node is a replica
    Number of threads is controlled by concurrent_reads in cassandra.yaml
    Various reads, all handled by ReadStage
    READ - normal read on a single partition
    RANGE_SLICE - A sequential or secondary index scan over multiple partitions
    PAGED_RANGE - Used for automatic paging when result size exceeds row limit
  • Handles local writes for which this node is a replica
    Number of threads is controlled by concurrent_writes in cassandra.yaml
    Various writes, handled by MutationStage
    MUTATION - normal write
    COUNTER_MUTATION - incrementing a counter
  • Coordinator uses this to process responses from other nodes
    Roughly indicates how often this node has been a coordinator
    Roughly because unless using CL ONE, need to handle responses from multiple nodes
    Request completed but timed out before coordinator could respond to it
    Timeout controlled by request_timeout_in_ms in cassandra.yaml
    May indicate that coordinator is overloaded
    Ensure that client load balancing is set up correctly
    Make sure use of batches is appropriate
    Logged batches only when atomicity is required
    Unlogged batches only when updating multiple rows with the same partition key
    Otherwise use asynchronous execution to pipeline requests without overloading a single coordinator
  • Writes memtables to disk
    Number of threads controlled by memtable_flush_writers, should equal number of data drives
    Maximum pending tasks controlled by memtable_flush_queue_size in cassandra.yaml
    Once queue is full writes are blocked until another flush writer is available
    Large number of all-time-blocked tasks indicates a disk bottleneck; add more/faster disks or more nodes
  • Handles compactions
    Number of threads controlled by concurrent_compactors in cassandra.yaml
    Constantly pending compactions means compactions can’t keep up with writes; mitigation strategies:
    Switch from LeveledCompactionStrategy to SizeTieredCompactionStrategy for write-heavy tables
    Get faster disks (SSD) if needed
    Increase compaction throughput
    Get faster CPU cores
  • Asynchronous read repairs
    Occur for a certain percentage of reads, configurable per table
    Pending tasks indicate that you may have read repair chance set too high
    READ_REPAIR - dropped write due to a read repair
    Can sometimes show up as a TimedOutException on the read that triggered it
  • Handles hint delivery from the coordinator to a node that’s recently come back up
    Large number of hints usually means an unhealthy cluster
  • Nodes gossip with each other once a second
    Completed GossipStage tasks is an easy way to tell approximately how long a node has been up
    Prior to 2.0, when using vnodes with a large cluster, gossip could become CPU-bound and get behind
  • Migrates schema changes to other nodes
    If you see pending tasks here, you’re making too many schema changes too quickly
  • Repair related stages
    AntiEntropyStage coordinates repairs
    AntiEntropySesssions are active repairs in progress
    ValidationExecutor builds merkle trees for repair
  • cfstats shows statistics for individual tables
    Tables are grouped by keyspace
    Values apply only to the node where cfstats is run, not cluster-wide
  • Keyspace and table names
    Multiple tables grouped under each keyspace
  • Read and write counts
    Per table table and at keyspace level
  • Read and write latency
    Per table and averaged at keyspace level
  • Number of flushes pending against a table
    Per table and summed at keyspace level
  • Space used by the table on this node
    Must sum across different nodes to get total space
    May include deleted or updated data that hasn’t been compacted yet
    Live and total values are usually equal
    If some tables have been discarded but not deleted yet, total may be larger
    Space used by snapshots for the table; if you’re running out of space, look here
  • Space consumed by off-heap data structures
    Total off heap memory
    Broken down by data structure: bloom filter, index summary, compression metadata
  • Number of sstables comprising a table
    Broken down by level when using LeveledCompactionStrategy
  • Bloom filter statistics
    False positives
    Too high, performance will suffer
    Reduce false positive chance for table
    Space used
    Lower false positive chance requires more space
    Bloom filter grows linearly with number of partitions
    Increase false positive if bloom filter is too large
  • Percent of original size once data is compressed (lower is better)
    Compression trades higher CPU usage for lower I/O usage
    Usually this is a good tradeoff
    But if compression ratio is high, you may want to turn it off for this table
  • Number of partition keys in the table on this node
    Estimated to the nearest index_interval (128 by default)
    Rows spread across multiple sstables will inflate this number
  • Memtable information
    number of entries in the memtable
    bytes of data in the memtable
    bytes of data in the memtable stored off heap
    the number of times the memtable has been switched (flushed to disk)
  • Statistics on partition size, calculated during compaction
    Maximum, minimum, and average size
    Helps identify tables containing large partitions
  • Tombstone statistics
    Number of live cells versus tombstones encountered when scanning a partition
    Rolling average for the last 5 minutes
  • cfhistograms provides deeper insight into a specific table
    Must specify keyspace and table name when calling nodetool cfhistograms
    Information shown by cfhistograms is local to the node where it was run
  • Keyspace and table for which histograms were requested
  • Percentiles (percent of X less than this value)
    Minimum and maximum values
    50% aka median
  • The number of sstables that had to be scanned for each read query
    Scanning more sstables is more expensive and will lead to higher read latencies
    Reports counts for reads within the last 5 minutes (approximately)
  • Write and Read latency in microseconds
    Remember to divide by 1000 to get ms
    Reports latencies for the last 5 minutes (approximately)
    This is the latency for local reads so it doesn’t include round trip time for the coordinator
  • The size of each partition on the node in bytes
    In this example:
    the largest partition is 2.23MB
    the smallest is 1.29KB
    the median partition size is 28.8KB
    99% of partitions are 444KB or smaller
  • Number of cells in a partition
    Partition key is stored separately (not counted here)
    Cells are name/value pairs used to store column data
    There will be one cell per non-key column in each row, plus one sentinel cell per row
  • cfhistograms provides deeper insight into a specific table
    Must specify keyspace and table name when calling nodetool cfhistograms
    Information shown by cfhistograms is local to the node where it was run
  • cfhistograms provides deeper insight into a specific table
    Must specify keyspace and table name when calling nodetool cfhistograms
    Information shown by cfhistograms is local to the node where it was run
  • cfhistograms provides deeper insight into a specific table
    Must specify keyspace and table name when calling nodetool cfhistograms
    Information shown by cfhistograms is local to the node where it was run
  • cfhistograms provides deeper insight into a specific table
    Must specify keyspace and table name when calling nodetool cfhistograms
    Information shown by cfhistograms is local to the node where it was run
  • cfhistograms provides deeper insight into a specific table
    Must specify keyspace and table name when calling nodetool cfhistograms
    Information shown by cfhistograms is local to the node where it was run
  • Shows network activity
  • Mode, same as on status: NORMAL, JOINING, LEAVING, MOVING
  • Active streams
  • Read repair statistics
  • Commands sent and responses received while acting as coordinator
  • Repair session IDs
  • Nodes exchanging data
  • Data is streaming over listen address instead of broadcast address
    Useful on EC2 because Amazon doesn’t charge for internal traffic
  • Total number of files and bytes of data to be received from specified node
  • Total number of files and bytes of data to be sent to specified node
  • Specific files currently being transferred
  • Progress for each file
  • Number of read repairs attempted
  • Number of mismatches resolved
    foreground
    background
  • Number of pending and completed commands sent to other nodes
  • Number of pending and completed responses received from other nodes
  • Status of current and pending compactions
  • Number of pending compactions
  • Compactions in progress
  • Compaction Type
    Compaction - normal compaction
    Validation - building merkle trees for repair
  • Keyspace and table
  • Bytes complete and total bytes for each compaction
  • Percent done
  • Estimated time remaining for the active tasks
    Not useful, because:
    estimates can be wrong
    doesn’t account for pending tasks
  • History of recent compactions
    Shows how much space a compaction reclaimed
    Same information available in system.log
  • Unique ID
  • Keyspace and column family
  • Unix timestamp when compaction occurred
    Seconds since Jan 1, 1970
  • Total bytes before compaction
  • Total bytes after compaction
  • Row merge counts
    Actually the last column on each row
    Moved to make output fit on slide
  • Count of sstables
  • Number of rows spread across that many sstables prior to compaction
  • Used to check for schema disagreements
  • This is not the information we’re interested in
  • We’re looking to see how many schema versions there are in the cluster.
  • No disagreement; only one version of the schema shared by all nodes
  • Schema disagreement; one node has a different version from all the others
    Schema disagreements must be manually resolved
    If only one node disagrees, run nodetool resetlocalschema on that node
    If multiple nodes disagree
    shut down nodes in the minority and delete system/schema_* sstables
    start nodes back up one by one
  • system.log is the most important tool for troubleshooting Cassandra
    Basic format
    Level: INFO, WARN, ERROR by default; DEBUG only if configured
    Thread: use the ID to correlate messages from the same thread
    Date & Time: use to correlate messages across multiple nodes, time duration of events
    Source File/Line No: code that logged the message, not necessarily where an error occurred; talk about stack traces later
  • system.log is the most important tool for troubleshooting Cassandra
    Basic format
    Level: INFO, WARN, ERROR by default; DEBUG only if configured
    Thread: use the ID to correlate messages from the same thread
    Date & Time: use to correlate messages across multiple nodes, time duration of events
    Source File/Line No: code that logged the message, not necessarily where an error occurred; talk about stack traces later
  • Exception - What kind of error occurred?
  • Exception - What kind of error occurred?
  • Stack trace – where did the error happen?
    Most local at top to most global at bottom
    Wall of text — we’ll dissect it in the next slides
  • Organization names – whose fault is it?
  • Sub-packages usually group major application subsystems
  • Class Name – specific object
    Will usually, but not always match the filename
  • Nested Classes - $ separates outer class from inner class(es)
    Method belongs to the inner class
  • Method name – what was the class doing?
    <init> indicates that the error occurred in a static initialization block or constructor
  • File name – where the source code is, should you want to look at it
    Also pay attention to the package name so you can find the file within the nested directory structure
    When source-diving, start with the most local method and work your way out
  • Line number – where to look in the code (available on github.com/apache/cassandra)
    Be careful! Line numbers change between versions, so make sure you select the the right version in github
  • Pay attention to nested exceptions
    Each exception has its own stack trace which may be completely different
    The outer exception may be too general because it’s been rethrown from unrelated code
    The innermost exception will be the actual root cause of the error
    Best to use a combination of outer and inner exception as search terms
  • Exception will usually have an error message
    Provides additional information about the circumstances of the exception
    Usually good to search for the exception and message together
    Look out for embedded numbers or strings; these may change from one message to the next, and including them will undesirably narrow your search
  • - Some additional examples of organizations and subsystems
  • - Some additional examples of organizations and subsystems
  • Use exception and several package+class+method names
    Exception alone often isn’t sufficient because the same error can occur many different places
    You’ll find the same exception in unrelated software if it’s a standard java exception.
    Add several package/class/method combinations to narrow down the exception
    Use at least the topmost method and the first org.apache.cassandra method
    Use quotes around individual elements (especially if they contain spaces)
    Line numbers shouldn’t be part of your search criteria because you may not find the same error in a different version
    Likewise, exclude specific numbers and strings like names and counts from your search
    Use Google’s site: feature to narrow search terms to apache JIRA, cassandra mailing list, or stackoverflow
    Add or remove additional methods as needed to narrow or broaden search
  • These might be a good set of search terms for this exception
    Include both exceptions and methods from both stack traces
    Include exception and error message grouped together inside quotes
  • Know how to recognize a restart
    Check versions of major components and JVM
    Confirm settings are what you think they are
    Make sure you have JNA installed
    Know when node is ready to serve requests
  • Cassandra writes updates to the commit log on disk and the memtables in memory
    The memtables are flushed to sstables on disk as required
  • First the flush is enqueued
    There are a limited number of FlushWriter threads
    Flushes wait in the queue until a FlushWriter is available
  • - When a FlushWriter becomes available, the flush begins
  • Eventually the flush completes.
    This is not an instantaneous process because disks are slow.
  • This is the name of the table
    You can use this to link the enqueuing and writing of the flush
  • Make note of the MemtableFlushWriter thread doing the flush
    You can use this to link the beginning and end of the flush
  • The thread that enqueues the memtable provides clues about why it was flushed:
    When the memory fills up, the flush is enqueued by SlabPoolCleaner (for on-heap) or NativePoolCleaner (for off-heap)
    If the commit log space fills up, the flush is enqueued by OptionalTasks
    Other threads may enqueue flushes for other reasons
    If you have lots of tiny sstables, it may be from flushing too frequently so try to figure out why
  • This shows the number of bytes stored in the memtable both on-heap and off-heap
    Also shows the percentage of total space devoted to memtables that this one consumed
  • serialized bytes, larger than in-memory bytes, due to serialization overhead
  • - This shows the name of the sstable that the memtable was written to on disk
  • Note the times on the messages
    Time between first and second messages is how long the flush waited in the queue
    Time between second and third messages is how long the flush took to complete
  • Sstables are immutable; updates go into new sstables
    Reads scan over multiple sstables to stitch together a row
    SSTables must eventually be compacted together to keep reads fast
    Size-tiered compactions occur when a sufficient number of similarly sized sstables exist
    In leveled compaction, sstables are written to Level 0 and moved to higher levels as they are compacted
    Leveled compactions occur continuously as long as sstables exist in Level 0
  • Note the CompactionExecutor thread doing the compaction
    The thread ID can be used to link together the messages
  • The compaction is beginning
  • The compaction is complete
  • The sstables that are going to be compacted
  • How many tables were compacted
  • The name of the new sstable created by the compaction
  • The number of bytes in the original files
    The number of bytes in the new file
    The percentage of the original size after:
    Updates were merged
    Tombstones and expired TTLs were removed
  • The time the compaction took and the rate in MB/sec
  • The sum of the number of rows in each sstable
    The number of unique rows across all compacted sstables
  • X:Y where Y rows were split across X sstables
  • X:Y where Y rows were split across X sstables
  • X:Y where Y rows were split across X sstables
  • During compaction, you may see one or more messages about partitions being compacted incrementally
    Logical CQL 3 rows sharing the same partition key form a physical row when stored in the cluster
    Newer versions of Cassandra say partition instead of row
    Large partitions can cause a number of problems for Cassandra
    Uneven distribution of data between nodes
    Large memory usage when large partitions are read all at once
    Generating lots of garbage compacting a wide row
    Slower compactions because they’re done incrementally on disk
    These messages can help identify large rows
  • The keyspace, table, and partition key
  • The size of the partition
  • Garbage collections are a necessary evil
    Some garbage collections run concurrently with Cassandra, but others stop the world
    Cassandra logs any stop-the-world collections that last longer than 200ms
    Stop the world collections cause nodes to stop responding to gossip and client requests
  • This shows the number of ms elapsed for the collection
    Long collections increase latency of read and write requests
    Very long collections will prevent the node from gossiping and other nodes will think it’s down
  • Note the time on each GCInspector message.
    Even if the individual collections are fast, too many collections within a short timespan can hurt throughput
  • This shows the amount of data in each generation before and after the collection
    Java divides the heap into different generations and uses different GC approaches on each
    This is based on the observation that objects either tend to be short-lived or long-lived and different approaches work better in each scenario
    The size of different spaces can be tweaked as well as the rules for moving objects between them
  • Java allocates new objects into the eden space
  • Eden objects that survive a single collection are promoted to the survivor space
  • Survivor objects that survive a specified number of collections get further promoted to the tenured generation
  • This shows the type of collection that occurred
    ParNew collections occur when the young gen is collected. These are stop-the-world.
    The young gen is usually small so ParNew is usually fast
    If there’s not enough contiguous space in the old gen to promote an object, the old gen must be compacted, which takes a long time
  • ParNew collections occur when the young gen is collected. These are stop-the-world.
    The young gen is usually small so ParNew is usually fast
    If there’s not enough contiguous space in the old gen to promote an object, the old gen must be compacted, which takes a long time
  • ConcurrentMarkSweep normally runs concurrently with the application and does not stop the world
    If the concurrent collection can’t keep up with the rate at which garbage is generated, a stop-the-world collection occurs
    Stop-the-world CMS can take a very long time because the old gen is usually big
  • G1 is a newer garbage collector that can optionally be used with Cassandra
    G1 divides the heap into multiple young and tenured regions and collects different regions independently
    Recent tests have shown that G1 provides lower latency and throughput with less tuning than the older GC options
    G1 will be the default garbage collector starting in Cassandra 3.0.
  • Flapping is often caused by garbage collections
  • The nodes go up-down-up-down, repeatedly
    That’s why it’s called flapping
  • Notice the nodes that are flapping
    If a single node is flapping, check the logs on that node during the same timeframe and see if GC is occurring
    If multiple nodes are are reported up and down, the local node may be the problem
    If a node is doing GC it won’t be able to receive gossip messages from another node and may think they’re down
    Check for GC messages in the local log around the time that flapping occurs
  • Note the time that flapping occurs
    If it happens infrequently, it may not be a problem
    If it happens multiple times a minute, it is a problem
    Check other node’s logs for GC events that occurred at the same time
    If you don’t see anything in the other node’s log, it may be a network issue
    You can reduce the failure detector’s sensitivity by increasing phi_convict_threshold in cassandra.yaml
    Default value is 8; maximum recommended value is 12 (useful on high latency networks such as AWS)
  • If a write comes in for a down node, the coordinator will store hints for it
    When the node comes back up, the nodes that have hints for it will send them
    Hints are no longer stored after the node has been down over period of time specified in cassandra.yaml.
    This is to prevent a node that comes back up from being inundated with more hints than it can handle
    Any node that has been down longer than this period of time needs to run nodetool repair
    Flapping can cause excessive hint buildup, which adds extra burden for both the coordinator and the node that is flapping
    This can lead to cascading failures
  • Repairs are initiated by running nodetool repair
    They do a full comparison of all the data for a particular token range with the other replicas for that range, then exchange any data that is out of sync
    system.log shows the process from start to finish
  • Note the UUID for the repair session. This is your key to correlating the various messages.
  • When a repair session begins, you will see a “new session” message
  • It will report the nodes it’s going to sync with
  • The token ranges it’s going to sync
  • And the keyspace and column families it’s going to sync
  • The first step is for the repair leader to request merkle trees from all the other replicas
  • A message is logged reporting the receipt of each requested merkle tree
    Make sure you see a message that the merkle tree was received from each node that it was requested from
  • After comparing the merkle trees, if the nodes are in sync, you’ll see a message like this
  • If not, you’ll see a message like this, reporting how many ranges are out of sync
  • The node will then begin a streaming repair with the out-of-sync replica
  • Another message reports when the streaming task has succeeded
  • A message will report when each table has been fully synced. This means either it was in sync to begin with, or all the streaming tasks necessary to sync it completed.
  • Once all tables are synced, a message will report that the overall repair session completed successfully.
    If you see a “new session” message for a particular ID but not a “session completed successfully”, the repair is still running.
    If a repair doesn’t complete successfully after some time, you should look more closely at the other messages for that session to see where it might be stuck.
    Sometimes network issues can disrupt the streaming of data or a merkle tree, causing repair to hang
    Other times, there is simply a lot of data, and building merkle trees can take a long time, as can streaming data
    Increasing compaction throughput and streaming throughput will help speed the process, at the cost of using extra I/O and network bandwidth.
    Check the other nodes involved in the repair for messages using the same session ID
    Check for any errors that would have disrupted the repair
  • Before we end, I just want to go back to the troubleshooting process I discussed at the beginning
    Next time you have a problem, think about the tools at your disposal and how they can help you with these steps

×