SlideShare uma empresa Scribd logo
1 de 38
Baixar para ler offline
Real  World  DTCS  For  Operators
An Introduction to
CrowdStrike
We Are CyberSecurity Technology Company
We Detect, Prevent And Respond To All Attack Types
In Real Time, Protecting OrganizationsFrom
Catastrophic Breaches
We Provide Next Generation Endpoint Protection,Threat
Intelligence & Pre &Post IR Services
NEXT-GEN
ENDPOINT
INCIDENT
RESPONSE
THREAT
INTEL
What  Is  Compaction?
• Cassandra  write  path:
– First  the  Commitlog
– Then  the  Memtable
– Eventually  flushed  to  a  SSTable
• Each  SSTable is  written  exactly  once
• Over  time,  Cassandra  combines  files
– Duplicate  cells  are  merged
– Obsolete  data  is  purged
• The  algorithm  Cassandra  uses  to  determine  when  and  how  to  combine  
files  is  pluggable,  and  choosing  the  right  strategy  may  be  important  at  
scale
3©  2015.  All  Rights  Reserved.    
What  Is  Compaction?
• SizeTieredCompactionStrategy
– Each  time  min_threshold (4)  files  of  the  same  size  appear,  combine  them  
into  a  new  file
– Over  time,  you’ll  naturally  end  up  with  a  distribution  of  old  data  in  large  
files,  new  data  in  small  files
– Deleted  data  in  large  files  stays  on  disk  longer  than  desired  because  
those  files  are  very  rarely  compacted
4©  2015.  All  Rights  Reserved.    
SizeTieredCompactionStrategy
©  2015.  All  Rights  Reserved.     5
SizeTieredCompactionStrategy
If  each  of  the  smallest  blocks  represent  1  day  of  data,  and  each  write  
had  a  90  day  TTL,  when  do  you  actually  delete  files  and  reclaim  disk  
space?
©  2015.  All  Rights  Reserved.     6
Why  Compaction  Strategy  Matters
©  2015.  All  Rights  Reserved.     7
• We  keep  some  data  from  sensors  for  a  
fixed  time  period
• Processes
• DNS  queries
• Files  created
• It’s  a  LOT  of  data
• Talk  tomorrow  morning:  One  million  
writes  per  second  with  60  nodes
• We’re  WELL  past  60  nodes
• If  we  can’t  delete  it  efficiently,  costs  go  
way,  way  up
DateTieredCompactionStrategy
• Early  tickets  suggested  creating  a  way  to  stop  compacting  cold  
data
– CASSANDRA-­5515  – track  sstable coldness,  stop  compacting  cold  
sstables (measured  by  READ  counts)
• CASSANDRA-­6602  – optimize  for  time  series  specifically
– Solution  provided  by  Björn Hegerfors from  Spotify
– Use  sstable’s min  timestamp  to  find  a  target  window
– Compact  sstables within  the  same  target
– Stop  compacting  sstables if  max  timestamp  is  older  than  a  specified  cutoff
©  2015.  All  Rights  Reserved.     8
DTCS  In  Pictures
©  2015.  All  Rights  Reserved.     9
DTCS  Parameters
• max_sstable_age_days
• base_time_seconds
• timestamp_resolution
• Min_threshold
– Common  to  all  compaction  strategies
• Max  Threshold
– Common  to  all  compaction  strategies
©  2015.  All  Rights  Reserved.     10
DTCS  In  Pictures
©  2015.  All  Rights  Reserved.     11
DTCS  Benefits
In  Theory…  
• You  can  stop  data  compacting  at  a  point  you  choose!
– max_sstable_age_days
• You  can  adjust  the  window  size  so  that  you  can  quickly  expire  
data  when  it’s  approximately  the  size  you  want
– It’s  not  immediately  intuitive,  but  you  CAN  calculate  it  (min_threshold and  
base_time_seconds)
• We  know  cold  data  won’t  be  recompacted,  so  we  can  potentially  
enable  cold  storage  directories  with  cheaper  disk  
– CASSANDRA-­8460  – patch  available,  I  need  to  rebase
©  2015.  All  Rights  Reserved.     12
Do  people  consider  DTCS  Production  Ready?  
• It  was  added  to  2.0  after  2.1  was  out.  Usually  this  means:
– Trivial  and  low  risk,  or
– Experimental  and  meant  for  advanced  users  only
©  2015.  All  Rights  Reserved.     13
Do  people  consider  DTCS  Production  Ready?  
• It  was  added  to  2.0  after  2.1  was  out.  Usually  this  means:
– Trivial  and  low  risk,  or
– Experimental  and  meant  for  advanced  users  only
– I  challenge  you  to  find  documentation  on  which  is  true  for  DTCS
©  2015.  All  Rights  Reserved.     14
Do  people  consider  DTCS  Production  Ready?  
• It  was  added  to  2.0  after  2.1  was  out.  Usually  this  means:
– Trivial  and  low  risk,  or
– Experimental  and  meant  for  advanced  users  only
– I  challenge  you  to  find  documentation  on  which  is  true  for  DTCS
• Spotify’s  intro  blog  notes  that  they  use  it  in  production
• I’ve  been  told  by  a  project  committer  that  they  feel  DTCS  is  for  
advanced  users  only,  but  I’ve  never  seen  any  public  facing  
messaging  that  normal  users  should  avoid  it
• It  seems  so  easy,  what  could  possibly  go  wrong…
©  2015.  All  Rights  Reserved.     15
DTCS  Caveats
• The  initial  blogs  give  us  some  insight  about  what  type  of  things  
may  not  behave  as  intended
– “But  something  that  works  against  the  efforts  of  the  strategy  is  writes  with  
highly  out-­of-­order  timestamps”
• How  much  is  “highly  out  of  order”?  
– “Consider  turning  off  read  repairs.  Anti-­entropy  repairs  and  hinted  handoff  
don’t  incur  as  much  additional  work  for  DTCS  and  may  be  used  like  
usual.”
©  2015.  All  Rights  Reserved.     16
Out  of  order  timestamps
• When  an  sstable gets  flushed  with  an  old  timestamp  in  a  new  
table:
– The  max  timestamp  is  used  to  determine  when  to  stop  compacting,  but
– The  min  timestamp  is  used  to  determine  which  other  files  will  be  
compacted  with  this  sstable
©  2015.  All  Rights  Reserved.     17
Out  of  order  timestamps
©  2015.  All  Rights  Reserved.     18
Out  of  order  timestamps
©  2015.  All  Rights  Reserved.     19
Out  of  order  timestamps
©  2015.  All  Rights  Reserved.     20
• Windows  are  tiered,  and  they  get  bigger  and  bigger  
• With  default  settings  and  1  year  of  data,  the  largest  window  
covers  180  days
– This  means  even  if  most  of  the  file  is  past  max_sstable_age_days,  you  
can  still  end  up  compacting  with  a  brand  new  sstable with  read  repaired  
data
• “DTCS  never  stops  compacting”
– Read  repairs  pull  old  data  into  new  windows  triggering  
recompaction
Out  of  order  timestamps
©  2015.  All  Rights  Reserved.     21
• Windows  are  tiered,  and  they  get  bigger  and  bigger  
• With  default  settings  and  1  year  of  data,  the  largest  window  
covers  180  days
– This  means  even  if  most  of  the  file  is  past  max_sstable_age_days,  you  
can  still  end  up  compacting  with  a  brand  new  sstable with  read  repaired  
data
• “DTCS  never  stops  compacting”
– Read  repairs  pull  old  data  into  new  windows  triggering  recompaction
– Does  that  mean  we  better  run  repair?  
Small  SSTables from  Repairs
(and  other  streaming  operations)
• “If  an  SSTable contains  timestamps  that  don’t  match  the  time  
when  it  was  actually  written  to  disk,  it  violates  the  size-­to-­age  
correspondence  that  DTCS  tries  to  maintain.”
• The  suggestions  on  Spotify  and  Datastax blogs  say  run  repair  
more  often  than  max_sstable_age_days,  but  that  isn’t  the  only  
cause  of  small  sstables
– Bootstrap
– Decommission
– Bulk  Loader
©  2015.  All  Rights  Reserved.     22
Real  Pain:
If  you  can’t  expand  your  cluster,  what’s  the  point?
©  2015.  All  Rights  Reserved.     23
SSTable Count  Per  Node
Real  Pain:
If  you  can’t  expand  your  cluster,  what’s  the  point?
©  2015.  All  Rights  Reserved.     24
Damn  you,  vnodes!
Well…
©  2015.  All  Rights  Reserved.     25
Small  SSTables Shouldn’t  Be  Ignored
• If  the  small  sstables are  beyond  max_sstable_age_days,  they  
won’t  be  compacted
– After  all,  that’s  the  point  of  max_sstable_age_days,  right?  
• If  you  raise  max_sstable_age_days,  the  ever-­growing  DTCS  
tiered  windows  will  cause  existing  sstables to  merge  and  get  
much  larger,  negating  one  of  the  benefits  of  DTCS
• If  you  don’t  raise  max_sstable_age_days,  you  have  to  deal  with  
performance  implications  of  ten  thousand  sstables
– Reduced  somewhat  by  CASSANDRA-­9882
– Before  #9882,  too  many  sstables could  block  flushing  for  a  long  time
©  2015.  All  Rights  Reserved.     26
Embarrassing  Admission
• Our  early  bulk  loading  plan  and  bootstrapping  procedure  
acknowledged  that  sstables will  be  abandoned  beyond  
max_sstable_age_days
• We  have  python  scripts  that  check  the  timestamps,  and  
manually  submit  compactions  through  JMX  
forceUserDefinedCompaction()
©  2015.  All  Rights  Reserved.     27
Really  Embarrassing  Admission
• Our  early  bulk  loading  plan  and  bootstrapping  procedure  
acknowledged  that  sstables will  be  abandoned  beyond  
max_sstable_age_days
• We  have  python  scripts  that  check  the  timestamps,  and  
manually  submit  compactions  through  JMX  
forceUserDefinedCompaction()
• Yes,  really.
©  2015.  All  Rights  Reserved.     28
Really  Embarrassing  Admission
• Our  early  bulk  loading  plan  and  bootstrapping  procedure  
acknowledged  and  accepted  that  sstables will  be  abandoned  
beyond  max_sstable_age_days
• We  have  python  scripts  that  check  the  timestamps,  and  
manually  submit  compactions  through  JMX  
forceUserDefinedCompaction()
• Yes,  really.
• Does  it  actually  scale?
©  2015.  All  Rights  Reserved.     29
When  should  you  use  DTCS?
• You  TTL  ALL  of  your  data  and  writes  come  in  order
• Fixed  sized  cluster  and  no  plans  for  bulk  loading,  or  rarely  
changing  cluster  size  and  not  using  vnodes
– If  you  plan  on  growing,  you  better  have  a  plan  for  small  sstables
– If  you  do  need  to  add/remove  nodes,  vnodes will  cause  far  more  small  
sstables than  single-­token-­per-­node
• Extra  space  available  for  compaction
– You  can’t  rely  on  theoretical  table  sizes  calculated  with  
max_sstable_age_days,  because  read  repair,  hints,  etc,  can  force  those  
files  to  span  much  larger  time  ranges  than  you  expect
©  2015.  All  Rights  Reserved.     30
Being  Honest
©  2015.  All  Rights  Reserved.     31
What  if?  
• Do  we  really  need  max_sstable_age_days?
– The  conventional  logic  is  to  use  it  to  denote  cold  data,  but  we  use  it  to  
force  window  sizes
– If  we  give  up  tiering,  and  stick  with  fixed  sized  windows,  do  we  need  
max_sstable_age_days?
• Without  tiering,  can  we  swap  base_time_seconds for  more  
intuitive  configuration  option?
©  2015.  All  Rights  Reserved.     32
TimeWindowCompactionStrategy
• Designed  to  be  simple  and  efficient
– Group  sstables into  logical  buckets
– STCS  within  each  time  window
– No  more  rolling  re-­compaction
– No  more  streaming  leftovers
– No  more  confusing  options,  just  Window  Size  +    Window  Unit
• “12  Hours”,  “3  Days”,  “6  Minutes”
©  2015.  All  Rights  Reserved.     33
TimeWindowCompactionStrategy
• Submitted  to  Apache  Cassandra  as  CASSANDRA-­9666
• For  now,  we  use  it  at  Crowdstrike to  clean  up  after  streaming:
– echo  "set  -­b  
org.apache.cassandra.db:columnfamily=table,keyspace=keyspace,type=ColumnFamilies
CompactionStrategyClass
org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy"    |    java  -­jar  
jmxterm.jar -­l  $IP:$PORT
– It’s  not  an  accident  that  the  TWCS  defaults  use  1  day  windows  with  
microsecond  timestamp  resolution,  that  matches  our  sstable needs,  but  
we  think  it’s  a  good  default
• Patches  (and  Tests)  Available  for  2.1,  2.2,  3.0
©  2015.  All  Rights  Reserved.     34
TimeWindowCompactionStrategy
• No  more  continuous  compaction
• No  more  tiny  streaming  leftovers
• No  more  confusing  options
– Just  Window  Size,  Window  Unit
– “12  Hours”,  “3  Days”,  “6  Minutes”
• Work  is  ongoing  for  both  DTCS  and  TWCS
– CASSANDRA-­9645  to  make  DTCS  easier  to  use
– CASSANDRA-­10276  to  make  DTCS  do  STCS  within  each  window  (patch  
available)
– CASSANDRA-­10280  to  make  DTCS  work  well  with  old  data  
©  2015.  All  Rights  Reserved.     35
TimeWindowCompactionStrategy
• There’s  no  guarantee  that  TWCS  will  make  it  into  the  project
– TWCS  is  certainly  easier  to  reason  about,  but  DTCS  was  there  first  and  is  
already  deployed  by  real  users
– Anecdotal  evidence  and  preliminary  benchmarks  suggest  TWCS  comes  out  
ahead  based  on  current  state  of  both  strategies  (at  the  time  of  these  slides)
– Formal  benchmarking  is  needed
– DTCS  probably  wins  for  reads/SELECTS  in  SOME  data  models
• Even  if  TWCS  doesn’t  make  it  in,  the  source  is  available  now  on  (see:  
CASSANDRA-­9666)
– It’s  likely  we’ll  continue  to  maintain  it,  even  if  it’s  not  accepted  upstream,  so  
pull  requests  are  welcome
©  2015.  All  Rights  Reserved.     36
Q&A
• Talk  to  me  about  Cassandra  or  DTCS  on  twitter:  @jjirsa
• Try  to  stop  me  from  talking  about  DTCS  on  IRC:  #cassandra
• Crowdstrike is  awesome  and  hiring
– www.crowdstrike.com/careers/
• Jim  Plush  and  Dennis  Opacki,  tomorrow  morning
– “1  Million  Writes  Per  Second  on  60  Nodes  with  Cassandra  and  EBS”
©  2015.  All  Rights  Reserved.     37
Thank  you

Mais conteúdo relacionado

Mais procurados

PlayStation and Cassandra Streams (Alexander Filipchik & Dustin Pham, Sony) |...
PlayStation and Cassandra Streams (Alexander Filipchik & Dustin Pham, Sony) |...PlayStation and Cassandra Streams (Alexander Filipchik & Dustin Pham, Sony) |...
PlayStation and Cassandra Streams (Alexander Filipchik & Dustin Pham, Sony) |...DataStax
 
Managing Cassandra at Scale by Al Tobey
Managing Cassandra at Scale by Al TobeyManaging Cassandra at Scale by Al Tobey
Managing Cassandra at Scale by Al TobeyDataStax Academy
 
How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)DataStax Academy
 
Cassandra summit 2013 how not to use cassandra
Cassandra summit 2013  how not to use cassandraCassandra summit 2013  how not to use cassandra
Cassandra summit 2013 how not to use cassandraAxel Liljencrantz
 
Cassandra Day Atlanta 2015: Diagnosing Problems in Production
Cassandra Day Atlanta 2015: Diagnosing Problems in ProductionCassandra Day Atlanta 2015: Diagnosing Problems in Production
Cassandra Day Atlanta 2015: Diagnosing Problems in ProductionDataStax Academy
 
Instaclustr introduction to managing cassandra
Instaclustr introduction to managing cassandraInstaclustr introduction to managing cassandra
Instaclustr introduction to managing cassandraInstaclustr
 
Webinar: Getting Started with Apache Cassandra
Webinar: Getting Started with Apache CassandraWebinar: Getting Started with Apache Cassandra
Webinar: Getting Started with Apache CassandraDataStax
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonDataStax Academy
 
Cassandra Summit 2014: Diagnosing Problems in Production
Cassandra Summit 2014: Diagnosing Problems in ProductionCassandra Summit 2014: Diagnosing Problems in Production
Cassandra Summit 2014: Diagnosing Problems in ProductionDataStax Academy
 
How can you successfully migrate to hosted private cloud 2020
How can you successfully migrate to hosted private cloud 2020How can you successfully migrate to hosted private cloud 2020
How can you successfully migrate to hosted private cloud 2020OVHcloud
 
Instaclustr Apache Cassandra Best Practices & Toubleshooting
Instaclustr Apache Cassandra Best Practices & ToubleshootingInstaclustr Apache Cassandra Best Practices & Toubleshooting
Instaclustr Apache Cassandra Best Practices & ToubleshootingInstaclustr
 
Beginning Operations: 7 Deadly Sins for Apache Cassandra Ops
Beginning Operations: 7 Deadly Sins for Apache Cassandra OpsBeginning Operations: 7 Deadly Sins for Apache Cassandra Ops
Beginning Operations: 7 Deadly Sins for Apache Cassandra OpsDataStax Academy
 
M6d cassandrapresentation
M6d cassandrapresentationM6d cassandrapresentation
M6d cassandrapresentationEdward Capriolo
 
Low latency for high throughput
Low latency for high throughputLow latency for high throughput
Low latency for high throughputPeter Lawrey
 
Responding rapidly when you have 100+ GB data sets in Java
Responding rapidly when you have 100+ GB data sets in JavaResponding rapidly when you have 100+ GB data sets in Java
Responding rapidly when you have 100+ GB data sets in JavaPeter Lawrey
 
Everyday I’m scaling... Cassandra
Everyday I’m scaling... CassandraEveryday I’m scaling... Cassandra
Everyday I’m scaling... CassandraInstaclustr
 
Load testing Cassandra applications
Load testing Cassandra applicationsLoad testing Cassandra applications
Load testing Cassandra applicationsBen Slater
 

Mais procurados (18)

PlayStation and Cassandra Streams (Alexander Filipchik & Dustin Pham, Sony) |...
PlayStation and Cassandra Streams (Alexander Filipchik & Dustin Pham, Sony) |...PlayStation and Cassandra Streams (Alexander Filipchik & Dustin Pham, Sony) |...
PlayStation and Cassandra Streams (Alexander Filipchik & Dustin Pham, Sony) |...
 
Managing Cassandra at Scale by Al Tobey
Managing Cassandra at Scale by Al TobeyManaging Cassandra at Scale by Al Tobey
Managing Cassandra at Scale by Al Tobey
 
How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)
 
Cassandra summit 2013 how not to use cassandra
Cassandra summit 2013  how not to use cassandraCassandra summit 2013  how not to use cassandra
Cassandra summit 2013 how not to use cassandra
 
Cassandra Day Atlanta 2015: Diagnosing Problems in Production
Cassandra Day Atlanta 2015: Diagnosing Problems in ProductionCassandra Day Atlanta 2015: Diagnosing Problems in Production
Cassandra Day Atlanta 2015: Diagnosing Problems in Production
 
Instaclustr introduction to managing cassandra
Instaclustr introduction to managing cassandraInstaclustr introduction to managing cassandra
Instaclustr introduction to managing cassandra
 
Webinar: Getting Started with Apache Cassandra
Webinar: Getting Started with Apache CassandraWebinar: Getting Started with Apache Cassandra
Webinar: Getting Started with Apache Cassandra
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
 
Cassandra Summit 2014: Diagnosing Problems in Production
Cassandra Summit 2014: Diagnosing Problems in ProductionCassandra Summit 2014: Diagnosing Problems in Production
Cassandra Summit 2014: Diagnosing Problems in Production
 
How can you successfully migrate to hosted private cloud 2020
How can you successfully migrate to hosted private cloud 2020How can you successfully migrate to hosted private cloud 2020
How can you successfully migrate to hosted private cloud 2020
 
Instaclustr Apache Cassandra Best Practices & Toubleshooting
Instaclustr Apache Cassandra Best Practices & ToubleshootingInstaclustr Apache Cassandra Best Practices & Toubleshooting
Instaclustr Apache Cassandra Best Practices & Toubleshooting
 
Advanced Operations
Advanced OperationsAdvanced Operations
Advanced Operations
 
Beginning Operations: 7 Deadly Sins for Apache Cassandra Ops
Beginning Operations: 7 Deadly Sins for Apache Cassandra OpsBeginning Operations: 7 Deadly Sins for Apache Cassandra Ops
Beginning Operations: 7 Deadly Sins for Apache Cassandra Ops
 
M6d cassandrapresentation
M6d cassandrapresentationM6d cassandrapresentation
M6d cassandrapresentation
 
Low latency for high throughput
Low latency for high throughputLow latency for high throughput
Low latency for high throughput
 
Responding rapidly when you have 100+ GB data sets in Java
Responding rapidly when you have 100+ GB data sets in JavaResponding rapidly when you have 100+ GB data sets in Java
Responding rapidly when you have 100+ GB data sets in Java
 
Everyday I’m scaling... Cassandra
Everyday I’m scaling... CassandraEveryday I’m scaling... Cassandra
Everyday I’m scaling... Cassandra
 
Load testing Cassandra applications
Load testing Cassandra applicationsLoad testing Cassandra applications
Load testing Cassandra applications
 

Destaque

iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...DataStax Academy
 
Carlos Santa María - Hiperconvergencia, el futuro del Data Center - semanainf...
Carlos Santa María - Hiperconvergencia, el futuro del Data Center - semanainf...Carlos Santa María - Hiperconvergencia, el futuro del Data Center - semanainf...
Carlos Santa María - Hiperconvergencia, el futuro del Data Center - semanainf...COIICV
 
Cassandra Summit 2014: Novel Multi-Region Clusters — Cassandra Deployments Sp...
Cassandra Summit 2014: Novel Multi-Region Clusters — Cassandra Deployments Sp...Cassandra Summit 2014: Novel Multi-Region Clusters — Cassandra Deployments Sp...
Cassandra Summit 2014: Novel Multi-Region Clusters — Cassandra Deployments Sp...DataStax Academy
 
NGCC 2016 - Support large partitions
NGCC 2016 - Support large partitionsNGCC 2016 - Support large partitions
NGCC 2016 - Support large partitionsRobert Stupp
 
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...DataStax
 
3800 die-bonder overview
3800 die-bonder overview3800 die-bonder overview
3800 die-bonder overviewfastbr
 
Highly available, scalable and secure data with Cassandra and DataStax Enterp...
Highly available, scalable and secure data with Cassandra and DataStax Enterp...Highly available, scalable and secure data with Cassandra and DataStax Enterp...
Highly available, scalable and secure data with Cassandra and DataStax Enterp...Johnny Miller
 
Securing Cassandra
Securing CassandraSecuring Cassandra
Securing CassandraInstaclustr
 
Multi-Region Cassandra Clusters
Multi-Region Cassandra ClustersMulti-Region Cassandra Clusters
Multi-Region Cassandra ClustersInstaclustr
 
Cassandra multi-datacenter operations essentials
Cassandra multi-datacenter operations essentialsCassandra multi-datacenter operations essentials
Cassandra multi-datacenter operations essentialsJulien Anguenot
 
Lessons Learned from Real-World Deployments of Java EE 7 at JavaOne 2014
Lessons Learned from Real-World Deployments of Java EE 7 at JavaOne 2014Lessons Learned from Real-World Deployments of Java EE 7 at JavaOne 2014
Lessons Learned from Real-World Deployments of Java EE 7 at JavaOne 2014Arun Gupta
 
DataStax - Analytics on Apache Cassandra - Paris Tech Talks meetup
DataStax - Analytics on Apache Cassandra - Paris Tech Talks meetupDataStax - Analytics on Apache Cassandra - Paris Tech Talks meetup
DataStax - Analytics on Apache Cassandra - Paris Tech Talks meetupVictor Coustenoble
 
GumGum: Multi-Region Cassandra in AWS
GumGum: Multi-Region Cassandra in AWSGumGum: Multi-Region Cassandra in AWS
GumGum: Multi-Region Cassandra in AWSDataStax Academy
 
Cassandra Operations at Netflix
Cassandra Operations at NetflixCassandra Operations at Netflix
Cassandra Operations at Netflixgreggulrich
 
An Introduction to Priam
An Introduction to PriamAn Introduction to Priam
An Introduction to PriamJason Brown
 
Multi Data Center Strategies
Multi Data Center StrategiesMulti Data Center Strategies
Multi Data Center StrategiesSteven Francia
 
Ficstar Software: Cassandra Installation to Optimization
Ficstar Software: Cassandra Installation to OptimizationFicstar Software: Cassandra Installation to Optimization
Ficstar Software: Cassandra Installation to OptimizationDataStax Academy
 
Target: Performance Tuning Cassandra at Target
Target: Performance Tuning Cassandra at TargetTarget: Performance Tuning Cassandra at Target
Target: Performance Tuning Cassandra at TargetDataStax Academy
 
DataStax: Extreme Cassandra Optimization: The Sequel
DataStax: Extreme Cassandra Optimization: The SequelDataStax: Extreme Cassandra Optimization: The Sequel
DataStax: Extreme Cassandra Optimization: The SequelDataStax Academy
 
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...DataStax Academy
 

Destaque (20)

iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
 
Carlos Santa María - Hiperconvergencia, el futuro del Data Center - semanainf...
Carlos Santa María - Hiperconvergencia, el futuro del Data Center - semanainf...Carlos Santa María - Hiperconvergencia, el futuro del Data Center - semanainf...
Carlos Santa María - Hiperconvergencia, el futuro del Data Center - semanainf...
 
Cassandra Summit 2014: Novel Multi-Region Clusters — Cassandra Deployments Sp...
Cassandra Summit 2014: Novel Multi-Region Clusters — Cassandra Deployments Sp...Cassandra Summit 2014: Novel Multi-Region Clusters — Cassandra Deployments Sp...
Cassandra Summit 2014: Novel Multi-Region Clusters — Cassandra Deployments Sp...
 
NGCC 2016 - Support large partitions
NGCC 2016 - Support large partitionsNGCC 2016 - Support large partitions
NGCC 2016 - Support large partitions
 
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
 
3800 die-bonder overview
3800 die-bonder overview3800 die-bonder overview
3800 die-bonder overview
 
Highly available, scalable and secure data with Cassandra and DataStax Enterp...
Highly available, scalable and secure data with Cassandra and DataStax Enterp...Highly available, scalable and secure data with Cassandra and DataStax Enterp...
Highly available, scalable and secure data with Cassandra and DataStax Enterp...
 
Securing Cassandra
Securing CassandraSecuring Cassandra
Securing Cassandra
 
Multi-Region Cassandra Clusters
Multi-Region Cassandra ClustersMulti-Region Cassandra Clusters
Multi-Region Cassandra Clusters
 
Cassandra multi-datacenter operations essentials
Cassandra multi-datacenter operations essentialsCassandra multi-datacenter operations essentials
Cassandra multi-datacenter operations essentials
 
Lessons Learned from Real-World Deployments of Java EE 7 at JavaOne 2014
Lessons Learned from Real-World Deployments of Java EE 7 at JavaOne 2014Lessons Learned from Real-World Deployments of Java EE 7 at JavaOne 2014
Lessons Learned from Real-World Deployments of Java EE 7 at JavaOne 2014
 
DataStax - Analytics on Apache Cassandra - Paris Tech Talks meetup
DataStax - Analytics on Apache Cassandra - Paris Tech Talks meetupDataStax - Analytics on Apache Cassandra - Paris Tech Talks meetup
DataStax - Analytics on Apache Cassandra - Paris Tech Talks meetup
 
GumGum: Multi-Region Cassandra in AWS
GumGum: Multi-Region Cassandra in AWSGumGum: Multi-Region Cassandra in AWS
GumGum: Multi-Region Cassandra in AWS
 
Cassandra Operations at Netflix
Cassandra Operations at NetflixCassandra Operations at Netflix
Cassandra Operations at Netflix
 
An Introduction to Priam
An Introduction to PriamAn Introduction to Priam
An Introduction to Priam
 
Multi Data Center Strategies
Multi Data Center StrategiesMulti Data Center Strategies
Multi Data Center Strategies
 
Ficstar Software: Cassandra Installation to Optimization
Ficstar Software: Cassandra Installation to OptimizationFicstar Software: Cassandra Installation to Optimization
Ficstar Software: Cassandra Installation to Optimization
 
Target: Performance Tuning Cassandra at Target
Target: Performance Tuning Cassandra at TargetTarget: Performance Tuning Cassandra at Target
Target: Performance Tuning Cassandra at Target
 
DataStax: Extreme Cassandra Optimization: The Sequel
DataStax: Extreme Cassandra Optimization: The SequelDataStax: Extreme Cassandra Optimization: The Sequel
DataStax: Extreme Cassandra Optimization: The Sequel
 
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
 

Semelhante a CrowdStrike: Real World DTCS For Operators

Manage your compactions before they manage you!
Manage your compactions before they manage you!Manage your compactions before they manage you!
Manage your compactions before they manage you!Carlos Juzarte Rolo
 
Azure + DataStax Enterprise Powers Office 365 Per User Store
Azure + DataStax Enterprise Powers Office 365 Per User StoreAzure + DataStax Enterprise Powers Office 365 Per User Store
Azure + DataStax Enterprise Powers Office 365 Per User StoreDataStax Academy
 
Lessons Learned From Running 1800 Clusters (Brooke Jensen, Instaclustr) | Cas...
Lessons Learned From Running 1800 Clusters (Brooke Jensen, Instaclustr) | Cas...Lessons Learned From Running 1800 Clusters (Brooke Jensen, Instaclustr) | Cas...
Lessons Learned From Running 1800 Clusters (Brooke Jensen, Instaclustr) | Cas...DataStax
 
Everyday I'm Scaling... Cassandra (Ben Bromhead, Instaclustr) | C* Summit 2016
Everyday I'm Scaling... Cassandra (Ben Bromhead, Instaclustr) | C* Summit 2016Everyday I'm Scaling... Cassandra (Ben Bromhead, Instaclustr) | C* Summit 2016
Everyday I'm Scaling... Cassandra (Ben Bromhead, Instaclustr) | C* Summit 2016DataStax
 
AWS Sydney Summit 2013 - Big Data Analytics
AWS Sydney Summit 2013 - Big Data AnalyticsAWS Sydney Summit 2013 - Big Data Analytics
AWS Sydney Summit 2013 - Big Data AnalyticsAmazon Web Services
 
Azure + DataStax Enterprise (DSE) Powers Office365 Per User Store
Azure + DataStax Enterprise (DSE) Powers Office365 Per User StoreAzure + DataStax Enterprise (DSE) Powers Office365 Per User Store
Azure + DataStax Enterprise (DSE) Powers Office365 Per User StoreDataStax Academy
 
start_your_datacenter_sds_v3
start_your_datacenter_sds_v3start_your_datacenter_sds_v3
start_your_datacenter_sds_v3David Byte
 
Choosing the right parallel compute architecture
Choosing the right parallel compute architecture Choosing the right parallel compute architecture
Choosing the right parallel compute architecture corehard_by
 
mParticle's Journey to Scylla from Cassandra
mParticle's Journey to Scylla from CassandramParticle's Journey to Scylla from Cassandra
mParticle's Journey to Scylla from CassandraScyllaDB
 
Lambda kappa architecture - the jury are still out
Lambda   kappa architecture - the jury are still outLambda   kappa architecture - the jury are still out
Lambda kappa architecture - the jury are still outYoav chernobroda
 
DataStax Enterprise in the Field – 20160920
DataStax Enterprise in the Field – 20160920DataStax Enterprise in the Field – 20160920
DataStax Enterprise in the Field – 20160920Daniel Cohen
 
Performance Optimization of Cloud Based Applications by Peter Smith, ACL
Performance Optimization of Cloud Based Applications by Peter Smith, ACLPerformance Optimization of Cloud Based Applications by Peter Smith, ACL
Performance Optimization of Cloud Based Applications by Peter Smith, ACLTriNimbus
 
Dori Exterman, Considerations for choosing the parallel computing strategy th...
Dori Exterman, Considerations for choosing the parallel computing strategy th...Dori Exterman, Considerations for choosing the parallel computing strategy th...
Dori Exterman, Considerations for choosing the parallel computing strategy th...Sergey Platonov
 
TidalScale Overview
TidalScale OverviewTidalScale Overview
TidalScale OverviewPete Jarvis
 
Building Big Data Streaming Architectures
Building Big Data Streaming ArchitecturesBuilding Big Data Streaming Architectures
Building Big Data Streaming ArchitecturesDavid Martínez Rego
 
How jKool Analyzes Streaming Data in Real Time with DataStax
How jKool Analyzes Streaming Data in Real Time with DataStaxHow jKool Analyzes Streaming Data in Real Time with DataStax
How jKool Analyzes Streaming Data in Real Time with DataStaxjKool
 
How jKool Analyzes Streaming Data in Real Time with DataStax
How jKool Analyzes Streaming Data in Real Time with DataStaxHow jKool Analyzes Streaming Data in Real Time with DataStax
How jKool Analyzes Streaming Data in Real Time with DataStaxDataStax
 
Diagnosing Problems in Production (Nov 2015)
Diagnosing Problems in Production (Nov 2015)Diagnosing Problems in Production (Nov 2015)
Diagnosing Problems in Production (Nov 2015)Jon Haddad
 

Semelhante a CrowdStrike: Real World DTCS For Operators (20)

Manage your compactions before they manage you!
Manage your compactions before they manage you!Manage your compactions before they manage you!
Manage your compactions before they manage you!
 
Azure + DataStax Enterprise Powers Office 365 Per User Store
Azure + DataStax Enterprise Powers Office 365 Per User StoreAzure + DataStax Enterprise Powers Office 365 Per User Store
Azure + DataStax Enterprise Powers Office 365 Per User Store
 
Lessons Learned From Running 1800 Clusters (Brooke Jensen, Instaclustr) | Cas...
Lessons Learned From Running 1800 Clusters (Brooke Jensen, Instaclustr) | Cas...Lessons Learned From Running 1800 Clusters (Brooke Jensen, Instaclustr) | Cas...
Lessons Learned From Running 1800 Clusters (Brooke Jensen, Instaclustr) | Cas...
 
Everyday I'm Scaling... Cassandra (Ben Bromhead, Instaclustr) | C* Summit 2016
Everyday I'm Scaling... Cassandra (Ben Bromhead, Instaclustr) | C* Summit 2016Everyday I'm Scaling... Cassandra (Ben Bromhead, Instaclustr) | C* Summit 2016
Everyday I'm Scaling... Cassandra (Ben Bromhead, Instaclustr) | C* Summit 2016
 
AWS Sydney Summit 2013 - Big Data Analytics
AWS Sydney Summit 2013 - Big Data AnalyticsAWS Sydney Summit 2013 - Big Data Analytics
AWS Sydney Summit 2013 - Big Data Analytics
 
Azure + DataStax Enterprise (DSE) Powers Office365 Per User Store
Azure + DataStax Enterprise (DSE) Powers Office365 Per User StoreAzure + DataStax Enterprise (DSE) Powers Office365 Per User Store
Azure + DataStax Enterprise (DSE) Powers Office365 Per User Store
 
start_your_datacenter_sds_v3
start_your_datacenter_sds_v3start_your_datacenter_sds_v3
start_your_datacenter_sds_v3
 
Choosing the right parallel compute architecture
Choosing the right parallel compute architecture Choosing the right parallel compute architecture
Choosing the right parallel compute architecture
 
mParticle's Journey to Scylla from Cassandra
mParticle's Journey to Scylla from CassandramParticle's Journey to Scylla from Cassandra
mParticle's Journey to Scylla from Cassandra
 
Lambda kappa architecture - the jury are still out
Lambda   kappa architecture - the jury are still outLambda   kappa architecture - the jury are still out
Lambda kappa architecture - the jury are still out
 
DataStax Enterprise in the Field – 20160920
DataStax Enterprise in the Field – 20160920DataStax Enterprise in the Field – 20160920
DataStax Enterprise in the Field – 20160920
 
Performance Optimization of Cloud Based Applications by Peter Smith, ACL
Performance Optimization of Cloud Based Applications by Peter Smith, ACLPerformance Optimization of Cloud Based Applications by Peter Smith, ACL
Performance Optimization of Cloud Based Applications by Peter Smith, ACL
 
Dori Exterman, Considerations for choosing the parallel computing strategy th...
Dori Exterman, Considerations for choosing the parallel computing strategy th...Dori Exterman, Considerations for choosing the parallel computing strategy th...
Dori Exterman, Considerations for choosing the parallel computing strategy th...
 
TidalScale Overview
TidalScale OverviewTidalScale Overview
TidalScale Overview
 
Big data nyu
Big data nyuBig data nyu
Big data nyu
 
Building Big Data Streaming Architectures
Building Big Data Streaming ArchitecturesBuilding Big Data Streaming Architectures
Building Big Data Streaming Architectures
 
How jKool Analyzes Streaming Data in Real Time with DataStax
How jKool Analyzes Streaming Data in Real Time with DataStaxHow jKool Analyzes Streaming Data in Real Time with DataStax
How jKool Analyzes Streaming Data in Real Time with DataStax
 
How jKool Analyzes Streaming Data in Real Time with DataStax
How jKool Analyzes Streaming Data in Real Time with DataStaxHow jKool Analyzes Streaming Data in Real Time with DataStax
How jKool Analyzes Streaming Data in Real Time with DataStax
 
Diagnosing Problems in Production (Nov 2015)
Diagnosing Problems in Production (Nov 2015)Diagnosing Problems in Production (Nov 2015)
Diagnosing Problems in Production (Nov 2015)
 
Kudu austin oct 2015.pptx
Kudu austin oct 2015.pptxKudu austin oct 2015.pptx
Kudu austin oct 2015.pptx
 

Mais de DataStax Academy

Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftForrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftDataStax Academy
 
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseIntroduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseDataStax Academy
 
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraIntroduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraDataStax Academy
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsDataStax Academy
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingDataStax Academy
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackDataStax Academy
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache CassandraDataStax Academy
 
Production Ready Cassandra
Production Ready CassandraProduction Ready Cassandra
Production Ready CassandraDataStax Academy
 
Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1DataStax Academy
 
Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2DataStax Academy
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First ClusterDataStax Academy
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with DseDataStax Academy
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraDataStax Academy
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseDataStax Academy
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraDataStax Academy
 
Apache Cassandra and Drivers
Apache Cassandra and DriversApache Cassandra and Drivers
Apache Cassandra and DriversDataStax Academy
 

Mais de DataStax Academy (20)

Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftForrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
 
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseIntroduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph Database
 
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraIntroduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart Labs
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data Modeling
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stack
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache Cassandra
 
Coursera Cassandra Driver
Coursera Cassandra DriverCoursera Cassandra Driver
Coursera Cassandra Driver
 
Production Ready Cassandra
Production Ready CassandraProduction Ready Cassandra
Production Ready Cassandra
 
Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1
 
Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First Cluster
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with Dse
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache Cassandra
 
Cassandra Core Concepts
Cassandra Core ConceptsCassandra Core Concepts
Cassandra Core Concepts
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax Enterprise
 
Bad Habits Die Hard
Bad Habits Die Hard Bad Habits Die Hard
Bad Habits Die Hard
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache Cassandra
 
Advanced Cassandra
Advanced CassandraAdvanced Cassandra
Advanced Cassandra
 
Apache Cassandra and Drivers
Apache Cassandra and DriversApache Cassandra and Drivers
Apache Cassandra and Drivers
 

Último

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 

Último (20)

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 

CrowdStrike: Real World DTCS For Operators

  • 1. Real  World  DTCS  For  Operators
  • 2. An Introduction to CrowdStrike We Are CyberSecurity Technology Company We Detect, Prevent And Respond To All Attack Types In Real Time, Protecting OrganizationsFrom Catastrophic Breaches We Provide Next Generation Endpoint Protection,Threat Intelligence & Pre &Post IR Services NEXT-GEN ENDPOINT INCIDENT RESPONSE THREAT INTEL
  • 3. What  Is  Compaction? • Cassandra  write  path: – First  the  Commitlog – Then  the  Memtable – Eventually  flushed  to  a  SSTable • Each  SSTable is  written  exactly  once • Over  time,  Cassandra  combines  files – Duplicate  cells  are  merged – Obsolete  data  is  purged • The  algorithm  Cassandra  uses  to  determine  when  and  how  to  combine   files  is  pluggable,  and  choosing  the  right  strategy  may  be  important  at   scale 3©  2015.  All  Rights  Reserved.    
  • 4. What  Is  Compaction? • SizeTieredCompactionStrategy – Each  time  min_threshold (4)  files  of  the  same  size  appear,  combine  them   into  a  new  file – Over  time,  you’ll  naturally  end  up  with  a  distribution  of  old  data  in  large   files,  new  data  in  small  files – Deleted  data  in  large  files  stays  on  disk  longer  than  desired  because   those  files  are  very  rarely  compacted 4©  2015.  All  Rights  Reserved.    
  • 5. SizeTieredCompactionStrategy ©  2015.  All  Rights  Reserved.     5
  • 6. SizeTieredCompactionStrategy If  each  of  the  smallest  blocks  represent  1  day  of  data,  and  each  write   had  a  90  day  TTL,  when  do  you  actually  delete  files  and  reclaim  disk   space? ©  2015.  All  Rights  Reserved.     6
  • 7. Why  Compaction  Strategy  Matters ©  2015.  All  Rights  Reserved.     7 • We  keep  some  data  from  sensors  for  a   fixed  time  period • Processes • DNS  queries • Files  created • It’s  a  LOT  of  data • Talk  tomorrow  morning:  One  million   writes  per  second  with  60  nodes • We’re  WELL  past  60  nodes • If  we  can’t  delete  it  efficiently,  costs  go   way,  way  up
  • 8. DateTieredCompactionStrategy • Early  tickets  suggested  creating  a  way  to  stop  compacting  cold   data – CASSANDRA-­5515  – track  sstable coldness,  stop  compacting  cold   sstables (measured  by  READ  counts) • CASSANDRA-­6602  – optimize  for  time  series  specifically – Solution  provided  by  Björn Hegerfors from  Spotify – Use  sstable’s min  timestamp  to  find  a  target  window – Compact  sstables within  the  same  target – Stop  compacting  sstables if  max  timestamp  is  older  than  a  specified  cutoff ©  2015.  All  Rights  Reserved.     8
  • 9. DTCS  In  Pictures ©  2015.  All  Rights  Reserved.     9
  • 10. DTCS  Parameters • max_sstable_age_days • base_time_seconds • timestamp_resolution • Min_threshold – Common  to  all  compaction  strategies • Max  Threshold – Common  to  all  compaction  strategies ©  2015.  All  Rights  Reserved.     10
  • 11. DTCS  In  Pictures ©  2015.  All  Rights  Reserved.     11
  • 12. DTCS  Benefits In  Theory…   • You  can  stop  data  compacting  at  a  point  you  choose! – max_sstable_age_days • You  can  adjust  the  window  size  so  that  you  can  quickly  expire   data  when  it’s  approximately  the  size  you  want – It’s  not  immediately  intuitive,  but  you  CAN  calculate  it  (min_threshold and   base_time_seconds) • We  know  cold  data  won’t  be  recompacted,  so  we  can  potentially   enable  cold  storage  directories  with  cheaper  disk   – CASSANDRA-­8460  – patch  available,  I  need  to  rebase ©  2015.  All  Rights  Reserved.     12
  • 13. Do  people  consider  DTCS  Production  Ready?   • It  was  added  to  2.0  after  2.1  was  out.  Usually  this  means: – Trivial  and  low  risk,  or – Experimental  and  meant  for  advanced  users  only ©  2015.  All  Rights  Reserved.     13
  • 14. Do  people  consider  DTCS  Production  Ready?   • It  was  added  to  2.0  after  2.1  was  out.  Usually  this  means: – Trivial  and  low  risk,  or – Experimental  and  meant  for  advanced  users  only – I  challenge  you  to  find  documentation  on  which  is  true  for  DTCS ©  2015.  All  Rights  Reserved.     14
  • 15. Do  people  consider  DTCS  Production  Ready?   • It  was  added  to  2.0  after  2.1  was  out.  Usually  this  means: – Trivial  and  low  risk,  or – Experimental  and  meant  for  advanced  users  only – I  challenge  you  to  find  documentation  on  which  is  true  for  DTCS • Spotify’s  intro  blog  notes  that  they  use  it  in  production • I’ve  been  told  by  a  project  committer  that  they  feel  DTCS  is  for   advanced  users  only,  but  I’ve  never  seen  any  public  facing   messaging  that  normal  users  should  avoid  it • It  seems  so  easy,  what  could  possibly  go  wrong… ©  2015.  All  Rights  Reserved.     15
  • 16. DTCS  Caveats • The  initial  blogs  give  us  some  insight  about  what  type  of  things   may  not  behave  as  intended – “But  something  that  works  against  the  efforts  of  the  strategy  is  writes  with   highly  out-­of-­order  timestamps” • How  much  is  “highly  out  of  order”?   – “Consider  turning  off  read  repairs.  Anti-­entropy  repairs  and  hinted  handoff   don’t  incur  as  much  additional  work  for  DTCS  and  may  be  used  like   usual.” ©  2015.  All  Rights  Reserved.     16
  • 17. Out  of  order  timestamps • When  an  sstable gets  flushed  with  an  old  timestamp  in  a  new   table: – The  max  timestamp  is  used  to  determine  when  to  stop  compacting,  but – The  min  timestamp  is  used  to  determine  which  other  files  will  be   compacted  with  this  sstable ©  2015.  All  Rights  Reserved.     17
  • 18. Out  of  order  timestamps ©  2015.  All  Rights  Reserved.     18
  • 19. Out  of  order  timestamps ©  2015.  All  Rights  Reserved.     19
  • 20. Out  of  order  timestamps ©  2015.  All  Rights  Reserved.     20 • Windows  are  tiered,  and  they  get  bigger  and  bigger   • With  default  settings  and  1  year  of  data,  the  largest  window   covers  180  days – This  means  even  if  most  of  the  file  is  past  max_sstable_age_days,  you   can  still  end  up  compacting  with  a  brand  new  sstable with  read  repaired   data • “DTCS  never  stops  compacting” – Read  repairs  pull  old  data  into  new  windows  triggering   recompaction
  • 21. Out  of  order  timestamps ©  2015.  All  Rights  Reserved.     21 • Windows  are  tiered,  and  they  get  bigger  and  bigger   • With  default  settings  and  1  year  of  data,  the  largest  window   covers  180  days – This  means  even  if  most  of  the  file  is  past  max_sstable_age_days,  you   can  still  end  up  compacting  with  a  brand  new  sstable with  read  repaired   data • “DTCS  never  stops  compacting” – Read  repairs  pull  old  data  into  new  windows  triggering  recompaction – Does  that  mean  we  better  run  repair?  
  • 22. Small  SSTables from  Repairs (and  other  streaming  operations) • “If  an  SSTable contains  timestamps  that  don’t  match  the  time   when  it  was  actually  written  to  disk,  it  violates  the  size-­to-­age   correspondence  that  DTCS  tries  to  maintain.” • The  suggestions  on  Spotify  and  Datastax blogs  say  run  repair   more  often  than  max_sstable_age_days,  but  that  isn’t  the  only   cause  of  small  sstables – Bootstrap – Decommission – Bulk  Loader ©  2015.  All  Rights  Reserved.     22
  • 23. Real  Pain: If  you  can’t  expand  your  cluster,  what’s  the  point? ©  2015.  All  Rights  Reserved.     23 SSTable Count  Per  Node
  • 24. Real  Pain: If  you  can’t  expand  your  cluster,  what’s  the  point? ©  2015.  All  Rights  Reserved.     24 Damn  you,  vnodes!
  • 25. Well… ©  2015.  All  Rights  Reserved.     25
  • 26. Small  SSTables Shouldn’t  Be  Ignored • If  the  small  sstables are  beyond  max_sstable_age_days,  they   won’t  be  compacted – After  all,  that’s  the  point  of  max_sstable_age_days,  right?   • If  you  raise  max_sstable_age_days,  the  ever-­growing  DTCS   tiered  windows  will  cause  existing  sstables to  merge  and  get   much  larger,  negating  one  of  the  benefits  of  DTCS • If  you  don’t  raise  max_sstable_age_days,  you  have  to  deal  with   performance  implications  of  ten  thousand  sstables – Reduced  somewhat  by  CASSANDRA-­9882 – Before  #9882,  too  many  sstables could  block  flushing  for  a  long  time ©  2015.  All  Rights  Reserved.     26
  • 27. Embarrassing  Admission • Our  early  bulk  loading  plan  and  bootstrapping  procedure   acknowledged  that  sstables will  be  abandoned  beyond   max_sstable_age_days • We  have  python  scripts  that  check  the  timestamps,  and   manually  submit  compactions  through  JMX   forceUserDefinedCompaction() ©  2015.  All  Rights  Reserved.     27
  • 28. Really  Embarrassing  Admission • Our  early  bulk  loading  plan  and  bootstrapping  procedure   acknowledged  that  sstables will  be  abandoned  beyond   max_sstable_age_days • We  have  python  scripts  that  check  the  timestamps,  and   manually  submit  compactions  through  JMX   forceUserDefinedCompaction() • Yes,  really. ©  2015.  All  Rights  Reserved.     28
  • 29. Really  Embarrassing  Admission • Our  early  bulk  loading  plan  and  bootstrapping  procedure   acknowledged  and  accepted  that  sstables will  be  abandoned   beyond  max_sstable_age_days • We  have  python  scripts  that  check  the  timestamps,  and   manually  submit  compactions  through  JMX   forceUserDefinedCompaction() • Yes,  really. • Does  it  actually  scale? ©  2015.  All  Rights  Reserved.     29
  • 30. When  should  you  use  DTCS? • You  TTL  ALL  of  your  data  and  writes  come  in  order • Fixed  sized  cluster  and  no  plans  for  bulk  loading,  or  rarely   changing  cluster  size  and  not  using  vnodes – If  you  plan  on  growing,  you  better  have  a  plan  for  small  sstables – If  you  do  need  to  add/remove  nodes,  vnodes will  cause  far  more  small   sstables than  single-­token-­per-­node • Extra  space  available  for  compaction – You  can’t  rely  on  theoretical  table  sizes  calculated  with   max_sstable_age_days,  because  read  repair,  hints,  etc,  can  force  those   files  to  span  much  larger  time  ranges  than  you  expect ©  2015.  All  Rights  Reserved.     30
  • 31. Being  Honest ©  2015.  All  Rights  Reserved.     31
  • 32. What  if?   • Do  we  really  need  max_sstable_age_days? – The  conventional  logic  is  to  use  it  to  denote  cold  data,  but  we  use  it  to   force  window  sizes – If  we  give  up  tiering,  and  stick  with  fixed  sized  windows,  do  we  need   max_sstable_age_days? • Without  tiering,  can  we  swap  base_time_seconds for  more   intuitive  configuration  option? ©  2015.  All  Rights  Reserved.     32
  • 33. TimeWindowCompactionStrategy • Designed  to  be  simple  and  efficient – Group  sstables into  logical  buckets – STCS  within  each  time  window – No  more  rolling  re-­compaction – No  more  streaming  leftovers – No  more  confusing  options,  just  Window  Size  +    Window  Unit • “12  Hours”,  “3  Days”,  “6  Minutes” ©  2015.  All  Rights  Reserved.     33
  • 34. TimeWindowCompactionStrategy • Submitted  to  Apache  Cassandra  as  CASSANDRA-­9666 • For  now,  we  use  it  at  Crowdstrike to  clean  up  after  streaming: – echo  "set  -­b   org.apache.cassandra.db:columnfamily=table,keyspace=keyspace,type=ColumnFamilies CompactionStrategyClass org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy"    |    java  -­jar   jmxterm.jar -­l  $IP:$PORT – It’s  not  an  accident  that  the  TWCS  defaults  use  1  day  windows  with   microsecond  timestamp  resolution,  that  matches  our  sstable needs,  but   we  think  it’s  a  good  default • Patches  (and  Tests)  Available  for  2.1,  2.2,  3.0 ©  2015.  All  Rights  Reserved.     34
  • 35. TimeWindowCompactionStrategy • No  more  continuous  compaction • No  more  tiny  streaming  leftovers • No  more  confusing  options – Just  Window  Size,  Window  Unit – “12  Hours”,  “3  Days”,  “6  Minutes” • Work  is  ongoing  for  both  DTCS  and  TWCS – CASSANDRA-­9645  to  make  DTCS  easier  to  use – CASSANDRA-­10276  to  make  DTCS  do  STCS  within  each  window  (patch   available) – CASSANDRA-­10280  to  make  DTCS  work  well  with  old  data   ©  2015.  All  Rights  Reserved.     35
  • 36. TimeWindowCompactionStrategy • There’s  no  guarantee  that  TWCS  will  make  it  into  the  project – TWCS  is  certainly  easier  to  reason  about,  but  DTCS  was  there  first  and  is   already  deployed  by  real  users – Anecdotal  evidence  and  preliminary  benchmarks  suggest  TWCS  comes  out   ahead  based  on  current  state  of  both  strategies  (at  the  time  of  these  slides) – Formal  benchmarking  is  needed – DTCS  probably  wins  for  reads/SELECTS  in  SOME  data  models • Even  if  TWCS  doesn’t  make  it  in,  the  source  is  available  now  on  (see:   CASSANDRA-­9666) – It’s  likely  we’ll  continue  to  maintain  it,  even  if  it’s  not  accepted  upstream,  so   pull  requests  are  welcome ©  2015.  All  Rights  Reserved.     36
  • 37. Q&A • Talk  to  me  about  Cassandra  or  DTCS  on  twitter:  @jjirsa • Try  to  stop  me  from  talking  about  DTCS  on  IRC:  #cassandra • Crowdstrike is  awesome  and  hiring – www.crowdstrike.com/careers/ • Jim  Plush  and  Dennis  Opacki,  tomorrow  morning – “1  Million  Writes  Per  Second  on  60  Nodes  with  Cassandra  and  EBS” ©  2015.  All  Rights  Reserved.     37