SlideShare uma empresa Scribd logo
1 de 16
#Cassandra13
Real World, Real-Time
Data Modeling
(for analytics apps)
Tim Moreton
Founder and CTO, Acunu
#Cassandra13
Virtual nodes CQL Support
WE C*
#Cassandra13
•e.g Click stream, telemetry, logs
•100x more writes than reads
•Almost all reads are to results
•Almost no writes are ‘updates’
•Really not going to fit in RAM
Real-time analytics
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
•e.g User profiles
•Create, Read, Update, Delete
•Probably mostly reads
•Probably wants atomicity
•Probably fits in RAM
Session storage
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
What folk use C* for
#Cassandra13
Real-time analytics
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
Session storage
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
What folk use C* for
S WP HA ACIDS WP HA ACID
#Cassandra13
Real-time analytics
What folk use C* for
Session storage
#Cassandra13
Example use case
{
time: 13:50:11,
latitude: 12.5,
longitude: -43.4,
duration: 24,
device_type: ..
}
Call detail records
tens thousands/sec
Real-time dashboards
#Cassandra13
C* Data Modeling 101
• Denormalise: Writes (and disk) are
cheap, reads are expensive:
insert data in every arrangement that
you need to read it
• Items you’ll access together, and want
sorted: put in the same row
• Sets of items you’re likely to access
separately: keep in separate rows
• Atomic counters are the building block
of Cassandra real-time analytics apps
row2
row3
row1
One event
update
One query read
#Cassandra13
#1: Hierarchies
13:00 ... :01→45 :02→62 :03→87
<day> ... :12→2930 :13→3520 :14→3034
13:01 ... :10→3 :11→4 :12→2
14:00
13:02
......
Counting
occurrences
by day, hour,
min, sec
One row for each
value at each level in
the hierarchy
Columns encode sub-components for each level
#Cassandra13
#1: Hierarchies
{
time: 13:02:11,
....
}
13:00 ... :01→45 :02→62 :03→87
<day> ... :12→2930 :13→3520 :14→3034
13:01 ... :10→3 :11→4 :12→2
14:00
13:02
......
11:59
-> 13:02
Counting
occurrences
by day, hour,
min, sec
#Cassandra13
#2: Filtering
{
time: 13:50:11,
device_type : xx,
}
13:00 ... :01→45 :02→62 :03→87
xx
yy
<day>
xx
yy
13:01
xx
yy
14:00
13:02
xx
yy
......
Adding
‘WHERE’s
To filter on a field,
make sure it is in the
partition key
#Cassandra13
#3: Grouping
{
time: 13:50:11,
device_type : xx,
}
Adding
‘GROUP BY’
13:00 ... :01, xx→45 :01, yy→3 :02, xx→7
<day> ... :12, xx→1012 :12,yy→542 :13,xx→228
14:00
......
#Cassandra13
#4: Drilldown
13:00 ... :01, e3→- :01, e4→- :02, e5→-
<day> ... :12, e1→- :12,e2→- :13,e3→-
14:00
......
Going from
counts to the
constituent events
{
_id: e3,
time: 13:01:11,
device_type : xx,
}
e3 time → 13:01:11 device_type → xx ...
Use an identifier in the column key and store
the event in a different ColumnFamily
#Cassandra13
Put it together...
Source: http://paintcutpaste.com/pollock-splatter-painting/
#Cassandra13
Schema agility
Source: http://thoughtstream-distantechoes.blogspot.com/2011/06/13062011_13.html
#Cassandra13
API
event
stream
event
store
roll-up
cubes
Ingest
Processing
dashboard queries programatic interface
API
event
stream
event
store
roll-up
cubes
Ingest
Processing
dashboard queries programatic interface
Cassandra stores raw events, aggregates, data model definition
Acunu Analytics maps events and SQL-like queries into C* ops
API
event
stream
event
store
roll-up
cubes
Ingest
Processing
dashboard queries programatic interfacePROCESSING AT INGEST
JSON, CSV, log ingest
via RESTful HTTP API,
Flume, Storm, AMQP
Storm, MQ HTTP
Acunu Dashboards provides rich, real-time,
embeddable visualizations
SELECT AVG(r)
FROM metrics
GROUP BY host;
AQL Alerting
!
Cubes
MILLISECOND QUERIES
API
event
stream
event
store
roll-up
cubes
Ingest
Processing
dashboard queries programatic interface
API for rich queries,
threshold alerting
Backfill historic results
for new cubes to enable
agile schema changes
#Cassandra13 Apache,Apache Cassandra, Cassandra, Flume, and the eye logos
are trademarks of the Apache Software Foundation.
@timmoreton
@acunu
Thanks!

Mais conteúdo relacionado

Mais de DataStax Academy

Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart Labs
DataStax Academy
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stack
DataStax Academy
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
DataStax Academy
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First Cluster
DataStax Academy
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with Dse
DataStax Academy
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache Cassandra
DataStax Academy
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax Enterprise
DataStax Academy
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache Cassandra
DataStax Academy
 

Mais de DataStax Academy (20)

Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraIntroduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart Labs
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data Modeling
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stack
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache Cassandra
 
Coursera Cassandra Driver
Coursera Cassandra DriverCoursera Cassandra Driver
Coursera Cassandra Driver
 
Production Ready Cassandra
Production Ready CassandraProduction Ready Cassandra
Production Ready Cassandra
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
 
Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1
 
Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First Cluster
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with Dse
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache Cassandra
 
Cassandra Core Concepts
Cassandra Core ConceptsCassandra Core Concepts
Cassandra Core Concepts
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax Enterprise
 
Bad Habits Die Hard
Bad Habits Die Hard Bad Habits Die Hard
Bad Habits Die Hard
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache Cassandra
 
Advanced Cassandra
Advanced CassandraAdvanced Cassandra
Advanced Cassandra
 
Apache Cassandra and Drivers
Apache Cassandra and DriversApache Cassandra and Drivers
Apache Cassandra and Drivers
 
Getting Started with Graph Databases
Getting Started with Graph DatabasesGetting Started with Graph Databases
Getting Started with Graph Databases
 

Último

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Último (20)

Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 

C* Summit 2013: Real World, Real Time Data Modeling by Tim Moreton

  • 1. #Cassandra13 Real World, Real-Time Data Modeling (for analytics apps) Tim Moreton Founder and CTO, Acunu
  • 3. #Cassandra13 •e.g Click stream, telemetry, logs •100x more writes than reads •Almost all reads are to results •Almost no writes are ‘updates’ •Really not going to fit in RAM Real-time analytics 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html •e.g User profiles •Create, Read, Update, Delete •Probably mostly reads •Probably wants atomicity •Probably fits in RAM Session storage 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html What folk use C* for
  • 4. #Cassandra13 Real-time analytics 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html Session storage 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html What folk use C* for S WP HA ACIDS WP HA ACID
  • 5. #Cassandra13 Real-time analytics What folk use C* for Session storage
  • 6. #Cassandra13 Example use case { time: 13:50:11, latitude: 12.5, longitude: -43.4, duration: 24, device_type: .. } Call detail records tens thousands/sec Real-time dashboards
  • 7. #Cassandra13 C* Data Modeling 101 • Denormalise: Writes (and disk) are cheap, reads are expensive: insert data in every arrangement that you need to read it • Items you’ll access together, and want sorted: put in the same row • Sets of items you’re likely to access separately: keep in separate rows • Atomic counters are the building block of Cassandra real-time analytics apps row2 row3 row1 One event update One query read
  • 8. #Cassandra13 #1: Hierarchies 13:00 ... :01→45 :02→62 :03→87 <day> ... :12→2930 :13→3520 :14→3034 13:01 ... :10→3 :11→4 :12→2 14:00 13:02 ...... Counting occurrences by day, hour, min, sec One row for each value at each level in the hierarchy Columns encode sub-components for each level
  • 9. #Cassandra13 #1: Hierarchies { time: 13:02:11, .... } 13:00 ... :01→45 :02→62 :03→87 <day> ... :12→2930 :13→3520 :14→3034 13:01 ... :10→3 :11→4 :12→2 14:00 13:02 ...... 11:59 -> 13:02 Counting occurrences by day, hour, min, sec
  • 10. #Cassandra13 #2: Filtering { time: 13:50:11, device_type : xx, } 13:00 ... :01→45 :02→62 :03→87 xx yy <day> xx yy 13:01 xx yy 14:00 13:02 xx yy ...... Adding ‘WHERE’s To filter on a field, make sure it is in the partition key
  • 11. #Cassandra13 #3: Grouping { time: 13:50:11, device_type : xx, } Adding ‘GROUP BY’ 13:00 ... :01, xx→45 :01, yy→3 :02, xx→7 <day> ... :12, xx→1012 :12,yy→542 :13,xx→228 14:00 ......
  • 12. #Cassandra13 #4: Drilldown 13:00 ... :01, e3→- :01, e4→- :02, e5→- <day> ... :12, e1→- :12,e2→- :13,e3→- 14:00 ...... Going from counts to the constituent events { _id: e3, time: 13:01:11, device_type : xx, } e3 time → 13:01:11 device_type → xx ... Use an identifier in the column key and store the event in a different ColumnFamily
  • 13. #Cassandra13 Put it together... Source: http://paintcutpaste.com/pollock-splatter-painting/
  • 15. #Cassandra13 API event stream event store roll-up cubes Ingest Processing dashboard queries programatic interface API event stream event store roll-up cubes Ingest Processing dashboard queries programatic interface Cassandra stores raw events, aggregates, data model definition Acunu Analytics maps events and SQL-like queries into C* ops API event stream event store roll-up cubes Ingest Processing dashboard queries programatic interfacePROCESSING AT INGEST JSON, CSV, log ingest via RESTful HTTP API, Flume, Storm, AMQP Storm, MQ HTTP Acunu Dashboards provides rich, real-time, embeddable visualizations SELECT AVG(r) FROM metrics GROUP BY host; AQL Alerting ! Cubes MILLISECOND QUERIES API event stream event store roll-up cubes Ingest Processing dashboard queries programatic interface API for rich queries, threshold alerting Backfill historic results for new cubes to enable agile schema changes
  • 16. #Cassandra13 Apache,Apache Cassandra, Cassandra, Flume, and the eye logos are trademarks of the Apache Software Foundation. @timmoreton @acunu Thanks!