SlideShare uma empresa Scribd logo
1 de 16
#Cassandra13
Real World, Real-Time
Data Modeling
(for analytics apps)
Tim Moreton
Founder and CTO, Acunu
#Cassandra13
Virtual nodes CQL Support
WE C*
#Cassandra13
•e.g Click stream, telemetry, logs
•100x more writes than reads
•Almost all reads are to results
•Almost no writes are ‘updates’
•Really not going to fit in RAM
Real-time analytics
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
•e.g User profiles
•Create, Read, Update, Delete
•Probably mostly reads
•Probably wants atomicity
•Probably fits in RAM
Session storage
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
What folk use C* for
#Cassandra13
Real-time analytics
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
Session storage
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
02:44:02 241.24.41 0.0.1 GET /index.html
What folk use C* for
S WP HA ACIDS WP HA ACID
#Cassandra13
Real-time analytics
What folk use C* for
Session storage
#Cassandra13
Example use case
{
time: 13:50:11,
latitude: 12.5,
longitude: -43.4,
duration: 24,
device_type: ..
}
Call detail records
tens thousands/sec
Real-time dashboards
#Cassandra13
C* Data Modeling 101
• Denormalise: Writes (and disk) are
cheap, reads are expensive:
insert data in every arrangement that
you need to read it
• Items you’ll access together, and want
sorted: put in the same row
• Sets of items you’re likely to access
separately: keep in separate rows
• Atomic counters are the building block
of Cassandra real-time analytics apps
row2
row3
row1
One event
update
One query read
#Cassandra13
#1: Hierarchies
13:00 ... :01→45 :02→62 :03→87
<day> ... :12→2930 :13→3520 :14→3034
13:01 ... :10→3 :11→4 :12→2
14:00
13:02
......
Counting
occurrences
by day, hour,
min, sec
One row for each
value at each level in
the hierarchy
Columns encode sub-components for each level
#Cassandra13
#1: Hierarchies
{
time: 13:02:11,
....
}
13:00 ... :01→45 :02→62 :03→87
<day> ... :12→2930 :13→3520 :14→3034
13:01 ... :10→3 :11→4 :12→2
14:00
13:02
......
11:59
-> 13:02
Counting
occurrences
by day, hour,
min, sec
#Cassandra13
#2: Filtering
{
time: 13:50:11,
device_type : xx,
}
13:00 ... :01→45 :02→62 :03→87
xx
yy
<day>
xx
yy
13:01
xx
yy
14:00
13:02
xx
yy
......
Adding
‘WHERE’s
To filter on a field,
make sure it is in the
partition key
#Cassandra13
#3: Grouping
{
time: 13:50:11,
device_type : xx,
}
Adding
‘GROUP BY’
13:00 ... :01, xx→45 :01, yy→3 :02, xx→7
<day> ... :12, xx→1012 :12,yy→542 :13,xx→228
14:00
......
#Cassandra13
#4: Drilldown
13:00 ... :01, e3→- :01, e4→- :02, e5→-
<day> ... :12, e1→- :12,e2→- :13,e3→-
14:00
......
Going from
counts to the
constituent events
{
_id: e3,
time: 13:01:11,
device_type : xx,
}
e3 time → 13:01:11 device_type → xx ...
Use an identifier in the column key and store
the event in a different ColumnFamily
#Cassandra13
Put it together...
Source: http://paintcutpaste.com/pollock-splatter-painting/
#Cassandra13
Schema agility
Source: http://thoughtstream-distantechoes.blogspot.com/2011/06/13062011_13.html
#Cassandra13
API
event
stream
event
store
roll-up
cubes
Ingest
Processing
dashboard queries programatic interface
API
event
stream
event
store
roll-up
cubes
Ingest
Processing
dashboard queries programatic interface
Cassandra stores raw events, aggregates, data model definition
Acunu Analytics maps events and SQL-like queries into C* ops
API
event
stream
event
store
roll-up
cubes
Ingest
Processing
dashboard queries programatic interfacePROCESSING AT INGEST
JSON, CSV, log ingest
via RESTful HTTP API,
Flume, Storm, AMQP
Storm, MQ HTTP
Acunu Dashboards provides rich, real-time,
embeddable visualizations
SELECT AVG(r)
FROM metrics
GROUP BY host;
AQL Alerting
!
Cubes
MILLISECOND QUERIES
API
event
stream
event
store
roll-up
cubes
Ingest
Processing
dashboard queries programatic interface
API for rich queries,
threshold alerting
Backfill historic results
for new cubes to enable
agile schema changes
#Cassandra13 Apache,Apache Cassandra, Cassandra, Flume, and the eye logos
are trademarks of the Apache Software Foundation.
@timmoreton
@acunu
Thanks!

Mais conteúdo relacionado

Mais de DataStax Academy

Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraIntroduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraDataStax Academy
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsDataStax Academy
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingDataStax Academy
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackDataStax Academy
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache CassandraDataStax Academy
 
Production Ready Cassandra
Production Ready CassandraProduction Ready Cassandra
Production Ready CassandraDataStax Academy
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonDataStax Academy
 
Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1DataStax Academy
 
Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2DataStax Academy
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First ClusterDataStax Academy
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with DseDataStax Academy
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraDataStax Academy
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseDataStax Academy
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraDataStax Academy
 
Apache Cassandra and Drivers
Apache Cassandra and DriversApache Cassandra and Drivers
Apache Cassandra and DriversDataStax Academy
 
Getting Started with Graph Databases
Getting Started with Graph DatabasesGetting Started with Graph Databases
Getting Started with Graph DatabasesDataStax Academy
 

Mais de DataStax Academy (20)

Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraIntroduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart Labs
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data Modeling
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stack
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache Cassandra
 
Coursera Cassandra Driver
Coursera Cassandra DriverCoursera Cassandra Driver
Coursera Cassandra Driver
 
Production Ready Cassandra
Production Ready CassandraProduction Ready Cassandra
Production Ready Cassandra
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
 
Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1
 
Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First Cluster
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with Dse
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache Cassandra
 
Cassandra Core Concepts
Cassandra Core ConceptsCassandra Core Concepts
Cassandra Core Concepts
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax Enterprise
 
Bad Habits Die Hard
Bad Habits Die Hard Bad Habits Die Hard
Bad Habits Die Hard
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache Cassandra
 
Advanced Cassandra
Advanced CassandraAdvanced Cassandra
Advanced Cassandra
 
Apache Cassandra and Drivers
Apache Cassandra and DriversApache Cassandra and Drivers
Apache Cassandra and Drivers
 
Getting Started with Graph Databases
Getting Started with Graph DatabasesGetting Started with Graph Databases
Getting Started with Graph Databases
 

Último

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKJago de Vreede
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 

Último (20)

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 

C* Summit 2013: Real World, Real Time Data Modeling by Tim Moreton

  • 1. #Cassandra13 Real World, Real-Time Data Modeling (for analytics apps) Tim Moreton Founder and CTO, Acunu
  • 3. #Cassandra13 •e.g Click stream, telemetry, logs •100x more writes than reads •Almost all reads are to results •Almost no writes are ‘updates’ •Really not going to fit in RAM Real-time analytics 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html •e.g User profiles •Create, Read, Update, Delete •Probably mostly reads •Probably wants atomicity •Probably fits in RAM Session storage 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html What folk use C* for
  • 4. #Cassandra13 Real-time analytics 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html Session storage 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html 02:44:02 241.24.41 0.0.1 GET /index.html What folk use C* for S WP HA ACIDS WP HA ACID
  • 5. #Cassandra13 Real-time analytics What folk use C* for Session storage
  • 6. #Cassandra13 Example use case { time: 13:50:11, latitude: 12.5, longitude: -43.4, duration: 24, device_type: .. } Call detail records tens thousands/sec Real-time dashboards
  • 7. #Cassandra13 C* Data Modeling 101 • Denormalise: Writes (and disk) are cheap, reads are expensive: insert data in every arrangement that you need to read it • Items you’ll access together, and want sorted: put in the same row • Sets of items you’re likely to access separately: keep in separate rows • Atomic counters are the building block of Cassandra real-time analytics apps row2 row3 row1 One event update One query read
  • 8. #Cassandra13 #1: Hierarchies 13:00 ... :01→45 :02→62 :03→87 <day> ... :12→2930 :13→3520 :14→3034 13:01 ... :10→3 :11→4 :12→2 14:00 13:02 ...... Counting occurrences by day, hour, min, sec One row for each value at each level in the hierarchy Columns encode sub-components for each level
  • 9. #Cassandra13 #1: Hierarchies { time: 13:02:11, .... } 13:00 ... :01→45 :02→62 :03→87 <day> ... :12→2930 :13→3520 :14→3034 13:01 ... :10→3 :11→4 :12→2 14:00 13:02 ...... 11:59 -> 13:02 Counting occurrences by day, hour, min, sec
  • 10. #Cassandra13 #2: Filtering { time: 13:50:11, device_type : xx, } 13:00 ... :01→45 :02→62 :03→87 xx yy <day> xx yy 13:01 xx yy 14:00 13:02 xx yy ...... Adding ‘WHERE’s To filter on a field, make sure it is in the partition key
  • 11. #Cassandra13 #3: Grouping { time: 13:50:11, device_type : xx, } Adding ‘GROUP BY’ 13:00 ... :01, xx→45 :01, yy→3 :02, xx→7 <day> ... :12, xx→1012 :12,yy→542 :13,xx→228 14:00 ......
  • 12. #Cassandra13 #4: Drilldown 13:00 ... :01, e3→- :01, e4→- :02, e5→- <day> ... :12, e1→- :12,e2→- :13,e3→- 14:00 ...... Going from counts to the constituent events { _id: e3, time: 13:01:11, device_type : xx, } e3 time → 13:01:11 device_type → xx ... Use an identifier in the column key and store the event in a different ColumnFamily
  • 13. #Cassandra13 Put it together... Source: http://paintcutpaste.com/pollock-splatter-painting/
  • 15. #Cassandra13 API event stream event store roll-up cubes Ingest Processing dashboard queries programatic interface API event stream event store roll-up cubes Ingest Processing dashboard queries programatic interface Cassandra stores raw events, aggregates, data model definition Acunu Analytics maps events and SQL-like queries into C* ops API event stream event store roll-up cubes Ingest Processing dashboard queries programatic interfacePROCESSING AT INGEST JSON, CSV, log ingest via RESTful HTTP API, Flume, Storm, AMQP Storm, MQ HTTP Acunu Dashboards provides rich, real-time, embeddable visualizations SELECT AVG(r) FROM metrics GROUP BY host; AQL Alerting ! Cubes MILLISECOND QUERIES API event stream event store roll-up cubes Ingest Processing dashboard queries programatic interface API for rich queries, threshold alerting Backfill historic results for new cubes to enable agile schema changes
  • 16. #Cassandra13 Apache,Apache Cassandra, Cassandra, Flume, and the eye logos are trademarks of the Apache Software Foundation. @timmoreton @acunu Thanks!