SlideShare a Scribd company logo
1 of 17
Download to read offline
Time Series with Apache Cassandra
Patrick McFadin

Chief Evangelist
@PatrickMcFadin
©2013 DataStax Confidential. Do not distribute without consent.

1
Quick intro to Cassandra
• Shared nothing
• Masterless peer-to-peer
• Based on Dynamo
Scaling
• Add nodes to scale
• Millions Ops/s

THROUGHPUT OPS/SEC)

Cassandra

HBase

Redis

MySQL
Uptime
• Built to replicate
• Resilient to failure
• Always on

NONE
Easy to use
• CQL is a familiar syntax
• Friendly to programmers
• Paxos for locking

CREATE TABLE users (!
username varchar,!
firstname varchar,!
lastname varchar,!
email list<varchar>,!
password varchar,!
created_date timestamp,!
PRIMARY KEY (username)!
);

INSERT INTO users (username, firstname, lastname, !
email, password, created_date)!
VALUES ('pmcfadin','Patrick','McFadin',!
['patrick@datastax.com'],'ba27e03fd95e507daf2937c937d499ab',!
'2011-06-20 13:50:00');!

INSERT INTO users (username, firstname, !
lastname, email, password, created_date)!
VALUES ('pmcfadin','Patrick','McFadin',!
['patrick@datastax.com'],!
'ba27e03fd95e507daf2937c937d499ab',!
'2011-06-20 13:50:00')!
IF NOT EXISTS;
Time series in production
• It’s all about “What’s happening”
• Data is the new currency

“Sirca, a non-profit university consortium based in Sydney, is the world’s biggest broker of
financial data, ingesting into its database 2million pieces of information a second from every
major trading exchange.”*
* http://www.theage.com.au/it-pro/business-it/help-poverty-theres-an-app-for-that-20140120-hv948.html
Why Cassandra for Time Series
Scales
Resilient
Good data model
Efficient Storage Model

What about that?
Data Model
CREATE TABLE temperature (
weatherstation_id text,
event_time timestamp,
temperature text,
PRIMARY KEY (weatherstation_id,event_time)
);

• Weather Station Id and Time
are unique
• Store as many as needed

INSERT INTO temperature(weatherstation_id,event_time,temperature)
VALUES ('1234ABCD','2013-04-03 07:01:00','72F');
!

INSERT INTO temperature(weatherstation_id,event_time,temperature)
VALUES ('1234ABCD','2013-04-03 07:02:00','73F');
!

INSERT INTO temperature(weatherstation_id,event_time,temperature)
VALUES ('1234ABCD','2013-04-03 07:03:00','73F');
!

INSERT INTO temperature(weatherstation_id,event_time,temperature)
VALUES ('1234ABCD','2013-04-03 07:04:00','74F');
Storage Model - Logical View
SELECT weatherstation_id,event_time,temperature
FROM temperature
WHERE weatherstation_id='1234ABCD';

weatherstation_id

event_time

temperature

2013-04-03 07:01:00

1234ABCD

72F
2013-04-03 07:02:00

1234ABCD

73F
2013-04-03 07:03:00

1234ABCD

73F
2013-04-03 07:04:00

1234ABCD

74F
Storage Model - Disk Layout
SELECT weatherstation_id,event_time,temperature
FROM temperature
WHERE weatherstation_id='1234ABCD';

2013-04-03 07:01:00

1234ABCD

72F

2013-04-03 07:02:00

73F

2013-04-03 07:03:00

2013-04-03 07:04:00

73F

Merged, Sorted and Stored Sequentially

74F

2013-04-03 07:05:00
!

2013-04-03 07:06:00
!

74F

75F

!

!
Query patterns
SELECT temperature
FROM event_time,temperature
WHERE weatherstation_id='1234ABCD'
AND event_time > '2013-04-03 07:01:00'
AND event_time < '2013-04-03 07:04:00';

• Range queries
• “Slice” operation on disk

Single seek on disk
2013-04-03 07:01:00

1234ABCD

72F

2013-04-03 07:02:00

73F

2013-04-03 07:03:00

73F

2013-04-03 07:04:00

74F

2013-04-03 07:05:00
!

2013-04-03 07:06:00
!

74F

75F

!

!
Query patterns
SELECT temperature
FROM event_time,temperature
WHERE weatherstation_id='1234ABCD'
AND event_time > '2013-04-03 07:01:00'
AND event_time < '2013-04-03 07:04:00';
weatherstation_id

event_time

• Range queries
• “Slice” operation on disk

temperature

2013-04-03 07:01:00

1234ABCD

72F

Sorted by event_time

2013-04-03 07:02:00

1234ABCD

73F
2013-04-03 07:03:00

1234ABCD

73F
2013-04-03 07:04:00

1234ABCD

74F

Programmers like this
Ingestion models
• Apache Kafka
• Apache Flume
• Storm
• Custom Applications

Apache Kafka

Your totally!
killer!
application
Dealing with data at speed
• 1 million writes per second?
• 1 insert every microsecond
• Collisions?

Your totally!
killer!
application

weatherstation_id='5678EFGH'

• Primary Key determines node
placement
• Random partitioning
• Special data type - TimeUUID

weatherstation_id='1234ABCD'
TimeUUID
Timestamp to Microsecond

+

UUID

=

TimeUUID

• Also known as a Version 1 UUID
• Sortable
• Reversible

04d580b0-9412-11e3-baa8-0800200c9a66

=

Wednesday, February 12, 2014 6:18:06 PM GMT

http://www.famkruithof.net/uuid/uuidgen
Way more information
www.planetcassandra.org
!

• 5 minute interviews
• Use cases
• Free training!
Thank You!

Follow me for more updates all the time: @PatrickMcFadin

More Related Content

What's hot

Cassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series ModelingCassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series ModelingVassilis Bekiaris
 
Cassandra EU - Data model on fire
Cassandra EU - Data model on fireCassandra EU - Data model on fire
Cassandra EU - Data model on firePatrick McFadin
 
Introduction to data modeling with apache cassandra
Introduction to data modeling with apache cassandraIntroduction to data modeling with apache cassandra
Introduction to data modeling with apache cassandraPatrick McFadin
 
Real data models of silicon valley
Real data models of silicon valleyReal data models of silicon valley
Real data models of silicon valleyPatrick McFadin
 
Introduction to cassandra 2014
Introduction to cassandra 2014Introduction to cassandra 2014
Introduction to cassandra 2014Patrick McFadin
 
Successful Architectures for Fast Data
Successful Architectures for Fast DataSuccessful Architectures for Fast Data
Successful Architectures for Fast DataPatrick McFadin
 
Spark Streaming with Cassandra
Spark Streaming with CassandraSpark Streaming with Cassandra
Spark Streaming with CassandraJacek Lewandowski
 
Escape from Hadoop: Ultra Fast Data Analysis with Spark & Cassandra
Escape from Hadoop: Ultra Fast Data Analysis with Spark & CassandraEscape from Hadoop: Ultra Fast Data Analysis with Spark & Cassandra
Escape from Hadoop: Ultra Fast Data Analysis with Spark & CassandraPiotr Kolaczkowski
 
A Cassandra + Solr + Spark Love Triangle Using DataStax Enterprise
A Cassandra + Solr + Spark Love Triangle Using DataStax EnterpriseA Cassandra + Solr + Spark Love Triangle Using DataStax Enterprise
A Cassandra + Solr + Spark Love Triangle Using DataStax EnterprisePatrick McFadin
 
How We Used Cassandra/Solr to Build Real-Time Analytics Platform
How We Used Cassandra/Solr to Build Real-Time Analytics PlatformHow We Used Cassandra/Solr to Build Real-Time Analytics Platform
How We Used Cassandra/Solr to Build Real-Time Analytics PlatformDataStax Academy
 
Cassandra 3.0 advanced preview
Cassandra 3.0 advanced previewCassandra 3.0 advanced preview
Cassandra 3.0 advanced previewPatrick McFadin
 
Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...
Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...
Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...DataStax
 
Cassandra Materialized Views
Cassandra Materialized ViewsCassandra Materialized Views
Cassandra Materialized ViewsCarl Yeksigian
 
Using Spark to Load Oracle Data into Cassandra
Using Spark to Load Oracle Data into CassandraUsing Spark to Load Oracle Data into Cassandra
Using Spark to Load Oracle Data into CassandraJim Hatcher
 
Cassandra and Spark: Optimizing for Data Locality-(Russell Spitzer, DataStax)
Cassandra and Spark: Optimizing for Data Locality-(Russell Spitzer, DataStax)Cassandra and Spark: Optimizing for Data Locality-(Russell Spitzer, DataStax)
Cassandra and Spark: Optimizing for Data Locality-(Russell Spitzer, DataStax)Spark Summit
 
Cassandra Community Webinar: Apache Cassandra Internals
Cassandra Community Webinar: Apache Cassandra InternalsCassandra Community Webinar: Apache Cassandra Internals
Cassandra Community Webinar: Apache Cassandra InternalsDataStax
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseDataStax Academy
 
Cassandra Fundamentals - C* 2.0
Cassandra Fundamentals - C* 2.0Cassandra Fundamentals - C* 2.0
Cassandra Fundamentals - C* 2.0Russell Spitzer
 
Beyond the Query – Bringing Complex Access Patterns to NoSQL with DataStax - ...
Beyond the Query – Bringing Complex Access Patterns to NoSQL with DataStax - ...Beyond the Query – Bringing Complex Access Patterns to NoSQL with DataStax - ...
Beyond the Query – Bringing Complex Access Patterns to NoSQL with DataStax - ...StampedeCon
 
Spark and Cassandra 2 Fast 2 Furious
Spark and Cassandra 2 Fast 2 FuriousSpark and Cassandra 2 Fast 2 Furious
Spark and Cassandra 2 Fast 2 FuriousRussell Spitzer
 

What's hot (20)

Cassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series ModelingCassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series Modeling
 
Cassandra EU - Data model on fire
Cassandra EU - Data model on fireCassandra EU - Data model on fire
Cassandra EU - Data model on fire
 
Introduction to data modeling with apache cassandra
Introduction to data modeling with apache cassandraIntroduction to data modeling with apache cassandra
Introduction to data modeling with apache cassandra
 
Real data models of silicon valley
Real data models of silicon valleyReal data models of silicon valley
Real data models of silicon valley
 
Introduction to cassandra 2014
Introduction to cassandra 2014Introduction to cassandra 2014
Introduction to cassandra 2014
 
Successful Architectures for Fast Data
Successful Architectures for Fast DataSuccessful Architectures for Fast Data
Successful Architectures for Fast Data
 
Spark Streaming with Cassandra
Spark Streaming with CassandraSpark Streaming with Cassandra
Spark Streaming with Cassandra
 
Escape from Hadoop: Ultra Fast Data Analysis with Spark & Cassandra
Escape from Hadoop: Ultra Fast Data Analysis with Spark & CassandraEscape from Hadoop: Ultra Fast Data Analysis with Spark & Cassandra
Escape from Hadoop: Ultra Fast Data Analysis with Spark & Cassandra
 
A Cassandra + Solr + Spark Love Triangle Using DataStax Enterprise
A Cassandra + Solr + Spark Love Triangle Using DataStax EnterpriseA Cassandra + Solr + Spark Love Triangle Using DataStax Enterprise
A Cassandra + Solr + Spark Love Triangle Using DataStax Enterprise
 
How We Used Cassandra/Solr to Build Real-Time Analytics Platform
How We Used Cassandra/Solr to Build Real-Time Analytics PlatformHow We Used Cassandra/Solr to Build Real-Time Analytics Platform
How We Used Cassandra/Solr to Build Real-Time Analytics Platform
 
Cassandra 3.0 advanced preview
Cassandra 3.0 advanced previewCassandra 3.0 advanced preview
Cassandra 3.0 advanced preview
 
Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...
Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...
Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...
 
Cassandra Materialized Views
Cassandra Materialized ViewsCassandra Materialized Views
Cassandra Materialized Views
 
Using Spark to Load Oracle Data into Cassandra
Using Spark to Load Oracle Data into CassandraUsing Spark to Load Oracle Data into Cassandra
Using Spark to Load Oracle Data into Cassandra
 
Cassandra and Spark: Optimizing for Data Locality-(Russell Spitzer, DataStax)
Cassandra and Spark: Optimizing for Data Locality-(Russell Spitzer, DataStax)Cassandra and Spark: Optimizing for Data Locality-(Russell Spitzer, DataStax)
Cassandra and Spark: Optimizing for Data Locality-(Russell Spitzer, DataStax)
 
Cassandra Community Webinar: Apache Cassandra Internals
Cassandra Community Webinar: Apache Cassandra InternalsCassandra Community Webinar: Apache Cassandra Internals
Cassandra Community Webinar: Apache Cassandra Internals
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax Enterprise
 
Cassandra Fundamentals - C* 2.0
Cassandra Fundamentals - C* 2.0Cassandra Fundamentals - C* 2.0
Cassandra Fundamentals - C* 2.0
 
Beyond the Query – Bringing Complex Access Patterns to NoSQL with DataStax - ...
Beyond the Query – Bringing Complex Access Patterns to NoSQL with DataStax - ...Beyond the Query – Bringing Complex Access Patterns to NoSQL with DataStax - ...
Beyond the Query – Bringing Complex Access Patterns to NoSQL with DataStax - ...
 
Spark and Cassandra 2 Fast 2 Furious
Spark and Cassandra 2 Fast 2 FuriousSpark and Cassandra 2 Fast 2 Furious
Spark and Cassandra 2 Fast 2 Furious
 

Similar to Time series with apache cassandra strata

Cassandra Community Webinar | Getting Started with Apache Cassandra with Patr...
Cassandra Community Webinar | Getting Started with Apache Cassandra with Patr...Cassandra Community Webinar | Getting Started with Apache Cassandra with Patr...
Cassandra Community Webinar | Getting Started with Apache Cassandra with Patr...DataStax Academy
 
Cassandra Day SV 2014: Beyond Read-Modify-Write with Apache Cassandra
Cassandra Day SV 2014: Beyond Read-Modify-Write with Apache CassandraCassandra Day SV 2014: Beyond Read-Modify-Write with Apache Cassandra
Cassandra Day SV 2014: Beyond Read-Modify-Write with Apache CassandraDataStax Academy
 
Apache cassandra & apache spark for time series data
Apache cassandra & apache spark for time series dataApache cassandra & apache spark for time series data
Apache cassandra & apache spark for time series dataPatrick McFadin
 
Cassandra Summit 2013 Keynote
Cassandra Summit 2013 KeynoteCassandra Summit 2013 Keynote
Cassandra Summit 2013 Keynotejbellis
 
C* Summit 2013: The World's Next Top Data Model by Patrick McFadin
C* Summit 2013: The World's Next Top Data Model by Patrick McFadinC* Summit 2013: The World's Next Top Data Model by Patrick McFadin
C* Summit 2013: The World's Next Top Data Model by Patrick McFadinDataStax Academy
 
Managing Cassandra at Scale by Al Tobey
Managing Cassandra at Scale by Al TobeyManaging Cassandra at Scale by Al Tobey
Managing Cassandra at Scale by Al TobeyDataStax Academy
 
Tokyo Cassandra Summit 2014: Tunable Consistency by Al Tobey
Tokyo Cassandra Summit 2014: Tunable Consistency by Al TobeyTokyo Cassandra Summit 2014: Tunable Consistency by Al Tobey
Tokyo Cassandra Summit 2014: Tunable Consistency by Al TobeyDataStax Academy
 
Mixing Batch and Real-time: Cassandra with Shark (Cassandra Europe 2013)
Mixing Batch and Real-time: Cassandra with Shark (Cassandra Europe 2013)Mixing Batch and Real-time: Cassandra with Shark (Cassandra Europe 2013)
Mixing Batch and Real-time: Cassandra with Shark (Cassandra Europe 2013)Richard Low
 
C* Summit EU 2013: Mixing Batch and Real-Time: Cassandra with Shark
C* Summit EU 2013: Mixing Batch and Real-Time: Cassandra with Shark C* Summit EU 2013: Mixing Batch and Real-Time: Cassandra with Shark
C* Summit EU 2013: Mixing Batch and Real-Time: Cassandra with Shark DataStax Academy
 
Oracle to Cassandra Core Concepts Guide Pt. 2
Oracle to Cassandra Core Concepts Guide Pt. 2Oracle to Cassandra Core Concepts Guide Pt. 2
Oracle to Cassandra Core Concepts Guide Pt. 2DataStax Academy
 
C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1
C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1
C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1DataStax Academy
 
Cassandra Summit EU 2013
Cassandra Summit EU 2013Cassandra Summit EU 2013
Cassandra Summit EU 2013jbellis
 
Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinChristian Johannsen
 
DataStax: Old Dogs, New Tricks. Teaching your Relational DBA to fetch
DataStax: Old Dogs, New Tricks. Teaching your Relational DBA to fetchDataStax: Old Dogs, New Tricks. Teaching your Relational DBA to fetch
DataStax: Old Dogs, New Tricks. Teaching your Relational DBA to fetchDataStax Academy
 
The world's next top data model
The world's next top data modelThe world's next top data model
The world's next top data modelPatrick McFadin
 
Collaborate 2009 - Migrating a Data Warehouse from Microsoft SQL Server to Or...
Collaborate 2009 - Migrating a Data Warehouse from Microsoft SQL Server to Or...Collaborate 2009 - Migrating a Data Warehouse from Microsoft SQL Server to Or...
Collaborate 2009 - Migrating a Data Warehouse from Microsoft SQL Server to Or...djkucera
 

Similar to Time series with apache cassandra strata (20)

Cassandra Community Webinar | Getting Started with Apache Cassandra with Patr...
Cassandra Community Webinar | Getting Started with Apache Cassandra with Patr...Cassandra Community Webinar | Getting Started with Apache Cassandra with Patr...
Cassandra Community Webinar | Getting Started with Apache Cassandra with Patr...
 
Cassandra Day SV 2014: Beyond Read-Modify-Write with Apache Cassandra
Cassandra Day SV 2014: Beyond Read-Modify-Write with Apache CassandraCassandra Day SV 2014: Beyond Read-Modify-Write with Apache Cassandra
Cassandra Day SV 2014: Beyond Read-Modify-Write with Apache Cassandra
 
Apache cassandra & apache spark for time series data
Apache cassandra & apache spark for time series dataApache cassandra & apache spark for time series data
Apache cassandra & apache spark for time series data
 
Cassandra Summit 2013 Keynote
Cassandra Summit 2013 KeynoteCassandra Summit 2013 Keynote
Cassandra Summit 2013 Keynote
 
C* Summit 2013: The World's Next Top Data Model by Patrick McFadin
C* Summit 2013: The World's Next Top Data Model by Patrick McFadinC* Summit 2013: The World's Next Top Data Model by Patrick McFadin
C* Summit 2013: The World's Next Top Data Model by Patrick McFadin
 
Managing Cassandra at Scale by Al Tobey
Managing Cassandra at Scale by Al TobeyManaging Cassandra at Scale by Al Tobey
Managing Cassandra at Scale by Al Tobey
 
Apache Cassandra and Go
Apache Cassandra and GoApache Cassandra and Go
Apache Cassandra and Go
 
Advanced Cassandra
Advanced CassandraAdvanced Cassandra
Advanced Cassandra
 
Tokyo Cassandra Summit 2014: Tunable Consistency by Al Tobey
Tokyo Cassandra Summit 2014: Tunable Consistency by Al TobeyTokyo Cassandra Summit 2014: Tunable Consistency by Al Tobey
Tokyo Cassandra Summit 2014: Tunable Consistency by Al Tobey
 
Mixing Batch and Real-time: Cassandra with Shark (Cassandra Europe 2013)
Mixing Batch and Real-time: Cassandra with Shark (Cassandra Europe 2013)Mixing Batch and Real-time: Cassandra with Shark (Cassandra Europe 2013)
Mixing Batch and Real-time: Cassandra with Shark (Cassandra Europe 2013)
 
C* Summit EU 2013: Mixing Batch and Real-Time: Cassandra with Shark
C* Summit EU 2013: Mixing Batch and Real-Time: Cassandra with Shark C* Summit EU 2013: Mixing Batch and Real-Time: Cassandra with Shark
C* Summit EU 2013: Mixing Batch and Real-Time: Cassandra with Shark
 
Oracle to Cassandra Core Concepts Guide Pt. 2
Oracle to Cassandra Core Concepts Guide Pt. 2Oracle to Cassandra Core Concepts Guide Pt. 2
Oracle to Cassandra Core Concepts Guide Pt. 2
 
C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1
C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1
C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1
 
Cassandra Summit EU 2013
Cassandra Summit EU 2013Cassandra Summit EU 2013
Cassandra Summit EU 2013
 
Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek Berlin
 
DataStax: Old Dogs, New Tricks. Teaching your Relational DBA to fetch
DataStax: Old Dogs, New Tricks. Teaching your Relational DBA to fetchDataStax: Old Dogs, New Tricks. Teaching your Relational DBA to fetch
DataStax: Old Dogs, New Tricks. Teaching your Relational DBA to fetch
 
The world's next top data model
The world's next top data modelThe world's next top data model
The world's next top data model
 
Asterisk_MySQL_Cluster_Presentation.pdf
Asterisk_MySQL_Cluster_Presentation.pdfAsterisk_MySQL_Cluster_Presentation.pdf
Asterisk_MySQL_Cluster_Presentation.pdf
 
Collaborate 2009 - Migrating a Data Warehouse from Microsoft SQL Server to Or...
Collaborate 2009 - Migrating a Data Warehouse from Microsoft SQL Server to Or...Collaborate 2009 - Migrating a Data Warehouse from Microsoft SQL Server to Or...
Collaborate 2009 - Migrating a Data Warehouse from Microsoft SQL Server to Or...
 
CouchDB
CouchDBCouchDB
CouchDB
 

More from Patrick McFadin

Open source or proprietary, choose wisely!
Open source or proprietary,  choose wisely!Open source or proprietary,  choose wisely!
Open source or proprietary, choose wisely!Patrick McFadin
 
Help! I want to contribute to an Open Source project but my boss says no.
Help! I want to contribute to an Open Source project but my boss says no.Help! I want to contribute to an Open Source project but my boss says no.
Help! I want to contribute to an Open Source project but my boss says no.Patrick McFadin
 
Analyzing Time Series Data with Apache Spark and Cassandra
Analyzing Time Series Data with Apache Spark and CassandraAnalyzing Time Series Data with Apache Spark and Cassandra
Analyzing Time Series Data with Apache Spark and CassandraPatrick McFadin
 
Advanced data modeling with apache cassandra
Advanced data modeling with apache cassandraAdvanced data modeling with apache cassandra
Advanced data modeling with apache cassandraPatrick McFadin
 
Making money with open source and not losing your soul: A practical guide
Making money with open source and not losing your soul: A practical guideMaking money with open source and not losing your soul: A practical guide
Making money with open source and not losing your soul: A practical guidePatrick McFadin
 
Cassandra 2.0 better, faster, stronger
Cassandra 2.0   better, faster, strongerCassandra 2.0   better, faster, stronger
Cassandra 2.0 better, faster, strongerPatrick McFadin
 
Building Antifragile Applications with Apache Cassandra
Building Antifragile Applications with Apache CassandraBuilding Antifragile Applications with Apache Cassandra
Building Antifragile Applications with Apache CassandraPatrick McFadin
 
The data model is dead, long live the data model
The data model is dead, long live the data modelThe data model is dead, long live the data model
The data model is dead, long live the data modelPatrick McFadin
 
Cassandra Virtual Node talk
Cassandra Virtual Node talkCassandra Virtual Node talk
Cassandra Virtual Node talkPatrick McFadin
 
Toronto jaspersoft meetup
Toronto jaspersoft meetupToronto jaspersoft meetup
Toronto jaspersoft meetupPatrick McFadin
 
Cassandra data modeling talk
Cassandra data modeling talkCassandra data modeling talk
Cassandra data modeling talkPatrick McFadin
 

More from Patrick McFadin (13)

Open source or proprietary, choose wisely!
Open source or proprietary,  choose wisely!Open source or proprietary,  choose wisely!
Open source or proprietary, choose wisely!
 
Help! I want to contribute to an Open Source project but my boss says no.
Help! I want to contribute to an Open Source project but my boss says no.Help! I want to contribute to an Open Source project but my boss says no.
Help! I want to contribute to an Open Source project but my boss says no.
 
Analyzing Time Series Data with Apache Spark and Cassandra
Analyzing Time Series Data with Apache Spark and CassandraAnalyzing Time Series Data with Apache Spark and Cassandra
Analyzing Time Series Data with Apache Spark and Cassandra
 
Advanced data modeling with apache cassandra
Advanced data modeling with apache cassandraAdvanced data modeling with apache cassandra
Advanced data modeling with apache cassandra
 
Making money with open source and not losing your soul: A practical guide
Making money with open source and not losing your soul: A practical guideMaking money with open source and not losing your soul: A practical guide
Making money with open source and not losing your soul: A practical guide
 
Cassandra 2.0 better, faster, stronger
Cassandra 2.0   better, faster, strongerCassandra 2.0   better, faster, stronger
Cassandra 2.0 better, faster, stronger
 
Building Antifragile Applications with Apache Cassandra
Building Antifragile Applications with Apache CassandraBuilding Antifragile Applications with Apache Cassandra
Building Antifragile Applications with Apache Cassandra
 
Cassandra at scale
Cassandra at scaleCassandra at scale
Cassandra at scale
 
Become a super modeler
Become a super modelerBecome a super modeler
Become a super modeler
 
The data model is dead, long live the data model
The data model is dead, long live the data modelThe data model is dead, long live the data model
The data model is dead, long live the data model
 
Cassandra Virtual Node talk
Cassandra Virtual Node talkCassandra Virtual Node talk
Cassandra Virtual Node talk
 
Toronto jaspersoft meetup
Toronto jaspersoft meetupToronto jaspersoft meetup
Toronto jaspersoft meetup
 
Cassandra data modeling talk
Cassandra data modeling talkCassandra data modeling talk
Cassandra data modeling talk
 

Recently uploaded

What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 

Recently uploaded (20)

What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 

Time series with apache cassandra strata

  • 1. Time Series with Apache Cassandra Patrick McFadin
 Chief Evangelist @PatrickMcFadin ©2013 DataStax Confidential. Do not distribute without consent. 1
  • 2. Quick intro to Cassandra • Shared nothing • Masterless peer-to-peer • Based on Dynamo
  • 3. Scaling • Add nodes to scale • Millions Ops/s THROUGHPUT OPS/SEC) Cassandra HBase Redis MySQL
  • 4. Uptime • Built to replicate • Resilient to failure • Always on NONE
  • 5. Easy to use • CQL is a familiar syntax • Friendly to programmers • Paxos for locking CREATE TABLE users (! username varchar,! firstname varchar,! lastname varchar,! email list<varchar>,! password varchar,! created_date timestamp,! PRIMARY KEY (username)! ); INSERT INTO users (username, firstname, lastname, ! email, password, created_date)! VALUES ('pmcfadin','Patrick','McFadin',! ['patrick@datastax.com'],'ba27e03fd95e507daf2937c937d499ab',! '2011-06-20 13:50:00');! INSERT INTO users (username, firstname, ! lastname, email, password, created_date)! VALUES ('pmcfadin','Patrick','McFadin',! ['patrick@datastax.com'],! 'ba27e03fd95e507daf2937c937d499ab',! '2011-06-20 13:50:00')! IF NOT EXISTS;
  • 6. Time series in production • It’s all about “What’s happening” • Data is the new currency “Sirca, a non-profit university consortium based in Sydney, is the world’s biggest broker of financial data, ingesting into its database 2million pieces of information a second from every major trading exchange.”* * http://www.theage.com.au/it-pro/business-it/help-poverty-theres-an-app-for-that-20140120-hv948.html
  • 7. Why Cassandra for Time Series Scales Resilient Good data model Efficient Storage Model What about that?
  • 8. Data Model CREATE TABLE temperature ( weatherstation_id text, event_time timestamp, temperature text, PRIMARY KEY (weatherstation_id,event_time) ); • Weather Station Id and Time are unique • Store as many as needed INSERT INTO temperature(weatherstation_id,event_time,temperature) VALUES ('1234ABCD','2013-04-03 07:01:00','72F'); ! INSERT INTO temperature(weatherstation_id,event_time,temperature) VALUES ('1234ABCD','2013-04-03 07:02:00','73F'); ! INSERT INTO temperature(weatherstation_id,event_time,temperature) VALUES ('1234ABCD','2013-04-03 07:03:00','73F'); ! INSERT INTO temperature(weatherstation_id,event_time,temperature) VALUES ('1234ABCD','2013-04-03 07:04:00','74F');
  • 9. Storage Model - Logical View SELECT weatherstation_id,event_time,temperature FROM temperature WHERE weatherstation_id='1234ABCD'; weatherstation_id event_time temperature 2013-04-03 07:01:00 1234ABCD 72F 2013-04-03 07:02:00 1234ABCD 73F 2013-04-03 07:03:00 1234ABCD 73F 2013-04-03 07:04:00 1234ABCD 74F
  • 10. Storage Model - Disk Layout SELECT weatherstation_id,event_time,temperature FROM temperature WHERE weatherstation_id='1234ABCD'; 2013-04-03 07:01:00 1234ABCD 72F 2013-04-03 07:02:00 73F 2013-04-03 07:03:00 2013-04-03 07:04:00 73F Merged, Sorted and Stored Sequentially 74F 2013-04-03 07:05:00 ! 2013-04-03 07:06:00 ! 74F 75F ! !
  • 11. Query patterns SELECT temperature FROM event_time,temperature WHERE weatherstation_id='1234ABCD' AND event_time > '2013-04-03 07:01:00' AND event_time < '2013-04-03 07:04:00'; • Range queries • “Slice” operation on disk Single seek on disk 2013-04-03 07:01:00 1234ABCD 72F 2013-04-03 07:02:00 73F 2013-04-03 07:03:00 73F 2013-04-03 07:04:00 74F 2013-04-03 07:05:00 ! 2013-04-03 07:06:00 ! 74F 75F ! !
  • 12. Query patterns SELECT temperature FROM event_time,temperature WHERE weatherstation_id='1234ABCD' AND event_time > '2013-04-03 07:01:00' AND event_time < '2013-04-03 07:04:00'; weatherstation_id event_time • Range queries • “Slice” operation on disk temperature 2013-04-03 07:01:00 1234ABCD 72F Sorted by event_time 2013-04-03 07:02:00 1234ABCD 73F 2013-04-03 07:03:00 1234ABCD 73F 2013-04-03 07:04:00 1234ABCD 74F Programmers like this
  • 13. Ingestion models • Apache Kafka • Apache Flume • Storm • Custom Applications Apache Kafka Your totally! killer! application
  • 14. Dealing with data at speed • 1 million writes per second? • 1 insert every microsecond • Collisions? Your totally! killer! application weatherstation_id='5678EFGH' • Primary Key determines node placement • Random partitioning • Special data type - TimeUUID weatherstation_id='1234ABCD'
  • 15. TimeUUID Timestamp to Microsecond + UUID = TimeUUID • Also known as a Version 1 UUID • Sortable • Reversible 04d580b0-9412-11e3-baa8-0800200c9a66 = Wednesday, February 12, 2014 6:18:06 PM GMT http://www.famkruithof.net/uuid/uuidgen
  • 16. Way more information www.planetcassandra.org ! • 5 minute interviews • Use cases • Free training!
  • 17. Thank You! Follow me for more updates all the time: @PatrickMcFadin