SlideShare uma empresa Scribd logo
1 de 44
Baixar para ler offline
Cassandra 3.0 Advanced Preview
Patrick McFadin @PatrickMcFadin
Chief Evangelist for Apache Cassandra
Intro
1.Developer friendly
2.Pay down a lot of technical debt
3.Push performance even higher
2
Developer Productivity
3
User Defined Functions
• Counter table
• User clicks on a number of stars
• rating_counter = How many clicks
• rating_total = Cumulative amount of stars
4
CREATE TABLE video_rating (

videoid uuid,

rating_counter counter,

rating_total counter,

PRIMARY KEY (videoid)

);
User Defined Functions
5
CREATE TABLE video_rating (

videoid uuid,

rating_counter counter,

rating_total counter,

PRIMARY KEY (videoid)

);
public long getRatingForVideo(UUID videoId) {

BoundStatement bs =
getRatingByVideoPreparedStatement.bind(videoId);



ResultSet rs = session.execute(bs);



Row row = rs.one();



// Get the count and total rating for the video

long total = row.getLong("rating_total");

long count = row.getLong("rating_counter");



// Divide the total by the count and return an average

return (total / count);

}

User Defined Functions
6
CREATE TABLE video_rating (

videoid uuid,

rating_counter counter,

rating_total counter,

PRIMARY KEY (videoid)

);
public long getRatingForVideo(UUID videoId) {

BoundStatement bs =
getRatingByVideoPreparedStatement.bind(videoId);



ResultSet rs = session.execute(bs);



Row row = rs.one();



// Get the count and total rating for the video

long total = row.getLong("rating_total");

long count = row.getLong("rating_counter");



// Divide the total by the count and return an average

return (total / count);

}

Application code?
User Defined Functions
7
CREATE OR REPLACE FUNCTION averageRating ( rating_counter counter, rating_total counter )

RETURNS Float

LANGUAGE java

AS '

return Float.valueOf(rating_total.floatValue() / rating_counter.floatValue());

';
Function Name CQL TypeObject return type
Java Code
User Defined Functions
• Add to your CQL statement!
8
> SELECT averageRating(rating_counter, rating_total) AS avg

FROM video_rating

WHERE videoid = 99051fe9-6a9c-46c2-b949-38ef78858dd0;
videoid | rating_counter | rating_total

--------------------------------------+----------------+--------------

99051fe9-6a9c-46c2-b949-38ef78858dd0 | 3 | 12
avg

-----

4
User Defined Functions - Fine print
• “Pure” functions
• Nothing outside of input parameters
• Return types are only objects. No primitives
• Method signatures on parameter type
9
User Defined Function Language Support
• Java
• JavaScript
10
• Scala
• Groovy
• Jython
• JRuby
Primary Languages
Optional Languages
JSON Support
• Table to store a video
• TYPE to store metadata
11
CREATE TYPE video_metadata (

height int,

width int,

video_bit_rate set<text>,

encoding text

);
CREATE TABLE videos (

videoid uuid,

userid uuid,

name varchar,

description varchar,

location text,

location_type int,

preview_thumbnails map<text,text>,

tags set<varchar>,

metadata set <frozen<video_metadata>>,

added_date timestamp,

PRIMARY KEY (videoid)

);
JSON Support
12
INSERT INTO videos (videoid, name, userid, description, location, location_type,
preview_thumbnails, tags, added_date, metadata)

VALUES (49f64d40-7d89-4890-b910-dbf923563a33,'The World''s Next Top Data Model',
9761d3d7-7fbd-4269-9988-6cfd4e188678, 

'Third in a three part series for Cassandra Data Modeling','http://www.youtube.com/watch?
v=HdJlsOZVGwM',1,

{'YouTube':'http://www.youtube.com/watch?v=HdJlsOZVGwM'},{'cassandra','data
model','examples','instruction'},'2013-06-11 11:00:00',

{{ height: 480, width: 640, encoding: 'MP4', video_bit_rate: {'1000kbs', '400kbs'}}});
Decompose into standard insert
OR!
JSON Support
13
INSERT INTO videos JSON

'{

"videoid":"99051fe9-6a9c-46c2-b949-38ef78858dd0",

"added_date":"2012-06-01 08:00:00.000",

"description":"My cat likes to play the piano! So funny.",

"location":"/us/vid/b3/b3a76c6b-7c7f-4af6-964f-803a9283c401",

"location_type":1,

"metadata":[

{

"height":480,

"width":640,

"video_bit_rate":[

"1000kbs",

"400kbs"

],

"encoding":"MP4"

}

],

"name":"My funny cat",

"preview_thumbnails":{

"10":"/us/vid/b3/b3a76c6b-7c7f-4af6-964f-803a9283c401"

},

"tags":[

"cats",

"lol",

"piano"

],

"userid":"d0f60aa8-54a9-4840-b70c-fe562b68842b"

}';
One block of JSON
OR!
JSON Support
14
INSERT INTO videos (videoid, name, userid, description, location, location_type, preview_thumbnails, tags,
added_date, metadata)

VALUES (99051fe9-6a9c-46c2-b949-38ef78858dd0,'My funny cat',d0f60aa8-54a9-4840-b70c-fe562b68842b, 

'My cat likes to play the piano! So funny.','/us/vid/b3/b3a76c6b-7c7f-4af6-964f-803a9283c401',1,

{'10':'/us/vid/b3/b3a76c6b-7c7f-4af6-964f-803a9283c401'},{'cats','piano','lol'},'2012-06-01 08:00:00',

fromJson('

[{

"height":480,

"width":640,

"video_bit_rate":[

"1000kbs",

"400kbs"

],

"encoding":"MP4"

}]

')

);
Just a block at a time
Get JSON data
15
[json]

------------------------------------------------------------------

'{

"videoid":"99051fe9-6a9c-46c2-b949-38ef78858dd0",

"added_date":"2012-06-01 08:00:00.000",

"description":"My cat likes to play the piano! So funny.",

"location":"/us/vid/b3/b3a76c6b-7c7f-4af6-964f-803a9283c401",

"location_type":1,

"metadata":[

{

"height":480,

"width":640,

"video_bit_rate":[

"1000kbs",

"400kbs"

],

"encoding":"MP4"

}

],

"name":"My funny cat",

"preview_thumbnails":{

"10":"/us/vid/b3/b3a76c6b-7c7f-4af6-964f-803a9283c401"

},

"tags":[

"cats",

"lol",

"piano"

],

"userid":"d0f60aa8-54a9-4840-b70c-fe562b68842b"

}';
SELECT JSON * FROM videos;
16
Indexing
Global Indexes
17
CREATE TABLE videos (

videoid uuid,

userid uuid,

name varchar,

description varchar,

location text,

location_type int,

preview_thumbnails map<text,text>,

tags set<varchar>,

metadata set <frozen<video_metadata>>,

added_date timestamp,

PRIMARY KEY (videoid)

);
CREATE TABLE videos_by_tag (

tag text,

videoid uuid,

added_date timestamp,

name text,

preview_image_location text,

tagged_date timestamp,

PRIMARY KEY (tag, videoid)

);
Application maintained
consistency
Global Indexes
18
CREATE TABLE videos (

videoid uuid,

userid uuid,

name varchar,

description varchar,

location text,

location_type int,

preview_thumbnails map<text,text>,

tags set<varchar>,

metadata set <frozen<video_metadata>>,

added_date timestamp,

PRIMARY KEY (videoid)

);
CREATE GLOBAL INDEX tags_index 

ON videos (tag, videoid) 

INCLUDE (name, added_date, preview_thumbnails)
CREATE TABLE videos_by_tag (

tag text,

videoid uuid,

added_date timestamp,

name text,

preview_image_location text,

tagged_date timestamp,

PRIMARY KEY (tag, videoid)

);
Global Indexes
• Separate Cassandra managed table
• Inserts
• Updates
19
More Indexes!
• Partial Indexes - Postponed until 3.1
• Functional Indexes - using a UDF in an index
20
CREATE INDEX ON user_rating averageRating(rating_counter, rating_total);
21
File System
Hints to Raw Files
• Pre 3.0 hints stored in table
• Create load on entire write path
• …and read path
• …and compaction
22
CREATE TABLE system.hints (

target_id uuid,

hint_id timeuuid,

message_version int,

mutation blob,

PRIMARY KEY (target_id, hint_id, message_version)

) WITH COMPACT STORAGE

AND CLUSTERING ORDER BY (hint_id ASC, message_version ASC);
Hints to Raw Files
• Hints now written to a local file
• Replays direct from disk
• Bulk streamed to endpoints
23
CREATE TABLE system.hints (

target_id uuid,

hint_id timeuuid,

message_version int,

mutation blob,

PRIMARY KEY (target_id, hint_id, message_version)

) WITH COMPACT STORAGE

AND CLUSTERING ORDER BY (hint_id ASC, message_version ASC);
Windows Compatibility - The Problem
• Java file management on Windows is… different
• File delete’s are not possible
• Hard links - Broke
• Snapshots - Broke
• Memory Mapped I/O - Broke
24
Windows Compatibility - 3.0
• Re-tooling of critical file functions
• Extensive use of FILE_SHARE_DELETE from JDK7
• Launch now in PowerShell
• CCM now supports windows
25
Storage Engine Changes
• Now infamous CASSANDRA-8099
• Technical debt from Thrift
• Move from Thrift centric to CQL centric storage
26
Pre 3.0 Storage Engine Format
27
2005:12:1:102005:12:1:92005:12:1:82005:12:1:7
5F22A0BC
Partition Key Clustering Columns
F2B3652CFFB3652D7AB3652C
PRIMARY KEY (userId,added_date,videoId)
A12378E55F5A32
3.0 Format
• Partition header stores column names
• Row stores clustering values
• No duplicated values
28
Partition Key
Column Names
Clustering Values
Column Values
Clustering Values
Column Values
Partition Header Row Row
Clustering Values
Column Values
Row
29
Performance
Commit Log Compression
• Pre 3.0 commit log writes
30
Commit Log Compression
• Pre 3.0 commit log writes
31
Commit Log Compression
• Pre 3.0 commit log writes
32
Commit Log Compression
• Pre 3.0 commit log writes
33
Commit Log Compression
• Pre 3.0 commit log writes
34
Commit Log Compression
• Pre 3.0 commit log writes
35
Commit Log Compression
• Pre 3.0 commit log writes
36
Commit Log Compression
• Segments are compressed by time interval
• Higher throughput under high writes
37
Commit Log Compression
• Segments are compressed by time interval
• Higher throughput under high writes
38
Commit Log Compression
• Segments are compressed by time interval
• Higher throughput under high writes
39
Smaller but significant changes
• Direct buffer decompression of reads
• Avoiding memory allocation on Index Summary search
• Repair concurrency improvements
• Optimal CRC32 implementation at runtime
40
41
Security
Role Based Access Control
• Expands on User based auth in 1.2
• Requires the internal auth to be enabled
42
CREATE ROLE supervisor;



GRANT MODIFY ON user_credentials TO supervisor;
When will it ship?
43
Maybe June
When 8099 is finished, it ships
Thank you!
Questions
Follow me on Twitter for more
@PatrickMcFadin

Mais conteúdo relacionado

Mais procurados

Storing time series data with Apache Cassandra
Storing time series data with Apache CassandraStoring time series data with Apache Cassandra
Storing time series data with Apache CassandraPatrick McFadin
 
Time series with apache cassandra strata
Time series with apache cassandra   strataTime series with apache cassandra   strata
Time series with apache cassandra strataPatrick McFadin
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseDataStax Academy
 
Introduction to data modeling with apache cassandra
Introduction to data modeling with apache cassandraIntroduction to data modeling with apache cassandra
Introduction to data modeling with apache cassandraPatrick McFadin
 
Spark Cassandra Connector: Past, Present and Furure
Spark Cassandra Connector: Past, Present and FurureSpark Cassandra Connector: Past, Present and Furure
Spark Cassandra Connector: Past, Present and FurureDataStax Academy
 
Cassandra 2.0 and timeseries
Cassandra 2.0 and timeseriesCassandra 2.0 and timeseries
Cassandra 2.0 and timeseriesPatrick McFadin
 
Cassandra 3.0 Awesomeness
Cassandra 3.0 AwesomenessCassandra 3.0 Awesomeness
Cassandra 3.0 AwesomenessJon Haddad
 
Cassandra Materialized Views
Cassandra Materialized ViewsCassandra Materialized Views
Cassandra Materialized ViewsCarl Yeksigian
 
Apache Cassandra and Drivers
Apache Cassandra and DriversApache Cassandra and Drivers
Apache Cassandra and DriversDataStax Academy
 
Cassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series ModelingCassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series ModelingVassilis Bekiaris
 
Cassandra Day Atlanta 2015: Building Your First Application with Apache Cassa...
Cassandra Day Atlanta 2015: Building Your First Application with Apache Cassa...Cassandra Day Atlanta 2015: Building Your First Application with Apache Cassa...
Cassandra Day Atlanta 2015: Building Your First Application with Apache Cassa...DataStax Academy
 
Cassandra 3.0 - JSON at scale - StampedeCon 2015
Cassandra 3.0 - JSON at scale - StampedeCon 2015Cassandra 3.0 - JSON at scale - StampedeCon 2015
Cassandra 3.0 - JSON at scale - StampedeCon 2015StampedeCon
 
Cassandra Summit 2013 Keynote
Cassandra Summit 2013 KeynoteCassandra Summit 2013 Keynote
Cassandra Summit 2013 Keynotejbellis
 
Cassandra nice use cases and worst anti patterns
Cassandra nice use cases and worst anti patternsCassandra nice use cases and worst anti patterns
Cassandra nice use cases and worst anti patternsDuyhai Doan
 
Time series with Apache Cassandra - Long version
Time series with Apache Cassandra - Long versionTime series with Apache Cassandra - Long version
Time series with Apache Cassandra - Long versionPatrick McFadin
 
The world's next top data model
The world's next top data modelThe world's next top data model
The world's next top data modelPatrick McFadin
 

Mais procurados (20)

Storing time series data with Apache Cassandra
Storing time series data with Apache CassandraStoring time series data with Apache Cassandra
Storing time series data with Apache Cassandra
 
Advanced Cassandra
Advanced CassandraAdvanced Cassandra
Advanced Cassandra
 
Cassandra 3.0
Cassandra 3.0Cassandra 3.0
Cassandra 3.0
 
Time series with apache cassandra strata
Time series with apache cassandra   strataTime series with apache cassandra   strata
Time series with apache cassandra strata
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax Enterprise
 
Introduction to data modeling with apache cassandra
Introduction to data modeling with apache cassandraIntroduction to data modeling with apache cassandra
Introduction to data modeling with apache cassandra
 
Spark Cassandra Connector: Past, Present and Furure
Spark Cassandra Connector: Past, Present and FurureSpark Cassandra Connector: Past, Present and Furure
Spark Cassandra Connector: Past, Present and Furure
 
Cassandra 2.0 and timeseries
Cassandra 2.0 and timeseriesCassandra 2.0 and timeseries
Cassandra 2.0 and timeseries
 
Cassandra 3.0 Awesomeness
Cassandra 3.0 AwesomenessCassandra 3.0 Awesomeness
Cassandra 3.0 Awesomeness
 
Cassandra Materialized Views
Cassandra Materialized ViewsCassandra Materialized Views
Cassandra Materialized Views
 
Apache Cassandra and Drivers
Apache Cassandra and DriversApache Cassandra and Drivers
Apache Cassandra and Drivers
 
Cassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series ModelingCassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series Modeling
 
CQL3 in depth
CQL3 in depthCQL3 in depth
CQL3 in depth
 
Cassandra Day Atlanta 2015: Building Your First Application with Apache Cassa...
Cassandra Day Atlanta 2015: Building Your First Application with Apache Cassa...Cassandra Day Atlanta 2015: Building Your First Application with Apache Cassa...
Cassandra Day Atlanta 2015: Building Your First Application with Apache Cassa...
 
Cassandra 3.0 - JSON at scale - StampedeCon 2015
Cassandra 3.0 - JSON at scale - StampedeCon 2015Cassandra 3.0 - JSON at scale - StampedeCon 2015
Cassandra 3.0 - JSON at scale - StampedeCon 2015
 
Apache Cassandra & Data Modeling
Apache Cassandra & Data ModelingApache Cassandra & Data Modeling
Apache Cassandra & Data Modeling
 
Cassandra Summit 2013 Keynote
Cassandra Summit 2013 KeynoteCassandra Summit 2013 Keynote
Cassandra Summit 2013 Keynote
 
Cassandra nice use cases and worst anti patterns
Cassandra nice use cases and worst anti patternsCassandra nice use cases and worst anti patterns
Cassandra nice use cases and worst anti patterns
 
Time series with Apache Cassandra - Long version
Time series with Apache Cassandra - Long versionTime series with Apache Cassandra - Long version
Time series with Apache Cassandra - Long version
 
The world's next top data model
The world's next top data modelThe world's next top data model
The world's next top data model
 

Semelhante a Cassandra 3.0 advanced preview

Hey Relational Developer, Let's Go Crazy (Patrick McFadin, DataStax) | Cassan...
Hey Relational Developer, Let's Go Crazy (Patrick McFadin, DataStax) | Cassan...Hey Relational Developer, Let's Go Crazy (Patrick McFadin, DataStax) | Cassan...
Hey Relational Developer, Let's Go Crazy (Patrick McFadin, DataStax) | Cassan...DataStax
 
Building your First Application with Cassandra
Building your First Application with CassandraBuilding your First Application with Cassandra
Building your First Application with CassandraLuke Tillman
 
Cassandra Summit 2014: Highly Scalable Web Application in the Cloud with Cass...
Cassandra Summit 2014: Highly Scalable Web Application in the Cloud with Cass...Cassandra Summit 2014: Highly Scalable Web Application in the Cloud with Cass...
Cassandra Summit 2014: Highly Scalable Web Application in the Cloud with Cass...DataStax Academy
 
Cassandra Day Chicago 2015: Building Your First Application with Apache Cassa...
Cassandra Day Chicago 2015: Building Your First Application with Apache Cassa...Cassandra Day Chicago 2015: Building Your First Application with Apache Cassa...
Cassandra Day Chicago 2015: Building Your First Application with Apache Cassa...DataStax Academy
 
Cassandra Day London 2015: Building Your First Application in Apache Cassandra
Cassandra Day London 2015: Building Your First Application in Apache CassandraCassandra Day London 2015: Building Your First Application in Apache Cassandra
Cassandra Day London 2015: Building Your First Application in Apache CassandraDataStax Academy
 
Sprint 177
Sprint 177Sprint 177
Sprint 177ManageIQ
 
Creating a Python Microservice Tier in Four Sprints with Cassandra, Kafka, an...
Creating a Python Microservice Tier in Four Sprints with Cassandra, Kafka, an...Creating a Python Microservice Tier in Four Sprints with Cassandra, Kafka, an...
Creating a Python Microservice Tier in Four Sprints with Cassandra, Kafka, an...Jeffrey Carpenter
 
Microservices with Node.js and Apache Cassandra
Microservices with Node.js and Apache CassandraMicroservices with Node.js and Apache Cassandra
Microservices with Node.js and Apache CassandraJorge Bay Gondra
 
Sprint 180
Sprint 180Sprint 180
Sprint 180ManageIQ
 
Super-NetOps Source of Truth
Super-NetOps Source of TruthSuper-NetOps Source of Truth
Super-NetOps Source of TruthJoel W. King
 
Sprint 130
Sprint 130Sprint 130
Sprint 130ManageIQ
 
Oracle to Cassandra Core Concepts Guide Pt. 2
Oracle to Cassandra Core Concepts Guide Pt. 2Oracle to Cassandra Core Concepts Guide Pt. 2
Oracle to Cassandra Core Concepts Guide Pt. 2DataStax Academy
 
Pivotal Platform - December Release A First Look
Pivotal Platform - December Release A First LookPivotal Platform - December Release A First Look
Pivotal Platform - December Release A First LookVMware Tanzu
 
Sprint 171
Sprint 171Sprint 171
Sprint 171ManageIQ
 
Sprint 170
Sprint 170Sprint 170
Sprint 170ManageIQ
 
How to help your editor love your vue component library
How to help your editor love your vue component libraryHow to help your editor love your vue component library
How to help your editor love your vue component libraryPiotr Tomiak
 

Semelhante a Cassandra 3.0 advanced preview (20)

Hey Relational Developer, Let's Go Crazy (Patrick McFadin, DataStax) | Cassan...
Hey Relational Developer, Let's Go Crazy (Patrick McFadin, DataStax) | Cassan...Hey Relational Developer, Let's Go Crazy (Patrick McFadin, DataStax) | Cassan...
Hey Relational Developer, Let's Go Crazy (Patrick McFadin, DataStax) | Cassan...
 
Building your First Application with Cassandra
Building your First Application with CassandraBuilding your First Application with Cassandra
Building your First Application with Cassandra
 
Cassandra Summit 2014: Highly Scalable Web Application in the Cloud with Cass...
Cassandra Summit 2014: Highly Scalable Web Application in the Cloud with Cass...Cassandra Summit 2014: Highly Scalable Web Application in the Cloud with Cass...
Cassandra Summit 2014: Highly Scalable Web Application in the Cloud with Cass...
 
Cassandra Day Chicago 2015: Building Your First Application with Apache Cassa...
Cassandra Day Chicago 2015: Building Your First Application with Apache Cassa...Cassandra Day Chicago 2015: Building Your First Application with Apache Cassa...
Cassandra Day Chicago 2015: Building Your First Application with Apache Cassa...
 
Cassandra Day London 2015: Building Your First Application in Apache Cassandra
Cassandra Day London 2015: Building Your First Application in Apache CassandraCassandra Day London 2015: Building Your First Application in Apache Cassandra
Cassandra Day London 2015: Building Your First Application in Apache Cassandra
 
Sprint 177
Sprint 177Sprint 177
Sprint 177
 
Creating a Python Microservice Tier in Four Sprints with Cassandra, Kafka, an...
Creating a Python Microservice Tier in Four Sprints with Cassandra, Kafka, an...Creating a Python Microservice Tier in Four Sprints with Cassandra, Kafka, an...
Creating a Python Microservice Tier in Four Sprints with Cassandra, Kafka, an...
 
Microservices with Node.js and Apache Cassandra
Microservices with Node.js and Apache CassandraMicroservices with Node.js and Apache Cassandra
Microservices with Node.js and Apache Cassandra
 
Sprint 180
Sprint 180Sprint 180
Sprint 180
 
Sprint 180
Sprint 180Sprint 180
Sprint 180
 
Apache DeltaSpike: The CDI Toolbox
Apache DeltaSpike: The CDI ToolboxApache DeltaSpike: The CDI Toolbox
Apache DeltaSpike: The CDI Toolbox
 
Apache DeltaSpike the CDI toolbox
Apache DeltaSpike the CDI toolboxApache DeltaSpike the CDI toolbox
Apache DeltaSpike the CDI toolbox
 
Super-NetOps Source of Truth
Super-NetOps Source of TruthSuper-NetOps Source of Truth
Super-NetOps Source of Truth
 
Droidcon Paris 2015
Droidcon Paris 2015Droidcon Paris 2015
Droidcon Paris 2015
 
Sprint 130
Sprint 130Sprint 130
Sprint 130
 
Oracle to Cassandra Core Concepts Guide Pt. 2
Oracle to Cassandra Core Concepts Guide Pt. 2Oracle to Cassandra Core Concepts Guide Pt. 2
Oracle to Cassandra Core Concepts Guide Pt. 2
 
Pivotal Platform - December Release A First Look
Pivotal Platform - December Release A First LookPivotal Platform - December Release A First Look
Pivotal Platform - December Release A First Look
 
Sprint 171
Sprint 171Sprint 171
Sprint 171
 
Sprint 170
Sprint 170Sprint 170
Sprint 170
 
How to help your editor love your vue component library
How to help your editor love your vue component libraryHow to help your editor love your vue component library
How to help your editor love your vue component library
 

Mais de Patrick McFadin

Successful Architectures for Fast Data
Successful Architectures for Fast DataSuccessful Architectures for Fast Data
Successful Architectures for Fast DataPatrick McFadin
 
Open source or proprietary, choose wisely!
Open source or proprietary,  choose wisely!Open source or proprietary,  choose wisely!
Open source or proprietary, choose wisely!Patrick McFadin
 
An Introduction to time series with Team Apache
An Introduction to time series with Team ApacheAn Introduction to time series with Team Apache
An Introduction to time series with Team ApachePatrick McFadin
 
Laying down the smack on your data pipelines
Laying down the smack on your data pipelinesLaying down the smack on your data pipelines
Laying down the smack on your data pipelinesPatrick McFadin
 
Help! I want to contribute to an Open Source project but my boss says no.
Help! I want to contribute to an Open Source project but my boss says no.Help! I want to contribute to an Open Source project but my boss says no.
Help! I want to contribute to an Open Source project but my boss says no.Patrick McFadin
 
Analyzing Time Series Data with Apache Spark and Cassandra
Analyzing Time Series Data with Apache Spark and CassandraAnalyzing Time Series Data with Apache Spark and Cassandra
Analyzing Time Series Data with Apache Spark and CassandraPatrick McFadin
 
A Cassandra + Solr + Spark Love Triangle Using DataStax Enterprise
A Cassandra + Solr + Spark Love Triangle Using DataStax EnterpriseA Cassandra + Solr + Spark Love Triangle Using DataStax Enterprise
A Cassandra + Solr + Spark Love Triangle Using DataStax EnterprisePatrick McFadin
 
Apache cassandra and spark. you got the the lighter, let's start the fire
Apache cassandra and spark. you got the the lighter, let's start the fireApache cassandra and spark. you got the the lighter, let's start the fire
Apache cassandra and spark. you got the the lighter, let's start the firePatrick McFadin
 
Owning time series with team apache Strata San Jose 2015
Owning time series with team apache   Strata San Jose 2015Owning time series with team apache   Strata San Jose 2015
Owning time series with team apache Strata San Jose 2015Patrick McFadin
 
Nike Tech Talk: Double Down on Apache Cassandra and Spark
Nike Tech Talk:  Double Down on Apache Cassandra and SparkNike Tech Talk:  Double Down on Apache Cassandra and Spark
Nike Tech Talk: Double Down on Apache Cassandra and SparkPatrick McFadin
 
Apache cassandra & apache spark for time series data
Apache cassandra & apache spark for time series dataApache cassandra & apache spark for time series data
Apache cassandra & apache spark for time series dataPatrick McFadin
 
Introduction to cassandra 2014
Introduction to cassandra 2014Introduction to cassandra 2014
Introduction to cassandra 2014Patrick McFadin
 
Making money with open source and not losing your soul: A practical guide
Making money with open source and not losing your soul: A practical guideMaking money with open source and not losing your soul: A practical guide
Making money with open source and not losing your soul: A practical guidePatrick McFadin
 
Building Antifragile Applications with Apache Cassandra
Building Antifragile Applications with Apache CassandraBuilding Antifragile Applications with Apache Cassandra
Building Antifragile Applications with Apache CassandraPatrick McFadin
 
The data model is dead, long live the data model
The data model is dead, long live the data modelThe data model is dead, long live the data model
The data model is dead, long live the data modelPatrick McFadin
 
Cassandra Virtual Node talk
Cassandra Virtual Node talkCassandra Virtual Node talk
Cassandra Virtual Node talkPatrick McFadin
 
Toronto jaspersoft meetup
Toronto jaspersoft meetupToronto jaspersoft meetup
Toronto jaspersoft meetupPatrick McFadin
 
Cassandra data modeling talk
Cassandra data modeling talkCassandra data modeling talk
Cassandra data modeling talkPatrick McFadin
 

Mais de Patrick McFadin (20)

Successful Architectures for Fast Data
Successful Architectures for Fast DataSuccessful Architectures for Fast Data
Successful Architectures for Fast Data
 
Open source or proprietary, choose wisely!
Open source or proprietary,  choose wisely!Open source or proprietary,  choose wisely!
Open source or proprietary, choose wisely!
 
An Introduction to time series with Team Apache
An Introduction to time series with Team ApacheAn Introduction to time series with Team Apache
An Introduction to time series with Team Apache
 
Laying down the smack on your data pipelines
Laying down the smack on your data pipelinesLaying down the smack on your data pipelines
Laying down the smack on your data pipelines
 
Help! I want to contribute to an Open Source project but my boss says no.
Help! I want to contribute to an Open Source project but my boss says no.Help! I want to contribute to an Open Source project but my boss says no.
Help! I want to contribute to an Open Source project but my boss says no.
 
Analyzing Time Series Data with Apache Spark and Cassandra
Analyzing Time Series Data with Apache Spark and CassandraAnalyzing Time Series Data with Apache Spark and Cassandra
Analyzing Time Series Data with Apache Spark and Cassandra
 
A Cassandra + Solr + Spark Love Triangle Using DataStax Enterprise
A Cassandra + Solr + Spark Love Triangle Using DataStax EnterpriseA Cassandra + Solr + Spark Love Triangle Using DataStax Enterprise
A Cassandra + Solr + Spark Love Triangle Using DataStax Enterprise
 
Apache cassandra and spark. you got the the lighter, let's start the fire
Apache cassandra and spark. you got the the lighter, let's start the fireApache cassandra and spark. you got the the lighter, let's start the fire
Apache cassandra and spark. you got the the lighter, let's start the fire
 
Owning time series with team apache Strata San Jose 2015
Owning time series with team apache   Strata San Jose 2015Owning time series with team apache   Strata San Jose 2015
Owning time series with team apache Strata San Jose 2015
 
Nike Tech Talk: Double Down on Apache Cassandra and Spark
Nike Tech Talk:  Double Down on Apache Cassandra and SparkNike Tech Talk:  Double Down on Apache Cassandra and Spark
Nike Tech Talk: Double Down on Apache Cassandra and Spark
 
Apache cassandra & apache spark for time series data
Apache cassandra & apache spark for time series dataApache cassandra & apache spark for time series data
Apache cassandra & apache spark for time series data
 
Introduction to cassandra 2014
Introduction to cassandra 2014Introduction to cassandra 2014
Introduction to cassandra 2014
 
Making money with open source and not losing your soul: A practical guide
Making money with open source and not losing your soul: A practical guideMaking money with open source and not losing your soul: A practical guide
Making money with open source and not losing your soul: A practical guide
 
Building Antifragile Applications with Apache Cassandra
Building Antifragile Applications with Apache CassandraBuilding Antifragile Applications with Apache Cassandra
Building Antifragile Applications with Apache Cassandra
 
Cassandra at scale
Cassandra at scaleCassandra at scale
Cassandra at scale
 
Become a super modeler
Become a super modelerBecome a super modeler
Become a super modeler
 
The data model is dead, long live the data model
The data model is dead, long live the data modelThe data model is dead, long live the data model
The data model is dead, long live the data model
 
Cassandra Virtual Node talk
Cassandra Virtual Node talkCassandra Virtual Node talk
Cassandra Virtual Node talk
 
Toronto jaspersoft meetup
Toronto jaspersoft meetupToronto jaspersoft meetup
Toronto jaspersoft meetup
 
Cassandra data modeling talk
Cassandra data modeling talkCassandra data modeling talk
Cassandra data modeling talk
 

Último

The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 

Último (20)

The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 

Cassandra 3.0 advanced preview

  • 1. Cassandra 3.0 Advanced Preview Patrick McFadin @PatrickMcFadin Chief Evangelist for Apache Cassandra
  • 2. Intro 1.Developer friendly 2.Pay down a lot of technical debt 3.Push performance even higher 2
  • 4. User Defined Functions • Counter table • User clicks on a number of stars • rating_counter = How many clicks • rating_total = Cumulative amount of stars 4 CREATE TABLE video_rating (
 videoid uuid,
 rating_counter counter,
 rating_total counter,
 PRIMARY KEY (videoid)
 );
  • 5. User Defined Functions 5 CREATE TABLE video_rating (
 videoid uuid,
 rating_counter counter,
 rating_total counter,
 PRIMARY KEY (videoid)
 ); public long getRatingForVideo(UUID videoId) {
 BoundStatement bs = getRatingByVideoPreparedStatement.bind(videoId);
 
 ResultSet rs = session.execute(bs);
 
 Row row = rs.one();
 
 // Get the count and total rating for the video
 long total = row.getLong("rating_total");
 long count = row.getLong("rating_counter");
 
 // Divide the total by the count and return an average
 return (total / count);
 }

  • 6. User Defined Functions 6 CREATE TABLE video_rating (
 videoid uuid,
 rating_counter counter,
 rating_total counter,
 PRIMARY KEY (videoid)
 ); public long getRatingForVideo(UUID videoId) {
 BoundStatement bs = getRatingByVideoPreparedStatement.bind(videoId);
 
 ResultSet rs = session.execute(bs);
 
 Row row = rs.one();
 
 // Get the count and total rating for the video
 long total = row.getLong("rating_total");
 long count = row.getLong("rating_counter");
 
 // Divide the total by the count and return an average
 return (total / count);
 }
 Application code?
  • 7. User Defined Functions 7 CREATE OR REPLACE FUNCTION averageRating ( rating_counter counter, rating_total counter )
 RETURNS Float
 LANGUAGE java
 AS '
 return Float.valueOf(rating_total.floatValue() / rating_counter.floatValue());
 '; Function Name CQL TypeObject return type Java Code
  • 8. User Defined Functions • Add to your CQL statement! 8 > SELECT averageRating(rating_counter, rating_total) AS avg
 FROM video_rating
 WHERE videoid = 99051fe9-6a9c-46c2-b949-38ef78858dd0; videoid | rating_counter | rating_total
 --------------------------------------+----------------+--------------
 99051fe9-6a9c-46c2-b949-38ef78858dd0 | 3 | 12 avg
 -----
 4
  • 9. User Defined Functions - Fine print • “Pure” functions • Nothing outside of input parameters • Return types are only objects. No primitives • Method signatures on parameter type 9
  • 10. User Defined Function Language Support • Java • JavaScript 10 • Scala • Groovy • Jython • JRuby Primary Languages Optional Languages
  • 11. JSON Support • Table to store a video • TYPE to store metadata 11 CREATE TYPE video_metadata (
 height int,
 width int,
 video_bit_rate set<text>,
 encoding text
 ); CREATE TABLE videos (
 videoid uuid,
 userid uuid,
 name varchar,
 description varchar,
 location text,
 location_type int,
 preview_thumbnails map<text,text>,
 tags set<varchar>,
 metadata set <frozen<video_metadata>>,
 added_date timestamp,
 PRIMARY KEY (videoid)
 );
  • 12. JSON Support 12 INSERT INTO videos (videoid, name, userid, description, location, location_type, preview_thumbnails, tags, added_date, metadata)
 VALUES (49f64d40-7d89-4890-b910-dbf923563a33,'The World''s Next Top Data Model', 9761d3d7-7fbd-4269-9988-6cfd4e188678, 
 'Third in a three part series for Cassandra Data Modeling','http://www.youtube.com/watch? v=HdJlsOZVGwM',1,
 {'YouTube':'http://www.youtube.com/watch?v=HdJlsOZVGwM'},{'cassandra','data model','examples','instruction'},'2013-06-11 11:00:00',
 {{ height: 480, width: 640, encoding: 'MP4', video_bit_rate: {'1000kbs', '400kbs'}}}); Decompose into standard insert OR!
  • 13. JSON Support 13 INSERT INTO videos JSON
 '{
 "videoid":"99051fe9-6a9c-46c2-b949-38ef78858dd0",
 "added_date":"2012-06-01 08:00:00.000",
 "description":"My cat likes to play the piano! So funny.",
 "location":"/us/vid/b3/b3a76c6b-7c7f-4af6-964f-803a9283c401",
 "location_type":1,
 "metadata":[
 {
 "height":480,
 "width":640,
 "video_bit_rate":[
 "1000kbs",
 "400kbs"
 ],
 "encoding":"MP4"
 }
 ],
 "name":"My funny cat",
 "preview_thumbnails":{
 "10":"/us/vid/b3/b3a76c6b-7c7f-4af6-964f-803a9283c401"
 },
 "tags":[
 "cats",
 "lol",
 "piano"
 ],
 "userid":"d0f60aa8-54a9-4840-b70c-fe562b68842b"
 }'; One block of JSON OR!
  • 14. JSON Support 14 INSERT INTO videos (videoid, name, userid, description, location, location_type, preview_thumbnails, tags, added_date, metadata)
 VALUES (99051fe9-6a9c-46c2-b949-38ef78858dd0,'My funny cat',d0f60aa8-54a9-4840-b70c-fe562b68842b, 
 'My cat likes to play the piano! So funny.','/us/vid/b3/b3a76c6b-7c7f-4af6-964f-803a9283c401',1,
 {'10':'/us/vid/b3/b3a76c6b-7c7f-4af6-964f-803a9283c401'},{'cats','piano','lol'},'2012-06-01 08:00:00',
 fromJson('
 [{
 "height":480,
 "width":640,
 "video_bit_rate":[
 "1000kbs",
 "400kbs"
 ],
 "encoding":"MP4"
 }]
 ')
 ); Just a block at a time
  • 15. Get JSON data 15 [json]
 ------------------------------------------------------------------
 '{
 "videoid":"99051fe9-6a9c-46c2-b949-38ef78858dd0",
 "added_date":"2012-06-01 08:00:00.000",
 "description":"My cat likes to play the piano! So funny.",
 "location":"/us/vid/b3/b3a76c6b-7c7f-4af6-964f-803a9283c401",
 "location_type":1,
 "metadata":[
 {
 "height":480,
 "width":640,
 "video_bit_rate":[
 "1000kbs",
 "400kbs"
 ],
 "encoding":"MP4"
 }
 ],
 "name":"My funny cat",
 "preview_thumbnails":{
 "10":"/us/vid/b3/b3a76c6b-7c7f-4af6-964f-803a9283c401"
 },
 "tags":[
 "cats",
 "lol",
 "piano"
 ],
 "userid":"d0f60aa8-54a9-4840-b70c-fe562b68842b"
 }'; SELECT JSON * FROM videos;
  • 17. Global Indexes 17 CREATE TABLE videos (
 videoid uuid,
 userid uuid,
 name varchar,
 description varchar,
 location text,
 location_type int,
 preview_thumbnails map<text,text>,
 tags set<varchar>,
 metadata set <frozen<video_metadata>>,
 added_date timestamp,
 PRIMARY KEY (videoid)
 ); CREATE TABLE videos_by_tag (
 tag text,
 videoid uuid,
 added_date timestamp,
 name text,
 preview_image_location text,
 tagged_date timestamp,
 PRIMARY KEY (tag, videoid)
 ); Application maintained consistency
  • 18. Global Indexes 18 CREATE TABLE videos (
 videoid uuid,
 userid uuid,
 name varchar,
 description varchar,
 location text,
 location_type int,
 preview_thumbnails map<text,text>,
 tags set<varchar>,
 metadata set <frozen<video_metadata>>,
 added_date timestamp,
 PRIMARY KEY (videoid)
 ); CREATE GLOBAL INDEX tags_index 
 ON videos (tag, videoid) 
 INCLUDE (name, added_date, preview_thumbnails) CREATE TABLE videos_by_tag (
 tag text,
 videoid uuid,
 added_date timestamp,
 name text,
 preview_image_location text,
 tagged_date timestamp,
 PRIMARY KEY (tag, videoid)
 );
  • 19. Global Indexes • Separate Cassandra managed table • Inserts • Updates 19
  • 20. More Indexes! • Partial Indexes - Postponed until 3.1 • Functional Indexes - using a UDF in an index 20 CREATE INDEX ON user_rating averageRating(rating_counter, rating_total);
  • 22. Hints to Raw Files • Pre 3.0 hints stored in table • Create load on entire write path • …and read path • …and compaction 22 CREATE TABLE system.hints (
 target_id uuid,
 hint_id timeuuid,
 message_version int,
 mutation blob,
 PRIMARY KEY (target_id, hint_id, message_version)
 ) WITH COMPACT STORAGE
 AND CLUSTERING ORDER BY (hint_id ASC, message_version ASC);
  • 23. Hints to Raw Files • Hints now written to a local file • Replays direct from disk • Bulk streamed to endpoints 23 CREATE TABLE system.hints (
 target_id uuid,
 hint_id timeuuid,
 message_version int,
 mutation blob,
 PRIMARY KEY (target_id, hint_id, message_version)
 ) WITH COMPACT STORAGE
 AND CLUSTERING ORDER BY (hint_id ASC, message_version ASC);
  • 24. Windows Compatibility - The Problem • Java file management on Windows is… different • File delete’s are not possible • Hard links - Broke • Snapshots - Broke • Memory Mapped I/O - Broke 24
  • 25. Windows Compatibility - 3.0 • Re-tooling of critical file functions • Extensive use of FILE_SHARE_DELETE from JDK7 • Launch now in PowerShell • CCM now supports windows 25
  • 26. Storage Engine Changes • Now infamous CASSANDRA-8099 • Technical debt from Thrift • Move from Thrift centric to CQL centric storage 26
  • 27. Pre 3.0 Storage Engine Format 27 2005:12:1:102005:12:1:92005:12:1:82005:12:1:7 5F22A0BC Partition Key Clustering Columns F2B3652CFFB3652D7AB3652C PRIMARY KEY (userId,added_date,videoId) A12378E55F5A32
  • 28. 3.0 Format • Partition header stores column names • Row stores clustering values • No duplicated values 28 Partition Key Column Names Clustering Values Column Values Clustering Values Column Values Partition Header Row Row Clustering Values Column Values Row
  • 30. Commit Log Compression • Pre 3.0 commit log writes 30
  • 31. Commit Log Compression • Pre 3.0 commit log writes 31
  • 32. Commit Log Compression • Pre 3.0 commit log writes 32
  • 33. Commit Log Compression • Pre 3.0 commit log writes 33
  • 34. Commit Log Compression • Pre 3.0 commit log writes 34
  • 35. Commit Log Compression • Pre 3.0 commit log writes 35
  • 36. Commit Log Compression • Pre 3.0 commit log writes 36
  • 37. Commit Log Compression • Segments are compressed by time interval • Higher throughput under high writes 37
  • 38. Commit Log Compression • Segments are compressed by time interval • Higher throughput under high writes 38
  • 39. Commit Log Compression • Segments are compressed by time interval • Higher throughput under high writes 39
  • 40. Smaller but significant changes • Direct buffer decompression of reads • Avoiding memory allocation on Index Summary search • Repair concurrency improvements • Optimal CRC32 implementation at runtime 40
  • 42. Role Based Access Control • Expands on User based auth in 1.2 • Requires the internal auth to be enabled 42 CREATE ROLE supervisor;
 
 GRANT MODIFY ON user_credentials TO supervisor;
  • 43. When will it ship? 43 Maybe June When 8099 is finished, it ships
  • 44. Thank you! Questions Follow me on Twitter for more @PatrickMcFadin