SlideShare uma empresa Scribd logo
1 de 22
Baixar para ler offline
On Cassandra's evolution
Berlin Buzzwords(June4th 2013)
Sylvain Lebresne
Apache Cassandra
#bbuzz
Fully Distributed Database
Massively Scalable
High performance
Highly reliable/available
·
·
·
·
3/22
Cassandra: the past
#bbuzz
Cassandra 0.7 (Jan 2011):
Cassandra 0.8 (Jun 2011):
·
Dynamic schema creation
Expiring columns (TTL)
Secondary indexes
-
-
-
·
Counters
First version of CQL
Automatic memtable tuning
-
-
-
Cassandra 1.0 (Oct 2011):
Cassandra 1.1 (Apr 2012):
·
Compression
Leveled compaction
-
-
·
Row level isolation
Concurrent schema changes
Support for mixed SDD+HDD
nodes
Self-tuning key/row caches
-
-
-
-
4/22
Cassandra: the present
Cassandra 1.2 (Jan 2013):
#bbuzz
Virtual nodes
CQL3
Native protocol
Tracing
...
·
·
·
·
·
5/22
Data distribution without virtual nodes
#bbuzz 6/22
Virtual nodes
#bbuzz 7/22
Repairing without virtual nodes
#bbuzz 8/22
Virtual nodes: repairing
#bbuzz 9/22
Virtual nodes
#bbuzz
Not really "virtual nodes", more "multiple tokens per nodes" (but we still call them
vnodes).
Faster rebuilds.
Allows heterogeneous nodes.
Simpler load balancing when adding nodes.
·
·
·
·
10/22
The Cassandra Query language
#bbuzz
Initial version introduced in Cassandra 0.8. Version 3 (described here) is a major, more
ambitious, revision.
Goal: provide a much simpler, more abstracted user interface than the legacy thrift one.
Kind of a "denormalized SQL".
Strictly real-time oriented:
·
·
·
·
No joins
No sub-queries
No aggregation/GROUP BY
Limited ORDER BY
-
-
-
-
11/22
Storing songs
id title artist album tags track
a3e64f8f... La Grange ZZTop Tres Hombres { blues, 1973 } 3
8a172618... Moving in Stereo Fu Manchu We Must Obey { covers, 2003 } 9
2b09185b... Outside Woman Blues Back Door Slame Roll Away 3
#bbuzz
CREATETABLEsongs(
iduuidPRIMARYKEY,
titletext,
artisttext,
albumtext,
trackint,
tagsset<text>
);
--Atomicandisolated
INSERTINTOsongs(id,title,artist,album,tags,track)
VALUES(a3e64f8f...,'LaGrange','ZTop''TresHombres',{'blues','1973'},3);
UPDATEsongsSETartist='ZZTop'WHEREid=a3e64f8f...;
CQL
12/22
Playlists
user_id playlist_name title artist album song_id
pcmanus My list La Grange ZZTop Tres Hombres a3e64f8f...
pcmanus My list Moving in Stereo Fu Manchu We Must Obey 8a172618...
pcmanus Other list La Grange ZZTop Tres Hombres a3e64f8f...
pcmanus Other list Outside Woman Blues Back Door Slame Roll Away 2b09185b...
Partition key: determines the physical node holding the row
Clustering key: induces physical ordering of the rows on disk
#bbuzz
CREATETABLEplaylists(
user_idtext,
playlist_nametext,
titletext,
artisttext,
albumtext,
song_iduuid,
PRIMARYKEY((user_id,playlist_name), title,album,artist)
);
CQL
13/22
Querying a Playlist
#bbuzz
--Songsin'Mylist'withatitlestartingby'b'or'c'
SELECT*FROMplaylists
WHEREuser_id='pcmanus'
ANDplaylist_name='Mylist'
ANDtitle>='b'ANDtitle<'d';
--50lastsongsin'Mylist'
SELECT*FROMplaylists
WHEREuser_id='pcmanus'
ANDplaylist_name='Mylist'
ORDERBYtitleDESC
LIMIT50;
CQL
14/22
Native protocol
Binary transport protocol for CQL3 (replace Thrift transport):
See the Datastax Java Driver for a mature driver using this new protocol
(https://github.com/datastax/java-driver).
#bbuzz
Asynchronous (less connections)
Server notifications for new nodes, schema changes, etc..
Optimized for CQL3
·
·
·
15/22
Request tracing
#bbuzz
cqlsh:foo>TRACINGON;
cqlsh:foo>INSERTINTObar(i,j)VALUES(6,2);
CQL
activity |timestamp |source |elapsed
-------------------------------------+--------------+-----------+---------
Determiningreplicasformutation |00:02:37,015|127.0.0.1| 540
Sendingmessageto/127.0.0.2 |00:02:37,015|127.0.0.1| 779
Messagereceivedfrom/127.0.0.1 |00:02:37,016|127.0.0.2| 63
Applyingmutation |00:02:37,016|127.0.0.2| 220
AcquiringswitchLock |00:02:37,016|127.0.0.2| 250
Appendingtocommitlog |00:02:37,016|127.0.0.2| 277
Addingtomemtable |00:02:37,016|127.0.0.2| 378
Enqueuingresponseto/127.0.0.1 |00:02:37,016|127.0.0.2| 710
Sendingmessageto/127.0.0.1 |00:02:37,016|127.0.0.2| 888
Messagereceivedfrom/127.0.0.2 |00:02:37,017|127.0.0.1| 2334
Processingresponsefrom/127.0.0.2 |00:02:37,017|127.0.0.1| 2550
16/22
Tracing an anti-pattern
id created_at value
my_queue 1399121331 0x9b0450d30de9
my_queue 1439051021 0xfc7aee5f6a66
my_queue 1440134565 0x668fdb3a2196
my_queue 1445219887 0xdaf420a01c09
my_queue 1479138491 0x3241ad893ff0
#bbuzz
CREATETABLEqueues(
idtext,
created_attimestamp,
valueblob,
PRIMARYKEY(id,created_at)
);
CQL
17/22
Tracing an anti-pattern
#bbuzz
cqlsh:foo>TRACINGON;
cqlsh:foo>SELECTFROMqueuesWHEREid='myqueue'ORDERBYcreated_atLIMIT1;
CQL
activity |timestamp |source |elapsed
------------------------------------------+--------------+-----------+---------
execute_cql3_query |19:31:05,650|127.0.0.1| 0
Sendingmessageto/127.0.0.3 |19:31:05,651|127.0.0.1| 541
Messagereceivedfrom/127.0.0.1 |19:31:05,651|127.0.0.3| 39
Executingsingle-partitionquery |19:31:05,652|127.0.0.3| 943
Acquiringsstablereferences |19:31:05,652|127.0.0.3| 973
Mergingmemtablecontents |19:31:05,652|127.0.0.3| 1020
Mergingdatafrommemtablesandsstables |19:31:05,652|127.0.0.3| 1081
Read1livecellsand100000tombstoned |19:31:05,686|127.0.0.3| 35072
Enqueuingresponseto/127.0.0.1 |19:31:05,687|127.0.0.3| 35220
Sendingmessageto/127.0.0.1 |19:31:05,687|127.0.0.3| 35314
Messagereceivedfrom/127.0.0.3 |19:31:05,687|127.0.0.1| 36908
Processingresponsefrom/127.0.0.3 |19:31:05,688|127.0.0.1| 37650
Requestcomplete |19:31:05,688|127.0.0.1| 38047
18/22
But also ...
#bbuzz
Concurrent schema creation
Improved JBOD support
Off-heap bloom filters and compression metadata
Faster (murmur3 based) partitioner
...
·
·
·
·
·
19/22
What's next?
Cassandra 2.0 is scheduled for July:
#bbuzz
Improvements to CQL3 and the native protocol (automatic query paging)
Compare-and-swap
Triggers (experimental)
Eager retries
Performance improvements (single-pass compaction, more efficient tombstone removal,
...)
...
·
·
UPDATEusersSETlogin='pcmanus',name='SylvainLebresne'IFNOTEXISTS;
UPDATEusersSETemail='sylvain@datastax.com'IFemail='slebresne@datastax.com';
CQL
·
·
·
·
20/22
<Thank You!>
Questions?
www http://cassandra.apache.org/
twitter @pcmanus
github github.com/pcmanus
Cassandra's evolution and future directions

Mais conteúdo relacionado

Mais procurados

By Vikrant Desai
By Vikrant DesaiBy Vikrant Desai
By Vikrant Desaidvikranttt
 
Creación de máquinas virtuales basada en kernel usando qemu y virsh
Creación de máquinas virtuales basada en kernel usando qemu y virshCreación de máquinas virtuales basada en kernel usando qemu y virsh
Creación de máquinas virtuales basada en kernel usando qemu y virshJonathan Franchesco Torres Baca
 
Qemu - Raspberry | while42 Singapore #2
Qemu - Raspberry | while42 Singapore #2Qemu - Raspberry | while42 Singapore #2
Qemu - Raspberry | while42 Singapore #2While42
 
Simple Sheet for resource calculation of openstack compute node
Simple Sheet for resource calculation of openstack compute nodeSimple Sheet for resource calculation of openstack compute node
Simple Sheet for resource calculation of openstack compute nodeAsmaa Ibrahim
 
Linux fundamental - Chap 04 archive
Linux fundamental - Chap 04 archiveLinux fundamental - Chap 04 archive
Linux fundamental - Chap 04 archiveKenny (netman)
 
Building packages through emulation by Sean Bruno
Building packages through emulation by Sean BrunoBuilding packages through emulation by Sean Bruno
Building packages through emulation by Sean Brunoeurobsdcon
 
My sql fabric ha and sharding solutions
My sql fabric ha and sharding solutionsMy sql fabric ha and sharding solutions
My sql fabric ha and sharding solutionsLouis liu
 
Some analysis of BlueStore and RocksDB
Some analysis of BlueStore and RocksDBSome analysis of BlueStore and RocksDB
Some analysis of BlueStore and RocksDBXiao Yan Li
 
Fixed in drizzle
Fixed in drizzleFixed in drizzle
Fixed in drizzleHenrik Ingo
 
Proxy server ubuntu 12.04
Proxy server ubuntu 12.04Proxy server ubuntu 12.04
Proxy server ubuntu 12.04Tio Aldiansyah
 
Linux Introduction
Linux IntroductionLinux Introduction
Linux IntroductionDuy Do Phan
 
MyAWR another mysql awr
MyAWR another mysql awrMyAWR another mysql awr
MyAWR another mysql awrLouis liu
 

Mais procurados (20)

By Vikrant Desai
By Vikrant DesaiBy Vikrant Desai
By Vikrant Desai
 
NetApp mailbox disk
NetApp mailbox diskNetApp mailbox disk
NetApp mailbox disk
 
My First BCC
My First BCCMy First BCC
My First BCC
 
Creación de máquinas virtuales basada en kernel usando qemu y virsh
Creación de máquinas virtuales basada en kernel usando qemu y virshCreación de máquinas virtuales basada en kernel usando qemu y virsh
Creación de máquinas virtuales basada en kernel usando qemu y virsh
 
Gitkata fish shell
Gitkata fish shellGitkata fish shell
Gitkata fish shell
 
Qemu - Raspberry | while42 Singapore #2
Qemu - Raspberry | while42 Singapore #2Qemu - Raspberry | while42 Singapore #2
Qemu - Raspberry | while42 Singapore #2
 
Backups
BackupsBackups
Backups
 
Simple Sheet for resource calculation of openstack compute node
Simple Sheet for resource calculation of openstack compute nodeSimple Sheet for resource calculation of openstack compute node
Simple Sheet for resource calculation of openstack compute node
 
Cli1 Bibalex
Cli1 BibalexCli1 Bibalex
Cli1 Bibalex
 
Linux fundamental - Chap 04 archive
Linux fundamental - Chap 04 archiveLinux fundamental - Chap 04 archive
Linux fundamental - Chap 04 archive
 
Building packages through emulation by Sean Bruno
Building packages through emulation by Sean BrunoBuilding packages through emulation by Sean Bruno
Building packages through emulation by Sean Bruno
 
Linux Containers (LXC)
Linux Containers (LXC)Linux Containers (LXC)
Linux Containers (LXC)
 
My sql fabric ha and sharding solutions
My sql fabric ha and sharding solutionsMy sql fabric ha and sharding solutions
My sql fabric ha and sharding solutions
 
Some analysis of BlueStore and RocksDB
Some analysis of BlueStore and RocksDBSome analysis of BlueStore and RocksDB
Some analysis of BlueStore and RocksDB
 
ubunturef
ubunturefubunturef
ubunturef
 
Fixed in drizzle
Fixed in drizzleFixed in drizzle
Fixed in drizzle
 
Proxy server ubuntu 12.04
Proxy server ubuntu 12.04Proxy server ubuntu 12.04
Proxy server ubuntu 12.04
 
Linux Introduction
Linux IntroductionLinux Introduction
Linux Introduction
 
MyAWR another mysql awr
MyAWR another mysql awrMyAWR another mysql awr
MyAWR another mysql awr
 
Unixtoolbox
UnixtoolboxUnixtoolbox
Unixtoolbox
 

Mais de pcmanus

Webinar Big Data Paris
Webinar Big Data ParisWebinar Big Data Paris
Webinar Big Data Parispcmanus
 
Cassandra EU - State of CQL
Cassandra EU - State of CQLCassandra EU - State of CQL
Cassandra EU - State of CQLpcmanus
 
Fosdem 2012
Fosdem 2012Fosdem 2012
Fosdem 2012pcmanus
 
33rd degree conference
33rd degree conference33rd degree conference
33rd degree conferencepcmanus
 
On Cassandra Development: Past, Present and Future
On Cassandra Development: Past, Present and FutureOn Cassandra Development: Past, Present and Future
On Cassandra Development: Past, Present and Futurepcmanus
 
Cassandra
CassandraCassandra
Cassandrapcmanus
 

Mais de pcmanus (6)

Webinar Big Data Paris
Webinar Big Data ParisWebinar Big Data Paris
Webinar Big Data Paris
 
Cassandra EU - State of CQL
Cassandra EU - State of CQLCassandra EU - State of CQL
Cassandra EU - State of CQL
 
Fosdem 2012
Fosdem 2012Fosdem 2012
Fosdem 2012
 
33rd degree conference
33rd degree conference33rd degree conference
33rd degree conference
 
On Cassandra Development: Past, Present and Future
On Cassandra Development: Past, Present and FutureOn Cassandra Development: Past, Present and Future
On Cassandra Development: Past, Present and Future
 
Cassandra
CassandraCassandra
Cassandra
 

Último

WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 

Último (20)

WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 

Cassandra's evolution and future directions

  • 1.
  • 2. On Cassandra's evolution Berlin Buzzwords(June4th 2013) Sylvain Lebresne
  • 3. Apache Cassandra #bbuzz Fully Distributed Database Massively Scalable High performance Highly reliable/available · · · · 3/22
  • 4. Cassandra: the past #bbuzz Cassandra 0.7 (Jan 2011): Cassandra 0.8 (Jun 2011): · Dynamic schema creation Expiring columns (TTL) Secondary indexes - - - · Counters First version of CQL Automatic memtable tuning - - - Cassandra 1.0 (Oct 2011): Cassandra 1.1 (Apr 2012): · Compression Leveled compaction - - · Row level isolation Concurrent schema changes Support for mixed SDD+HDD nodes Self-tuning key/row caches - - - - 4/22
  • 5. Cassandra: the present Cassandra 1.2 (Jan 2013): #bbuzz Virtual nodes CQL3 Native protocol Tracing ... · · · · · 5/22
  • 6. Data distribution without virtual nodes #bbuzz 6/22
  • 8. Repairing without virtual nodes #bbuzz 8/22
  • 10. Virtual nodes #bbuzz Not really "virtual nodes", more "multiple tokens per nodes" (but we still call them vnodes). Faster rebuilds. Allows heterogeneous nodes. Simpler load balancing when adding nodes. · · · · 10/22
  • 11. The Cassandra Query language #bbuzz Initial version introduced in Cassandra 0.8. Version 3 (described here) is a major, more ambitious, revision. Goal: provide a much simpler, more abstracted user interface than the legacy thrift one. Kind of a "denormalized SQL". Strictly real-time oriented: · · · · No joins No sub-queries No aggregation/GROUP BY Limited ORDER BY - - - - 11/22
  • 12. Storing songs id title artist album tags track a3e64f8f... La Grange ZZTop Tres Hombres { blues, 1973 } 3 8a172618... Moving in Stereo Fu Manchu We Must Obey { covers, 2003 } 9 2b09185b... Outside Woman Blues Back Door Slame Roll Away 3 #bbuzz CREATETABLEsongs( iduuidPRIMARYKEY, titletext, artisttext, albumtext, trackint, tagsset<text> ); --Atomicandisolated INSERTINTOsongs(id,title,artist,album,tags,track) VALUES(a3e64f8f...,'LaGrange','ZTop''TresHombres',{'blues','1973'},3); UPDATEsongsSETartist='ZZTop'WHEREid=a3e64f8f...; CQL 12/22
  • 13. Playlists user_id playlist_name title artist album song_id pcmanus My list La Grange ZZTop Tres Hombres a3e64f8f... pcmanus My list Moving in Stereo Fu Manchu We Must Obey 8a172618... pcmanus Other list La Grange ZZTop Tres Hombres a3e64f8f... pcmanus Other list Outside Woman Blues Back Door Slame Roll Away 2b09185b... Partition key: determines the physical node holding the row Clustering key: induces physical ordering of the rows on disk #bbuzz CREATETABLEplaylists( user_idtext, playlist_nametext, titletext, artisttext, albumtext, song_iduuid, PRIMARYKEY((user_id,playlist_name), title,album,artist) ); CQL 13/22
  • 15. Native protocol Binary transport protocol for CQL3 (replace Thrift transport): See the Datastax Java Driver for a mature driver using this new protocol (https://github.com/datastax/java-driver). #bbuzz Asynchronous (less connections) Server notifications for new nodes, schema changes, etc.. Optimized for CQL3 · · · 15/22
  • 16. Request tracing #bbuzz cqlsh:foo>TRACINGON; cqlsh:foo>INSERTINTObar(i,j)VALUES(6,2); CQL activity |timestamp |source |elapsed -------------------------------------+--------------+-----------+--------- Determiningreplicasformutation |00:02:37,015|127.0.0.1| 540 Sendingmessageto/127.0.0.2 |00:02:37,015|127.0.0.1| 779 Messagereceivedfrom/127.0.0.1 |00:02:37,016|127.0.0.2| 63 Applyingmutation |00:02:37,016|127.0.0.2| 220 AcquiringswitchLock |00:02:37,016|127.0.0.2| 250 Appendingtocommitlog |00:02:37,016|127.0.0.2| 277 Addingtomemtable |00:02:37,016|127.0.0.2| 378 Enqueuingresponseto/127.0.0.1 |00:02:37,016|127.0.0.2| 710 Sendingmessageto/127.0.0.1 |00:02:37,016|127.0.0.2| 888 Messagereceivedfrom/127.0.0.2 |00:02:37,017|127.0.0.1| 2334 Processingresponsefrom/127.0.0.2 |00:02:37,017|127.0.0.1| 2550 16/22
  • 17. Tracing an anti-pattern id created_at value my_queue 1399121331 0x9b0450d30de9 my_queue 1439051021 0xfc7aee5f6a66 my_queue 1440134565 0x668fdb3a2196 my_queue 1445219887 0xdaf420a01c09 my_queue 1479138491 0x3241ad893ff0 #bbuzz CREATETABLEqueues( idtext, created_attimestamp, valueblob, PRIMARYKEY(id,created_at) ); CQL 17/22
  • 18. Tracing an anti-pattern #bbuzz cqlsh:foo>TRACINGON; cqlsh:foo>SELECTFROMqueuesWHEREid='myqueue'ORDERBYcreated_atLIMIT1; CQL activity |timestamp |source |elapsed ------------------------------------------+--------------+-----------+--------- execute_cql3_query |19:31:05,650|127.0.0.1| 0 Sendingmessageto/127.0.0.3 |19:31:05,651|127.0.0.1| 541 Messagereceivedfrom/127.0.0.1 |19:31:05,651|127.0.0.3| 39 Executingsingle-partitionquery |19:31:05,652|127.0.0.3| 943 Acquiringsstablereferences |19:31:05,652|127.0.0.3| 973 Mergingmemtablecontents |19:31:05,652|127.0.0.3| 1020 Mergingdatafrommemtablesandsstables |19:31:05,652|127.0.0.3| 1081 Read1livecellsand100000tombstoned |19:31:05,686|127.0.0.3| 35072 Enqueuingresponseto/127.0.0.1 |19:31:05,687|127.0.0.3| 35220 Sendingmessageto/127.0.0.1 |19:31:05,687|127.0.0.3| 35314 Messagereceivedfrom/127.0.0.3 |19:31:05,687|127.0.0.1| 36908 Processingresponsefrom/127.0.0.3 |19:31:05,688|127.0.0.1| 37650 Requestcomplete |19:31:05,688|127.0.0.1| 38047 18/22
  • 19. But also ... #bbuzz Concurrent schema creation Improved JBOD support Off-heap bloom filters and compression metadata Faster (murmur3 based) partitioner ... · · · · · 19/22
  • 20. What's next? Cassandra 2.0 is scheduled for July: #bbuzz Improvements to CQL3 and the native protocol (automatic query paging) Compare-and-swap Triggers (experimental) Eager retries Performance improvements (single-pass compaction, more efficient tombstone removal, ...) ... · · UPDATEusersSETlogin='pcmanus',name='SylvainLebresne'IFNOTEXISTS; UPDATEusersSETemail='sylvain@datastax.com'IFemail='slebresne@datastax.com'; CQL · · · · 20/22