SlideShare uma empresa Scribd logo
1 de 18
Baixar para ler offline
Cassandra
part 2
diegopacheco
@diego_pacheco
Diego Pacheco
@diego_pacheco
❏ Cat's Father
❏ Principal Software Architect
❏ Agile Coach
❏ SOA Expert
❏ DevOps Practitioner
❏ Speaker
❏ Author
diegopacheco
http://diego-pacheco.blogspot.com.br/
https://goo.gl/eEqvzl
About me...
Agenda
❏ RE-CAP
❏ Cassandra Write Path
❏ Tombstones
❏ Compaction Strategies
❏ Row Cache
❏ Bloom Filter
❏ SASI Index
❏ Materialized Views
❏ Counter Families
❏ Anti-Patterns
❏ Cassandra running at UBER in MESOS use case
❏ Q&A
http://cassandra.apache.org/
RE-CAP: Partition Strategy
Cassandra Write Path
❏ SSTable => Sorted Array of Strings.
❏ Write to Disk: Merges and Pre-sorts
happens.
❏ SSTables are IMMUTABLE.
❏ Compaction happens:
❏ Time to time
❏ Prune deleted data
❏ Has thread-offs
Tombstones
❏ Deleted data is MARKED as Removed == Tombstone
❏ Data is deleted and removed during compaction
❏ Compaction can happen in few days depending of the
configs.
❏ Queries on partition with lots of tombstones requires lots of
filtering which can slow down the CASS performance.
❏ Collections operations can lead to tombstones depending
on what you do.
❏ There are Compaction Trade-Offs.
Compaction Strategies
❏ STCS
❏ Default
❏ Insert-Heavy
❏ General Workloads
❏ LCS
❏ Read Heavy
❏ More Updates than
Inserts
❏ DTCS
❏ Time Series
❏ Inserts out of order
❏ Updates for old data
Cassandra ROW CACHE
❏ Buffer FULL merged row into memory
❏ Increase a lot the throughput
❏ Row Cache works with Key Cache
❏ Key Cache = Where the partition is on DISK.
CREATE TABLE status (
user text,
status_id timeuuid,
status text,
PRIMARY KEY (user, status_id))
WITH CLUSTERING ORDER BY (status_id DESC)
AND caching = '{"keys":"ALL", "rows_per_partition":"10"}'
Cassandra Bloom Filter
❏ Bloom Filter: Technique created on the 70s to filter db matches.
❏ Space Efficient
❏ Probabilistic Data Structures
❏ For each SSTable there is a Bloom Filter
❏ Used for Index scans - not used to range scans
❏ Stored OFF HEAP
❏ Tunable per TABLE
❏ Cassandra uses bloom filters to know if the data is on the ROW or not.
Cassandra READ Path
SASI
❏ Secondary Index: Not the primary key.
❏ Lookup tables: bySomething
❏ Distributed Index
❏ Search Like Capabilities: %diego%
❏ Great when:
❏ Multi fields Search
❏ You know the partition key
❏ Indexing static columns
❏ Issues:
❏ More than 1000 rows returned
❏ Searching in Large Partitions
❏ Aggressive Read SLOs
❏ Search for Analytics(Use Spark/Flink)
❏ Ordering Search is important
SASI
Samples
❏ SELECT * FROM users WHERE firstname LIKE 'Die%';
❏ SELECT * FROM users WHERE lastname LIKE '%ie%';
❏ SELECT * FROM users WHERE
created_date > '2015-01-02' AND created_date < '2017-01-02';
Materialized Views
❏ Automated - Table managed for you, Denormalization
❏ Copies of the data in different partitions / replicas
❏ Some Write penalty but acceptable performance
❏ Store results in table which can be indexed
❏ Update ASYNC
❏ Great For:
❏ Caching
❏ Result Sets
❏ Dashbaords
SAMPLE
CREATE MATERIALIZED VIEW all_time_high AS
SELECT user FROM scores WHERE
game IS NOT NULL AND
score IS NOT NULL
PRIMARY KEY (game,score) WITH CLUSTERING ORDER BY (score DESC)
Cassandra Counter Family
❏ Static VS Dynamic Column families
❏ Dynamic Column families A.K.A Wide Rows
❏ Wide Rows is good for: Ordering,Grouping and Filtering.
❏ Wide Rows are not split into NODES.
❏ Counters Internally:
❏ Calculated and sum of all replicas
❏ Split into fragments called SHARDs.
❏ Logical clock monotonically increased
❏ 3 tuple = { NODE_COUNTER_ID, SHARD_LOGICAL_CLOCK, SHARD_VALUE }
Anti-Patterns
❏ Using Cassandra as a queue or queue-like table
❏ Tombstones
❏ Lots of deleted columns(expiry) and slice-queries don't play well
❏ http://www.datastax.com/dev/blog/cassandra-anti-patterns-queues-and-queue-like-datasets
❏ CQL Nulls
❏ Reading Tombstones
❏ Write NULL create tombstones
❏ Intensive Updates on SAME column
❏ Sensor table (ID,VALUE)
❏ Physical Limits
❏ Solution: Timestamp as cluster key.
Cassandra at UBER using MESOS (2016 data)
Cassandra
part 2
diegopacheco
@diego_pacheco
Diego Pacheco

Mais conteúdo relacionado

Destaque

Lean/Agile/DevOps 2016 part 3
Lean/Agile/DevOps 2016 part 3Lean/Agile/DevOps 2016 part 3
Lean/Agile/DevOps 2016 part 3Diego Pacheco
 
Microservices reativos usando a stack do Netflix na AWS
Microservices reativos usando a stack do Netflix na AWSMicroservices reativos usando a stack do Netflix na AWS
Microservices reativos usando a stack do Netflix na AWSDiego Pacheco
 
Cloud Native, Microservices and SRE/Chaos Engineering: The new Rules of The G...
Cloud Native, Microservices and SRE/Chaos Engineering: The new Rules of The G...Cloud Native, Microservices and SRE/Chaos Engineering: The new Rules of The G...
Cloud Native, Microservices and SRE/Chaos Engineering: The new Rules of The G...Diego Pacheco
 
Lean/Agile/DevOps 2016 part 1
Lean/Agile/DevOps 2016  part 1Lean/Agile/DevOps 2016  part 1
Lean/Agile/DevOps 2016 part 1Diego Pacheco
 
DevOps: Plain English Business Benefits
DevOps: Plain English Business BenefitsDevOps: Plain English Business Benefits
DevOps: Plain English Business BenefitsDiego Pacheco
 
(ARC312) Processing Money in the Cloud | AWS re:Invent 2014
(ARC312) Processing Money in the Cloud | AWS re:Invent 2014(ARC312) Processing Money in the Cloud | AWS re:Invent 2014
(ARC312) Processing Money in the Cloud | AWS re:Invent 2014Amazon Web Services
 
TI na ERA DEVOPS
TI na ERA DEVOPSTI na ERA DEVOPS
TI na ERA DEVOPSilegra
 
Stream Processing with Kafka and Samza
Stream Processing with Kafka and SamzaStream Processing with Kafka and Samza
Stream Processing with Kafka and SamzaDiego Pacheco
 
Spring framework 2.5
Spring framework 2.5Spring framework 2.5
Spring framework 2.5Diego Pacheco
 

Destaque (20)

Elassandra
ElassandraElassandra
Elassandra
 
Lean/Agile/DevOps 2016 part 3
Lean/Agile/DevOps 2016 part 3Lean/Agile/DevOps 2016 part 3
Lean/Agile/DevOps 2016 part 3
 
Dev opsdaykeynote
Dev opsdaykeynoteDev opsdaykeynote
Dev opsdaykeynote
 
Microservices reativos usando a stack do Netflix na AWS
Microservices reativos usando a stack do Netflix na AWSMicroservices reativos usando a stack do Netflix na AWS
Microservices reativos usando a stack do Netflix na AWS
 
Cloud Native, Microservices and SRE/Chaos Engineering: The new Rules of The G...
Cloud Native, Microservices and SRE/Chaos Engineering: The new Rules of The G...Cloud Native, Microservices and SRE/Chaos Engineering: The new Rules of The G...
Cloud Native, Microservices and SRE/Chaos Engineering: The new Rules of The G...
 
Lean/Agile/DevOps 2016 part 1
Lean/Agile/DevOps 2016  part 1Lean/Agile/DevOps 2016  part 1
Lean/Agile/DevOps 2016 part 1
 
Microservices
MicroservicesMicroservices
Microservices
 
Pattern matchind and case classes
Pattern matchind and case classesPattern matchind and case classes
Pattern matchind and case classes
 
Apresentação play framework
Apresentação play frameworkApresentação play framework
Apresentação play framework
 
Play Framework
Play FrameworkPlay Framework
Play Framework
 
Pattern matching and case classes
Pattern matching and case classesPattern matching and case classes
Pattern matching and case classes
 
Scala
ScalaScala
Scala
 
Highorderfunctions
HighorderfunctionsHighorderfunctions
Highorderfunctions
 
Apresentação angular js
Apresentação angular jsApresentação angular js
Apresentação angular js
 
DevOps: Plain English Business Benefits
DevOps: Plain English Business BenefitsDevOps: Plain English Business Benefits
DevOps: Plain English Business Benefits
 
(ARC312) Processing Money in the Cloud | AWS re:Invent 2014
(ARC312) Processing Money in the Cloud | AWS re:Invent 2014(ARC312) Processing Money in the Cloud | AWS re:Invent 2014
(ARC312) Processing Money in the Cloud | AWS re:Invent 2014
 
TI na ERA DEVOPS
TI na ERA DEVOPSTI na ERA DEVOPS
TI na ERA DEVOPS
 
Stream Processing with Kafka and Samza
Stream Processing with Kafka and SamzaStream Processing with Kafka and Samza
Stream Processing with Kafka and Samza
 
Spring framework 2.5
Spring framework 2.5Spring framework 2.5
Spring framework 2.5
 
Cassandra
CassandraCassandra
Cassandra
 

Semelhante a Apache Cassandra - part 2

Experiences building a multi region cassandra operations orchestrator on aws
Experiences building a multi region cassandra operations orchestrator on awsExperiences building a multi region cassandra operations orchestrator on aws
Experiences building a multi region cassandra operations orchestrator on awsDiego Pacheco
 
Cloud-Native DevOps Engineering
Cloud-Native DevOps EngineeringCloud-Native DevOps Engineering
Cloud-Native DevOps EngineeringDiego Pacheco
 
My Database Skills Killed the Server
My Database Skills Killed the ServerMy Database Skills Killed the Server
My Database Skills Killed the ServerColdFusionConference
 
My SQL Skills Killed the Server
My SQL Skills Killed the ServerMy SQL Skills Killed the Server
My SQL Skills Killed the ServerdevObjective
 
Don't you (forget about me) - PHP Meetup Lisboa 2023
Don't you (forget about me) - PHP Meetup Lisboa 2023Don't you (forget about me) - PHP Meetup Lisboa 2023
Don't you (forget about me) - PHP Meetup Lisboa 2023Bernd Alter
 
In-Memory Logical Data Warehouse for accelerating Machine Learning Pipelines ...
In-Memory Logical Data Warehouse for accelerating Machine Learning Pipelines ...In-Memory Logical Data Warehouse for accelerating Machine Learning Pipelines ...
In-Memory Logical Data Warehouse for accelerating Machine Learning Pipelines ...Gianmario Spacagna
 
Deep Dive into Cassandra
Deep Dive into CassandraDeep Dive into Cassandra
Deep Dive into CassandraBrent Theisen
 
Accelerating Machine Learning Pipelines with Alluxio at Alluxio Meetup 2016
Accelerating Machine Learning Pipelines with Alluxio at Alluxio Meetup 2016Accelerating Machine Learning Pipelines with Alluxio at Alluxio Meetup 2016
Accelerating Machine Learning Pipelines with Alluxio at Alluxio Meetup 2016Alluxio, Inc.
 
Spark & Cassandra - DevFest Córdoba
Spark & Cassandra - DevFest CórdobaSpark & Cassandra - DevFest Córdoba
Spark & Cassandra - DevFest CórdobaJose Mº Muñoz
 
Lunch & Learn BigQuery & Firebase from other Google Cloud customers
Lunch & Learn BigQuery & Firebase from other Google Cloud customersLunch & Learn BigQuery & Firebase from other Google Cloud customers
Lunch & Learn BigQuery & Firebase from other Google Cloud customersDaniel Zivkovic
 
Logical-DataWarehouse-Alluxio-meetup
Logical-DataWarehouse-Alluxio-meetupLogical-DataWarehouse-Alluxio-meetup
Logical-DataWarehouse-Alluxio-meetupGianmario Spacagna
 
Scylla Summit 2018: How We Made Large Partition Scans Over Two Times Faster
Scylla Summit 2018: How We Made Large Partition Scans Over Two Times FasterScylla Summit 2018: How We Made Large Partition Scans Over Two Times Faster
Scylla Summit 2018: How We Made Large Partition Scans Over Two Times FasterScyllaDB
 
Dynomite Eureka Registry With Prana
Dynomite Eureka Registry With PranaDynomite Eureka Registry With Prana
Dynomite Eureka Registry With PranaDiego Pacheco
 
Mixing Batch and Real-time: Cassandra with Shark (Cassandra Europe 2013)
Mixing Batch and Real-time: Cassandra with Shark (Cassandra Europe 2013)Mixing Batch and Real-time: Cassandra with Shark (Cassandra Europe 2013)
Mixing Batch and Real-time: Cassandra with Shark (Cassandra Europe 2013)Richard Low
 
C* Summit EU 2013: Mixing Batch and Real-Time: Cassandra with Shark
C* Summit EU 2013: Mixing Batch and Real-Time: Cassandra with Shark C* Summit EU 2013: Mixing Batch and Real-Time: Cassandra with Shark
C* Summit EU 2013: Mixing Batch and Real-Time: Cassandra with Shark DataStax Academy
 
Using cassandra as a distributed logging to store pb data
Using cassandra as a distributed logging to store pb dataUsing cassandra as a distributed logging to store pb data
Using cassandra as a distributed logging to store pb dataRamesh Veeramani
 

Semelhante a Apache Cassandra - part 2 (20)

NoSQL
NoSQLNoSQL
NoSQL
 
Experiences building a multi region cassandra operations orchestrator on aws
Experiences building a multi region cassandra operations orchestrator on awsExperiences building a multi region cassandra operations orchestrator on aws
Experiences building a multi region cassandra operations orchestrator on aws
 
Cloud-Native DevOps Engineering
Cloud-Native DevOps EngineeringCloud-Native DevOps Engineering
Cloud-Native DevOps Engineering
 
My Database Skills Killed the Server
My Database Skills Killed the ServerMy Database Skills Killed the Server
My Database Skills Killed the Server
 
My SQL Skills Killed the Server
My SQL Skills Killed the ServerMy SQL Skills Killed the Server
My SQL Skills Killed the Server
 
Sql killedserver
Sql killedserverSql killedserver
Sql killedserver
 
Don't you (forget about me) - PHP Meetup Lisboa 2023
Don't you (forget about me) - PHP Meetup Lisboa 2023Don't you (forget about me) - PHP Meetup Lisboa 2023
Don't you (forget about me) - PHP Meetup Lisboa 2023
 
In-Memory Logical Data Warehouse for accelerating Machine Learning Pipelines ...
In-Memory Logical Data Warehouse for accelerating Machine Learning Pipelines ...In-Memory Logical Data Warehouse for accelerating Machine Learning Pipelines ...
In-Memory Logical Data Warehouse for accelerating Machine Learning Pipelines ...
 
Deep Dive into Cassandra
Deep Dive into CassandraDeep Dive into Cassandra
Deep Dive into Cassandra
 
Cassandra Redis
Cassandra RedisCassandra Redis
Cassandra Redis
 
Accelerating Machine Learning Pipelines with Alluxio at Alluxio Meetup 2016
Accelerating Machine Learning Pipelines with Alluxio at Alluxio Meetup 2016Accelerating Machine Learning Pipelines with Alluxio at Alluxio Meetup 2016
Accelerating Machine Learning Pipelines with Alluxio at Alluxio Meetup 2016
 
Spark & Cassandra - DevFest Córdoba
Spark & Cassandra - DevFest CórdobaSpark & Cassandra - DevFest Córdoba
Spark & Cassandra - DevFest Córdoba
 
Lunch & Learn BigQuery & Firebase from other Google Cloud customers
Lunch & Learn BigQuery & Firebase from other Google Cloud customersLunch & Learn BigQuery & Firebase from other Google Cloud customers
Lunch & Learn BigQuery & Firebase from other Google Cloud customers
 
Logical-DataWarehouse-Alluxio-meetup
Logical-DataWarehouse-Alluxio-meetupLogical-DataWarehouse-Alluxio-meetup
Logical-DataWarehouse-Alluxio-meetup
 
Scylla Summit 2018: How We Made Large Partition Scans Over Two Times Faster
Scylla Summit 2018: How We Made Large Partition Scans Over Two Times FasterScylla Summit 2018: How We Made Large Partition Scans Over Two Times Faster
Scylla Summit 2018: How We Made Large Partition Scans Over Two Times Faster
 
Dynomite Eureka Registry With Prana
Dynomite Eureka Registry With PranaDynomite Eureka Registry With Prana
Dynomite Eureka Registry With Prana
 
Mixing Batch and Real-time: Cassandra with Shark (Cassandra Europe 2013)
Mixing Batch and Real-time: Cassandra with Shark (Cassandra Europe 2013)Mixing Batch and Real-time: Cassandra with Shark (Cassandra Europe 2013)
Mixing Batch and Real-time: Cassandra with Shark (Cassandra Europe 2013)
 
C* Summit EU 2013: Mixing Batch and Real-Time: Cassandra with Shark
C* Summit EU 2013: Mixing Batch and Real-Time: Cassandra with Shark C* Summit EU 2013: Mixing Batch and Real-Time: Cassandra with Shark
C* Summit EU 2013: Mixing Batch and Real-Time: Cassandra with Shark
 
Real world capacity
Real world capacityReal world capacity
Real world capacity
 
Using cassandra as a distributed logging to store pb data
Using cassandra as a distributed logging to store pb dataUsing cassandra as a distributed logging to store pb data
Using cassandra as a distributed logging to store pb data
 

Mais de Diego Pacheco

Naming Things Book : Simple Book Review!
Naming Things Book : Simple Book Review!Naming Things Book : Simple Book Review!
Naming Things Book : Simple Book Review!Diego Pacheco
 
Continuous Discovery Habits Book Review.pdf
Continuous Discovery Habits  Book Review.pdfContinuous Discovery Habits  Book Review.pdf
Continuous Discovery Habits Book Review.pdfDiego Pacheco
 
Encryption Deep Dive
Encryption Deep DiveEncryption Deep Dive
Encryption Deep DiveDiego Pacheco
 
Management: Doing the non-obvious! III
Management: Doing the non-obvious! IIIManagement: Doing the non-obvious! III
Management: Doing the non-obvious! IIIDiego Pacheco
 
Design is not Subjective
Design is not SubjectiveDesign is not Subjective
Design is not SubjectiveDiego Pacheco
 
Architecture & Engineering : Doing the non-obvious!
Architecture & Engineering :  Doing the non-obvious!Architecture & Engineering :  Doing the non-obvious!
Architecture & Engineering : Doing the non-obvious!Diego Pacheco
 
Management doing the non-obvious II
Management doing the non-obvious II Management doing the non-obvious II
Management doing the non-obvious II Diego Pacheco
 
Testing in production
Testing in productionTesting in production
Testing in productionDiego Pacheco
 
Nine lies about work
Nine lies about workNine lies about work
Nine lies about workDiego Pacheco
 
Management: doing the nonobvious!
Management: doing the nonobvious!Management: doing the nonobvious!
Management: doing the nonobvious!Diego Pacheco
 
Dealing with dependencies
Dealing  with dependenciesDealing  with dependencies
Dealing with dependenciesDiego Pacheco
 
Dealing with dependencies in tests
Dealing  with dependencies in testsDealing  with dependencies in tests
Dealing with dependencies in testsDiego Pacheco
 

Mais de Diego Pacheco (20)

Naming Things Book : Simple Book Review!
Naming Things Book : Simple Book Review!Naming Things Book : Simple Book Review!
Naming Things Book : Simple Book Review!
 
Continuous Discovery Habits Book Review.pdf
Continuous Discovery Habits  Book Review.pdfContinuous Discovery Habits  Book Review.pdf
Continuous Discovery Habits Book Review.pdf
 
Holacracy
HolacracyHolacracy
Holacracy
 
AWS IAM
AWS IAMAWS IAM
AWS IAM
 
Encryption Deep Dive
Encryption Deep DiveEncryption Deep Dive
Encryption Deep Dive
 
Sec 101
Sec 101Sec 101
Sec 101
 
Management: Doing the non-obvious! III
Management: Doing the non-obvious! IIIManagement: Doing the non-obvious! III
Management: Doing the non-obvious! III
 
Design is not Subjective
Design is not SubjectiveDesign is not Subjective
Design is not Subjective
 
Architecture & Engineering : Doing the non-obvious!
Architecture & Engineering :  Doing the non-obvious!Architecture & Engineering :  Doing the non-obvious!
Architecture & Engineering : Doing the non-obvious!
 
Management doing the non-obvious II
Management doing the non-obvious II Management doing the non-obvious II
Management doing the non-obvious II
 
Testing in production
Testing in productionTesting in production
Testing in production
 
Nine lies about work
Nine lies about workNine lies about work
Nine lies about work
 
Management: doing the nonobvious!
Management: doing the nonobvious!Management: doing the nonobvious!
Management: doing the nonobvious!
 
AI and the Future
AI and the FutureAI and the Future
AI and the Future
 
Dealing with dependencies
Dealing  with dependenciesDealing  with dependencies
Dealing with dependencies
 
Dealing with dependencies in tests
Dealing  with dependencies in testsDealing  with dependencies in tests
Dealing with dependencies in tests
 
Kanban 2020
Kanban 2020Kanban 2020
Kanban 2020
 
Lean 2020
Lean 2020Lean 2020
Lean 2020
 
Hardening
HardeningHardening
Hardening
 
Design 101
Design 101Design 101
Design 101
 

Último

Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 

Último (20)

Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 

Apache Cassandra - part 2

  • 2. @diego_pacheco ❏ Cat's Father ❏ Principal Software Architect ❏ Agile Coach ❏ SOA Expert ❏ DevOps Practitioner ❏ Speaker ❏ Author diegopacheco http://diego-pacheco.blogspot.com.br/ https://goo.gl/eEqvzl About me...
  • 3. Agenda ❏ RE-CAP ❏ Cassandra Write Path ❏ Tombstones ❏ Compaction Strategies ❏ Row Cache ❏ Bloom Filter ❏ SASI Index ❏ Materialized Views ❏ Counter Families ❏ Anti-Patterns ❏ Cassandra running at UBER in MESOS use case ❏ Q&A
  • 6. Cassandra Write Path ❏ SSTable => Sorted Array of Strings. ❏ Write to Disk: Merges and Pre-sorts happens. ❏ SSTables are IMMUTABLE. ❏ Compaction happens: ❏ Time to time ❏ Prune deleted data ❏ Has thread-offs
  • 7. Tombstones ❏ Deleted data is MARKED as Removed == Tombstone ❏ Data is deleted and removed during compaction ❏ Compaction can happen in few days depending of the configs. ❏ Queries on partition with lots of tombstones requires lots of filtering which can slow down the CASS performance. ❏ Collections operations can lead to tombstones depending on what you do. ❏ There are Compaction Trade-Offs.
  • 8. Compaction Strategies ❏ STCS ❏ Default ❏ Insert-Heavy ❏ General Workloads ❏ LCS ❏ Read Heavy ❏ More Updates than Inserts ❏ DTCS ❏ Time Series ❏ Inserts out of order ❏ Updates for old data
  • 9. Cassandra ROW CACHE ❏ Buffer FULL merged row into memory ❏ Increase a lot the throughput ❏ Row Cache works with Key Cache ❏ Key Cache = Where the partition is on DISK. CREATE TABLE status ( user text, status_id timeuuid, status text, PRIMARY KEY (user, status_id)) WITH CLUSTERING ORDER BY (status_id DESC) AND caching = '{"keys":"ALL", "rows_per_partition":"10"}'
  • 10. Cassandra Bloom Filter ❏ Bloom Filter: Technique created on the 70s to filter db matches. ❏ Space Efficient ❏ Probabilistic Data Structures ❏ For each SSTable there is a Bloom Filter ❏ Used for Index scans - not used to range scans ❏ Stored OFF HEAP ❏ Tunable per TABLE ❏ Cassandra uses bloom filters to know if the data is on the ROW or not.
  • 12. SASI ❏ Secondary Index: Not the primary key. ❏ Lookup tables: bySomething ❏ Distributed Index ❏ Search Like Capabilities: %diego% ❏ Great when: ❏ Multi fields Search ❏ You know the partition key ❏ Indexing static columns ❏ Issues: ❏ More than 1000 rows returned ❏ Searching in Large Partitions ❏ Aggressive Read SLOs ❏ Search for Analytics(Use Spark/Flink) ❏ Ordering Search is important
  • 13. SASI Samples ❏ SELECT * FROM users WHERE firstname LIKE 'Die%'; ❏ SELECT * FROM users WHERE lastname LIKE '%ie%'; ❏ SELECT * FROM users WHERE created_date > '2015-01-02' AND created_date < '2017-01-02';
  • 14. Materialized Views ❏ Automated - Table managed for you, Denormalization ❏ Copies of the data in different partitions / replicas ❏ Some Write penalty but acceptable performance ❏ Store results in table which can be indexed ❏ Update ASYNC ❏ Great For: ❏ Caching ❏ Result Sets ❏ Dashbaords SAMPLE CREATE MATERIALIZED VIEW all_time_high AS SELECT user FROM scores WHERE game IS NOT NULL AND score IS NOT NULL PRIMARY KEY (game,score) WITH CLUSTERING ORDER BY (score DESC)
  • 15. Cassandra Counter Family ❏ Static VS Dynamic Column families ❏ Dynamic Column families A.K.A Wide Rows ❏ Wide Rows is good for: Ordering,Grouping and Filtering. ❏ Wide Rows are not split into NODES. ❏ Counters Internally: ❏ Calculated and sum of all replicas ❏ Split into fragments called SHARDs. ❏ Logical clock monotonically increased ❏ 3 tuple = { NODE_COUNTER_ID, SHARD_LOGICAL_CLOCK, SHARD_VALUE }
  • 16. Anti-Patterns ❏ Using Cassandra as a queue or queue-like table ❏ Tombstones ❏ Lots of deleted columns(expiry) and slice-queries don't play well ❏ http://www.datastax.com/dev/blog/cassandra-anti-patterns-queues-and-queue-like-datasets ❏ CQL Nulls ❏ Reading Tombstones ❏ Write NULL create tombstones ❏ Intensive Updates on SAME column ❏ Sensor table (ID,VALUE) ❏ Physical Limits ❏ Solution: Timestamp as cluster key.
  • 17. Cassandra at UBER using MESOS (2016 data)