SlideShare uma empresa Scribd logo
1 de 16
An introduction
 to Cassandra

                           Pedro Gomes
              pedrogomes@lsd.di.uminho.pt
          Braga Geek Nights - Abril 2010
Context
•   NOSQL movement- Not only SQL
    •unstructured data
    •web oriented interfaces
    •scale problems
                                            Voldemort
•   +20 emerging non relational databases
    • Document stores
    • Graph databases
    • Key-Value and Wide Column Stores
Cassandra - introduction
• From the greek prophetess Cassandra.
• Based on Amazon Dynamo and Goggle
  BigTable
• Built on FaceBook, open sourced in 2008
• Scalable, decentralized and structured data
  store
Why Cassandra?
•   High available
•   Eventual consistent
•   Decentralized
•   Elastic
•   Fault tolerant
•   Flexible Schema
A little internals...
• Built for Scale -   Consistence Hashing
                                     A
      A




                      New node
                                            F
               F




                                 N
M


                                     I
                          B
Partitioners
• Order preserving
• Random
• Custom...
Consistency
• CAP theorem                               Availability   Consitency


 • Trade consistency for availability
                                                     Partition
                                                    Tolerance

    •   Eventual consistency

    •   Read Repair, Hinted Handoff , Proactive Repair

  • A choice, not an obligation
Consistency - N,W,R
• Define your Consistency:
 • Define the replication factor N
 • For writes and reads chose the number
    of nodes R or W
   • ALL, ONE, QUORUM, ZERO.
   • W + R > N = Consistency
Data model
• KeySpaces - collection of your unique keys
• Column Families - groups of columns
• Columns - a tuple with column name, value,
  and time stamp
• Super columns - A column that is a set of
  column
• I will show pictures next, don’t worry.
Data model - Column Families
• Using the blog example:
 • PostsKeys       Columns

        Geek          Title:       Author:        Body:
        Nights     Geek Nights     Pedro          The...


                     Title:      Author:     Body:      Tags:
       Cassandra                                       Data, ...
                   Cassandra     Pedro       This...


                    Title:    Author:        Body:
         Stuff
                    Stuff    Someone       Something
Data model - Super Columns
          • Comments
 Keys       SuperColumns

 Geek        4/5/2010   Author:    Comment:       email:    4/5/2010   Author:   Comment:    email:
 Nights        20:00    Ricardo     I think...   email@       19:00     Jack      IMO ...   email@


             1/4/2010   Author:    Comment:       email:    1/4/2010   Author:   Comment:    email:
Cassandra
               14:00     Filipe    My POV..      email@       14:00     Jon         ...     email@


  Stuff      1/4/2010    Author:    Comment:       email:
               14:00      Filipe     Great...     email@
Data model
<Keyspace Name="BloggyAppy">

   <!-- CF definitions -->
   <ColumnFamily CompareWith="BytesType" Name="BlogEntries"/>
   <ColumnFamily CompareWith="TimeUUIDType" Name="Comments"
       CompareSubcolumnsWith="BytesType" ColumnType="Super"/>

</Keyspace>




• Think about your schema
API

• Thrift RPC
 • Java, PHP, C++....
API
•   insert(KeySpace, Key,Column_path,Value, Timestamp,Consistency_level)

•   get(KeySpace, Key,Column_path,Consistency_level)

•   batch_mutate

•   multi_get

•   range

•   ...
Have fun

• Clients for many languages
• Lucandra
• Hadoop support
• ...
End


• Questions ?

Mais conteúdo relacionado

Destaque (6)

Week13
Week13Week13
Week13
 
Research Orientation towards Do-it-Yourself Internet-of-Things Mass Creativit...
Research Orientation towards Do-it-Yourself Internet-of-Things Mass Creativit...Research Orientation towards Do-it-Yourself Internet-of-Things Mass Creativit...
Research Orientation towards Do-it-Yourself Internet-of-Things Mass Creativit...
 
MCCLV Celebrating 30 Years of Continuous Ministry in Las Vegas
MCCLV Celebrating 30 Years of Continuous Ministry in Las VegasMCCLV Celebrating 30 Years of Continuous Ministry in Las Vegas
MCCLV Celebrating 30 Years of Continuous Ministry in Las Vegas
 
SLQ vs NOSQL - friends or foes
SLQ vs NOSQL - friends or foes SLQ vs NOSQL - friends or foes
SLQ vs NOSQL - friends or foes
 
Incorporation of new arc
Incorporation of new arcIncorporation of new arc
Incorporation of new arc
 
اختبار القدرات
اختبار القدراتاختبار القدرات
اختبار القدرات
 

Semelhante a Cassandra presentation - Geek Nights Braga

SDEC2011 NoSQL concepts and models
SDEC2011 NoSQL concepts and modelsSDEC2011 NoSQL concepts and models
SDEC2011 NoSQL concepts and models
Korea Sdec
 
What every developer should know about database scalability, PyCon 2010
What every developer should know about database scalability, PyCon 2010What every developer should know about database scalability, PyCon 2010
What every developer should know about database scalability, PyCon 2010
jbellis
 
Scaling Twitter with Cassandra
Scaling Twitter with CassandraScaling Twitter with Cassandra
Scaling Twitter with Cassandra
Ryan King
 
Writing DSL's in Scala
Writing DSL's in ScalaWriting DSL's in Scala
Writing DSL's in Scala
Abhijit Sharma
 

Semelhante a Cassandra presentation - Geek Nights Braga (20)

Spring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_dataSpring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_data
 
Cassandra from the trenches: migrating Netflix (update)
Cassandra from the trenches: migrating Netflix (update)Cassandra from the trenches: migrating Netflix (update)
Cassandra from the trenches: migrating Netflix (update)
 
SDEC2011 NoSQL concepts and models
SDEC2011 NoSQL concepts and modelsSDEC2011 NoSQL concepts and models
SDEC2011 NoSQL concepts and models
 
Accelerating NoSQL
Accelerating NoSQLAccelerating NoSQL
Accelerating NoSQL
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
What every developer should know about database scalability, PyCon 2010
What every developer should know about database scalability, PyCon 2010What every developer should know about database scalability, PyCon 2010
What every developer should know about database scalability, PyCon 2010
 
KeyValue Stores
KeyValue StoresKeyValue Stores
KeyValue Stores
 
Building a distributed Key-Value store with Cassandra
Building a distributed Key-Value store with CassandraBuilding a distributed Key-Value store with Cassandra
Building a distributed Key-Value store with Cassandra
 
Using Scala for building DSLs
Using Scala for building DSLsUsing Scala for building DSLs
Using Scala for building DSLs
 
Cassandra from the trenches: migrating Netflix
Cassandra from the trenches: migrating NetflixCassandra from the trenches: migrating Netflix
Cassandra from the trenches: migrating Netflix
 
Scaling Twitter with Cassandra
Scaling Twitter with CassandraScaling Twitter with Cassandra
Scaling Twitter with Cassandra
 
Apache Con 2021 Structured Data Streaming
Apache Con 2021 Structured Data StreamingApache Con 2021 Structured Data Streaming
Apache Con 2021 Structured Data Streaming
 
Client storage
Client storageClient storage
Client storage
 
Writing DSL's in Scala
Writing DSL's in ScalaWriting DSL's in Scala
Writing DSL's in Scala
 
Using Spring with NoSQL databases (SpringOne China 2012)
Using Spring with NoSQL databases (SpringOne China 2012)Using Spring with NoSQL databases (SpringOne China 2012)
Using Spring with NoSQL databases (SpringOne China 2012)
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQL
 
Scaling Databases On The Cloud
Scaling Databases On The CloudScaling Databases On The Cloud
Scaling Databases On The Cloud
 
Scaing databases on the cloud
Scaing databases on the cloudScaing databases on the cloud
Scaing databases on the cloud
 
NoSQL overview #phptostart turin 11.07.2011
NoSQL overview #phptostart turin 11.07.2011NoSQL overview #phptostart turin 11.07.2011
NoSQL overview #phptostart turin 11.07.2011
 
Cassandra Talk: Austin JUG
Cassandra Talk: Austin JUGCassandra Talk: Austin JUG
Cassandra Talk: Austin JUG
 

Último

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Último (20)

DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 

Cassandra presentation - Geek Nights Braga

  • 1. An introduction to Cassandra Pedro Gomes pedrogomes@lsd.di.uminho.pt Braga Geek Nights - Abril 2010
  • 2. Context • NOSQL movement- Not only SQL •unstructured data •web oriented interfaces •scale problems Voldemort • +20 emerging non relational databases • Document stores • Graph databases • Key-Value and Wide Column Stores
  • 3. Cassandra - introduction • From the greek prophetess Cassandra. • Based on Amazon Dynamo and Goggle BigTable • Built on FaceBook, open sourced in 2008 • Scalable, decentralized and structured data store
  • 4. Why Cassandra? • High available • Eventual consistent • Decentralized • Elastic • Fault tolerant • Flexible Schema
  • 5. A little internals... • Built for Scale - Consistence Hashing A A New node F F N M I B
  • 7. Consistency • CAP theorem Availability Consitency • Trade consistency for availability Partition Tolerance • Eventual consistency • Read Repair, Hinted Handoff , Proactive Repair • A choice, not an obligation
  • 8. Consistency - N,W,R • Define your Consistency: • Define the replication factor N • For writes and reads chose the number of nodes R or W • ALL, ONE, QUORUM, ZERO. • W + R > N = Consistency
  • 9. Data model • KeySpaces - collection of your unique keys • Column Families - groups of columns • Columns - a tuple with column name, value, and time stamp • Super columns - A column that is a set of column • I will show pictures next, don’t worry.
  • 10. Data model - Column Families • Using the blog example: • PostsKeys Columns Geek Title: Author: Body: Nights Geek Nights Pedro The... Title: Author: Body: Tags: Cassandra Data, ... Cassandra Pedro This... Title: Author: Body: Stuff Stuff Someone Something
  • 11. Data model - Super Columns • Comments Keys SuperColumns Geek 4/5/2010 Author: Comment: email: 4/5/2010 Author: Comment: email: Nights 20:00 Ricardo I think... email@ 19:00 Jack IMO ... email@ 1/4/2010 Author: Comment: email: 1/4/2010 Author: Comment: email: Cassandra 14:00 Filipe My POV.. email@ 14:00 Jon ... email@ Stuff 1/4/2010 Author: Comment: email: 14:00 Filipe Great... email@
  • 12. Data model <Keyspace Name="BloggyAppy"> <!-- CF definitions --> <ColumnFamily CompareWith="BytesType" Name="BlogEntries"/> <ColumnFamily CompareWith="TimeUUIDType" Name="Comments" CompareSubcolumnsWith="BytesType" ColumnType="Super"/> </Keyspace> • Think about your schema
  • 13. API • Thrift RPC • Java, PHP, C++....
  • 14. API • insert(KeySpace, Key,Column_path,Value, Timestamp,Consistency_level) • get(KeySpace, Key,Column_path,Consistency_level) • batch_mutate • multi_get • range • ...
  • 15. Have fun • Clients for many languages • Lucandra • Hadoop support • ...

Notas do Editor