SlideShare a Scribd company logo
1 of 91
NoSQL Essentials
Cassandra & Dynamo-like
       Databases
  Buenos Aires, Argentina, Nov 2012
      Fernando Rodriguez Olivera
             @frodriguez


      nosqlessentials.com
Hash Partitioning

                            A         0
Client
                            B         1

                            C         2

                            D         3

                        N	
  =	
  4
Hash Partitioning

                                          A         0
           Client
                                          B         1
hash(“hello”)	
  mod	
  4	
  =	
  2
                                          C         2

                                          D         3

                                      N	
  =	
  4
Hash Partitioning

                                                  A         0
           Client

                                      hello       B         1
hash(“hello”)	
  mod	
  4	
  =	
  2
                                                  C         2

                                                  D         3

                                              N	
  =	
  4
Hash Partitioning

                                                  A         0
           Client

                                      hello       B         1
hash(“hello”)	
  mod	
  4	
  =	
  2
hash(“world”)	
  mod	
  4	
  =	
  0               C         2

                                                  D         3

                                              N	
  =	
  4
Hash Partitioning

                                      world
                                                  A         0
           Client

                                      hello       B         1
hash(“hello”)	
  mod	
  4	
  =	
  2
hash(“world”)	
  mod	
  4	
  =	
  0               C         2

                                                  D         3

                                              N	
  =	
  4
Hash Partitioning

                                            world
                                                        A         0
            Client

                                            hello       B         1
hash(“hello”)	
  mod	
  4	
  =	
  2
hash(“world”)	
  mod	
  4	
  =	
  0                     C         2
hash(“bye”)	
  	
  	
  mod	
  4	
  =	
  3

                                                        D         3

                                                    N	
  =	
  4
Hash Partitioning

                                            world
                                                        A         0
            Client

                                            hello       B         1
hash(“hello”)	
  mod	
  4	
  =	
  2          bye
hash(“world”)	
  mod	
  4	
  =	
  0                     C         2
hash(“bye”)	
  	
  	
  mod	
  4	
  =	
  3

                                                        D         3

                                                    N	
  =	
  4
Hash Partitioning

                                            world
                                                        A         0
            Client

                                            hello       B         1
hash(“hello”)	
  mod	
  4	
  =	
  2          bye
hash(“world”)	
  mod	
  4	
  =	
  0                     C         2
hash(“bye”)	
  	
  	
  mod	
  4	
  =	
  3

                                                        D         3
Difficult to add/remove nodes
                                                    N	
  =	
  4
Consistent Hashing / Random Tokens
                                                        0
E.g:	
  Address	
  Space	
  0..Max	
  =	
  0..65535
  hash	
  function	
  with	
  range	
  0..Max
                                                      65535




                              49152                           16384
  Client




                                                      32768
Consistent Hashing / Random Tokens
                                                                   0
E.g:	
  Address	
  Space	
  0..Max	
  =	
  0..65535
  hash	
  function	
  with	
  range	
  0..Max
                                                                 65535




                                                      token	
  A	
  =	
  33015	
  

                              49152                                                  16384
  Client




                                                             A

                                                                 32768
Consistent Hashing / Random Tokens
                                                                   0
E.g:	
  Address	
  Space	
  0..Max	
  =	
  0..65535
  hash	
  function	
  with	
  range	
  0..Max
                                                                 65535

                                                                                     B

                                                      token	
  A	
  =	
  33015	
  

                                                      token	
  B	
  =	
  	
  8915
                              49152                                                      16384
  Client




                                                             A

                                                                 32768
Consistent Hashing / Random Tokens
                                                                   0
E.g:	
  Address	
  Space	
  0..Max	
  =	
  0..65535
  hash	
  function	
  with	
  range	
  0..Max
                                                                 65535

                                                                                     B

                                                      token	
  A	
  =	
  33015	
  

                                                      token	
  B	
  =	
  	
  8915
                              49152                                                      16384
  Client                                              token	
  C	
  =	
  31541




                                                             A         C

                                                                 32768
Consistent Hashing / Random Tokens
                                                                   0
E.g:	
  Address	
  Space	
  0..Max	
  =	
  0..65535
  hash	
  function	
  with	
  range	
  0..Max
                                                                 65535

                                                                                     B

                                                      token	
  A	
  =	
  33015	
  

                                                      token	
  B	
  =	
  	
  8915
                              49152                                                      16384
  Client                                              token	
  C	
  =	
  31541

                                                      token	
  D	
  =	
  40927
                                           D



                                                             A         C

                                                                 32768
Consistent Hashing / Random Tokens
                                                                   0
E.g:	
  Address	
  Space	
  0..Max	
  =	
  0..65535
  hash	
  function	
  with	
  range	
  0..Max
                                                                 65535

                                                                                     B

                                                      token	
  A	
  =	
  33015	
  

                                                      token	
  B	
  =	
  	
  8915
                              49152                                                      16384
  Client                                              token	
  C	
  =	
  31541

                                                      token	
  D	
  =	
  40927
                                           D


hash(“hello”)	
  =	
  13209
                                                             A         C

                                                                 32768
Consistent Hashing / Random Tokens
                                                                   0
E.g:	
  Address	
  Space	
  0..Max	
  =	
  0..65535
  hash	
  function	
  with	
  range	
  0..Max
                                                                 65535

                                                                                     B

                                                      token	
  A	
  =	
  33015	
  

                                                      token	
  B	
  =	
  	
  8915
                              49152                                                      16384
  Client              hello                           token	
  C	
  =	
  31541

                                                      token	
  D	
  =	
  40927
                                           D


hash(“hello”)	
  =	
  13209
                                                             A         C

                                                                 32768
Consistent Hashing / Random Tokens
                                                                   0
E.g:	
  Address	
  Space	
  0..Max	
  =	
  0..65535
  hash	
  function	
  with	
  range	
  0..Max
                                                                 65535

                                                                                     B

                                                      token	
  A	
  =	
  33015	
  

                                                      token	
  B	
  =	
  	
  8915
                              49152                                                      16384
  Client              hello                           token	
  C	
  =	
  31541

                                                      token	
  D	
  =	
  40927
                                           D


hash(“hello”)	
  =	
  13209
                                                             A         C
hash(“world”)	
  =	
  36551
                                                                 32768
Consistent Hashing / Random Tokens
                                                                   0
E.g:	
  Address	
  Space	
  0..Max	
  =	
  0..65535
  hash	
  function	
  with	
  range	
  0..Max
                                                                 65535

                                                                                     B

                                                      token	
  A	
  =	
  33015	
  

                                                      token	
  B	
  =	
  	
  8915
                              49152                                                      16384
  Client              hello                           token	
  C	
  =	
  31541

                                                      token	
  D	
  =	
  40927
                      world
                                           D


hash(“hello”)	
  =	
  13209
                                                             A         C
hash(“world”)	
  =	
  36551
                                                                 32768
Consistent Hashing / Random Tokens
                                                                   0
E.g:	
  Address	
  Space	
  0..Max	
  =	
  0..65535
  hash	
  function	
  with	
  range	
  0..Max
                                                                 65535

                                                                                     B

                                                      token	
  A	
  =	
  33015	
  

                                                      token	
  B	
  =	
  	
  8915
                              49152                                                      16384
  Client              hello                           token	
  C	
  =	
  31541

                                                      token	
  D	
  =	
  40927
                      world
                                           D


hash(“hello”)	
  =	
  13209
                                                             A         C
hash(“world”)	
  =	
  36551
hash(“bye”)	
  	
  	
  =	
  60912                                32768
Consistent Hashing / Random Tokens
                                                                   0
E.g:	
  Address	
  Space	
  0..Max	
  =	
  0..65535
  hash	
  function	
  with	
  range	
  0..Max
                                                                 65535

                                                                                     B

                       bye                            token	
  A	
  =	
  33015	
  

                                                      token	
  B	
  =	
  	
  8915
                              49152                                                      16384
  Client              hello                           token	
  C	
  =	
  31541

                                                      token	
  D	
  =	
  40927
                      world
                                           D


hash(“hello”)	
  =	
  13209
                                                             A         C
hash(“world”)	
  =	
  36551
hash(“bye”)	
  	
  	
  =	
  60912                                32768
Consistent Hashing / Virtual Nodes
                                                     0
                                                A
                                            C                D
                                                    65535            A
                                        C
4 Virtual Nodes                                                          B
Random Tokens                   B
 token	
  A.1	
  =	
  ...
 token	
  A.2	
  =	
  ...   D

 token	
  A.3	
  =	
  ...
                                                                             A
 token	
  A.4	
  =	
  ...
                            A
 token	
  B.1	
  =	
  ...

                                                                         B
                                    D
                                        C
                                                                 D
                                            B
                                                A        C
Consistent Hashing / Manual Placement
                                   0
                                       1



Uniform Distribution       8
                                               2
 Calculated Tokens

                       7


 Adding/Removing                                   3

  Node Requires
   Rebalancing             6

                                           4
                               5
Token Generation

	
  #>	
  cassandra/tools/bin/token-­‐generator

	
  Token	
  Generator	
  Interactive	
  Mode
	
  -­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐

	
  	
  How	
  many	
  datacenters	
  will	
  participate	
  in	
  this	
  Cassandra	
  cluster?	
  1
	
  	
  How	
  many	
  nodes	
  are	
  in	
  datacenter	
  #1?	
  5

	
  DC	
  #1:
	
  	
  	
  Node	
  #1:	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  0
	
  	
  	
  Node	
  #2:	
  	
  	
  34028236692093846346337460743176821145
	
  	
  	
  Node	
  #3:	
  	
  	
  68056473384187692692674921486353642290
	
  	
  	
  Node	
  #4:	
  	
  102084710076281539039012382229530463435
	
  	
  	
  Node	
  #5:	
  	
  136112946768375385385349842972707284580
Partitioning Strategy

              RandomPartitioner
               (consistent hashing)

           ByteOrderedPartitioner


    Cassandra Documentation from DataStax:
    “Unless	
  absolutely	
  required	
  by	
  your	
  
application,	
  DataStax	
  strongly	
  recommends	
  
  against	
  using	
  the	
  ordered	
  partitioner”
Partitioning and Replication
                                    0
            KeySpace                    1
            with RF=3

                            8
                                                2

         Client

                        7

                                                    3

Partition Strategy
                            6
Replication Strategy                        4
                                5
Partitioning and Replication
                                    0
            KeySpace                    1
            with RF=3

                            8
                    A                           2

         Client

                        7

                                                    3

Partition Strategy
                            6
Replication Strategy                        4
                                5
Partitioning and Replication
                                          0
            KeySpace                          1
                        Coordinator
            with RF=3

                             8
                    A                                 2

         Client

                        7

                                                          3

Partition Strategy
                            6
Replication Strategy                              4
                                      5
Partitioning and Replication
                                          0
            KeySpace                          1
                        Coordinator
            with RF=3

                             8
                    A                                      2   R1

         Client

                        7

                                                               3    R2

Partition Strategy
                            6
Replication Strategy                              4   R3
                                      5
Partitioning and Replication
                                          0
            KeySpace                          1
                        Coordinator
            with RF=3
                                              A
                             8
                    A                                      2   R1
                                              A
         Client

                        7
                                              A
                                                               3    R2

Partition Strategy
                            6
Replication Strategy                              4   R3
                                      5
Cassandra CLI / Keyspaces
	
  CREATE	
  KEYSPACE	
  demo	
  
	
  	
  	
  	
  	
  WITH	
  placement_strategy	
  =	
  'SimpleStrategy'	
  
	
  	
  	
  	
  	
  AND	
  strategy_options:replication_factor	
  =	
  3;




	
  CREATE	
  KEYSPACE	
  cache	
  
	
  	
  	
  	
  	
  WITH	
  placement_strategy	
  =	
  'SimpleStrategy'	
  
	
  	
  	
  	
  	
  AND	
  strategy_options:replication_factor	
  =	
  1
	
  	
  	
  	
  	
  AND	
  durable_writes	
  =	
  ‘false’;
Cassandra CLI / Static Columns
	
  CREATE	
  COLUMN	
  FAMILY	
  users	
  
	
  	
  	
  	
  	
  WITH	
  comparator	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  =	
  UTF8Type
	
  	
  	
  	
  	
  AND	
  key_validation_class	
  =	
  UTF8Type
	
  	
  	
  	
  	
  AND	
  column_metadata	
  =	
  [
	
  	
  	
  	
  	
  	
  	
  {	
  column_name:	
  name,	
  	
  	
  	
  	
  validation_class:	
  UTF8Type	
  },
	
  	
  	
  	
  	
  	
  	
  {	
  column_name:	
  password,	
  validation_class:	
  UTF8Type	
  },
	
  	
  	
  	
  	
  	
  	
  {	
  column_name:	
  country,	
  	
  validation_class:	
  UTF8Type	
  },
	
  	
  	
  	
  	
  	
  	
  {	
  column_name:	
  state,	
  	
  	
  	
  validation_class:	
  UTF8Type	
  }
	
  	
  	
  	
  	
  ]



(static column familiy)
	
  SET	
  users['alankay']['name']	
  	
  	
  	
  =	
  'Alan	
  Kay';
	
  SET	
  users['alankay']['state']	
  	
  	
  =	
  'CA';	
  
	
  SET	
  users['alankay']['country']	
  =	
  'US';
Cassandra CLI / Dynamic Columns

	
  CREATE	
  COLUMN	
  FAMILY	
  posts	
  
	
  	
  	
  	
  	
  WITH	
  comparator	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  =	
  TimeUUIDType
	
  	
  	
  	
  	
  AND	
  key_validation_class	
  =	
  UTF8Type
	
  	
  	
  	
  	
  AND	
  default_validation_class	
  =	
  UTF8Type;



(dynamic column familiy)

	
  SET	
  posts[‘alankay’][timeuuid()]	
  =	
  ‘Hello	
  world...’;
Cassandra CLI / Counters

	
  CREATE	
  COLUMN	
  FAMILY	
  page_views	
  
	
  	
  	
  	
  	
  WITH	
  comparator	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  =	
  UTF8Type
	
  	
  	
  	
  	
  AND	
  key_validation_class	
  =	
  UTF8Type
	
  	
  	
  	
  	
  AND	
  default_validation_class	
  =	
  CounterType;



(counter column familiy)

	
  INCR	
  page_views[‘www.google.com’][‘about.html’]	
  BY	
  1
	
  INCR	
  page_views[‘www.google.com’][‘help.html’]	
  	
  BY	
  1
CQL
(Cassandra Query Language)

SQL-like language. No joins, aggregation, ...
CQL (Cassandra Query Language)
	
  CREATE	
  KEYSPACE	
  demo	
  
	
  	
  	
  	
  	
  WITH	
  strategy_class	
  =	
  'SimpleStrategy'	
  
	
  	
  	
  	
  	
  AND	
  strategy_options:replication_factor	
  =	
  3;



	
  CREATE	
  TABLE	
  users	
  (
	
  	
  	
  	
  login	
  	
  	
  	
  	
  varchar	
  PRIMARY	
  KEY,
	
  	
  	
  	
  name	
  	
  	
  	
  	
  	
  varchar,
	
  	
  	
  	
  password	
  	
  varchar,
	
  	
  	
  	
  country	
  	
  	
  varchar,
	
  	
  	
  	
  state	
  	
  	
  	
  	
  varchar
)	
  	
  	
  	
  	
  	
  	
  



	
  CREATE	
  INDEX	
  users_country	
  ON	
  users(country)
	
  CREATE	
  INDEX	
  users_state	
  	
  	
  ON	
  users(state)
CQL (Cassandra Query Language)

	
  INSERT	
  INTO	
  users	
  (login,	
  name,	
  country,	
  state)	
  
	
  VALUES	
  (‘alankay’,	
  ‘Alan	
  Kay’,	
  ‘US’,	
  ‘CA’)	
  

	
  SELECT	
  *	
  
	
  FROM	
  users	
  
	
  WHERE	
  login	
  =	
  ‘alankey’
	
  
	
  SELECT	
  *	
  
	
  FROM	
  users	
  
	
  WHERE	
  country	
  =	
  ‘US’	
  and	
  state	
  =	
  ‘CA’
CQL Counters

	
  CREATE	
  TABLE	
  login_stats	
  (
	
  	
  	
  login	
  varchar,	
  
	
  	
  	
  success	
  counter,	
  
	
  	
  	
  failed	
  counter,	
  
	
  	
  	
  PRIMARY	
  KEY(login)
	
  );




	
  UPDATE	
  login_stats	
  
	
  SET	
  success	
  =	
  success	
  +	
  1	
  
	
  WHERE	
  login	
  =	
  'alankay';
CQL (Cassandra Query Language)
         Type              CQL
BytesType           blob
AsciiType           ascii
UTF8Type            text,	
  varchar
IntegerType         varint             arbitrary-­‐precision
Int32Type           int                4-­‐bytes	
  integer
LongType            bigint             8-­‐bytes	
  integer
UUIDType            uuid
TimeUUIDType        timeuuid
DateType            timestamp          8-­‐bytes
BooleanType         boolean
FloatType           float
DoubleType          double             8-­‐bytes
DecimalType         decimal            variable-­‐precision
CounterColumnType   counter            distributed	
  counter
Tunable Consistency

     Any (Only for Write)

   One, Two, Three

            Quorum                                    Local Quorum

                 ALL                                   Each Quorum

	
  SELECT	
  *	
  FROM	
  users	
  USING	
  CONSISTENCY	
  QUORUM	
  WHERE	
  ...
	
  
	
  INSERT	
  INTO	
  users	
  (id,	
  name,	
  ..)	
  VALUES	
  (...)	
  
	
  USING	
  CONSISTENCY	
  QUORUM
Consistency Level
                             0
                                 1



                     8
                                         2

     Client

                 7

  USING                                      3

CONSISTENCY
   ONE               6

                                     4
                         5
Consistency Level
                                     0
                                         1
                   Coordinator


                        8
               A                                 2

     Client

                   7

  USING                                              3

CONSISTENCY
   ONE                 6

                                             4
                                 5
Consistency Level
                                     0
                                         1
                   Coordinator


                        8
               A                                      2   R1

     Client

                   7

  USING                                                   3    R2
CONSISTENCY
   ONE                 6

                                             4   R3
                                 5
Consistency Level
                                     0
                                         1
                   Coordinator
                                         A
                        8
               A                                      2   R1
                                         A
     Client

                   7
                                         A
  USING                                                   3    R2
CONSISTENCY
   ONE                 6

                                             4   R3
                                 5
Consistency Level
                                           0
                                               1
                   Coordinator
                                               A
                        8
               A                                            2   R1
                                               A
     Client

                   7
                                               A
  USING                          Ack
                                                                3    R2
CONSISTENCY
   ONE                 6

                                                   4   R3
                                       5
Consistency Level
                                             0
                                                 1
                     Coordinator
                                                 A
                          8
               A                                              2   R1
                                                 A
     Client    Ack



                     7
                                                 A
  USING                            Ack
                                                                  3    R2
CONSISTENCY
   ONE                   6

                                                     4   R3
                                         5
Gossip-based Protocol
                 0
                     1



         8
                             2




     7

                                 3



         6

                         4
             5
Gossip-based Protocol
                 0
                     1



         8
                             2




     7

                                 3



         6

                         4
             5
Gossip-based Protocol
                 0
                     1



         8
                             2




     7

                                 3



         6

                         4
             5
Gossip-based Protocol
                 0
                     1



         8
                             2




     7

                                 3



         6

                         4
             5
Gossip-based Protocol
                 0
                     1



         8
                             2




     7

                                 3



         6

                         4
             5
Gossip-based Protocol
                 0
                     1



         8
                             2




     7

                                 3



         6

                         4
             5
Gossip-based Protocol
                 0
                     1



         8
                             2




     7

                                 3



         6

                         4
             5
Gossip-based Protocol
                 0
                     1



         8
                             2




     7

                                 3



         6

                         4
             5
Hinted Handoff Writes
                                      0
                                          1



                              8
                                                     2

           Client

                         7

                                              Down       3

     Hints stored for
      down replicas           6

                                              4
If consistency level = ANY,       5
      always writable
Hinted Handoff Writes
                                      0
                                          1



                              8
                     A                               2

           Client

                         7

                                              Down       3

     Hints stored for
      down replicas           6

                                              4
If consistency level = ANY,       5
      always writable
Hinted Handoff Writes
                                           0
                                               1
                         Coordinator


                              8
                     A                                    2

           Client

                         7

                                                   Down       3

     Hints stored for
      down replicas           6

                                                   4
If consistency level = ANY,            5
      always writable
Hinted Handoff Writes
                                           0
                                               1
                         Coordinator


                              8
                     A                                      2   R1

           Client

                         7

                                                   Down         3    R2

     Hints stored for
      down replicas           6

                                                   4   R3
If consistency level = ANY,            5
      always writable
Hinted Handoff Writes
                                           0
                                               1
                         Coordinator
                                               A
                              8
                     A                                      2   R1

           Client

                         7
                                               A
                                                   Down         3    R2

     Hints stored for
      down replicas           6

                                                   4   R3
If consistency level = ANY,            5
      always writable
Hinted Handoff Writes
                                             0
                                                 1
                         Coordinator
                                                 A
                              8
                     A                                        2   R1

           Client                 Hint
                                  3:A
                         7
                                                 A
                                                     Down         3    R2

     Hints stored for
      down replicas           6

                                                     4   R3
If consistency level = ANY,              5
      always writable
Hinted Handoff Writes
                                                0
                                                    1
                         Coordinator
                                                    A
                              8
                     A                                           2   R1

           Client                 Hint
                                  3:A
                         7
                                                    A
                                                        Down         3    R2

     Hints stored for
      down replicas                      Hint
                              6
                                         3:B
                                                        4   R3
If consistency level = ANY,               5
      always writable
Anti-Entropy / Read Repair
                                  0
          KeySpace                    1
          with RF=3

                          8
                                              2

       Client

                      7

                                                  3


read_repair_chance        6
  by column family
                                          4
                              5
Anti-Entropy / Read Repair
                                        0
          KeySpace                          1
                      Coordinator
          with RF=3

                           8
                                                    2

       Client

                      7

                                                        3


read_repair_chance        6
  by column family
                                                4
                                    5
Anti-Entropy / Read Repair
                                        0
          KeySpace                          1
                      Coordinator
          with RF=3

                           8
                                                         2   R1

       Client

                      7

                                                             3    R2


read_repair_chance        6
  by column family
                                                4   R3
                                    5
Anti-Entropy / Read Repair
                                         0
          KeySpace                              1
                      Coordinator
          with RF=3
                                    DigestQuery
                           8
                                                                   2   R1

       Client                                Qu
                                                    ery


                      7




                                     Di
                                        ge
                                          stQ
                                                                       3    R2




                                           ue
                                                ry
read_repair_chance        6
  by column family
                                                          4   R3
                                    5
Anti-Entropy / Read Repair
                                         0
          KeySpace                              1
                      Coordinator
          with RF=3
                                    DigestQuery
                           8
                                                                   2   R1

       Client                                Qu
                                                    ery


                      7




                                     Di
                                        ge
                                          stQ
                                                                       3    R2




                                           ue
                                                ry
read_repair_chance        6
  by column family
                                                          4   R3
                                    5
Anti-Entropy / Read Repair
                                         0
          KeySpace                              1
                      Coordinator
          with RF=3
                                    DigestQuery
                           8
                                                                   2   R1

       Client                                Qu
                                                    ery


                      7




                                     Di
                                        ge
                                          stQ
                                                                       3    R2




                                           ue
                                                ry
read_repair_chance        6
  by column family
                                                          4   R3
                                    5
Anti-Entropy / Node Repair
                                   0
                                       1



                           8
                                                        2



 node	
  repair




                                                       t/
                                                      e
                                                 eques
                                                 epons
                       7

 disk	
  expensive/                                         3




                                           TreeR
                                           TreeR
network	
  efficient
                           6

                                              4
                               5
Anti-Entropy / Node Repair
                                   0
                                       1



                           8
                                                        2



 node	
  repair




                                                       t/
                                                      e
                                                 eques
                                                 epons
                       7

 disk	
  expensive/                                         3




                                           TreeR
                                           TreeR
network	
  efficient
                           6

                                              4
                               5
Merkle Trees


                     Top	
  Hash                                                Top	
  Hash



Hash	
  1-­‐2                             Hash	
  3-­‐4   Hash	
  1-­‐2                             Hash	
  3-­‐4




   Hash	
  1    Hash	
  2    Hash	
  3    Hash	
  4          Hash	
  1    Hash	
  2     Hash	
  3   Hash	
  4
Multi Datacenter Partitioning
          0                         0
           1

                          6                 3




4                     2




                          5                 4

           3


    DataCenter	
  1           DataCenter	
  2
Multi Datacenter Partitioning
                  0
                  1

             6        3




         4                2




             5        4

                  3

DataCenter	
  1       DataCenter	
  2
Multi Datacenter Partitioning

Client         0                         0
                1

                               6                 3




    4                      2




                               5                 4

                3


         DataCenter	
  1           DataCenter	
  2
Multi Datacenter Partitioning

Client         0                         0
                1

                               6                 3
 A



     4                     2




                               5                 4

                3


         DataCenter	
  1           DataCenter	
  2
Multi Datacenter Partitioning

Client         0                         0
                1

                               6                 3
 A



     4                     2




                               5                 4

                3


         DataCenter	
  1           DataCenter	
  2
Multi Datacenter Partitioning

Client          0                        0
                1

                               6                 3
 A

                A
     4                     2

            A

                               5                 4

                3


         DataCenter	
  1           DataCenter	
  2
Multi Datacenter Partitioning

Client          0                        0
                1

                               6                 3
 A

                A
     4                     2

            A

                               5                 4

                3


         DataCenter	
  1           DataCenter	
  2
Multi Datacenter Partitioning

Client          0                        0
                1
                           A
                               6                 3
 A

                A
     4                     2

            A

                               5                 4

                3


         DataCenter	
  1           DataCenter	
  2
Multi Datacenter Partitioning

Client          0                        0
                1
                           A             A
                               6                 3
 A

                A                         A
     4                     2

            A

                               5                 4

                3


         DataCenter	
  1           DataCenter	
  2
Replica Placement

                            SimpleStrategy
                             (adjacent nodes)

	
  CREATE	
  KEYSPACE	
  demo
	
  	
  	
  	
  WITH	
  strategy_class	
  =	
  ‘SimpleStrategy’
	
  	
  	
  	
  AND	
  	
  strategy_options:replication_factor	
  =	
  3;
Replica Placement

              NetworkTopologyStrategy
                (replication by datacenter)

	
  CREATE	
  KEYSPACE	
  demo
	
  	
  	
  	
  WITH	
  strategy_class	
  =	
  ‘NetworkTopologyStrategy’
	
  	
  	
  	
  AND	
  	
  strategy_options:DC1	
  =	
  3
	
  	
  	
  	
  AND	
  	
  strategy_options:DC2	
  =	
  2;
Topology Discovery

           SimpleSnitch
         (single datacenter)


             EC2Snitch
 (region as datancer, a. zone as rack)


       PropertyFileSnitch
  (cassandra-topology.properties)


     RackInferringSnitch
    (10.DataCenter.Rack.Node)
Property File Snitch

   	
  	
  
	
  cassandra-­‐topology.properties

	
  66.160.141.216	
  =	
  DC1:RAC1
	
  66.160.141.217	
  =	
  DC1:RAC1
	
  66.160.141.218	
  =	
  DC1:RAC1

	
  174.129.20.82	
  =	
  DC2:RAC1
	
  174.129.20.83	
  =	
  DC2:RAC1
	
  174.129.30.60	
  =	
  DC2:RAC2
	
  174.129.30.61	
  =	
  DC2:RAC2
Wide Rows
                    (Composite Primary Key)

	
  CREATE	
  TABLE	
  page_views	
  (
	
  	
  	
  	
  	
  domain	
  varchar,	
  
	
  	
  	
  	
  	
  page	
  varchar,	
  
	
  	
  	
  	
  	
  hits	
  counter,	
  
	
  	
  	
  	
  	
  PRIMARY	
  KEY(domain,	
  page)
	
  );



	
  UPDATE	
  page_views	
  
	
  SET	
  hits	
  =	
  hits	
  +	
  1	
  
	
  WHERE	
  domain	
  =	
  'www.google.com'	
  and	
  page	
  =	
  '/faq.html';
Wide Rows
                     (Composite Primary Key)

	
  CREATE	
  TABLE	
  metrics	
  (
	
  	
  	
  	
  name	
  text,	
  
	
  	
  	
  	
  day	
  int,	
  
	
  	
  	
  	
  value	
  counter,	
  
	
  	
  	
  	
  PRIMARY	
  KEY	
  (name,	
  day)
	
  );



	
  UPDATE	
  metrics	
  
	
  SET	
  value	
  =	
  value	
  +	
  1	
  
	
  WHERE	
  name	
  =	
  'google.com'	
  AND	
  day	
  =	
  20121201;



	
  SELECT	
  *	
  
	
  FROM	
  metrics	
  
	
  WHERE	
  day	
  >	
  20121201	
  AND	
  day	
  <	
  20121205	
  
	
  	
  	
  	
  	
  	
  	
  AND	
  name	
  =	
  ‘google.com’
Wide Rows
                    (Composite Primary Key)
	
  CREATE	
  TABLE	
  tweets	
  (
	
  	
  	
  	
  tweet_id	
  uuid	
  PRIMARY	
  KEY,
	
  	
  	
  	
  author	
  varchar,
	
  	
  	
  	
  body	
  varchar
	
  );




	
  CREATE	
  TABLE	
  timeline	
  (
	
  	
  	
  	
  user_id	
  varchar,
	
  	
  	
  	
  tweet_id	
  uuid,	
  	
  	
  //	
  uuid	
  with	
  time	
  as	
  prefix	
  timeuuid
	
  	
  	
  	
  author	
  varchar,
	
  	
  	
  	
  body	
  varchar,
	
  	
  	
  	
  PRIMARY	
  KEY	
  (user_id,	
  tweet_id)
	
  );
Atomic Batches (1.2+)

	
  BEGIN	
  BATCH	
  USING	
  CONSISTENCY	
  QUORUM

	
  	
  	
  	
  	
  INSERT	
  INTO	
  tweets	
  (user_id,	
  tweet_id,	
  author,	
  body)	
  
	
  	
  	
  	
  	
  VALUES	
  (‘alankay’,	
  ...,	
  ‘alan	
  kay’,	
  ‘...’)	
  

	
  	
  	
  	
  	
  INSERT	
  INTO	
  timeline	
  (user_id,	
  tweet_id,	
  author,	
  body)	
  
	
  	
  	
  	
  	
  VALUES	
  (‘other’,	
  ‘...’,	
  ‘alankay’,	
  ‘...’)

	
  APPLY	
  BATCH




	
  CREATE	
  TABLE	
  batchlog	
  (
	
  	
  	
  	
  	
  id	
  uuid	
  PRIMARY	
  KEY,
	
  	
  	
  	
  	
  written_at	
  timestamp,
	
  	
  	
  	
  	
  data	
  blob
	
  )
Collections / Sets (1.2+)
	
  CREATE	
  TABLE	
  users	
  (
	
  	
  	
  	
  login	
  text	
  PRIMARY	
  KEY,
	
  	
  	
  	
  name	
  text,
	
  	
  	
  	
  emails	
  set<text>
	
  );



	
  INSERT	
  INTO	
  users	
  (login,	
  name,	
  emails)	
  
	
  VALUES	
  (‘alankay’,	
  ‘Alan	
  Kay’,	
  {	
  “alan@kay.com”	
  })	
  



	
  UPDATE	
  users	
  
	
  SET	
  emails	
  +	
  {	
  “a@b.com”	
  }	
  
	
  WHERE	
  login	
  =	
  ‘alankay’	
  
Collections / Maps (1.2+)
	
  CREATE	
  TABLE	
  users	
  (
	
  	
  	
  	
  login	
  text	
  PRIMARY	
  KEY,
	
  	
  	
  	
  name	
  text,
	
  	
  	
  	
  social_ids	
  map<text,	
  text>
	
  );



	
  INSERT	
  INTO	
  users	
  (login,	
  name,	
  social_ids)	
  
	
  VALUES	
  (‘alankay’,	
  ‘Alan	
  Kay’,	
  {	
  “twitter”	
  :	
  “alankay”	
  })	
  



	
  UPDATE	
  users	
  
	
  SET	
  social_ids[“google”]	
  =	
  “+alankay”	
  
	
  WHERE	
  login	
  =	
  ‘alankay’	
  
Collections / Lists (1.2+)
	
  CREATE	
  TABLE	
  users	
  (
	
  	
  	
  	
  login	
  text	
  PRIMARY	
  KEY,
	
  	
  	
  	
  name	
  text,
	
  	
  	
  	
  creditcards	
  list<text>
	
  );



	
  INSERT	
  INTO	
  users	
  (login,	
  name,	
  creditcards)	
  
	
  VALUES	
  (‘alankay’,	
  ‘Alan	
  Kay’,	
  [	
  “1234-­‐”	
  ])	
  



	
  UPDATE	
  users	
  
	
  SET	
  creditcards	
  +	
  “2345-­‐”	
  
	
  WHERE	
  login	
  =	
  ‘alankay’	
  
Cassandra Clients
Shells                      High Level APIs
Cassandra-­‐CLI             Java:	
  Hector	
  Client	
  API
CQLSH                       Java:	
  Astyanax	
  (Netflix)
Drivers                     Scala:	
  Cassie	
  (Twitter)

Java:	
  CQL	
  /	
  JDBC   Python:	
  PyCassa	
  Client	
  API
                            PHP:	
  PhpCassa	
  Client	
  API
Mappings
Java:	
  Apache	
  Gora     Low Level
Java:	
  Kundera	
  (JPA)   Thrift	
  (multi	
  language)
Thanks,
  Fernando Rodriguez Olivera

 twitter:	
  @frodriguez
 	
  	
  	
  mail:	
  frodriguez	
  <at>	
  gmail.com
 website:	
  nosqlessentials.com


   Next course (Spanish only):
Hadoop/HBase/Cassandra/MongoDB
  Buenos Aires, 18/19 Dec 2012:
 Registration: nosqlessentials.com

More Related Content

What's hot

Advanced Postgres Monitoring
Advanced Postgres MonitoringAdvanced Postgres Monitoring
Advanced Postgres MonitoringDenish Patel
 
Why I quit Amazon and Build the Next-gen Streaming System
Why I quit Amazon and Build the Next-gen Streaming SystemWhy I quit Amazon and Build the Next-gen Streaming System
Why I quit Amazon and Build the Next-gen Streaming SystemYingjun Wu
 
The evolution of Netflix's S3 data warehouse (Strata NY 2018)
The evolution of Netflix's S3 data warehouse (Strata NY 2018)The evolution of Netflix's S3 data warehouse (Strata NY 2018)
The evolution of Netflix's S3 data warehouse (Strata NY 2018)Ryan Blue
 
InnoDB MVCC Architecture (by 권건우)
InnoDB MVCC Architecture (by 권건우)InnoDB MVCC Architecture (by 권건우)
InnoDB MVCC Architecture (by 권건우)I Goo Lee.
 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcachedJurriaan Persyn
 
Thrift vs Protocol Buffers vs Avro - Biased Comparison
Thrift vs Protocol Buffers vs Avro - Biased ComparisonThrift vs Protocol Buffers vs Avro - Biased Comparison
Thrift vs Protocol Buffers vs Avro - Biased ComparisonIgor Anishchenko
 
Apache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang WangApache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang WangDatabricks
 
The Great Debate: PostgreSQL vs MySQL
The Great Debate: PostgreSQL vs MySQLThe Great Debate: PostgreSQL vs MySQL
The Great Debate: PostgreSQL vs MySQLEDB
 
Hudi: Large-Scale, Near Real-Time Pipelines at Uber with Nishith Agarwal and ...
Hudi: Large-Scale, Near Real-Time Pipelines at Uber with Nishith Agarwal and ...Hudi: Large-Scale, Near Real-Time Pipelines at Uber with Nishith Agarwal and ...
Hudi: Large-Scale, Near Real-Time Pipelines at Uber with Nishith Agarwal and ...Databricks
 
Indexing with MongoDB
Indexing with MongoDBIndexing with MongoDB
Indexing with MongoDBMongoDB
 
An Enterprise Architect's View of MongoDB
An Enterprise Architect's View of MongoDBAn Enterprise Architect's View of MongoDB
An Enterprise Architect's View of MongoDBMongoDB
 
Building large scale transactional data lake using apache hudi
Building large scale transactional data lake using apache hudiBuilding large scale transactional data lake using apache hudi
Building large scale transactional data lake using apache hudiBill Liu
 
Presto Summit 2018 - 09 - Netflix Iceberg
Presto Summit 2018  - 09 - Netflix IcebergPresto Summit 2018  - 09 - Netflix Iceberg
Presto Summit 2018 - 09 - Netflix Icebergkbajda
 
Pinot: Near Realtime Analytics @ Uber
Pinot: Near Realtime Analytics @ UberPinot: Near Realtime Analytics @ Uber
Pinot: Near Realtime Analytics @ UberXiang Fu
 
Etsy Activity Feeds Architecture
Etsy Activity Feeds ArchitectureEtsy Activity Feeds Architecture
Etsy Activity Feeds ArchitectureDan McKinley
 
Kappa vs Lambda Architectures and Technology Comparison
Kappa vs Lambda Architectures and Technology ComparisonKappa vs Lambda Architectures and Technology Comparison
Kappa vs Lambda Architectures and Technology ComparisonKai Wähner
 
Optimizing Application Performance - 2022.pptx
Optimizing Application Performance - 2022.pptxOptimizing Application Performance - 2022.pptx
Optimizing Application Performance - 2022.pptxJasonTuran2
 
Why does my choice of storage matter with cassandra?
Why does my choice of storage matter with cassandra?Why does my choice of storage matter with cassandra?
Why does my choice of storage matter with cassandra?Johnny Miller
 

What's hot (20)

Advanced Postgres Monitoring
Advanced Postgres MonitoringAdvanced Postgres Monitoring
Advanced Postgres Monitoring
 
MongodB Internals
MongodB InternalsMongodB Internals
MongodB Internals
 
Why I quit Amazon and Build the Next-gen Streaming System
Why I quit Amazon and Build the Next-gen Streaming SystemWhy I quit Amazon and Build the Next-gen Streaming System
Why I quit Amazon and Build the Next-gen Streaming System
 
The evolution of Netflix's S3 data warehouse (Strata NY 2018)
The evolution of Netflix's S3 data warehouse (Strata NY 2018)The evolution of Netflix's S3 data warehouse (Strata NY 2018)
The evolution of Netflix's S3 data warehouse (Strata NY 2018)
 
InnoDB MVCC Architecture (by 권건우)
InnoDB MVCC Architecture (by 권건우)InnoDB MVCC Architecture (by 권건우)
InnoDB MVCC Architecture (by 권건우)
 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcached
 
Thrift vs Protocol Buffers vs Avro - Biased Comparison
Thrift vs Protocol Buffers vs Avro - Biased ComparisonThrift vs Protocol Buffers vs Avro - Biased Comparison
Thrift vs Protocol Buffers vs Avro - Biased Comparison
 
Apache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang WangApache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
 
The Great Debate: PostgreSQL vs MySQL
The Great Debate: PostgreSQL vs MySQLThe Great Debate: PostgreSQL vs MySQL
The Great Debate: PostgreSQL vs MySQL
 
Hudi: Large-Scale, Near Real-Time Pipelines at Uber with Nishith Agarwal and ...
Hudi: Large-Scale, Near Real-Time Pipelines at Uber with Nishith Agarwal and ...Hudi: Large-Scale, Near Real-Time Pipelines at Uber with Nishith Agarwal and ...
Hudi: Large-Scale, Near Real-Time Pipelines at Uber with Nishith Agarwal and ...
 
Indexing with MongoDB
Indexing with MongoDBIndexing with MongoDB
Indexing with MongoDB
 
An Enterprise Architect's View of MongoDB
An Enterprise Architect's View of MongoDBAn Enterprise Architect's View of MongoDB
An Enterprise Architect's View of MongoDB
 
Building large scale transactional data lake using apache hudi
Building large scale transactional data lake using apache hudiBuilding large scale transactional data lake using apache hudi
Building large scale transactional data lake using apache hudi
 
Presto Summit 2018 - 09 - Netflix Iceberg
Presto Summit 2018  - 09 - Netflix IcebergPresto Summit 2018  - 09 - Netflix Iceberg
Presto Summit 2018 - 09 - Netflix Iceberg
 
Pinot: Near Realtime Analytics @ Uber
Pinot: Near Realtime Analytics @ UberPinot: Near Realtime Analytics @ Uber
Pinot: Near Realtime Analytics @ Uber
 
Etsy Activity Feeds Architecture
Etsy Activity Feeds ArchitectureEtsy Activity Feeds Architecture
Etsy Activity Feeds Architecture
 
Kappa vs Lambda Architectures and Technology Comparison
Kappa vs Lambda Architectures and Technology ComparisonKappa vs Lambda Architectures and Technology Comparison
Kappa vs Lambda Architectures and Technology Comparison
 
Optimizing Application Performance - 2022.pptx
Optimizing Application Performance - 2022.pptxOptimizing Application Performance - 2022.pptx
Optimizing Application Performance - 2022.pptx
 
Why does my choice of storage matter with cassandra?
Why does my choice of storage matter with cassandra?Why does my choice of storage matter with cassandra?
Why does my choice of storage matter with cassandra?
 
PostgreSQL
PostgreSQLPostgreSQL
PostgreSQL
 

Viewers also liked

LJC: Fault tolerance with Apache Cassandra
LJC: Fault tolerance with Apache CassandraLJC: Fault tolerance with Apache Cassandra
LJC: Fault tolerance with Apache CassandraChristopher Batey
 
Cassandra Tutorial
Cassandra TutorialCassandra Tutorial
Cassandra Tutorialmubarakss
 
Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3Eric Evans
 
Apache Cassandra Developer Training Slide Deck
Apache Cassandra Developer Training Slide DeckApache Cassandra Developer Training Slide Deck
Apache Cassandra Developer Training Slide DeckDataStax Academy
 
Cassandra Introduction & Features
Cassandra Introduction & FeaturesCassandra Introduction & Features
Cassandra Introduction & FeaturesDataStax Academy
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache CassandraDataStax
 
Cassandra Explained
Cassandra ExplainedCassandra Explained
Cassandra ExplainedEric Evans
 

Viewers also liked (9)

LJC: Fault tolerance with Apache Cassandra
LJC: Fault tolerance with Apache CassandraLJC: Fault tolerance with Apache Cassandra
LJC: Fault tolerance with Apache Cassandra
 
Cassandra for Rails
Cassandra for RailsCassandra for Rails
Cassandra for Rails
 
Cassandra Tutorial
Cassandra TutorialCassandra Tutorial
Cassandra Tutorial
 
Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3
 
Apache Cassandra Developer Training Slide Deck
Apache Cassandra Developer Training Slide DeckApache Cassandra Developer Training Slide Deck
Apache Cassandra Developer Training Slide Deck
 
Cassandra NoSQL Tutorial
Cassandra NoSQL TutorialCassandra NoSQL Tutorial
Cassandra NoSQL Tutorial
 
Cassandra Introduction & Features
Cassandra Introduction & FeaturesCassandra Introduction & Features
Cassandra Introduction & Features
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache Cassandra
 
Cassandra Explained
Cassandra ExplainedCassandra Explained
Cassandra Explained
 

Recently uploaded

Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Bhuvaneswari Subramani
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 

Recently uploaded (20)

Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 

NoSQL Essentials: Cassandra

  • 1. NoSQL Essentials Cassandra & Dynamo-like Databases Buenos Aires, Argentina, Nov 2012 Fernando Rodriguez Olivera @frodriguez nosqlessentials.com
  • 2. Hash Partitioning A 0 Client B 1 C 2 D 3 N  =  4
  • 3. Hash Partitioning A 0 Client B 1 hash(“hello”)  mod  4  =  2 C 2 D 3 N  =  4
  • 4. Hash Partitioning A 0 Client hello B 1 hash(“hello”)  mod  4  =  2 C 2 D 3 N  =  4
  • 5. Hash Partitioning A 0 Client hello B 1 hash(“hello”)  mod  4  =  2 hash(“world”)  mod  4  =  0 C 2 D 3 N  =  4
  • 6. Hash Partitioning world A 0 Client hello B 1 hash(“hello”)  mod  4  =  2 hash(“world”)  mod  4  =  0 C 2 D 3 N  =  4
  • 7. Hash Partitioning world A 0 Client hello B 1 hash(“hello”)  mod  4  =  2 hash(“world”)  mod  4  =  0 C 2 hash(“bye”)      mod  4  =  3 D 3 N  =  4
  • 8. Hash Partitioning world A 0 Client hello B 1 hash(“hello”)  mod  4  =  2 bye hash(“world”)  mod  4  =  0 C 2 hash(“bye”)      mod  4  =  3 D 3 N  =  4
  • 9. Hash Partitioning world A 0 Client hello B 1 hash(“hello”)  mod  4  =  2 bye hash(“world”)  mod  4  =  0 C 2 hash(“bye”)      mod  4  =  3 D 3 Difficult to add/remove nodes N  =  4
  • 10. Consistent Hashing / Random Tokens 0 E.g:  Address  Space  0..Max  =  0..65535 hash  function  with  range  0..Max 65535 49152 16384 Client 32768
  • 11. Consistent Hashing / Random Tokens 0 E.g:  Address  Space  0..Max  =  0..65535 hash  function  with  range  0..Max 65535 token  A  =  33015   49152 16384 Client A 32768
  • 12. Consistent Hashing / Random Tokens 0 E.g:  Address  Space  0..Max  =  0..65535 hash  function  with  range  0..Max 65535 B token  A  =  33015   token  B  =    8915 49152 16384 Client A 32768
  • 13. Consistent Hashing / Random Tokens 0 E.g:  Address  Space  0..Max  =  0..65535 hash  function  with  range  0..Max 65535 B token  A  =  33015   token  B  =    8915 49152 16384 Client token  C  =  31541 A C 32768
  • 14. Consistent Hashing / Random Tokens 0 E.g:  Address  Space  0..Max  =  0..65535 hash  function  with  range  0..Max 65535 B token  A  =  33015   token  B  =    8915 49152 16384 Client token  C  =  31541 token  D  =  40927 D A C 32768
  • 15. Consistent Hashing / Random Tokens 0 E.g:  Address  Space  0..Max  =  0..65535 hash  function  with  range  0..Max 65535 B token  A  =  33015   token  B  =    8915 49152 16384 Client token  C  =  31541 token  D  =  40927 D hash(“hello”)  =  13209 A C 32768
  • 16. Consistent Hashing / Random Tokens 0 E.g:  Address  Space  0..Max  =  0..65535 hash  function  with  range  0..Max 65535 B token  A  =  33015   token  B  =    8915 49152 16384 Client hello token  C  =  31541 token  D  =  40927 D hash(“hello”)  =  13209 A C 32768
  • 17. Consistent Hashing / Random Tokens 0 E.g:  Address  Space  0..Max  =  0..65535 hash  function  with  range  0..Max 65535 B token  A  =  33015   token  B  =    8915 49152 16384 Client hello token  C  =  31541 token  D  =  40927 D hash(“hello”)  =  13209 A C hash(“world”)  =  36551 32768
  • 18. Consistent Hashing / Random Tokens 0 E.g:  Address  Space  0..Max  =  0..65535 hash  function  with  range  0..Max 65535 B token  A  =  33015   token  B  =    8915 49152 16384 Client hello token  C  =  31541 token  D  =  40927 world D hash(“hello”)  =  13209 A C hash(“world”)  =  36551 32768
  • 19. Consistent Hashing / Random Tokens 0 E.g:  Address  Space  0..Max  =  0..65535 hash  function  with  range  0..Max 65535 B token  A  =  33015   token  B  =    8915 49152 16384 Client hello token  C  =  31541 token  D  =  40927 world D hash(“hello”)  =  13209 A C hash(“world”)  =  36551 hash(“bye”)      =  60912 32768
  • 20. Consistent Hashing / Random Tokens 0 E.g:  Address  Space  0..Max  =  0..65535 hash  function  with  range  0..Max 65535 B bye token  A  =  33015   token  B  =    8915 49152 16384 Client hello token  C  =  31541 token  D  =  40927 world D hash(“hello”)  =  13209 A C hash(“world”)  =  36551 hash(“bye”)      =  60912 32768
  • 21. Consistent Hashing / Virtual Nodes 0 A C D 65535 A C 4 Virtual Nodes B Random Tokens B token  A.1  =  ... token  A.2  =  ... D token  A.3  =  ... A token  A.4  =  ... A token  B.1  =  ... B D C D B A C
  • 22. Consistent Hashing / Manual Placement 0 1 Uniform Distribution 8 2 Calculated Tokens 7 Adding/Removing 3 Node Requires Rebalancing 6 4 5
  • 23. Token Generation  #>  cassandra/tools/bin/token-­‐generator  Token  Generator  Interactive  Mode  -­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐    How  many  datacenters  will  participate  in  this  Cassandra  cluster?  1    How  many  nodes  are  in  datacenter  #1?  5  DC  #1:      Node  #1:                                                                                0      Node  #2:      34028236692093846346337460743176821145      Node  #3:      68056473384187692692674921486353642290      Node  #4:    102084710076281539039012382229530463435      Node  #5:    136112946768375385385349842972707284580
  • 24. Partitioning Strategy RandomPartitioner (consistent hashing) ByteOrderedPartitioner Cassandra Documentation from DataStax: “Unless  absolutely  required  by  your   application,  DataStax  strongly  recommends   against  using  the  ordered  partitioner”
  • 25. Partitioning and Replication 0 KeySpace 1 with RF=3 8 2 Client 7 3 Partition Strategy 6 Replication Strategy 4 5
  • 26. Partitioning and Replication 0 KeySpace 1 with RF=3 8 A 2 Client 7 3 Partition Strategy 6 Replication Strategy 4 5
  • 27. Partitioning and Replication 0 KeySpace 1 Coordinator with RF=3 8 A 2 Client 7 3 Partition Strategy 6 Replication Strategy 4 5
  • 28. Partitioning and Replication 0 KeySpace 1 Coordinator with RF=3 8 A 2 R1 Client 7 3 R2 Partition Strategy 6 Replication Strategy 4 R3 5
  • 29. Partitioning and Replication 0 KeySpace 1 Coordinator with RF=3 A 8 A 2 R1 A Client 7 A 3 R2 Partition Strategy 6 Replication Strategy 4 R3 5
  • 30. Cassandra CLI / Keyspaces  CREATE  KEYSPACE  demo            WITH  placement_strategy  =  'SimpleStrategy'            AND  strategy_options:replication_factor  =  3;  CREATE  KEYSPACE  cache            WITH  placement_strategy  =  'SimpleStrategy'            AND  strategy_options:replication_factor  =  1          AND  durable_writes  =  ‘false’;
  • 31. Cassandra CLI / Static Columns  CREATE  COLUMN  FAMILY  users            WITH  comparator                    =  UTF8Type          AND  key_validation_class  =  UTF8Type          AND  column_metadata  =  [              {  column_name:  name,          validation_class:  UTF8Type  },              {  column_name:  password,  validation_class:  UTF8Type  },              {  column_name:  country,    validation_class:  UTF8Type  },              {  column_name:  state,        validation_class:  UTF8Type  }          ] (static column familiy)  SET  users['alankay']['name']        =  'Alan  Kay';  SET  users['alankay']['state']      =  'CA';    SET  users['alankay']['country']  =  'US';
  • 32. Cassandra CLI / Dynamic Columns  CREATE  COLUMN  FAMILY  posts            WITH  comparator                    =  TimeUUIDType          AND  key_validation_class  =  UTF8Type          AND  default_validation_class  =  UTF8Type; (dynamic column familiy)  SET  posts[‘alankay’][timeuuid()]  =  ‘Hello  world...’;
  • 33. Cassandra CLI / Counters  CREATE  COLUMN  FAMILY  page_views            WITH  comparator                    =  UTF8Type          AND  key_validation_class  =  UTF8Type          AND  default_validation_class  =  CounterType; (counter column familiy)  INCR  page_views[‘www.google.com’][‘about.html’]  BY  1  INCR  page_views[‘www.google.com’][‘help.html’]    BY  1
  • 34. CQL (Cassandra Query Language) SQL-like language. No joins, aggregation, ...
  • 35. CQL (Cassandra Query Language)  CREATE  KEYSPACE  demo            WITH  strategy_class  =  'SimpleStrategy'            AND  strategy_options:replication_factor  =  3;  CREATE  TABLE  users  (        login          varchar  PRIMARY  KEY,        name            varchar,        password    varchar,        country      varchar,        state          varchar )                CREATE  INDEX  users_country  ON  users(country)  CREATE  INDEX  users_state      ON  users(state)
  • 36. CQL (Cassandra Query Language)  INSERT  INTO  users  (login,  name,  country,  state)    VALUES  (‘alankay’,  ‘Alan  Kay’,  ‘US’,  ‘CA’)    SELECT  *    FROM  users    WHERE  login  =  ‘alankey’    SELECT  *    FROM  users    WHERE  country  =  ‘US’  and  state  =  ‘CA’
  • 37. CQL Counters  CREATE  TABLE  login_stats  (      login  varchar,        success  counter,        failed  counter,        PRIMARY  KEY(login)  );  UPDATE  login_stats    SET  success  =  success  +  1    WHERE  login  =  'alankay';
  • 38. CQL (Cassandra Query Language) Type CQL BytesType blob AsciiType ascii UTF8Type text,  varchar IntegerType varint arbitrary-­‐precision Int32Type int 4-­‐bytes  integer LongType bigint 8-­‐bytes  integer UUIDType uuid TimeUUIDType timeuuid DateType timestamp 8-­‐bytes BooleanType boolean FloatType float DoubleType double 8-­‐bytes DecimalType decimal variable-­‐precision CounterColumnType counter distributed  counter
  • 39. Tunable Consistency Any (Only for Write) One, Two, Three Quorum Local Quorum ALL Each Quorum  SELECT  *  FROM  users  USING  CONSISTENCY  QUORUM  WHERE  ...    INSERT  INTO  users  (id,  name,  ..)  VALUES  (...)    USING  CONSISTENCY  QUORUM
  • 40. Consistency Level 0 1 8 2 Client 7 USING 3 CONSISTENCY ONE 6 4 5
  • 41. Consistency Level 0 1 Coordinator 8 A 2 Client 7 USING 3 CONSISTENCY ONE 6 4 5
  • 42. Consistency Level 0 1 Coordinator 8 A 2 R1 Client 7 USING 3 R2 CONSISTENCY ONE 6 4 R3 5
  • 43. Consistency Level 0 1 Coordinator A 8 A 2 R1 A Client 7 A USING 3 R2 CONSISTENCY ONE 6 4 R3 5
  • 44. Consistency Level 0 1 Coordinator A 8 A 2 R1 A Client 7 A USING Ack 3 R2 CONSISTENCY ONE 6 4 R3 5
  • 45. Consistency Level 0 1 Coordinator A 8 A 2 R1 A Client Ack 7 A USING Ack 3 R2 CONSISTENCY ONE 6 4 R3 5
  • 46. Gossip-based Protocol 0 1 8 2 7 3 6 4 5
  • 47. Gossip-based Protocol 0 1 8 2 7 3 6 4 5
  • 48. Gossip-based Protocol 0 1 8 2 7 3 6 4 5
  • 49. Gossip-based Protocol 0 1 8 2 7 3 6 4 5
  • 50. Gossip-based Protocol 0 1 8 2 7 3 6 4 5
  • 51. Gossip-based Protocol 0 1 8 2 7 3 6 4 5
  • 52. Gossip-based Protocol 0 1 8 2 7 3 6 4 5
  • 53. Gossip-based Protocol 0 1 8 2 7 3 6 4 5
  • 54. Hinted Handoff Writes 0 1 8 2 Client 7 Down 3 Hints stored for down replicas 6 4 If consistency level = ANY, 5 always writable
  • 55. Hinted Handoff Writes 0 1 8 A 2 Client 7 Down 3 Hints stored for down replicas 6 4 If consistency level = ANY, 5 always writable
  • 56. Hinted Handoff Writes 0 1 Coordinator 8 A 2 Client 7 Down 3 Hints stored for down replicas 6 4 If consistency level = ANY, 5 always writable
  • 57. Hinted Handoff Writes 0 1 Coordinator 8 A 2 R1 Client 7 Down 3 R2 Hints stored for down replicas 6 4 R3 If consistency level = ANY, 5 always writable
  • 58. Hinted Handoff Writes 0 1 Coordinator A 8 A 2 R1 Client 7 A Down 3 R2 Hints stored for down replicas 6 4 R3 If consistency level = ANY, 5 always writable
  • 59. Hinted Handoff Writes 0 1 Coordinator A 8 A 2 R1 Client Hint 3:A 7 A Down 3 R2 Hints stored for down replicas 6 4 R3 If consistency level = ANY, 5 always writable
  • 60. Hinted Handoff Writes 0 1 Coordinator A 8 A 2 R1 Client Hint 3:A 7 A Down 3 R2 Hints stored for down replicas Hint 6 3:B 4 R3 If consistency level = ANY, 5 always writable
  • 61. Anti-Entropy / Read Repair 0 KeySpace 1 with RF=3 8 2 Client 7 3 read_repair_chance 6 by column family 4 5
  • 62. Anti-Entropy / Read Repair 0 KeySpace 1 Coordinator with RF=3 8 2 Client 7 3 read_repair_chance 6 by column family 4 5
  • 63. Anti-Entropy / Read Repair 0 KeySpace 1 Coordinator with RF=3 8 2 R1 Client 7 3 R2 read_repair_chance 6 by column family 4 R3 5
  • 64. Anti-Entropy / Read Repair 0 KeySpace 1 Coordinator with RF=3 DigestQuery 8 2 R1 Client Qu ery 7 Di ge stQ 3 R2 ue ry read_repair_chance 6 by column family 4 R3 5
  • 65. Anti-Entropy / Read Repair 0 KeySpace 1 Coordinator with RF=3 DigestQuery 8 2 R1 Client Qu ery 7 Di ge stQ 3 R2 ue ry read_repair_chance 6 by column family 4 R3 5
  • 66. Anti-Entropy / Read Repair 0 KeySpace 1 Coordinator with RF=3 DigestQuery 8 2 R1 Client Qu ery 7 Di ge stQ 3 R2 ue ry read_repair_chance 6 by column family 4 R3 5
  • 67. Anti-Entropy / Node Repair 0 1 8 2 node  repair t/ e eques epons 7 disk  expensive/ 3 TreeR TreeR network  efficient 6 4 5
  • 68. Anti-Entropy / Node Repair 0 1 8 2 node  repair t/ e eques epons 7 disk  expensive/ 3 TreeR TreeR network  efficient 6 4 5
  • 69. Merkle Trees Top  Hash Top  Hash Hash  1-­‐2 Hash  3-­‐4 Hash  1-­‐2 Hash  3-­‐4 Hash  1 Hash  2 Hash  3 Hash  4 Hash  1 Hash  2 Hash  3 Hash  4
  • 70. Multi Datacenter Partitioning 0 0 1 6 3 4 2 5 4 3 DataCenter  1 DataCenter  2
  • 71. Multi Datacenter Partitioning 0 1 6 3 4 2 5 4 3 DataCenter  1 DataCenter  2
  • 72. Multi Datacenter Partitioning Client 0 0 1 6 3 4 2 5 4 3 DataCenter  1 DataCenter  2
  • 73. Multi Datacenter Partitioning Client 0 0 1 6 3 A 4 2 5 4 3 DataCenter  1 DataCenter  2
  • 74. Multi Datacenter Partitioning Client 0 0 1 6 3 A 4 2 5 4 3 DataCenter  1 DataCenter  2
  • 75. Multi Datacenter Partitioning Client 0 0 1 6 3 A A 4 2 A 5 4 3 DataCenter  1 DataCenter  2
  • 76. Multi Datacenter Partitioning Client 0 0 1 6 3 A A 4 2 A 5 4 3 DataCenter  1 DataCenter  2
  • 77. Multi Datacenter Partitioning Client 0 0 1 A 6 3 A A 4 2 A 5 4 3 DataCenter  1 DataCenter  2
  • 78. Multi Datacenter Partitioning Client 0 0 1 A A 6 3 A A A 4 2 A 5 4 3 DataCenter  1 DataCenter  2
  • 79. Replica Placement SimpleStrategy (adjacent nodes)  CREATE  KEYSPACE  demo        WITH  strategy_class  =  ‘SimpleStrategy’        AND    strategy_options:replication_factor  =  3;
  • 80. Replica Placement NetworkTopologyStrategy (replication by datacenter)  CREATE  KEYSPACE  demo        WITH  strategy_class  =  ‘NetworkTopologyStrategy’        AND    strategy_options:DC1  =  3        AND    strategy_options:DC2  =  2;
  • 81. Topology Discovery SimpleSnitch (single datacenter) EC2Snitch (region as datancer, a. zone as rack) PropertyFileSnitch (cassandra-topology.properties) RackInferringSnitch (10.DataCenter.Rack.Node)
  • 82. Property File Snitch      cassandra-­‐topology.properties  66.160.141.216  =  DC1:RAC1  66.160.141.217  =  DC1:RAC1  66.160.141.218  =  DC1:RAC1  174.129.20.82  =  DC2:RAC1  174.129.20.83  =  DC2:RAC1  174.129.30.60  =  DC2:RAC2  174.129.30.61  =  DC2:RAC2
  • 83. Wide Rows (Composite Primary Key)  CREATE  TABLE  page_views  (          domain  varchar,            page  varchar,            hits  counter,            PRIMARY  KEY(domain,  page)  );  UPDATE  page_views    SET  hits  =  hits  +  1    WHERE  domain  =  'www.google.com'  and  page  =  '/faq.html';
  • 84. Wide Rows (Composite Primary Key)  CREATE  TABLE  metrics  (        name  text,          day  int,          value  counter,          PRIMARY  KEY  (name,  day)  );  UPDATE  metrics    SET  value  =  value  +  1    WHERE  name  =  'google.com'  AND  day  =  20121201;  SELECT  *    FROM  metrics    WHERE  day  >  20121201  AND  day  <  20121205                AND  name  =  ‘google.com’
  • 85. Wide Rows (Composite Primary Key)  CREATE  TABLE  tweets  (        tweet_id  uuid  PRIMARY  KEY,        author  varchar,        body  varchar  );  CREATE  TABLE  timeline  (        user_id  varchar,        tweet_id  uuid,      //  uuid  with  time  as  prefix  timeuuid        author  varchar,        body  varchar,        PRIMARY  KEY  (user_id,  tweet_id)  );
  • 86. Atomic Batches (1.2+)  BEGIN  BATCH  USING  CONSISTENCY  QUORUM          INSERT  INTO  tweets  (user_id,  tweet_id,  author,  body)            VALUES  (‘alankay’,  ...,  ‘alan  kay’,  ‘...’)            INSERT  INTO  timeline  (user_id,  tweet_id,  author,  body)            VALUES  (‘other’,  ‘...’,  ‘alankay’,  ‘...’)  APPLY  BATCH  CREATE  TABLE  batchlog  (          id  uuid  PRIMARY  KEY,          written_at  timestamp,          data  blob  )
  • 87. Collections / Sets (1.2+)  CREATE  TABLE  users  (        login  text  PRIMARY  KEY,        name  text,        emails  set<text>  );  INSERT  INTO  users  (login,  name,  emails)    VALUES  (‘alankay’,  ‘Alan  Kay’,  {  “alan@kay.com”  })    UPDATE  users    SET  emails  +  {  “a@b.com”  }    WHERE  login  =  ‘alankay’  
  • 88. Collections / Maps (1.2+)  CREATE  TABLE  users  (        login  text  PRIMARY  KEY,        name  text,        social_ids  map<text,  text>  );  INSERT  INTO  users  (login,  name,  social_ids)    VALUES  (‘alankay’,  ‘Alan  Kay’,  {  “twitter”  :  “alankay”  })    UPDATE  users    SET  social_ids[“google”]  =  “+alankay”    WHERE  login  =  ‘alankay’  
  • 89. Collections / Lists (1.2+)  CREATE  TABLE  users  (        login  text  PRIMARY  KEY,        name  text,        creditcards  list<text>  );  INSERT  INTO  users  (login,  name,  creditcards)    VALUES  (‘alankay’,  ‘Alan  Kay’,  [  “1234-­‐”  ])    UPDATE  users    SET  creditcards  +  “2345-­‐”    WHERE  login  =  ‘alankay’  
  • 90. Cassandra Clients Shells High Level APIs Cassandra-­‐CLI Java:  Hector  Client  API CQLSH Java:  Astyanax  (Netflix) Drivers Scala:  Cassie  (Twitter) Java:  CQL  /  JDBC Python:  PyCassa  Client  API PHP:  PhpCassa  Client  API Mappings Java:  Apache  Gora Low Level Java:  Kundera  (JPA) Thrift  (multi  language)
  • 91. Thanks, Fernando Rodriguez Olivera twitter:  @frodriguez      mail:  frodriguez  <at>  gmail.com website:  nosqlessentials.com Next course (Spanish only): Hadoop/HBase/Cassandra/MongoDB Buenos Aires, 18/19 Dec 2012: Registration: nosqlessentials.com