SlideShare uma empresa Scribd logo
1 de 5
Cassandra data modelling best practices:
1. Composite Type use throughAPIclientisnotrecommended.
2. Supercolumnfamilyuse isnotrecommendedasitde serializeall the columnsonusage as
againstdeserializationof singlecolumn.
3. We can create wide rows (huge columnsandseveral rows) andskinnyrows(smallcol andhuge
rows).
4. Valuelesscolumn;if Rowid={City+uid} we wanttowrite/readonlyCitythenuidcanbe emptyor
valuelesscolumn.
5. Can expire columnbasedonTTLset inseconds.
6. Countercolumnsmaintaintostore a numberthatincrementallycountsthe occurrencesof a
particulareventorprocess.For example,youmightuse acountercolumnto count the number
of timesapage isviewed.
7. Keyspace:aclusterhas one keyspace perapplication.
Top level containerforColumnFamilies.
ColumnFamily:A containerforRow KeysandColumnFamilies
Row Key:The unique identifierfordatastoredwithinaColumnFamily
Column:Name-Valuepairwithanadditional field:timestamp
SuperColumn:A Dictionaryof Columns identifiedbyRow Key.
8. RandomPartitioneristhe recommendedpartitioningscheme.Ithasthe followingadvantages
overOrderedPartitioningasinBOP
Randompartitioner:Ituseshashon the Row Keyto determine whichnode inthe clusterwill be
responsible forthe data.The hash value isgeneratedbydoingMD5 on the Row Key.Each node
inthe clusterina data centerisassignedsectionsof thisrange (token) andisresponsible for
storingthe data whose RowKey’shashvalue fallswithinthisrange.
TokenRange = (2^127) ÷ (# of nodesinthe cluster)
If the clusterisspannedacrossmultiple datacenters,the tokensare createdforindividual data
centers. Whichisbetter.
Byte OrderedPartitioner(BOP):Itallowsyoutocalculate yourowntokensandassign to nodes
yourself asopposedtoRandomPartitionerautomaticallydoingthisforyou.
9. Partitioning => Pickingoutone node tostore firstcopy of data on
Replication => Pickingoutadditional nodestostore more copiesof data.
Storage commitlog(durability)  flushittomemtables(in-memorystructures)  SSTables
whichcompact data usingcompactiontoremove stale dataand tombstones(indicatorthatdata
deleted).
10. Binaryprotocol isfasterthan thrift.
11. Why RP?
1. RP ensuresthatthe data is evenlydistributedacrossall nodesinthe clusterandnotcreate
data hotspotas inBOP.
2. When a newnode isaddedto the cluster,RPcan quicklyassignita new tokenrange and
move minimumamountof datafromothernodesto the new node whichitis now responsible
for.With BOP,thiswill have tobe done manually.
3. Multiple ColumnFamiliesIssue:BOPcancause unevendistributionof dataif youhave
multiple columnfamilies.
4. The onlybenefitthatBOPhasoverRP isthat it allowsyoutodo row slices.You can obtaina
cursor like inRDBMS and move overyourrows.
12. columnfamilyasa map of a map.
SortedMap<RowKey,SortedMap<ColumnKey,ColumnValue>>
A map givesefficientkeylookup,andthe sortednature givesefficientscans.InCassandra,we
can use row keysandcolumnkeysto do efficientlookupsandrange scans.
13. The numberof columnkeysisunbounded.Inotherwords,youcan have wide rows.
A keycan itself holdavalue.Inotherwords,youcan have a valuelesscolumn.
14. You needtopass the timestampwitheachcolumnvalue,forCassandratouse internallyfor
conflictresolution.However,the timestampcanbe safelyignoredduringmodeling.
15. Start withquerypatternsandcreate ER model.Thenstartdeformalizingandduplicating. helps
to identifythe most frequentquerypatternsandisolate the lessfrequent.
Querypattern:
Get userby userid
Get itembyitemid
Get all the itemsthata particularuserlikes
Get all the userswholike a particularitem
Option1: Exact replicaof relational model.
Option2: Normalizedentitieswithcustomindexes
Option3: Normalizedentitieswithde-normalizationintocustomindexes
Option4: Partiallyde-normalizedentities
Keyspaces: container for column families and a cluster has 1 keyspace per application.
CREATE KEYSPACE keyspace_name WITH
strategy_class = 'SimpleStrategy'
AND strategy_options:replication_factor='2';
Single device per row - Time Series Pattern 1
Partitioning to limit row size - Time Series Pattern 2
The solution is to use a pattern called row partitioning by adding data to the row key to limit the
amount of columns you get per device.
Reverse order timeseries with expiring columns -
Time Series Pattern 3
Data for a dashboard application and we only want to show the last 10 temperature readings. With
TTL time to live for data value it is possible.
CREATE TABLE latest_temperatures (
weatherstation_id text,
event_time timestamp,
temperature text,
PRIMARY KEY (weatherstation_id,event_time),
) WITH CLUSTERING ORDER BY (event_time DESC);
INSERT INTO latest_temperatures(weatherstation_id,event_time,temperature) VALUES
('1234ABCD','2013-04-03 07:03:00','72F') USING TTL 20;
create table Inbound (
InboundID int not null primary key auto_increment,
ParticipantID int not null,
FromParticipantID int not null,
Occurred date not null,
Subject varchar(50) not null,
Story text not null,
foreign key (ParticipantID) references Participant(ParticipantID),
foreign key (FromParticipantID) references Participant(ParticipantID));
create table Inbound (
ParticipantID int,
Occurred timeuuid,
FromParticipantID int,
Subject text,
Story text,
primary key (ParticipantID, Occurred));
1 Define the User Scenarios This ensures User participation and commitment.
2
Define the Steps in each
Scenario
Clarify the User Interaction.
3 Derive the Data Model.
Use a Modelling Tool, such as Data Architect or ERWin to
generate SQL.
4 Relate Data Entities to each Step. Create Cross-reference matrix to check results.
5
Identify Transactions for each
Entity
Confirm that each Entity has Transactions to load and read
Data
6 Prepare sample Data In collaboration with the Users.
7 Prepare Test Scripts Agree sign-off with the Users.
8 Define a Load Sequence
Reference Data, basics such as Products, any existing Users
or Customers,etc..
9 Run the Test Scripts Get User Sign-off to record progress.

Mais conteúdo relacionado

Mais procurados

USING EMC FAST SUITE WITH SYBASE ASE ON EMC VNX STORAGE SYSTEMS
USING EMC FAST SUITE  WITH SYBASE ASE ON EMC VNX STORAGE SYSTEMSUSING EMC FAST SUITE  WITH SYBASE ASE ON EMC VNX STORAGE SYSTEMS
USING EMC FAST SUITE WITH SYBASE ASE ON EMC VNX STORAGE SYSTEMS
suri1155
 
Managing terabytes: When Postgres gets big
Managing terabytes: When Postgres gets bigManaging terabytes: When Postgres gets big
Managing terabytes: When Postgres gets big
Selena Deckelmann
 
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
DataStax
 

Mais procurados (20)

Apache cassandra
Apache cassandraApache cassandra
Apache cassandra
 
Cassandra for mission critical data
Cassandra for mission critical dataCassandra for mission critical data
Cassandra for mission critical data
 
Cassandra architecture
Cassandra architectureCassandra architecture
Cassandra architecture
 
Cassandra: Open Source Bigtable + Dynamo
Cassandra: Open Source Bigtable + DynamoCassandra: Open Source Bigtable + Dynamo
Cassandra: Open Source Bigtable + Dynamo
 
Pgxc scalability pg_open2012
Pgxc scalability pg_open2012Pgxc scalability pg_open2012
Pgxc scalability pg_open2012
 
Write intensive workloads and lsm trees
Write intensive workloads and lsm treesWrite intensive workloads and lsm trees
Write intensive workloads and lsm trees
 
USING EMC FAST SUITE WITH SYBASE ASE ON EMC VNX STORAGE SYSTEMS
USING EMC FAST SUITE  WITH SYBASE ASE ON EMC VNX STORAGE SYSTEMSUSING EMC FAST SUITE  WITH SYBASE ASE ON EMC VNX STORAGE SYSTEMS
USING EMC FAST SUITE WITH SYBASE ASE ON EMC VNX STORAGE SYSTEMS
 
Cassandra Day Atlanta 2015: Introduction to Apache Cassandra & DataStax Enter...
Cassandra Day Atlanta 2015: Introduction to Apache Cassandra & DataStax Enter...Cassandra Day Atlanta 2015: Introduction to Apache Cassandra & DataStax Enter...
Cassandra Day Atlanta 2015: Introduction to Apache Cassandra & DataStax Enter...
 
Consistent hashing
Consistent hashingConsistent hashing
Consistent hashing
 
Talk About Apache Cassandra
Talk About Apache CassandraTalk About Apache Cassandra
Talk About Apache Cassandra
 
Managing terabytes: When Postgres gets big
Managing terabytes: When Postgres gets bigManaging terabytes: When Postgres gets big
Managing terabytes: When Postgres gets big
 
The design and implementation of modern column oriented databases
The design and implementation of modern column oriented databasesThe design and implementation of modern column oriented databases
The design and implementation of modern column oriented databases
 
Introduction to Cassandra: Replication and Consistency
Introduction to Cassandra: Replication and ConsistencyIntroduction to Cassandra: Replication and Consistency
Introduction to Cassandra: Replication and Consistency
 
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
 
Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek Berlin
 
Cassandra - A Distributed Database System
Cassandra - A Distributed Database System Cassandra - A Distributed Database System
Cassandra - A Distributed Database System
 
Understanding Data Partitioning and Replication in Apache Cassandra
Understanding Data Partitioning and Replication in Apache CassandraUnderstanding Data Partitioning and Replication in Apache Cassandra
Understanding Data Partitioning and Replication in Apache Cassandra
 
Cassandra 101
Cassandra 101Cassandra 101
Cassandra 101
 
Intro to Cassandra
Intro to CassandraIntro to Cassandra
Intro to Cassandra
 
Introduction to Apache Cassandra
Introduction to Apache CassandraIntroduction to Apache Cassandra
Introduction to Apache Cassandra
 

Destaque

17th Edition Part 2 3
17th Edition  Part 2   317th Edition  Part 2   3
17th Edition Part 2 3
Paul Holden
 
Digital Transformation Strategy
Digital Transformation StrategyDigital Transformation Strategy
Digital Transformation Strategy
James Woolwine
 
Oracle dba interview questions with answer
Oracle dba interview questions with answerOracle dba interview questions with answer
Oracle dba interview questions with answer
upenpriti
 

Destaque (20)

Smart metering infrastructure Architecture and analytics
Smart metering infrastructure Architecture and analyticsSmart metering infrastructure Architecture and analytics
Smart metering infrastructure Architecture and analytics
 
Data modelling qlik view
Data modelling qlik viewData modelling qlik view
Data modelling qlik view
 
153 Oracle dba interview questions
153 Oracle dba interview questions153 Oracle dba interview questions
153 Oracle dba interview questions
 
Oracle Complete Interview Questions
Oracle Complete Interview QuestionsOracle Complete Interview Questions
Oracle Complete Interview Questions
 
Mathematical thinking of database performance
Mathematical thinking of database performanceMathematical thinking of database performance
Mathematical thinking of database performance
 
NoSQL Type, Bigdata, and Analytics
NoSQL Type, Bigdata, and AnalyticsNoSQL Type, Bigdata, and Analytics
NoSQL Type, Bigdata, and Analytics
 
Mathematical Modelling of Wireless sensor Network and new energy Aware Routing
Mathematical Modelling of Wireless sensor Network and new energy Aware Routing Mathematical Modelling of Wireless sensor Network and new energy Aware Routing
Mathematical Modelling of Wireless sensor Network and new energy Aware Routing
 
Data science big data and analytics
Data science big data and analyticsData science big data and analytics
Data science big data and analytics
 
Risk management in Healthcare on Cloud
Risk management in Healthcare on CloudRisk management in Healthcare on Cloud
Risk management in Healthcare on Cloud
 
Real time bi solution architecture
Real time bi solution architectureReal time bi solution architecture
Real time bi solution architecture
 
Tableau 8.3 server configuration
Tableau 8.3 server configurationTableau 8.3 server configuration
Tableau 8.3 server configuration
 
Big data technologies with Case Study Finance and Healthcare
Big data technologies with Case Study Finance and HealthcareBig data technologies with Case Study Finance and Healthcare
Big data technologies with Case Study Finance and Healthcare
 
Case study haad operating model improvement model
Case study  haad operating model improvement modelCase study  haad operating model improvement model
Case study haad operating model improvement model
 
Oracle cloud, private, public and hybrid
Oracle cloud, private, public and hybridOracle cloud, private, public and hybrid
Oracle cloud, private, public and hybrid
 
Revista marzo2014
Revista marzo2014Revista marzo2014
Revista marzo2014
 
Edición Especial 2014 - SalsaSocial
Edición Especial 2014 - SalsaSocialEdición Especial 2014 - SalsaSocial
Edición Especial 2014 - SalsaSocial
 
17th Edition Part 2 3
17th Edition  Part 2   317th Edition  Part 2   3
17th Edition Part 2 3
 
Digital Transformation Strategy
Digital Transformation StrategyDigital Transformation Strategy
Digital Transformation Strategy
 
The Governance Framework For Managing Change
The Governance Framework For Managing ChangeThe Governance Framework For Managing Change
The Governance Framework For Managing Change
 
Oracle dba interview questions with answer
Oracle dba interview questions with answerOracle dba interview questions with answer
Oracle dba interview questions with answer
 

Semelhante a Cassandra data modelling best practices

NOSQL and Cassandra
NOSQL and CassandraNOSQL and Cassandra
NOSQL and Cassandra
rantav
 
C programming session 04
C programming session 04C programming session 04
C programming session 04
Dushmanta Nath
 
Using netbeans javaThe purpose of this exercise is t.pdf
Using netbeans javaThe purpose of this exercise is t.pdfUsing netbeans javaThe purpose of this exercise is t.pdf
Using netbeans javaThe purpose of this exercise is t.pdf
karymadelaneyrenne19
 

Semelhante a Cassandra data modelling best practices (20)

Cassandra
CassandraCassandra
Cassandra
 
Cassandra20141009
Cassandra20141009Cassandra20141009
Cassandra20141009
 
Cassandra hands on
Cassandra hands onCassandra hands on
Cassandra hands on
 
Cassandra20141113
Cassandra20141113Cassandra20141113
Cassandra20141113
 
NOSQL and Cassandra
NOSQL and CassandraNOSQL and Cassandra
NOSQL and Cassandra
 
Apache Cassandra, part 1 – principles, data model
Apache Cassandra, part 1 – principles, data modelApache Cassandra, part 1 – principles, data model
Apache Cassandra, part 1 – principles, data model
 
Introduction to cassandra
Introduction to cassandraIntroduction to cassandra
Introduction to cassandra
 
Apache Cassandra Data Modeling with Travis Price
Apache Cassandra Data Modeling with Travis PriceApache Cassandra Data Modeling with Travis Price
Apache Cassandra Data Modeling with Travis Price
 
LectureNotes-05-DSA
LectureNotes-05-DSALectureNotes-05-DSA
LectureNotes-05-DSA
 
Mysql Optimization
Mysql OptimizationMysql Optimization
Mysql Optimization
 
Андрей Козлов (Altoros): Оптимизация производительности Cassandra
Андрей Козлов (Altoros): Оптимизация производительности CassandraАндрей Козлов (Altoros): Оптимизация производительности Cassandra
Андрей Козлов (Altoros): Оптимизация производительности Cassandra
 
Introduction to System verilog
Introduction to System verilog Introduction to System verilog
Introduction to System verilog
 
Real-World Cassandra at ShareThis
Real-World Cassandra at ShareThisReal-World Cassandra at ShareThis
Real-World Cassandra at ShareThis
 
Cassandra - A decentralized storage system
Cassandra - A decentralized storage systemCassandra - A decentralized storage system
Cassandra - A decentralized storage system
 
C programming session 04
C programming session 04C programming session 04
C programming session 04
 
Cassandra data modelling best practices
Cassandra data modelling best practicesCassandra data modelling best practices
Cassandra data modelling best practices
 
Cassandra
CassandraCassandra
Cassandra
 
Using netbeans javaThe purpose of this exercise is t.pdf
Using netbeans javaThe purpose of this exercise is t.pdfUsing netbeans javaThe purpose of this exercise is t.pdf
Using netbeans javaThe purpose of this exercise is t.pdf
 
Apache Cassandra 2.0
Apache Cassandra 2.0Apache Cassandra 2.0
Apache Cassandra 2.0
 
002 hbase clientapi
002 hbase clientapi002 hbase clientapi
002 hbase clientapi
 

Mais de Sandeep Sharma IIMK Smart City,IoT,Bigdata,Cloud,BI,DW

Mais de Sandeep Sharma IIMK Smart City,IoT,Bigdata,Cloud,BI,DW (20)

Management Consultancy Saudi Telecom Digital Transformation Design Thinking
Management Consultancy Saudi Telecom Digital Transformation Design ThinkingManagement Consultancy Saudi Telecom Digital Transformation Design Thinking
Management Consultancy Saudi Telecom Digital Transformation Design Thinking
 
Major new initiatives
Major new initiativesMajor new initiatives
Major new initiatives
 
Digital transformation journey Consulting
Digital transformation journey ConsultingDigital transformation journey Consulting
Digital transformation journey Consulting
 
Agile Jira Reporting
Agile Jira Reporting Agile Jira Reporting
Agile Jira Reporting
 
Lnt and bbby Retail Houseare industry Case assignment sandeep sharma
Lnt and bbby Retail Houseare industry Case assignment  sandeep sharmaLnt and bbby Retail Houseare industry Case assignment  sandeep sharma
Lnt and bbby Retail Houseare industry Case assignment sandeep sharma
 
Risk management Consulting For Municipality
Risk management Consulting For MunicipalityRisk management Consulting For Municipality
Risk management Consulting For Municipality
 
GDPR And Privacy By design Consultancy
GDPR And Privacy By design ConsultancyGDPR And Privacy By design Consultancy
GDPR And Privacy By design Consultancy
 
Real implementation Blockchain Best Use Cases Examples
Real implementation Blockchain Best Use Cases ExamplesReal implementation Blockchain Best Use Cases Examples
Real implementation Blockchain Best Use Cases Examples
 
Ffd 05 2012
Ffd 05 2012Ffd 05 2012
Ffd 05 2012
 
Biztalk architecture for Configured SMS service
Biztalk architecture for Configured SMS serviceBiztalk architecture for Configured SMS service
Biztalk architecture for Configured SMS service
 
Data modelling interview question
Data modelling interview questionData modelling interview question
Data modelling interview question
 
Pmo best practices
Pmo best practicesPmo best practices
Pmo best practices
 
Agile project management
Agile project managementAgile project management
Agile project management
 
Enroll hostel Business Model
Enroll hostel Business ModelEnroll hostel Business Model
Enroll hostel Business Model
 
Cloud manager client provisioning guideline draft 1.0
Cloud manager client provisioning guideline draft 1.0Cloud manager client provisioning guideline draft 1.0
Cloud manager client provisioning guideline draft 1.0
 
Bpm digital transformation
Bpm digital transformationBpm digital transformation
Bpm digital transformation
 
Digital transformation explained
Digital transformation explainedDigital transformation explained
Digital transformation explained
 
Government Digital transformation trend draft 1.0
Government Digital transformation trend draft 1.0Government Digital transformation trend draft 1.0
Government Digital transformation trend draft 1.0
 
Enterprise architecture maturity rating draft 1.0
Enterprise architecture maturity rating draft 1.0Enterprise architecture maturity rating draft 1.0
Enterprise architecture maturity rating draft 1.0
 
Organisation Structure For digital Transformation Team
Organisation Structure For digital Transformation TeamOrganisation Structure For digital Transformation Team
Organisation Structure For digital Transformation Team
 

Cassandra data modelling best practices

  • 1. Cassandra data modelling best practices: 1. Composite Type use throughAPIclientisnotrecommended. 2. Supercolumnfamilyuse isnotrecommendedasitde serializeall the columnsonusage as againstdeserializationof singlecolumn. 3. We can create wide rows (huge columnsandseveral rows) andskinnyrows(smallcol andhuge rows). 4. Valuelesscolumn;if Rowid={City+uid} we wanttowrite/readonlyCitythenuidcanbe emptyor valuelesscolumn. 5. Can expire columnbasedonTTLset inseconds. 6. Countercolumnsmaintaintostore a numberthatincrementallycountsthe occurrencesof a particulareventorprocess.For example,youmightuse acountercolumnto count the number of timesapage isviewed. 7. Keyspace:aclusterhas one keyspace perapplication. Top level containerforColumnFamilies. ColumnFamily:A containerforRow KeysandColumnFamilies Row Key:The unique identifierfordatastoredwithinaColumnFamily Column:Name-Valuepairwithanadditional field:timestamp SuperColumn:A Dictionaryof Columns identifiedbyRow Key. 8. RandomPartitioneristhe recommendedpartitioningscheme.Ithasthe followingadvantages overOrderedPartitioningasinBOP Randompartitioner:Ituseshashon the Row Keyto determine whichnode inthe clusterwill be responsible forthe data.The hash value isgeneratedbydoingMD5 on the Row Key.Each node inthe clusterina data centerisassignedsectionsof thisrange (token) andisresponsible for storingthe data whose RowKey’shashvalue fallswithinthisrange. TokenRange = (2^127) ÷ (# of nodesinthe cluster) If the clusterisspannedacrossmultiple datacenters,the tokensare createdforindividual data centers. Whichisbetter. Byte OrderedPartitioner(BOP):Itallowsyoutocalculate yourowntokensandassign to nodes yourself asopposedtoRandomPartitionerautomaticallydoingthisforyou. 9. Partitioning => Pickingoutone node tostore firstcopy of data on Replication => Pickingoutadditional nodestostore more copiesof data. Storage commitlog(durability)  flushittomemtables(in-memorystructures)  SSTables whichcompact data usingcompactiontoremove stale dataand tombstones(indicatorthatdata deleted). 10. Binaryprotocol isfasterthan thrift. 11. Why RP? 1. RP ensuresthatthe data is evenlydistributedacrossall nodesinthe clusterandnotcreate data hotspotas inBOP. 2. When a newnode isaddedto the cluster,RPcan quicklyassignita new tokenrange and move minimumamountof datafromothernodesto the new node whichitis now responsible for.With BOP,thiswill have tobe done manually.
  • 2. 3. Multiple ColumnFamiliesIssue:BOPcancause unevendistributionof dataif youhave multiple columnfamilies. 4. The onlybenefitthatBOPhasoverRP isthat it allowsyoutodo row slices.You can obtaina cursor like inRDBMS and move overyourrows. 12. columnfamilyasa map of a map. SortedMap<RowKey,SortedMap<ColumnKey,ColumnValue>> A map givesefficientkeylookup,andthe sortednature givesefficientscans.InCassandra,we can use row keysandcolumnkeysto do efficientlookupsandrange scans. 13. The numberof columnkeysisunbounded.Inotherwords,youcan have wide rows. A keycan itself holdavalue.Inotherwords,youcan have a valuelesscolumn. 14. You needtopass the timestampwitheachcolumnvalue,forCassandratouse internallyfor conflictresolution.However,the timestampcanbe safelyignoredduringmodeling. 15. Start withquerypatternsandcreate ER model.Thenstartdeformalizingandduplicating. helps to identifythe most frequentquerypatternsandisolate the lessfrequent. Querypattern:
  • 3. Get userby userid Get itembyitemid Get all the itemsthata particularuserlikes Get all the userswholike a particularitem Option1: Exact replicaof relational model. Option2: Normalizedentitieswithcustomindexes Option3: Normalizedentitieswithde-normalizationintocustomindexes Option4: Partiallyde-normalizedentities
  • 4. Keyspaces: container for column families and a cluster has 1 keyspace per application. CREATE KEYSPACE keyspace_name WITH strategy_class = 'SimpleStrategy' AND strategy_options:replication_factor='2'; Single device per row - Time Series Pattern 1 Partitioning to limit row size - Time Series Pattern 2 The solution is to use a pattern called row partitioning by adding data to the row key to limit the amount of columns you get per device. Reverse order timeseries with expiring columns - Time Series Pattern 3 Data for a dashboard application and we only want to show the last 10 temperature readings. With TTL time to live for data value it is possible. CREATE TABLE latest_temperatures ( weatherstation_id text, event_time timestamp, temperature text, PRIMARY KEY (weatherstation_id,event_time), ) WITH CLUSTERING ORDER BY (event_time DESC); INSERT INTO latest_temperatures(weatherstation_id,event_time,temperature) VALUES ('1234ABCD','2013-04-03 07:03:00','72F') USING TTL 20;
  • 5. create table Inbound ( InboundID int not null primary key auto_increment, ParticipantID int not null, FromParticipantID int not null, Occurred date not null, Subject varchar(50) not null, Story text not null, foreign key (ParticipantID) references Participant(ParticipantID), foreign key (FromParticipantID) references Participant(ParticipantID)); create table Inbound ( ParticipantID int, Occurred timeuuid, FromParticipantID int, Subject text, Story text, primary key (ParticipantID, Occurred)); 1 Define the User Scenarios This ensures User participation and commitment. 2 Define the Steps in each Scenario Clarify the User Interaction. 3 Derive the Data Model. Use a Modelling Tool, such as Data Architect or ERWin to generate SQL. 4 Relate Data Entities to each Step. Create Cross-reference matrix to check results. 5 Identify Transactions for each Entity Confirm that each Entity has Transactions to load and read Data 6 Prepare sample Data In collaboration with the Users. 7 Prepare Test Scripts Agree sign-off with the Users. 8 Define a Load Sequence Reference Data, basics such as Products, any existing Users or Customers,etc.. 9 Run the Test Scripts Get User Sign-off to record progress.