1. Key characteristics of NoSQL 1
Key Characteristics of NoSQL
Kirti Jayadevan
Introduction to Big Data Concepts, Technologies and deployment
Alakh Verma
3-7-2016
2. Key characteristics of NoSQL 2
Abstract: [In today’s world there is no one size that fits all. Earlier most companies used
RDBMS as their database; nonetheless many companies have adopted NoSQL technology to
matches their needs. NoSQL systems are easy to use and they help in improving availability
and scalability than RDBMS. This paper provides an overview of the key characteristics of
NoSQL and compares different types of NoSQL.]
NoSQL database is popular and are used in many companies. They have a distributed
data structure and hence the probability of having a single point of failure is very low. Along
with availability, NoSQL also provides high performance due to the same distributed
architecture. Performance increases by adding the number of machines. Thus it provides
scalability to the architecture. NoSQL systems are mainly benefited by web 2.0 applications
like networking sites, blogs, mashups and video sharing websites (Cattel 2011).
The data stores in NoSQL are categorized into key value stores, document stores,
Extensible record stores and graph stores. Key-value stores a pair of keys and values and
these values are retrieved when the key is known. Here the users store data in a schema less
way, which enables ease of use. These systems also provide replication feature to provide
data recovery. Redis, Memcached, Riak, Scalaris and Voldemort are few databases that use
key-value stores model (Cattel 2011).
Document stores provide a mechanism where the documents contain complex data
and a unique key is assigned to each document which helps to search and retrieve data (Planet
Cassandra). This model also follows schema-less structure like key-value. However what
makes it unique are the internal notations to process applications like JSON. In Key value
stores and RDBMS, client side processing is required to store JSON documents. Mongo DB,
3. Key characteristics of NoSQL 3
Couch DB, CouchBase and Amazon Dynamo DB uses Document stores. Both Key value
and Document stores partition the data over many machines (Cattel 2011).
Extensible record stores provide data partitioning with dynamic number of attributes.
They store data in records with large number of columns and are schema free (Cattel 2011).
HBase, Cassandra and Google’s BigTable uses Extensible record stores. Extensible record
stores are also termed as wide column stores.
Graph databases stores data whose elements are interconnected and are represented
as graph. In RDBMS, we use referential integrity to define relationship between the records
and uses JOINs to retrieve result, thus making it time consuming and expensive. While in
Graph data stores, each node stores a list of relationship record that represents the
relationship between each node (Abadi et al., 2008). Thus the database will have direct access
to connected node making it less expensive to search and match. Neo4j and Titan use Graph
data stores (Cattel 2011).
We can choose the right NoSQL data store by analysing the advantages and
challenges of each NoSQL data store and understanding the business goal. As a data scientist,
we select the most suitable NoSQL data store by identifying:
• whether the use case needs to perform transactions or provide analytics
• whether the use case can tolerate downtime or will nanoseconds delay costs them
• whether the use case needs continuous availability of data
The right NoSQL platform can be selected in a business use case by considering the
scalability, performance, availability, cost and manageability (Planet Cassandra). The table
below compares all the above mentioned data model with Performance, scalability,
flexibility, complexity and functionality.
4. Key characteristics of NoSQL 4
Data model Performance Scalability Flexibility Complexity Functionality
Key-value store High High High None
Variable
(None)
Document Store High Variable (High) High Low
Variable
(Low)
Extensible record store High High Moderate Low Minimal
Graph Store Variable Variable High High Graph Theory
(Planet Cassandra)
References:
1. Abadi, Daniel, Samuel R. Madden, Nabil Hachem. Column-Stores Vs Row-Stores:
How different are they really? Unpublished Manuscript SIGMOD ’08 2008
Vancouver, BC, Canada Available at http://db.csail.mit.edu/projects/cstore/abadi-
sigmod08.pdf Accessed 3/7/2016
2. Cattel, Rick. Relational Databases, Object Databases, Key-Value Stores, Document
Stores and Extensible Record Stores: A comparison. December 2010. Available at
http://www.odbms.org/wpcontent/uploads/2010/01/Cattell.Dec10.pdf Accessed
3/5/2016
3. The shift to the Digital economy is driving NoSQL Adoption, Couchbase. Retrieved
March 7, 2016, from http://www.couchbase.com/nosql-resources/what-is-no-sql
4. NoSQL Database defined and explained, Planet Cassandra. Retrieved March 7,2016
from http://www.planetcassandra.org/what-is-nosql/#nosql-database-types