1. http://dotnetdlr.com
NoSql (It’s “Not only SQL” not “No to
Sql”)
This is my first post on NoSql database technologies. There have been drastic changes in database
technologies over the few years. Increase in user’s requests, high availability of applications, real time
performance forced to think on different database technologies. We have traditional RDBMS, memory
and NoSql databases available in market to suffice particular business needs. Here I’ll illustrate some of
key aspects of NoSql databases like what is NoSql, why we need it, advantages and disadvantages of
NoSql.
What is NoSql Movement?
It’s a different way of thinking in database technologies. It is unlike relational database management
system where we have tables, procedures,functions, normalization concepts. NoSql databases are not
built primarily on tables and don’t use sql for manipulation or querying database.
NoSql databases have specificpurpose to achieve, that means NoSql database might not support all the
features like in relational databases.
NoSql databases are based onCAPTheorem.
Consistency: Most of the applications or services attempt to provide strong consistent data.
Interactions with applications/services are expected to behave in transactional manner ie. Operation
should be atomic (succeed or failure entirely), uncommitted transactions should be isolated from each
other and transaction once committed should be permanent.
Availability: Load on services /applications are increasing and eventually services should be highly
available to users. Every request should be succeed.
Partition tolerant:Your services should provide some amount of fault tolerance in case of crash,
failure or heavy load. It is important that in case of these circumstances your services should still
perform as expected. Partition tolerant is one of desirable property of service. Services can serve
request from multiple nodes
Why NoSql?
Since NoSql databases are using for specific purpose. They are normally using for huge data
where performance matters. Relational database systems are hard to scale out in case of write
operation. We can load balance database servers by replicating on multiple servers, in this case
read operation can be load balance but write operation needs consistency across multiple
servers. Writes can be scaled only by partitioning the data. This affects reads as distributed joins
are usually slow and hard to implement. We can support increase in no. of users or requests by
scaling up relational databases which means we need more hardware support, licensing,
increase in costs etc.
Relational databases are not good option on heavy load which are doing read and write
operations simultaneously like Facebook, Google, Amazon, Twitter etc.
1
2. http://dotnetdlr.com
A NoSQL implementation, on the other hand, can scale out, i.e. distribute the database load
across more servers.
Source: Couchbase.com
Common characteristic in NoSql databases
Aggregating (supported by column databases):Aggregation usage to calculate aggregated
values like Count, Max, Avg, Min etc. Some of NoSql provides support for aggregationframework
which have inbuilt aggregation of values. Approach in column databases is to store values in columns
instead rows (de-normalized data). This kind of data mainly used in data analytics and business
intelligence. Google’s BigTable and Apache’s Cassandra supports some feature of column
databases.
Relationships (support by graph databases):A graph database uses graph structures with
nodes, edges and properties. Every element contains a direct pointer to adjacent element; in this
case it doesn’t need to lookup indexes or scanning whole data. Graph databases are mostly use in
relational or social data where elements are connected. Eg. Neo4j, BigData, OrientDB.
2
3. http://dotnetdlr.com
Source: wikipaedia
Document based. Document databases are considered by many as the next logical step from
simple key-/value-stores to slightly more complex and meaningful data structures as they at least
allow encapsulating key-/value-pairs in documents. Eg. CouchDb, MongoDb.
Mapping of document based db vs relational db
Document Based Databases Relational databases
Collections Table
Document Row
Key- Value Store: Values are stored as simply key-value pairs. Values only stored like blob object
and doesn’t care about data content. Eg. Dynamo DB, LevelDB, RaptorDB.
Databases Scale out:When the load increases on databases, database administrators were
scaling up tradition databases by increasing hardware, buying bigger databases- instead of scale out
i.e. distributing databases on multiple nodes /servers to balance load. Because of increase in
transactions rates and availability requirements and availability of databases on cloud or virtual
machine, scaling out is not economic pain in increasing cost anymore.
On the other hand, NoSql databases can scale out by distributing on multiple servers. NoSQL
databases typically use clusters of cheap commodity servers to manage the exploding and
transaction volumes. The result is that the cost per gigabyte or transaction/second for NoSQL can be
many times less than the cost for RDBMS, allowing you to store and process more data at a much
lower price;
Now question here is why scaling out in RDBMS is hard to implement. Traditional databases support
ACID properties that guarantee that database transactions are processed reliably. A transaction can
have write operations for multiple records, so to keep consistency across multiple nodes is slow and
complex process, because multiple servers would need to communicate back and forth to keep data
integrity and synchronize transactions while preventing deadlock. On the other hand NoSql databases
3
4. http://dotnetdlr.com
supports single record transaction and data is partitioned on multiple nodes to process transactions
fast.
Auto Sharding (Elasticity): NoSql databases support automatic data sharding (horizontal
partitioning of data), where database breaks down into smaller chunks (called shard) and can be
shared across distributed servers or cluster. This feature provides faster responses to transactions
and data requests.
Data Replication:Most of NoSql supports data-replication like relational databases to support same
data-availability across distributed servers.
No schema required (Flexible data model):Data can be inserted in a NoSQL DB without first
defining a rigid database schema. The format of the data being inserted can be changed at any time,
without application disruption. This provides greater application flexibility, which ultimately delivers
significant business flexibility.
Caching:Most of NoSql databases supports integrated caching to support low latency and high
throughput. This behavior is contrast with traditional database management systems where it needs
separate configuration or development to support.
Challenges of No-SQL
Till now we have seen significant advantages of NoSql over RDBMS, however there are many challenges
to implement NoSql.
Maturity: Most of the NoSql databases are in open source or in pre-production stage. In this case it might
be risk to adopt these databases on enterprise level. For small business or use case it might be better to
consider. On the other hand RDBMS databases are matured, providing many features and having good
documentations or resources.
Support:Most of RDBMS are not open source that means they come with commitment and assurance in
case of failure. They are reliable products and properly tested. Most of NoSql databases are open source
and not widely adopted by organizations. It is very hard to get effective support from open sources
databases. Some of NoSql databases created by small startups for specific needs, not for global reach.
Tools: RDBMS databases have lot of tools to monitor databases, queries analyzing, optimizations,
performance profiling, analytics and Business Intelligence. Objective of NoSql databases are to minimize
use of admin tools which has not achieved fully yet, still there are certain things which need skills and
tools to monitor database activities.
When to consider NoSql
Following are some of indicators you can consider while choosing NoSql database for your application:
If your application needs high performance databases.
Need less or zero administration of databases.
You want flexible data model. Minor of major changes should not impact whole system.
Application that needs less complex transactions.
4
5. http://dotnetdlr.com
High availability.
Not or less consideration on Business Intelligence and analytics.
References:
http://nosql-database.org/
http://www.couchbase.com
www.mongodb.org
http://en.wikipedia.org/wiki/Nosql
5