2. Could be “Not SQL”
Could be “Not Only SQL”
Could be “Not SQLYet”
Essentially any database system that doesn’t
require storage in tables/rows/columns
Development as early as 1960s
NoSQL DB often tailored to specific need
3. TRADITIONAL RDBMS
Record persistence
Well-defined schemas
SQL querying tools
ACID support
Atomic
Consistent
Isolated
Durable
Doesn’t scale horizontally
(easily)
NOSQL
Multiple formats
No Schema
No single querying
language
BASE support
BasicAvailability
Soft-state
Eventual consistency
Scales horizontally
4. Traditional RDBM wasn’t designed for high
rates of growth (emails, tweets, message
boards, etc.)
Not all data is relational
SQL can be too cumbersome (joins,
subqueries, etc.)
Defining schema limits growth
So … NoSQL offers a solution to all these
issues
5. KEY-VALUE
Dynamo, Cassandra,
SimpleDB, etc.
Essentially store a key and
a corresponding value
Simple to program
Easy to distribute across
clusters
DOCUMENT STORES
MongoDB, CouchDB
Similar to key-store, but
maps keys to documents in
either XML or JSON format
No need for joins because a
single document contains
the entire information
A little more technical than
Key-Value, but still easier
than relational
6. ORACLE NOSQL (KEY-VALUE)
USING JAVA
// Define the major and minor path components for the key
majorComponents.add("Smith");
majorComponents.add("Bob");
minorComponents.add("phonenumber");
// Create the key
Key myKey = Key.createKey(majorComponents,
minorComponents);
String data = "408 555 5555";
// Create the value. Notice that we serialize the contents of
the
// String object when we create the
value.Value myValue =Value.createValue(data.getBytes());
// Now put the record. Note that we do not show the
creation of the
// kvstore handle here.
kvstore.put(myKey, myValue);
MONGODB USING C#
MongoCollection<BsonDocument> employees =
database.GetCollection("employee");
for (int i = 1; i <= 5; i++)
{
BsonDocument employee = new BsonDocument {
{ "name", "Employee " + i },
{ "email", String.Format("email{0}@email.com", i) },
{ "createddate", DateTime.Now }
};
employees.Insert(employee);
}
7. PROS
No Schema leads to faster
changes in application
Various options available;
Open Source
Scales well
API driven interaction
doesn’t require SQL query
CONS
No Schema leads to
unmanageable code
Vendors may not be
around in the future
For truly large databases,
need planning
API interaction is a little
more complex than SQL
Very few tools to support
reporting/analytics
8. Replicas: Ensure availability even if a replica is
lost
With read access (if one doesn’t respond, go to
the second replica)
With Updates, send data to all replicas
Two Implementations:
Eventual Consistency
Majority Write/Majority Read
Resync Replicas when available
9. PostgreSQL is a hybrid
Oracle NoSQL
Microsoft Azure NoSQL offerings: DocumentDB,
Tables, HBase
But …
Splice Machine offers RDBMS support for Hadoop
FoundationDB offers a SQL engine for Key-Value
Support
What if RDBMS vendors support JSON and KV?
If they add KV and Document search capabilities?
Game over for NoSQL databases?
A bit more complex than that (underlying architecture is
the problem)
10. Eliminates the “wasted” processing from RDBM
No Disk
▪ break DB into RAM-sized chunks & dist. across cluster
No Locking
No Concurrency Control
▪ One transaction at a time per partition
▪ Easier to do when db is broken into multiple partitions
No Disk Logging
▪ Recover from existing replicas
▪ Limited, efficient logging to disk