SlideShare uma empresa Scribd logo
1 de 41
ISO/IEC JTC1/SC32/WG2 N1537




        A Comparison of SQL
        and NoSQL Databases
                       Keith W. Hare
                   JCC Consulting, Inc.
            Convenor, ISO/IEC JTC1 SC32 WG3


December 29, 2012      Metadata Open Forum                             1
Abstract
NoSQL databases (either no-SQL or Not Only
SQL) are currently a hot topic in some parts of
computing. In fact, one website lists over a
hundred different NoSQL databases.
This presentation reviews the features common to
the NoSQL databases and compares those features
to the features and capabilities of SQL databases.



December 29, 2012   Metadata Open Forum          2
Who Am I?
   Muskingum College, 1980, BS in Biology and
    Computer Science
   Senior Consultant with JCC Consulting, Inc.
    since 1985 – high performance database systems
   Ohio State – Masters in Computer &
    Information Science, 1985
   SQL Standards committees since 1988
   Vice Chair, INCITS H2 since 2003
   Convenor, ISO/IEC JTC1 SC32 WG3 since
    2005
December 29, 2012    Metadata Open Forum         3
Topics
   SQL Databases
      SQL Standard
      SQL Characteristics

      SQL Database Examples

   NoSQL Databases
      NoSQL Defintion
      General Characteristics

      NoSQL Database Types

      NoSQL Database Examples


December 29, 2012     Metadata Open Forum   4
Standard SQL
The following is a short, incomplete history of the SQL
  Standards – ISO/IEC 9075
 1987 – Initial ISO/IEC Standard
 1989 – Referential Integrity
 1992 – SQL2
        1995 SQL/CLI (ODBC)
        1996 SQL/PSM – Procedural Language extensions
   1999 – User Defined Types
   2003 – SQL/XML
   2008 – Expansions and corrections
   2011 (or 2012) System Versioned and Application Time
    Period Tables
December 29, 2012            Metadata Open Forum           5
SQL Characteristics
   Data stored in columns and tables
   Relationships represented by data
   Data Manipulation Language
   Data Definition Language
   Transactions
   Abstraction from physical layer




December 29, 2012         Metadata Open Forum   6
SQL Physical Layer Abstraction
   Applications specify what, not how
   Query optimization engine
   Physical layer can change without modifying
    applications
      Create indexes to support queries
      In Memory databases




December 29, 2012       Metadata Open Forum       7
Data Manipulation Language (DML)
   Data manipulated with Select, Insert, Update, &
    Delete statements
        Select T1.Column1, T2.Column2 …
         From Table1, Table2 …
         Where T1.Column1 = T2.Column1 …
   Data Aggregation
   Compound statements
   Functions and Procedures
   Explicit transaction control

December 29, 2012      Metadata Open Forum            8
Data Definition Language
   Schema defined at the start
   Create Table (Column1 Datatype1, Column2 Datatype
    2, …)
   Constraints to define and enforce relationships
        Primary Key
        Foreign Key
        Etc.
   Triggers to respond to Insert, Update , & Delete
   Stored Modules
   Alter …
   Drop …
   Security and Access Control

December 29, 2012       Metadata Open Forum             9
Transactions – ACID Properties
   Atomic – All of the work in a transaction completes
    (commit) or none of it completes
   Consistent – A transaction transforms the database
    from one consistent state to another consistent
    state. Consistency is defined in terms of constraints.
   Isolated – The results of any changes made during a
    transaction are not visible until the transaction has
    committed.
   Durable – The results of a committed transaction
    survive failures
December 29, 2012       Metadata Open Forum             10
SQL Database Examples
   Commercial
      IBM DB2
      Oracle RDMS
      Microsoft SQL Server
      Sybase SQL Anywhere

   Open Source (with commercial options)
      MySQL
      Ingres

                Significant portions of the
            world’s economy use SQL databases!
December 29, 2012      Metadata Open Forum       11
NoSQL Definition
From www.nosql-database.org:
     Next Generation Databases mostly addressing some of
     the points: being non-relational, distributed, open-
     source and horizontal scalable. The original intention
     has been modern web-scale databases. The
     movement began early 2009 and is growing rapidly.
     Often more characteristics apply as: schema-free,
     easy replication support, simple API, eventually
     consistent / BASE (not ACID), a huge data
     amount, and more.


December 29, 2012        Metadata Open Forum             12
NoSQL Products/Projects
http://www.nosql-database.org/ lists 122 NoSQL
Databases
Cassandra

CouchDB

Hadoop & Hbase

MongoDB

StupidDB

Etc.


December 29, 2012     Metadata Open Forum    13
NoSQL Distinguishing Characteristics
   Large data volumes
        Google’s “big data”
   Scalable replication and distribution
        Potentially thousands of machines
        Potentially distributed around the world
   Queries need to return answers quickly
   Mostly query, few updates
   Asynchronous Inserts & Updates
   Schema-less
   ACID transaction properties are not needed – BASE
   CAP Theorem
   Open source development

December 29, 2012              Metadata Open Forum      14
BASE Transactions
   Acronym contrived to be the opposite of ACID
        Basically Available,
        Soft state,
        Eventually Consistent
   Characteristics
        Weak consistency – stale data OK
        Availability first
        Best effort
        Approximate answers OK
        Aggressive (optimistic)
        Simpler and faster

December 29, 2012          Metadata Open Forum     15
Brewer’s CAP Theorem
A distributed system can support only two of the
following characteristics:
 Consistency

Availability

Partition tolerance

The slides from Brewer’s July 2000 talk do not
define these characteristics.



December 29, 2012         Metadata Open Forum      16
Consistency
   all nodes see the same data at the same time –
    Wikipedia
   client perceives that a set of operations has
    occurred all at once – Pritchett
   More like Atomic in ACID transaction
    properties




December 29, 2012     Metadata Open Forum            17
Availability
   node failures do not prevent survivors from
    continuing to operate – Wikipedia
   Every operation must terminate in an intended
    response – Pritchett




December 29, 2012     Metadata Open Forum           18
Partition Tolerance
   the system continues to operate despite arbitrary
    message loss – Wikipedia
   Operations will complete, even if individual
    components are unavailable – Pritchett




December 29, 2012        Metadata Open Forum        19
NoSQL Database Types
Discussing NoSQL databases is complicated
because there are a variety of types:
Column Store – Each storage block contains data
from only one column
Document Store – stores documents made up of
tagged elements
Key-Value Store – Hash table of keys




December 29, 2012          Metadata Open Forum   20
Other Non-SQL Databases
   XML Databases
   Graph Databases
   Codasyl Databases
   Object Oriented Databases
   Etc…
   Will not address these today




December 29, 2012     Metadata Open Forum   21
NoSQL Example: Column Store
   Each storage block contains data from only one
    column
   Example: Hadoop/Hbase
      http://hadoop.apache.org/
      Yahoo, Facebook

   Example: Ingres VectorWise
      Column Store integrated with an SQL database
      http://www.ingres.com/products/vectorwise




December 29, 2012      Metadata Open Forum            22
Column Store Comments
   More efficient than row (or document) store if:
      Multiple row/record/documents are inserted at the
       same time so updates of column blocks can be
       aggregated
      Retrievals access only some of the columns in a
       row/record/document




December 29, 2012       Metadata Open Forum                23
NoSQL Example: Document Store
   Example: CouchDB
      http://couchdb.apache.org/
      BBC

   Example: MongoDB
      http://www.mongodb.org/
      Foursquare, Shutterfly

   JSON – JavaScript Object Notation



December 29, 2012      Metadata Open Forum   24
CouchDB JSON Example
{
    "_id": "guid goes here",
    "_rev": "314159",

    "type": "abstract",

    "author": "Keith W. Hare"

    "title": "SQL Standard and NoSQL Databases",

    "body": "NoSQL databases (either no-SQL or Not Only SQL)
             are currently a hot topic in some parts of
             computing.",
    "creation_timestamp": "2011/05/10 13:30:00 +0004"
}




December 29, 2012              Metadata Open Forum             25
CouchDB JSON Tags
   "_id"
        GUID – Global Unique Identifier
        Passed in or generated by CouchDB
   "_rev"
        Revision number
        Versioning mechanism
   "type", "author", "title", etc.
        Arbitrary tags
        Schema-less
        Could be validated after the fact by user-written routine
December 29, 2012            Metadata Open Forum                     26
NoSQL Examples: Key-Value Store
   Hash tables of Keys
   Values stored with Keys
   Fast access to small data values
   Example – Project-Voldemort
      http://www.project-voldemort.com/
      Linkedin

   Example – MemCacheDB
      http://memcachedb.org/
      Backend storage is Berkeley-DB

December 29, 2012      Metadata Open Forum   27
Map Reduce
   Technique for indexing and searching large data
    volumes
   Two Phases, Map and Reduce
        Map
            Extract sets of Key-Value pairs from underlying data
            Potentially in Parallel on multiple machines

        Reduce
            Merge and sort sets of Key-Value pairs
            Results may be useful for other searches




December 29, 2012             Metadata Open Forum                   28
Map Reduce
   Map Reduce techniques differ across products
   Implemented by application developers, not by
    underlying software




December 29, 2012     Metadata Open Forum           29
Map Reduce Patent
Google granted US Patent 7,650,331, January 2010
     System and method for efficient large-scale data processing
     A large-scale data processing system and method includes one
     or more application-independent map modules configured to
     read input data and to apply at least one application-specific
     map operation to the input data to produce intermediate data
     values, wherein the map operation is automatically parallelized
     across multiple processors in the parallel processing
     environment. A plurality of intermediate data structures are
     used to store the intermediate data values. One or more
     application-independent reduce modules are configured to
     retrieve the intermediate data values and to apply at least one
     application-specific reduce operation to the intermediate
     data values to provide output data.


December 29, 2012           Metadata Open Forum                        30
Storing and Modifying Data
   Syntax varies
      HTML
      Java Script

      Etc.

   Asynchronous – Inserts and updates do not wait
    for confirmation
   Versioned
   Optimistic Concurrency


December 29, 2012    Metadata Open Forum         31
Retrieving Data
   Syntax Varies
      No set-based query language
      Procedural program languages such as Java, C, etc.

   Application specifies retrieval path
   No query optimizer
   Quick answer is important
   May not be a single “right” answer



December 29, 2012        Metadata Open Forum                32
Open Source
   Small upfront software costs
   Suitable for large scale distribution on
    commodity hardware




December 29, 2012      Metadata Open Forum     33
NoSQL Summary
   NoSQL databases reject:
      Overhead of ACID transactions
      “Complexity” of SQL

      Burden of up-front schema design

      Declarative query expression

      Yesterday’s technology

   Programmer responsible for
      Step-by-step procedural language
      Navigating access path


December 29, 2012       Metadata Open Forum   34
Summary
   SQL Databases
      Predefined Schema
      Standard definition and interface language
      Tight consistency
      Well defined semantics

   NoSQL Database
      No predefined Schema
      Per-product definition and interface language
      Getting an answer quickly is more important than
       getting a correct answer

December 29, 2012        Metadata Open Forum              35
December 29, 2012   Metadata Open Forum   36
Questions?




December 29, 2012    Metadata Open Forum   37
Web References
   “NoSQL -- Your Ultimate Guide to the Non - Relational
    Universe!”
    http://nosql-database.org/links.html
   “NoSQL (RDBMS)”
    http://en.wikipedia.org/wiki/NoSQL
   PODC Keynote, July 19, 2000. Towards Robust. Distributed Systems.
    Dr. Eric A. Brewer. Professor, UC Berkeley. Co-Founder & Chief
    Scientist, Inktomi .
    www.eecs.berkeley.edu/~brewer
    /cs262b-2004/PODC-keynote.pdf
   “Brewer's CAP Theorem” posted by Julian Browne, January 11,
    2009.
    http://www.julianbrowne.com/article/viewer/brewers-cap-theorem
   “How to write a CV” Geek & Poke Cartoon
    http://geekandpoke.typepad.com/geekandpoke/2011/01/nosql
    .html
December 29, 2012          Metadata Open Forum                  38
Web References
   “Exploring CouchDB: A document-oriented database for Web
    applications”, Joe Lennon, Software developer, Core
    International.
    http://www.ibm.com/developerworks/opensource/library/os-
    couchdb/index.html
   “Graph Databases, NOSQL and Neo4j” Posted by Peter
    Neubauer on May 12, 2010  at:
    http://www.infoq.com/articles/graph-nosql-neo4j
   “Cassandra vs MongoDB vs CouchDB vs Redis vs Riak vs
    HBase comparison”, Kristóf Kovács.
    http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis
   “Distinguishing Two Major Types of Column-Stores” Posted by
    Daniel Abadi onMarch 29, 2010
    http://dbmsmusings.blogspot.com/2010/03/distinguishing-
    two-major-types-of_29.html

December 29, 2012         Metadata Open Forum                 39
Web References
   “MapReduce: Simplified Data Processing on Large Clusters”,
    Jeffrey Dean and Sanjay Ghemawat, December 2004.
    http://labs.google.com/papers/mapreduce.html
   “Scalable SQL”, ACM Queue, Michael Rys, April 19, 2011
    http://queue.acm.org/detail.cfm?id=1971597
   “a practical guide to noSQL”, Posted by Denise Miura on March
    17, 2011 at http://blogs.marklogic.com/2011/03/17/a-
    practical-guide-to-nosql/




December 29, 2012          Metadata Open Forum                  40
Books
   “CouchDB The Definitive Guide”, J. Chris Anderson, Jan Lehnardt
    and Noah Slater. O’Reilly Media Inc., Sebastopool, CA, USA.
    2010
   “Hadoop The Definitive Guide”, Tom White. O’Reilly Media Inc.,
    Sebastopool, CA, USA. 2011
   “MongoDB The Definitive Guide”, Kristina Chodorow and
    Michael Dirolf. O’Reilly Media Inc., Sebastopool, CA, USA.
    2010




December 29, 2012          Metadata Open Forum                   41

Mais conteúdo relacionado

Mais procurados

Database replication
Database replicationDatabase replication
Database replicationArslan111
 
Data Streaming For Big Data
Data Streaming For Big DataData Streaming For Big Data
Data Streaming For Big DataSeval Çapraz
 
NoSQL databases - An introduction
NoSQL databases - An introductionNoSQL databases - An introduction
NoSQL databases - An introductionPooyan Mehrparvar
 
Sql vs NoSQL
Sql vs NoSQLSql vs NoSQL
Sql vs NoSQLRTigger
 
Data Lake Overview
Data Lake OverviewData Lake Overview
Data Lake OverviewJames Serra
 
Building a modern data warehouse
Building a modern data warehouseBuilding a modern data warehouse
Building a modern data warehouseJames Serra
 
Column oriented database
Column oriented databaseColumn oriented database
Column oriented databaseKanike Krishna
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)James Serra
 
NoSQL Architecture Overview
NoSQL Architecture OverviewNoSQL Architecture Overview
NoSQL Architecture OverviewChristopher Foot
 
Hadoop File system (HDFS)
Hadoop File system (HDFS)Hadoop File system (HDFS)
Hadoop File system (HDFS)Prashant Gupta
 
NOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLNOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLRamakant Soni
 
Building a Big Data Pipeline
Building a Big Data PipelineBuilding a Big Data Pipeline
Building a Big Data PipelineJesus Rodriguez
 
7. Key-Value Databases: In Depth
7. Key-Value Databases: In Depth7. Key-Value Databases: In Depth
7. Key-Value Databases: In DepthFabio Fumarola
 
No SQL- The Future Of Data Storage
No SQL- The Future Of Data StorageNo SQL- The Future Of Data Storage
No SQL- The Future Of Data StorageBethmi Gunasekara
 
Data Modeling for Big Data
Data Modeling for Big DataData Modeling for Big Data
Data Modeling for Big DataDATAVERSITY
 

Mais procurados (20)

Database replication
Database replicationDatabase replication
Database replication
 
Data Streaming For Big Data
Data Streaming For Big DataData Streaming For Big Data
Data Streaming For Big Data
 
NoSQL databases - An introduction
NoSQL databases - An introductionNoSQL databases - An introduction
NoSQL databases - An introduction
 
Nosql databases
Nosql databasesNosql databases
Nosql databases
 
NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
 
Sql vs NoSQL
Sql vs NoSQLSql vs NoSQL
Sql vs NoSQL
 
Data Lake Overview
Data Lake OverviewData Lake Overview
Data Lake Overview
 
Apache HBase™
Apache HBase™Apache HBase™
Apache HBase™
 
Building a modern data warehouse
Building a modern data warehouseBuilding a modern data warehouse
Building a modern data warehouse
 
Column oriented database
Column oriented databaseColumn oriented database
Column oriented database
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
 
NoSQL Architecture Overview
NoSQL Architecture OverviewNoSQL Architecture Overview
NoSQL Architecture Overview
 
Hadoop File system (HDFS)
Hadoop File system (HDFS)Hadoop File system (HDFS)
Hadoop File system (HDFS)
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQL
 
NOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLNOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQL
 
Building a Big Data Pipeline
Building a Big Data PipelineBuilding a Big Data Pipeline
Building a Big Data Pipeline
 
NoSql
NoSqlNoSql
NoSql
 
7. Key-Value Databases: In Depth
7. Key-Value Databases: In Depth7. Key-Value Databases: In Depth
7. Key-Value Databases: In Depth
 
No SQL- The Future Of Data Storage
No SQL- The Future Of Data StorageNo SQL- The Future Of Data Storage
No SQL- The Future Of Data Storage
 
Data Modeling for Big Data
Data Modeling for Big DataData Modeling for Big Data
Data Modeling for Big Data
 

Destaque

Maria db the new mysql (Colin Charles)
Maria db the new mysql (Colin Charles)Maria db the new mysql (Colin Charles)
Maria db the new mysql (Colin Charles)Ontico
 
Maria db vs mysql
Maria db vs mysqlMaria db vs mysql
Maria db vs mysqlNitin KR
 
SQL vs. NoSQL Databases
SQL vs. NoSQL DatabasesSQL vs. NoSQL Databases
SQL vs. NoSQL DatabasesOsama Jomaa
 
How to upload pdf to slideshare and embed to wordpress
How to upload pdf to slideshare and embed to wordpressHow to upload pdf to slideshare and embed to wordpress
How to upload pdf to slideshare and embed to wordpressArdanette Seguban
 
5 Data Modeling for NoSQL 1/2
5 Data Modeling for NoSQL 1/25 Data Modeling for NoSQL 1/2
5 Data Modeling for NoSQL 1/2Fabio Fumarola
 
NoSQL databases pros and cons
NoSQL databases pros and consNoSQL databases pros and cons
NoSQL databases pros and consFabio Fumarola
 
NoSQL Databases: Why, what and when
NoSQL Databases: Why, what and whenNoSQL Databases: Why, what and when
NoSQL Databases: Why, what and whenLorenzo Alberton
 
Introduction to NoSQL Databases
Introduction to NoSQL DatabasesIntroduction to NoSQL Databases
Introduction to NoSQL DatabasesDerek Stainer
 
What’s New in Amazon RDS for Open-Source and Commercial Databases
What’s New in Amazon RDS for Open-Source and Commercial DatabasesWhat’s New in Amazon RDS for Open-Source and Commercial Databases
What’s New in Amazon RDS for Open-Source and Commercial DatabasesAmazon Web Services
 
Big Data in Healthcare Made Simple: Where It Stands Today and Where It’s Going
Big Data in Healthcare Made Simple: Where It Stands Today and Where It’s GoingBig Data in Healthcare Made Simple: Where It Stands Today and Where It’s Going
Big Data in Healthcare Made Simple: Where It Stands Today and Where It’s GoingHealth Catalyst
 
Medical Store Management System Software Engineering Project
Medical Store Management System Software Engineering ProjectMedical Store Management System Software Engineering Project
Medical Store Management System Software Engineering Projecthani2253
 
Open Source SQL databases enter millions queries per second era
Open Source SQL databases enter millions queries per second eraOpen Source SQL databases enter millions queries per second era
Open Source SQL databases enter millions queries per second eraAlexander Korotkov
 

Destaque (15)

SQL vs. NoSQL
SQL vs. NoSQLSQL vs. NoSQL
SQL vs. NoSQL
 
Maria db the new mysql (Colin Charles)
Maria db the new mysql (Colin Charles)Maria db the new mysql (Colin Charles)
Maria db the new mysql (Colin Charles)
 
Maria db vs mysql
Maria db vs mysqlMaria db vs mysql
Maria db vs mysql
 
SQL vs. NoSQL Databases
SQL vs. NoSQL DatabasesSQL vs. NoSQL Databases
SQL vs. NoSQL Databases
 
Why MariaDB?
Why MariaDB?Why MariaDB?
Why MariaDB?
 
Slideshare Doc
Slideshare DocSlideshare Doc
Slideshare Doc
 
How to upload pdf to slideshare and embed to wordpress
How to upload pdf to slideshare and embed to wordpressHow to upload pdf to slideshare and embed to wordpress
How to upload pdf to slideshare and embed to wordpress
 
5 Data Modeling for NoSQL 1/2
5 Data Modeling for NoSQL 1/25 Data Modeling for NoSQL 1/2
5 Data Modeling for NoSQL 1/2
 
NoSQL databases pros and cons
NoSQL databases pros and consNoSQL databases pros and cons
NoSQL databases pros and cons
 
NoSQL Databases: Why, what and when
NoSQL Databases: Why, what and whenNoSQL Databases: Why, what and when
NoSQL Databases: Why, what and when
 
Introduction to NoSQL Databases
Introduction to NoSQL DatabasesIntroduction to NoSQL Databases
Introduction to NoSQL Databases
 
What’s New in Amazon RDS for Open-Source and Commercial Databases
What’s New in Amazon RDS for Open-Source and Commercial DatabasesWhat’s New in Amazon RDS for Open-Source and Commercial Databases
What’s New in Amazon RDS for Open-Source and Commercial Databases
 
Big Data in Healthcare Made Simple: Where It Stands Today and Where It’s Going
Big Data in Healthcare Made Simple: Where It Stands Today and Where It’s GoingBig Data in Healthcare Made Simple: Where It Stands Today and Where It’s Going
Big Data in Healthcare Made Simple: Where It Stands Today and Where It’s Going
 
Medical Store Management System Software Engineering Project
Medical Store Management System Software Engineering ProjectMedical Store Management System Software Engineering Project
Medical Store Management System Software Engineering Project
 
Open Source SQL databases enter millions queries per second era
Open Source SQL databases enter millions queries per second eraOpen Source SQL databases enter millions queries per second era
Open Source SQL databases enter millions queries per second era
 

Semelhante a RDBMS vs NoSQL

Comparing sql and nosql dbs
Comparing sql and nosql dbsComparing sql and nosql dbs
Comparing sql and nosql dbsVasilios Kuznos
 
By Popular Demand: The Rise of Elastic SQL
By Popular Demand: The Rise of Elastic SQLBy Popular Demand: The Rise of Elastic SQL
By Popular Demand: The Rise of Elastic SQLNuoDB
 
SQL? NoSQL? NewSQL?!? What’s a Java developer to do? - JDC2012 Cairo, Egypt
SQL? NoSQL? NewSQL?!? What’s a Java developer to do? - JDC2012 Cairo, EgyptSQL? NoSQL? NewSQL?!? What’s a Java developer to do? - JDC2012 Cairo, Egypt
SQL? NoSQL? NewSQL?!? What’s a Java developer to do? - JDC2012 Cairo, EgyptChris Richardson
 
CA_Plex_SupportForModernizingIBM_DB2_for_i
CA_Plex_SupportForModernizingIBM_DB2_for_iCA_Plex_SupportForModernizingIBM_DB2_for_i
CA_Plex_SupportForModernizingIBM_DB2_for_iGeorge Jeffcock
 
MOUG17 Keynote: Oracle OpenWorld Major Announcements
MOUG17 Keynote: Oracle OpenWorld Major AnnouncementsMOUG17 Keynote: Oracle OpenWorld Major Announcements
MOUG17 Keynote: Oracle OpenWorld Major AnnouncementsMonica Li
 
Trouble with nosql_dbs
Trouble with nosql_dbsTrouble with nosql_dbs
Trouble with nosql_dbsMurat Çakal
 
Database Revolution - Exploratory Webcast
Database Revolution - Exploratory WebcastDatabase Revolution - Exploratory Webcast
Database Revolution - Exploratory WebcastInside Analysis
 
Database revolution opening webcast 01 18-12
Database revolution opening webcast 01 18-12Database revolution opening webcast 01 18-12
Database revolution opening webcast 01 18-12mark madsen
 
Evolution of the DBA to Data Platform Administrator/Specialist
Evolution of the DBA to Data Platform Administrator/SpecialistEvolution of the DBA to Data Platform Administrator/Specialist
Evolution of the DBA to Data Platform Administrator/SpecialistTony Rogerson
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQLbalwinders
 
Modern databases and its challenges (SQL ,NoSQL, NewSQL)
Modern databases and its challenges (SQL ,NoSQL, NewSQL)Modern databases and its challenges (SQL ,NoSQL, NewSQL)
Modern databases and its challenges (SQL ,NoSQL, NewSQL)Mohamed Galal
 
Jump Start on Apache Spark 2.2 with Databricks
Jump Start on Apache Spark 2.2 with DatabricksJump Start on Apache Spark 2.2 with Databricks
Jump Start on Apache Spark 2.2 with DatabricksAnyscale
 
About "Apache Cassandra"
About "Apache Cassandra"About "Apache Cassandra"
About "Apache Cassandra"Jihyun Ahn
 
Build Application With MongoDB
Build Application With MongoDBBuild Application With MongoDB
Build Application With MongoDBEdureka!
 
Adbms 8 history of data models
Adbms 8 history of data modelsAdbms 8 history of data models
Adbms 8 history of data modelsVaibhav Khanna
 
[SSA] 03.newsql database (2014.02.05)
[SSA] 03.newsql database (2014.02.05)[SSA] 03.newsql database (2014.02.05)
[SSA] 03.newsql database (2014.02.05)Steve Min
 

Semelhante a RDBMS vs NoSQL (20)

Comparing sql and nosql dbs
Comparing sql and nosql dbsComparing sql and nosql dbs
Comparing sql and nosql dbs
 
By Popular Demand: The Rise of Elastic SQL
By Popular Demand: The Rise of Elastic SQLBy Popular Demand: The Rise of Elastic SQL
By Popular Demand: The Rise of Elastic SQL
 
SQL? NoSQL? NewSQL?!? What’s a Java developer to do? - JDC2012 Cairo, Egypt
SQL? NoSQL? NewSQL?!? What’s a Java developer to do? - JDC2012 Cairo, EgyptSQL? NoSQL? NewSQL?!? What’s a Java developer to do? - JDC2012 Cairo, Egypt
SQL? NoSQL? NewSQL?!? What’s a Java developer to do? - JDC2012 Cairo, Egypt
 
NoSql Databases
NoSql DatabasesNoSql Databases
NoSql Databases
 
CA_Plex_SupportForModernizingIBM_DB2_for_i
CA_Plex_SupportForModernizingIBM_DB2_for_iCA_Plex_SupportForModernizingIBM_DB2_for_i
CA_Plex_SupportForModernizingIBM_DB2_for_i
 
MOUG17 Keynote: Oracle OpenWorld Major Announcements
MOUG17 Keynote: Oracle OpenWorld Major AnnouncementsMOUG17 Keynote: Oracle OpenWorld Major Announcements
MOUG17 Keynote: Oracle OpenWorld Major Announcements
 
Trouble with nosql_dbs
Trouble with nosql_dbsTrouble with nosql_dbs
Trouble with nosql_dbs
 
Database Revolution - Exploratory Webcast
Database Revolution - Exploratory WebcastDatabase Revolution - Exploratory Webcast
Database Revolution - Exploratory Webcast
 
Database revolution opening webcast 01 18-12
Database revolution opening webcast 01 18-12Database revolution opening webcast 01 18-12
Database revolution opening webcast 01 18-12
 
Mongo DB
Mongo DBMongo DB
Mongo DB
 
NoSQL and MySQL
NoSQL and MySQLNoSQL and MySQL
NoSQL and MySQL
 
On Storing Big Data
On Storing Big DataOn Storing Big Data
On Storing Big Data
 
Evolution of the DBA to Data Platform Administrator/Specialist
Evolution of the DBA to Data Platform Administrator/SpecialistEvolution of the DBA to Data Platform Administrator/Specialist
Evolution of the DBA to Data Platform Administrator/Specialist
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQL
 
Modern databases and its challenges (SQL ,NoSQL, NewSQL)
Modern databases and its challenges (SQL ,NoSQL, NewSQL)Modern databases and its challenges (SQL ,NoSQL, NewSQL)
Modern databases and its challenges (SQL ,NoSQL, NewSQL)
 
Jump Start on Apache Spark 2.2 with Databricks
Jump Start on Apache Spark 2.2 with DatabricksJump Start on Apache Spark 2.2 with Databricks
Jump Start on Apache Spark 2.2 with Databricks
 
About "Apache Cassandra"
About "Apache Cassandra"About "Apache Cassandra"
About "Apache Cassandra"
 
Build Application With MongoDB
Build Application With MongoDBBuild Application With MongoDB
Build Application With MongoDB
 
Adbms 8 history of data models
Adbms 8 history of data modelsAdbms 8 history of data models
Adbms 8 history of data models
 
[SSA] 03.newsql database (2014.02.05)
[SSA] 03.newsql database (2014.02.05)[SSA] 03.newsql database (2014.02.05)
[SSA] 03.newsql database (2014.02.05)
 

Mais de Murat Çakal

Scaling web applications with cassandra presentation
Scaling web applications with cassandra presentationScaling web applications with cassandra presentation
Scaling web applications with cassandra presentationMurat Çakal
 
Mongodb open source_high_performance_database
Mongodb open source_high_performance_databaseMongodb open source_high_performance_database
Mongodb open source_high_performance_databaseMurat Çakal
 
Building web applications with mongo db presentation
Building web applications with mongo db presentationBuilding web applications with mongo db presentation
Building web applications with mongo db presentationMurat Çakal
 

Mais de Murat Çakal (8)

REST vs. SOAP
REST vs. SOAPREST vs. SOAP
REST vs. SOAP
 
Cassandra NoSQL
Cassandra NoSQLCassandra NoSQL
Cassandra NoSQL
 
Scaling web applications with cassandra presentation
Scaling web applications with cassandra presentationScaling web applications with cassandra presentation
Scaling web applications with cassandra presentation
 
Mongodb open source_high_performance_database
Mongodb open source_high_performance_databaseMongodb open source_high_performance_database
Mongodb open source_high_performance_database
 
Building web applications with mongo db presentation
Building web applications with mongo db presentationBuilding web applications with mongo db presentation
Building web applications with mongo db presentation
 
Wmware NoSQL
Wmware NoSQLWmware NoSQL
Wmware NoSQL
 
NoSql databases
NoSql databasesNoSql databases
NoSql databases
 
No sql
No sqlNo sql
No sql
 

Último

TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 

Último (20)

TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 

RDBMS vs NoSQL

  • 1. ISO/IEC JTC1/SC32/WG2 N1537 A Comparison of SQL and NoSQL Databases Keith W. Hare JCC Consulting, Inc. Convenor, ISO/IEC JTC1 SC32 WG3 December 29, 2012 Metadata Open Forum 1
  • 2. Abstract NoSQL databases (either no-SQL or Not Only SQL) are currently a hot topic in some parts of computing. In fact, one website lists over a hundred different NoSQL databases. This presentation reviews the features common to the NoSQL databases and compares those features to the features and capabilities of SQL databases. December 29, 2012 Metadata Open Forum 2
  • 3. Who Am I?  Muskingum College, 1980, BS in Biology and Computer Science  Senior Consultant with JCC Consulting, Inc. since 1985 – high performance database systems  Ohio State – Masters in Computer & Information Science, 1985  SQL Standards committees since 1988  Vice Chair, INCITS H2 since 2003  Convenor, ISO/IEC JTC1 SC32 WG3 since 2005 December 29, 2012 Metadata Open Forum 3
  • 4. Topics  SQL Databases  SQL Standard  SQL Characteristics  SQL Database Examples  NoSQL Databases  NoSQL Defintion  General Characteristics  NoSQL Database Types  NoSQL Database Examples December 29, 2012 Metadata Open Forum 4
  • 5. Standard SQL The following is a short, incomplete history of the SQL Standards – ISO/IEC 9075  1987 – Initial ISO/IEC Standard  1989 – Referential Integrity  1992 – SQL2  1995 SQL/CLI (ODBC)  1996 SQL/PSM – Procedural Language extensions  1999 – User Defined Types  2003 – SQL/XML  2008 – Expansions and corrections  2011 (or 2012) System Versioned and Application Time Period Tables December 29, 2012 Metadata Open Forum 5
  • 6. SQL Characteristics  Data stored in columns and tables  Relationships represented by data  Data Manipulation Language  Data Definition Language  Transactions  Abstraction from physical layer December 29, 2012 Metadata Open Forum 6
  • 7. SQL Physical Layer Abstraction  Applications specify what, not how  Query optimization engine  Physical layer can change without modifying applications  Create indexes to support queries  In Memory databases December 29, 2012 Metadata Open Forum 7
  • 8. Data Manipulation Language (DML)  Data manipulated with Select, Insert, Update, & Delete statements  Select T1.Column1, T2.Column2 … From Table1, Table2 … Where T1.Column1 = T2.Column1 …  Data Aggregation  Compound statements  Functions and Procedures  Explicit transaction control December 29, 2012 Metadata Open Forum 8
  • 9. Data Definition Language  Schema defined at the start  Create Table (Column1 Datatype1, Column2 Datatype 2, …)  Constraints to define and enforce relationships  Primary Key  Foreign Key  Etc.  Triggers to respond to Insert, Update , & Delete  Stored Modules  Alter …  Drop …  Security and Access Control December 29, 2012 Metadata Open Forum 9
  • 10. Transactions – ACID Properties  Atomic – All of the work in a transaction completes (commit) or none of it completes  Consistent – A transaction transforms the database from one consistent state to another consistent state. Consistency is defined in terms of constraints.  Isolated – The results of any changes made during a transaction are not visible until the transaction has committed.  Durable – The results of a committed transaction survive failures December 29, 2012 Metadata Open Forum 10
  • 11. SQL Database Examples  Commercial  IBM DB2  Oracle RDMS  Microsoft SQL Server  Sybase SQL Anywhere  Open Source (with commercial options)  MySQL  Ingres Significant portions of the world’s economy use SQL databases! December 29, 2012 Metadata Open Forum 11
  • 12. NoSQL Definition From www.nosql-database.org: Next Generation Databases mostly addressing some of the points: being non-relational, distributed, open- source and horizontal scalable. The original intention has been modern web-scale databases. The movement began early 2009 and is growing rapidly. Often more characteristics apply as: schema-free, easy replication support, simple API, eventually consistent / BASE (not ACID), a huge data amount, and more. December 29, 2012 Metadata Open Forum 12
  • 13. NoSQL Products/Projects http://www.nosql-database.org/ lists 122 NoSQL Databases Cassandra CouchDB Hadoop & Hbase MongoDB StupidDB Etc. December 29, 2012 Metadata Open Forum 13
  • 14. NoSQL Distinguishing Characteristics  Large data volumes  Google’s “big data”  Scalable replication and distribution  Potentially thousands of machines  Potentially distributed around the world  Queries need to return answers quickly  Mostly query, few updates  Asynchronous Inserts & Updates  Schema-less  ACID transaction properties are not needed – BASE  CAP Theorem  Open source development December 29, 2012 Metadata Open Forum 14
  • 15. BASE Transactions  Acronym contrived to be the opposite of ACID  Basically Available,  Soft state,  Eventually Consistent  Characteristics  Weak consistency – stale data OK  Availability first  Best effort  Approximate answers OK  Aggressive (optimistic)  Simpler and faster December 29, 2012 Metadata Open Forum 15
  • 16. Brewer’s CAP Theorem A distributed system can support only two of the following characteristics:  Consistency Availability Partition tolerance The slides from Brewer’s July 2000 talk do not define these characteristics. December 29, 2012 Metadata Open Forum 16
  • 17. Consistency  all nodes see the same data at the same time – Wikipedia  client perceives that a set of operations has occurred all at once – Pritchett  More like Atomic in ACID transaction properties December 29, 2012 Metadata Open Forum 17
  • 18. Availability  node failures do not prevent survivors from continuing to operate – Wikipedia  Every operation must terminate in an intended response – Pritchett December 29, 2012 Metadata Open Forum 18
  • 19. Partition Tolerance  the system continues to operate despite arbitrary message loss – Wikipedia  Operations will complete, even if individual components are unavailable – Pritchett December 29, 2012 Metadata Open Forum 19
  • 20. NoSQL Database Types Discussing NoSQL databases is complicated because there are a variety of types: Column Store – Each storage block contains data from only one column Document Store – stores documents made up of tagged elements Key-Value Store – Hash table of keys December 29, 2012 Metadata Open Forum 20
  • 21. Other Non-SQL Databases  XML Databases  Graph Databases  Codasyl Databases  Object Oriented Databases  Etc…  Will not address these today December 29, 2012 Metadata Open Forum 21
  • 22. NoSQL Example: Column Store  Each storage block contains data from only one column  Example: Hadoop/Hbase  http://hadoop.apache.org/  Yahoo, Facebook  Example: Ingres VectorWise  Column Store integrated with an SQL database  http://www.ingres.com/products/vectorwise December 29, 2012 Metadata Open Forum 22
  • 23. Column Store Comments  More efficient than row (or document) store if:  Multiple row/record/documents are inserted at the same time so updates of column blocks can be aggregated  Retrievals access only some of the columns in a row/record/document December 29, 2012 Metadata Open Forum 23
  • 24. NoSQL Example: Document Store  Example: CouchDB  http://couchdb.apache.org/  BBC  Example: MongoDB  http://www.mongodb.org/  Foursquare, Shutterfly  JSON – JavaScript Object Notation December 29, 2012 Metadata Open Forum 24
  • 25. CouchDB JSON Example { "_id": "guid goes here", "_rev": "314159", "type": "abstract", "author": "Keith W. Hare" "title": "SQL Standard and NoSQL Databases", "body": "NoSQL databases (either no-SQL or Not Only SQL) are currently a hot topic in some parts of computing.", "creation_timestamp": "2011/05/10 13:30:00 +0004" } December 29, 2012 Metadata Open Forum 25
  • 26. CouchDB JSON Tags  "_id"  GUID – Global Unique Identifier  Passed in or generated by CouchDB  "_rev"  Revision number  Versioning mechanism  "type", "author", "title", etc.  Arbitrary tags  Schema-less  Could be validated after the fact by user-written routine December 29, 2012 Metadata Open Forum 26
  • 27. NoSQL Examples: Key-Value Store  Hash tables of Keys  Values stored with Keys  Fast access to small data values  Example – Project-Voldemort  http://www.project-voldemort.com/  Linkedin  Example – MemCacheDB  http://memcachedb.org/  Backend storage is Berkeley-DB December 29, 2012 Metadata Open Forum 27
  • 28. Map Reduce  Technique for indexing and searching large data volumes  Two Phases, Map and Reduce  Map  Extract sets of Key-Value pairs from underlying data  Potentially in Parallel on multiple machines  Reduce  Merge and sort sets of Key-Value pairs  Results may be useful for other searches December 29, 2012 Metadata Open Forum 28
  • 29. Map Reduce  Map Reduce techniques differ across products  Implemented by application developers, not by underlying software December 29, 2012 Metadata Open Forum 29
  • 30. Map Reduce Patent Google granted US Patent 7,650,331, January 2010 System and method for efficient large-scale data processing A large-scale data processing system and method includes one or more application-independent map modules configured to read input data and to apply at least one application-specific map operation to the input data to produce intermediate data values, wherein the map operation is automatically parallelized across multiple processors in the parallel processing environment. A plurality of intermediate data structures are used to store the intermediate data values. One or more application-independent reduce modules are configured to retrieve the intermediate data values and to apply at least one application-specific reduce operation to the intermediate data values to provide output data. December 29, 2012 Metadata Open Forum 30
  • 31. Storing and Modifying Data  Syntax varies  HTML  Java Script  Etc.  Asynchronous – Inserts and updates do not wait for confirmation  Versioned  Optimistic Concurrency December 29, 2012 Metadata Open Forum 31
  • 32. Retrieving Data  Syntax Varies  No set-based query language  Procedural program languages such as Java, C, etc.  Application specifies retrieval path  No query optimizer  Quick answer is important  May not be a single “right” answer December 29, 2012 Metadata Open Forum 32
  • 33. Open Source  Small upfront software costs  Suitable for large scale distribution on commodity hardware December 29, 2012 Metadata Open Forum 33
  • 34. NoSQL Summary  NoSQL databases reject:  Overhead of ACID transactions  “Complexity” of SQL  Burden of up-front schema design  Declarative query expression  Yesterday’s technology  Programmer responsible for  Step-by-step procedural language  Navigating access path December 29, 2012 Metadata Open Forum 34
  • 35. Summary  SQL Databases  Predefined Schema  Standard definition and interface language  Tight consistency  Well defined semantics  NoSQL Database  No predefined Schema  Per-product definition and interface language  Getting an answer quickly is more important than getting a correct answer December 29, 2012 Metadata Open Forum 35
  • 36. December 29, 2012 Metadata Open Forum 36
  • 37. Questions? December 29, 2012 Metadata Open Forum 37
  • 38. Web References  “NoSQL -- Your Ultimate Guide to the Non - Relational Universe!” http://nosql-database.org/links.html  “NoSQL (RDBMS)” http://en.wikipedia.org/wiki/NoSQL  PODC Keynote, July 19, 2000. Towards Robust. Distributed Systems. Dr. Eric A. Brewer. Professor, UC Berkeley. Co-Founder & Chief Scientist, Inktomi . www.eecs.berkeley.edu/~brewer /cs262b-2004/PODC-keynote.pdf  “Brewer's CAP Theorem” posted by Julian Browne, January 11, 2009. http://www.julianbrowne.com/article/viewer/brewers-cap-theorem  “How to write a CV” Geek & Poke Cartoon http://geekandpoke.typepad.com/geekandpoke/2011/01/nosql .html December 29, 2012 Metadata Open Forum 38
  • 39. Web References  “Exploring CouchDB: A document-oriented database for Web applications”, Joe Lennon, Software developer, Core International. http://www.ibm.com/developerworks/opensource/library/os- couchdb/index.html  “Graph Databases, NOSQL and Neo4j” Posted by Peter Neubauer on May 12, 2010  at: http://www.infoq.com/articles/graph-nosql-neo4j  “Cassandra vs MongoDB vs CouchDB vs Redis vs Riak vs HBase comparison”, Kristóf Kovács. http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis  “Distinguishing Two Major Types of Column-Stores” Posted by Daniel Abadi onMarch 29, 2010 http://dbmsmusings.blogspot.com/2010/03/distinguishing- two-major-types-of_29.html December 29, 2012 Metadata Open Forum 39
  • 40. Web References  “MapReduce: Simplified Data Processing on Large Clusters”, Jeffrey Dean and Sanjay Ghemawat, December 2004. http://labs.google.com/papers/mapreduce.html  “Scalable SQL”, ACM Queue, Michael Rys, April 19, 2011 http://queue.acm.org/detail.cfm?id=1971597  “a practical guide to noSQL”, Posted by Denise Miura on March 17, 2011 at http://blogs.marklogic.com/2011/03/17/a- practical-guide-to-nosql/ December 29, 2012 Metadata Open Forum 40
  • 41. Books  “CouchDB The Definitive Guide”, J. Chris Anderson, Jan Lehnardt and Noah Slater. O’Reilly Media Inc., Sebastopool, CA, USA. 2010  “Hadoop The Definitive Guide”, Tom White. O’Reilly Media Inc., Sebastopool, CA, USA. 2011  “MongoDB The Definitive Guide”, Kristina Chodorow and Michael Dirolf. O’Reilly Media Inc., Sebastopool, CA, USA. 2010 December 29, 2012 Metadata Open Forum 41