SlideShare uma empresa Scribd logo
1 de 20
Baixar para ler offline
Overview of
NoSQL
...motivation, technologies, should you
care?
Overview
● Evolution of/motivation for NoSQL
  databases
● Characterization of NoSQL databases
● Classification of NoSQL databases
● Popularity/usage of NoSQL systems
A brief history of NoSQL
● Originally coined in 1998 by Strozzi for
  specific non-rel database
   ○ easy to use, free, text based data storage, easy
     manipulation of contents of db
● Reintroduced by Evans (Rackspace) in 2009
  for conf on open source distributed
  databases
   ○ in response to increase in interest in non RDBMS
     solutions
      ■ bringing together Cassandra, Mongo, Couch, etc
● Has grown as a movement over last 3 years
Current status
● Significant buzz within community in 2010
  ○ initial development of technology
  ○ pioneer deployments
  ○ lots of meetups/conferences/birds of feathers
● Many key technologies evolved later 2010,
  2011
  ○ more large deployments for some technologies
  ○ small companies with no legacy basing operations
    on NoSQL
Current Status
● 2012
  ○   buzz/hype is fading
  ○   technology continues to mature
  ○   increased number of deployments
  ○   skills sought in job market
NoSQL - a negative
definition
● NoSQL simply defined by being non-
  relational
  ○ diverse set of technologies fall into NoSQL camp
● Motivations mixed
  ○   open source
  ○   scale - TB, PB - particulary for read/write latency
  ○   increased flexibility over RDBMS systems
  ○   ability to work with raw data
  ○   ACID not always most appropriate design choice
       ■ analytics data is excellent example
● Results in many different NoSQL
  technologies
Typical characteristics
● Don't use SQL!
● Open Source
● Intended to deliver performance
  ○ in some dimension
● Typically JOIN not supported
  ○ performance hit
● Consistency often relaxed
  ○ eventual consistency
● More flexibility in schema
  ○ if schema used at all!
Diversity of NoSQL
databases
● 122 seperate technologies listed on http:
  //nosql-database.org/
  ○ mix of commercial, open source and some
    inbetween
● Vary in many dimensions:
  ○ architecture
  ○ interfaces
     ■ api/languages
  ○ internal data storage
  ○ distribution mechanisms
     ■ redundancy, reliability
  ○ usage - deployments & support community
  ○ maturity
Classification of NoSQL
systems
●   Column based solutions
●   Document store solutions
●   Key/Value solutions
●   Graph based solutions
●   Less significantly:
    ○ XML databases
    ○ Object databases
    ○ Mulitvalue databases
Column based solutions
● Structured data
  ○ similar to classical tables
● Generally much more flexible
  ○ no rigorous schema necessary
  ○ can typically add columns in ad hoc fashion
    ■ often without explicitly declaring column
● However, can result in very different usage
  ○ eg can have millions of columns associated with
    given row
● Examples: Hadoop/HBase, Cassandra,
  Hypertable, SimpleDB
Document based solutions
● Less structured data
  ○ DB composed of 'documents' containing arbitrary
    data
    ■ usually containing longer form content eg CMS
● Documents contain some structure to
  support query/search/filter, etc
● Somewhat less emphasis on a key
  ○ can be autogenerated
● Quite unlike classical databases
● Examples: MongoDB, CouchDB
Key/value stores
● DBs inspired by memcache
   ○ simple, fast key/value stores
● Attempt to retain most of DB in memory
   ○ fast response times
● Different designs for scalability
   ○ single node/multi node
● Much emphasis on the keys in this type of
  DB
● Write usually overwrites entire previous entry
● Examples: Redis, Couchbase/Membase,
  DynamoDB, Riak
Graph based solutions
● Obviously different from previous categories
  ○ Focus specifically on graphs
● Queries supported are graph-specific
  ○ eg get nodes related to specified node
● Typically support for solving standard graph
  problems
  ○ eg shortest path, general graph traversal
● Can deliver very significant performance
  over non-graph specific solutions
  ○ for graph problems!
● Examples: Neo4j
It's a noisy space...
● Very many candidate technologies
● Relatively small amount of real world
  solutions
● Differences between classifications above is
  one of emphasis...
   ○ column based and document based arrive at semi-
     structured sweet spot from opposite ends of
     spectrum
● ...although this results in different preferred
  use cases...
   ○ document based solution better for document
     problems, eg CMS
Common techniques used
● Hashing techniques used to map data to
  nodes in cluster
● Internode communication via Gossip
● Common replication techniques
● Thrift is used in a few cases
● MapReduce often used to search over
  distributed system
Comparison (oldish)...
Comparison (oldish)
Comparison (oldish)
Horses for courses...
● SQL is perfectly good solution for many
  problems
  ○ tried and tested
● Some problems require alternative solution
  ○ typically driven by scale and/or flexibility
● NoSQL offers (many) alternatives
  ○ although relatively easy to identify realistic options
● Column based approaches good for mostly
  structured data with enhanced flexibility
● Document based approaches good for
  document oriented problems
...so let's dive into one
NoSQL database...
● Cassandra...

Mais conteúdo relacionado

Mais procurados

Nosql databases for the .net developer
Nosql databases for the .net developerNosql databases for the .net developer
Nosql databases for the .net developer
Jesus Rodriguez
 

Mais procurados (20)

NoSQL databases - An introduction
NoSQL databases - An introductionNoSQL databases - An introduction
NoSQL databases - An introduction
 
HPTS 2011: The NoSQL Ecosystem
HPTS 2011: The NoSQL EcosystemHPTS 2011: The NoSQL Ecosystem
HPTS 2011: The NoSQL Ecosystem
 
Nosql databases for the .net developer
Nosql databases for the .net developerNosql databases for the .net developer
Nosql databases for the .net developer
 
No sql
No sqlNo sql
No sql
 
An Intro to NoSQL Databases
An Intro to NoSQL DatabasesAn Intro to NoSQL Databases
An Intro to NoSQL Databases
 
The Hive Think Tank: Rocking the Database World with RocksDB
The Hive Think Tank: Rocking the Database World with RocksDBThe Hive Think Tank: Rocking the Database World with RocksDB
The Hive Think Tank: Rocking the Database World with RocksDB
 
NoSql
NoSqlNoSql
NoSql
 
Comparative study of modern databases
Comparative study of modern databasesComparative study of modern databases
Comparative study of modern databases
 
NoSQL
NoSQLNoSQL
NoSQL
 
NoSQL Seminer
NoSQL SeminerNoSQL Seminer
NoSQL Seminer
 
NoSQL
NoSQLNoSQL
NoSQL
 
First steps to Azure Cosmos DB: Getting Started with MongoDB and NoSQL
First steps to Azure Cosmos DB: Getting Started with MongoDB and NoSQLFirst steps to Azure Cosmos DB: Getting Started with MongoDB and NoSQL
First steps to Azure Cosmos DB: Getting Started with MongoDB and NoSQL
 
Four NoSQL Databases You Should Know
Four NoSQL Databases You Should KnowFour NoSQL Databases You Should Know
Four NoSQL Databases You Should Know
 
A Seminar on NoSQL Databases.
A Seminar on NoSQL Databases.A Seminar on NoSQL Databases.
A Seminar on NoSQL Databases.
 
MongoDB
MongoDBMongoDB
MongoDB
 
No SQL - A Simple Intro
No SQL - A Simple IntroNo SQL - A Simple Intro
No SQL - A Simple Intro
 
Big data stores
Big data  storesBig data  stores
Big data stores
 
NoSQL
NoSQLNoSQL
NoSQL
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQL
 
NoSQL and MongoDB
NoSQL and MongoDBNoSQL and MongoDB
NoSQL and MongoDB
 

Semelhante a Overview of no sql

NOSQL Databases for the .NET Developer
NOSQL Databases for the .NET DeveloperNOSQL Databases for the .NET Developer
NOSQL Databases for the .NET Developer
Jesus Rodriguez
 
NoSQL Solutions - a comparative study
NoSQL Solutions - a comparative studyNoSQL Solutions - a comparative study
NoSQL Solutions - a comparative study
Guillaume Lefranc
 

Semelhante a Overview of no sql (20)

SQL or NoSQL - how to choose
SQL or NoSQL - how to chooseSQL or NoSQL - how to choose
SQL or NoSQL - how to choose
 
How to get started in Big Data for master's students
How to get started in Big Data for master's studentsHow to get started in Big Data for master's students
How to get started in Big Data for master's students
 
No sql bigdata and postgresql
No sql bigdata and postgresqlNo sql bigdata and postgresql
No sql bigdata and postgresql
 
HPEC 2021 sparse binary format
HPEC 2021 sparse binary formatHPEC 2021 sparse binary format
HPEC 2021 sparse binary format
 
NOsql Presentation.pdf
NOsql Presentation.pdfNOsql Presentation.pdf
NOsql Presentation.pdf
 
Scalability broad strokes
Scalability   broad strokesScalability   broad strokes
Scalability broad strokes
 
Introduction to NoSQL and MongoDB
Introduction to NoSQL and MongoDBIntroduction to NoSQL and MongoDB
Introduction to NoSQL and MongoDB
 
NOSQL Databases for the .NET Developer
NOSQL Databases for the .NET DeveloperNOSQL Databases for the .NET Developer
NOSQL Databases for the .NET Developer
 
Database Technologies
Database TechnologiesDatabase Technologies
Database Technologies
 
NoSQL.pptx
NoSQL.pptxNoSQL.pptx
NoSQL.pptx
 
Steam Learn: Introduction to RDBMS indexes
Steam Learn: Introduction to RDBMS indexesSteam Learn: Introduction to RDBMS indexes
Steam Learn: Introduction to RDBMS indexes
 
NoSQL for Artificial Intelligence
NoSQL for Artificial IntelligenceNoSQL for Artificial Intelligence
NoSQL for Artificial Intelligence
 
No SQL- The Future Of Data Storage
No SQL- The Future Of Data StorageNo SQL- The Future Of Data Storage
No SQL- The Future Of Data Storage
 
Polyglot Persistence - Two Great Tastes That Taste Great Together
Polyglot Persistence - Two Great Tastes That Taste Great TogetherPolyglot Persistence - Two Great Tastes That Taste Great Together
Polyglot Persistence - Two Great Tastes That Taste Great Together
 
NoSQL Solutions - a comparative study
NoSQL Solutions - a comparative studyNoSQL Solutions - a comparative study
NoSQL Solutions - a comparative study
 
Handling the growth of data
Handling the growth of dataHandling the growth of data
Handling the growth of data
 
Ledingkart Meetup #2: Scaling Search @Lendingkart
Ledingkart Meetup #2: Scaling Search @LendingkartLedingkart Meetup #2: Scaling Search @Lendingkart
Ledingkart Meetup #2: Scaling Search @Lendingkart
 
Heterogenous Persistence
Heterogenous PersistenceHeterogenous Persistence
Heterogenous Persistence
 
Distributed Databases - Concepts & Architectures
Distributed Databases - Concepts & ArchitecturesDistributed Databases - Concepts & Architectures
Distributed Databases - Concepts & Architectures
 
Datastore PPT.pptx
Datastore PPT.pptxDatastore PPT.pptx
Datastore PPT.pptx
 

Mais de Sean Murphy (8)

Hadoop pig
Hadoop pigHadoop pig
Hadoop pig
 
Demonstration
DemonstrationDemonstration
Demonstration
 
Cassandra overview
Cassandra overviewCassandra overview
Cassandra overview
 
No sql course introduction
No sql course   introductionNo sql course   introduction
No sql course introduction
 
Rss talk
Rss talkRss talk
Rss talk
 
Rss announcements
Rss announcementsRss announcements
Rss announcements
 
Rocco pres-v1
Rocco pres-v1Rocco pres-v1
Rocco pres-v1
 
UCD Android Workshop
UCD Android WorkshopUCD Android Workshop
UCD Android Workshop
 

Último

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Último (20)

Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 

Overview of no sql

  • 2. Overview ● Evolution of/motivation for NoSQL databases ● Characterization of NoSQL databases ● Classification of NoSQL databases ● Popularity/usage of NoSQL systems
  • 3. A brief history of NoSQL ● Originally coined in 1998 by Strozzi for specific non-rel database ○ easy to use, free, text based data storage, easy manipulation of contents of db ● Reintroduced by Evans (Rackspace) in 2009 for conf on open source distributed databases ○ in response to increase in interest in non RDBMS solutions ■ bringing together Cassandra, Mongo, Couch, etc ● Has grown as a movement over last 3 years
  • 4. Current status ● Significant buzz within community in 2010 ○ initial development of technology ○ pioneer deployments ○ lots of meetups/conferences/birds of feathers ● Many key technologies evolved later 2010, 2011 ○ more large deployments for some technologies ○ small companies with no legacy basing operations on NoSQL
  • 5. Current Status ● 2012 ○ buzz/hype is fading ○ technology continues to mature ○ increased number of deployments ○ skills sought in job market
  • 6. NoSQL - a negative definition ● NoSQL simply defined by being non- relational ○ diverse set of technologies fall into NoSQL camp ● Motivations mixed ○ open source ○ scale - TB, PB - particulary for read/write latency ○ increased flexibility over RDBMS systems ○ ability to work with raw data ○ ACID not always most appropriate design choice ■ analytics data is excellent example ● Results in many different NoSQL technologies
  • 7. Typical characteristics ● Don't use SQL! ● Open Source ● Intended to deliver performance ○ in some dimension ● Typically JOIN not supported ○ performance hit ● Consistency often relaxed ○ eventual consistency ● More flexibility in schema ○ if schema used at all!
  • 8. Diversity of NoSQL databases ● 122 seperate technologies listed on http: //nosql-database.org/ ○ mix of commercial, open source and some inbetween ● Vary in many dimensions: ○ architecture ○ interfaces ■ api/languages ○ internal data storage ○ distribution mechanisms ■ redundancy, reliability ○ usage - deployments & support community ○ maturity
  • 9. Classification of NoSQL systems ● Column based solutions ● Document store solutions ● Key/Value solutions ● Graph based solutions ● Less significantly: ○ XML databases ○ Object databases ○ Mulitvalue databases
  • 10. Column based solutions ● Structured data ○ similar to classical tables ● Generally much more flexible ○ no rigorous schema necessary ○ can typically add columns in ad hoc fashion ■ often without explicitly declaring column ● However, can result in very different usage ○ eg can have millions of columns associated with given row ● Examples: Hadoop/HBase, Cassandra, Hypertable, SimpleDB
  • 11. Document based solutions ● Less structured data ○ DB composed of 'documents' containing arbitrary data ■ usually containing longer form content eg CMS ● Documents contain some structure to support query/search/filter, etc ● Somewhat less emphasis on a key ○ can be autogenerated ● Quite unlike classical databases ● Examples: MongoDB, CouchDB
  • 12. Key/value stores ● DBs inspired by memcache ○ simple, fast key/value stores ● Attempt to retain most of DB in memory ○ fast response times ● Different designs for scalability ○ single node/multi node ● Much emphasis on the keys in this type of DB ● Write usually overwrites entire previous entry ● Examples: Redis, Couchbase/Membase, DynamoDB, Riak
  • 13. Graph based solutions ● Obviously different from previous categories ○ Focus specifically on graphs ● Queries supported are graph-specific ○ eg get nodes related to specified node ● Typically support for solving standard graph problems ○ eg shortest path, general graph traversal ● Can deliver very significant performance over non-graph specific solutions ○ for graph problems! ● Examples: Neo4j
  • 14. It's a noisy space... ● Very many candidate technologies ● Relatively small amount of real world solutions ● Differences between classifications above is one of emphasis... ○ column based and document based arrive at semi- structured sweet spot from opposite ends of spectrum ● ...although this results in different preferred use cases... ○ document based solution better for document problems, eg CMS
  • 15. Common techniques used ● Hashing techniques used to map data to nodes in cluster ● Internode communication via Gossip ● Common replication techniques ● Thrift is used in a few cases ● MapReduce often used to search over distributed system
  • 19. Horses for courses... ● SQL is perfectly good solution for many problems ○ tried and tested ● Some problems require alternative solution ○ typically driven by scale and/or flexibility ● NoSQL offers (many) alternatives ○ although relatively easy to identify realistic options ● Column based approaches good for mostly structured data with enhanced flexibility ● Document based approaches good for document oriented problems
  • 20. ...so let's dive into one NoSQL database... ● Cassandra...