SlideShare uma empresa Scribd logo
1 de 34
An Introduction to NoSQL
Brad Anderson - DevNexus
March 21, 2011
Me
‘boorad’ most places (twitter, github, etc.)
Erlang Programmer
  Cloudant BigCouch, Ericsson Monaco, Verdeeco
  Java, Python, D, Javascript, Common Lisp
NoSQL East - October 2009
Data Warehousing / Big Data
pre-lunch talks... always.
Agenda


NoSQL is BULLSHIT

You Don’t Need It

You Can’t Query It
The Name

Play on MySQL (Eric Evans, Rackspace)

Not Only SQL (Emil Eifrem)

Broad Umbrella

Shitty Marketing Term and we’re stuck with it
Why do you need NoSQL?
Why do you need NoSQL?




 YOU DON’T!
Seriously, you don’t...

Vastly different performance characteristics

Immature APIs and tools / ecosystems

Bugs, most are actively being developed

Your situation doesn’t warrant it
Why do they exist?
Every one of these new data storage systems
came from a particular pain someone was
having.
Each system was created to specifically solve
the pain point the authors were experiencing.
This pain usually involves a metric shit-tonne of
data and distributed processing is required.
Schema-free
Prediction: Pain
Examples
Google - index Internet (mapreduce/bigtable)
Yahoo - keep up with Google (Hadoop)
Amazon - shopping cart (Dynamo)
Facebook - inbox search (Cassandra)
Lotus - Notes legacy restrictions (CouchDB)
Cloudant - physics research (BigCouch)
Basho - CRM product (Riak)
Neo - graph traversal (Neo4J)
Pain of Scaling

Scale Reads with master-slave replication

Scale Writes with master-master replication

Partitioning Vertically (by functional groups)

Partitioning Horizontally (by key, i.e. ‘date’)

Caching works, kinda
What to do?

Distribute both data and processing

    horizontal scaling

Organize data differently

Use appropriate on-disk storage
Sorting Hat Says...


Distribution Model

Data Model

Disk Data Structure
Distribution Model

Embedded (no distribution)

Replication / Sharding

Chord - peer to peer

Dynamo

  consistent hashing, vnodes, vector clocks
No Distribution


BDB

Neo4J
Replication / Sharding
Distribution

MongoDB

CouchDB

Redis
Dynamo Distribution
BigCouch
Riak
Voldemort
Cassandra
    no vnodes
    no vector clocks
Hibari ?
Dynamo - how does it work?
                                                                                                           N=3
                                                                                                           W=2
                            Node 1

           26                                    No
      de                A   B   C    D             de
    No                                   B
                                                          2
                    C
                B                            C
     A                                                D
Z                                                             E

                                                                  C       N
                                                                           od
                                                                              e
                                                                      D           3

                                                                          E

                                                                                      F




                                                                                          D



                                                                                                  No
                                                                                                      de
                                                                                              E



                                                                                                       4
                                                                                                  F
                                                                                                      G
                                                                                                                 17
Dynamo - how does it work?
PUT http://boorad.cloudant.com/dbname/blah?w=2
                                                                                                              N=3
                                                                                                              W=2
                               Node 1

              26                                    No
         de                A   B   C    D             de
       No                                   B
                                                             2
                       C
                   B                            C
        A                                                D
   Z                                                             E

                                                                     C       N
                                                                              od
                                                                                 e
                                                                         D           3

                                                                             E

                                                                                         F




                                                                                             D



                                                                                                     No
                                                                                                         de
                                                                                                 E



                                                                                                          4
                                                                                                     F
                                                                                                         G
                                                                                                                    17
Dynamo - how does it work?
PUT http://boorad.cloudant.com/dbname/blah?w=2
                                                                                                              N=3
                                                                                                              W=2
                               Node 1

              26                                    No
         de                A   B   C    D             de
       No                                   B
                                                             2
                       C
                   B                            C
        A                                                D
   Z                                                             E

                                                                     C       N
                                                                              od
                                                                                 e
                                                                         D           3

                                                                             E

                                                                                         F




                                                                                             D



                                                                                                     No
                                                                                                         de
                                                                                                 E



                                                                                                          4
                                                                                                     F
                                                                                                         G
                                                                                                                    17
Dynamo - how does it work?
PUT http://boorad.cloudant.com/dbname/blah?w=2
                                                                                                                  N=3
                                                                                                                  W=2
                                 Node 1

              26                                        No
         de                A     B   C      D             de
       No                                       B
                                                                 2
                       C
                   B                                C
        A                                                    D
   Z                           hash(blah)                            E

                                                                         C       N
                                                                                  od
                                                                                     e
                                                                             D           3

                                                                                 E

                                                                                             F




                                                                                                 D



                                                                                                         No
                                                                                                             de
                                                                                                     E



                                                                                                              4
                                                                                                         F
                                                                                                             G
                                                                                                                        17
Dynamo - how does it work?
PUT http://boorad.cloudant.com/dbname/blah?w=2
                                                                                                                  N=3
                                                                                                                  W=2
                                 Node 1

              26                                        No
         de                A     B   C      D             de
       No                                       B
                                                                 2
                       C
                   B                                C
        A                                                    D
   Z                           hash(blah)                            E

                                                                         C       N
                                                                                  od
                                                                                     e
                                                                             D           3

                                                                                 E

                                                                                             F




                                                                                                 D



                                                                                                         No
                                                                                                             de
                                                                                                     E



                                                                                                              4
                                                                                                         F
                                                                                                             G
                                                                                                                        17
CAP Theorem
Pick Two (at any given time)

  Consistency

  Availability

  Partition Tolerance

CP refuses requests, AP eventually consistent

Must Read: http://codahale.com/you-cant-
sacrifice-partition-tolerance/
Data Model

Key/Value

Document

Column

Graph
Key / Value
BDB

Riak

Voldemort

Redis

Hibari
Document

CouchDB

MongoDB

SimpleDB
Column Stores

HBase

Cassandra

Hypertable
Graph Databases

Neo4J

AllegroGraph

FlockDB
Disk Data Structure

btree - many different kinds

mmap - compact bson

memtable/sstable or log structured merge tree

log-structured linear hashing

adjacency lists / adjacency matrices
Querying NoSQL
Key Lookups
 fast, easy, limiting
Secondary Indexes
 Immature part of most systems
 Roll your own
 MapReduce
Mongo query language
Polyglot Persistence

                                  RDBMS



                batch processes




                                          Cache
Raw
       Hadoop                     NoSQL           Apps
Data


                                  NoSQL
Drivers
Spring
  commons, hadoop, kv, document, graph
  membase, hbase, cassandra coming
Serialization
  Thrift, Protocol Buffers, Avro
Native
  Cassandra, Hadoop, Voldemort
  JInterface to Erlang?
Good Luck! You’ll Need It.
Questions?

Mais conteúdo relacionado

Mais procurados (9)

Music presentation
Music presentationMusic presentation
Music presentation
 
Intelligent Tutorial System
Intelligent Tutorial SystemIntelligent Tutorial System
Intelligent Tutorial System
 
Pmr trial-2010-math-qa-perak
Pmr trial-2010-math-qa-perakPmr trial-2010-math-qa-perak
Pmr trial-2010-math-qa-perak
 
Schematic driver edo
Schematic driver edoSchematic driver edo
Schematic driver edo
 
Leading Without Being In Charge
Leading Without Being In ChargeLeading Without Being In Charge
Leading Without Being In Charge
 
Img
ImgImg
Img
 
Portfolio Full Version
Portfolio Full VersionPortfolio Full Version
Portfolio Full Version
 
Finding%20 trigonometric%20ratios
Finding%20 trigonometric%20ratiosFinding%20 trigonometric%20ratios
Finding%20 trigonometric%20ratios
 
State of the Cloud presentation from Interop 09 Enterprise Cloud Summit
State of the Cloud presentation from Interop 09 Enterprise Cloud SummitState of the Cloud presentation from Interop 09 Enterprise Cloud Summit
State of the Cloud presentation from Interop 09 Enterprise Cloud Summit
 

Destaque

Stay productive while slicing up the monolith
Stay productive while slicing up the monolith Stay productive while slicing up the monolith
Stay productive while slicing up the monolith
Markus Eisele
 
Building Reactive Fast Data & the Data Lake with Akka, Kafka, Spark
Building Reactive Fast Data & the Data Lake with Akka, Kafka, SparkBuilding Reactive Fast Data & the Data Lake with Akka, Kafka, Spark
Building Reactive Fast Data & the Data Lake with Akka, Kafka, Spark
Todd Fritz
 

Destaque (8)

Stream Processing in the Cloud With Data Microservices
Stream Processing in the Cloud With Data MicroservicesStream Processing in the Cloud With Data Microservices
Stream Processing in the Cloud With Data Microservices
 
DevOps @Scale (Greek Tragedy in 3 Acts) as it was presented at DevNexus 2017
DevOps @Scale (Greek Tragedy in 3 Acts) as it was presented at DevNexus 2017DevOps @Scale (Greek Tragedy in 3 Acts) as it was presented at DevNexus 2017
DevOps @Scale (Greek Tragedy in 3 Acts) as it was presented at DevNexus 2017
 
Stay productive while slicing up the monolith
Stay productive while slicing up the monolith Stay productive while slicing up the monolith
Stay productive while slicing up the monolith
 
Architecting for failure - Why are distributed systems hard?
Architecting for failure - Why are distributed systems hard?Architecting for failure - Why are distributed systems hard?
Architecting for failure - Why are distributed systems hard?
 
Building Reactive Fast Data & the Data Lake with Akka, Kafka, Spark
Building Reactive Fast Data & the Data Lake with Akka, Kafka, SparkBuilding Reactive Fast Data & the Data Lake with Akka, Kafka, Spark
Building Reactive Fast Data & the Data Lake with Akka, Kafka, Spark
 
Reactive Thinking in Java with RxJava2
Reactive Thinking in Java with RxJava2Reactive Thinking in Java with RxJava2
Reactive Thinking in Java with RxJava2
 
Deploying Microservices as Containers
Deploying Microservices as ContainersDeploying Microservices as Containers
Deploying Microservices as Containers
 
Transformation Processing Smackdown; Spark vs Hive vs Pig
Transformation Processing Smackdown; Spark vs Hive vs PigTransformation Processing Smackdown; Spark vs Hive vs Pig
Transformation Processing Smackdown; Spark vs Hive vs Pig
 

Semelhante a DevNexus 2011 (7)

newtableconcept assembly instructions sheet
newtableconcept assembly instructions sheetnewtableconcept assembly instructions sheet
newtableconcept assembly instructions sheet
 
CURATE: About the game
CURATE:  About the gameCURATE:  About the game
CURATE: About the game
 
Brocade Migration Example
Brocade Migration ExampleBrocade Migration Example
Brocade Migration Example
 
Acordes piano
Acordes pianoAcordes piano
Acordes piano
 
Acordes piano
Acordes pianoAcordes piano
Acordes piano
 
Acordes piano
Acordes pianoAcordes piano
Acordes piano
 
Pneumatic actuator
Pneumatic actuatorPneumatic actuator
Pneumatic actuator
 

Mais de boorad

TriHUG - Beyond Batch
TriHUG - Beyond BatchTriHUG - Beyond Batch
TriHUG - Beyond Batch
boorad
 

Mais de boorad (12)

Big Data Analysis Patterns with Hadoop, Mahout and Solr
Big Data Analysis Patterns with Hadoop, Mahout and SolrBig Data Analysis Patterns with Hadoop, Mahout and Solr
Big Data Analysis Patterns with Hadoop, Mahout and Solr
 
Big Data Analysis Patterns - TriHUG 6/27/2013
Big Data Analysis Patterns - TriHUG 6/27/2013Big Data Analysis Patterns - TriHUG 6/27/2013
Big Data Analysis Patterns - TriHUG 6/27/2013
 
Hadoop and Storm - AJUG talk
Hadoop and Storm - AJUG talkHadoop and Storm - AJUG talk
Hadoop and Storm - AJUG talk
 
Realtime Computation with Storm
Realtime Computation with StormRealtime Computation with Storm
Realtime Computation with Storm
 
Big Data Use Cases
Big Data Use CasesBig Data Use Cases
Big Data Use Cases
 
PhillyDB Talk - Beyond Batch
PhillyDB Talk - Beyond BatchPhillyDB Talk - Beyond Batch
PhillyDB Talk - Beyond Batch
 
TriHUG - Beyond Batch
TriHUG - Beyond BatchTriHUG - Beyond Batch
TriHUG - Beyond Batch
 
Realtime Computation with Storm
Realtime Computation with StormRealtime Computation with Storm
Realtime Computation with Storm
 
Large Scale Data Analysis Tools
Large Scale Data Analysis ToolsLarge Scale Data Analysis Tools
Large Scale Data Analysis Tools
 
DevNation Atlanta
DevNation AtlantaDevNation Atlanta
DevNation Atlanta
 
NOSQL, CouchDB, and the Cloud
NOSQL, CouchDB, and the CloudNOSQL, CouchDB, and the Cloud
NOSQL, CouchDB, and the Cloud
 
Why Erlang? - Bar Camp Atlanta 2008
Why Erlang?  - Bar Camp Atlanta 2008Why Erlang?  - Bar Camp Atlanta 2008
Why Erlang? - Bar Camp Atlanta 2008
 

Último

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 

Último (20)

"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 

DevNexus 2011

Notas do Editor

  1. \n
  2. \n
  3. \n
  4. \n
  5. \n
  6. \n
  7. \n
  8. \n
  9. \n
  10. \n
  11. \n
  12. \n
  13. \n
  14. \n
  15. \n
  16. \n
  17. \n
  18. \n
  19. \n
  20. \n
  21. \n
  22. \n
  23. \n
  24. \n
  25. \n
  26. \n
  27. \n
  28. \n
  29. \n
  30. \n
  31. \n
  32. \n