SlideShare uma empresa Scribd logo
1 de 12
Baixar para ler offline
NoSQL: What Is It and Why Would I Care?
     Eberhard Wolff




21.09.11
Alternative Databases: NoSQL
►    NoSQL: Not only SQL


►    A good example for a catchy but bad name
►    Not positive definition, rather “not something else”
►    Now: Even less clear
Why NoSQL?
►    Exponential data growth


►    More and more connected data
     >  Hypertext, Blogs, User generated content, Blogs


►    Semi structured
     >  User generated content
     >  Full text search / indices instead of Query-by-Example


►    Integration on the database less common


►    Cloud prefers scale out over scale up
     >  Cloud supports scale up: Reboot into larger machine
     >  …but eventually you will need to scale out i.e. add more machines
NoSQL Flavors
►    Key / value store
►    Document
►    Wide Column: Lots of Columns


►    Graph Database: Graphs with nodes, relationships and properties
►    Object databases: Stores objects – not rows


►    Note: NoSQL is actually vaguely defined
Key-Value Stores
►    Maps keys to values                               Key   Value
►    Just a large globally available Map               42    Some
                                                             data
►    i.e. not very powerful data model
►    Advantages
     >  Easy to understand
     >  Easier to build scale out solutions
        (no joins, easy sharding etc)
►    Disadvantages
     >  Simplistic data model
     >  Not a good fit for complex data
     >  Might add complexity to the application code
•    Focus in Scalability
•    Redis: Think cache + Persistence
•    Riak
Key Value Store: Hybrid Approach
►    Might just be used to store specific data


►    I.e. scores of players in an online game
     >  No complex structure
     >  Need to scale
     >  Lots of reads and write


►    Player name, age, address would still be in a RDBMS


►    Hybrid approach
Key-Value Stores: Store All Data
►    Storing data as serialized blobs
     >  "user:someuser" è "someuser|someuser@example.com|more|data|here"
►    Storing data as multiple keys
     >  "user:username:someuser" è "someuser"
     >  "user:email:someuser" è "someuser@example.com"
     >  Requires multi get/set to be efficient
     >  Allows some querying if the database supports wildcards,
        like "user:email:someuser*"
►    Storing links
     >  Blob: "basket:someuser" è"...|item|1|product|product:123|..."
     >  Separate keys: "basket:someuser:item:1:product" è "product:123"
        –  Multi-get: "basket:someuser:*" loads the shopping basket and all items
►    Easy to understand, hard to implement
Document Stores
►    Aggregates are typically stored as "documents“ (key-value collection)
►    JSON, BSON (binary JSON) and XML are common
►    Still no schema, so add any data at runtime
►    The semi-structure of the document allows the database to build indexes, allowing
     queries that address properties of the document
     >  E.g. "find all baskets that contain the product 123"
►    Relations might be modeled as links
►    Advantages
     >  Good fit for semi structured data
     >  In particular a good fit for JSON, XML, HTML…
     >  Probably the easiest transition from RDBMS
►    Disadvantages
     >  Does not scale to the key/value store level
►    Focus on semi structured data e.g. JSON
►    MongoDB, CouchDB
Wide Column
►     Add any "column" you like to a row
                                                                          XX

►     Not a key-value store, but a "key-(column-value)" store        XX        XX        XX        XX

                                                                               XX   XX   XX
►     Column families are like tables                                     XX   XX        XX        XX


►     E.g. in the "Users" column family                              XX        XX   XX             XX

                                                                          XX        XX        XX   XX
      >  "someuser" è ("username"è"someuser"),                     XX        XX        XX        XX

                         ("email" è"someuser@example.com")          XX   XX

                                                                               XX   XX   XX
►     Since columns are named, some databases provide indexing          XX                    XX   XX

      >  E.g. Google AppEngine allows you to define columns that can XX queried
                                                                     be       XX              XX

                                                                          XX   XX        XX        XX
►     Advantages                                                          XX   XX   XX        XX

      >  Easy to store complex and heterogeous data                  XX        xX   XX   XX   XX



§    Apache Cassandra
§    Amazon SimpleDB
Graph
►    Nodes with Properties
►    Typed relationships with properties


►    Ideal e.g. to model relations in a social network


►    Easy to find number of followers, degree of relation etc.


►    Neo4j
What happened to Queries?
►    Data is easily and quickly read/stored using primary key
►    Denormalize data for commonly used queries
     >  Store twitter inbox in key/value as
        –  "inbox:someuser" è ("posts:123", "posts:234", ...)
     >  instead of doing the query (RDBMS)
        –  select p.* from POSTS p, POSTLINKS pl where p.id = pl.postId and
           pl.userid=42
►    Store reverse lookup
     >  ”ewolff|following" è (”spring_rod", ”spring_juergen")
     >  ”post:435|RT" è (”post:42", ”post:21")
What It Means for Developers
§  More technologies to have fun with
§  Broader choice of persistence stores
§  Probably Cross Store Persistence
    •  Store name, firstname etc in RDBMS
    •  Store followers in Graph database

  •  Store Content in RDBMS
  •  Store User Generated Content in Document database


§  Spring Data
    •  Similar APIs for JPA and NoSQL
    •  Support for cross store persistence
    •  Sophisticated support for generic DAOs
    •  E.g. just add findByName() method, implementation is provided
§  QueryDSL
    •  JPA Criteria API done right

Mais conteúdo relacionado

Mais procurados

MongoDb and NoSQL
MongoDb and NoSQLMongoDb and NoSQL
MongoDb and NoSQL
TO THE NEW | Technology
 
MongoDB - A Document NoSQL Database
MongoDB - A Document NoSQL DatabaseMongoDB - A Document NoSQL Database
MongoDB - A Document NoSQL Database
Ruben Inoto Soto
 
Database Architecture and Basic Concepts
Database Architecture and Basic ConceptsDatabase Architecture and Basic Concepts
Database Architecture and Basic Concepts
Tony Wong
 

Mais procurados (20)

MongoDb and NoSQL
MongoDb and NoSQLMongoDb and NoSQL
MongoDb and NoSQL
 
Scaling up and accelerating Drupal 8 with NoSQL
Scaling up and accelerating Drupal 8 with NoSQLScaling up and accelerating Drupal 8 with NoSQL
Scaling up and accelerating Drupal 8 with NoSQL
 
MongoDB at FrozenRails
MongoDB at FrozenRailsMongoDB at FrozenRails
MongoDB at FrozenRails
 
Schema Design with MongoDB
Schema Design with MongoDBSchema Design with MongoDB
Schema Design with MongoDB
 
Mongo db queries
Mongo db queriesMongo db queries
Mongo db queries
 
Optimize drupal using mongo db
Optimize drupal using mongo dbOptimize drupal using mongo db
Optimize drupal using mongo db
 
Building Apps with MongoDB
Building Apps with MongoDBBuilding Apps with MongoDB
Building Apps with MongoDB
 
Introduction to NoSQL CassandraDB
Introduction to NoSQL CassandraDBIntroduction to NoSQL CassandraDB
Introduction to NoSQL CassandraDB
 
Introduction to CouchDB - LA Hacker News
Introduction to CouchDB - LA Hacker NewsIntroduction to CouchDB - LA Hacker News
Introduction to CouchDB - LA Hacker News
 
Back to Basics Webinar 3: Schema Design Thinking in Documents
 Back to Basics Webinar 3: Schema Design Thinking in Documents Back to Basics Webinar 3: Schema Design Thinking in Documents
Back to Basics Webinar 3: Schema Design Thinking in Documents
 
CouchDB at New York PHP
CouchDB at New York PHPCouchDB at New York PHP
CouchDB at New York PHP
 
Socialite, the Open Source Status Feed Part 2: Managing the Social Graph
Socialite, the Open Source Status Feed Part 2: Managing the Social GraphSocialite, the Open Source Status Feed Part 2: Managing the Social Graph
Socialite, the Open Source Status Feed Part 2: Managing the Social Graph
 
MongoDB - A Document NoSQL Database
MongoDB - A Document NoSQL DatabaseMongoDB - A Document NoSQL Database
MongoDB - A Document NoSQL Database
 
Introduction to (sql)
Introduction to (sql)Introduction to (sql)
Introduction to (sql)
 
Database Architecture and Basic Concepts
Database Architecture and Basic ConceptsDatabase Architecture and Basic Concepts
Database Architecture and Basic Concepts
 
Mongo db
Mongo dbMongo db
Mongo db
 
Socialite, the Open Source Status Feed Part 3: Scaling the Data Feed
Socialite, the Open Source Status Feed Part 3: Scaling the Data FeedSocialite, the Open Source Status Feed Part 3: Scaling the Data Feed
Socialite, the Open Source Status Feed Part 3: Scaling the Data Feed
 
How to search extracted data
How to search extracted dataHow to search extracted data
How to search extracted data
 
Webinar: Back to Basics: Thinking in Documents
Webinar: Back to Basics: Thinking in DocumentsWebinar: Back to Basics: Thinking in Documents
Webinar: Back to Basics: Thinking in Documents
 
MongoDB Strange Loop 2009
MongoDB Strange Loop 2009MongoDB Strange Loop 2009
MongoDB Strange Loop 2009
 

Semelhante a NoSQL Overview

DB2UDB_the_Basics Day2
DB2UDB_the_Basics Day2DB2UDB_the_Basics Day2
DB2UDB_the_Basics Day2
Pranav Prakash
 
Cassandra Tutorial
Cassandra TutorialCassandra Tutorial
Cassandra Tutorial
mubarakss
 

Semelhante a NoSQL Overview (20)

NoSQL: An Architects Perspective
NoSQL: An Architects PerspectiveNoSQL: An Architects Perspective
NoSQL: An Architects Perspective
 
DB2UDB_the_Basics Day2
DB2UDB_the_Basics Day2DB2UDB_the_Basics Day2
DB2UDB_the_Basics Day2
 
Introduction to Apache Tajo: Data Warehouse for Big Data
Introduction to Apache Tajo: Data Warehouse for Big DataIntroduction to Apache Tajo: Data Warehouse for Big Data
Introduction to Apache Tajo: Data Warehouse for Big Data
 
Data stores: beyond relational databases
Data stores: beyond relational databasesData stores: beyond relational databases
Data stores: beyond relational databases
 
Cassandra Tutorial
Cassandra TutorialCassandra Tutorial
Cassandra Tutorial
 
Couchbase Data Platform | Big Data Demystified
Couchbase Data Platform | Big Data DemystifiedCouchbase Data Platform | Big Data Demystified
Couchbase Data Platform | Big Data Demystified
 
Managing Social Content with MongoDB
Managing Social Content with MongoDBManaging Social Content with MongoDB
Managing Social Content with MongoDB
 
DBVersity MongoDB Online Training Presentations
DBVersity MongoDB Online Training PresentationsDBVersity MongoDB Online Training Presentations
DBVersity MongoDB Online Training Presentations
 
No SQL and MongoDB - Hyderabad Scalability Meetup
No SQL and MongoDB - Hyderabad Scalability MeetupNo SQL and MongoDB - Hyderabad Scalability Meetup
No SQL and MongoDB - Hyderabad Scalability Meetup
 
The Value in Trees
The Value in TreesThe Value in Trees
The Value in Trees
 
Tech Gupshup Meetup On MongoDB - 24/06/2016
Tech Gupshup Meetup On MongoDB - 24/06/2016Tech Gupshup Meetup On MongoDB - 24/06/2016
Tech Gupshup Meetup On MongoDB - 24/06/2016
 
On Rails with Apache Cassandra
On Rails with Apache CassandraOn Rails with Apache Cassandra
On Rails with Apache Cassandra
 
Mysql database
Mysql databaseMysql database
Mysql database
 
Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...
Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...
Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...
 
Introduction to Apache Tajo: Data Warehouse for Big Data
Introduction to Apache Tajo: Data Warehouse for Big DataIntroduction to Apache Tajo: Data Warehouse for Big Data
Introduction to Apache Tajo: Data Warehouse for Big Data
 
NoSQL: An Analysis
NoSQL: An AnalysisNoSQL: An Analysis
NoSQL: An Analysis
 
Apache Cassandra introduction
Apache Cassandra introductionApache Cassandra introduction
Apache Cassandra introduction
 
Deep Dive into Cassandra
Deep Dive into CassandraDeep Dive into Cassandra
Deep Dive into Cassandra
 
N07_RoundII_20220405.pptx
N07_RoundII_20220405.pptxN07_RoundII_20220405.pptx
N07_RoundII_20220405.pptx
 
Scaling Drupal in AWS Using AutoScaling, Cloudformation, RDS and more
Scaling Drupal in AWS Using AutoScaling, Cloudformation, RDS and moreScaling Drupal in AWS Using AutoScaling, Cloudformation, RDS and more
Scaling Drupal in AWS Using AutoScaling, Cloudformation, RDS and more
 

Mais de adesso AG

Wartbare Web-Anwendungen mit Knockout.js und Model-View-ViewModel (MVVM)
Wartbare Web-Anwendungen mit Knockout.js und Model-View-ViewModel (MVVM)Wartbare Web-Anwendungen mit Knockout.js und Model-View-ViewModel (MVVM)
Wartbare Web-Anwendungen mit Knockout.js und Model-View-ViewModel (MVVM)
adesso AG
 

Mais de adesso AG (20)

SNMP Applied - Sicheres Anwendungs-Monitoring mit SNMP (Kurzversion)
SNMP Applied - Sicheres Anwendungs-Monitoring mit SNMP (Kurzversion)SNMP Applied - Sicheres Anwendungs-Monitoring mit SNMP (Kurzversion)
SNMP Applied - Sicheres Anwendungs-Monitoring mit SNMP (Kurzversion)
 
SNMP Applied - Sicheres Anwendungs-Monitoring mit SNMP
SNMP Applied - Sicheres Anwendungs-Monitoring mit SNMPSNMP Applied - Sicheres Anwendungs-Monitoring mit SNMP
SNMP Applied - Sicheres Anwendungs-Monitoring mit SNMP
 
Mythos High Performance Teams
Mythos High Performance TeamsMythos High Performance Teams
Mythos High Performance Teams
 
A Business-Critical SharePoint Solution From adesso AG
A Business-CriticalSharePoint SolutionFrom adesso AGA Business-CriticalSharePoint SolutionFrom adesso AG
A Business-Critical SharePoint Solution From adesso AG
 
Was Sie über NoSQL Datenbanken wissen sollten!
Was Sie über NoSQL Datenbanken wissen sollten!Was Sie über NoSQL Datenbanken wissen sollten!
Was Sie über NoSQL Datenbanken wissen sollten!
 
Continuous Delivery praktisch
Continuous Delivery praktischContinuous Delivery praktisch
Continuous Delivery praktisch
 
Agilität, Snapshots und Continuous Delivery
Agilität, Snapshots und Continuous DeliveryAgilität, Snapshots und Continuous Delivery
Agilität, Snapshots und Continuous Delivery
 
Wozu Portlets – reichen HTML5 und Rest nicht aus für moderne Portale?
Wozu Portlets – reichen HTML5 und Rest nicht aus für moderne Portale?Wozu Portlets – reichen HTML5 und Rest nicht aus für moderne Portale?
Wozu Portlets – reichen HTML5 und Rest nicht aus für moderne Portale?
 
Getriebene Anwendungslandschaften
Getriebene AnwendungslandschaftenGetriebene Anwendungslandschaften
Getriebene Anwendungslandschaften
 
Google App Engine JAX PaaS Parade 2013
Google App Engine JAX PaaS Parade 2013Google App Engine JAX PaaS Parade 2013
Google App Engine JAX PaaS Parade 2013
 
Wartbare Web-Anwendungen mit Knockout.js und Model-View-ViewModel (MVVM)
Wartbare Web-Anwendungen mit Knockout.js und Model-View-ViewModel (MVVM)Wartbare Web-Anwendungen mit Knockout.js und Model-View-ViewModel (MVVM)
Wartbare Web-Anwendungen mit Knockout.js und Model-View-ViewModel (MVVM)
 
OOP 2013 NoSQL Suche
OOP 2013 NoSQL SucheOOP 2013 NoSQL Suche
OOP 2013 NoSQL Suche
 
NoSQL in der Cloud - Why?
NoSQL in der Cloud -  Why?NoSQL in der Cloud -  Why?
NoSQL in der Cloud - Why?
 
Lean web architecture mit jsf 2.0, cdi & co.
Lean web architecture mit jsf 2.0, cdi & co.Lean web architecture mit jsf 2.0, cdi & co.
Lean web architecture mit jsf 2.0, cdi & co.
 
Schlanke Webarchitekturen nicht nur mit JSF 2 und CDI
Schlanke Webarchitekturen nicht nur mit JSF 2 und CDISchlanke Webarchitekturen nicht nur mit JSF 2 und CDI
Schlanke Webarchitekturen nicht nur mit JSF 2 und CDI
 
Zehn Hinweise für Architekten
Zehn Hinweise für ArchitektenZehn Hinweise für Architekten
Zehn Hinweise für Architekten
 
Agile Praktiken
Agile PraktikenAgile Praktiken
Agile Praktiken
 
Java und Cloud - nicht nur mit PaaS
Java und Cloud - nicht nur mit PaaS Java und Cloud - nicht nur mit PaaS
Java und Cloud - nicht nur mit PaaS
 
Neue EBusiness Perspektiven durch HTML5
Neue EBusiness Perspektiven durch HTML5Neue EBusiness Perspektiven durch HTML5
Neue EBusiness Perspektiven durch HTML5
 
CloudConf2011 Introduction to Google App Engine
CloudConf2011 Introduction to Google App EngineCloudConf2011 Introduction to Google App Engine
CloudConf2011 Introduction to Google App Engine
 

Último

Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Último (20)

Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 

NoSQL Overview

  • 1. NoSQL: What Is It and Why Would I Care? Eberhard Wolff 21.09.11
  • 2. Alternative Databases: NoSQL ►  NoSQL: Not only SQL ►  A good example for a catchy but bad name ►  Not positive definition, rather “not something else” ►  Now: Even less clear
  • 3. Why NoSQL? ►  Exponential data growth ►  More and more connected data >  Hypertext, Blogs, User generated content, Blogs ►  Semi structured >  User generated content >  Full text search / indices instead of Query-by-Example ►  Integration on the database less common ►  Cloud prefers scale out over scale up >  Cloud supports scale up: Reboot into larger machine >  …but eventually you will need to scale out i.e. add more machines
  • 4. NoSQL Flavors ►  Key / value store ►  Document ►  Wide Column: Lots of Columns ►  Graph Database: Graphs with nodes, relationships and properties ►  Object databases: Stores objects – not rows ►  Note: NoSQL is actually vaguely defined
  • 5. Key-Value Stores ►  Maps keys to values Key Value ►  Just a large globally available Map 42 Some data ►  i.e. not very powerful data model ►  Advantages >  Easy to understand >  Easier to build scale out solutions (no joins, easy sharding etc) ►  Disadvantages >  Simplistic data model >  Not a good fit for complex data >  Might add complexity to the application code •  Focus in Scalability •  Redis: Think cache + Persistence •  Riak
  • 6. Key Value Store: Hybrid Approach ►  Might just be used to store specific data ►  I.e. scores of players in an online game >  No complex structure >  Need to scale >  Lots of reads and write ►  Player name, age, address would still be in a RDBMS ►  Hybrid approach
  • 7. Key-Value Stores: Store All Data ►  Storing data as serialized blobs >  "user:someuser" è "someuser|someuser@example.com|more|data|here" ►  Storing data as multiple keys >  "user:username:someuser" è "someuser" >  "user:email:someuser" è "someuser@example.com" >  Requires multi get/set to be efficient >  Allows some querying if the database supports wildcards, like "user:email:someuser*" ►  Storing links >  Blob: "basket:someuser" è"...|item|1|product|product:123|..." >  Separate keys: "basket:someuser:item:1:product" è "product:123" –  Multi-get: "basket:someuser:*" loads the shopping basket and all items ►  Easy to understand, hard to implement
  • 8. Document Stores ►  Aggregates are typically stored as "documents“ (key-value collection) ►  JSON, BSON (binary JSON) and XML are common ►  Still no schema, so add any data at runtime ►  The semi-structure of the document allows the database to build indexes, allowing queries that address properties of the document >  E.g. "find all baskets that contain the product 123" ►  Relations might be modeled as links ►  Advantages >  Good fit for semi structured data >  In particular a good fit for JSON, XML, HTML… >  Probably the easiest transition from RDBMS ►  Disadvantages >  Does not scale to the key/value store level ►  Focus on semi structured data e.g. JSON ►  MongoDB, CouchDB
  • 9. Wide Column ►  Add any "column" you like to a row XX ►  Not a key-value store, but a "key-(column-value)" store XX XX XX XX XX XX XX ►  Column families are like tables XX XX XX XX ►  E.g. in the "Users" column family XX XX XX XX XX XX XX XX >  "someuser" è ("username"è"someuser"), XX XX XX XX ("email" è"someuser@example.com") XX XX XX XX XX ►  Since columns are named, some databases provide indexing XX XX XX >  E.g. Google AppEngine allows you to define columns that can XX queried be XX XX XX XX XX XX ►  Advantages XX XX XX XX >  Easy to store complex and heterogeous data XX xX XX XX XX §  Apache Cassandra §  Amazon SimpleDB
  • 10. Graph ►  Nodes with Properties ►  Typed relationships with properties ►  Ideal e.g. to model relations in a social network ►  Easy to find number of followers, degree of relation etc. ►  Neo4j
  • 11. What happened to Queries? ►  Data is easily and quickly read/stored using primary key ►  Denormalize data for commonly used queries >  Store twitter inbox in key/value as –  "inbox:someuser" è ("posts:123", "posts:234", ...) >  instead of doing the query (RDBMS) –  select p.* from POSTS p, POSTLINKS pl where p.id = pl.postId and pl.userid=42 ►  Store reverse lookup >  ”ewolff|following" è (”spring_rod", ”spring_juergen") >  ”post:435|RT" è (”post:42", ”post:21")
  • 12. What It Means for Developers §  More technologies to have fun with §  Broader choice of persistence stores §  Probably Cross Store Persistence •  Store name, firstname etc in RDBMS •  Store followers in Graph database •  Store Content in RDBMS •  Store User Generated Content in Document database §  Spring Data •  Similar APIs for JPA and NoSQL •  Support for cross store persistence •  Sophisticated support for generic DAOs •  E.g. just add findByName() method, implementation is provided §  QueryDSL •  JPA Criteria API done right