SlideShare a Scribd company logo
1 of 44
Download to read offline
NoSQL
Now we know what it’s not... what is it?
What are we running
from?
• Relational databases are the defacto
standard for storing data in a web
application.
• A lot of times, that data isn’t really
relational at all.
• RDBMS’s have lots of rules that can impact
performance.
Rules? What Rules?
• Classic relational databases follow the
ACID rules:
• Atomicity
• Consistency
• Isolation
• Durability
Atomicity
• If any part of the update fails, it all fails.
• Databases have to be able to lock tables
and rows for operations, which can block
or delay other incoming requests.
Consistency
• After a transaction, all copies of the data
must be consistent with each other (my
interpretation).
• Replication across lots of shards is
expensive especially if there’s locking
involved.
Isolation
• Data involved in a transaction must be
inaccessible to other operations.
• Remember the thing about locked rows
and tables?
• It’s a bummer.
Durability
• Once a user is notified that a transaction
has completed, the data must be accessible
and all integrity constraints have been met.
I come not to bury
MySQL...
• Relational databases are great for a lot of
uses.
• If you have data that’s actually relational and
you need transactions, joins and have a
limited number of data types, then an
RDBMS will work for you.
But...
• RDBMS’s have been
treated like hammers
and used for things
they’re not good at and
weren’t designed for.
• Like the web...
Thus were born...
• Key-Value Stores
• Wide-Column Stores
• Document Stores/Databases
• Graph Databases
All thrown together &
clumsily dubbed...
NoSQL
Which, despite it’s
negative sound,
supposedly means:
“Not Only SQL”
Yeah, I don’t believe it
either...
Key-Value
Just what it sounds like. You set a Key to aValue and
can then retrieve it.
Key-Value Benets
• Simple
• High performance (usually) because there
are no transactions or relations so it’s a
simple bucket and lookup.
• Extremely flexible
• Commonly used as caches in front of
slower resources (like MySQL - bazinga!)
Popular Players
• memcached - in memory only, extremely
efcient hashing algorithm allows you to
scale easily to hundreds of nodes.
• Redis - persistent, slightly more complex
than memcached (has support for arrays)
but still highly performant.
• Riak - The Rails Machine guys love it. Jesse?
My Uses
• memcached: Read-through cache for
Rails with cache-money.
• redis: persistent cache for results from
our algorithm, partitioned by version and
instance.
Wide Column
• Family of databases modeled on either
Google’s BigTable or Amazon’s Dynamo.
• Pick two out of three from the CAP
theorem in order to get horizontal
scalability.
• Data stored by column instead of by row.
CAP?
• Consistency:All clients always have the
same view of the data.
• Availability: Each client can always read
and write.
• Partition Tolerance:The system works
well despite physical network partitions
Use cases
• Making sense out of large amounts of data
where you know your query scenario
ahead of time.
• Large = 100s of millions of records.
• Data-mining log files and other sources of
similar data.
Big Players
• HBase
• Cassandra
• Hypertable
• Amazon’s SimpleDB
• Google’s BigTable (the granddaddy of all of
them)
Graph Databases
• Store nodes, edges and properties
• Think of them as Things, Connections and
Properties
• Good for storing properties and
relationships.
• Honestly, I don’t fully understand them...
anyone?
The Players
• Neo4j
• FlockDB
• HyperGraphDB
Document Stores
• Short on relationships, tall on rich data
types.
• Big on eventual consistency and flexible
schemas.
• Hybrid of traditional RDBMS and Key-Value
stores.
Use Cases
• Content Management Systems
• Applications with rapid partial updates
• Anything you don’t need joins or
transactions for that you would normally
use a RDBMS for.
The Players
• CouchDB
• MongoDB
• Terrastore
MongoDB
• Support for rich data types: arrays, hashes,
embedded documents, etc
• Support for adding and removing things
from arrays and embedded documents
(addToSet, for example).
• Map/Reduce support and strong indexes
• Regular expression support in queries
Design Considerations
• Embedded Documents - Use only if it
the embedded document will always be
selected with the parent.
• Indexes - MongoDB punishes you much
earlier for missing indexes than MySQL.
• Document size - Currently, documents
are limited to 4MB, which should be large
enough, but if it’s not...
Real-World MongoDB
• We use MongoDB heavily at MIS.
• Statistics application and reporting
• Top-secret new application
• Web crawler and indexer
• CMS
Real-World Example
Let’s do tags. Everything is taggable now, right?
The MySQL Way
Schema
And to get a “thing’s”
tags?
SELECT `tags`.* FROM `tags`
INNER JOIN `taggings` ON `tags`.id = `taggings`.tag_id
WHERE ((`taggings`.taggable_id = 237)
AND (`taggings`.taggable_type = 'Song'))
Yuck!
That’s a lot of pain for something so simple.
And I didn’t even show you finding things with tag “x”.
Or how to set and unset tags on a “thing”.
Ouch.
The MongoDB Way
Using MongoMapper and Rails 3
class Post
include MongoMapper::Document
key :title, String
key :body, String
key :tags, Array
ensure_index :tags
end
Let’s Make This Easy...
def add_tag(tag)
tag = Post.clean_tag(tag)
self.tags << tag
self.add_to_set(:tags => tag) unless self.new_record?
end
def remove_tag(tag)
tag = Post.clean_tag(tag)
self.tags.delete(tag)
self.pull(:tags => tag) unless self.new_record?
end
def self.clean_tag(str)
str.strip.downcase.gsub(" ","-").gsub(/[^a-z0-9-]/,"")
end
def self.clean_tags(str)
out = []
arr = str.split(",")
arr.each do |t|
out << self.clean_tag(t)
end
out
end
Demo Time
Sorry if you’re looking at this later, but it’s console time!
Why I Love MongoDB
• Document model fits how I build web apps.
• For most apps, I don’t need transactions.
• Eventual consistency is actually OK.
• Partial updates and arrays make things that
are a pain in SQL-land absolutely painless.
• It’s just smart enough without getting in the
way.
What’s NoSQL, really?
• The right tool for the job.
• We’ve got lots of options for storing
application data.
• The key is picking the one that solves our
real problem.
• And if an RDBMS is the right tool, that’s OK
too.
Questions?
Further Reading
• Visual NoSQL: http://blog.nahurst.com/
visual-guide-to-nosql-systems
• MongoDB: http://mongodb.org
• MongoMapper: http://mongomapper.com/
Thanks!
• Kevin Lawver
• @kplawver
• kevin@lawver.net
• http://kevinlawver.com

More Related Content

Viewers also liked

'UX', 'UX Design' and 'Good UX'
'UX', 'UX Design' and 'Good UX''UX', 'UX Design' and 'Good UX'
'UX', 'UX Design' and 'Good UX'
Jinyong Kim
 

Viewers also liked (9)

CODE!
CODE!CODE!
CODE!
 
Hinduja Interactive Company Profile
Hinduja Interactive Company ProfileHinduja Interactive Company Profile
Hinduja Interactive Company Profile
 
Welcome To Ruby On Rails
Welcome To Ruby On RailsWelcome To Ruby On Rails
Welcome To Ruby On Rails
 
Crowdsourcing in the Public Sector
Crowdsourcing in the Public SectorCrowdsourcing in the Public Sector
Crowdsourcing in the Public Sector
 
Inspire U Presents Aromatherapy for Special Populations
Inspire U Presents Aromatherapy for Special PopulationsInspire U Presents Aromatherapy for Special Populations
Inspire U Presents Aromatherapy for Special Populations
 
Vocabulario o viño
Vocabulario o viñoVocabulario o viño
Vocabulario o viño
 
Súper Casares Paqui
Súper Casares PaquiSúper Casares Paqui
Súper Casares Paqui
 
Social Media Food Chain
Social Media Food ChainSocial Media Food Chain
Social Media Food Chain
 
'UX', 'UX Design' and 'Good UX'
'UX', 'UX Design' and 'Good UX''UX', 'UX Design' and 'Good UX'
'UX', 'UX Design' and 'Good UX'
 

Recently uploaded

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Recently uploaded (20)

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 

NoSQL - We know what it isn't, but what is it?

  • 1. NoSQL Now we know what it’s not... what is it?
  • 2. What are we running from? • Relational databases are the defacto standard for storing data in a web application. • A lot of times, that data isn’t really relational at all. • RDBMS’s have lots of rules that can impact performance.
  • 3. Rules? What Rules? • Classic relational databases follow the ACID rules: • Atomicity • Consistency • Isolation • Durability
  • 4. Atomicity • If any part of the update fails, it all fails. • Databases have to be able to lock tables and rows for operations, which can block or delay other incoming requests.
  • 5. Consistency • After a transaction, all copies of the data must be consistent with each other (my interpretation). • Replication across lots of shards is expensive especially if there’s locking involved.
  • 6. Isolation • Data involved in a transaction must be inaccessible to other operations. • Remember the thing about locked rows and tables? • It’s a bummer.
  • 7. Durability • Once a user is notied that a transaction has completed, the data must be accessible and all integrity constraints have been met.
  • 8. I come not to bury MySQL... • Relational databases are great for a lot of uses. • If you have data that’s actually relational and you need transactions, joins and have a limited number of data types, then an RDBMS will work for you.
  • 9. But... • RDBMS’s have been treated like hammers and used for things they’re not good at and weren’t designed for. • Like the web...
  • 10. Thus were born... • Key-Value Stores • Wide-Column Stores • Document Stores/Databases • Graph Databases
  • 11. All thrown together & clumsily dubbed...
  • 12. NoSQL
  • 13. Which, despite it’s negative sound, supposedly means: “Not Only SQL”
  • 14. Yeah, I don’t believe it either...
  • 15. Key-Value Just what it sounds like. You set a Key to aValue and can then retrieve it.
  • 16. Key-Value Benets • Simple • High performance (usually) because there are no transactions or relations so it’s a simple bucket and lookup. • Extremely flexible • Commonly used as caches in front of slower resources (like MySQL - bazinga!)
  • 17. Popular Players • memcached - in memory only, extremely efcient hashing algorithm allows you to scale easily to hundreds of nodes. • Redis - persistent, slightly more complex than memcached (has support for arrays) but still highly performant. • Riak - The Rails Machine guys love it. Jesse?
  • 18. My Uses • memcached: Read-through cache for Rails with cache-money. • redis: persistent cache for results from our algorithm, partitioned by version and instance.
  • 19. Wide Column • Family of databases modeled on either Google’s BigTable or Amazon’s Dynamo. • Pick two out of three from the CAP theorem in order to get horizontal scalability. • Data stored by column instead of by row.
  • 20. CAP? • Consistency:All clients always have the same view of the data. • Availability: Each client can always read and write. • Partition Tolerance:The system works well despite physical network partitions
  • 21. Use cases • Making sense out of large amounts of data where you know your query scenario ahead of time. • Large = 100s of millions of records. • Data-mining log les and other sources of similar data.
  • 22. Big Players • HBase • Cassandra • Hypertable • Amazon’s SimpleDB • Google’s BigTable (the granddaddy of all of them)
  • 23. Graph Databases • Store nodes, edges and properties • Think of them as Things, Connections and Properties • Good for storing properties and relationships. • Honestly, I don’t fully understand them... anyone?
  • 24. The Players • Neo4j • FlockDB • HyperGraphDB
  • 25. Document Stores • Short on relationships, tall on rich data types. • Big on eventual consistency and flexible schemas. • Hybrid of traditional RDBMS and Key-Value stores.
  • 26. Use Cases • Content Management Systems • Applications with rapid partial updates • Anything you don’t need joins or transactions for that you would normally use a RDBMS for.
  • 27. The Players • CouchDB • MongoDB • Terrastore
  • 28. MongoDB • Support for rich data types: arrays, hashes, embedded documents, etc • Support for adding and removing things from arrays and embedded documents (addToSet, for example). • Map/Reduce support and strong indexes • Regular expression support in queries
  • 29. Design Considerations • Embedded Documents - Use only if it the embedded document will always be selected with the parent. • Indexes - MongoDB punishes you much earlier for missing indexes than MySQL. • Document size - Currently, documents are limited to 4MB, which should be large enough, but if it’s not...
  • 30. Real-World MongoDB • We use MongoDB heavily at MIS. • Statistics application and reporting • Top-secret new application • Web crawler and indexer • CMS
  • 31. Real-World Example Let’s do tags. Everything is taggable now, right?
  • 34. And to get a “thing’s” tags? SELECT `tags`.* FROM `tags` INNER JOIN `taggings` ON `tags`.id = `taggings`.tag_id WHERE ((`taggings`.taggable_id = 237) AND (`taggings`.taggable_type = 'Song'))
  • 35. Yuck! That’s a lot of pain for something so simple. And I didn’t even show you nding things with tag “x”. Or how to set and unset tags on a “thing”. Ouch.
  • 36. The MongoDB Way Using MongoMapper and Rails 3
  • 37. class Post include MongoMapper::Document key :title, String key :body, String key :tags, Array ensure_index :tags end
  • 38. Let’s Make This Easy... def add_tag(tag) tag = Post.clean_tag(tag) self.tags << tag self.add_to_set(:tags => tag) unless self.new_record? end def remove_tag(tag) tag = Post.clean_tag(tag) self.tags.delete(tag) self.pull(:tags => tag) unless self.new_record? end def self.clean_tag(str) str.strip.downcase.gsub(" ","-").gsub(/[^a-z0-9-]/,"") end def self.clean_tags(str) out = [] arr = str.split(",") arr.each do |t| out << self.clean_tag(t) end out end
  • 39. Demo Time Sorry if you’re looking at this later, but it’s console time!
  • 40. Why I Love MongoDB • Document model ts how I build web apps. • For most apps, I don’t need transactions. • Eventual consistency is actually OK. • Partial updates and arrays make things that are a pain in SQL-land absolutely painless. • It’s just smart enough without getting in the way.
  • 41. What’s NoSQL, really? • The right tool for the job. • We’ve got lots of options for storing application data. • The key is picking the one that solves our real problem. • And if an RDBMS is the right tool, that’s OK too.
  • 43. Further Reading • Visual NoSQL: http://blog.nahurst.com/ visual-guide-to-nosql-systems • MongoDB: http://mongodb.org • MongoMapper: http://mongomapper.com/
  • 44. Thanks! • Kevin Lawver • @kplawver • kevin@lawver.net • http://kevinlawver.com