SlideShare uma empresa Scribd logo
1 de 27
Running MongoDB in the Cloud

          Tony Tam
          @fehguy
What this Talk is About

Wordnik left the cloud and came back
 •   What?!?
 •   Why we left
 •   Decisions
 •   Why we came back (and what we did differently)
Who is Wordnik?

• World’s fastest updating English dictionary
 •   Based on input of text at ~8k words/second
 •   Word Graph as basis to our analysis
     •   Synchronous & asynchronous processing
• 10’s of Billions of documents in NR
  storage
• Concept & Meaning Discovery Engine
• > 20M daily REST API calls, billions
  served
So Why the Detour?

• Architectural Choices
• Business Choices
• Feedback, tooling, infrastructure
• Learning
• Changes in use case
• Progress!
Architecture History

• EC2-based LAMP Stack
 •   POC (and seed funding)
 •   A manageable corpus < 1M records
• REST API
 •   Web + public
 •   MySQL in master/slave
 •   ~1B documents
 •   Operational nightmare
Architecture History
• MongoDB
 •   First-order MySQL issues solved
 •   But it got slow…
• Real Servers to the rescue!
 •   Faster, bigger disks
• MongoDB for Corpus, Structured Data
 •   Faster Reads + Writes!
 •   More metal (72GB RAM)
 •   More cores
 •   “cold” query from 400ms to < 100
Why Change?

Easy!
• Can’t beat metal…except
 •   Quick expansion
 •   Batch jobs/experiments
 •   Add a datacenter
 •   Full cluster migration
 •   The bill for unused capacity
Architectural Mindshift

1. Anything can die, anytime
2. Centralized, redundant state (see point 1)
3. Server performance is *different*
  •   CPU, I/O, Memory—choose one
  •   Smart design makes it work!
Architectural Mindshift

• Your software will need to change!
 •   So will the components you rely on
Your Infrastructure   Cloud
                                   Hero
• Deploying Servers
 •   Going to need a lot!
• Configuration
• Updates to your software

                    What about
                     Data?
Let’s make this Work!

• MySQL Master Slave
  •   Take a snapshot (yes, this will block)
  •   Keep your binlogs!

change master to MASTER_HOST='app1',
MASTER_USER='XXXX', MASTER_PASSWORD='XXXX',
MASTER_LOG_FILE='app1-relay.0038774',
MASTER_LOG_POS=6754205951;
Let’s make this Work!

But…
• Your master is down!
 •   Quick, promote a slave!
 •   Point the other slaves to the new master
• As for the clients…
                                 “Well, we
                               never really
                               tried that…”
Better with Mongo

• Easy up, easy down!
 •   Startup: Sync your data, and announce to clients
     when ready for business
 •   Shutdown: Announce your departure and leave
• Replica sets
 rs.add("db4.wordnik.com:27017");
 rs.remove("db1.wordnik.com:27017");
Better with Mongo
But what about Performance?

• Software Design
    •   It’s slow! (What is *it*?)
    •   Profile everything
import com.wordnik.util.perf._
...
def findUser(id:Long): User = {
    Profile("UserDao::findUserById", dao.findUserById(id))
}

         http://github.com/wordnik/wordnik-oss
But what about Performance?
But what about Performance?

• “It’s the database!”
  •   What is it?
• Mapping layer
  •   Mysql (12+ joins) => 50 records/sec
  •   Mongo JSON  POJO => 1000 records/sec
  •   Mongo DBO  POJO => 35,000 records/sec
• How do you know?
                                      Profile
                                        it!
It’s Still Slow!

• It’s the index!
  •   How do you know?
  •   AHHHHH
It’s Still Slow!

• Balance your B-Tree
 •   Can't always keep index in ram. MMF "does it's
     thing"
 •   Right-balanced b-tree keeps necessary index hot
 •   If you hit indexes on disk, mute your pager
                                                   1
                                                   7




                                                       1   2
                                                       5   7
But it’s Still Slow!

• Look at your Schema design
  •   Design to limit index size/number
  •   _id is your friend—make it meaningful
  •   Record size consistency
      •   Hierarchal Data beware!
      •   Split documents even in same collection!
db.posts.find({_id:/^tony_posts_/})
{_id:"tony_posts_1”, posts:[...]}
{_id:"tony_posts_2”, posts:[...]}       YOUR
{_id:"tony_posts_3”, posts:[...]}     app knows
                                         best
Really, it’s STILL slow!

• Your monolithic app/DB won’t scale same
  on VMs
• Specialize!
 •   Wordnik uses SOA

                 Powered API
                               swagger.wordnik.com

 •   Data tiers follow service types
 •   Smaller *everything*
Really, it’s STILL slow!

• Your monolithic app/DB won’t scale same
  on VMs
• Specialize!
 •   Wordnik uses SOA                   A contract
                                         for your
                               swagger.wordnik.com
                 Powered API
                                          clients
 •   Data tiers follow service types
 •   Smaller *everything*
Be the Boss of your Data

• Your app *should* be smarter than your
  DB
 •   Lots of users?
 •   Lots of blog posts?
 •   Lots of images?
 •   Shard? On what?
• Data dimensionality
 •   Keep active data hot
 •   Don’t try to boil the ocean
Cloud Computing + Mongo

• It can work extremely well
  •   No “Save as Cloud!” menu item
• Shifting constraints
  •   Optimize for RAM on VM
  •   Virtual disk => virtual performance
• Be “Deployable”
  •   Mongo Replica Sets are made for this
Cloud Computing + Mongo

• System Durability
 •   Design your software for abuse
 •   Your old design doesn’t apply
 •   Add APM hooks, now!
• Dissect your app
 •   Build to micro services with dedicated MongoDB
     clusters
• Deployment Infrastructure
 •   Don’t wait until it’s too late
See More
•   See more about Wordnik APIs
                    http://developer.wordnik.com

•   Migrating from MySQL to MongoDB
http://www.slideshare.net/fehguy/migrating-from-mysql-to-mongodb-at-wordnik

•   Maintaining your MongoDB Installation
              http://www.slideshare.net/fehguy/mongo-sv-tony-tam

•   Swagger API Framework
                          http://swagger.wordnik.com

•   Mapping Benchmark
               https://github.com/fehguy/mongodb-benchmark-tools

•   Wordnik OSS Tools
                   https://github.com/wordnik/wordnik-oss
Questions?

Mais conteúdo relacionado

Mais procurados

RavenDB Presentation
RavenDB PresentationRavenDB Presentation
RavenDB Presentation
Mark Rodseth
 
From MySQL to MongoDB at Wordnik (Tony Tam)
From MySQL to MongoDB at Wordnik (Tony Tam)From MySQL to MongoDB at Wordnik (Tony Tam)
From MySQL to MongoDB at Wordnik (Tony Tam)
MongoSF
 

Mais procurados (20)

Scaling with swagger
Scaling with swaggerScaling with swagger
Scaling with swagger
 
Know thy cost (or where performance problems lurk)
Know thy cost (or where performance problems lurk)Know thy cost (or where performance problems lurk)
Know thy cost (or where performance problems lurk)
 
Living with SQL and NoSQL at craigslist, a Pragmatic Approach
Living with SQL and NoSQL at craigslist, a Pragmatic ApproachLiving with SQL and NoSQL at craigslist, a Pragmatic Approach
Living with SQL and NoSQL at craigslist, a Pragmatic Approach
 
GoSF Summerfest - Why Go at Apcera
GoSF Summerfest - Why Go at ApceraGoSF Summerfest - Why Go at Apcera
GoSF Summerfest - Why Go at Apcera
 
Sphinx at Craigslist in 2012
Sphinx at Craigslist in 2012Sphinx at Craigslist in 2012
Sphinx at Craigslist in 2012
 
Realtime Search Infrastructure at Craigslist (OpenWest 2014)
Realtime Search Infrastructure at Craigslist (OpenWest 2014)Realtime Search Infrastructure at Craigslist (OpenWest 2014)
Realtime Search Infrastructure at Craigslist (OpenWest 2014)
 
MySQL And Search At Craigslist
MySQL And Search At CraigslistMySQL And Search At Craigslist
MySQL And Search At Craigslist
 
RavenDB embedded at massive scales
RavenDB embedded at massive scalesRavenDB embedded at massive scales
RavenDB embedded at massive scales
 
SenchaCon 2016 - How to Auto Generate a Back-end in Minutes
SenchaCon 2016 - How to Auto Generate a Back-end in MinutesSenchaCon 2016 - How to Auto Generate a Back-end in Minutes
SenchaCon 2016 - How to Auto Generate a Back-end in Minutes
 
Internet scaleservice
Internet scaleserviceInternet scaleservice
Internet scaleservice
 
Engineering an Encrypted Storage Engine
Engineering an Encrypted Storage EngineEngineering an Encrypted Storage Engine
Engineering an Encrypted Storage Engine
 
RavenDB Presentation
RavenDB PresentationRavenDB Presentation
RavenDB Presentation
 
GlobalsDB: Its significance for Node.js Developers
GlobalsDB: Its significance for Node.js DevelopersGlobalsDB: Its significance for Node.js Developers
GlobalsDB: Its significance for Node.js Developers
 
NATS - A new nervous system for distributed cloud platforms
NATS - A new nervous system for distributed cloud platformsNATS - A new nervous system for distributed cloud platforms
NATS - A new nervous system for distributed cloud platforms
 
Fusion-io and MySQL at Craigslist
Fusion-io and MySQL at CraigslistFusion-io and MySQL at Craigslist
Fusion-io and MySQL at Craigslist
 
From MySQL to MongoDB at Wordnik (Tony Tam)
From MySQL to MongoDB at Wordnik (Tony Tam)From MySQL to MongoDB at Wordnik (Tony Tam)
From MySQL to MongoDB at Wordnik (Tony Tam)
 
Document Databases & RavenDB
Document Databases & RavenDBDocument Databases & RavenDB
Document Databases & RavenDB
 
How Different are MongoDB Drivers
How Different are MongoDB DriversHow Different are MongoDB Drivers
How Different are MongoDB Drivers
 
Lessons Learned Migrating 2+ Billion Documents at Craigslist
Lessons Learned Migrating 2+ Billion Documents at CraigslistLessons Learned Migrating 2+ Billion Documents at Craigslist
Lessons Learned Migrating 2+ Billion Documents at Craigslist
 
What's new in MongoDB 2.6 at India event by company
What's new in MongoDB 2.6 at India event by companyWhat's new in MongoDB 2.6 at India event by company
What's new in MongoDB 2.6 at India event by company
 

Semelhante a Running MongoDB in the Cloud

A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?
A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?
A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?
DATAVERSITY
 
2013 CPM Conference, Nov 6th, NoSQL Capacity Planning
2013 CPM Conference, Nov 6th, NoSQL Capacity Planning2013 CPM Conference, Nov 6th, NoSQL Capacity Planning
2013 CPM Conference, Nov 6th, NoSQL Capacity Planning
asya999
 
Discover MongoDB - Israel
Discover MongoDB - IsraelDiscover MongoDB - Israel
Discover MongoDB - Israel
Michael Fiedler
 
Designing your API Server for mobile apps
Designing your API Server for mobile appsDesigning your API Server for mobile apps
Designing your API Server for mobile apps
Mugunth Kumar
 

Semelhante a Running MongoDB in the Cloud (20)

A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?
A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?
A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?
 
Inside Wordnik's Architecture
Inside Wordnik's ArchitectureInside Wordnik's Architecture
Inside Wordnik's Architecture
 
What Drove Wordnik Non-Relational?
What Drove Wordnik Non-Relational?What Drove Wordnik Non-Relational?
What Drove Wordnik Non-Relational?
 
Why ruby and rails
Why ruby and railsWhy ruby and rails
Why ruby and rails
 
Data Modeling for NoSQL
Data Modeling for NoSQLData Modeling for NoSQL
Data Modeling for NoSQL
 
DevNation Atlanta
DevNation AtlantaDevNation Atlanta
DevNation Atlanta
 
2013 CPM Conference, Nov 6th, NoSQL Capacity Planning
2013 CPM Conference, Nov 6th, NoSQL Capacity Planning2013 CPM Conference, Nov 6th, NoSQL Capacity Planning
2013 CPM Conference, Nov 6th, NoSQL Capacity Planning
 
NOSQL, CouchDB, and the Cloud
NOSQL, CouchDB, and the CloudNOSQL, CouchDB, and the Cloud
NOSQL, CouchDB, and the Cloud
 
Discover MongoDB - Israel
Discover MongoDB - IsraelDiscover MongoDB - Israel
Discover MongoDB - Israel
 
Mongo DB at Community Engine
Mongo DB at Community EngineMongo DB at Community Engine
Mongo DB at Community Engine
 
MongoDB at community engine
MongoDB at community engineMongoDB at community engine
MongoDB at community engine
 
Five Years of EC2 Distilled
Five Years of EC2 DistilledFive Years of EC2 Distilled
Five Years of EC2 Distilled
 
AWS to Bare Metal: Motivation, Pitfalls, and Results
AWS to Bare Metal: Motivation, Pitfalls, and ResultsAWS to Bare Metal: Motivation, Pitfalls, and Results
AWS to Bare Metal: Motivation, Pitfalls, and Results
 
MongoDB Days UK: Using MongoDB to Build a Fast and Scalable Content Repositor...
MongoDB Days UK: Using MongoDB to Build a Fast and Scalable Content Repositor...MongoDB Days UK: Using MongoDB to Build a Fast and Scalable Content Repositor...
MongoDB Days UK: Using MongoDB to Build a Fast and Scalable Content Repositor...
 
Designing your API Server for mobile apps
Designing your API Server for mobile appsDesigning your API Server for mobile apps
Designing your API Server for mobile apps
 
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)
 
Chirp 2010: Scaling Twitter
Chirp 2010: Scaling TwitterChirp 2010: Scaling Twitter
Chirp 2010: Scaling Twitter
 
Fixing twitter
Fixing twitterFixing twitter
Fixing twitter
 
Fixing_Twitter
Fixing_TwitterFixing_Twitter
Fixing_Twitter
 
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
 

Mais de Tony Tam

Love your API with Swagger (Gluecon lightning talk)
Love your API with Swagger (Gluecon lightning talk)Love your API with Swagger (Gluecon lightning talk)
Love your API with Swagger (Gluecon lightning talk)
Tony Tam
 

Mais de Tony Tam (15)

A Tasty deep-dive into Open API Specification Links
A Tasty deep-dive into Open API Specification LinksA Tasty deep-dive into Open API Specification Links
A Tasty deep-dive into Open API Specification Links
 
API Design first with Swagger
API Design first with SwaggerAPI Design first with Swagger
API Design first with Swagger
 
Developing Faster with Swagger
Developing Faster with SwaggerDeveloping Faster with Swagger
Developing Faster with Swagger
 
Writer APIs in Java faster with Swagger Inflector
Writer APIs in Java faster with Swagger InflectorWriter APIs in Java faster with Swagger Inflector
Writer APIs in Java faster with Swagger Inflector
 
Fastest to Mobile with Scalatra + Swagger
Fastest to Mobile with Scalatra + SwaggerFastest to Mobile with Scalatra + Swagger
Fastest to Mobile with Scalatra + Swagger
 
Swagger APIs for Humans and Robots (Gluecon)
Swagger APIs for Humans and Robots (Gluecon)Swagger APIs for Humans and Robots (Gluecon)
Swagger APIs for Humans and Robots (Gluecon)
 
Love your API with Swagger (Gluecon lightning talk)
Love your API with Swagger (Gluecon lightning talk)Love your API with Swagger (Gluecon lightning talk)
Love your API with Swagger (Gluecon lightning talk)
 
Swagger for-your-api
Swagger for-your-apiSwagger for-your-api
Swagger for-your-api
 
Swagger for startups
Swagger for startupsSwagger for startups
Swagger for startups
 
System insight without Interference
System insight without InterferenceSystem insight without Interference
System insight without Interference
 
Keeping MongoDB Data Safe
Keeping MongoDB Data SafeKeeping MongoDB Data Safe
Keeping MongoDB Data Safe
 
Scala & Swagger at Wordnik
Scala & Swagger at WordnikScala & Swagger at Wordnik
Scala & Swagger at Wordnik
 
Introducing Swagger
Introducing SwaggerIntroducing Swagger
Introducing Swagger
 
Building a Directed Graph with MongoDB
Building a Directed Graph with MongoDBBuilding a Directed Graph with MongoDB
Building a Directed Graph with MongoDB
 
Migrating from MySQL to MongoDB at Wordnik
Migrating from MySQL to MongoDB at WordnikMigrating from MySQL to MongoDB at Wordnik
Migrating from MySQL to MongoDB at Wordnik
 

Último

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 

Último (20)

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 

Running MongoDB in the Cloud

  • 1. Running MongoDB in the Cloud Tony Tam @fehguy
  • 2. What this Talk is About Wordnik left the cloud and came back • What?!? • Why we left • Decisions • Why we came back (and what we did differently)
  • 3. Who is Wordnik? • World’s fastest updating English dictionary • Based on input of text at ~8k words/second • Word Graph as basis to our analysis • Synchronous & asynchronous processing • 10’s of Billions of documents in NR storage • Concept & Meaning Discovery Engine • > 20M daily REST API calls, billions served
  • 4. So Why the Detour? • Architectural Choices • Business Choices • Feedback, tooling, infrastructure • Learning • Changes in use case • Progress!
  • 5. Architecture History • EC2-based LAMP Stack • POC (and seed funding) • A manageable corpus < 1M records • REST API • Web + public • MySQL in master/slave • ~1B documents • Operational nightmare
  • 6. Architecture History • MongoDB • First-order MySQL issues solved • But it got slow… • Real Servers to the rescue! • Faster, bigger disks • MongoDB for Corpus, Structured Data • Faster Reads + Writes! • More metal (72GB RAM) • More cores • “cold” query from 400ms to < 100
  • 7. Why Change? Easy! • Can’t beat metal…except • Quick expansion • Batch jobs/experiments • Add a datacenter • Full cluster migration • The bill for unused capacity
  • 8. Architectural Mindshift 1. Anything can die, anytime 2. Centralized, redundant state (see point 1) 3. Server performance is *different* • CPU, I/O, Memory—choose one • Smart design makes it work!
  • 9. Architectural Mindshift • Your software will need to change! • So will the components you rely on
  • 10. Your Infrastructure Cloud Hero • Deploying Servers • Going to need a lot! • Configuration • Updates to your software What about Data?
  • 11. Let’s make this Work! • MySQL Master Slave • Take a snapshot (yes, this will block) • Keep your binlogs! change master to MASTER_HOST='app1', MASTER_USER='XXXX', MASTER_PASSWORD='XXXX', MASTER_LOG_FILE='app1-relay.0038774', MASTER_LOG_POS=6754205951;
  • 12. Let’s make this Work! But… • Your master is down! • Quick, promote a slave! • Point the other slaves to the new master • As for the clients… “Well, we never really tried that…”
  • 13. Better with Mongo • Easy up, easy down! • Startup: Sync your data, and announce to clients when ready for business • Shutdown: Announce your departure and leave • Replica sets rs.add("db4.wordnik.com:27017"); rs.remove("db1.wordnik.com:27017");
  • 15. But what about Performance? • Software Design • It’s slow! (What is *it*?) • Profile everything import com.wordnik.util.perf._ ... def findUser(id:Long): User = { Profile("UserDao::findUserById", dao.findUserById(id)) } http://github.com/wordnik/wordnik-oss
  • 16. But what about Performance?
  • 17. But what about Performance? • “It’s the database!” • What is it? • Mapping layer • Mysql (12+ joins) => 50 records/sec • Mongo JSON  POJO => 1000 records/sec • Mongo DBO  POJO => 35,000 records/sec • How do you know? Profile it!
  • 18. It’s Still Slow! • It’s the index! • How do you know? • AHHHHH
  • 19. It’s Still Slow! • Balance your B-Tree • Can't always keep index in ram. MMF "does it's thing" • Right-balanced b-tree keeps necessary index hot • If you hit indexes on disk, mute your pager 1 7 1 2 5 7
  • 20. But it’s Still Slow! • Look at your Schema design • Design to limit index size/number • _id is your friend—make it meaningful • Record size consistency • Hierarchal Data beware! • Split documents even in same collection! db.posts.find({_id:/^tony_posts_/}) {_id:"tony_posts_1”, posts:[...]} {_id:"tony_posts_2”, posts:[...]} YOUR {_id:"tony_posts_3”, posts:[...]} app knows best
  • 21. Really, it’s STILL slow! • Your monolithic app/DB won’t scale same on VMs • Specialize! • Wordnik uses SOA Powered API swagger.wordnik.com • Data tiers follow service types • Smaller *everything*
  • 22. Really, it’s STILL slow! • Your monolithic app/DB won’t scale same on VMs • Specialize! • Wordnik uses SOA A contract for your swagger.wordnik.com Powered API clients • Data tiers follow service types • Smaller *everything*
  • 23. Be the Boss of your Data • Your app *should* be smarter than your DB • Lots of users? • Lots of blog posts? • Lots of images? • Shard? On what? • Data dimensionality • Keep active data hot • Don’t try to boil the ocean
  • 24. Cloud Computing + Mongo • It can work extremely well • No “Save as Cloud!” menu item • Shifting constraints • Optimize for RAM on VM • Virtual disk => virtual performance • Be “Deployable” • Mongo Replica Sets are made for this
  • 25. Cloud Computing + Mongo • System Durability • Design your software for abuse • Your old design doesn’t apply • Add APM hooks, now! • Dissect your app • Build to micro services with dedicated MongoDB clusters • Deployment Infrastructure • Don’t wait until it’s too late
  • 26. See More • See more about Wordnik APIs http://developer.wordnik.com • Migrating from MySQL to MongoDB http://www.slideshare.net/fehguy/migrating-from-mysql-to-mongodb-at-wordnik • Maintaining your MongoDB Installation http://www.slideshare.net/fehguy/mongo-sv-tony-tam • Swagger API Framework http://swagger.wordnik.com • Mapping Benchmark https://github.com/fehguy/mongodb-benchmark-tools • Wordnik OSS Tools https://github.com/wordnik/wordnik-oss