Are you in the process of evaluating or migrating to MongoDB? We will cover key aspects of migrating to MongoDB from a RDBMS, including Schema design, Indexing strategies, Data migration approaches as your implementation reaches various SDLC stages, Achieving operational agility through MongoDB Management Services (MMS).
6. Organizing for Success -
Stakeholders
• Key to success: Involve all key stakeholders for the
application
o Line of business
o Developers
o Data Architects
o DBAs
o Systems Administrators
o Security
7. Organizing for Success – Project Charter
• Develop project charter
o Define business and technical objectives
o Define timeliness and responsibilities
o Monitor progress and address any issues
8. Organizing for Success – Help
needed?
• Partner services and resources available from MongoDB
o Community support
o Build skills and proficiency through web based training
o Support and consulting services
14. Schema Design – For more details
• Afternoon session: Data modeling deep dive
• Google: "6 rules of thumb for MongoDB schema design"
• Google: "MongoDB compound index optimization"
16. Drivers & Ecosystem
MongoDB API is implemented as methods
Morphia
Java Ruby Python Perl
MEAN Stack
17. Developer Efficiency
RDBMS
Rigid schema
Object-Relational impedance?
Alter 2TB table to modify a column?
MongoDB
Dynamic schema
MongoDB APIs are classes and packages
Modify code to use MongoDB APIs
20. Data Migration – Can you have
downtime?
Application View AAvavailailabblele DDeegrgaraddeedd DDoowwnn AAvavailailabblele
Source
Database
Source
Database
MMaastseter r EExpxpoortritningg
ImImppoortritningg MMaastseter r
Time
T
1
T
2
T
3
27. Case Study
Uses MongoDB to safeguard over 6 billion images served to millions of customers
Problem Why MongoDB Results
• 6B images, 20TB of
data
• Brittle code base on top
of Oracle database –
hard to scale, add
features
• High SW and HW costs
• JSON-based data
model
• Agile, high
performance, scalable
• Alignment with
Shutterfly’s services-based
architecture
• 80% cost reduction
• 900% performance
improvement
• Faster time-to-market
• Dev. cycles in weeks
vs. tens of months
28. Shutterfly – Original Data store
OOraraclcele
• Meta data stored in XML Blobs
• App responsible for content of blob
Photo ID XML Blob
1 <xml><meta-data>…</xml>
2 <xml><meta-data>…</xml>
3 <xml><meta-data>…</xml>
37. Help available from MongoDB
MongoDB Enterprise Advanced
The best way to run MongoDB in your data center
MongoDB Management Service (MMS)
The easiest way to run MongoDB in the cloud
Production Support
In production and under control
Development Support
Let’s get you running
Consulting
We solve problems
Training
Get your teams up to speed.
Notas do Editor
Rich documents
Unrelenting growth in new data sources
Growing user loads
Promote agility
Improve developer efficiency
Improve time to market of developed features
Achieve higher scalability
Horizontal scaling
Commercial hardware
Cloud friendly
Lower budget strain
Much lower TCO
Do more with less resources
Edmunds – Billing, online advertising, user data (Oracle)
Metlife – Single view of 100M+ customers and 70 systems in 90 days
Cisco - Analytics, Social Networking (Various)
Salesforce – Real time analytics (Various)
Expedia – Special travel offers in real time
Adobe – Digital experience management platform
Shutterfly – Developed nearly a dozen projects on MongoDB storing more than 20TB data (Oracle)
Craigslist – Archive data migration (MySQL)
MTV - Centralized Content Management (Various)
Agility and flexibility
Data model supports business change
Rapidly iterate to meet new requirements
Intuitive, natural data representation
Eliminates ORM layer
Developers are more productive
Reduces the need for joins, disk seeks
Programming is more simple
Performance delivered at scale
Lets continue the comparison between relational and document model - consider the example of a blogging platform
Got 5 tables - Category, article, user, comments and tags - the application relies on the RDBMS to join five separate tables in order to build the blog entry. –
In the case of MongoDB, all of the blog data is aggregated within a single document, linked with a single reference to a user document containing authors of both the blog and comments
From a performance and scalability perspective, the aggregated document can be accessed in a single call to the database, rather than having to JOIN multiple tables to respond to a query
During schema design, think about how we query our data – some NoSQL databases that are little more than key/value stores, so you maybe able to ingest data quickly, but you can’t do anything other than primary key lookups – huge backward step coming from the relational world
MongoDB on the other hand has a rich query model enabled by extensive indexing. Indexes can be defined for any key or array within the document, as secondary indexes
MongoDB indexing will be familiar to DBAs
- B-Tree Indexes, Secondary Indexes
As with a relational DB, indexes are the single biggest tunable performance factor
- Define indexes by identifying common queries
- Use MongoDB explain to ensure index coverage
- Use MongoDB profiler log all slow queries
Listed index types on the slide, include text search and geospatial
Array indexes allow you to index each element of an embedded array, ie in a document describing a product, each of the categories that the product can be classified under can be included in an array and indexed, so get a major performance boost when users are searching by those classifications
This sort of flexibility gives MongoDB ability to run complex queries quickly
MongoDB has idiomatic drivers for the most popular languages with over a dozen developed and supported by MongoDB and 30+ community-supported drivers.
MongoDB API is implemented as methods within the API of a specific programming language, as opposed to a completely separate language like SQL. If we couple this with MongoDB’s document model and their affinity data structures used in object-oriented programming, makes integration with applications very simple.
MongoDB has idiomatic drivers for the most popular languages with over a dozen developed and supported by MongoDB and 30+ community-supported drivers.
MongoDB API is implemented as methods within the API of a specific programming language, as opposed to a completely separate language like SQL. If we couple this with MongoDB’s document model and their affinity data structures used in object-oriented programming, makes integration with applications very simple.
Easy to use tool. CSV, TSV, JSON formats
Useful if source data is in the same format as target
May not use for large data sets
Does not do transformation of data
Pentaho & Informatica have partnership with MongoDB
GUI based tools
Mapping, workflow that transform, change schema along the way
Can handle different sources
Stable, robust, scalable migrations for large, complex data sets.
Limitations around nesting
Hadoop as an ETL system
MR or Oozie to transform data
Combine, merge, build data set
MR can directly write to MongoDB using the M-H connector
Possible to do updates to augment an initial bulk load
Programmer friendly so almost no limitation as to target transformations
App talks to both source and target
Rather than one big bulk transfer, trickle changes
Business logic is inside the code, so modifications may be validated using rules before writing to MongoDB
A lot of times we see a combination of these three options used by customers
Key challenges: Time to market, Cost, Performance, Scalability
Solution: Simple API, OSS software & simple hardware, Reduce complexity & partition data, Clustered system
What kinds of tasks?
Provisioning. Any topology, at scale, with the click of a button.
Upgrades. In minutes, with no downtime.
Scale. Add capacity without taking your application offline.
Continuous Backup. Customize to meet your recovery goals.
Point-in-time Recovery. Restore to any point in time, because disasters aren’t scheduled.
Performance Alerts. Monitor 100+ system metrics and get custom alerts before your system degrades.
What We Sell
We are the MongoDB experts. Over 1,000 organizations rely on our commercial offerings, including leading startups and 30 of the Fortune 100. We offer software and services to make your life easier:
MongoDB Enterprise Advanced is the best way to run MongoDB in your data center. It’s a finely-tuned package of advanced software, support, certifications, and other services designed for the way you do business.
MongoDB Management Service (MMS) is the easiest way to run MongoDB in the cloud. It makes MongoDB the system you worry about the least and like managing the most.
Production Support helps keep your system up and running and gives you peace of mind. MongoDB engineers help you with production issues and any aspect of your project.
Development Support helps you get up and running quickly. It gives you a complete package of software and services for the early stages of your project.
MongoDB Consulting packages get you to production faster, help you tune performance in production, help you scale, and free you up to focus on your next release.
MongoDB Training helps you become a MongoDB expert, from design to operating mission-critical systems at scale. Whether you’re a developer, DBA, or architect, we can make you better at MongoDB.