This document discusses schema design considerations for MongoDB databases. It recommends letting the application direct the schema, judiciously denormalizing data, designing schemas for indexing, using application-level joins when needed, avoiding treating collections as unstructured heaps, and not frequently resizing documents. The document provides examples of embedding related data and storing event data in separate documents to avoid resizing.
08448380779 Call Girls In Civil Lines Women Seeking Men
MongoDB Schema Design (Richard Kreuter's Mongo Berlin preso)
1. Schema Design — MongoBerlin
Richard M Kreuter
10gen Inc.
richard@10gen.com
March 25, 2011
Schema Design — MongoBerlin
2. Observations about Relational Database Schemas
Relational schema design is often presented and thought of as
an exercise in normalization. While academics debate how
many normal forms can fit on the head of a pin, practitioners
tend to employ just one or two.
However, all nontrivial real-world applications employ a variety
of strategic denormalizations: materialized views in the
RDBMS, caching layers outside the RDBMS. These
denormalizations tend to be vital to real-world performance.
Finally, application programmers seldom code in relations, but
rather in object graphs; the RDBMS’s model, the set of
tuples, isn’t a great fit for modern programming languages or
developers’ minds.
Schema Design — MongoBerlin
3. MongoDB Documents, Queries, Features
MongoDB documents are deeply nestable sequences key-value
pairs, thus permitting “rich” structure.
The MongoDB query language is relatively SQL-like in its
capacity to find documents satisfying complicated, dynamic
criteria.
MongoDB documents can be updated atomically, with
special efficiency at updates that don’t alter a document’s size
or shape.
Schema Design — MongoBerlin
4. MongoDB Schema Design Generalities
When designing for MongoDB, do...
... let the application direct the schema.
... denormalize judiciously.
... design your schema for indexing.
... resort to application-level JOINs when needed
And don’t ...
... treat collections as heaps.
... frequently resize documents.
Schema Design — MongoBerlin
5. Letting the application direct the schema
Most applications mostly view their data in a small number of,
distinguished “shape”, generally congruent to graphs of
inter-object has-a relationships among instance classes in the
applications’ models. MongoDB lets you store your data more or
less directly according to the shape of your model.
Schema Design — MongoBerlin
6. Letting the application direct the schema, continued
db.blog_posts.findOne()
{ _id : Object(...)
text : "A blazingly clever blog post.",
by : "A. U. Thor",
date : "Mon Mar 21 2011 03:54:51 GMT-0400 (EDT)",
tags : [ "funny", "ironic" ]
}
Schema Design — MongoBerlin
7. Denormalizing Judiciously
Most application entities turn out to have some fields that are very
frequently altered, and other fields that are exceedingly seldom
altered. Embedding infrequently altered attributes around the
database is a reasonable strategy to improve performance.
Schema Design — MongoBerlin
9. Design your schema for indexing
There’s a subtle relationship between schemas and indexes.
Consider this query:
db.boxes.find({$where : "this.height > this.width"})
This query doesn’t take advantage of MongoDB indexes, both
because of the JavaScript and also because this predicate isn’t
something MongoDB knows how to index. If this sort of query is
important, maintaining a separate boolean attribute in the
document is the right thing; and the separate value can be indexed.
Schema Design — MongoBerlin
10. Application-level JOINs
Because most MongoDB documents are “richer” than RDBMS
rows, they tend to represent “pre-JOINed” data; and so
application-level JOIN operations should be few. However,
sometimes you do need relational-style normalization and
application-level JOINS. This comes up in some many-to-many
relationships, and may not cost much in practice.
Schema Design — MongoBerlin
11. Don’t treat collections as heaps
Although MongoDB permits quite a bit of freedom in document
structure, documents in a collection ought to share a common
subset of attributes, for programmatic processing effective
indexing, and developer comprehension. If you have documents
with very different sets of attributes, consider storing them in
separate collections.
Schema Design — MongoBerlin
12. Don’t frequently resize documents
Resizing a document (e.g. by adding/removing attributes or
adding/removing elements of lists) is generally costly. (In-place
updates are quite efficient, however.) In general, a schema whose
documents’ sizes are highly volatile should be considered suspect;
such data might best be stored as separate documents.
Schema Design — MongoBerlin
13. Don’t frequently resize documents, continued
So, instead of this
db.urlhits.findOne()
{ _id : ..., url : "http://10gen.com",
// this is counting with granularity of 1 day
counts : { "2011-03-01" :
{ firefox : 12345, chrome : 23456 },
"2011-03-02" :
{ firefox : 15678, chrome : 24567 }
... } }
consider this:
db.urlhits2.findOne()
{ _id : ..., url : "http://10gen.com",
date : "2011-03-01",
counts : { "firefox : 12345, chrome : 23456 } }
Schema Design — MongoBerlin
14. Don’t frequently resize documents, continued
So, instead of this
db.user_events.findOne()
{ _id : ..., user : "kreuter"
clicks : [ { url : <url1>, time : <time1> },
{ url : <url2>, time : <time2> },
... ] }
consider this:
db.user_events.findOne()
{ _id : ..., user : "kreuter", url: <url1>, time: <time1> }
Schema Design — MongoBerlin
15. Going forward
www.mongodb.org — downloads, docs, community
mongodb-user@googlegroups.com — mailing list
#mongodb on irc.freenode.net
try.mongodb.org — web-based shell
10gen is hiring. Email jobs@10gen.com.
10gen offers support, training, and advising services for
mongodb
Schema Design — MongoBerlin