The document discusses lessons learned from integrating MongoDB into eCommerce websites. Some key points:
- The EAV data model used by Magento is slow and performs poorly at scale, motivating a transition to MongoDB.
- Early approaches stored all product data in MongoDB but this broke features relying on SQL. A hybrid model using MongoDB for most attributes and MySQL for key fields worked better.
- The learning curve is high but storing data to match queries, managing transactions carefully, and using search engines are important. Near real-time processing can improve performance significantly.
- Backup and replication require special attention in distributed architectures. The open source MongoGento module developed by Smile improves Magento performance
3. SMILE IN A NUTSHELL
! SMILE IS THE BIGGEST PLAYER IN OPEN SOURCE IN EUROPE
+700 employees, 17 offices, +45 M€ turnover in 2012, 30% of grow / year
Office in Brussels since 2012 with a local team of experts for your projects
! MULTI-TECHNOLOGIES, A UNIQUE EXPERTISE
CMS, E-Commerce, Portal, ECM/DMS, ERP, BI, System/Infrastructure, Custom dev
…
3
6. WHY NOSQL ?
! OSS software does not meet the performance needs of
our clients out of the box, especially when dealing with
huge product catalog (millions of product)
! The main bottleneck encountered is the database :
l Avoid specialization Less scalable component of the LAMP
architecture
l Require a complex data model when dealing with heterogeneous
products
6
MongoDB into eCommerce websites
Lessons Learned
7. THE ROAD TO NOSQL
2008
2009
2012
2013
• First Magento release : poor performances
• Smile provides the first integration of SolR into Magento
• Magento does it later into its Enterprise edition (v. 1.8)
• First prototype of Magento / MongoDB integration
• MongoGogento is now in production and will be opened to the
community
• Several improvements to come by the end of the year
7
MongoDB into eCommerce websites
Lessons Learned
8. WHY MONGODB AMONG OTHER DATABASES ?
! MongoDB is a general purpose document database
l More versatile document selection API
l Update API allows partial document update and advanced operations (inc, push,
…)
! MongoDB is popular :
l More developpers with MongoDB skills
l Ecosystem : hosting, SaaS, …
! Well documented
8
MongoDB into eCommerce websites
Lessons Learned
9. HOW IS MAGENTO
STORING PRODUCTS ?
! Magento uses the EAV model to fit products into an RDBMs :
id
sku
id
attribute_id
product_id
store_id
value
1
345678909876
1
price
1
0
10.00
2
786576809080
2
name
1
0
Product name
3
978786798979
3
name
1
1
Nom du produit
…
…
…
…
…
…
…
40
price
2
0
20.00
41
name
2
0
Other product
42
name
2
1
Autre produit
Product main table
…
Attribute values table
9
MongoDB into eCommerce websites
Lessons Learned
10. THE EAV MODEL
PROS AND CONS
! Pros :
l You can add or remove attributes or locales without altering tables, thus avoid
downtimes required for such operations when dealing with a lot of products.
! Cons :
l Very slow : lots of joins required to get a single product or a product list (worst,
most of them are left joins).
l Writing is slow : many inserts for one product with a lot of checks done by the
RDBM (fK, indexes, transactionnal logic).
l The attribute values tables tend to have to grow a lot more than the number of
products (average is twenty times faster).
10
MongoDB into eCommerce websites
Lessons Learned
12. FIRST VERSION :
STORE EVERYTHING INTO MONGODB
! Pros :
{
_id : 1,
attr_0 : {
name : “Product name”
price : 10.00
},
attr_1 : {
name : “Nom du produit”
}
}
Product document example
l 1 product = 1 document (reads and writes are
very performant)
l Very flexible model
! Cons :
l All foreign keys on the product have to be
removed
l Some attributes are used to compute indexes
(sale price, …) : a lot of Magento have to be
rewritten or will be broken
12
MongoDB into eCommerce websites
Lessons Learned
13. THE SOLUTION :
HYBRID MODEL MONGODB / MYSQL ?
! Keep RDBM storage for :
l entity main table
l Attributes related to indexes (price, name, …)
! Put everything else into MongoDB (90% of the attributes such as
description, color, …)
! On product loading :
l Load from the RDBM and enrich from MongoDB after
! On product list loading :
l Load filtered product list from MongoDB
l Load filtered product list from MySQL
l Merge both product lists (intersection)
13
MongoDB into eCommerce websites
Lessons Learned
15. LESSON N°1
LEARNING CURVE
! It is very pleasant to work with MongoDB
! Learning curve is very high. The documentary model is a quite
natural way to work for developers
! The most of work you will have will be "unlearning" the way you
are building your models with a RDBMs to take full advantage of
the documentary model (nested doc vs reference to a doc).
! There is often several ways to make something. Some are better
than others and only experience will tell you what is the good
one.
15
MongoDB into eCommerce websites
Lessons Learned
16. LESSON N°2
STORE DATA THE WAY YOU WILL QUERY IT
! Be pragmatic about data normalisation
! Ratio between read / write : should impact the way you will store
data
! Example : store comments of a product
{
_id
: 174474747,
content_text : “My Super Comment”,
user_id
: 346568794,
product_id : 87687
}
To be efficient this solution need indexes
on both product_id and user_id :
• Indices are time consuming when writing
• All indices have to fit into RAM. Limit the number
of indexes.
16
MongoDB into eCommerce websites
Lessons Learned
17. LESSON N°2
STORE DATA THE WAY YOU WILL QUERY IT
! You can avoid indices by adding a new collection :
Cons :
{
_id : ‘user_346568794’,
comment_ids : [“174474747”],
}
{
_id : ‘product_87687’,
comment_ids : [“174474747”],
}
• You have to write three doc for one
comments
• Data duplication
Pros :
• You don’t need indices anymore
• Very fast to read comments for a
product or by an user
17
MongoDB into eCommerce websites
Lessons Learned
18. LESSON N°3
MANAGE TRANSACTIONS
! MongoDB does not support transactions …
! … but single document modifications is atomic
! The questions is: how can I modify my document model to avoid
transactions ?
! If transactions are really needed, their is alternatives you can
implement by your own :
l Two phase commits :
http://docs.mongodb.org/manual/tutorial/perform-two-phase-commits/
l Optimistic locking (can be implemented on client way)
18
MongoDB into eCommerce websites
Lessons Learned
19. LESSON N°3
MANAGE TRANSACTIONS
! You can avoid transactions by avoiding « get for set » operation.
You should better use MongoDB update operators instead.
! Examples : append a category to a product and update it’s price
$product = $collection->findOne(array(
‘_id’ => $productId
));
$product[‘price’] = 10.00;
$product[‘category_ids’][] = 3;
$collection->save($product);
AD
B
$updateCond = array(‘_id’ => $productId);
$updateData = array(
‘$set’ => array(‘price’ => 10.00),
‘$push’ => array(‘category_ids’ => 3),
);
OD
GO
19
MongoDB into eCommerce websites
Lessons Learned
20. LESSON N°4
NOSQL = NOT ONLY SQL
! Keep in mind, NoSQL is about database specialization :
l Most of time you will have several databases into your project
l Use the best available for the task you have to perform
! Avoid spread related data into several databases :
l Hard to backup (synchronized)
l Not very peformant
! Keep pragmatic about what you put into MongoDB :
l Target the performance killers into existing projects
l If you are trying to reproduce complex transactions, you are probably wrong
l SQL has not to be trashed
20
MongoDB into eCommerce websites
Lessons Learned
21. LESSON N°5
USE A SEARCH ENGINE
! Index your documents into a search engine (SolR, ElasticSearch,
…)
! They are more efficient where it comes to filtering collections of
documents
! Most of time, you will do it anyway, because you will need
fulltext search, facetting, …
! Use :
l Dataimport handlers for SolR
l River MongoDB plugin for ElasticSearch
21
MongoDB into eCommerce websites
Lessons Learned
22. LESSON N°6
NEAR TIME PROCESSING IS A BIG WIN
! Did I really need :
l Data to be computed in real-time ?
l Always fresh ? What is a acceptable delay ?
! Example : Top ten of product sales by category.
l Calculated at page load ?
l Calculated every time someone is buying something ?
l Calculated every 5 minutes by a batch process ?
! The tools :
l Use MapReduce (incremental) to batch analysis operations involving large datasets
l Use tailable cursors to proceed to backgroud stream processing
MongoDB into eCommerce websites
Lessons Learned
22
23. LESSON N°7
SCHEMALESS DATABASES HAVE FASTEST GROWTH RATE
! MongoDB consumes a lot more storage than RDBMs :
l Document structure is stored for each document (into RDBMs it
is done once for the table)
l Document is padded (small amount of free space added at
the end) => better perfomance during update, more space on
the disk
! To achieve the best performances, the whole dataset +
indices have to fit into RAM
! Don’t hesitate to experiment sharding but do it
carefully : shard key can not be updated. Pay
attention to what you are choosing.
MongoDB into eCommerce websites
Lessons Learned
23
24. LESSON N°8
DON’T USE REPLICATION AS A BACKUP
! Replication is about hardware failures
! Backup is about human failures :
db.catalog_product_entity.drop();
! Backup : Full backup + oplog (Point In Time Recovery)
! Difficult to backup a sharded cluster or an hybrid MySQL /
MongoDB application (synchronization)
! You can avoid having to recover from backups :
l Never use delete operations : mark data as deleted and filter them
instead
l Use a versionning system instead of updating data
MongoDB into eCommerce websites
Lessons Learned
24
26. MONGOGENTO
WHAT IT DOES ?
! Manage product attributes and media galleries
l Product Import performances : x5
l Frontend / Admin performances : x2
l Benchmarks (French) :
http://www.ecommerce-performances.com/mongogento.php
! Not so many Magento features broken. The broken ones were not
usable with huge catalogs
! June 2013 : Goes live into production.
! Jan. 2014 : First OpenSource release :
l You can fork it on GitHub : https://github.com/Smile-SA/mongogento
MongoDB into eCommerce websites
Lessons Learned
26
27. MONGOGENTO
THE ROADMAP
! More features to be added
l Fix Magento broken features : catalog rules support, sitemap
l Cart storage
l Media assets storage (GridFS)
! Search Engine integration (ElasticSearch) with unique features :
l Behavorial data processing
l Mahout integration)
! Magento community edition support
27
MongoDB into eCommerce websites
Lessons Learned