In a real life almost any project deals with the
tree structures. Different kinds of taxonomies,
site structures etc require modeling of
hierarchy relations.
Typical approaches used
● Model Tree Structures with Child References
● Model Tree Structures with Parent References
● Model Tree Structures with an Array of Ancestors
● Model Tree Structures with Materialized Paths
● Model Tree Structures with Nested Sets
2. Introduction
In a real life almost any project deals with the
tree structures. Different kinds of taxonomies,
site structures etc require modeling of
hierarchy relations.
Typical approaches used
● Model Tree Structures with Child References
● Model Tree Structures with Parent References
● Model Tree Structures with an Array of Ancestors
● Model Tree Structures with Materialized Paths
● Model Tree Structures with Nested Sets
4. Challenges to address
In a typical site scenario, we should be able to
● Operate with tree (insert new node under specific
parent, update/remove existing node, move node
across the tree)
● Get path to node (for example, in order to be build the
breadcrumb section)
● Get all node descendants (in order to be able, for
example, to select goods from more general category,
like 'Cell Phones and Accessories' which should include
goods from all subcategories.
5. Scope of the demo
On each of the examples below we:
● Add new node called 'LG' under electronics
● Move 'LG' node under Cell_Phones_And_Smartphones
node
● Remove 'LG' node from the tree
● Get child nodes of Electronics node
● Get path to 'Nokia' node
● Get all descendants of the
'Cell_Phones_and_Accessories' node
7. Tree structure with
parent reference
This is most commonly used approach. For each node
we store (ID, ParentReference, Order)
8. Operating with tree
Pretty simple, but changing the position of the node
within siblings will require additional calculations.
You might want to set high numbers like item position *
10^6 for sorting in order to be able to set new node order
as trunc (lower sibling order - higher sibling order)/2 -
this will give you enough operations, until you will need
to traverse whole the tree and set the order defaults to
big numbers again
9. Adding new node
Good points: requires only one insert
operation to introduce the node.
var existingelemscount = db.categoriesPCO.find
({parent:'Electronics'}).count();
var neworder = (existingelemscount+1)*10;
db.categoriesPCO.insert({_id:'LG', parent:'Electronics',
someadditionalattr:'test', order:neworder})
//{ "_id" : "LG", "parent" : "Electronics",
// "someadditionalattr" : "test", "order" : 40 }
10. Updating / moving the node
Good points: as during insert - requires only
one update operation to amend the node
existingelemscount = db.categoriesPCO.find
({parent:'Cell_Phones_and_Smartphones'}).count();
neworder = (existingelemscount+1)*10;
db.categoriesPCO.update({_id:'LG'},{$set:
{parent:'Cell_Phones_and_Smartphones', order:neworder}});
//{ "_id" : "LG", "order" : 60, "parent" :
// "Cell_Phones_and_Smartphones",
"someadditionalattr" : "test" }
11. Node removal
Good points: requires single operation to
remove the node from tree
db.categoriesPCO.remove({_id:'LG'});
12. Getting node children, ordered
Good points: all childs can be retrieved from
database and ordered using single call.
db.categoriesPCO.find({$query:{parent:'Electronics'},
$orderby:{order:1}})
//{ "_id" : "Cameras_and_Photography", "parent" :
"Electronics", "order" : 10 }
//{ "_id" : "Shop_Top_Products", "parent" : "Electronics",
"order" : 20 }
//{ "_id" : "Cell_Phones_and_Accessories", "parent" :
"Electronics", "order" : 30 }
13. Getting all node descendants
Bad points: unfortunately, requires recursive
calls to database.
var descendants=[]
var stack=[];
var item = db.categoriesPCO.findOne({_id:"Cell_Phones_and_Accessories"});
stack.push(item);
while (stack.length>0){
var currentnode = stack.pop();
var children = db.categoriesPCO.find({parent:currentnode._id});
while(true === children.hasNext()) {
var child = children.next();
descendants.push(child._id);
stack.push(child);
}
}
descendants.join(",")
//Cell_Phones_and_Smartphones,Headsets,Batteries,Cables_And_Adapters,Nokia,
Samsung,Apple,HTC,Vyacheslav
14. Getting path to node
Bad points: unfortunately also require
recursive operations to get the path.
var path=[]
var item = db.categoriesPCO.findOne({_id:"Nokia"})
while (item.parent !== null) {
item=db.categoriesPCO.findOne({_id:item.parent});
path.push(item._id);
}
path.reverse().join(' / ');
//Electronics / Cell_Phones_and_Accessories /
Cell_Phones_and_Smartphones
16. Tree structure with childs
reference
For each node we store (ID,
ChildReferences).
17. Note
Please note, that in this case we do not need order field,
because Childs collection already provides this
information.
Most of languages respect the array order. If this is not in
case for your language, you might consider additional
coding to preserve order, however this will make things
more complicated
18. Adding new node
Note: requires one insert operation and one
update operation to insert the node.
db.categoriesCRO.insert({_id:'LG', childs:[]});
db.categoriesCRO.update({_id:'Electronics'},{ $addToSet:
{childs:'LG'}});
//{ "_id" : "Electronics", "childs" : [
"Cameras_and_Photography", "Shop_Top_Products",
"Cell_Phones_and_Accessories", "LG" ] }
19. Updating/moving the node
Requires single update operation to change node order
within same parent, requires two update operations, if
node is moved under another parent.
Rearranging order under the same parent
db.categoriesCRO.update({_id:'Electronics'},{$set:{"childs.1":'LG',"
childs.3":'Shop_Top_Products'}});
//{ "_id" : "Electronics", "childs" : [ "Cameras_and_Photography",
"LG", "Cell_Phones_and_Accessories", "Shop_Top_Products" ] }
Moving the node
db.categoriesCRO.update({_id:'Cell_Phones_and_Smartphones'},{ $addToSet:
{childs:'LG'}});
db.categoriesCRO.update({_id:'Electronics'},{$pull:{childs:'LG'}});
//{ "_id" : "Cell_Phones_and_Smartphones", "childs" : [ "Nokia", "Samsung",
"Apple", "HTC", "Vyacheslav", "LG" ] }
20. Node removal
Node removal also requires two operations:
one update and one remove.
db.categoriesCRO.update
({_id:'Cell_Phones_and_Smartphones'},{$pull:
{childs:'LG'}})
db.categoriesCRO.remove({_id:'LG'});
21. Getting node children, ordered
Bad points: requires additional client side
sorting by parent array sequence. Depending
on result set, it may affect speed of your
code.
var parent = db.categoriesCRO.findOne({_id:'Electronics'})
db.categoriesCRO.find({_id:{$in:parent.childs}})
22. Getting node children, ordered
Result set
{ "_id" : "Cameras_and_Photography", "childs" : [ "Digital_Cameras",
"Camcorders", "Lenses_and_Filters", "Tripods_and_supports",
"Lighting_and_studio" ] }
{ "_id" : "Cell_Phones_and_Accessories", "childs" : [
"Cell_Phones_and_Smartphones", "Headsets", "Batteries",
"Cables_And_Adapters" ] }
{ "_id" : "Shop_Top_Products", "childs" : [ "IPad", "IPhone", "IPod",
"Blackberry" ] }
//parent:
{
"_id" : "Electronics",
"childs" : [
"Cameras_and_Photography",
"Cell_Phones_and_Accessories",
"Shop_Top_Products"
]
}
As you see, we have ordered array childs, which can be used to sort the result
set on a client
23. Getting all node descendants
Note: also recursive operations, but we need less selects
to databases comparing to previous approach
var descendants=[]
var stack=[];
var item = db.categoriesCRO.findOne({_id:"Cell_Phones_and_Accessories"});
stack.push(item);
while (stack.length>0){
var currentnode = stack.pop();
var children = db.categoriesCRO.find({_id:{$in:currentnode.childs}});
while(true === children.hasNext()) {
var child = children.next();
descendants.push(child._id);
if(child.childs.length>0){
stack.push(child);
}
}
}
//Batteries,Cables_And_Adapters,Cell_Phones_and_Smartphones,Headsets,Apple,HTC,Nokia,
Samsung
descendants.join(",")
24. Getting path to node
Path is calculated recursively, so we need to
issue number of sequential calls to database.
var path=[]
var item = db.categoriesCRO.findOne({_id:"Nokia"})
while ((item=db.categoriesCRO.findOne({childs:item._id})))
{
path.push(item._id);
}
path.reverse().join(' / ');
//Electronics / Cell_Phones_and_Accessories /
Cell_Phones_and_Smartphones
26. Tree structure using an
Array of Ancestors
For each node we store (ID, ParentReference,
AncestorReferences)
27. Adding new node
You need one insert operation to introduce
new node, however you need to invoke select
in order to prepare the data for insert
var ancestorpath = db.categoriesAAO.findOne
({_id:'Electronics'}).ancestors;
ancestorpath.push('Electronics')
db.categoriesAAO.insert({_id:'LG', parent:'Electronics',
ancestors:ancestorpath});
//{ "_id" : "LG", "parent" : "Electronics", "ancestors" :
[ "Electronics" ] }
28. Updating/moving the node
moving the node requires one select and one
update operation
ancestorpath = db.categoriesAAO.findOne
({_id:'Cell_Phones_and_Smartphones'}).ancestors;
ancestorpath.push('Cell_Phones_and_Smartphones')
db.categoriesAAO.update({_id:'LG'},{$set:
{parent:'Cell_Phones_and_Smartphones', ancestors:
ancestorpath}});
//{ "_id" : "LG", "ancestors" : [ "Electronics",
"Cell_Phones_and_Accessories",
"Cell_Phones_and_Smartphones" ], "parent" :
"Cell_Phones_and_Smartphones" }
30. Getting node children, unordered
Note: unless you introduce the order field, it
is impossible to get ordered list of node
children. You should consider another
approach if you need order.
db.categoriesAAO.find({$query:{parent:'Electronics'}})
31. Getting all node descendants
There are two options to get all node
descendants. One is classic through recursion:
var ancestors = db.categoriesAAO.find({ancestors:"
Cell_Phones_and_Accessories"},{_id:1});
while(true === ancestors.hasNext()) {
var elem = ancestors.next();
descendants.push(elem._id);
}
descendants.join(",")
//Cell_Phones_and_Smartphones,Headsets,Batteries,Cables_And_Adapters,Nokia,
Samsung,Apple,HTC,Vyacheslav
32. Getting all node descendants
second is using aggregation framework
introduced in MongoDB 2.2:
var aggrancestors = db.categoriesAAO.aggregate([
{$match:{ancestors:"Cell_Phones_and_Accessories"}},
{$project:{_id:1}},
{$group:{_id:{},ancestors:{$addToSet:"$_id"}}}
])
descendants = aggrancestors.result[0].ancestors
descendants.join(",")
//Vyacheslav,HTC,Samsung,Cables_And_Adapters,Batteries,Headsets,Apple,
Nokia,Cell_Phones_and_Smartphones
33. Getting path to node
This operation is done with single call to
database, which is advantage of this
approach.
var path=[]
var item = db.categoriesAAO.findOne({_id:"Nokia"})
item
path=item.ancestors;
path.join(' / ');
//Electronics / Cell_Phones_and_Accessories / Cell_Phones_and_Smartphones
36. Intro
Approach looks similar to storing array of
ancestors, but we store a path in form of
string instead.
In example above I intentionally use comma(,)
as a path elements divider, in order to keep
regular expression simpler
37. Adding new node
New node insertion is done with one select
and one insert operation
var ancestorpath = db.categoriesMP.findOne
({_id:'Electronics'}).path;
ancestorpath += 'Electronics,'
db.categoriesMP.insert({_id:'LG', path:ancestorpath});
//{ "_id" : "LG", "path" : "Electronics," }
38. Updating/moving the node
Node can be moved using one select and one
update operation
ancestorpath = db.categoriesMP.findOne
({_id:'Cell_Phones_and_Smartphones'}).path;
ancestorpath +='Cell_Phones_and_Smartphones,'
db.categoriesMP.update({_id:'LG'},{$set:{path:ancestorpath}});
//{ "_id" : "LG", "path" : "Electronics,Cell_Phones_and_Accessories,
Cell_Phones_and_Smartphones," }
39. Node removal
Node can be removed using single database
query
db.categoriesMP.remove({_id:'LG'});
40. Getting node children, unordered
Note: unless you introduce the order field, it
is impossible to get ordered list of node
children. You should consider another
approach if you need order.
db.categoriesMP.find({$query:{path:'Electronics,'}})
//{ "_id" : "Cameras_and_Photography", "path" : "Electronics," }
//{ "_id" : "Shop_Top_Products", "path" : "Electronics," }
//{ "_id" : "Cell_Phones_and_Accessories", "path" : "Electronics," }
41. Getting all node descendants
Single select, regexp starts with ^ which
allows using the index for matching
var descendants=[]
var item = db.categoriesMP.findOne({_id:"Cell_Phones_and_Accessories"});
var criteria = '^'+item.path+item._id+',';
var children = db.categoriesMP.find({path: { $regex: criteria, $options: 'i' }});
while(true === children.hasNext()) {
var child = children.next();
descendants.push(child._id);
}
descendants.join(",")
//Cell_Phones_and_Smartphones,Headsets,Batteries,Cables_And_Adapters,Nokia,Samsung,
Apple,HTC,Vyacheslav
42. Getting path to node
We can obtain path directly from node
without issuing additional selects.
var path=[]
var item = db.categoriesMP.findOne({_id:"Nokia"})
print (item.path)
//Electronics,Cell_Phones_and_Accessories,
Cell_Phones_and_Smartphones,
46. Adding new node
Please refer to image above.
Assume, we want to insert LG node after
shop_top_products(14,23).
New node would have left value of 24, affecting all
remaining left values according to traversal rules, and will
have right value of 25, affecting all remaining right values
including root one.
47. Adding new node
Take next node in traversal tree
New node will have left value of the following sibling and
right value - incremented by two following sibling's left
one
Now we have to create the place for the new node.
Update affects right values of all ancestor nodes and also
affects all nodes that remain for traversal
Only after creating place new node can be inserted
48. Adding new node
var followingsibling = db.categoriesNSO.findOne({_id:"
Cell_Phones_and_Accessories"});
var newnode = {_id:'LG', left:followingsibling.left,right:
followingsibling.left+1}
db.categoriesNSO.update({right:{$gt:followingsibling.right}},{$inc:{right:
2}}, false, true)
db.categoriesNSO.update({left:{$gte:followingsibling.left}, right:{$lte:
followingsibling.right}},{$inc:{left:2, right:2}}, false, true)
db.categoriesNSO.insert(newnode)
50. Node removal
While potentially rearranging node order within same parent is identical to
exchanging node's left and right values,the formal way of moving the node is
first removing node from the tree and later inserting it to new location.
Note: node removal without removing it's childs is out of scope for this
article.
For now, we assume, that node to remove has no children, i.e. right-left=1
Steps are identical to adding the node - i.e. we adjusting the space by
decreasing affected left/right values, and removing original node.
51. Node removal
var nodetoremove = db.categoriesNSO.findOne({_id:"LG"});
if((nodetoremove.right-nodetoremove.left-1)>0.001) {
print("Only node without childs can be removed")
exit
}
var followingsibling = db.categoriesNSO.findOne({_id:"Cell_Phones_and_Accessories"});
//update all remaining nodes
db.categoriesNSO.update({right:{$gt:nodetoremove.right}},{$inc:{right:-2}}, false, true)
db.categoriesNSO.update({left:{$gt:nodetoremove.right}},{$inc:{left:-2}}, false, true)
db.categoriesNSO.remove({_id:"LG"});
52. Updating/moving the single node
Moving the node can be within same parent, or to another parent. If the
same parent, and nodes are without childs, than you need just to exchange
nodes (left,right) pairs.
Formal way is to remove node and insert to new destination, thus the same
restriction apply - only node without children can be moved. If you need to
move subtree, consider creating mirror of the existing parent under new
location, and move nodes under the new parent one by one. Once all nodes
moved, remove obsolete old parent.
As an example, lets move LG node from the insertion example under the
Cell_Phones_and_Smartphones node, as a last sibling (i.e. you do not have
following sibling node as in the insertion example)
53. Updating/moving the single node
Steps
1. to remove LG node from tree using node removal procedure described
above
2. to take right value of the new parent.New node will have left value of
the parent's right value and right value - incremented by one parent's
right one. Now we have to create the place for the new node: update
affects right values of all nodes on a further traversal path
var newparent = db.categoriesNSO.findOne({_id:"Cell_Phones_and_Smartphones"});
var nodetomove = {_id:'LG', left:newparent.right,right:newparent.right+1}
//3th and 4th parameters: false stands for upsert=false and true stands for multi=true
db.categoriesNSO.update({right:{$gte:newparent.right}},{$inc:{right:2}}, false, true)
db.categoriesNSO.update({left:{$gte:newparent.right}},{$inc:{left:2}}, false, true)
db.categoriesNSO.insert(nodetomove)
55. Getting all node descendants
This is core strength of this approach - all descendants
retrieved using one select to DB. Moreover,by sorting by
node left - the dataset is ready for traversal in a correct
order
var descendants=[]
var item = db.categoriesNSO.findOne({_id:"Cell_Phones_and_Accessories"});
print ('('+item.left+','+item.right+')')
var children = db.categoriesNSO.find({left:{$gt:item.left}, right:{$lt:
item.right}}).sort(left:1);
while(true === children.hasNext()) {
var child = children.next();
descendants.push(child._id);
}
descendants.join(",")
//Cell_Phones_and_Smartphones,Headsets,Batteries,Cables_And_Adapters,Nokia,
Samsung,Apple,HTC,Vyacheslav
56. Getting path to node
Retrieving path to node is also elegant and can
be done using single query to database:
var path=[]
var item = db.categoriesNSO.findOne({_id:"Nokia"})
var ancestors = db.categoriesNSO.find({left:{$lt:item.left}, right:{$gt:
item.right}}).sort({left:1})
while(true === ancestors.hasNext()) {
var child = ancestors.next();
path.push(child._id);
}
path.join('/')
// Electronics/Cell_Phones_and_Accessories/Cell_Phones_and_Smartphones
57. Indexes
Recommended index is putting index on left
and right values:
db.categoriesAAO.ensureIndex( { left: 1, right:1 } )
58. Combination of Nested
Sets and classic Parent
reference with order
approach
For each node we store (ID, Parent, Order,left, right).
59. Intro
Left field also is treated as an order field, so
we could omit order field.
But from other hand, we can leave it, so we
can use Parent Reference with order data to
reconstruct left/right values in case of
accidental corruption, or, for example during
initial import.
60. Adding new node
Adding new node can be adopted from Nested
Sets in this manner:
var followingsibling = db.categoriesNSO.findOne({_id:"Cell_Phones_and_Accessories"});
var previoussignling = db.categoriesNSO.findOne({_id:"Shop_Top_Products"});
var neworder = parseInt((followingsibling.order + previoussignling.order)/2);
var newnode = {_id:'LG', left:followingsibling.left,right:followingsibling.left+1,
parent:followingsibling.parent, order:neworder};
db.categoriesNSO.update({right:{$gt:followingsibling.right}},{$inc:{right:2}}, false,
true)
db.categoriesNSO.update({left:{$gte:followingsibling.left}, right:{$lte:
followingsibling.right}},{$inc:{left:2, right:2}}, false, true)
db.categoriesNSO.insert(newnode)
68. Notes on using code
All files are packaged according to the following naming convention:
MODELReference.js - initialization file with tree data for MODEL approach
MODELReference_operating.js - add/update/move/remove/get children
examples
MODELReference_pathtonode.js - code illustrating how to obtain path to node
MODELReference_nodedescendants.js - code illustrating how to retrieve all
the descendants of the node
All files are ready to use in mongo shell. You can run examples by invoking
mongo < file_to_execute, or, if you want, interactively in the shell or with
RockMongo web shell.