2. Basic Aggregate functions available
Count, Distinct, Group
MongoDB doesn’t support SQL syntax
Aggregation requires building of “pipeline”
Essentially, one step/stage at a time, e.g.:
Step 1: Filter
Step 2: Projection
Step 3: Group
5. > db.restaurants.group( {
... key: { borough: 1 },
... cond: { cuisine: "Bakery"},
... reduce: function(cur, result) { result.count += 1 },
... initial: { count: 0 }
... } );
[
{
"borough" : "Bronx",
"count" : 71
},
{
"borough" : "Manhattan",
"count" : 221
},
{
"borough" : "Brooklyn",
"count" : 173
},
{
"borough" : "Queens",
"count" : 204
},
{
"borough" : "Staten Island",
"count" : 20
},
{
"borough" : "Missing",
"count" : 2
}
]
>
key is equivalent to the group by clause
cond is equivalent to the where clause
reduce function is called for each document in the
collection that passes the condition
reduce function has two parameters: cur and result. cur
stores the current document and result stores the result so
far for that group
In this case result.count simply adds 1 for each document
initial sets the initial value for each group result
7. db.restaurants.aggregate(
[ // bracket indicates an array
{ // first "step" or stage
$group:{ // aggregate operator
_id:'$cuisine', // group by cuisine property
total: {$sum:1} // sum or count each “row”
}
}
]
);
11. db.restaurants.aggregate(
[ // bracket indicates an array
{ // first "step" or stage
$match : { // match operator
borough: "Bronx" // where borough = "Bronx"
}
},
{ // second "step" or stage
$group:{ // aggregate operator
_id:'$cuisine', // group by cuisine property
total: {$sum:1} // sum or count each “row”
}
},
{ // third "step" or stage
$sort: {
total:-1 // sort on total; -1 indicates DESC
}
}
]
);
18. Controls which values are output
> db.restaurants.aggregate(
... [
... {$limit:1},
... {$project: {_id:0, // hide the _id value
… restaurant_id:1, // show restaurant_id
… "restaurant_name":"$name", // rename/alias name to restaurant_name
… "grades.grade":1}} // show grades.grade
... ]).pretty();
{
"grades" : [
{
"grade" : "A" // part of output
},
{
"grade" : "B" // part of output
},
{
"grade" : "A" // part of output
},
{
"grade" : "A" // part of output
}
],
"restaurant_name" : "Wendy'S", // part of output; renamed
"restaurant_id" : "30112340" // part of output
}
>
19. Saves the output of a pipeline to a collection
> db.restaurants.aggregate(
... [
... {$match : {borough: "Bronx"}},
... {$group:{_id:'$cuisine', total: {$sum:1}}},
... {$sort: {total:-1}},
... {$limit: 5 },
... {$out: "top5"} // output data to a collection called top5
... ]
... );
> db.top5.find({}); // retrieve all data from top5
{ "_id" : "American ", "total" : 411 }
{ "_id" : "Chinese", "total" : 323 }
{ "_id" : "Pizza", "total" : 197 }
{ "_id" : "Latin (Cuban, Dominican, Puerto Rican, South & Central
American)", "total" : 187 }
{ "_id" : "Spanish", "total" : 127 }
>
20. Motivation: How many A grades
did a restaurant get?
> db.restaurants.find({_id: ObjectId("5602b9200a67e499361c05ad")}).pretty();
{
"_id" : ObjectId("5602b9200a67e499361c05ad"),
"address" : {
"street" : "Flatbush Avenue",
"zipcode" : "11225",
"building" : "469",
"coord" : [
-73.961704,
40.662942
]
},
"borough" : "Brooklyn",
"cuisine" : "Hamburgers",
"grades" : [ // this is an array of objects
{
"date" : ISODate("2014-12-30T00:00:00Z"),
"grade" : "A", // A grade
"score" : 8
},
{
"grade" : "B", // B grade
"score" : 23,
"date" : ISODate("2014-07-01T00:00:00Z")
},
{
"score" : 12,
"date" : ISODate("2013-04-30T00:00:00Z"),
"grade" : "A"
},
{
"date" : ISODate("2012-05-08T00:00:00Z"),
"grade" : "A",
"score" : 12
}
],
"name" : "Wendy'S",
"restaurant_id" : "30112340"
}
>
Basic pipeline
Stage 1: unwind grades
Stage 2: match grade of
“A”
Stage 3: group by / sum
Stage 4: project (alias)
21. There is only one document for that restaurant_id, but since there were 4 elements in
grades, the unwind operator created 4 documents, one for each grade
Notice the result of the following is four documents with the same restaurant_id
> db.restaurants.aggregate(
... [
... {$unwind: "$grades"}, // unwind the grades array
... {$limit:4}, // limit the output to 4 documents
... {$project: {_id:0, restaurant_id:1, "grades.date":1, "grades.grade":1, "grades.score":1}}
... ]).pretty();
{
"grades" : {
"date" : ISODate("2014-12-30T00:00:00Z"),
"grade" : "A",
"score" : 8
},
"restaurant_id": "30112340"
}
{
"grades" : {
"grade" : "B",
"score" : 23,
"date" : ISODate("2014-07-01T00:00:00Z")
},
"restaurant_id": "30112340"
}
{
"grades" : {
"score" : 12,
"date" : ISODate("2013-04-30T00:00:00Z"),
"grade" : "A"
},
"restaurant_id": "30112340"
}
{
"grades" : {
"date" : ISODate("2012-05-08T00:00:00Z"),
"grade" : "A",
"score" : 12
},
"restaurant_id": "30112340"
}