SlideShare uma empresa Scribd logo
1 de 48
Baixar para ler offline
WHAT'S NEW IN AGGREGATION
  Jeremy Mikola
     @jmikola
AGENDA



State of Aggregation   Usage and Limitations
Pipeline               Sharding
Expressions            Looking Ahead
STATE OF AGGREGATION
We're storing our data in MongoDB.

  We need to do ad-hoc reporting,
grouping, common aggregations, etc.

    What are we using for this?
DATA WAREHOUSING
SQL for reporting and analytics
Infrastructure complications
   Additional maintenance
   Data duplication
   ETL processes
   Real time?
MAPREDUCE
Extremely versatile, powerful
Intended for complex data analysis
Overkill for simple aggregation tasks
   Averages
   Summation
   Grouping
MAPREDUCE IN MONGODB
Implemented with JavaScript
  Single-threaded
  Difficult to debug
Concurrency
  Appearance of parallelism
  Write locks (without i l n or j M d )
                        nie soe
And now for something completely different…
AGGREGATION FRAMEWORK
Declared with BSON, executes in C++
Flexible, functional, and simple
   Operation pipeline
   Computational expressions
Plays nice with sharding
PIPELINE
State of Aggregation
Pipeline
Expressions
Usage and Limitations
Sharding
Looking Ahead
PIPELINE
 Process a stream of documents
   Original input is a collection
   Final output is a result document
 Series of operators
   Filter or transform data
   Input/output chain

p x|ge ogd|ha n1
 sa  rpmno  ed‐ 
PIPELINE OPERATORS
      $ac
       mth
      $rjc
       poet
      $ru
       gop
      $nid
       uwn
      $ot
       sr
      $ii
       lmt
      $kp
       si
OUR EXAMPLE DATA
             Library Books
{_d 7,
  i:35
 tte TeGetGtb"
  il:"h ra asy,
 IB:"715109"
  SN 9887513,
 aalbe re
  vial:tu,
 pgs 1,
  ae:28
 catr:9
  hpes ,
 sbet:[
  ujcs 
  "ogIln"
   Ln sad,
  "e ok,
   NwYr"
  "90"
   12s
 ]
  ,
 lnug:"nls"
  agae Egih,
 pbihr 
  ulse:{
  ct:"odn,
   iy Lno"
  nm:"admHue
   ae Rno os"
 }
  
}
$MATCH
Filter documents
Uses existing query syntax
No geospatial operations or $ h r
                             wee
    w t i support coming in 2.4
   $ihn
$MATCH
            Matching field values
{tte TeGetGtb"
  il:"h ra asy,      ►   {$ac:{
                           mth 
 pgs 1,
  ae:28                   lnug:"usa"
                           agae Rsin
 lnug:"nls"
  agae Egih              }}
}

{tte WradPae,
  il:"a n ec"
                                    ▼
 pgs 40
  ae:14,
                         {tte WradPae,
                           il:"a n ec"
 lnug:"usa"
  agae Rsin
                          pgs 40
                           ae:14,
}
                          lnug:"usa"
                           agae Rsin
                         }
{tte AlsSrge"
  il:"ta hugd,
 pgs 08
  ae:18,
 lnug:"nls"
  agae Egih
}
$MATCH
         Matching with query operators
{tte TeGetGtb"
  il:"h ra asy,       ►   {$ac:{
                            mth 
 pgs 1,
  ae:28                    pgs  g:10 
                            ae:{$t 00}
 lnug:"nls"
  agae Egih               }}
}

{tte WradPae,
  il:"a n ec"
                                    ▼
 pgs 40
  ae:14,
                          {tte WradPae,
                            il:"a n ec"
 lnug:"usa"
  agae Rsin
                           pgs 40
                            ae:14,
}
                           lnug:"usa"
                            agae Rsin
                          }
{tte AlsSrge"
  il:"ta hugd,
 pgs 08
  ae:18,
                          {tte AlsSrge"
                            il:"ta hugd,
 lnug:"nls"
  agae Egih
                           pgs 08
                            ae:18,
}
                           lnug:"nls"
                            agae Egih
                          }
$PROJECT
Reshape documents
Include, exclude or rename fields
Inject computed fields
Manipulate sub-document fields
$PROJECT
         Including and excluding fields
{_d 7,
  i:35                ►    {$rjc:{
                             poet 
 tte TeGetGtb"
  il:"h ra asy,             _d ,
                             i:0
 IB:"715109"
  SN 9887513,               tte ,
                             il:1
 aalbe re
  vial:tu,                  lnug:1
                             agae 
 pgs 1,
  ae:28                    }}
 catr:9
  hpes ,
 sbet:[
  ujcs 
  "ogIln"
   Ln sad,                            ▼
  "e ok,
   NwYr"
  "90"
   12s                     {tte TeGetGtb"
                             il:"h ra asy,
 ]
  ,                         lnug:"nls"
                             agae Egih
 lnug:"nls"
  agae Egih                }
}
$PROJECT
         Renaming and computing fields
{_d 7,
  i:35                ►   {$rjc:{
                            poet 
 tte TeGetGtb"
  il:"h ra asy,            agaePrhpe:{
                            vPgseCatr 
 IB:"715109"
  SN 9887513,               $iie 
                             dvd:[
 aalbe re
  vial:tu,                   "pgs,
                              $ae"
 pgs 1,
  ae:28                      "catr"
                              $hpes
 catr:9
  hpes ,                    ]
                             
 sbet:[
  ujcs                     }
                            ,
  "ogIln"
   Ln sad,                 ln:"lnug"
                            ag $agae
  "e ok,
   NwYr"                  }}
  "90"
   12s
 ]
  ,
 lnug:"nls"
  agae Egih                         ▼
}
                          {_d 7,
                            i:35
                           agaePrhpe:2.22222
                            vPgseCatr 422222
                          22,
                           22
                           ln:"nls"
                            ag Egih
                          }
$PROJECT
    Creating and extracting sub-document fields
{_d 7,
  i:35                  ►   {$rjc:{
                              poet 
 tte TeGetGtb"
  il:"h ra asy,              tte ,
                              il:1
 IB:"715109"
  SN 9887513,                sas 
                              tt:{
 aalbe re
  vial:tu,                    pgs $ae"
                               ae:"pgs,
 pgs 1,
  ae:28                       catr:"catr"
                               hpes $hpes,
 catr:9
  hpes ,                     }
                              ,
 sbet:[
  ujcs                       pbct:"pbihrct"
                              u_iy $ulse.iy
  "ogIln"
   Ln sad,                  }}
  "e ok,
   NwYr"
  "90"
   12s
 ]
  ,                                    ▼
 pbihr 
  ulse:{
  ct:"odn,
   iy Lno"                  {_d 7,
                              i:35
  nm:"admHue
   ae Rno os"                tte TeGetGtb"
                              il:"h ra asy,
 }
                             sas 
                              tt:{
}                             pgs 1,
                               ae:28
                              lnug:"nls"
                               agae Egih
                             }
                              ,
                             pbct:"odn
                              u_iy Lno"
                            }
$GROUP
Group documents by an ID
  Field reference, object, constant
Other output fields are computed
  $a, $i, $v, $u
   mx mn ag sm
   adoe, ps
  $ d T S t$ u h
  $is, $at
   frt ls
Processes all data in memory
$GROUP
            Calculating an average
{tte TeGetGtb"
  il:"h ra asy,      ►   {$ru:{
                           gop 
 pgs 1,
  ae:28                   _d $agae,
                           i:"lnug"
 lnug:"nls"
  agae Egih               agae:{$v:"pgs 
                           vPgs  ag $ae"}
}                        }}

{tte WradPae,
  il:"a n ec"
 pgs 40
  ae:14,
                                     ▼
 lnug:"usa"
  agae Rsin
                          {_d Rsin,
                            i:"usa"
}
                           agae:14
                            vPgs 40
                          }
{tte AlsSrge"
  il:"ta hugd,
 pgs 08
  ae:18,
                          {_d Egih,
                            i:"nls"
 lnug:"nls"
  agae Egih
                           agae:63
                            vPgs 5
}
                          }
$GROUP
         Summating fields and counting
{tte TeGetGtb"
  il:"h ra asy,       ►   {$ru:{
                            gop 
 pgs 1,
  ae:28                    _d $agae,
                            i:"lnug"
 lnug:"nls"
  agae Egih                nmils  sm  ,
                            uTte:{$u:1}
}                          smae:{$u:"pgs 
                            uPgs  sm $ae"}
                          }}
{tte WradPae,
  il:"a n ec"
 pgs 40
  ae:14,
 lnug:"usa"
  agae Rsin
                                    ▼
}
                          {_d Rsin,
                            i:"usa"
                           nmils ,
                            uTte:1
{tte AlsSrge"
  il:"ta hugd,
                           smae:14
                            uPgs 40
 pgs 08
  ae:18,
                          }
 lnug:"nls"
  agae Egih
}
                          {_d Egih,
                            i:"nls"
                           nmils ,
                            uTte:2
                           smae:10
                            uPgs 36
                          }
$GROUP
           Collecting distinct values
{tte TeGetGtb"
  il:"h ra asy,       ►    {$ru:{
                             gop 
 pgs 1,
  ae:28                     _d $agae,
                             i:"lnug"
 lnug:"nls"
  agae Egih                 tte:{$dTSt $il"}
                             ils  adoe:"tte 
}                          }}

{tte WradPae,
  il:"a n ec"
 pgs 40
  ae:14,
                                        ▼
 lnug:"usa"
  agae Rsin
                           {_d Rsin,
                             i:"usa"
}
                            tte:["a n ec"]
                             ils  WradPae 
                           }
{tte AlsSrge"
  il:"ta hugd,
 pgs 08
  ae:18,
                           {_d Egih,
                             i:"nls"
 lnug:"nls"
  agae Egih
                            tte:[
                             ils 
}
                             "ta hugd,
                              AlsSrge"
                             "h ra asy
                              TeGetGtb"
                            ]
                             
                           }
$UNWIND
Operate on an array field
Yield new documents for each array element
   Array replaced by element value
   Missing/empty fields → no output
   Non-array fields → error
Pipe to $ r u to aggregate array values
         gop
$UNWIND
      Yielding multiple documents from one
{_d 7,
  i:35                ►   {$nid $ujcs 
                           uwn:"sbet"}
 tte TeGetGtb"
  il:"h ra asy,
 sbet:[
  ujcs 
  "ogIln"
   Ln sad,                           ▼
  "e ok,
   NwYr"
  "90"
   12s                    {_d 7,
                            i:35
 ]
                           tte TeGetGtb"
                            il:"h ra asy,
}                          sbet:"ogIln"
                            ujcs Ln sad
                          }

                          {_d 7,
                            i:35
                           tte TeGetGtb"
                            il:"h ra asy,
                           sbet:"e ok
                            ujcs NwYr"
                          }

                          {_d 7,
                            i:35
                           tte TeGetGtb"
                            il:"h ra asy,
                           sbet:"90"
                            ujcs 12s
                          }
$SORT, $LIMIT, $SKIP
Sort documents by one or more fields
  Same order syntax as cursors
  Waits for earlier pipeline operator to return
  In-memory unless early and indexed
Limit and skip follow cursor behavior
$SORT
        Sort all documents in the pipeline
{tte TeGetGtb"}
  il:"h ra asy         ►    {$ot  il:1}
                             sr:{tte  }

{tte BaeNwWrd 
  il:"rv e ol"}
                                       ▼
{tte TeGae fWah 
  il:"h rpso rt"}
                            {tte Aia am 
                              il:"nmlFr"}
{tte Aia am 
  il:"nmlFr"}
                            {tte BaeNwWrd 
                              il:"rv e ol"}
{tte Lr fteFis 
  il:"odo h le"}
                            {tte Fhehi 5"}
                              il:"arnet41 
{tte FtesadSn"}
  il:"ahr n os 
                            {tte FtesadSn"}
                              il:"ahr n os 
{tte IvsbeMn 
  il:"niil a"}
                            {tte IvsbeMn 
                              il:"niil a"}
{tte Fhehi 5"}
  il:"arnet41 
                            {tte Lr fteFis 
                              il:"odo h le"}

                            {tte TeGae fWah 
                              il:"h rpso rt"}

                            {tte TeGetGtb"}
                              il:"h ra asy 
$LIMIT
      Limit documents through the pipeline
{tte Aia am 
  il:"nmlFr"}         ►   {$ii:5}
                           lmt  

{tte BaeNwWrd 
  il:"rv e ol"}
                                     ▼
{tte Fhehi 5"}
  il:"arnet41 
                           {tte Aia am 
                             il:"nmlFr"}
{tte FtesadSn"}
  il:"ahr n os 
                           {tte BaeNwWrd 
                             il:"rv e ol"}
{tte IvsbeMn 
  il:"niil a"}
                           {tte Fhehi 5"}
                             il:"arnet41 
{tte Lr fteFis 
  il:"odo h le"}
                           {tte FtesadSn"}
                             il:"ahr n os 
{tte TeGae fWah 
  il:"h rpso rt"}
                           {tte IvsbeMn 
                             il:"niil a"}
{tte TeGetGtb"}
  il:"h ra asy 
$SKIP
       Skip over documents in the pipeline
{tte Aia am 
  il:"nmlFr"}          ►   {$kp  
                            si:2}

{tte BaeNwWrd 
  il:"rv e ol"}
                                      ▼
{tte Fhehi 5"}
  il:"arnet41 
                           {tte Fhehi 5"}
                             il:"arnet41 
{tte FtesadSn"}
  il:"ahr n os 
                           {tte FtesadSn"}
                             il:"ahr n os 
{tte IvsbeMn 
  il:"niil a"}
                           {tte IvsbeMn 
                             il:"niil a"}
EXPRESSIONS
 State of Aggregation
 Pipeline
 Expressions
 Usage and Limitations
 Sharding
 Looking Ahead
EXPRESSIONS
Return computed values
Used with $ r j c and $ r u
           poet            gop
Reference fields using $(e.g. " x )
                               $"
Expressions may be nested
EXPRESSIONS
Logic                    Comparison
  $ n , $ r$ o …
   ad o, nt                $ m , $ q$ t
                            cp e, g…

Arithmetic               String
  $d, $iie
   ad dvd…                  sraem, sbt…
                           $ t c s c p$ u s r

Date                     Conditional
   ya, dyfot…
  $ e r$ a O M n h          cn, iNl…
                           $ o d$ f u l
USAGE
State of Aggregation
Pipeline
Expressions
Usage and Limitations
Sharding
Looking Ahead
USAGE
 g r g t database command
ageae
c l e t o . g r g t ( method
 olcinageae)
  Mongo shell
  Most drivers
COLLECTION METHOD
d.ok.grgt(
bbosageae[
 {$ot  rae:1},
  sr:{cetd  }
 {$nid $ujcs ,
  uwn:"sbet"}
 {$ru:{_d $ujcs,n  sm  ,
  gop  i:"sbet" :{$u:1}
       f:{$is:"cetd  }
       c  frt $rae"}},
 {$rjc:{_d ,n ,f:{$er $c }
  poet  i:1 :1 c  ya:"f"}}
];
)


                 ▼
{
 rsl:[
  eut 
  {"i" Fnay,"u" ,"c:20 ,
    _d:"ats" nm:6 f" 08}
  {"i" Hsoia" nm:7 f" 02}
    _d:"itrcl,"u" ,"c:21 ,
  {"i" WrdLtrtr" n:2 f" 09}
    _d:"ol ieaue,"" ,"c:20 
  / te eut olw
   /Ohrrslsflo…
 ]
  ,
 o:1
  k 
}
DATABASE COMMAND
d.uCmad{ageae bos,ppln:[
brnomn( grgt:"ok" ieie 
 {$ot  rae:1},
  sr:{cetd  }
 {$nid $ujcs ,
  uwn:"sbet"}
 {$ru:{_d $ujcs,n  sm  ,
  gop  i:"sbet" :{$u:1}
       f:{$is:"cetd  }
       c  frt $rae"}},
 {$rjc:{_d ,n ,f:{$er $c }
  poet  i:1 :1 c  ya:"f"}}
])
};


                 ▼
{
 rsl:[
  eut 
  {"i" Fnay,"u" ,"c:20 ,
    _d:"ats" nm:6 f" 08}
  {"i" Hsoia" nm:7 f" 02}
    _d:"itrcl,"u" ,"c:21 ,
  {"i" WrdLtrtr" n:2 f" 09}
    _d:"ol ieaue,"" ,"c:20 
  / te eut olw
   /Ohrrslsflo…
 ]
  ,
 o:1
  k 
}
LIMITATIONS
Result limited by BSON document size
   Final command result
   Intermediate shard results
Pipeline operator memory limits
SHARDING
State of Aggregation
Pipeline
Expressions
Usage and Limitations
Sharding
Looking Ahead
SHARDING
Split the pipeline at first $ r u or $ o t
                             gop sr
  Shards execute pipeline up to that point
  mongos merges results and continues
Early $ a c may excuse shards
        mth
CPU and memory implications for mongos
SHARDING
[
 {$ac: {/ itrb hr e /},
   mth   *fle ysadky* }
 {$ru: {/ ru ysm il /},
   gop   *gopb oefed* }
 {$ot  {/ otb oefed /},
   sr:   *sr ysm il * }
 {$rjc:{/ ehp eut   /}
   poet  *rsaersl   * }
]
SHARDING
shard1        shard2    shard3
$ac
 mth           $ac
                mth
$ru1
 gop           $ru1
                gop

         ↘       ↓
              mongos
               $ru2
               gop
               $ot
               sr
               $rjc
               poet

                 ↓
               Result
LOOKING AHEAD
  State of Aggregation
  Pipeline
  Expressions
  Usage and Limitations
  Sharding
  Looking Ahead
FRAMEWORK USE CASES
  Basic aggregation queries
  Ad-hoc reporting
  Real-time analytics
  Visualizing time series data
EXTENDING THE FRAMEWORK
Adding new pipeline operators, expressions
   w t i expression for $ a c
  $ihn                    mth
  $ e N a pipeline operator
   goer
  $ u operator for output control
   ot
FUTURE ENHANCEMENTS
Improved handling of n l values
                        ul
Optimizing $ a c position
             mth
Pipeline explain facility
Support BSON binary, code, etc.
Memory usage improvements
  Grouping input sorted by _ d
                             i
  Sorting with limited output (top k)
ENABLING DEVELOPERS
Doing more within MongoDB, faster
Refactoring MapReduce and groupings
  Replace pages of JavaScript
  Longer aggregation pipelines
Quick aggregations from the shell
THANKS!
 QUESTIONS?
PHOTO CREDITS
http://dilbert.com/strips/comic/2012-09-05
http://img.timeinc.net/time/photoessays/2009/monty_python/monty_python_02.jpg

Mais conteúdo relacionado

Mais procurados

MongoDB Aggregation Framework
MongoDB Aggregation FrameworkMongoDB Aggregation Framework
MongoDB Aggregation Framework
Tyler Brock
 
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
MongoDB
 
Aggregation in MongoDB
Aggregation in MongoDBAggregation in MongoDB
Aggregation in MongoDB
Kishor Parkhe
 
MongoDB全機能解説2
MongoDB全機能解説2MongoDB全機能解説2
MongoDB全機能解説2
Takahiro Inoue
 
MongoDBで作るソーシャルデータ新解析基盤
MongoDBで作るソーシャルデータ新解析基盤MongoDBで作るソーシャルデータ新解析基盤
MongoDBで作るソーシャルデータ新解析基盤
Takahiro Inoue
 
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
MongoDB
 

Mais procurados (20)

The Ring programming language version 1.7 book - Part 73 of 196
The Ring programming language version 1.7 book - Part 73 of 196The Ring programming language version 1.7 book - Part 73 of 196
The Ring programming language version 1.7 book - Part 73 of 196
 
The Ring programming language version 1.8 book - Part 75 of 202
The Ring programming language version 1.8 book - Part 75 of 202The Ring programming language version 1.8 book - Part 75 of 202
The Ring programming language version 1.8 book - Part 75 of 202
 
The Aggregation Framework
The Aggregation FrameworkThe Aggregation Framework
The Aggregation Framework
 
MongoDB Aggregation Framework
MongoDB Aggregation FrameworkMongoDB Aggregation Framework
MongoDB Aggregation Framework
 
Python Ireland Nov 2010 Talk: Unit Testing
Python Ireland Nov 2010 Talk: Unit TestingPython Ireland Nov 2010 Talk: Unit Testing
Python Ireland Nov 2010 Talk: Unit Testing
 
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
 
Webinar: Exploring the Aggregation Framework
Webinar: Exploring the Aggregation FrameworkWebinar: Exploring the Aggregation Framework
Webinar: Exploring the Aggregation Framework
 
MongoDB Aggregation Framework
MongoDB Aggregation FrameworkMongoDB Aggregation Framework
MongoDB Aggregation Framework
 
The Ring programming language version 1.10 book - Part 81 of 212
The Ring programming language version 1.10 book - Part 81 of 212The Ring programming language version 1.10 book - Part 81 of 212
The Ring programming language version 1.10 book - Part 81 of 212
 
MongoDB Aggregation
MongoDB Aggregation MongoDB Aggregation
MongoDB Aggregation
 
Aggregation Framework MongoDB Days Munich
Aggregation Framework MongoDB Days MunichAggregation Framework MongoDB Days Munich
Aggregation Framework MongoDB Days Munich
 
Agg framework selectgroup feb2015 v2
Agg framework selectgroup feb2015 v2Agg framework selectgroup feb2015 v2
Agg framework selectgroup feb2015 v2
 
Aggregation in MongoDB
Aggregation in MongoDBAggregation in MongoDB
Aggregation in MongoDB
 
MongoDB .local Toronto 2019: Aggregation Pipeline Power++: How MongoDB 4.2 Pi...
MongoDB .local Toronto 2019: Aggregation Pipeline Power++: How MongoDB 4.2 Pi...MongoDB .local Toronto 2019: Aggregation Pipeline Power++: How MongoDB 4.2 Pi...
MongoDB .local Toronto 2019: Aggregation Pipeline Power++: How MongoDB 4.2 Pi...
 
MongoDB .local Chicago 2019: Aggregation Pipeline Power++: How MongoDB 4.2 Pi...
MongoDB .local Chicago 2019: Aggregation Pipeline Power++: How MongoDB 4.2 Pi...MongoDB .local Chicago 2019: Aggregation Pipeline Power++: How MongoDB 4.2 Pi...
MongoDB .local Chicago 2019: Aggregation Pipeline Power++: How MongoDB 4.2 Pi...
 
MongoDB World 2019: Aggregation Pipeline Power++: How MongoDB 4.2 Pipeline Em...
MongoDB World 2019: Aggregation Pipeline Power++: How MongoDB 4.2 Pipeline Em...MongoDB World 2019: Aggregation Pipeline Power++: How MongoDB 4.2 Pipeline Em...
MongoDB World 2019: Aggregation Pipeline Power++: How MongoDB 4.2 Pipeline Em...
 
MongoDB全機能解説2
MongoDB全機能解説2MongoDB全機能解説2
MongoDB全機能解説2
 
MongoDBで作るソーシャルデータ新解析基盤
MongoDBで作るソーシャルデータ新解析基盤MongoDBで作るソーシャルデータ新解析基盤
MongoDBで作るソーシャルデータ新解析基盤
 
Shift Remote FRONTEND: Reactivity in Vue.JS 3 - Marko Boskovic (Barrage)
Shift Remote FRONTEND: Reactivity in Vue.JS 3 - Marko Boskovic (Barrage)Shift Remote FRONTEND: Reactivity in Vue.JS 3 - Marko Boskovic (Barrage)
Shift Remote FRONTEND: Reactivity in Vue.JS 3 - Marko Boskovic (Barrage)
 
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
 

Destaque (7)

Morning with MongoDB Paris 2012 - Accueil et Introductions
Morning with MongoDB Paris 2012 - Accueil et IntroductionsMorning with MongoDB Paris 2012 - Accueil et Introductions
Morning with MongoDB Paris 2012 - Accueil et Introductions
 
Webinar: Operational Best Practices
Webinar: Operational Best PracticesWebinar: Operational Best Practices
Webinar: Operational Best Practices
 
Bringing Spatial Love to Your Java Application
Bringing Spatial Love to Your Java ApplicationBringing Spatial Love to Your Java Application
Bringing Spatial Love to Your Java Application
 
Schema & Design
Schema & DesignSchema & Design
Schema & Design
 
MongoDC 2012: Taming Social Media with MongoDB
MongoDC 2012: Taming Social Media with MongoDBMongoDC 2012: Taming Social Media with MongoDB
MongoDC 2012: Taming Social Media with MongoDB
 
An Evening with MongoDB Detroit 2013
An Evening with MongoDB Detroit 2013An Evening with MongoDB Detroit 2013
An Evening with MongoDB Detroit 2013
 
The Spring Data MongoDB Project
The Spring Data MongoDB ProjectThe Spring Data MongoDB Project
The Spring Data MongoDB Project
 

Mais de MongoDB

Mais de MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
 

Último

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Último (20)

FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 

Webinar: What's New in Aggregation

  • 1. WHAT'S NEW IN AGGREGATION Jeremy Mikola @jmikola
  • 2. AGENDA State of Aggregation Usage and Limitations Pipeline Sharding Expressions Looking Ahead
  • 3. STATE OF AGGREGATION We're storing our data in MongoDB. We need to do ad-hoc reporting, grouping, common aggregations, etc. What are we using for this?
  • 4. DATA WAREHOUSING SQL for reporting and analytics Infrastructure complications Additional maintenance Data duplication ETL processes Real time?
  • 5. MAPREDUCE Extremely versatile, powerful Intended for complex data analysis Overkill for simple aggregation tasks Averages Summation Grouping
  • 6. MAPREDUCE IN MONGODB Implemented with JavaScript Single-threaded Difficult to debug Concurrency Appearance of parallelism Write locks (without i l n or j M d ) nie soe
  • 7. And now for something completely different…
  • 8. AGGREGATION FRAMEWORK Declared with BSON, executes in C++ Flexible, functional, and simple Operation pipeline Computational expressions Plays nice with sharding
  • 9. PIPELINE State of Aggregation Pipeline Expressions Usage and Limitations Sharding Looking Ahead
  • 10. PIPELINE Process a stream of documents Original input is a collection Final output is a result document Series of operators Filter or transform data Input/output chain p x|ge ogd|ha n1 sa  rpmno  ed‐ 
  • 11. PIPELINE OPERATORS $ac mth $rjc poet $ru gop $nid uwn $ot sr $ii lmt $kp si
  • 12. OUR EXAMPLE DATA Library Books {_d 7,  i:35  tte TeGetGtb"  il:"h ra asy,  IB:"715109"  SN 9887513,  aalbe re  vial:tu,  pgs 1,  ae:28  catr:9  hpes ,  sbet:[  ujcs    "ogIln"   Ln sad,   "e ok,   NwYr"   "90"   12s  ]  ,  lnug:"nls"  agae Egih,  pbihr   ulse:{   ct:"odn,   iy Lno"   nm:"admHue   ae Rno os"  }   }
  • 13. $MATCH Filter documents Uses existing query syntax No geospatial operations or $ h r wee w t i support coming in 2.4 $ihn
  • 14. $MATCH Matching field values {tte TeGetGtb"  il:"h ra asy, ► {$ac:{  mth   pgs 1,  ae:28  lnug:"usa"  agae Rsin  lnug:"nls"  agae Egih }} } {tte WradPae,  il:"a n ec" ▼  pgs 40  ae:14, {tte WradPae,  il:"a n ec"  lnug:"usa"  agae Rsin  pgs 40  ae:14, }  lnug:"usa"  agae Rsin } {tte AlsSrge"  il:"ta hugd,  pgs 08  ae:18,  lnug:"nls"  agae Egih }
  • 15. $MATCH Matching with query operators {tte TeGetGtb"  il:"h ra asy, ► {$ac:{  mth   pgs 1,  ae:28  pgs  g:10   ae:{$t 00}  lnug:"nls"  agae Egih }} } {tte WradPae,  il:"a n ec" ▼  pgs 40  ae:14, {tte WradPae,  il:"a n ec"  lnug:"usa"  agae Rsin  pgs 40  ae:14, }  lnug:"usa"  agae Rsin } {tte AlsSrge"  il:"ta hugd,  pgs 08  ae:18, {tte AlsSrge"  il:"ta hugd,  lnug:"nls"  agae Egih  pgs 08  ae:18, }  lnug:"nls"  agae Egih }
  • 16. $PROJECT Reshape documents Include, exclude or rename fields Inject computed fields Manipulate sub-document fields
  • 17. $PROJECT Including and excluding fields {_d 7,  i:35 ► {$rjc:{  poet   tte TeGetGtb"  il:"h ra asy,  _d ,  i:0  IB:"715109"  SN 9887513,  tte ,  il:1  aalbe re  vial:tu,  lnug:1  agae   pgs 1,  ae:28 }}  catr:9  hpes ,  sbet:[  ujcs    "ogIln"   Ln sad, ▼   "e ok,   NwYr"   "90"   12s {tte TeGetGtb"  il:"h ra asy,  ]  ,  lnug:"nls"  agae Egih  lnug:"nls"  agae Egih } }
  • 18. $PROJECT Renaming and computing fields {_d 7,  i:35 ► {$rjc:{  poet   tte TeGetGtb"  il:"h ra asy,  agaePrhpe:{  vPgseCatr   IB:"715109"  SN 9887513,   $iie    dvd:[  aalbe re  vial:tu,    "pgs,    $ae"  pgs 1,  ae:28    "catr"    $hpes  catr:9  hpes ,   ]     sbet:[  ujcs   }  ,   "ogIln"   Ln sad,  ln:"lnug"  ag $agae   "e ok,   NwYr" }}   "90"   12s  ]  ,  lnug:"nls"  agae Egih ▼ } {_d 7,  i:35  agaePrhpe:2.22222  vPgseCatr 422222 22, 22  ln:"nls"  ag Egih }
  • 19. $PROJECT Creating and extracting sub-document fields {_d 7,  i:35 ► {$rjc:{  poet   tte TeGetGtb"  il:"h ra asy,  tte ,  il:1  IB:"715109"  SN 9887513,  sas   tt:{  aalbe re  vial:tu,   pgs $ae"   ae:"pgs,  pgs 1,  ae:28   catr:"catr"   hpes $hpes,  catr:9  hpes ,  }  ,  sbet:[  ujcs   pbct:"pbihrct"  u_iy $ulse.iy   "ogIln"   Ln sad, }}   "e ok,   NwYr"   "90"   12s  ]  , ▼  pbihr   ulse:{   ct:"odn,   iy Lno" {_d 7,  i:35   nm:"admHue   ae Rno os"  tte TeGetGtb"  il:"h ra asy,  }    sas   tt:{ }   pgs 1,   ae:28   lnug:"nls"   agae Egih  }  ,  pbct:"odn  u_iy Lno" }
  • 20. $GROUP Group documents by an ID Field reference, object, constant Other output fields are computed $a, $i, $v, $u mx mn ag sm adoe, ps $ d T S t$ u h $is, $at frt ls Processes all data in memory
  • 21. $GROUP Calculating an average {tte TeGetGtb"  il:"h ra asy, ► {$ru:{  gop   pgs 1,  ae:28  _d $agae,  i:"lnug"  lnug:"nls"  agae Egih  agae:{$v:"pgs   vPgs  ag $ae"} } }} {tte WradPae,  il:"a n ec"  pgs 40  ae:14, ▼  lnug:"usa"  agae Rsin {_d Rsin,  i:"usa" }  agae:14  vPgs 40 } {tte AlsSrge"  il:"ta hugd,  pgs 08  ae:18, {_d Egih,  i:"nls"  lnug:"nls"  agae Egih  agae:63  vPgs 5 } }
  • 22. $GROUP Summating fields and counting {tte TeGetGtb"  il:"h ra asy, ► {$ru:{  gop   pgs 1,  ae:28  _d $agae,  i:"lnug"  lnug:"nls"  agae Egih  nmils  sm  ,  uTte:{$u:1} }  smae:{$u:"pgs   uPgs  sm $ae"} }} {tte WradPae,  il:"a n ec"  pgs 40  ae:14,  lnug:"usa"  agae Rsin ▼ } {_d Rsin,  i:"usa"  nmils ,  uTte:1 {tte AlsSrge"  il:"ta hugd,  smae:14  uPgs 40  pgs 08  ae:18, }  lnug:"nls"  agae Egih } {_d Egih,  i:"nls"  nmils ,  uTte:2  smae:10  uPgs 36 }
  • 23. $GROUP Collecting distinct values {tte TeGetGtb"  il:"h ra asy, ► {$ru:{  gop   pgs 1,  ae:28  _d $agae,  i:"lnug"  lnug:"nls"  agae Egih  tte:{$dTSt $il"}  ils  adoe:"tte  } }} {tte WradPae,  il:"a n ec"  pgs 40  ae:14, ▼  lnug:"usa"  agae Rsin {_d Rsin,  i:"usa" }  tte:["a n ec"]  ils  WradPae  } {tte AlsSrge"  il:"ta hugd,  pgs 08  ae:18, {_d Egih,  i:"nls"  lnug:"nls"  agae Egih  tte:[  ils  }   "ta hugd,   AlsSrge"   "h ra asy   TeGetGtb"  ]   }
  • 24. $UNWIND Operate on an array field Yield new documents for each array element Array replaced by element value Missing/empty fields → no output Non-array fields → error Pipe to $ r u to aggregate array values gop
  • 25. $UNWIND Yielding multiple documents from one {_d 7,  i:35 ► {$nid $ujcs   uwn:"sbet"}  tte TeGetGtb"  il:"h ra asy,  sbet:[  ujcs    "ogIln"   Ln sad, ▼   "e ok,   NwYr"   "90"   12s {_d 7,  i:35  ]    tte TeGetGtb"  il:"h ra asy, }  sbet:"ogIln"  ujcs Ln sad } {_d 7,  i:35  tte TeGetGtb"  il:"h ra asy,  sbet:"e ok  ujcs NwYr" } {_d 7,  i:35  tte TeGetGtb"  il:"h ra asy,  sbet:"90"  ujcs 12s }
  • 26. $SORT, $LIMIT, $SKIP Sort documents by one or more fields Same order syntax as cursors Waits for earlier pipeline operator to return In-memory unless early and indexed Limit and skip follow cursor behavior
  • 27. $SORT Sort all documents in the pipeline {tte TeGetGtb"}  il:"h ra asy  ► {$ot  il:1}  sr:{tte  } {tte BaeNwWrd   il:"rv e ol"} ▼ {tte TeGae fWah   il:"h rpso rt"} {tte Aia am   il:"nmlFr"} {tte Aia am   il:"nmlFr"} {tte BaeNwWrd   il:"rv e ol"} {tte Lr fteFis   il:"odo h le"} {tte Fhehi 5"}  il:"arnet41  {tte FtesadSn"}  il:"ahr n os  {tte FtesadSn"}  il:"ahr n os  {tte IvsbeMn   il:"niil a"} {tte IvsbeMn   il:"niil a"} {tte Fhehi 5"}  il:"arnet41  {tte Lr fteFis   il:"odo h le"} {tte TeGae fWah   il:"h rpso rt"} {tte TeGetGtb"}  il:"h ra asy 
  • 28. $LIMIT Limit documents through the pipeline {tte Aia am   il:"nmlFr"} ► {$ii:5}  lmt   {tte BaeNwWrd   il:"rv e ol"} ▼ {tte Fhehi 5"}  il:"arnet41  {tte Aia am   il:"nmlFr"} {tte FtesadSn"}  il:"ahr n os  {tte BaeNwWrd   il:"rv e ol"} {tte IvsbeMn   il:"niil a"} {tte Fhehi 5"}  il:"arnet41  {tte Lr fteFis   il:"odo h le"} {tte FtesadSn"}  il:"ahr n os  {tte TeGae fWah   il:"h rpso rt"} {tte IvsbeMn   il:"niil a"} {tte TeGetGtb"}  il:"h ra asy 
  • 29. $SKIP Skip over documents in the pipeline {tte Aia am   il:"nmlFr"} ► {$kp    si:2} {tte BaeNwWrd   il:"rv e ol"} ▼ {tte Fhehi 5"}  il:"arnet41  {tte Fhehi 5"}  il:"arnet41  {tte FtesadSn"}  il:"ahr n os  {tte FtesadSn"}  il:"ahr n os  {tte IvsbeMn   il:"niil a"} {tte IvsbeMn   il:"niil a"}
  • 30. EXPRESSIONS State of Aggregation Pipeline Expressions Usage and Limitations Sharding Looking Ahead
  • 31. EXPRESSIONS Return computed values Used with $ r j c and $ r u poet gop Reference fields using $(e.g. " x ) $" Expressions may be nested
  • 32. EXPRESSIONS Logic Comparison $ n , $ r$ o … ad o, nt $ m , $ q$ t cp e, g… Arithmetic String $d, $iie ad dvd… sraem, sbt… $ t c s c p$ u s r Date Conditional ya, dyfot… $ e r$ a O M n h cn, iNl… $ o d$ f u l
  • 33. USAGE State of Aggregation Pipeline Expressions Usage and Limitations Sharding Looking Ahead
  • 34. USAGE g r g t database command ageae c l e t o . g r g t ( method olcinageae) Mongo shell Most drivers
  • 35. COLLECTION METHOD d.ok.grgt( bbosageae[  {$ot  rae:1},   sr:{cetd  }  {$nid $ujcs ,   uwn:"sbet"}  {$ru:{_d $ujcs,n  sm  ,   gop  i:"sbet" :{$u:1}        f:{$is:"cetd  }        c  frt $rae"}},  {$rjc:{_d ,n ,f:{$er $c }   poet  i:1 :1 c  ya:"f"}} ]; ) ▼ {  rsl:[  eut    {"i" Fnay,"u" ,"c:20 ,    _d:"ats" nm:6 f" 08}   {"i" Hsoia" nm:7 f" 02}    _d:"itrcl,"u" ,"c:21 ,   {"i" WrdLtrtr" n:2 f" 09}    _d:"ol ieaue,"" ,"c:20    / te eut olw   /Ohrrslsflo…  ]  ,  o:1  k  }
  • 36. DATABASE COMMAND d.uCmad{ageae bos,ppln:[ brnomn( grgt:"ok" ieie   {$ot  rae:1},   sr:{cetd  }  {$nid $ujcs ,   uwn:"sbet"}  {$ru:{_d $ujcs,n  sm  ,   gop  i:"sbet" :{$u:1}        f:{$is:"cetd  }        c  frt $rae"}},  {$rjc:{_d ,n ,f:{$er $c }   poet  i:1 :1 c  ya:"f"}} ]) }; ▼ {  rsl:[  eut    {"i" Fnay,"u" ,"c:20 ,    _d:"ats" nm:6 f" 08}   {"i" Hsoia" nm:7 f" 02}    _d:"itrcl,"u" ,"c:21 ,   {"i" WrdLtrtr" n:2 f" 09}    _d:"ol ieaue,"" ,"c:20    / te eut olw   /Ohrrslsflo…  ]  ,  o:1  k  }
  • 37. LIMITATIONS Result limited by BSON document size Final command result Intermediate shard results Pipeline operator memory limits
  • 38. SHARDING State of Aggregation Pipeline Expressions Usage and Limitations Sharding Looking Ahead
  • 39. SHARDING Split the pipeline at first $ r u or $ o t gop sr Shards execute pipeline up to that point mongos merges results and continues Early $ a c may excuse shards mth CPU and memory implications for mongos
  • 41. SHARDING shard1 shard2 shard3 $ac mth $ac mth $ru1 gop $ru1 gop ↘ ↓ mongos $ru2 gop $ot sr $rjc poet ↓ Result
  • 42. LOOKING AHEAD State of Aggregation Pipeline Expressions Usage and Limitations Sharding Looking Ahead
  • 43. FRAMEWORK USE CASES Basic aggregation queries Ad-hoc reporting Real-time analytics Visualizing time series data
  • 44. EXTENDING THE FRAMEWORK Adding new pipeline operators, expressions w t i expression for $ a c $ihn mth $ e N a pipeline operator goer $ u operator for output control ot
  • 45. FUTURE ENHANCEMENTS Improved handling of n l values ul Optimizing $ a c position mth Pipeline explain facility Support BSON binary, code, etc. Memory usage improvements Grouping input sorted by _ d i Sorting with limited output (top k)
  • 46. ENABLING DEVELOPERS Doing more within MongoDB, faster Refactoring MapReduce and groupings Replace pages of JavaScript Longer aggregation pipelines Quick aggregations from the shell