SlideShare uma empresa Scribd logo
1 de 62
Baixar para ler offline
Data Processing and Aggregation
Achille Brighton
Consulting Engineer, MongoDB
Big Data
Exponential Data Growth
Billions of URLs indexed by Google
1200
1000
800
600
400
200
0
2000

2002

2004

2006

2008
For over a decade

Big Data == Custom Software
In the past few years
Open source software has
emerged enabling the rest of
us to handle Big Data
How MongoDB Meets Our Requirements
•  MongoDB is an operational database
•  MongoDB provides high performance for storage and

retrieval at large scale
•  MongoDB has a robust query interface permitting

intelligent operations
•  MongoDB is not a data processing engine, but provides

processing functionality
MongoDB data processing options
Getting Example Data
The “hello world” of
MapReduce is counting words
in a paragraph of text.
Let’s try something a little more
interesting…
What is the most popular pub name?
Open Street Map Data
#!/usr/bin/env python
# Data Source
# http://www.overpass-api.de/api/xapi?*[amenity=pub][bbox=-10.5,49.78,1.78,59]
import re
import sys
from imposm.parser import OSMParser
import pymongo
class Handler(object):
def nodes(self, nodes):
if not nodes:
return
docs = []
for node in nodes:
osm_id, doc, (lon, lat) = node
if "name" not in doc:
node_points[osm_id] = (lon, lat)
continue
doc["name"] = doc["name"].title().lstrip("The ").replace("And", "&")
doc["_id"] = osm_id
doc["location"] = {"type": "Point", "coordinates": [lon, lat]}
docs.append(doc)
collection.insert(docs)
Example Pub Data
{
"_id" : 451152,
"amenity" : "pub",
"name" : "The Dignity",
"addr:housenumber" : "363",
"addr:street" : "Regents Park Road",
"addr:city" : "London",
"addr:postcode" : "N3 1DH",
"toilets" : "yes",
"toilets:access" : "customers",
"location" : {
"type" : "Point",
"coordinates" : [-0.1945732, 51.6008172]
}
}
MongoDB MapReduce
• 

map
MongoDB

reduce
finalize
MongoDB MapReduce
• 
map

Map Function
MongoDB

> var map = function() {
emit(this.name, 1);

reduce

finalize
map

Reduce Function
MongoDB

> var reduce = function (key, values) {
var sum = 0;
values.forEach( function (val) {sum += val;} );
return sum;
}

reduce

finalize
Results
> db.pub_names.find().sort({value: -1}).limit(10)
{ "_id" : "The Red Lion", "value" : 407 }
{ "_id" : "The Royal Oak", "value" : 328 }
{ "_id" : "The Crown", "value" : 242 }
{ "_id" : "The White Hart", "value" : 214 }
{ "_id" : "The White Horse", "value" : 200 }
{ "_id" : "The New Inn", "value" : 187 }
{ "_id" : "The Plough", "value" : 185 }
{ "_id" : "The Rose & Crown", "value" : 164 }
{ "_id" : "The Wheatsheaf", "value" : 147 }
{ "_id" : "The Swan", "value" : 140 }
Pub Names in the Center of London
> db.pubs.mapReduce(map, reduce, { out: "pub_names",
query: {
location: {
$within: { $centerSphere: [[-0.12, 51.516], 2 / 3959] }
}}
})
{
"result" : "pub_names",
"timeMillis" : 116,
"counts" : {
"input" : 643,
"emit" : 643,
"reduce" : 54,
"output" : 537
},
"ok" : 1,
}
Results
> db.pub_names.find().sort({value: -1}).limit(10)
{
{
{
{
{
{
{
{
{
{

"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"

:
:
:
:
:
:
:
:
:
:

"All Bar One", "value" : 11 }
"The Slug & Lettuce", "value" : 7 }
"The Coach & Horses", "value" : 6 }
"The Green Man", "value" : 5 }
"The Kings Arms", "value" : 5 }
"The Red Lion", "value" : 5 }
"Corney & Barrow", "value" : 4 }
"O'Neills", "value" : 4 }
"Pitcher & Piano", "value" : 4 }
"The Crown", "value" : 4 }
Double Checking
MongoDB MapReduce
•  Real-time
•  Output directly to document or collection
•  Runs inside MongoDB on local data

− Adds load to your DB
− In Javascript – debugging can be a challenge
− Translating in and out of C++
Aggregation Framework
•  Declared in JSON, executes in C++

Aggregation Framework
Data Processing in MongoDB
•  Declared in JSON, executes in C++
•  Flexible, functional, and simple

Aggregation Framework
Data Processing in MongoDB
•  Declared in JSON, executes in C++
•  Flexible, functional, and simple
•  Plays nice with sharding

Aggregation Framework
Data Processing in MongoDB
Pipeline
Piping command line operations

ps ax | grep mongod | head 1

Data Processing in MongoDB
Pipeline
Piping aggregation operations

$match | $group | $sort
Stream of documents

Result document

Data Processing in MongoDB
Pipeline Operators
•  $match

•  $sort

•  $project

•  $limit

•  $group

•  $skip

•  $unwind

•  $geoNear

Data Processing in MongoDB
$match
•  Filter documents
•  Uses existing query syntax
•  If using $geoNear it has to be first in pipeline
•  $where is not supported
Matching Field Values
{
"_id" : 271421,
"amenity" : "pub",
"name" : "Sir Walter Tyrrell",
"location" : {
"type" : "Point",
"coordinates" : [
-1.6192422,
50.9131996
]
}
}

{ "$match": {
"name": "The Red Lion"
}}

{
"_id" : 271466,
"amenity" : "pub",
"name" : "The Red Lion",
"location" : {
"type" : "Point",
"coordinates" : [
-1.5494749,
50.7837119
]}

{
"_id" : 271466,
"amenity" : "pub",
"name" : "The Red Lion",
"location" : {
"type" : "Point",
"coordinates" : [
-1.5494749,
50.7837119
]
}

}
$project
•  Reshape documents
•  Include, exclude or rename fields
•  Inject computed fields
•  Create sub-document fields
Including and Excluding Fields
{
"_id" : 271466,
"amenity" : "pub",
"name" : "The Red Lion",
"location" : {
"type" : "Point",

{ “$project”: {
“_id”: 0,
“amenity”: 1,
“name”: 1,
}}

"coordinates" : [
-1.5494749,
50.7837119
]
}
}

{
“amenity” : “pub”,
“name” : “The Red Lion”
}
Reformatting Documents
{
"_id" : 271466,
"amenity" : "pub",
"name" : "The Red Lion",
"location" : {
"type" : "Point",

{ “$project”: {
“_id”: 0,
“name”: 1,
“meta”: {
“type”: “$amenity”}
}}

"coordinates" : [
-1.5494749,
50.7837119
]
}
}

{
“name” : “The Red Lion”
“meta” : {
“type” : “pub”
}}
$group
•  Group documents by an ID
•  Field reference, object, constant
•  Other output fields are computed

$max, $min, $avg, $sum
$addToSet, $push $first, $last
•  Processes all data in memory
Summating fields

}

{ $group: {
_id: "$language",
numTitles: { $sum: 1 },
sumPages: { $sum: "$pages" }
}}

{

{

{
title: "The Great Gatsby",
pages: 218,
language: "English"

title: "War and Peace",
pages: 1440,
language: "Russian”
}

}

{

_id: "Russian",
numTitles: 1,
sumPages: 1440

{
title: "Atlas Shrugged",
pages: 1088,
language: "English"

}

}

_id: "English",
numTitles: 2,
sumPages: 1306
Add To Set
{
title: "The Great Gatsby",
pages: 218,
language: "English"

{ $group: {
_id: "$language",
titles: { $addToSet: "$title" }
}}

}

{
{
title: "War and Peace",
pages: 1440,
language: "Russian"

}
{

}
{
title: "Atlas Shrugged",
pages: 1088,
language: "English"
}

}

_id: "Russian",
titles: [ "War and Peace" ]

_id: "English",
titles: [
"Atlas Shrugged",
"The Great Gatsby"
]
Expanding Arrays
{ $unwind: "$subjects" }

{
title: "The Great Gatsby",
ISBN: "9781857150193",
subjects: [
"Long Island",
"New York",
"1920s"
]

{

}
{

}

}
{

}

title: "The Great Gatsby",
ISBN: "9781857150193",
subjects: "Long Island"
title: "The Great Gatsby",
ISBN: "9781857150193",
subjects: "New York"
title: "The Great Gatsby",
ISBN: "9781857150193",
subjects: "1920s"
Back to the pub!

• 

http://www.offwestend.com/index.php/theatres/pastshows/71
Popular Pub Names
>var popular_pub_names = [
{ $match : location:
{ $within: { $centerSphere:
[[-0.12, 51.516], 2 / 3959]}}}
},
{ $group :
{ _id: “$name”
value: {$sum: 1} }
},
{ $sort : {value: -1} },
{ $limit : 10 }
Results
> db.pubs.aggregate(popular_pub_names)
{
"result" : [
{ "_id" : "All Bar One", "value" : 11 }
{ "_id" : "The Slug & Lettuce", "value" : 7 }
{ "_id" : "The Coach & Horses", "value" : 6 }
{ "_id" : "The Green Man", "value" : 5 }
{ "_id" : "The Kings Arms", "value" : 5 }
{ "_id" : "The Red Lion", "value" : 5 }
{ "_id" : "Corney & Barrow", "value" : 4 }
{ "_id" : "O'Neills", "value" : 4 }
{ "_id" : "Pitcher & Piano", "value" : 4 }
{ "_id" : "The Crown", "value" : 4 }
],
"ok" : 1
}
Aggregation Framework Benefits
•  Real-time
•  Simple yet powerful interface
•  Declared in JSON, executes in C++
•  Runs inside MongoDB on local data

− Adds load to your DB
− Limited Operators
− Data output is limited
Analyzing MongoDB Data in
External Systems
MongoDB with Hadoop
• 

MongoDB
Hadoop MongoDB Connector
•  MongoDB or BSON files as input/output
•  Source data can be filtered with queries
•  Hadoop Streaming support
–  For jobs written in Python, Ruby, Node.js

•  Supports Hadoop tools such as Pig and Hive
Map Pub Names in Python
#!/usr/bin/env python
from pymongo_hadoop import BSONMapper
def mapper(documents):
bounds = get_bounds() # ~2 mile polygon
for doc in documents:
geo = get_geo(doc["location"]) # Convert the geo type
if not geo:
continue
if bounds.intersects(geo):
yield {'_id': doc['name'], 'count': 1}
BSONMapper(mapper)
print >> sys.stderr, "Done Mapping."
Reduce Pub Names in Python
#!/usr/bin/env python
from pymongo_hadoop import BSONReducer
def reducer(key, values):
_count = 0
for v in values:
_count += v['count']
return {'_id': key, 'value': _count}
BSONReducer(reducer)
Execute MapReduce
hadoop jar target/mongo-hadoop-streaming-assembly-1.1.0-rc0.jar 
-mapper examples/pub/map.py 
-reducer examples/pub/reduce.py 
-mongo mongodb://127.0.0.1/demo.pubs 
-outputURI mongodb://127.0.0.1/demo.pub_names
Popular Pub Names Nearby
> db.pub_names.find().sort({value: -1}).limit(10)
{
{
{
{
{
{
{
{
{
{

"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"

:
:
:
:
:
:
:
:
:
:

"All Bar One", "value" : 11 }
"The Slug & Lettuce", "value" : 7 }
"The Coach & Horses", "value" : 6 }
"The Kings Arms", "value" : 5 }
"Corney & Barrow", "value" : 4 }
"O'Neills", "value" : 4 }
"Pitcher & Piano", "value" : 4 }
"The Crown", "value" : 4 }
"The George", "value" : 4 }
"The Green Man", "value" : 4 }
MongoDB with Hadoop
• 

MongoDB

warehouse
MongoDB with Hadoop
• 

ETL

MongoDB
Limitations
•  Batch processing
•  Requires synchronization between data store and

processor
•  Adds complexity to infrastructure
Advantages
•  Processing decoupled from data store
•  Parallel processing
•  Leverage existing infrastructure
•  Java has rich set of data processing libraries
–  And other languages if using Hadoop Streaming
Storm
Storm
Storm MongoDB connector
•  Spout for MongoDB oplog or capped collections
–  Filtering capabilities
–  Threaded and non-blocking

•  Output to new or existing documents
–  Insert/update bolt
Aggregating MongoDB’s
Data Processing Options
Data Processing with MongoDB
•  Process in MongoDB using Map/Reduce
•  Process in MongoDB using Aggregation Framework
•  Also: Storing pre-aggregated data
–  An exercise in schema design
•  Process outside MongoDB using Hadoop and other

external tools
External Tools
Questions?
References
•  Map Reduce docs
–  http://docs.mongodb.org/manual/core/map-reduce/
•  Aggregation Framework
–  Examples
http://docs.mongodb.org/manual/applications/aggregation
–  SQL Comparison
http://docs.mongodb.org/manual/reference/sql-aggregation-comparison/
•  Multi Threaded Map Reduce:

http://edgystuff.tumblr.com/post/54709368492/how-to-speedup-mongodb-map-reduce-by-20x
Thanks!
Achille Brighton
Consulting Engineer, MongoDB

Mais conteúdo relacionado

Mais procurados

Agg framework selectgroup feb2015 v2
Agg framework selectgroup feb2015 v2Agg framework selectgroup feb2015 v2
Agg framework selectgroup feb2015 v2MongoDB
 
Beyond the Basics 2: Aggregation Framework
Beyond the Basics 2: Aggregation Framework Beyond the Basics 2: Aggregation Framework
Beyond the Basics 2: Aggregation Framework MongoDB
 
Webinar: Working with Graph Data in MongoDB
Webinar: Working with Graph Data in MongoDBWebinar: Working with Graph Data in MongoDB
Webinar: Working with Graph Data in MongoDBMongoDB
 
MongoDB World 2016 : Advanced Aggregation
MongoDB World 2016 : Advanced AggregationMongoDB World 2016 : Advanced Aggregation
MongoDB World 2016 : Advanced AggregationJoe Drumgoole
 
Conexión de MongoDB con Hadoop - Luis Alberto Giménez - CAPSiDE #DevOSSAzureDays
Conexión de MongoDB con Hadoop - Luis Alberto Giménez - CAPSiDE #DevOSSAzureDaysConexión de MongoDB con Hadoop - Luis Alberto Giménez - CAPSiDE #DevOSSAzureDays
Conexión de MongoDB con Hadoop - Luis Alberto Giménez - CAPSiDE #DevOSSAzureDaysCAPSiDE
 
Barcelona MUG MongoDB + Hadoop Presentation
Barcelona MUG MongoDB + Hadoop PresentationBarcelona MUG MongoDB + Hadoop Presentation
Barcelona MUG MongoDB + Hadoop PresentationNorberto Leite
 
Webinar: Exploring the Aggregation Framework
Webinar: Exploring the Aggregation FrameworkWebinar: Exploring the Aggregation Framework
Webinar: Exploring the Aggregation FrameworkMongoDB
 
MongoDB Europe 2016 - Graph Operations with MongoDB
MongoDB Europe 2016 - Graph Operations with MongoDBMongoDB Europe 2016 - Graph Operations with MongoDB
MongoDB Europe 2016 - Graph Operations with MongoDBMongoDB
 
Analytics with MongoDB Aggregation Framework and Hadoop Connector
Analytics with MongoDB Aggregation Framework and Hadoop ConnectorAnalytics with MongoDB Aggregation Framework and Hadoop Connector
Analytics with MongoDB Aggregation Framework and Hadoop ConnectorHenrik Ingo
 
Introduction to MongoDB and Hadoop
Introduction to MongoDB and HadoopIntroduction to MongoDB and Hadoop
Introduction to MongoDB and HadoopSteven Francia
 
Conceptos básicos. Seminario web 5: Introducción a Aggregation Framework
Conceptos básicos. Seminario web 5: Introducción a Aggregation FrameworkConceptos básicos. Seminario web 5: Introducción a Aggregation Framework
Conceptos básicos. Seminario web 5: Introducción a Aggregation FrameworkMongoDB
 
MongoDB - Aggregation Pipeline
MongoDB - Aggregation PipelineMongoDB - Aggregation Pipeline
MongoDB - Aggregation PipelineJason Terpko
 
Hadoop - MongoDB Webinar June 2014
Hadoop - MongoDB Webinar June 2014Hadoop - MongoDB Webinar June 2014
Hadoop - MongoDB Webinar June 2014MongoDB
 
MongoDB Aggregation
MongoDB Aggregation MongoDB Aggregation
MongoDB Aggregation Amit Ghosh
 
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...MongoDB
 
Back to Basics, webinar 2: La tua prima applicazione MongoDB
Back to Basics, webinar 2: La tua prima applicazione MongoDBBack to Basics, webinar 2: La tua prima applicazione MongoDB
Back to Basics, webinar 2: La tua prima applicazione MongoDBMongoDB
 
2014 bigdatacamp asya_kamsky
2014 bigdatacamp asya_kamsky2014 bigdatacamp asya_kamsky
2014 bigdatacamp asya_kamskyData Con LA
 
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...MongoDB
 

Mais procurados (20)

Agg framework selectgroup feb2015 v2
Agg framework selectgroup feb2015 v2Agg framework selectgroup feb2015 v2
Agg framework selectgroup feb2015 v2
 
Beyond the Basics 2: Aggregation Framework
Beyond the Basics 2: Aggregation Framework Beyond the Basics 2: Aggregation Framework
Beyond the Basics 2: Aggregation Framework
 
Webinar: Working with Graph Data in MongoDB
Webinar: Working with Graph Data in MongoDBWebinar: Working with Graph Data in MongoDB
Webinar: Working with Graph Data in MongoDB
 
MongoDB World 2016 : Advanced Aggregation
MongoDB World 2016 : Advanced AggregationMongoDB World 2016 : Advanced Aggregation
MongoDB World 2016 : Advanced Aggregation
 
Conexión de MongoDB con Hadoop - Luis Alberto Giménez - CAPSiDE #DevOSSAzureDays
Conexión de MongoDB con Hadoop - Luis Alberto Giménez - CAPSiDE #DevOSSAzureDaysConexión de MongoDB con Hadoop - Luis Alberto Giménez - CAPSiDE #DevOSSAzureDays
Conexión de MongoDB con Hadoop - Luis Alberto Giménez - CAPSiDE #DevOSSAzureDays
 
Barcelona MUG MongoDB + Hadoop Presentation
Barcelona MUG MongoDB + Hadoop PresentationBarcelona MUG MongoDB + Hadoop Presentation
Barcelona MUG MongoDB + Hadoop Presentation
 
Webinar: Exploring the Aggregation Framework
Webinar: Exploring the Aggregation FrameworkWebinar: Exploring the Aggregation Framework
Webinar: Exploring the Aggregation Framework
 
MongoDB Europe 2016 - Graph Operations with MongoDB
MongoDB Europe 2016 - Graph Operations with MongoDBMongoDB Europe 2016 - Graph Operations with MongoDB
MongoDB Europe 2016 - Graph Operations with MongoDB
 
Analytics with MongoDB Aggregation Framework and Hadoop Connector
Analytics with MongoDB Aggregation Framework and Hadoop ConnectorAnalytics with MongoDB Aggregation Framework and Hadoop Connector
Analytics with MongoDB Aggregation Framework and Hadoop Connector
 
Introduction to MongoDB and Hadoop
Introduction to MongoDB and HadoopIntroduction to MongoDB and Hadoop
Introduction to MongoDB and Hadoop
 
Conceptos básicos. Seminario web 5: Introducción a Aggregation Framework
Conceptos básicos. Seminario web 5: Introducción a Aggregation FrameworkConceptos básicos. Seminario web 5: Introducción a Aggregation Framework
Conceptos básicos. Seminario web 5: Introducción a Aggregation Framework
 
MongoDB - Aggregation Pipeline
MongoDB - Aggregation PipelineMongoDB - Aggregation Pipeline
MongoDB - Aggregation Pipeline
 
Hadoop - MongoDB Webinar June 2014
Hadoop - MongoDB Webinar June 2014Hadoop - MongoDB Webinar June 2014
Hadoop - MongoDB Webinar June 2014
 
MongoDB Aggregation
MongoDB Aggregation MongoDB Aggregation
MongoDB Aggregation
 
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...
 
Back to Basics, webinar 2: La tua prima applicazione MongoDB
Back to Basics, webinar 2: La tua prima applicazione MongoDBBack to Basics, webinar 2: La tua prima applicazione MongoDB
Back to Basics, webinar 2: La tua prima applicazione MongoDB
 
2014 bigdatacamp asya_kamsky
2014 bigdatacamp asya_kamsky2014 bigdatacamp asya_kamsky
2014 bigdatacamp asya_kamsky
 
MongoDB and Python
MongoDB and PythonMongoDB and Python
MongoDB and Python
 
Python and MongoDB
Python and MongoDB Python and MongoDB
Python and MongoDB
 
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
 

Destaque

Introduction to MongoDB and Workshop
Introduction to MongoDB and WorkshopIntroduction to MongoDB and Workshop
Introduction to MongoDB and WorkshopAhmedabadJavaMeetup
 
Introduction to MongoDB with PHP
Introduction to MongoDB with PHPIntroduction to MongoDB with PHP
Introduction to MongoDB with PHPfwso
 
How to leverage MongoDB for Big Data Analysis and Operations with MongoDB's A...
How to leverage MongoDB for Big Data Analysis and Operations with MongoDB's A...How to leverage MongoDB for Big Data Analysis and Operations with MongoDB's A...
How to leverage MongoDB for Big Data Analysis and Operations with MongoDB's A...Gianfranco Palumbo
 
Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...
Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...
Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...NoSQLmatters
 
Big Data Analytics 3: Machine Learning to Engage the Customer, with Apache Sp...
Big Data Analytics 3: Machine Learning to Engage the Customer, with Apache Sp...Big Data Analytics 3: Machine Learning to Engage the Customer, with Apache Sp...
Big Data Analytics 3: Machine Learning to Engage the Customer, with Apache Sp...MongoDB
 
An Introduction to Map/Reduce with MongoDB
An Introduction to Map/Reduce with MongoDBAn Introduction to Map/Reduce with MongoDB
An Introduction to Map/Reduce with MongoDBRainforest QA
 
MongoDB & Machine Learning
MongoDB & Machine LearningMongoDB & Machine Learning
MongoDB & Machine LearningTom Maiaroto
 

Destaque (9)

Introduction to MongoDB and Workshop
Introduction to MongoDB and WorkshopIntroduction to MongoDB and Workshop
Introduction to MongoDB and Workshop
 
MongoDB - Ekino PHP
MongoDB - Ekino PHPMongoDB - Ekino PHP
MongoDB - Ekino PHP
 
MongoDB
MongoDBMongoDB
MongoDB
 
Introduction to MongoDB with PHP
Introduction to MongoDB with PHPIntroduction to MongoDB with PHP
Introduction to MongoDB with PHP
 
How to leverage MongoDB for Big Data Analysis and Operations with MongoDB's A...
How to leverage MongoDB for Big Data Analysis and Operations with MongoDB's A...How to leverage MongoDB for Big Data Analysis and Operations with MongoDB's A...
How to leverage MongoDB for Big Data Analysis and Operations with MongoDB's A...
 
Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...
Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...
Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...
 
Big Data Analytics 3: Machine Learning to Engage the Customer, with Apache Sp...
Big Data Analytics 3: Machine Learning to Engage the Customer, with Apache Sp...Big Data Analytics 3: Machine Learning to Engage the Customer, with Apache Sp...
Big Data Analytics 3: Machine Learning to Engage the Customer, with Apache Sp...
 
An Introduction to Map/Reduce with MongoDB
An Introduction to Map/Reduce with MongoDBAn Introduction to Map/Reduce with MongoDB
An Introduction to Map/Reduce with MongoDB
 
MongoDB & Machine Learning
MongoDB & Machine LearningMongoDB & Machine Learning
MongoDB & Machine Learning
 

Semelhante a Webinar: Data Processing and Aggregation Options

Webinar: General Technical Overview of MongoDB for Dev Teams
Webinar: General Technical Overview of MongoDB for Dev TeamsWebinar: General Technical Overview of MongoDB for Dev Teams
Webinar: General Technical Overview of MongoDB for Dev TeamsMongoDB
 
Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...
Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...
Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...MongoDB
 
Introduction to MongoDB for C# developers
Introduction to MongoDB for C# developersIntroduction to MongoDB for C# developers
Introduction to MongoDB for C# developersTaras Romanyk
 
Mongodb intro
Mongodb introMongodb intro
Mongodb introchristkv
 
DrupalCon Chicago Practical MongoDB and Drupal
DrupalCon Chicago Practical MongoDB and DrupalDrupalCon Chicago Practical MongoDB and Drupal
DrupalCon Chicago Practical MongoDB and DrupalDoug Green
 
MongoDB.local DC 2018: Tutorial - Data Analytics with MongoDB
MongoDB.local DC 2018: Tutorial - Data Analytics with MongoDBMongoDB.local DC 2018: Tutorial - Data Analytics with MongoDB
MongoDB.local DC 2018: Tutorial - Data Analytics with MongoDBMongoDB
 
Managing Social Content with MongoDB
Managing Social Content with MongoDBManaging Social Content with MongoDB
Managing Social Content with MongoDBMongoDB
 
Joins and Other MongoDB 3.2 Aggregation Enhancements
Joins and Other MongoDB 3.2 Aggregation EnhancementsJoins and Other MongoDB 3.2 Aggregation Enhancements
Joins and Other MongoDB 3.2 Aggregation EnhancementsAndrew Morgan
 
Joins and Other Aggregation Enhancements Coming in MongoDB 3.2
Joins and Other Aggregation Enhancements Coming in MongoDB 3.2Joins and Other Aggregation Enhancements Coming in MongoDB 3.2
Joins and Other Aggregation Enhancements Coming in MongoDB 3.2MongoDB
 
Webinar: Applikationsentwicklung mit MongoDB : Teil 5: Reporting & Aggregation
Webinar: Applikationsentwicklung mit MongoDB: Teil 5: Reporting & AggregationWebinar: Applikationsentwicklung mit MongoDB: Teil 5: Reporting & Aggregation
Webinar: Applikationsentwicklung mit MongoDB : Teil 5: Reporting & AggregationMongoDB
 
1403 app dev series - session 5 - analytics
1403   app dev series - session 5 - analytics1403   app dev series - session 5 - analytics
1403 app dev series - session 5 - analyticsMongoDB
 
MongoDB World 2019: Exploring your MongoDB Data with Pirates (R) and Snakes (...
MongoDB World 2019: Exploring your MongoDB Data with Pirates (R) and Snakes (...MongoDB World 2019: Exploring your MongoDB Data with Pirates (R) and Snakes (...
MongoDB World 2019: Exploring your MongoDB Data with Pirates (R) and Snakes (...MongoDB
 
MongoDB Evenings Dallas: What's the Scoop on MongoDB & Hadoop
MongoDB Evenings Dallas: What's the Scoop on MongoDB & HadoopMongoDB Evenings Dallas: What's the Scoop on MongoDB & Hadoop
MongoDB Evenings Dallas: What's the Scoop on MongoDB & HadoopMongoDB
 
9b. Document-Oriented Databases lab
9b. Document-Oriented Databases lab9b. Document-Oriented Databases lab
9b. Document-Oriented Databases labFabio Fumarola
 
MongoDB Evenings Houston: What's the Scoop on MongoDB and Hadoop? by Jake Ang...
MongoDB Evenings Houston: What's the Scoop on MongoDB and Hadoop? by Jake Ang...MongoDB Evenings Houston: What's the Scoop on MongoDB and Hadoop? by Jake Ang...
MongoDB Evenings Houston: What's the Scoop on MongoDB and Hadoop? by Jake Ang...MongoDB
 
Data Analytics with MongoDB - Jane Fine
Data Analytics with MongoDB - Jane FineData Analytics with MongoDB - Jane Fine
Data Analytics with MongoDB - Jane FineMongoDB
 
Building web applications with mongo db presentation
Building web applications with mongo db presentationBuilding web applications with mongo db presentation
Building web applications with mongo db presentationMurat Çakal
 
Crafting Evolvable Api Responses
Crafting Evolvable Api ResponsesCrafting Evolvable Api Responses
Crafting Evolvable Api Responsesdarrelmiller71
 

Semelhante a Webinar: Data Processing and Aggregation Options (20)

Webinar: General Technical Overview of MongoDB for Dev Teams
Webinar: General Technical Overview of MongoDB for Dev TeamsWebinar: General Technical Overview of MongoDB for Dev Teams
Webinar: General Technical Overview of MongoDB for Dev Teams
 
Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...
Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...
Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...
 
Introduction to MongoDB for C# developers
Introduction to MongoDB for C# developersIntroduction to MongoDB for C# developers
Introduction to MongoDB for C# developers
 
Mongodb intro
Mongodb introMongodb intro
Mongodb intro
 
DrupalCon Chicago Practical MongoDB and Drupal
DrupalCon Chicago Practical MongoDB and DrupalDrupalCon Chicago Practical MongoDB and Drupal
DrupalCon Chicago Practical MongoDB and Drupal
 
MongoDB.local DC 2018: Tutorial - Data Analytics with MongoDB
MongoDB.local DC 2018: Tutorial - Data Analytics with MongoDBMongoDB.local DC 2018: Tutorial - Data Analytics with MongoDB
MongoDB.local DC 2018: Tutorial - Data Analytics with MongoDB
 
Managing Social Content with MongoDB
Managing Social Content with MongoDBManaging Social Content with MongoDB
Managing Social Content with MongoDB
 
Joins and Other MongoDB 3.2 Aggregation Enhancements
Joins and Other MongoDB 3.2 Aggregation EnhancementsJoins and Other MongoDB 3.2 Aggregation Enhancements
Joins and Other MongoDB 3.2 Aggregation Enhancements
 
Joins and Other Aggregation Enhancements Coming in MongoDB 3.2
Joins and Other Aggregation Enhancements Coming in MongoDB 3.2Joins and Other Aggregation Enhancements Coming in MongoDB 3.2
Joins and Other Aggregation Enhancements Coming in MongoDB 3.2
 
Webinar: Applikationsentwicklung mit MongoDB : Teil 5: Reporting & Aggregation
Webinar: Applikationsentwicklung mit MongoDB: Teil 5: Reporting & AggregationWebinar: Applikationsentwicklung mit MongoDB: Teil 5: Reporting & Aggregation
Webinar: Applikationsentwicklung mit MongoDB : Teil 5: Reporting & Aggregation
 
1403 app dev series - session 5 - analytics
1403   app dev series - session 5 - analytics1403   app dev series - session 5 - analytics
1403 app dev series - session 5 - analytics
 
MongoDB World 2019: Exploring your MongoDB Data with Pirates (R) and Snakes (...
MongoDB World 2019: Exploring your MongoDB Data with Pirates (R) and Snakes (...MongoDB World 2019: Exploring your MongoDB Data with Pirates (R) and Snakes (...
MongoDB World 2019: Exploring your MongoDB Data with Pirates (R) and Snakes (...
 
MongoDB Evenings Dallas: What's the Scoop on MongoDB & Hadoop
MongoDB Evenings Dallas: What's the Scoop on MongoDB & HadoopMongoDB Evenings Dallas: What's the Scoop on MongoDB & Hadoop
MongoDB Evenings Dallas: What's the Scoop on MongoDB & Hadoop
 
MongoDB 3.0
MongoDB 3.0 MongoDB 3.0
MongoDB 3.0
 
MongoDB Meetup
MongoDB MeetupMongoDB Meetup
MongoDB Meetup
 
9b. Document-Oriented Databases lab
9b. Document-Oriented Databases lab9b. Document-Oriented Databases lab
9b. Document-Oriented Databases lab
 
MongoDB Evenings Houston: What's the Scoop on MongoDB and Hadoop? by Jake Ang...
MongoDB Evenings Houston: What's the Scoop on MongoDB and Hadoop? by Jake Ang...MongoDB Evenings Houston: What's the Scoop on MongoDB and Hadoop? by Jake Ang...
MongoDB Evenings Houston: What's the Scoop on MongoDB and Hadoop? by Jake Ang...
 
Data Analytics with MongoDB - Jane Fine
Data Analytics with MongoDB - Jane FineData Analytics with MongoDB - Jane Fine
Data Analytics with MongoDB - Jane Fine
 
Building web applications with mongo db presentation
Building web applications with mongo db presentationBuilding web applications with mongo db presentation
Building web applications with mongo db presentation
 
Crafting Evolvable Api Responses
Crafting Evolvable Api ResponsesCrafting Evolvable Api Responses
Crafting Evolvable Api Responses
 

Mais de MongoDB

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump StartMongoDB
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB
 

Mais de MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
 

Último

Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Principled Technologies
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 

Último (20)

Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 

Webinar: Data Processing and Aggregation Options

  • 1. Data Processing and Aggregation Achille Brighton Consulting Engineer, MongoDB
  • 3. Exponential Data Growth Billions of URLs indexed by Google 1200 1000 800 600 400 200 0 2000 2002 2004 2006 2008
  • 4. For over a decade Big Data == Custom Software
  • 5. In the past few years Open source software has emerged enabling the rest of us to handle Big Data
  • 6. How MongoDB Meets Our Requirements •  MongoDB is an operational database •  MongoDB provides high performance for storage and retrieval at large scale •  MongoDB has a robust query interface permitting intelligent operations •  MongoDB is not a data processing engine, but provides processing functionality
  • 9. The “hello world” of MapReduce is counting words in a paragraph of text. Let’s try something a little more interesting…
  • 10. What is the most popular pub name?
  • 11. Open Street Map Data #!/usr/bin/env python # Data Source # http://www.overpass-api.de/api/xapi?*[amenity=pub][bbox=-10.5,49.78,1.78,59] import re import sys from imposm.parser import OSMParser import pymongo class Handler(object): def nodes(self, nodes): if not nodes: return docs = [] for node in nodes: osm_id, doc, (lon, lat) = node if "name" not in doc: node_points[osm_id] = (lon, lat) continue doc["name"] = doc["name"].title().lstrip("The ").replace("And", "&") doc["_id"] = osm_id doc["location"] = {"type": "Point", "coordinates": [lon, lat]} docs.append(doc) collection.insert(docs)
  • 12. Example Pub Data { "_id" : 451152, "amenity" : "pub", "name" : "The Dignity", "addr:housenumber" : "363", "addr:street" : "Regents Park Road", "addr:city" : "London", "addr:postcode" : "N3 1DH", "toilets" : "yes", "toilets:access" : "customers", "location" : { "type" : "Point", "coordinates" : [-0.1945732, 51.6008172] } }
  • 15. map Map Function MongoDB > var map = function() { emit(this.name, 1); reduce finalize
  • 16. map Reduce Function MongoDB > var reduce = function (key, values) { var sum = 0; values.forEach( function (val) {sum += val;} ); return sum; } reduce finalize
  • 17. Results > db.pub_names.find().sort({value: -1}).limit(10) { "_id" : "The Red Lion", "value" : 407 } { "_id" : "The Royal Oak", "value" : 328 } { "_id" : "The Crown", "value" : 242 } { "_id" : "The White Hart", "value" : 214 } { "_id" : "The White Horse", "value" : 200 } { "_id" : "The New Inn", "value" : 187 } { "_id" : "The Plough", "value" : 185 } { "_id" : "The Rose & Crown", "value" : 164 } { "_id" : "The Wheatsheaf", "value" : 147 } { "_id" : "The Swan", "value" : 140 }
  • 18.
  • 19. Pub Names in the Center of London > db.pubs.mapReduce(map, reduce, { out: "pub_names", query: { location: { $within: { $centerSphere: [[-0.12, 51.516], 2 / 3959] } }} }) { "result" : "pub_names", "timeMillis" : 116, "counts" : { "input" : 643, "emit" : 643, "reduce" : 54, "output" : 537 }, "ok" : 1, }
  • 20. Results > db.pub_names.find().sort({value: -1}).limit(10) { { { { { { { { { { "_id" "_id" "_id" "_id" "_id" "_id" "_id" "_id" "_id" "_id" : : : : : : : : : : "All Bar One", "value" : 11 } "The Slug & Lettuce", "value" : 7 } "The Coach & Horses", "value" : 6 } "The Green Man", "value" : 5 } "The Kings Arms", "value" : 5 } "The Red Lion", "value" : 5 } "Corney & Barrow", "value" : 4 } "O'Neills", "value" : 4 } "Pitcher & Piano", "value" : 4 } "The Crown", "value" : 4 }
  • 22. MongoDB MapReduce •  Real-time •  Output directly to document or collection •  Runs inside MongoDB on local data − Adds load to your DB − In Javascript – debugging can be a challenge − Translating in and out of C++
  • 24. •  Declared in JSON, executes in C++ Aggregation Framework Data Processing in MongoDB
  • 25. •  Declared in JSON, executes in C++ •  Flexible, functional, and simple Aggregation Framework Data Processing in MongoDB
  • 26. •  Declared in JSON, executes in C++ •  Flexible, functional, and simple •  Plays nice with sharding Aggregation Framework Data Processing in MongoDB
  • 27. Pipeline Piping command line operations ps ax | grep mongod | head 1 Data Processing in MongoDB
  • 28. Pipeline Piping aggregation operations $match | $group | $sort Stream of documents Result document Data Processing in MongoDB
  • 29. Pipeline Operators •  $match •  $sort •  $project •  $limit •  $group •  $skip •  $unwind •  $geoNear Data Processing in MongoDB
  • 30. $match •  Filter documents •  Uses existing query syntax •  If using $geoNear it has to be first in pipeline •  $where is not supported
  • 31. Matching Field Values { "_id" : 271421, "amenity" : "pub", "name" : "Sir Walter Tyrrell", "location" : { "type" : "Point", "coordinates" : [ -1.6192422, 50.9131996 ] } } { "$match": { "name": "The Red Lion" }} { "_id" : 271466, "amenity" : "pub", "name" : "The Red Lion", "location" : { "type" : "Point", "coordinates" : [ -1.5494749, 50.7837119 ]} { "_id" : 271466, "amenity" : "pub", "name" : "The Red Lion", "location" : { "type" : "Point", "coordinates" : [ -1.5494749, 50.7837119 ] } }
  • 32. $project •  Reshape documents •  Include, exclude or rename fields •  Inject computed fields •  Create sub-document fields
  • 33. Including and Excluding Fields { "_id" : 271466, "amenity" : "pub", "name" : "The Red Lion", "location" : { "type" : "Point", { “$project”: { “_id”: 0, “amenity”: 1, “name”: 1, }} "coordinates" : [ -1.5494749, 50.7837119 ] } } { “amenity” : “pub”, “name” : “The Red Lion” }
  • 34. Reformatting Documents { "_id" : 271466, "amenity" : "pub", "name" : "The Red Lion", "location" : { "type" : "Point", { “$project”: { “_id”: 0, “name”: 1, “meta”: { “type”: “$amenity”} }} "coordinates" : [ -1.5494749, 50.7837119 ] } } { “name” : “The Red Lion” “meta” : { “type” : “pub” }}
  • 35. $group •  Group documents by an ID •  Field reference, object, constant •  Other output fields are computed $max, $min, $avg, $sum $addToSet, $push $first, $last •  Processes all data in memory
  • 36. Summating fields } { $group: { _id: "$language", numTitles: { $sum: 1 }, sumPages: { $sum: "$pages" } }} { { { title: "The Great Gatsby", pages: 218, language: "English" title: "War and Peace", pages: 1440, language: "Russian” } } { _id: "Russian", numTitles: 1, sumPages: 1440 { title: "Atlas Shrugged", pages: 1088, language: "English" } } _id: "English", numTitles: 2, sumPages: 1306
  • 37. Add To Set { title: "The Great Gatsby", pages: 218, language: "English" { $group: { _id: "$language", titles: { $addToSet: "$title" } }} } { { title: "War and Peace", pages: 1440, language: "Russian" } { } { title: "Atlas Shrugged", pages: 1088, language: "English" } } _id: "Russian", titles: [ "War and Peace" ] _id: "English", titles: [ "Atlas Shrugged", "The Great Gatsby" ]
  • 38. Expanding Arrays { $unwind: "$subjects" } { title: "The Great Gatsby", ISBN: "9781857150193", subjects: [ "Long Island", "New York", "1920s" ] { } { } } { } title: "The Great Gatsby", ISBN: "9781857150193", subjects: "Long Island" title: "The Great Gatsby", ISBN: "9781857150193", subjects: "New York" title: "The Great Gatsby", ISBN: "9781857150193", subjects: "1920s"
  • 39. Back to the pub! •  http://www.offwestend.com/index.php/theatres/pastshows/71
  • 40. Popular Pub Names >var popular_pub_names = [ { $match : location: { $within: { $centerSphere: [[-0.12, 51.516], 2 / 3959]}}} }, { $group : { _id: “$name” value: {$sum: 1} } }, { $sort : {value: -1} }, { $limit : 10 }
  • 41. Results > db.pubs.aggregate(popular_pub_names) { "result" : [ { "_id" : "All Bar One", "value" : 11 } { "_id" : "The Slug & Lettuce", "value" : 7 } { "_id" : "The Coach & Horses", "value" : 6 } { "_id" : "The Green Man", "value" : 5 } { "_id" : "The Kings Arms", "value" : 5 } { "_id" : "The Red Lion", "value" : 5 } { "_id" : "Corney & Barrow", "value" : 4 } { "_id" : "O'Neills", "value" : 4 } { "_id" : "Pitcher & Piano", "value" : 4 } { "_id" : "The Crown", "value" : 4 } ], "ok" : 1 }
  • 42. Aggregation Framework Benefits •  Real-time •  Simple yet powerful interface •  Declared in JSON, executes in C++ •  Runs inside MongoDB on local data − Adds load to your DB − Limited Operators − Data output is limited
  • 43. Analyzing MongoDB Data in External Systems
  • 45. Hadoop MongoDB Connector •  MongoDB or BSON files as input/output •  Source data can be filtered with queries •  Hadoop Streaming support –  For jobs written in Python, Ruby, Node.js •  Supports Hadoop tools such as Pig and Hive
  • 46. Map Pub Names in Python #!/usr/bin/env python from pymongo_hadoop import BSONMapper def mapper(documents): bounds = get_bounds() # ~2 mile polygon for doc in documents: geo = get_geo(doc["location"]) # Convert the geo type if not geo: continue if bounds.intersects(geo): yield {'_id': doc['name'], 'count': 1} BSONMapper(mapper) print >> sys.stderr, "Done Mapping."
  • 47. Reduce Pub Names in Python #!/usr/bin/env python from pymongo_hadoop import BSONReducer def reducer(key, values): _count = 0 for v in values: _count += v['count'] return {'_id': key, 'value': _count} BSONReducer(reducer)
  • 48. Execute MapReduce hadoop jar target/mongo-hadoop-streaming-assembly-1.1.0-rc0.jar -mapper examples/pub/map.py -reducer examples/pub/reduce.py -mongo mongodb://127.0.0.1/demo.pubs -outputURI mongodb://127.0.0.1/demo.pub_names
  • 49. Popular Pub Names Nearby > db.pub_names.find().sort({value: -1}).limit(10) { { { { { { { { { { "_id" "_id" "_id" "_id" "_id" "_id" "_id" "_id" "_id" "_id" : : : : : : : : : : "All Bar One", "value" : 11 } "The Slug & Lettuce", "value" : 7 } "The Coach & Horses", "value" : 6 } "The Kings Arms", "value" : 5 } "Corney & Barrow", "value" : 4 } "O'Neills", "value" : 4 } "Pitcher & Piano", "value" : 4 } "The Crown", "value" : 4 } "The George", "value" : 4 } "The Green Man", "value" : 4 }
  • 52. Limitations •  Batch processing •  Requires synchronization between data store and processor •  Adds complexity to infrastructure
  • 53. Advantages •  Processing decoupled from data store •  Parallel processing •  Leverage existing infrastructure •  Java has rich set of data processing libraries –  And other languages if using Hadoop Streaming
  • 54. Storm
  • 55. Storm
  • 56. Storm MongoDB connector •  Spout for MongoDB oplog or capped collections –  Filtering capabilities –  Threaded and non-blocking •  Output to new or existing documents –  Insert/update bolt
  • 58. Data Processing with MongoDB •  Process in MongoDB using Map/Reduce •  Process in MongoDB using Aggregation Framework •  Also: Storing pre-aggregated data –  An exercise in schema design •  Process outside MongoDB using Hadoop and other external tools
  • 61. References •  Map Reduce docs –  http://docs.mongodb.org/manual/core/map-reduce/ •  Aggregation Framework –  Examples http://docs.mongodb.org/manual/applications/aggregation –  SQL Comparison http://docs.mongodb.org/manual/reference/sql-aggregation-comparison/ •  Multi Threaded Map Reduce: http://edgystuff.tumblr.com/post/54709368492/how-to-speedup-mongodb-map-reduce-by-20x