SlideShare a Scribd company logo
1 of 47
Consulting Engineer, MongoDB
Bryan Reinero
#ConferenceHashTag
Time Series Data- Part 2
Aggregations in Action
Real Time Traffic Data Project
Our network of 16,000 speed sensors report
data every minute.
What we want from our data
Charting and Trending
What we want from our data
Historical & Predictive Analysis
What we want from our data
Real Time Traffic Dashboard
Document Structure
{ _id: ObjectId("5382ccdd58db8b81730344e2"),
linkId: 900006,
date: ISODate("2014-03-12T17:00:00Z"),
data: [
{ speed: NaN, time: NaN },
{ speed: NaN, time: NaN },
{ speed: NaN, time: NaN },
...
],
conditions: {
status: "Snow / Ice Conditions",
pavement: "Icy Spots",
weather: "Light Snow"
}
}
Sample Document Structure
Compound, unique
Index identifies the
Individual document
{ _id: ObjectId("5382ccdd58db8b81730344e2"),
linkId: 900006,
date: ISODate("2014-03-12T17:00:00Z"),
data: [
{ speed: NaN, time: NaN },
{ speed: NaN, time: NaN },
{ speed: NaN, time: NaN },
...
],
conditions: {
status: "Snow / Ice Conditions",
pavement: "Icy Spots",
weather: "Light Snow"
}
}
Sample Document Structure
Saves an extra index
{ _id: “900006:14031217”,
data: [
{ speed: NaN, time: NaN },
{ speed: NaN, time: NaN },
{ speed: NaN, time: NaN },
...
],
conditions: {
status: "Snow / Ice Conditions",
pavement: "Icy Spots",
weather: "Light Snow"
}
}
{ _id: “900006:14031217”,
data: [
{ speed: NaN, time: NaN },
{ speed: NaN, time: NaN },
{ speed: NaN, time: NaN },
...
],
conditions: {
status: "Snow / Ice Conditions",
pavement: "Icy Spots",
weather: "Light Snow"
}
}
Sample Document Structure
Range queries:
/^900006:1403/
Regex must be
left-anchored &
case-sensitive
{ _id: “900006:140312”,
data: [
{ speed: NaN, time: NaN },
{ speed: NaN, time: NaN },
{ speed: NaN, time: NaN },
...
],
conditions: {
status: "Snow / Ice Conditions",
pavement: "Icy Spots",
weather: "Light Snow"
}
}
Sample Document Structure
Pre-allocated,
60 element array of
per-minute data
Charts
0
10
20
30
40
50
60
70
MonMar10201404:57:00…
MonMar10201405:31:00…
MonMar10201406:05:00…
MonMar10201406:39:00…
MonMar10201407:13:00…
MonMar10201407:47:00…
MonMar10201408:21:00…
MonMar10201408:55:00…
MonMar10201409:29:00…
MonMar10201410:04:00…
MonMar10201410:38:00…
MonMar10201411:55:00…
TueMar11201402:41:00…
TueMar11201403:15:00…
TueMar11201403:49:00…
TueMar11201404:39:00…
TueMar11201405:13:00…
TueMar11201405:47:00…
TueMar11201406:21:00…
TueMar11201406:55:00…
TueMar11201407:29:00…
TueMar11201408:03:00…
TueMar11201408:37:00…
TueMar11201409:18:00…
TueMar11201410:44:00…
TueMar11201411:18:00…
TueMar11201411:53:00…
TueMar11201412:27:00…
TueMar11201413:04:00…
TueMar11201413:38:00…
TueMar11201414:15:00…
TueMar11201416:56:00…
WedMar12201401:45:00…
WedMar12201402:19:00…
WedMar12201402:53:00…
WedMar12201403:27:00…
WedMar12201406:46:00…
WedMar12201408:26:00…
WedMar12201409:00:00…
WedMar12201410:12:00…
WedMar12201410:46:00…
db.linkData.find( { _id : /^20484097:2014031/ } )
Rollups
{ _id: "20484097:20140204",
hours: [
{ speed: { sum: 1889, count: 60 }
time: { sum: 20562, count: 60 },
conditions: {
status: "Snow / Ice Conditions",
pavement: "Icy Spots",
weather: "Light Snow"
}
},
{ speed: {m: 1892, count: 60 },
time: {sum: 20442, count: 60 },
conditions: {
status: "Snow / Ice Conditions",
pavement: "Slush",
weather: "Light Snow"
}
}
]}
Document retention
Doc per hour
Doc per day
2 days
2 months
1year
Doc per Month
Analysis with The Aggregation
Framework
Pipelining operations
grep | sort | uniq
Piping command line operations
Pipelining operations
$match $group | $sort|
Piping aggregation operations
Stream of documents Result documents
What is the average speed for a
given road segment?
> db.linkData.aggregate(
{ $match: { ”_id" : /^20484097:/ } },
{ $project: { "data.speed": 1 } } ,
{ $unwind: "$data"},
{ $group: { _id: “”, ave: { $avg: "$data.speed"} } }
);
{ "_id" : 20484097, "ave" : 47.067650676506766 }
What is the average speed for a
given road segment?
Select documents on the target segment
> db.linkData.aggregate(
{ $match: { ”_id" : /^20484097:/ } },
{ $project: { "data.speed": 1, linkId: 1 } } ,
{ $unwind: "$data"},
{ $group: { _id: "$linkId", ave: { $avg: "$data.speed"} } }
);
{ "_id" : 20484097, "ave" : 47.067650676506766 }
What is the average speed for a
given road segment?
Keep only the fields we really need
> db.linkData.aggregate(
{ $match: { ”_id" : /^20484097:/ } },
{ $project: { "data.speed": 1, linkId: 1 } } ,
{ $unwind: "$data"},
{ $group: { _id: "$linkId", ave: { $avg: "$data.speed"} } }
);
{ "_id" : 20484097, "ave" : 47.067650676506766 }
What is the average speed for a
given road segment?
Loop over the array of data points
> db.linkData.aggregate(
{ $match: { ”_id" : /^20484097:/ } },
{ $project: { "data.speed": 1, linkId: 1 } } ,
{ $unwind: "$data"},
{ $group: { _id: "$linkId", ave: { $avg: "$data.speed"} } }
);
{ "_id" : 20484097, "ave" : 47.067650676506766 }
What is the average speed for a
given road segment?
Use the handy $avg operator
> db.linkData.aggregate(
{ $match: { ”_id" : /^20484097:/ } },
{ $project: { "data.speed": 1, linkId: 1 } } ,
{ $unwind: "$data"},
{ $group: { _id: "$linkId", ave: { $avg: "$data.speed"} } }
);
{ "_id" : 20484097, "ave" : 47.067650676506766 }
More Sophisticated Pipelines:
average speed with variance
{ "$project" : {
mean: "$meanSpd",
spdDiffSqrd : {
"$map" : {
"input": {
"$map" : {
"input" : "$speeds",
"as" : "samp",
"in" : { "$subtract" : [ "$$samp", "$meanSpd" ] }
}
},
as: "df", in: { $multiply: [ "$$df", "$$df" ] }
} } } },
{ $unwind: "$spdDiffSqrd" },
{ $group: { _id: mean: "$mean", variance: { $avg: "$spdDiffSqrd" } } }
Historic Analysis
How does weather and road conditions affect
traffic?
The Ask: what are the average speeds per
weather, status and pavement
MapReduce
function map() {
for( var i = 0; i < this.data.length; i++ ) {
emit (
this.conditions.weather,
{ speed : this.data[i].speed }
);
emit (
this.conditions.status,
{ speed : this.data[i].speed }
);
emit (
this.conditions.pavement,
{ speed : this.data[i].speed }
);
} }
MapReduce
function map() {
for( var i = 0; i < this.data.length; i++ ) {
emit (
this.conditions.weather,
{ speed : this.data[i].speed }
);
emit (
this.conditions.status,
{ speed : this.data[i].speed }
);
emit (
this.conditions.pavement,
{ speed : this.data[i].speed }
);
} }
“Snow”,
34
MapReduce
function map() {
for( var i = 0; i < this.data.length; i++ ) {
emit (
this.conditions.weather,
{ speed : this.data[i].speed }
);
emit (
this.conditions.status,
{ speed : this.data[i].speed }
);
emit (
this.conditions.pavement,
{ speed : this.data[i].speed }
);
} }
“Icy spots”, 34
MapReduce
function map() {
for( var i = 0; i < this.data.length; i++ ) {
emit (
this.conditions.weather,
{ speed : this.data[i].speed }
);
emit (
this.conditions.status,
{ speed : this.data[i].speed }
);
emit (
this.conditions.pavement,
{ speed : this.data[i].speed }
);
} }
“Delays”, 34
MapReduce
MapReduce
Weather: “Rain”, speed: 44
MapReduce
Weather: “Rain”, speed: 39
MapReduce
Weather: “Rain”, speed: 46
MapReduce
function reduce ( key, values ) {
var result = { count : 1, speedSum : 0 };
values.forEach( function( v ){
result.speedSum += v.speed;
result.count++;
});
return result;
}
MapReduce
function reduce ( key, values ) {
var result = { count : 1, speedSum : 0 };
values.forEach( function( v ){
result.speedSum += v.speed;
result.count++;
});
return result;
}
Results
results: [
{
"_id" : "Generally Clear and Dry Conditions",
"value" : {
"count" : 902,
"speedSum" : 45100
}
},
{
"_id" : "Icy Spots",
"value" : {
"count" : 242,
"speedSum" : 9438
}
},
{
"_id" : "Light Snow",
"value" : {
"count" : 122,
"speedSum" : 7686
}
},
{
"_id" : "No Report",
"value" : {
"count" : 782,
"speedSum" : NaN
}
}
Processing Large Data Sets
• Need to break data into smaller pieces
• Process data across multiple nodes
Hadoop Hadoop Hadoop Hadoop
Hadoop Hadoop Hadoop HadoopHadoop
Hadoop
Benefits of the Hadoop Connector
• Increased parallelism
• Access to analytics libraries
• Separation of concerns
• Integrates with existing tool chains
• Drivers will be accessing the data via web, mobile
devices, and navigation systems
• We need to provide current average speed, travel time
and weather per road segment
Real-time Dashboard
Current Real-Time Conditions
Last ten minutes of speeds and
times
{ _id : “I-87:10656”,
description : "NYS Thruway Harriman Section Exits 14A - 16",
update : ISODate(“2013-10-10T23:06:37.000Z”),
speeds : [ 52, 49, 45, 51, ... ],
times : [ 237, 224, 246, 233,... ],
pavement: "Wet Spots",
status: "Wet Conditions",
weather: "Light Rain”,
averageSpeed: 50.23,
averageTime: 234,
maxSafeSpeed: 53.1,
location" : {
"type" : "LineString",
"coordinates" : [
[ -74.056, 41.098 ],
[ -74.077, 41.104 ] }
}
{ _id : “I-87:10656”,
description : "NYS Thruway Harriman Section Exits 14A - 16",
update : ISODate(“2013-10-10T23:06:37.000Z”),
speeds : [ 52, 49, 45, 51, ... ],
times : [ 237, 224, 246, 233,... ],
pavement: "Wet Spots",
status: "Wet Conditions",
weather: "Light Rain”,
averageSpeed: 50.23,
averageTime: 234,
maxSafeSpeed: 53.1,
location" : {
"type" : "LineString",
"coordinates" : [
[ -74.056, 41.098 ],
[ -74.077, 41.104 ] }
}
Current Real-Time Conditions
Pre-aggregated
metrics
{ _id : “I-87:10656”,
description : "NYS Thruway Harriman Section Exits 14A - 16",
update : ISODate(“2013-10-10T23:06:37.000Z”),
speeds : [ 52, 49, 45, 51, ... ],
times : [ 237, 224, 246, 233,... ],
pavement: "Wet Spots",
status: "Wet Conditions",
weather: "Light Rain”,
averageSpeed: 50.23,
averageTime: 234,
maxSafeSpeed: 53.1,
location" : {
"type" : "LineString",
"coordinates" : [
[ -74.056, 41.098 ],
[ -74.077, 41.104 ] }
}
Current Real-Time Conditions
Geo-spatially indexed
road segment
db.linksAvg.update(
{"_id" : linkId},
{ "$set" : {"update " : date},
"$push" : {
"times" : { "$each" : [ time ], "$slice" : -10 },
"speeds" : {"$each" : [ speed ], "$slice" : -10}
}
})
Maintaining the current conditions
Each update pops the last element off the
array and pushes the new value
Putting it all together
Patterns common to time series
data:
• You need to store and manage an incoming
stream of data samples
• You need to compute derivative data sets based
on these samples
• You need low latency access to up-to-date data
Patterns common to time series
data:
• You need to store and manage an incoming
stream of data samples
• You need to compute derivative data sets based
on these samples
• You need low latency access to up-to-date data
Introducing The High Volume Data
Feed
HVDF: Reference Implementation
Screech -- High Volume Data Feed engine
REST
Service API
Processor
Plugins
Inline
Batch
Stream
Channel Data Storage
Raw
Channel
Data
Aggregated
Rollup T1
Aggregated
Rollup T2
Query Processor Streaming spout
Custom Stream
Processing Logic
Incoming Sample Stream
POST /feed/channel/data
GET
/feed/channeldata?time=XX
X&range=YYY
Real-time Queries
HVDF:
https://github.com/10gen-labs/hvdf
Hadoop Connector:
https://github.com/mongodb/mongo-hadoop
Consulting Engineer, MongoDB Inc.
Bryan Reinero
#MongoDBWorld
Thank You

More Related Content

What's hot

Kevin Kempter PostgreSQL Backup and Recovery Methods @ Postgres Open
Kevin Kempter PostgreSQL Backup and Recovery Methods @ Postgres OpenKevin Kempter PostgreSQL Backup and Recovery Methods @ Postgres Open
Kevin Kempter PostgreSQL Backup and Recovery Methods @ Postgres Open
PostgresOpen
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
Databricks
 
Automating Data Quality Processes at Reckitt
Automating Data Quality Processes at ReckittAutomating Data Quality Processes at Reckitt
Automating Data Quality Processes at Reckitt
Databricks
 

What's hot (20)

Neanex - Semantic Construction with Graphs
Neanex - Semantic Construction with GraphsNeanex - Semantic Construction with Graphs
Neanex - Semantic Construction with Graphs
 
Kevin Kempter PostgreSQL Backup and Recovery Methods @ Postgres Open
Kevin Kempter PostgreSQL Backup and Recovery Methods @ Postgres OpenKevin Kempter PostgreSQL Backup and Recovery Methods @ Postgres Open
Kevin Kempter PostgreSQL Backup and Recovery Methods @ Postgres Open
 
Big Query - Utilizing Google Data Warehouse for Media Analytics
Big Query - Utilizing Google Data Warehouse for Media AnalyticsBig Query - Utilizing Google Data Warehouse for Media Analytics
Big Query - Utilizing Google Data Warehouse for Media Analytics
 
Migration From Oracle to PostgreSQL
Migration From Oracle to PostgreSQLMigration From Oracle to PostgreSQL
Migration From Oracle to PostgreSQL
 
Caching for Microservices Architectures: Session I
Caching for Microservices Architectures: Session ICaching for Microservices Architectures: Session I
Caching for Microservices Architectures: Session I
 
Databricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks Delta Lake and Its Benefits
Databricks Delta Lake and Its Benefits
 
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
 
Self Service Reporting & Analytics For an Enterprise
Self Service Reporting & Analytics For an EnterpriseSelf Service Reporting & Analytics For an Enterprise
Self Service Reporting & Analytics For an Enterprise
 
Migrating Oracle database to PostgreSQL
Migrating Oracle database to PostgreSQLMigrating Oracle database to PostgreSQL
Migrating Oracle database to PostgreSQL
 
Oracle to Postgres Migration - part 1
Oracle to Postgres Migration - part 1Oracle to Postgres Migration - part 1
Oracle to Postgres Migration - part 1
 
Dmm302 - Sap Hana Data Warehousing: Models for Sap Bw and SQL DW on SAP HANA
Dmm302 - Sap Hana Data Warehousing: Models for Sap Bw and SQL DW on SAP HANA Dmm302 - Sap Hana Data Warehousing: Models for Sap Bw and SQL DW on SAP HANA
Dmm302 - Sap Hana Data Warehousing: Models for Sap Bw and SQL DW on SAP HANA
 
Best Practices of Data Modeling with InfoSphere Data Architect
Best Practices of Data Modeling with InfoSphere Data ArchitectBest Practices of Data Modeling with InfoSphere Data Architect
Best Practices of Data Modeling with InfoSphere Data Architect
 
Augury: Real-Time Insights for the Industrial IoT
Augury: Real-Time Insights for the Industrial IoTAugury: Real-Time Insights for the Industrial IoT
Augury: Real-Time Insights for the Industrial IoT
 
Comparison of MPP Data Warehouse Platforms
Comparison of MPP Data Warehouse PlatformsComparison of MPP Data Warehouse Platforms
Comparison of MPP Data Warehouse Platforms
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
 
Creating an Enterprise AI Strategy
Creating an Enterprise AI StrategyCreating an Enterprise AI Strategy
Creating an Enterprise AI Strategy
 
Best Practices for Running SQL Server on Amazon RDS (DAT323) - AWS re:Invent ...
Best Practices for Running SQL Server on Amazon RDS (DAT323) - AWS re:Invent ...Best Practices for Running SQL Server on Amazon RDS (DAT323) - AWS re:Invent ...
Best Practices for Running SQL Server on Amazon RDS (DAT323) - AWS re:Invent ...
 
Automating Data Quality Processes at Reckitt
Automating Data Quality Processes at ReckittAutomating Data Quality Processes at Reckitt
Automating Data Quality Processes at Reckitt
 
K-IMS
K-IMSK-IMS
K-IMS
 
GT.M: A Tried and Tested Open-Source NoSQL Database
GT.M: A Tried and Tested Open-Source NoSQL DatabaseGT.M: A Tried and Tested Open-Source NoSQL Database
GT.M: A Tried and Tested Open-Source NoSQL Database
 

Viewers also liked

MongoDB for Time Series Data Part 1: Setting the Stage for Sensor Management
MongoDB for Time Series Data Part 1: Setting the Stage for Sensor ManagementMongoDB for Time Series Data Part 1: Setting the Stage for Sensor Management
MongoDB for Time Series Data Part 1: Setting the Stage for Sensor Management
MongoDB
 
The Aggregation Framework
The Aggregation FrameworkThe Aggregation Framework
The Aggregation Framework
MongoDB
 
MongoDB for Time Series Data Part 3: Sharding
MongoDB for Time Series Data Part 3: ShardingMongoDB for Time Series Data Part 3: Sharding
MongoDB for Time Series Data Part 3: Sharding
MongoDB
 
MongoDB and Hadoop: Driving Business Insights
MongoDB and Hadoop: Driving Business InsightsMongoDB and Hadoop: Driving Business Insights
MongoDB and Hadoop: Driving Business Insights
MongoDB
 
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
MongoDB
 

Viewers also liked (18)

MongoDB for Time Series Data Part 1: Setting the Stage for Sensor Management
MongoDB for Time Series Data Part 1: Setting the Stage for Sensor ManagementMongoDB for Time Series Data Part 1: Setting the Stage for Sensor Management
MongoDB for Time Series Data Part 1: Setting the Stage for Sensor Management
 
MongoDB for Time Series Data
MongoDB for Time Series DataMongoDB for Time Series Data
MongoDB for Time Series Data
 
The Aggregation Framework
The Aggregation FrameworkThe Aggregation Framework
The Aggregation Framework
 
MongoDB for Time Series Data Part 3: Sharding
MongoDB for Time Series Data Part 3: ShardingMongoDB for Time Series Data Part 3: Sharding
MongoDB for Time Series Data Part 3: Sharding
 
MS SQL SERVER: Time series algorithm
MS SQL SERVER: Time series algorithmMS SQL SERVER: Time series algorithm
MS SQL SERVER: Time series algorithm
 
MongoDB and Hadoop: Driving Business Insights
MongoDB and Hadoop: Driving Business InsightsMongoDB and Hadoop: Driving Business Insights
MongoDB and Hadoop: Driving Business Insights
 
Using MongoDB + Hadoop Together
Using MongoDB + Hadoop TogetherUsing MongoDB + Hadoop Together
Using MongoDB + Hadoop Together
 
Agg framework selectgroup feb2015 v2
Agg framework selectgroup feb2015 v2Agg framework selectgroup feb2015 v2
Agg framework selectgroup feb2015 v2
 
Creating a Modern Data Architecture for Digital Transformation
Creating a Modern Data Architecture for Digital TransformationCreating a Modern Data Architecture for Digital Transformation
Creating a Modern Data Architecture for Digital Transformation
 
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
 
Back to Basics Webinar 3: Introduction to Replica Sets
Back to Basics Webinar 3: Introduction to Replica SetsBack to Basics Webinar 3: Introduction to Replica Sets
Back to Basics Webinar 3: Introduction to Replica Sets
 
Webinar: 10-Step Guide to Creating a Single View of your Business
Webinar: 10-Step Guide to Creating a Single View of your BusinessWebinar: 10-Step Guide to Creating a Single View of your Business
Webinar: 10-Step Guide to Creating a Single View of your Business
 
Seattle Scalability Meetup - Ted Dunning - MapR
Seattle Scalability Meetup - Ted Dunning - MapRSeattle Scalability Meetup - Ted Dunning - MapR
Seattle Scalability Meetup - Ted Dunning - MapR
 
Design, Scale and Performance of MapR's Distribution for Hadoop
Design, Scale and Performance of MapR's Distribution for HadoopDesign, Scale and Performance of MapR's Distribution for Hadoop
Design, Scale and Performance of MapR's Distribution for Hadoop
 
Back to Basics Webinar 1: Introduction to NoSQL
Back to Basics Webinar 1: Introduction to NoSQLBack to Basics Webinar 1: Introduction to NoSQL
Back to Basics Webinar 1: Introduction to NoSQL
 
Webinar: Working with Graph Data in MongoDB
Webinar: Working with Graph Data in MongoDBWebinar: Working with Graph Data in MongoDB
Webinar: Working with Graph Data in MongoDB
 
Back to Basics: My First MongoDB Application
Back to Basics: My First MongoDB ApplicationBack to Basics: My First MongoDB Application
Back to Basics: My First MongoDB Application
 
Advanced Schema Design Patterns
Advanced Schema Design PatternsAdvanced Schema Design Patterns
Advanced Schema Design Patterns
 

Similar to MongoDB for Time Series Data Part 2: Analyzing Time Series Data Using the Aggregation Framework and Hadoop

MongoDB for Time Series Data: Analyzing Time Series Data Using the Aggregatio...
MongoDB for Time Series Data: Analyzing Time Series Data Using the Aggregatio...MongoDB for Time Series Data: Analyzing Time Series Data Using the Aggregatio...
MongoDB for Time Series Data: Analyzing Time Series Data Using the Aggregatio...
MongoDB
 
MongoDB for Time Series Data: Setting the Stage for Sensor Management
MongoDB for Time Series Data: Setting the Stage for Sensor ManagementMongoDB for Time Series Data: Setting the Stage for Sensor Management
MongoDB for Time Series Data: Setting the Stage for Sensor Management
MongoDB
 
Operational Intelligence with MongoDB Webinar
Operational Intelligence with MongoDB WebinarOperational Intelligence with MongoDB Webinar
Operational Intelligence with MongoDB Webinar
MongoDB
 

Similar to MongoDB for Time Series Data Part 2: Analyzing Time Series Data Using the Aggregation Framework and Hadoop (20)

MongoDB for Time Series Data: Analyzing Time Series Data Using the Aggregatio...
MongoDB for Time Series Data: Analyzing Time Series Data Using the Aggregatio...MongoDB for Time Series Data: Analyzing Time Series Data Using the Aggregatio...
MongoDB for Time Series Data: Analyzing Time Series Data Using the Aggregatio...
 
MongoDB for Time Series Data: Setting the Stage for Sensor Management
MongoDB for Time Series Data: Setting the Stage for Sensor ManagementMongoDB for Time Series Data: Setting the Stage for Sensor Management
MongoDB for Time Series Data: Setting the Stage for Sensor Management
 
MongoDB World 2016: The Best IoT Analytics with MongoDB
MongoDB World 2016: The Best IoT Analytics with MongoDBMongoDB World 2016: The Best IoT Analytics with MongoDB
MongoDB World 2016: The Best IoT Analytics with MongoDB
 
Operational Intelligence with MongoDB Webinar
Operational Intelligence with MongoDB WebinarOperational Intelligence with MongoDB Webinar
Operational Intelligence with MongoDB Webinar
 
MongoDB 3.2 - Analytics
MongoDB 3.2  - AnalyticsMongoDB 3.2  - Analytics
MongoDB 3.2 - Analytics
 
Weather of the Century: Design and Performance
Weather of the Century: Design and PerformanceWeather of the Century: Design and Performance
Weather of the Century: Design and Performance
 
2014 bigdatacamp asya_kamsky
2014 bigdatacamp asya_kamsky2014 bigdatacamp asya_kamsky
2014 bigdatacamp asya_kamsky
 
IT Days - Parse huge JSON files in a streaming way.pptx
IT Days - Parse huge JSON files in a streaming way.pptxIT Days - Parse huge JSON files in a streaming way.pptx
IT Days - Parse huge JSON files in a streaming way.pptx
 
Samantha Wang [InfluxData] | Best Practices on How to Transform Your Data Usi...
Samantha Wang [InfluxData] | Best Practices on How to Transform Your Data Usi...Samantha Wang [InfluxData] | Best Practices on How to Transform Your Data Usi...
Samantha Wang [InfluxData] | Best Practices on How to Transform Your Data Usi...
 
Monitoring Your ISP Using InfluxDB Cloud and Raspberry Pi
Monitoring Your ISP Using InfluxDB Cloud and Raspberry PiMonitoring Your ISP Using InfluxDB Cloud and Raspberry Pi
Monitoring Your ISP Using InfluxDB Cloud and Raspberry Pi
 
NoSQL meets Microservices
NoSQL meets MicroservicesNoSQL meets Microservices
NoSQL meets Microservices
 
Michael Hackstein - NoSQL meets Microservices - NoSQL matters Dublin 2015
Michael Hackstein - NoSQL meets Microservices - NoSQL matters Dublin 2015Michael Hackstein - NoSQL meets Microservices - NoSQL matters Dublin 2015
Michael Hackstein - NoSQL meets Microservices - NoSQL matters Dublin 2015
 
Data Time Travel by Delta Time Machine
Data Time Travel by Delta Time MachineData Time Travel by Delta Time Machine
Data Time Travel by Delta Time Machine
 
Codepot - Pig i Hive: szybkie wprowadzenie / Pig and Hive crash course
Codepot - Pig i Hive: szybkie wprowadzenie / Pig and Hive crash courseCodepot - Pig i Hive: szybkie wprowadzenie / Pig and Hive crash course
Codepot - Pig i Hive: szybkie wprowadzenie / Pig and Hive crash course
 
NoSQL meets Microservices - Michael Hackstein
NoSQL meets Microservices - Michael HacksteinNoSQL meets Microservices - Michael Hackstein
NoSQL meets Microservices - Michael Hackstein
 
Analytics with Spark
Analytics with SparkAnalytics with Spark
Analytics with Spark
 
Handling Real-time Geostreams
Handling Real-time GeostreamsHandling Real-time Geostreams
Handling Real-time Geostreams
 
Handling Real-time Geostreams
Handling Real-time GeostreamsHandling Real-time Geostreams
Handling Real-time Geostreams
 
node.js and the AR.Drone: building a real-time dashboard using socket.io
node.js and the AR.Drone: building a real-time dashboard using socket.ionode.js and the AR.Drone: building a real-time dashboard using socket.io
node.js and the AR.Drone: building a real-time dashboard using socket.io
 
Evolution is Continuous, and so are Big Data and Streaming Pipelines
Evolution is Continuous, and so are Big Data and Streaming PipelinesEvolution is Continuous, and so are Big Data and Streaming Pipelines
Evolution is Continuous, and so are Big Data and Streaming Pipelines
 

More from MongoDB

More from MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
 

Recently uploaded

Recently uploaded (20)

[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 

MongoDB for Time Series Data Part 2: Analyzing Time Series Data Using the Aggregation Framework and Hadoop

Editor's Notes

  1. Reports (group, summing, averaging) Analytics(incremental reporting, rollups) Analysis (trends, segmentation, anomalies) Analytics (regression, forecasting, filtering) Warehousing (long term storage and simplified querying)
  2. Compound unique index on linkId & Interval update field used to identify new documents for aggregation
  3. Compound unique index on linkId & Interval update field used to identify new documents for aggregation
  4. Compound unique index on linkId & Interval update field used to identify new documents for aggregation
  5. Compound unique index on linkId & Interval update field used to identify new documents for aggregation
  6. Compound unique index on linkId & Interval update field used to identify new documents for aggregation
  7. Priority Floating point number between 0..1000 Highest member that is up to date wins Up to date == within 10 seconds of primary If a higher priority member catches up, it will force election and win Slave Delay Lags behind master by configurable time delay Automatically hidden from clients Protects against operator errors Fat fingering Application corrupts data
  8. Compound unique index on linkId & Interval update field used to identify new documents for aggregation
  9. Priority Floating point number between 0..1000 Highest member that is up to date wins Up to date == within 10 seconds of primary If a higher priority member catches up, it will force election and win Slave Delay Lags behind master by configurable time delay Automatically hidden from clients Protects against operator errors Fat fingering Application corrupts data
  10. Priority Floating point number between 0..1000 Highest member that is up to date wins Up to date == within 10 seconds of primary If a higher priority member catches up, it will force election and win Slave Delay Lags behind master by configurable time delay Automatically hidden from clients Protects against operator errors Fat fingering Application corrupts data
  11. Priority Floating point number between 0..1000 Highest member that is up to date wins Up to date == within 10 seconds of primary If a higher priority member catches up, it will force election and win Slave Delay Lags behind master by configurable time delay Automatically hidden from clients Protects against operator errors Fat fingering Application corrupts data
  12. Priority Floating point number between 0..1000 Highest member that is up to date wins Up to date == within 10 seconds of primary If a higher priority member catches up, it will force election and win Slave Delay Lags behind master by configurable time delay Automatically hidden from clients Protects against operator errors Fat fingering Application corrupts data
  13. Priority Floating point number between 0..1000 Highest member that is up to date wins Up to date == within 10 seconds of primary If a higher priority member catches up, it will force election and win Slave Delay Lags behind master by configurable time delay Automatically hidden from clients Protects against operator errors Fat fingering Application corrupts data
  14. Priority Floating point number between 0..1000 Highest member that is up to date wins Up to date == within 10 seconds of primary If a higher priority member catches up, it will force election and win Slave Delay Lags behind master by configurable time delay Automatically hidden from clients Protects against operator errors Fat fingering Application corrupts data
  15. Priority Floating point number between 0..1000 Highest member that is up to date wins Up to date == within 10 seconds of primary If a higher priority member catches up, it will force election and win Slave Delay Lags behind master by configurable time delay Automatically hidden from clients Protects against operator errors Fat fingering Application corrupts data
  16. Priority Floating point number between 0..1000 Highest member that is up to date wins Up to date == within 10 seconds of primary If a higher priority member catches up, it will force election and win Slave Delay Lags behind master by configurable time delay Automatically hidden from clients Protects against operator errors Fat fingering Application corrupts data
  17. Priority Floating point number between 0..1000 Highest member that is up to date wins Up to date == within 10 seconds of primary If a higher priority member catches up, it will force election and win Slave Delay Lags behind master by configurable time delay Automatically hidden from clients Protects against operator errors Fat fingering Application corrupts data
  18. Priority Floating point number between 0..1000 Highest member that is up to date wins Up to date == within 10 seconds of primary If a higher priority member catches up, it will force election and win Slave Delay Lags behind master by configurable time delay Automatically hidden from clients Protects against operator errors Fat fingering Application corrupts data
  19. Reports (group, summing, averaging) Analytics(incremental reporting, rollups) Analysis (trends, segmentation, anomalies) Analytics (regression, forecasting, filtering) Warehousing (long term storage and simplified querying)
  20. Reports (group, summing, averaging) Analytics(incremental reporting, rollups) Analysis (trends, segmentation, anomalies) Analytics (regression, forecasting, filtering) Warehousing (long term storage and simplified querying)
  21. Reports (group, summing, averaging) Analytics(incremental reporting, rollups) Analysis (trends, segmentation, anomalies) Analytics (regression, forecasting, filtering) Warehousing (long term storage and simplified querying)