MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB and the Internet of Things
1. MongoDB and the Internet of Things
Brandon Newell
Sr. Solutions Architect, MongoDB
Brandon.Newell@MongoDB.com
@virtual_newell
2. Agenda
•IoT Overview & Use Cases
•Architecture & Challenges
•Agility & Scalability with MongoDB
3. What is IoT
“The Internet of Things (IoT) is a computing concept whereby
everyday physical objects are connected to a network and able
to identify themselves to other devices.”
9. Where does MongoDB fit into
IoT?
MongoDB enables you to
Collect, Analyze and Act on
every piece of data that your IoT
environment can throw at it.
10. CONNECTED COW
by VITAL HERD
E-pill ingested into stomach
Transmits heart rate, temp,
chemical composition
Notifies farmer when
abnormality is detected
Health management
94 Million Cows in US, Billions
of savings
11. Capabilities
Solutions
Bosch SI IoT Suite
M2M | BPM | BRM | Big Data
Suite for IoT: Key Capabilities
A
B
C
D
Scale
Flexibility
Analytics
Unified View
https://www.mongodb.com/customers/bosch
12. Devices
Sensors, Controllers, etc.
(W)LAN / Drivers
Assets
Machines, Vehicles, Power Plants, etc.
WAN / Mobile Carrier Network
Systems of Systems
Assembly Line, Power Grid, etc.
M2M Management Backend
Asset Database, Analytics, Event Management,
Rules, Business Processes, Management Console
A
B
C
D
Scale
Flexibility
Analytics
Unified View
Suite for IoT
https://www.mongodb.com/customers/bosch
13. Hardware Platform: Arduino, Raspberry Pi, Intel Edison, bespoke sensors
Wireless Transport: Zigbee, Z-Wave, WIFI, GPRS, Bluetooth-LE
Communication Protocol: MQTT, CoAP, XMPP, AMQP, RESTful
Middleware and Storage: Application servers, Database Servers
Value Delivery: Business Analytics, User Access & Control
Technology Stack
14. IoT Required Capabilities
•Support a variety of hardware and software devices for data
ingestion
•Manage time series data at scale
•Easily support new device versions and data types
•Support real-time and historical analytics
•Minimize need for special purpose databases (timeseries,
reporting)
•Security
•High Availability
16. IoT Reference Architecture
MessageQueue
IoT Apps Mobile Apps
Live Bi
Dashboards
Raw Data
Processed
Data
Big Data /
Machine
Learning
Processing
Framework
Enriched data with filtered IoT data, historical data
and data from other sources required for IoT
Unfiltered Raw Data for Analytics, Is purged frequently
Mobile Data
Sensors
IoTGateway
Data Enriching
Application(s)
Other
Applications
Customer
Product
Transaction
Historical
"
18. R Driver for MongoDB 3.6+
Recommended MongoDB R driver for data scientists,
developers & statisticians
• Idiomatic, native language access to the database
• MongoDB read & write concerns to control data
consistency & durability
• Data security with enterprise authentication
mechanisms
• Advanced BSON data types, e.g., Decimal 128 for
high precision scientific & financial analysis
22. Relational Sample Design #1
EVENT_ID PLANE_ID TIMESTAMP LAT LONG ENGINE
TEMP
FUEL
LEVEL
… SPEED
100001 3902 1437297148810 38.2031 -124.4904
100002 3902 1437297149213 750
Modeling all metrics as columns in one relational table
Huge table, lots of wasted space caused by
empty values
Frequent schema change and data migrations
when adding new metrics
23. Relational Sample Design #2
EVENT_ID METRIC_NAME METRIC_VALUE
100001 LAT 38.2031
100001 LONG -124.4904
100002 SPEED 750
Store variable metrics in an EAV table
EVENT_ID PLANE_ID TIMESTAMP
100001 3902 1437297148810
METRIC_VALUE needs be
defined as TEXT field
Index implication for
METRIC_VALUE field
Multiple self joins necessary
24. Enormous
Data
Volume
A single flight, per minute interval:
3 * 60 * 100 = 18,000 data points/flight
100,000 flights per day:
1.8 Billion, 1.8TB per day
21,000 QPS
26. Highly Scalable &
Available
Built for Global Cloud
Deployments
Flexible Data Model
Expressive Query Language
& Secondary Indexes
Strong Data Consistency
Enterprise Management Tools
Security & Integrations
RDBMS
Core
Strengths
NoSQL
Core
Strengths
The MongoDB Difference
Nexus Architecture
28. AGILITY
Start coding now, without month long ER (Entity
Relationship) design.
Changing schema as you go without penalty.
Polymorphic schema models variable structure with
ease
29. location: (-84.2391, 34.1039)
speed: 750
engine:
fuel_level: 100 ,
temperature: 88.48
DATA MODEL
1
3
2
1 Rich data structure
Sparse Indexes
Dynamic Schema
2
3
31. Rich Functionality for IoT/Time
Series
Sample Document
Expressive Queries &
Secondary Indexes
• Find all devices of type Thermostat in Newark
• Find the time samples for yesterday from 12:00-12:05pm for
all Thermostats in sector 8
Geospatial
• Find temperatures for all devices within 100 ft of [40.7, -74.2]
• Find the Location of all devices within the polygoin
representing zip code 07114
Text Search • Find all log entries that mention “hot day”
Aggregation
• Find the devices that recorded an average temperature from
12-12:05pm of over 65 degrees
Native Binary
JSON support
• Add an additional temp entry without sending the whole
document
• Select just the time entry for 1:05pm
• Find all device entries and sort all of their date types properly
Left outer join
($lookup)
• Find the manufacture date (in another collection) for all
devices with error statuses, sorted by date
Graph queries
($graphLookup)
• Find all devices and subcomponents for DeviceId 100
Device {
DeviceId : 100,
Name : "Temp8",
Type: "Thermostat",
Location : "EWR",
Coordinates: [40.702675, -74.179471]
PlantName : "EWR 10",
Sector: 8
ForemanLogEntry: "Very hot day caused…"
Samples: {
StartTime : ISODate("2016-09-05"),
EndTime: ISODate("2016-09-06"),
Entries: [
{t: ISODate("…12:00"), temp: 65, …}
{t: ISODate("…12:01"), temp: 66, …}
… ]
}
SubcomponentIDs: [105, 207, 308, …]
}
32. OPTIMIZE
with document models
A time series is
a sequence of data points,
typically consisting of successive
measurements
made over a time interval.
Examples of time series are ocean
tides, counts of sunspots, and the daily
closing value of the Dow Jones
Industrial Average.
--wikipedia
Optimize Time Series
36. CHOOSING A SHARD KEY FOR SENSORS
Cardinality - LARGE
Write distribution - EVEN
Query isolation - ISOLATED
37. CHOOSING A SHARD KEY
Cardinality
Write distribution
Query isolation
Reliability
Index locality
Cardinality
Write
Distribution
Query
Isolation
Reliability
Index
Locality
_id Doc level One shard Scatter/gather
All users
affected
Good
hash(_id) Hash level All Shards Scatter/gather
All users
affected
Poor
asset_id Many docs All Shards Targeted
Some assets
affected
Good
asset_id, ts Doc level All Shards Targeted
Some assets
affected
Good
39. How MongoDB Can Help
MongoDB Enterprise Advanced
The best way to run MongoDB in your data center
MongoDB Ops Manager
The easiest way to run MongoDB in your datacenter
Production Support
In production and under control
Development Support
Let’s get you running
Consulting
We solve problems
Training
Get your teams up to speed.