Powering the Internet of Things with Apache Hadoop

1© Cloudera, Inc. All rights reserved.
Driving Insights in A Connected World
Amy O’ Connor | Big Data Evangelist
Vijay Raja | Solutions Marketing
Powering the Internet of Things
with Apache Hadoop

Agenda & Overview
• IoT – A Revolution in the Making
• Key IoT Segments & Use Cases
• The IoT Ecosystem
• A Next gen IoT Analytics Engine
• Cloudera Enterprise – Real Time Analytics for IoT
• Reference Architecture
• IoT – Customer Case Studies

Internet of Things (IoT) – A Revolution In The Making
$1.7
Trillion
In Value
20%
Annual Growth
30 Billion
Things
250
Million
Connected Vehicles
Source - IDC & Gartner Estimates
Internet of
Things
IoT Markets - 2020

IoT – A Growing Universe of Connected Objects
Source: McKinsey Source: Business Intelligence

IoT Will Drive An Explosion of Data…
Data expected to explode to
44 ZB by 2020
Source: IDC
44 Trillion GB!80% of data will be
unstructured

It’s all kind of meaningless
unless you can make
sense of all that data

< 1% of data is currently utilized…
Mostly for anomaly detection or real-
time control; more can be used for
optimization and prediction
1%
Source: McKinsey Global - THE INTERNET OF THINGS:
MAPPING THE VALUE BEYOND THE HYPE, June 2015

Value is Maximized when Data is combined from
other sources
Value of Data is multiplied when you combine
and correlate it with other data from relevant
sources
Improvement in value that can be
unlocked by combining data from
multiple IoT applications and sources
SOURCE: McKinsey Global Institute analysis
“Interoperability would significantly improve performance by
combining sensor data from different machines and systems to provide
decision makers with an integrated view of performance across an
entire factory or oil rig.”
40%

IoT – Key Segments & Use Cases
Industrial IoT
Manufacturing
• Predictive Maintenance
• Operations Optimization
• Supply Chain Optimization
Healthcare
• Proactive & Connected Monitoring
• Early detection & Diagnosis
• Remote Measurements
Energy & Utilities
• Transmission & Distribution
• Smart-Grid & Smart Meters
• Ops & Predictive Maintenance
Govt. & Public Services
• Smart Cities
• Traffic Optimization
• Public Safety
Insurance
• Usage Based Insurance (UBI)
• Telematics for Insurance
• Insured Asset Management
Retail
• Automated Checkouts
• Footfall Analytics & Promos
• Inventory Optimization
Telecommunications
• Network Maintenance
• Connected Homes/ Cars
• Data Monetization
Mobility
• Telematics & Fleet Mgmt.
• Tracking & Remote Monitoring
• Condition Based Maintenance
Consumer IoT
Connected Cars
• Safety & Security
• Real-Time Diagnostics
• In-car Connectivity & Infotainment
Health & Lifestyle
• Wearables
• Health & Fitness Tracking
• Real-Time Remote Monitoring
Connected Homes
• Home Automation & Security
• Home Energy management
• Smart Appliances
Entertainment
• Interconnected Smart Devices
• Virtual Reality/ Interactive Gaming
• Drones

Internet of Things - Opportunity & Impact
Source- Ovum: Understanding the IoT Opportunity: An Industry Perspective - 2015
Manufacturing is a future hotbed for IoT deployment
The potential value that could be unlocked with IoT
applications in factory settings could be as much as
$3.7 trillion in 2025 - McKinsey Analysis$3.7T
50%Using real-time data to predict and prevent
breakdowns can reduce downtime by 50 percent
- McKinsey Analysis
Reduction in
Downtime
50%
Healthcare: IoT can enable cutting the costs of
chronic disease treatment by as much as
50 percent
- McKinsey Analysis
Reduction in
Costs
The number of connected vehicles will grow more
than six fold to over 250 million by 2020
- Gartner
250
Million

Polling Question
Where are you with respect to your IoT journey?
o We already have IoT use cases deployed
o We are actively working on deploying IoT use cases
o We are currently evaluating IoT use cases & technologies
o We will potentially deploy in the next two years
o We don’t see an IoT play in my organization

Performance Monitoring & Predictive
Maintenance of Heavy Equipment
Challenge:
• Continuously monitor performance of
heavy machinery and perform predictive
maintenance
Solution:
• Use Cloudera to parse large volume and
high velocity sensor data from equipment
• Process and analyze data for performance
analysis, advanced defect detection
HEAVY MACHINARY
» INDUSTRIAL IoT
» PREDICTIVE MAINTENANCE
» LOWERED COSTS
Industrial IoT – Heavy Machinery
DATA-DRIVEN
PROCESS
CASE STUDY
Change Image

Advanced analytics on streaming data to
reduce human space mission risks
Challenge:
• Over 2 TB/ hour of telemetry test data
streaming in from over 1200 sensors in
test environment
Solution:
• Cloudera cluster supporting high rate of
data ingest – up to ~300MB/sec
• Advanced analytics run on the
streaming data to check for issues or
determine patterns and reduce risk
AEROSPACE
» INDUSTRIAL IoT
» REMOTE MONITORING
» PREDICTIVE MAINTENANCE
Aerospace – Spacecraft Telemetry
DATA-DRIVEN
PROCESS
CASE STUDY

Enabling Smart Parking in Milton Keynes
with BT
Challenge:
• Increase efficiency and utilization of the
parking spots at Milton Keynes
Solution:
• Sensors installed in the parking bays sent
data streams to the Cloudera Data Hub
hosted by BT
• Real-time updates on parking availability
over the web/ smartphones
• Savings of £105M & reduced emissions
TELCOMMUNICATIONS
» SMART CITIES
» IMPROVED UTILIZATION
» COST REDUCTION
Smart Cities
CASE STUDY

The IoT Ecosystem
Consumer
Industrial
IoT Gateway
Cloud
Data Center
Data Analytics
Sensors/ Things

The IoT Ecosystem
Consumer
Industrial
IoT Gateway
Data Center
Data Analytics
Sensors/ Things
Data Characteristics
• Un-structured
• Intermittent
• Volume & Variety
Gateway
• Data Routing
• Edge-Processing
• Edge-Storage
Sensors/ Things
•To grow by 50X
•Drop in prices by
70% in last 5 years
Data Storage, Processing & Analytics
IOT Data Characteristics
• More processing in the
cloud
• Analytics on the cloud
IOT Data Analytics
• Key to Value Creation
• Combine data from multiple
sources & types
• Drive business insights
IOT Data Characteristics
• Distributed Data
Processing
• Cloud & On-Premise
Cloud

Key Attributes For Next Gen IoT Data Platform
Scale efficiently based on
your data growth
Effectively handle multiple
data-types and structures
Manage the complexity of
real-time IoT data ingest
Fundamentally Secure
Real-Time Analytics – Combine and
analyze data from multiple sources
Flexible deployment options
- Cloud & Distributed Data Processing

FILESYSTEM RELATIONAL
Cloudera Enterprise – Making Hadoop Fast, Easy, and Secure
OPERATIONS
Cloudera Manager
Cloudera Director
DATA
MANAGEMENT
Cloudera Navigator
Encrypt and KeyTrustee
Optimizer
BATCH
Sqoop
REAL-TIME
Kafka, Flume
PROCESS, ANALYZE, SERVE
UNIFIED SERVICES
RESOURCE MANAGEMENT
YARN
SECURITY
Sentry, RecordService
FILESYSTEM
HDFS
RELATIONAL
Kudu
NoSQL
HBase
STORE
INTEGRATE
BATCH
Spark, Hive, Pig
MapReduce
STREAM
Spark
SQL
Impala
SEARCH
Solr
SDK
Partners
CLOUDERA ENTERPRISE

Cloudera Enterprise – The Data & Analytics Platform for IoT
Sensors/ IoT
Data Sources
Internal Systems External Sources
BI Solutions Real-Time AppsSearch EDWDiscove
r
Machine
Learning
Data Center
Cloud
Sensor/ IoT Data
IoT Gateway
• Data Storage
• Data Processing
• Machine Learning
• Real-time Analytics
OPERATIONS
Cloudera Manager
Cloudera Director
DATA
MANAGEMENT
Cloudera Navigator
Optimizer
BATCH
Sqoop
REAL-TIME
Kafka, Flume
UNIFIED SERVICES
RESOURCE MANAGEMENT
YARN
SECURITY
FILESYSTEM
HDFS
RELATIONAL
Kudu
NoSQL
HBase
STORE
INTEGRATE
BATCH
Spark, Hive, Pig
MapReduce
STREAM
Spark
SQL
Impala
SEARCH
Solr
SDK
Partners

Cloudera Enterprise – Real Time Analytics for IoT
BI Solutions Real-Time AppsSearch EDWDiscover Machine
Learning
Deployment
Flexibility
Spark Streaming
Leadership in Spark
Integrated with EDH
Flexible Storage
Store any and all Data.
Kudu - Fast Analytics on
Fast Data
Real-Time Data
Processing
Data Security
Four pillars of security: Perimeter,
Access, Visibility, and Data
+ Record Service
Streaming Ingest
Kafka & Flume - Real-Time
Data Ingest for streaming,
high volume data
Sensor/ IoT Data Internal Systems External Sources
Centralized Mgmt.
Cloudera Manager for
centralized cluster
management
Manage Multiple Clusters – On
Premise or Cloud environment
- On Premise or Cloud
OPERATIONS
Cloudera Manager
Cloudera Director
DATA
MANAGEMENT
Cloudera Navigator
Optimizer
BATCH
Sqoop
REAL-TIME
Kafka, Flume
UNIFIED SERVICES
RESOURCE MANAGEMENT
YARN
SECURITY
FILESYSTEM
HDFS
RELATIONAL
Kudu
NoSQL
HBase
STORE
INTEGRATE
BATCH
Spark, Hive, Pig
MapReduce
STREAM
Spark
SQL
Impala
SEARCH
Solr
SDK
Partners

Cloudera – IoT Technical Architecture
IoT Enablers
Data Ingestion
(Kafka, JSON)
Data Transformation
and Enrichment
Data Processing &
Serving
Rule Mining, Pattern
Matching, Machine Learning
Distributed Data
Analytics
Data
Visualization
Data Storage
(Hbase, HDFS, Kudu)

Enabling US Auto Manufacturer to
Effectively Capture & Analyze Data from
over 30,000 Sensors in Every Car
Challenge:
• Effectively capture and analyze data
emitting from 30,000 sensors in every
car
Solution:
• Centralized data platform from Cloudera
for analytics of all of the sensor data
• Monitor individual component
performance and operational metrics
AUTOMOBILE
» CONNECTED CARS
» PREDICTIVE ANALYTICS
» REMOTE MONITORING
Connected Cars
DATA-DRIVEN
PROCESS
CASE STUDY

Helping 4+ million homes save hundreds of
millions of dollars in energy bills
Challenge:
• Bringing together diverse data sets
including streaming utility& sensor data
• Deriving Business Insights from all the
Data
Solution:
• Analytical Application on Cloudera EDH
• Savings of more than $320 Million for
subscribers
UTILITIES
» CONSUMER IoT
» PROCESS IMPROVEMENT
» COST REDUCTION
Smart Meters
CASE STUDY

Usage Based Insurance to reduce claims
by a large European insurance agency
Challenge:
• Gather, store and analyze sensor data
from millions of customers to analyze
driving habits & risks
Solution:
• Telemetry solution on Cloudera EDH -
increased policy renewal and customer
satisfaction rates
• Reduced the number of claims by 30%
resulting in savings in millions annually
INSURANCE
» USAGE BASED INSURANCE
» IMPROVED PROCESSES
» REDUCED COSTS
Usage-Based Insurance and Telematics
DATA-DRIVEN
PROCESS
CASE STUDY

Improve Parkinson's Disease Monitoring
and Treatment through IoT
Challenge:
• Collect and analyze data from wearables
(more than 300 readings per second)
from thousands of patients in real-time
Solution:
• Cloudera on Intel architecture to detect
patterns in patient data streaming from
thousands of wearables
• Continuously monitor the patients and
symptoms to accelerate breakthrough on
drug development
HEALTHCARE
» PATIENT 360°
» PREDICTIVE ANALYTICS
» IMPROVED CARE
Smart Healthcare
DATA-DRIVEN
PROCESS
CASE STUDY

The Cloudera Difference
Powerful Cluster Ops
Trusted by the pros
Cloud & Hybrid deployment
Integrated with AWS & Azure
Expert Support
Dedicated prescriptive help, just a click away
Real-Time IoT Analytics
The most experience with Spark
The Fastest Analytic SQL
Lowest latency, best concurrency
Fast, Updateable Analytic Storage
High throughput, low latency, and updates
Easy to ManageFast for Business Security without Compromise
Enterprise Encryption
Protects everything transparently
Access Policy Enforcement
Full-stack row/column-based RBAC & dynamic masking
Automated Data Management
Full-stack audit, lineage, discovery, and lifecycle
Secure Operations
Separation of duties, log data redaction

Thank You

Powering the Internet of Things with Apache Hadoop

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Destaque

Destaque (20)

Semelhante a Powering the Internet of Things with Apache Hadoop

Semelhante a Powering the Internet of Things with Apache Hadoop (20)

Mais de Cloudera, Inc.

Mais de Cloudera, Inc. (20)

Último

Último (20)

Powering the Internet of Things with Apache Hadoop

Notas do Editor