2. Agenda
2
IoT and IIoT
Data collection, storage, processing and visualization
Data Analytics
Cloud infrastructure and platform services for Analytics
Architecture - Example
3. IOT & IIOT
Billions of connected devices to server via network and deliver connected
industry solutions. The connectivity is just an enabler but the real value of
IoT is on data (business insight/data-driven economy).
Use of smart sensors and actuators to enhance manufacturing and
industrial processes.
Industry 4.0 focusses on the interconnectedness of machines and systems
to improve operational efficiency and productivity
3
4. IoT- Key Technology Enablers
4
(1) Cloud Computing
(2) Big Data and Analytics
(3) Web 2.0 and 3.0
(4) Evolution of high speed communication technologies
5. Technology Landscape
5
Industry verticals -
Dashboard
Platform and
services
Protocols and
Communication
Sensors ,Devices and
Gateway
Light
Sensor
Voltage
Sensor
Temp/
humidity
Vibration
Sensor
Ultrasonic
Sensor
Gas
Sensor
BLE
Sensor
GPS
Pi 3
gateway
Gateway
PLC
• Username/
Password
• API Security
• Data At
Transit
• Data At Rest
• Firewall
• DoS
prevention
• Certificates/
Encryption
• Policies
• SSO/MFA
Security
6. IOT- Sensors to Application
Beacons
Industrial
plants
Sensors
Senor/Machine
Parameters
Device Gateway
Edge
Analytics
Platform
Agent
Sensor
Data
Agent
IoT & Cloud platform
AnalyzeStore
Application
Alert
Data
Visualization
Edge Devices/Sensors source for the real time data .
Device gateway collects data from multiple edge devices, filter aggregate and ingest the data to the cloud
platform for further processing and analyzing
IoT platform enable device onboarding, data ingestion, device to cloud and cloud to device communication
Cloud platform receive data, store, process and generate insights
Application helps to visualize the dashboard ,monitor and control the devices.
Monitor
and Control
Edge
Devices/Sensors
Device
Provision &
Onboard
Data Ingestion
Rules
Device
Management Compute Hosting
Security
Delivery
7. IOT Sensors &Gateway
7
Acquire and Transmit
Beacon
Sensor
Device
Thing
Gateway
Monitor Transmit Aggregate Analyze Send to Cloud
Analyze and ActAggregate
Gateway Provides
Authentication
Data Filtering
Edge Analytics
Control and management
Communication between sensors and
Gateway , Gateway to cloud platform
using
Zigbee
BLE
Wi-Fi
RF
LoRa
MQTT
AMQP
CoAP
HTTP/HTTPS
NFC,TCP/UDP
UART,SPI
Different field sensors/Devices
Sensors: Temperature, pressure, accelerometer ,vibration ,RPM,
Beacons etc..
Devices: Camera, activity tracker, smart glass etc..
8. 8
Data from Sensors/Devices
• Structured, semi-structured, or unstructured, or any combination of these varieties.
• Velocity, Variety and Volume
• What can we do with Vast amount of data?
• Real time streaming analysis and insights
• Derive meaningful KPI’s to help business
• Detect anomaly in operation, device behavior, Alert
• Predictive analysis
• Visualization, Control/Operate and report
• Quality inspection
• Process automation
• Improve quality of life
• Research and improvements
• How?
• Analyze the data
• How to analyze Terabytes of streaming data ?
• Large repositories
• Complex data analysis techniques
• Distributed/parallel processing .
• Data lake/warehousing /Business Intelligence .
9. Data Collection, Storage and processing
9
Predictive Tasks
With growing volumes of available data
and affordable data storage.
Computational processing is also cheaper
Analyzing bigger and complex data helps
in delivering faster and more accurate
results
Process real-time data such as video,
audio, application logs, website
clickstreams, and IoT telemetry data.
What happened and why?.
Real time Streaming Analytics
Descriptive Analytics
human-interpretable patterns that
describe the data
correlations, trends, clusters,
trajectories, and anomalies
summarize the underlying relationships
in data
Predictive Analytics.
Use some variables to predict unknown
or future values of other variables
Data Collection
Event
Streaming/Data
Storage
Data Processing Descriptive Tasks
Real time Streaming
analytics
10. 10
Step 1: Data Collection
How to collect Continuous data from multiple sources/industry machines?
Choose the important parameters and monitor the parameters
Select the frequency at which the data needs to be collected
Choose appropriate Cloud Gateway/broker and streaming platform
Choose proper data lake to store different type of data
Collection
11. 11
Step 2: Data Storage
Storage
Requirement:
Relational/Non-Relational
Store large amount of data
Real Time Streaming Data
Scalable ,Reliable and high availability
Multi-tenancy
Cost
Relational (SQL): Structure with defined attributes
MS SQL, POSTGRESQL ,oracle SQL
Can be queried using SQL
No-relational(NO-SQL): Free flow operations
Utilize a variety of data models, including document, graph, key-value and columnar
Unique way to query the data
Mongo DB(Document) , Redis (Key-value) ,Amazon Redshift (Columnar) , Cassandra
(Columnar), HBase (Columnar) ,Dynamo DB(Document DB-stores JSON/XML) , GraphDB
13. 13
Processing – Tools/Framework
Processing Big Data
Apache Hadoop
Spark - distributed stream processing
Storm - distributed stream processing
Mango DB
Casandra
Talend –ETL
Kafka – Event Processing
Splunk – Log analysis Platform
Hive – Data warehouse
Hbase – No SQL
Pig- Scripting
Zookeeper – Centralized config & coordination
Streams and Complex Event Processing
Kafka, AWS Kinesis, JMS , Azure Event
hub, Google pub/sub
Where to install Big Data Tools ? Who is providing process/memory/Storage/
Network Capabilities?
Language/Tools
Python
R Programming (Statistical
computing )
Matlab
Java
Tensor flow
Amazon Machine Learning
Spark Mlib
H2O
Azure ML studio
In-Memory
Distributed/Parallel
processing
Automated
Infrastructure/configuration
14. 14
Streaming/Descriptive Analytics
Streaming Analytics
Analyzes and visualizes data in real time
Ex: production floor manager wants to have real-time insights from the
sensor data, patterns and take actions on them
Equipment performance
LBS, In-context
Descriptive Analytics
Use historical data
Clustering
Association rule analysis
Anomaly detection
15. 15
Finding groups of objects/points such that the objects/points in a group
will be similar (or related) to one another and different from (or unrelated
to) the objects in other groups
Clustering
16. Predictive Analytics
16
• What else most likely to happen?
• Data/Text mining ,forecast and statistical analysis
• Intelligent/scientific estimates about the future values (Ex:customer demand, interest rates, stock
market movements etc..).
• Deploy to take business decision
• Predictive Shipping
Input Param 3
Build predictive
Model
Model validation
Model
tuning/optimization
Input Param 1
Input Param 2
Model Deployment
19. 19
Classification and Regression
Output of an algorithm after it has been trained on a historical dataset and applied to
new data to know the likelihood of a particular outcome.
Set of algorithms & methods to predict categorical values.
Classification, which is the task of assigning objects to one of several predefined
categories (classes)
20. 20
Apply
Model
Learn
Model
Tid Attrib1 Attrib2 Attrib3 Class
1 Yes Large 125K No
2 No Medium 100K No
3 No Small 70K No
4 Yes Medium 120K No
5 No Large 95K Yes
6 No Medium 60K No
7 Yes Large 220K No
8 No Small 85K Yes
9 No Medium 75K No
10 No Small 90K Yes
10
Tid Attrib1 Attrib2 Attrib3 Class
11 No Small 55K ?
12 Yes Medium 80K ?
13 Yes Large 110K ?
14 No Small 95K ?
15 No Large 67K ?
10
Classification and Regression
21. 21
Models/Algorithms
• Algorithm Selection depends on
• Accuracy
• Training time
• Number of parameters
• Feature count
• Memory footprint
• Linear/Non-linear data
• Algorithms
Linear Regression
Logical Regression
Naive Bayes
K-Means
Random Forest
Support Vector Machine
Neural Network
22. 22
Data Visualization
To capture and communicate insights from Big Data analytics, move from standard
reporting to more sophisticated visualization.
Visualization -> presenting information in such a way that people can consume it
effectively.
The most impactful visualizations are often the most interactive
Explore and have a conversation with the data.
it capitalizes on visual advantage to recognize and understand patterns, represents
a large amount of data in one place, and gives users access to actionable insights.
Heat Map Tag Cloud History Flow
23. How to Visualize Raw/Processed/Analytics output?
Applications/Visualization Tools
23
Web Applications:
• Application that is accessed via a web browser over a
network
• JavaScript, CSS, and HTML5
• Web apps became really popular when HTML5 came
around and people realized that they can obtain
native-like functionality in the browser.
Native Applications:
• Native apps are written in languages that the platform
accepts
• Swift or Objective-C for iOS
• Java for Android
• C# for Windows Mobile
Hybrid Application:
• Combination of Native with Web Component
• Xamarin -Slack, Pinterest.
• React Native -Facebook, Walmart, Tesla, and Airbnb
• Titanium -eBay, ZipCar, PayPal
• Angular JS -PubNub Chat, YouTube on PS3
• Advanced BI tools – Power BI, Qlikview, Tableau
24. IoT Platforms
24
AWS IoT
Azure IoT
GE Predix
IBM Watson IoT
Thingworx
Google Cloud IoT
Bosch IoT
Mindsphere
Alibaba IoT
C3 IoT
Jasper
leonardo
Telit IoT
Platform ability to centrally manage of
multiple devices at scale, provide
remote configuration, monitoring and
decommissioning.
Facilitate seamless connection
between device to platform, platform
to device and direct connectivity
between sensors to platform.
Ability to provide infrastructure, tools
to manage, store, process and real
time analysis of streaming data.
Platform to host in public, private,
hybrid environments,
Friendly environment, programming,
framework options to develop,
integrate, connect, host and run the
applications
Platform capability to provide fine
grained security and data privacy.
25. Cloud Platforms
25
AWS
Azure
GCP
Alibaba
PaaS and IaaS
Ramp up or ramp down resource on need
basis
Compute/memory/storage/GPU optimized
Route the load to difference instances
Virtual Network Environment
Environment to host applications and run
Auto scaling and Load Balancing
Disaster management
Automatic Deployment with Zero downtime
Scaling and Elasticity
Failure fallback
Managed Machine learning, Deep Learning ,
Notification/Alert engine
API management
Identity and Access Management
Compute, Storage, Network
Serverless, Microservice
Open shift
Rackspace
Heroku