SlideShare uma empresa Scribd logo
1 de 192
A Short Course on the Internet of Things
Prasant Misra, Ph.D.
W: https://sites.google.com/site/prasantmisra
Course Content (240 mins)
Gateway
Field Area Network
Back-haul Network
CloudInfraFog
Networked
FieldDevice
“Last-mile”
 IoT Primer (40 mins.)
 History of Computing and Trends
 Industrial IoT and Industry 4.0
 IoT Architecture Primer (20 mins.)
 Functional Architecture
 IoT “Last-mile” Considerations (60 mins.)
 Field Devices & Platforms
 Field Device Stack
(PHY, MAC, NTWK, ROUTING, TRANSPORT, APP)
 IoT “Last-mile” Communication Nuances
(30 mins.)
 IoT “Last-mile” Existing and Upcoming
Standards (30 mins.)
 Derivatives for Intelligence (60 mins.)
 Nature of Data Analysis
 Intelligence with Machine Learning
2
Internet of Things : Primer
Setting the context …
4
1960 - 70
1980 - 90
2000 -10 and
beyond
Year
Size
History of Computing
Accessibility to cyber end points have increased drastically …
5
Trend-I: Device/Data Proliferation (by Moore’s Law)
Wireless Sensor Networks (WSN) Medical Devices
Industrial Systems Portable Smart DevicesRFID
6
Trend-I: Device Proliferation
http://www.onethatmatters.com/wp-content/uploads/2015/12/Internet-of-Things-why.png
NAIVE
7
Trend-I: DATA Proliferation
Web & Social
Media
Enter-
prises
Gov.
8
Trend-II: Integration at Scale (Isolation has cost !!!)
(World Wide) Sensor Web
(Feng Zhao)
Future Combat Systems
Ubiquitous embedded devices
• Large scale network embedded systems
• Seamless integration with the physical environment
Complex system with global integration
9
Trend-III: Evolution: Man vs. Machine
The exponential proliferation of embedded devices (courtesy of Moore’s Law) is NOT
matched by a corresponding increase in human ability to consume information !
Increase in Machine Autonomy !!! 10
Confluence of Trends
Distributed,
Information
Distillation and
Control Systems of
Embedded Devices
Trend-1:
Data & Device
Proliferation
Trend-3:
Autonomy
Trend-2:
Integration at
Scale
11
Confluence of Technologies
CPS
Trend-1:
Sensing &
Actuation
Trend-3:
Computation &
Control
Trend-2:
Communication &
Networking
A cyber-physical system (CPS) refers to a tightly integrated system that is engineered with a
collection of technologies, and is designed to drive an application in a principled manner.
12
Functional Blocks of CPS
Enormous SCALE : both in space and time
13
Enormous SCALE : both in space and time
Functional Blocks of CPS
14
Casting CPS Technology into Application Requirement
Use Case: Adaptive Lighting in Road Tunnels
Problem: Control the tunnel lighting levels in a manner that ensures continuity of light conditions
from the outside to the inside (or vice-versa) such that drivers do not perceive the tunnel as too
bright or dark.
Solution: Design a system that is able to account for the change in light intensity (i.e., detect physical
conditions and interpret), and adjust the illumination levels of the tunnel lamps (i.e., respond) till a
point along the length of the tunnel where this change is indiscernible to the drivers (i.e., reason and
control in an optimal manner). 15
Casting CPS Technology into Application Requirement
Use Case: Smart Buildings/Homes
Problem: How to make buildings/homes (both new and existing) ‘smarter’ ?
• Energy efficient
• Damage prevention
• Increased comfort
16
Beaming from CPS to IoT : The SCALE is even BIGGER !!!
C1 C2 Cn
P1 P2 Pn
CPS
Internet
CyberworldPhysicalworld
NoT
IoT = CPS + People ‘in-the-loop’ (that act as sensors, actuators, controllers)
IoT = CPS + Hybrid (tight and loose) sense of control 17
CPS & IoT
 Gives us the ability to look more broadly (SCALE), deeply (PRECISION) and
over extended periods of time at the physical world
 As a result, our interactions with the physical world has increased !!!
Example of a Killer APP: Navigation System
18
Navigation System - I
Context Service Example
Current
Location
Local business
19
Context Service Example
Current
Location
Local business and
directions
+
Time Tracks
Businesses in
driving direction
Navigation System - II
20
Context Service Example
Current
Location
Local business and
directions
+
Time Tracks
Businesses in
driving direction
+
History
Personalized
directions
 Take 520 East
Navigation System - III
21
Context Service Example
Current
Location
Local business and
directions
+
Time Tracks
Businesses in
driving direction
+
History
Personalized
directions
+
Community
Tourist
recommendation
35% people pick
the scenic route
Navigation System - IV
22
Alert: Bad
Traffic
Consider
Alternate
route
Context Service Example
Current
Location
Local business and
directions
Tracks
Businesses in
driving direction
+
History
Personalized
directions
+
Community
Tourist
recommendation
+
Push
alerts, triggers,
reminders
Navigation System - V
23
Some formalism and SYSTEMS feel …
24
IoT: Vision and Value Proposition
Vision:
Build a ubiquitous society where everyone (“people”) and everything (“systems,
machines, equipment and devices") is immersively connected.
Value Proposition:
 Connected “Things” will provide utility to “People”
 Digital shadow of “People” will provide value to the “Enterprise”
25
How BIG is IoT ?
26
IoT Applications
27
The FORTUNE TELLER or NOT …
IIoT and Industry 4.O is ALL about re-imagination !!!
 Improve flexibility, reliability and time to market/scale
 Improve customer intimacy and profitability
 Improve revenue and market position
28
Is the Internet of Things disruptive?
OR
Are they repackaging known technologies
and making them a little better?
What is your take ?
29
Internet of Things :
Architectural Design Primer
High-level Functional Architecture
DATA @ REST (VOLUME)
Archival/Static data (TBs) in Data
stores
DATA @ MOTION (VELOCITY)
Streaming data
DATA @ MANY FORMS (VARIETY)
Structured/Unstructured, Text,
Multimedia, Audio, Video
DATA @ DOUBT (VERACITY)
Data with uncertainty that may be
due to incompleteness, missing
points, etc.,
PRESCRIPTIVE
What are the best outcomes ?
PREDICTIVE
What could happen ?
DESCRIPTIVE
What has happened ?
DISCOVERY
What do we have ?
NATURE of INGESTED DATA
NATURE of ANALYSIS
DATA
KNOWLEDGE
31
Detailed Functional Architecture
CloudInfra
Fog
Networked
FieldDevice
“Last-mile”
Gateway
Field Area Network
Back-haul Network
Gateway
Field Area Network
Back-haul Network
Gateway
Field Area Network
Back-haul Network
32
Functional Architecture Layers and their Key Physical Attributes
Gateway
Physical Attribute
Field Devices
(with Sensing, Compute and
Actuation HW)
Functionality
Sense
Actuate
Control
Physical Attribute
Last-mile connectivity
PAN, HAN, FAN, NAN, CAN,
WAN, etc.,
Functionality
Connection
Management
Routing
Physical Attribute
Data Storage
Functionality
Ingestion
Semantics
Transformation
Functionality
Interoperability
Security
Access Control
Functionality
Business Logic
Orchestration
Functionality
Input
Output
Transform
Physical Attribute
Common Service Functions
Physical Attribute
Business Logic & Related
Functions
Physical Attribute
Users
33
Recap : Functional Architecture
Service Oriented
Approach
Application & Business Architecture
Describing the service strategy, the organizational,
functional process information and geographic aspects of
the environment based on the strategic goals and
strategic drivers
Information Systems Architecture
Describing information/data structure and semantics,
types and sources of data necessary to support various
smart application
Data Access Architecture
Describing technical components (software, hardware),
access technology and data aggregation policies
Information Security
characterized by:
• Availability
• Integrity
• Confidentiality
Interoperability
characterized by:
• Syntactic
• Semantics
34
Internet of Things :
“Last-mile” Considerations
What ROUTE are we going to take ?
36
Popular Communication, Networking, and Control Standards
for Industrial Systems
37
“GO Wireless” !!!
38
“GO Wireless” and “GO w or w/o IP” !!!
39
“Last-mile” Consideration
w.r.t.
Low-power, Wireless,
Constrained Field Devices &
Networks
Gateway
Field Area Network
Back-haul Network
CloudInfraFog
Networked
FieldDevice
“Last-mile”
40
 Consist of many embedded units called sensor nodes, motes etc. ,.
 Sensors (and actuators)
 Small microcontroller
 Limited memory
 Radio for wireless communication
 Power source (often battery)
 Communication centric systems
 Motes form networks, and in a one hop or multi-hop fashion transport sensor data to
base station
Background: Wireless Sensor Networks (WSN)
41
• Processing speed ?
• Memory ?
• Storage ?
• Power consumption ?
BTNodeMicaZ dotMote Fleck Tmote Sky
Radio
Sensors/
Actuators
Microcontroller Storage
Power Source
Architecture: WSN platforms
42
WSN Node: Core Features
Limited Energy Reserves
– PREMIUM resource
Under MAC Control
(bit) RISC
KBytes
KBytes
(bit)
KBytes
#
#
43
Sensor Web: Field Device Stack
L2: MAC
L4: ROUTING
L5: TRANSPORT
L6: APP
L3: NETWORK
L1: PHY
Do we need a LAYERED approach @
the Field Device level ?
44
PHYsical Layer Technologies
45
(Popular) Short and Medium Range Low Power Wireless Technology
Technology Standard Body
Frequency
Band
Max
Range
Max
Data Rate
Max
Power
Network
Type
Bluetooth Bluetooth SIG 2.4 GHz ISM 100 m 1-3 Mbps 1 W WPAN
Bluetooth
Smart
IoT Interconnect 2.4 GHz ISM 35 m 1 Mbps 10 mW WPAN
ZigBee
IEEE 802.15.4,
Zigbee Alliance
2.4 GHz ISM 160 m 250 Kbps 100 mW Star, Mesh
Wi-Fi
IEEE 802.11
g/n/ac/ad
2.4/5/60 GHz 100 m
6-780 Mbps,
6 Gbps @ 60
GHz
1 W
Star, Mesh
Zwave Zwave 908 MHz 30 m 100 Kbps 1 mW
Star, Mesh
ANT+ ANT Alliance 2.4 GHz 100 m 1 Mbps 1 mW
Star, Mesh
Rubee
IEEE 1902.1, IEEE
1902.2
131 kHz 5 m 1.2 Kbps 40-50 nW P2P
46
Low Power Wide Area Networking Technology
Technology
Standards/
Governing
Body
Frequency
Band
Max Range
Max
Data Rate
Topology
Devices /
Access
Point
Weightless -
SubGHz ISM,
TV Whitespaces
2-5 k (urban)
200 bps –
100 Kbps,
W: 1 Kbps –
10 Mbps
Star Unlimited
LoraWAN LoRa Alliance
433/780/868/9
15 MHz ISM
2.5 -15 km
0.3 – 50
Kbps
Star 1 million
SigFox SigFox
Ultra narrow
Band
30-50 km
(rural), 3-10
km (urban)
100 bps Star 1 million
WiFi LowPower
IEEE
P802.11ah
SubGHz
1 km
(outdoor)
150 - 340
kbps
Start, Tree
-
Dash7 Dash7 Alliance
433/868/915
MHz
2 km
9.6/56/167
Kbps
Star, Tree
-
LTE-Cat 0 3GPP R-13 Cellular 2.5 -5 km 200 kbps Start
> 20,000
UMTS (3G),
HSDPA / HSUPA
3GPP Cellular
27 km, 10
km
0.73 - 56
Mbps
Star
Hundreds
per cell
47
Taxonomy of Key IoT Wireless Technologies
48
Low Power Communication Technologies: Frequency
49
Low Power Communication Technologies: Data Rate
50
Low Power Communication Technologies: Range
51
Low Power Communication Technologies: Energy
52
Internet of Things : “Last-mile” Considerations
Case study with IEEE 802.15.4
54
Existing Stack using IEEE 802.15.4 as the PHY Layer
IEEE 802.15.4 IEEE 802.15.4
IEEE 802.15.4 PHY
L2: MAC
L4: ROUTING
L5: TRANSPORT
L6: APP
L3: NETWORK
L1: PHY 55
IEEE 802.15.4: Quick Facts
IEEE 802.15.4
 Offers physical and media access control layers for low-speed, low-power wireless personal
area networks (WPANs)
 16 non-overlapping channels, spaced 5 MHz apart; and occupy frequencies 2405-2480 MHz
 Provides a physical layer bandwidth of 250kbps
 Shares the same frequency band as IEEE 802.11 and Bluetooth
56
IEEE 802.15.4: Radio Characteristics
57
IEEE 802.15.4: Device Classes
Full Function Device (FFD)
 Any topology
 PAN coordinator capable
 Talks to any other device
 Implements complete protocol set
Reduced Function Device (RFD)
 Reduced protocol set
 Very simple implementation
 Cannot become a PAN coordinator
 Limited to leafs in more complex topologies
58
IEEE 802.15.4: Topology Types
Star Topology
 All nodes communicate via the central PAN coordinator
 Leafs may be any combination of FFD and RFD devices
 PAN coordinator is usually having a reliable power source
Peer-to-Peer Topology
 Nodes can communicate via the central PAN coordinator
and via additional point-to-point links
 Extension of the pure star topology
Cluster Tree Topology
 Leafs connect to a network of coordinators (FFDs)
 One of the coordinators serves as the PAN coordinator
 Clustered star topologies are an important case
(e.g., each hotel room forms a star in a HVAC system)
59
IEEE 802.15.4: Frame Formats
 Max. frame size: 127 octets
 Max. frame header: 25 octets
60
IEEE 802.15.4: Frame Formats
 Beacon Frames
Broadcasted by the coordinator to organize the network
 Command Frames
Used for association, disassociation, data and beacon requests, conflict notification, . . .
 Data Frames
Carrying user data
 Acknowledgement Frames
Acknowledges successful data transmission (if requested)
61
Link Layer Protocols L2: MAC
L4: ROUTING
L5: TRANSPORT
L6: APP
L3: NETWORK
L1: PHY
IEEE 802.15.4 IEEE 802.15.4
62
 Why do we need MAC ?
 Wireless channel is a shared medium
 Radios, within the communication range of each other and operating in the same
frequency band, interfere with each others transmission
 Interference -> Collision -> Packet Loss -> Retransmission -> Increase in net energy
 The role of MAC
 Co-ordinate access to and transmission over the common, shared (wireless) medium
 Can traditional MAC methods be directly applied to WSN ?
 Control -> often decentralized
 Data -> low load but convergecast communication pattern
 Links -> highly volatile/dynamic
 Nodes/Hops -> Scale is much larger
 Energy is the BIGGEST concern
 Network longetivity, reliability, fairness, scalability and latency
are more important than throughput
MAC is Crucial !!!
63
MAC Family
Reservation
(Scheduled, Synchronous)
Contention
(Unscheduled, Asynchronous)
 Reservation-based
 Nodes access the channel based on a schedule
 Examples: TDMA
 Limits collisions, idle listening, overhearing
 Bounded latency, fairness, good throughput (in loaded traffic conditions)
 Saves node power by pointing them to sleep until needed
 Low idle listening
 Dependencies: time synchronization and knowledge of network topology
 Not flexible under conditions of node mobility, node redeployment and node death:
complicates schedule maintenance
 Contention-based
 Nodes compete (in probabilistic coordination) to access the channel
 Examples: ALOHA (pure & slotted), CSMA
 Time synchronization “NOT” required
 Robust to network changes
 High idle listening and overhearing overheads
Taxonomy
64
MAC: Reservation vs. Contention
65
 Collisions
 Node(s) is/are within the range of nodes that are transmitting at the same time -> retransmissions
 Overhearing
 The receiver of a packet is not the intended receiver of that packet
 Overhead
 Arising from control packets such as RTS/CTS
 E.g.: exchange of RTS/CTS induces high overheads in the range of 40-75% of the channel capacity
 Idle Listening
 Listening to possible traffic that is not sent
 Most significant source of energy consumption
Function Protocols
Reduce Collisions CSMA/CA, MACA, Sift
Reduce Overheads CSMA/ARC
Reduce Overhearing PAMAS
Reduce Idle Listening PSM
Causes of Energy Consumption
66
Low-power, Constrained Field Devices MAC Family
Scheduled
(periodic, high-load traffic)
Common Active Periods
(medium-load traffic)
Preamble Sampling
(rare reporting events)
67
 Build a schedule for all nodes
 Time schedule
 no collisions
 no overhearing
 minimized idle listening
 bounded latency, fairness, good throughput (in loaded traffic conditions)
 BUT: how to setup and maintain the schedule ?
Function Protocols
Canonical Solution TSMP, IEEE 802.15.4
Centralized Scheduling Arisha, PEDAMACS, BitMAC, G-MAC
Distributed Scheduling SMACS
Localization-based Scheduling TRAMA, FLAMA, uMAC, EMACs, PMAC
Rotating Node Roles PACT, BMA
Handling Node Mobility MMAC, FlexiMAC
Adapting to Traffic Changes PMAC
Receiver Oriented Slot Assignment O-MAC
Using different frequencies PicoRadio, Wavenis, f-MAC, Multichannel LMAC,
MMSN, Y-MAC, Practical Multichannel MAC
Other functionalities LMAC, AI-LMAC, SS-TDMA, RMAC
Scheduled MAC Protocols
68
Time Synchronized Mesh Protocol (TSMP): Overview
 Goal: High end-to-end reliability
 Major Components
 time synchronized communication (medium access)
 TDMA-based: uses timeslots and time frames
 Synchronization is achieved by exchanging offset information (and not by
beaconing strategies)
 frequency hopping (medium access)
 automatic node joining and network formation (network)
 redundant mesh routing (network)
 secure message transfer (network)
 Limitations
 Complexity in infrastructure-less
networks
 Scaling is a challenge
 Finding a collision free
schedule is a two-hop
coloring problem
 Reduced flexibility to adapt to
dynamic topologies 69
 Nodes define common active/sleep periods
 active period -> communication, where nodes contend for the channel
 sleep period -> saving energy
 need to maintain a common time reference across all nodes
Function Protocols
Canonical Solution SMAC
Increasing Flexibility TMAC, E2MAC, SWMAC
Minimizing Sleep Delay Adaptive listening, nanoMAC, DSMAC, FPA, DMAC, Q-MAC
Handling Mobility MSMAC
Minimizing Schedules GSA
Statistical Approaches RL-MAC, U-MAC
Using Wake-up Radio RMAC, E2RMAC
Common Active Period MAC Protocols
70
 Goal: reduce energy consumption, while supporting good scalability and collision
avoidance
 Major Components
 periodic listen and sleep
 Copes with idle listening: uses a scheme of active (listen) and sleep periods
 Active periods are fixed; Sleep periods depend on a predefined duty-cycle param
 Synchronization is used to form virtual clusters of nodes on the same sleep schedule
 Schedules coordinate nodes to minimize additional latency
 collision and overhearing avoidance
 Adopts a contention-based scheme
 In-channel signaling is used to put each node to sleep when its neighbor is transmitting to
another node; thus, avoids the overhearing problem but does not require an additional
channel
 message passing
 Small packets transmitted in bursts
 RTS/CTS reserves the channel for the whole burst duration rather than for each packet;
hence unfair from a per-hop MAC level
Sensor MAC (S-MAC): Overview
71
 Periodic Listen and Sleep
 Each node goes to sleep for some time, and then wakes up and listens to see if any other
node wants to talk to it. During sleep, the node turns off its radio, and sets a timer to awake
itself later.
 Maintain Schedules
 Maintain Synchronization
S-MAC - I
72
 Collision and Overhearing Avoidance
 Adopts a contention based scheme
 Collision Avoidance
 Overhearing Avoidance
 Basic Idea
 A node can go to sleep whenever its neighbor is talking with another node
 Who should sleep?
 The immediate neighbors of sender and receiver
 How to they know when to sleep?
 By overhearing RTS or CTS
 Hog long should they sleep?
 Network Address Vector (NAV)
 Message Passing
 How to transmit a long message?
 Transmit it as a single long packet
 Easy to be corrupted
 Transmit as many independent packets
 Higher control overhead & longer delay
 Divide into fragments, but transmit all in burst
S-MAC - II
73
 Adaptive duty cycle: duration of the active period is no longer fixed but varies according
to traffic
 Prematurely ends an active period if no traffic occurs for a duration of TA
Timeout MAC (TMAC): Overview
74
 Goal: minimize idle listening -> minimize energy consumption
 Operation
 Node periodically wakes up, turns radio on and checks channel
 Wakeup time fixed (time spend sampling RSSI?)
 “Check interval” variable
 If energy is detected, node powers up in order to receive the packet
 Node goes back to sleep
 If a packet is received
 After a timeout
 Preamble length matches channel “checking interval”
 No explicit synchronization required
 Noise floor estimation used to detect channel activity during LPL
Preamble Sampling MAC Protocols
75
Function Protocols
Canonical Solution Preamble-Sampling ALOHA, Preamble-Sampling CSMA, Cycled
Receiver, LPL, Channel polling
Improving CCA BMAC
Adaptive Duty Cycle EA-ALPL
Reducing Preamble Length by
Packetization
X-MAC, CSMA-MPS, TICER, WOR, MH-MAC, DPS-MAC, CMAC,
GeRAF, 1-hopMAC, RICER, SpeckMAC-D, MX-MAC
Reducing Preamble Length by
Piggybacking Synchronization
Information
WiseMAC, RATE EST, SP, SyncWUF
Use Separate Channels STEM
Avoiding Unnecessary
reception
MFP, 1-hopMAC
Drawbacks:
 Costly collisions
 Longer preamble leads to higher probability of collision in applications with considerate traffic
 Limited duty cycle
 “Check interval” period cannot be arbitrarily increased -> longer preamble length
 Overhearing problem
 The target receiver has to wait for the full preamble before receiving the data packet: the per-
hop latency is lower bounded by the preamble length. Over a multi-hop path, this latency can
accumulate to become quite substantial.
Preamble Sampling MAC Protocols
76
Goals:
 Simple and predictable; Effective collision avoidance by improving CCA
 Tolerable to changing RF/networking conditions
 Low power operation; Scalable to large numbers of nodes; Small code size and RAM usage
CCA
 MAC must accurately determine if channel is clear
 Need to tell what is noise and what is a signal
 Ambient noise is prone to environmental changes
 BMAC solution: ‘software automatic gain control’
 Signal strength samples taken when channel is assumed to be free – When?
 immediately after transmitting a packet
 when the data path of the radio stack is not receiving valid data
 Samples go in a FIFO queue (sliding window)
 Median added to an EWMA (exponentially weighted moving average with decay α) filter
 Once noise floor is established (What is a good estimate?), a TX requests starts monitoring
RSSI from the radio
CCA: Thresholding vs. Outlier Detection
 Common approach: take single sample, compare to noise floor
 Large number of false negatives
 BMAC: search for outliers in RSSI
 If a sample has significantly lower energy than the noise floor during the sampling period, then
channel is clear
Berkeley MAC (BMAC): Overview
77
 0=busy, 1=clear
 Packet arrives between 22 and 54 ms
 Single-sample thresholding produces several false ‘busy’ signals
BMAC
78
 Series of short preamble packets each containing target address information
 Minimize overhearing problem
 Reduce latency and reduce energy consumption
 Strobed preamble: pauses in the series of short preamble packets
 Target receiver can shorten the strobed preamble via an early ACK
 Small pauses between preamble packets permit the target receiver to send an early ACK
 Reduces latency for the case where destination is awake before preamble completes
 Non-target receivers that
overhear the strobed preamble
can go back to sleep immediately
 Preamble period must
be greater than sleep period
 Reduces per-hop latency and energy
XMAC: Overview
79
Wireless Sensor (Wise) MAC: Overview
WiseMAC uses a scheme that learns the sampling schedule of direct neighbors and exploits
this knowledge to minimize the wake-up preamble length
 ACK packets, in addition to a carrying the acknowledgement for a received data packet, also have
information about the next sampling time of that node
 Node keeps a table of the sampling time offsets of all its usual destinations up-to-date
 Node transmits a packet just at the right time, with a wake-up preamble of minimized size
80
Wireless Sensor (Wise) MAC: I
How does the system cope with Clock drifts ?
 Clock drifts may make the transmitter lose accuracy about the receiver’s wakeup time.
 Transmitter uses a preamble that is just long enough to make up for the estimated maximum clock
drift.
 The length of the preamble used in this case depends on clock drifts: the smaller the clock drift, the
shorter the preamble the transmitter has to use.
What if the node has no information about the wakeup time of a neighbor node ?
 Node uses a full-length preamble
81
Function Protocols
Flexible MAC Structure IEEE 802.15.4
CSMA inside TDMA Slots ZMAC
Minimizing Convergecast Effect Funneling MAC, MH-MAC
Slotted and Sampling SCP
Receiver based Scheduling Crankshaft
Hybrid Protocols
82
Funneling MAC: Overview
ConvergcastComms
High traffic intensity:
80% of packet loss happens in the
2-hop region from the SINK
83
IEEE 802.15.4 MAC: Overview
 Two different channel access methods
 Beacon-Enabled duty-cycled mode (typically, used in FFD networks)
 Non-Beacon Enabled mode (aka Beacon Disabled mode)
84
IEEE 802.15.4 Beacon Enabled Mode
CAP: Contention Access Period | CFP: Collision Free Period | GTS: Guaranteed Time Slot
 Node listen to Beacon and check IF GTS is reserved
 If YES: remain powered off until GTS is scheduled
 If NO: Performs CSMA/CA during CAP
 Synchronization
 Sync with Tracking Mode
 Sync with Non Tracking Mode
85
A Tribute to Fieldbus Technology …
86
Milestones of Fieldbus Evolution and Related Fields
87
MAC Strategies in Fieldbus systems
88
Traffic Classes
89
IP over IEEE 802.15.4
L2: MAC
L4: ROUTING
L5: TRANSPORT
L6: APP
L3: NETWORK
L1: PHY
IPv6 over
IEEE 802.15.4
IPv6 over
IEEE 802.15.4
90
Field Devices: Network Topology Planning
 STAR topologies are the easiest to setup and manage
 STAR will simply the network design, and if there is just 1-hop communication between
the field devices and gateway, then the need for the "routing layer" on the stack of the
field devices many not arise ... thereby making it more energy efficient and lightweight.
 TREE and MESH are also interesting concepts, but they are very tedious to manage. 91
IPv6 over IEEE 802.15.4 (6LoWPAN)
Benefits of IP over 802.15.4 (RFC 4919)
 The pervasive nature of IP networks allows use of existing infrastructure
 IP-based technologies already exist, are well-known, and proven to be working
 Open and freely available specifications vs. closed proprietary solutions
 Tools for diagnostics, management, and commissioning of IP networks already exist
 IP-based devices can be connected readily to other IP-based networks, without the need
for intermediate entities like translation gateways or proxies
92
6LoWPAN Challenge
Header Size Calculation
 IPv6 header is 40 octets, UDP header is 8 octets
 802.15.4 MAC header can be up to:
 25 octets (null security)
 25+21=46 octets (AES-CCM-128)
 With the 802.15.4 frame size of 127 octets, the following space left for application data:
 127-25-40-8 = 54 octets (null security)
 127-46-40-8 = 33 octets (AES-CCM-128)
IPv6 MTU Requirements
 IPv6 requires that links support an MTU of 1280 octets
 Link-layer fragmentation / reassembly is needed
93
6LoWPAN Overview (RFC 4944)
Overview
 An adaptation layer allowing transport of IPv6 packets over 802.15.4 links
 Uses 802.15.4 in unslotted CSMA/CA
 Based on IEEE standard 802.15.4-2003
 Fragmentation / reassembly of IPv6 packets
 Compression of IPv6 and UDP/ICMP headers
 Mesh routing support (mesh under)
 Low processing / storage costs
94
6LoWPAN Dispatch Codes
 All 6LoWPAN encapsulated datagrams are prefixed by an encapsulation header stack
 Each header in the stack starts with a header type field followed by zero or more
header fields
95
6LoWPAN Frame Formats
Uncompressed IPv6/UDP (worst case scenario)
 Dispatch code (010000012) indicates no compression
 Up to 54 / 33 octets left for payload with a max. size MAC header with null / AES-CCM-128
security
 The relationship of header information to application payload is obviously really bad
96
6LoWPAN Frame Formats
Compressed Link-local IPv6/UDP (best case scenario)
 Dispatch code (010000102) indicates HC1 compression
 HC1 compression may indicate HC2 compression follows
 This shows the maximum compression achievable for link-local addresses (does not work
for global addresses)
 Any non-compressible header fields are carried after the HC1 or HC1/HC2 tags (partial
compression)
97
Header Compression, Fragmentation & Reassembly
Compression Principles (RFC 4944)
 Omit any header fields that can be calculated from the context, send the remaining
fields unmodified
 Nodes do not have to maintain compression state (stateless compression)
 Support (almost) arbitrary combinations of compressed / uncompressed header fields
Fragmentation Principles (RFC 4944)
 IPv6 packets to large to fit into a single 802.15.4 frame are fragmented
 A first fragment carries a header that includes the datagram size (11 bits) and a
datagram tag (16 bits)
 Subsequent fragments carry a header that includes the datagram size, the datagram
tag, and the offset (8 bits)
 Time limit for reassembly is 60 seconds
98
Routing Layer Protocol
L2: MAC
L4: ROUTING
L5: TRANSPORT
L6: APP
L3: NETWORK
L1: PHY
99
How “Lossy” is Lossy ?
 LLN Link Characteristics:
 High BER
 Frequency packet drops
 High instability
 LLN failures are frequent and
usually transient
100
Routing Protocol for Low-power Lossy Links (RPL): Key Highlights
RPL :
 Highly modular
 (Core + Additional) modules
 Designed specifically for “lossy” networks
 Under-reacts to LLN link changes
 Agnostic to underlying link layer technology
 Is a proactive IPv6 distance vector protocol
 Builds a Destination Oriented Directed Acyclic Graph (DODAG) based on an objective
 Supports many-to-one, one-to-many, point-to-point communication
 Supports different LLN application requirements
 Urban (RFC 5548)
 Industrial (RFC 5673)
 Home (RFC 5826)
 Building (RFC 5867)
101
 RPL builds DODAGs
 DODAG: set of vertices connected by directed edges with no directed cycles
 In contrast to trees, DODAGs offer redundant paths
 RPL supports DODAGs instance
 Concept similar to multi-topology routing (MTR) as done in OSPF
 Allows a node to join multiple DODAGs according to different Objective Functions (OF)
 There can be multiple DODAGs within a RPL instance
 A node can, therefore, belong to multiple RPL instances
 Identifications:
 DODAG -> {RPLInstanceID}
 Unique identity of DODAG: {RPLInstanceID, DODAGID}
RPL: DODAG and Instances
102
RPL: DODAG and Instances
Traffic moves either up towards the DODAG root or down towards the DODAG leafs
DODAG Properties
 Many-to-one communication: upwards
 One-to-many communication: downwards
 Point-to-point communication: upwards-downwards
RPL Instance Properties
 RPL Instance has an optimization objective
 Multiple RPL Instances with different optimization objectives can coexist
A typical example would be an energy-efficient topology for background traffic along with
a low-latency topology for delay-sensitive alarms.
103
RPL: TerminologyRPL: Terminology
A node’s Rank defines the node’s individual position
relative to other nodes with respect to a DODAG root.
The scope of Rank is a DODAG Version.
Route Construction
 Up routes towards nodes of decreasing rank (parents)
 Down routes towards nodes of increasing rank
 Nodes inform parents of their presence and reachability to
descendants
 Source route for nodes that cannot maintain down routes
Forwarding Rules
 All routes go upwards and/or downwards along a
DODAG
 When going up, always forward to lower rank when
possible, may forward to sibling if no lower rank
exists
 When going down, forward based on down routes
Once a non-root node selects its parent set, it can
use the following table to covert the path cost of a
| Node/link Metric | Rank |
| Hop-Count | Cost |
| Latency | Cost/65536 |
| ETX | Cost |
104
RPL: Control Messages
DAG Information Object (DIO)
A DIO carries information that allows a node to discover an RPL Instance, learn its
configuration parameters and select DODAG parents
DAG Information Solicitation (DIS)
A DIS solicits a DODAG Information Object from an RPL node
Destination Advertisement Object (DAO)
A DAO propagates destination information upwards along the DODAG
105
RPL: DODAG Construction
Construction
 Nodes periodically send link-local multicast DIO messages
 Stability or detection of routing inconsistencies influence the rate of DIO messages
 Nodes listen for DIOs and use their information to join a new DODAG, or to maintain an
existing DODAG
 Nodes may use a DIS message to solicit a DIO
 Based on information in the DIOs the node chooses parents that minimize path cost to the
DODAG root
Essentially a distance vector routing protocol with ranks to prevent count-to-infinity problems
106
Application Layer Protocols
L2: MAC
L4: ROUTING
L5: TRANSPORT
L6: APP
L3: NETWORK
L1: PHY
IPv6 over
IEEE 802.15.4
IPv6 over
IEEE 802.15.4
CoAP
107
Constrained Application Protocol CoAP: Key Features
CoAP (RFC 7252):
 Web transfer protocol (coap://) for use with constrained nodes and networks
 Based on RESTful protocol design minimizing the complexity of mapping with HTTP
 Asynchronous transaction model
 Default bound to UDP, and optionally to DTLS
 Low header overhead and parsing complexity
 URI and content-type support
 Subset of MIME types and HTTP response codes
 Has GET, POST, PUT, DELETE methods
108
CoAP: Transaction Model
UDP DTLS …
CoAP
Message Sub-layer
Reliability
Request/Response Sub-layer
RESTful interaction
 Transport
 UDP ( + DTLS)
 Base Messaging
 Simple message exchange between endpoints
 Confirmable or Non-Confirmable message answered by Acknowledgment or Reset
message
 REST Semantics
 REST Request/Response piggybacked on CoAP messages
 Method, Response code and Options (URI, content-type, etc.,)
109
CoAP: Message Format
 Header (4 Bytes)
 Ver - Version (1)
 T – Message type (Confirmable, Non-Confirmable, Acknowledgment, Reset)
 TKL – Token length, if any, number of token bytes after the header
 Code – Request method (1-10), Response code (40-255)
 Message ID – Identifier for matching response
 Token (0-8 Bytes)
110
CoAP: Request
111
CoAP: Dealing with Packet Loss
112
Other Popular App Layer Protocols
113
Putting it all together …
114
A “High-Level” Technology Suite
115
Internet of Things :
“Last-mile” Communication Nuances
Lessons learnt from WSN deployments “at-scale” …
117
SEEDLING @ UNSW, Sydney
URL : http://cgi.cse.unsw.edu.au/~sensar/seedling/Seedling.html
Objective:
1. Show-case a basic prototype of a WSN System in precision agriculture
2. Understand sensornet deployment challenges
3. Increase the interest of high-school students in ICT
118
 Choosing a radio transceiver that gave low-power, long-range links
 A robust MAC protocol
 Simple network topology and planning
 Easy network reconfiguration
 Simple uniform data representation
 Early adoption of solar power for sensor networks
Factors CRITICAL to the SUCCESS of Deployments
Limited Energy Reserves
– PREMIUM Resource
Under MAC Control
119
These lessons are also RELEVANT today …
120
LESSON – 1 …
121
Low POWER Low ENERGY
Wireless Communication Links: Power is NOT Energy
POWER
TIME
E1
E2
 Message Passing / Time to Transmit
ALSO governs Energy
 Transmit it as a single long packet
 Easy to be corrupted
 Transmit as many independent packets
 Higher control overhead & longer delay
 Divide into fragments, but transmit all in burst 122
LESSON – 2 …
123
Wireless Communication Links: “Longer the Better”
Reduced hops help to obtain better PRR with lesser field devices
Configuration - 1
Configuration - 2
124
LESSON – 3 …
125
IP Adaptation
MAC
PHY
Routing
Transport
App
IP
A Routing Layer can be AVOIDED with Smart Network Planning
If a single hop
(with long link)
suffices the
purpose, then
a routing layer
may not be
required …
save ENERGY
IP Adaptation
MAC
PHY
Transport
App
IP
126
LESSON – 4 …
127
Long Power, Long Links are “GREY”
 Approximately 70% of low power, long range links are GREY
(i.e., neither good or bad)
 Very difficult to predict link behavior
128
Characterizing Low Power Links – Tx Variation
Tx power variation can happen … 7dB is a large variation
129
Characterizing Low Power Links – Rx Variation
Rx sensitivity variation … 130
Characterizing Low Power Links – Tx/Rx Dual Mode vs. Rx Only Mode
Power variation in Tx/Rx dual mode vs. Rx only mode
131
LESSON – 5 …
132
 Why do we need MAC ?
 Wireless channel is a shared medium
 Radios, within the communication range of each other and operating in the same
frequency band, interfere with each others transmission
 Interference -> Collision -> Packet Loss -> Retransmission -> Increase in net energy
 The role of MAC
 Co-ordinate access to and transmission over the common, shared (wireless) medium
 Can traditional MAC methods be directly applied to WSN ?
 Control -> often decentralized
 Data -> low load but convergecast communication pattern
 Links -> highly volatile/dynamic
 Nodes/Hops -> Scale is much larger
 Energy is the BIGGEST concern
 Network longetivity, reliability, fairness, scalability and latency
are more important than throughput
MAC is Crucial … Design/Choose it Carefully !!!
133
MAC Family
Reservation
(Scheduled, Synchronous)
Contention
(Unscheduled, Asynchronous)
 Reservation-based
 Nodes access the channel based on a schedule
 Examples: TDMA
 Limits collisions, idle listening, overhearing
 Bounded latency, fairness, good throughput (in loaded traffic conditions)
 Saves node power by pointing them to sleep until needed
 Low idle listening
 Dependencies: time synchronization and knowledge of network topology
 Not flexible under conditions of node mobility, node redeployment and node death:
complicates schedule maintenance
 Contention-based
 Nodes compete (in probabilistic coordination) to access the channel
 Examples: ALOHA (pure & slotted), CSMA
 Time synchronization “NOT” required
 Robust to network changes
 High idle listening and overhearing overheads
MAC Taxonomy
134
MAC: Reservation vs. Contention
135
LESSON – 6 …
136
Understand the Application’s Traffic Pattern
137
Some Concluding Remarks …
138
The FORTUNE TELLER or NOT …
 Low power, long range communication is a very different ball game
compared to standard communication technologies.
 Many attributes that inherently are known to work in regular communications
will “shock you” in low-power communications.
 Take inspiration from the tons of WSN deployments that have studied these
artifacts rather than hypothesizing “again”.
139
Internet of Things :
“Last-mile” Existing and Upcoming Standards
141
Existing Stack using IEEE 802.15.4 as the PHY Layer
142
Popular IETF Stack for Field Devices: RFC Portfolio
143
Popular IETF Stack for Field Devices: Other RFC Portfolio
144
Thread Stack for Field Devices
“New” IETF Stack for Field Devices: +6TiSCH
145
IETF Deterministic Networking (DetNet)
146
DetNet Open Challenges
147
Other Stack: OneM2M
148
Interoperability via Data Semantics: IEEE 1451 + IEEE 2700 ?
 The IEEE 1451 (TEDS) is a well established
standard in industrial automation to achieve
plug-n-play capability with the help of
electronic datasheets.
 TEDS is the electronic version of the data sheet
that is used to configure a sensor.
 TEDS bring forward the concept that if the data
sheet is electronic and can be readily accessed
upon sensor discovery, it would be possible to
configure the sensor automatically.
 This is analogous to the operation of plugging a
mouse, keyboard, or monitor in the computer
and using them without any kind of manual
configuration.
 TEDS enables self-configuration of the system
by self-identification and self-description of
sensors and actuators (i.e., plug-and-play).
 IEEE 2700 is a sensor calibration standard.
149
Internet of Things :
Derivatives for Intelligence
The Data to Knowledge Pipeline
Cyber & Physical Space Entities
Edge
Global Infra
Data Ingestion
Data Analysis
Applications
Data source
“Big” data Infra
“Little” data Infra
Decision making
with Knowledge
DATA @ REST (VOLUME)
Archival/Static data (TBs) in Data stores
DATA @ MOTION (VELOCITY)
Streaming data
DATA @ MANY FORMS (VARIETY)
Structured/Unstructured, Text, Multimedia, Audio, Video
DATA @ DOUBT (VERACITY)
Data with uncertainty that may be due to
incompleteness, missing points, etc.,
NATURE of INGESTED DATA
PRESCRIPTIVE
What are the best outcomes ?
PREDICTIVE
What could happen ?
DIAGNOSTIC
Why did this happen ?
DESCRIPTIVE
What has happened ?
NATURE of DATA ANALYSIS
151
Value
Hindsight and Insight/
Insights into the PAST
Foresight/
Insights into the FUTURE
Skill
Descriptive
“WHAT has
happened ? ”
Diagnostic
“WHY did this
happen ?”
Prescriptive
“WHAT should
we do ?”
Predictive
“WHAT could
happen ? ”
Information Optimization
Nature of Data Analysis
DASHBOARD
FORECAST ACTIONS,
RULES,
RECOMMs
152
Example: Energy Analysis for a PV Microgrid
Descriptive: What is the total energy, instantaneous energy and power, etc., …?
Diagnostic: Why is the panel temperature decreasing, when the solar irradiance is high and the wind
speed is very low ?
Predictive: Can I forecast the plant output for tomorrow, or can I generate 4kWh net energy ?
Prescriptive: What actions should be undertaken for the plant to reach 4kW energy generation capacity
from its current 2 kW ?
153
Example: Self Health Monitoring of Multi-rotor MAV
Descriptive: What is the total input power (voltage and current), thrust, vibration and ego-noise profiles,
and motor/propeller unit RPM ?
Diagnostic: Why is the THRUST not increasing with increasing RPM ?
Predictive: What is the success probability of the upcoming mission, given that flight and structural
health history ?
Prescriptive: What actions should be taken for increasing the success probability of the upcoming mission
from 75% to 90% ?
154
Machine/System Intelligence …
Depending on the type and quality of analytics, machines/systems could manifest themselves into:
 Informed Systems — Systems That Know/Aware
 Adaptive Systems — Systems That Learn
 Cognitive Systems — Systems That Reason and Plan
155
Deriving Machine Intelligence
 Reason and Plan (with Uncertain Knowledge)
 Probabilistic Reasoning:
 Bayesian Networks
 Conditional Distributions
 Probabilistic Reasoning over Time:
 Hidden Markov Models
 Kalman Filters
 Dynamic Bayesian Networks
 Simple Decisions:
 Utility Theory
 Decision Networks
 Expert Systems
 Complex decisions:
 Partial observable Markov Decision Process (MDP)
 Game Theoretic Models
 Learning
 Supervised | Semi-supervised | Unsupervised | Reinforcement
 Classification
 Regression
 Clustering
156
Machine Learning …
157
ML computational methods / algorithms :
 LEARN information directly from data, “without” relying on predetermined models
 FIND natural patterns in data, which help to generate insights for better decisions and
predictions
ML teaches Machines to do what “naturally” comes to
Humans and Animals
“LEARN from EXPERIENCE”
158
ML Techniques
SUPERVISED
Develop a predictive model,
based on evidence (both
input and output data)
UNSUPERVISED
Group and interpret data,
based only on input data
(without labels)
CLASSSIFICATION
Predicts discrete responses
(e.g., email: genuine vs. spam;
tumor: cancerous vs. benign)
REGRESSION
Predicts continuous responses
(e.g., changes in temperature;
fluctuations in power demand)
CLUSTERING
Finds hidden patterns or
groupings
(e.g., object recognition)
When to use ?
 When you want to train a model to make
a prediction.
 When you have existing <input, output>
data for response that you are trying to
predict.
When to use ?
 When you want to train a model to find a good
internal representation.
 When you want to explore your data; but
don’t yet have a specific goal, or are not sure
what information the data contains.
 When you want to reduce the dimensions of
your data.
When to use ?
 When you are working with
data that can be tagged or
categorized.
When to use ?
 When you are working with
data ranges, and want to
predict trends.
159
Selecting the Right Algorithm
ML TECHNIQUES
SUPERVISED UNSUPERVISED
CLASSIFICATION REGRESSION CLUSTERING
Support Vector
Machines
Discriminant
Analysis
Naive Bayes
Near Neighbor
Linear Regression
Ensemble Methods
Decision Trees
Neural Networks
K-Means, K-Medoids,
Fuzzy C-Means
Hierarchical
Gaussian Mixture
 Is it TRIAL and ERROR ?
 Is it Trade-off between:
 Speed of training
 Memory usage
 Predictive accuracy on new data
 Transparency / Interpretability
(how easily can you understand the reasons for an algorithm to make that prediction)
 Using larger training datasets often yield models that generalize well for new data
160
ML Workflow
Input
AcquiredData
(Sensor/Image/Video/Transactional)
Sub Goal:
Data
Representation
Sub Goal:
Preprocessing
Sub Goal:
Feature
Extraction
Sub Goal:
Build / Train
Model
Identify:
Good and Bad
Data Portions
Identify:
Missing
Samples/Values
Detect:
Outliers
Prepare:
Cross
Validation
Sub Goal:
Improve
Model
161
ML Workflow: Feature Derivation
 The number of features that could be derived is limited only by our imagination !!!
Sensor data
 Extract signal properties from raw sensor data
 Peak analysis (frequency, power, etc.,)
 Pulse and transition analysis (rise time, fall time, settling time, etc.,)
 Spectral analysis (power, bandwidth, frequency & its span, etc.,)
Image/Video data
 Extract features such as edge locations, resolution, color …
 Bag of visual words (create a histogram of local image features : edges, corners, blobs, etc.,)
 Histogram of oriented gradients
 Minimum eigenvalue (detect corner locations in images)
 Edge detection (identify points where the degree of brightness changes sharply)
Transactional data
 Calculate derived features that enhance the information in the data
 Time decomposition (break timestamps down into components such as day and month)
 Aggregate value calculation (create higher-level features such as total number of times a
particular event occurred)
162
Common Classification Algorithms …
163
How it Works ?
 Categorizes data points based on the classes of their nearest neighbors in the dataset (“guilty by
association”).
 Motivating insight: data points near to each other, tend to be similar.
 Non-parametric: does not make any assumptions regarding the distribution of data.
 Metric for near neighbor : Distance, either Euclidean (most popular), City block, Chebychev,
Correlation, Cosine, etc.
 Choose K to be ODD for clear majority
Best Used :
 When you want to use a method that does not have training phase (often called a lazy learner).
 When response time, memory and space are of lesser concern (need to store not just the
algorithm, but also the training data).
 When you want a less smarter algorithm, which can be fooled with irrelevant inputs (i.e., less
robust to noise).
 When you need a simple algorithm to establish benchmark learning rules.
k-Nearest Neighbor (kNN)
K = 15
Smoother, more defined boundaries
K = 1
164
Logistic Regression
How it Works ?
 Fits a model that can predict the probability of a binary response belonging to one class or the other.
Best Used :
 When the dependent variable is BINARY.
 When data can be clearly separated by a single, linear boundary.
 When a baseline is needed for evaluating more complex classification methods.
𝑦 =
1
1 + 𝑒− 𝛽0+ 𝛽1𝑥
x
y
x
y
165
Support Vector Machines
How it Works ?
 Classifies data by finding the linear decision boundary (hyperplane), which separates all data points
of one class from those of the other class.
 When the data is linearly separable:
 the best hyperplane is the one with the:
largest margin between the two classes.
 When the data is not linearly separable:
 use a kernel transform to transform
nonlinearly separable data into higher dimensions,
where a linear decision boundary can be found.
 use a loss function to penalize points on the
wrong side of the hyperplane.
Best Used :
 When data has exactly two classes.
 multiclass classification can be performed with a divide-and-conquer approach
 When data is complex, has high-dimensionality, and is nonlinearly separable.
 When data is limited.
 When you need a classifier that’s simple, easy to interpret, and accurate.
 When fast response is needed.
Support vector
166
Neural Networks
How it Works ?
 Consists of highly connected networks of neurons, which relate (map) the inputs to the desired
outputs.
 The network is trained by iteratively modifying the strengths (i.e., weights) of the connections so
that given inputs map to the correct response.
Best Used :
 When modeling highly nonlinear systems.
 When computation cost is of lesser concern.
 When model interpretability is not a key concern* (… however, there is work that can go to the
details of interpreting each layer and also suggesting how many neurons are needed; therefore, is
interpretable | It can also now handle time information ….)
 When there could be unexpected changes in your input data* (… for which the network has to be
deep with large number of neurons …)
167
Naïve Bayes
How it Works ?
 Based on Bayes Probability Theorem, it assumes that the presence of a particular feature in a class
is unrelated to the presence of any other feature.
 Classifies new data based on the highest probability of its belonging to a particular class.
c = HYPOTHESIS (class)
x = EVIDENCE (predictor variable / new data point)
P(c) = probability of the hypothesis before getting the evidence
P(c|x) = probability of the hypothesis after getting the evidence
Best Used :
 When assumption of feature independence holds TRUE; it can easily outperform other well known
techniques with lesser training data.
 When the model is expected to encounter scenarios that weren’t in the training data.
 When CPU and memory resources are a limiting factor* (… although for likelihood estimation, a
dataset is needed …).
 When you want a method that doesn’t overfit.
 When you want a method that can update itself with continuous new data.
 When you need a classifier that’s easy to interpret.
𝑃 𝑐 𝑥 =
𝑃 𝑥 𝑐 𝑃(𝑐)
𝑃(𝑥)
Posterior = Likelihood ratio x Prior
168
Discriminant Analysis
How it Works ?
 Classifies data by finding linear combinations.
 Assumes that different classes generate data based on Gaussian distributions.
 Training a discriminant analysis model involves finding the parameters for a Gaussian distribution for
each class.
 The distribution parameters are used to calculate boundaries, which can be linear or quadratic
functions; and these boundaries are used to determine the class of new data.
Best Used …
 When memory usage during training is a concern.
 When you need a model that is fast to predict.
 When you need a simple model that is easy to interpret.
169
Decision Trees
How it Works ?
 represents a procedure for classifying categorical data based on their attributes.
 decide which attribute to test at a node by determining the “best” way to separate (splitting‐point)
 pick the attribute that has the highest Information gain.
A decision tree for the concept buys_computer, indicating whether a customer at AllElectronics is likely to purchase a computer. Each internal (nonleaf)
node represents a test on an attribute. Each leaf node represents a class (either buys_computer = yes or buy_computers = no)
Best Used :
 When handling large datasets.
 When there is a need to ignore redundant variables, and handle missing data elegantly* (… missing
data should be small …).
 When memory usage needs to be minimized.
 When decision traceability is needed. 170
Bagged and Boosted Decision Trees
 How do Bagging and Boosting get N leaners ?
Trees are simple, but often produce noisy (bushy) or weak (stunted) classifiers.
In these ensemble methods, several “weaker” decision trees are combined into a “stronger” ensemble.
 Why are the data elements weighted ?
171
Bagged and Boosted Decision Trees
 How does the classification stage work ?
172
Bagged and Boosted Decision Trees
Similarities Differences
Both are ensemble methods to get N learners from 1
learner
Bagging: builds N leaners independently
Boosting: tries to add new models that do well where
previous models fail
Both generate several training data sets by random
sampling
Bagging: no weighting strategy
Boosting: determines weights for the data to tip the
scales in favor of the most difficult cases
Both make the final decision by averaging the N
learners (or taking the majority of them)
Bagging: an equally weighted average
Boosting: a weighted average (i.e., more weight to
those with better performance on training data)
Both are good at reducing variance and provide
higher stability
Bagging: may solve the over-fitting problem | may not
reduce bias
Boosting: may increase the overfitting problem | tries
to reduce bias
Best Used :
 When there is a need to minimize prediction variance
Boosting > Random Forests > Bagging > Single Tree
173
Common Regression Algorithms
174
Linear/Non-linear/Gaussian Process Regression
How it Works ?
 Describes a continuous response variable as a linear/non-linear/Gaussian process function.
Linear regression : Best Used …
 When you need an algorithm that is easy to interpret and fast to fit.
 When you need a baseline for evaluating other, more complex regression models.
Non-linear regression : Best Used …
 When data has strong nonlinear trends, and cannot be easily transformed into a linear space.
 When you need to fit custom models to the data.
Gaussian Process regression (Kriging) : Best Used …
 When interpolation needs to be performed in the presence of uncertainty.
Linear Regression Non-linear Regression Kriging
175
SVM Regression / Regression Tree
SVM Regression: How it Works ?
 Works the same as SVM classification algorithms, but is modified to be able to predict a
continuous response.
 Instead of finding a hyperplane that separates data, it finds a model that deviates from the
measured data by a value no greater than a small amount, with parameter values that are as
small as possible (to minimize sensitivity to error).
SVM Regression: Best Used :
 For high-dimensional data , where there will be a large number of predictor variables.
 When data is limited, and number of predictor variables are large
Regression Tree: How it Works ?
 Works the same as decision trees, but is modified to be able to predict a continuous response.
Regression Tree : Best Used :
 When predictors are categorical (discrete) or behave nonlinearly.
176
Common Clustering Algorithms
(Unsupervised Learning)
177
Clustering Analysis
Hard Clustering
Each data point
belongs to only
ONE cluster
Soft Clustering
Each data point
belong to MORE
than ONE cluster
Data grouping
is KNOWN
Data grouping
is UNKNOWN
Self Organizing Maps (SOM)
Hierarchical Clustering
 Search for possible clusters
 Use cluster evaluation to look for the
“best’ number of groups for a given
clustering algorithm
 Data is partitioned into groups (or
clusters) based on some measure of
similarity or shared characteristic.
 Clusters are formed so that objects in
the same cluster are very similar and
objects in different clusters are very
distinct.
178
Common Hard Clustering Algos: k-Means / k-Mediods
k-Means: How it Works ?
 Partitions data into k number of mutually exclusive clusters.
 The fitment of a point into a cluster is determined by the distance from that point to
the cluster’s center.
k-Mediods: How it Works ?
 Similar to k-means, but with the requirement that the cluster centers coincide with
points in the data.
Best Used :
 When the number of clusters is known.
 For fast clustering of categorical data
 To scale to large data sets
179
Hierarchical Clustering & SOM
Hierarchical: How it Works ?
 Produces nested sets of clusters by analyzing similarities between pairs of points and
grouping objects into a binary, hierarchical tree.
Hierarchical : Best Used :
 When advance knowledge of data clusters is missing
 When you want visualization to guide your selection
SOM: How it Works ?
 Neural-network based clustering that transforms a dataset into a topology-preserving
2D map
SOM: Best Used :
 To visualize high-dimensional data in 2D or 3D
180
Possible Modes with Unsupervised Learning
Data Clusters
Results
Lower Dimensional Data
Feature Selection
Supervised
Learning
Model
Large Data
Unsupervised
Learning
End goal is unsupervised learning Preprocessing step for supervised learning
181
Common Practices in ML
182
Improving Models
 Model improvement in learning means:
 increasing its accuracy
 increasing predictive power
 preventing over-fitting (ambiguity between data and noise)
 increasing model parsimony
 Essentially, reduces errors in learning due to noise, bias and variance
Feature Selection
 Identifying the most relevant features, which provide the best predictive power.
 Could be done by: adding or removing features, which do not improve model performance.
Feature Transformation
 Recasting existing features into new features using techniques such as: principal component
analysis, nonnegative matrix factorization, and factor analysis.
Hyperparameter Tuning
 It is the process of identifying the set of parameters that provide the best model.
 It controls how a ML algorithm fits the model to the data.
A model is only as good as the features selected to train on !!!
183
Feature Selection
 Especially useful:
 when dealing with high-dimensional data
 when the dataset contains a large number of features and a limited number of
observations
 Reducing the feature space saves storage and computation time
 Makes the result easier to understand
Stepwise Regression
 Sequentially adding or removing features until there is no improvement in prediction accuracy.
Sequential Feature Selection
 Iteratively adding or removing predictor variables and evaluating the effect of each change on the
performance of the model.
Regularization
 Using shrinkage estimators to remove redundant features by reducing their weights (coefficients)
to zero.
Neighborhood Component Analysis (NCA)
 Finding the weight each feature has in predicting the output, so that the features with lower
weights can be discarded.
184
Feature Transformation
 Feature transformation is a form of dimensionality reduction
Principal Component Analysis (PCA)
 Performs a linear transformation on the data, so that most of the variance or information in your
high-dimensional dataset is captured by the first few principal components.
 The first principal component will capture the most variance, followed by the second principal
component, and so on.
Nonnegative Matrix Factorization
 Used when model terms must represent nonnegative quantities, such as physical quantities.
Factor Analysis
 Identifies underlying correlations between variables in the dataset to provide a representation in
terms of a smaller number of unobserved latent factors, or common factors.
 Shows the relationship between variables, so that variables (or features) that are not highly
correlated can be removed.
185
Feature Transformation & Hyper-parameter Tuning
 Begin by setting parameters based on a “best guess” of the outcome.
 Goal is to find the “best possible” values - that would yield the best model.
 As the parameters are adjusted and model performance begins to improve, a note has to be
made as to which parameters are effective and which still require tuning.
 Three common parameter tuning methods are:
 Bayesian optimization
 Grid search
 Gradient-based optimization
 Hyperparameter tuning is an iterative process
186
Choosing the Right Model ?
 Why is it so hard to get right?
 Each model has its own strengths and weaknesses in a given scenario.
 No established set of rules/guidelines.
 Closely tied to business case, and understanding of what needs to be accomplished.
 What can you do to choose the right model?
 How much data do you have and is it continuous?
 What type of data is it?
 What are you trying to accomplish?
 How important is it to visualize the process?
 How much detail do you need?
 Is storage a limiting factor?
 Is response time a limiting factor ?
 Is computation cost a limiting factor ?
187
Model Over-fitting
 Overfitting means that the model is so closely aligned to training data sets that it does
not know how to respond to new situations.
 Why is overfitting difficult to avoid?
 often the result of insufficient/inaccurate
information about the scenario.
 How do you avoid overfitting?
 using appropriate training data.
 training data needs to accurately reflect the complexity
and diversity of the data the model will be expected to work with.
 use regularization
 penalizes large parameters to help keep the model from relying too heavily on individual
data points and becoming too rigid
 control the smoothness of fit
 Has the form: [Error + λf(θ)], where f(θ) grows larger as the components of (θ) grow
larger and λ represents the strength of the regularization
 λ decides how much you want to protect against overfitting
 if λ=0, you aren’t looking to correct for overfitting at all
 perform model cross-validation
 partitions a dataset and uses a subset to train the algorithm and the remaining data for
testing
 common techniques: k-fold | holdout 188
Some Concluding Remarks …
189
The FORTUNE TELLER or NOT …
A general rule-of-thumb:
 Training - to generate the MODEL - is an expensive operation
 Estimation - using the derived MODEL - is lightweight
Intelligence (derived through LEARNING) on Embedded systems:
 On-device training MAY NOT be a good strategy
 It may be better to offload it to a resourceful device
 On-device estimations using the derived model MAY be a good strategy
There are EXCEPTIONS to this rule !!!
 Sequential versions of many commonly used learning algorithms have
been developed (K-means, etc.), and are part of the stream
processing suite.
190
Acknowledgment and References
This short course on IoT has been compiled from various online resources,
text books, and research papers on this topic.
While Prasant may not be able to “correctly” recollect the right sources, he –
nevertheless – requests all viewers to drop a note, if they come across any
discrepancies in this regard.
1. MAC Essentials :
A.Bachir, M. Dohler, T. Watteyne and K. K. Leung, "MAC Essentials for
Wireless Sensor Networks," in IEEE Communications Surveys & Tutorials, vol.
12, no. 2, pp. 222-248, Second Quarter 2010.
2. Machine Learning :
https://in.mathworks.com/campaigns/products/offer/machine-learning-
with-matlab.html
W: https://sites.google.com/site/prasantmisra
W: https://in.linkedin.com/in/prasantmisra

Mais conteúdo relacionado

Mais procurados

The way forward - Transforming Towards Industry 4.0
The way forward - Transforming Towards Industry 4.0The way forward - Transforming Towards Industry 4.0
The way forward - Transforming Towards Industry 4.0
Wg Cdr Jayesh C S PAI
 
presentation Comstor IoT_RTL
presentation Comstor IoT_RTLpresentation Comstor IoT_RTL
presentation Comstor IoT_RTL
Johan Basson
 

Mais procurados (20)

EMC Solutions for the Internet of Things and Industrie 4.0 - Platforms (EN) <...
EMC Solutions for the Internet of Things and Industrie 4.0 - Platforms (EN) <...EMC Solutions for the Internet of Things and Industrie 4.0 - Platforms (EN) <...
EMC Solutions for the Internet of Things and Industrie 4.0 - Platforms (EN) <...
 
The way forward - Transforming Towards Industry 4.0
The way forward - Transforming Towards Industry 4.0The way forward - Transforming Towards Industry 4.0
The way forward - Transforming Towards Industry 4.0
 
The future of hyperconnected buildings - Illumni 2014
The future of hyperconnected buildings - Illumni 2014The future of hyperconnected buildings - Illumni 2014
The future of hyperconnected buildings - Illumni 2014
 
Industry 4.0 vs
Industry 4.0 vsIndustry 4.0 vs
Industry 4.0 vs
 
Smart City: Many Applications and Devices
Smart City: Many Applications and DevicesSmart City: Many Applications and Devices
Smart City: Many Applications and Devices
 
presentation Comstor IoT_RTL
presentation Comstor IoT_RTLpresentation Comstor IoT_RTL
presentation Comstor IoT_RTL
 
A Full End-to-End Platform as a Service for Smart City Applications
A Full End-to-End Platform as a Service for SmartCity ApplicationsA Full End-to-End Platform as a Service for SmartCity Applications
A Full End-to-End Platform as a Service for Smart City Applications
 
Industrial Internet of Things: Recipe for Innovating the  Businesses through...
Industrial Internet of Things: Recipe for Innovating the  Businesses  through...Industrial Internet of Things: Recipe for Innovating the  Businesses  through...
Industrial Internet of Things: Recipe for Innovating the  Businesses through...
 
Ericsson Technology Review: Digital connectivity marketplaces to enrich 5G an...
Ericsson Technology Review: Digital connectivity marketplaces to enrich 5G an...Ericsson Technology Review: Digital connectivity marketplaces to enrich 5G an...
Ericsson Technology Review: Digital connectivity marketplaces to enrich 5G an...
 
Smart Buildings is This the New Normal?
Smart Buildings is This the New Normal?Smart Buildings is This the New Normal?
Smart Buildings is This the New Normal?
 
SITIST 2015 Dev - Industry 4.0
SITIST 2015 Dev - Industry 4.0SITIST 2015 Dev - Industry 4.0
SITIST 2015 Dev - Industry 4.0
 
Evolution of Smart Buildings and their place in the Internet of Everything
Evolution of Smart Buildings and their place in the Internet of Everything Evolution of Smart Buildings and their place in the Internet of Everything
Evolution of Smart Buildings and their place in the Internet of Everything
 
Bilel Jamoussi - Driving Internet of Ihings (IoT) standardization - IoT Tunis...
Bilel Jamoussi - Driving Internet of Ihings (IoT) standardization - IoT Tunis...Bilel Jamoussi - Driving Internet of Ihings (IoT) standardization - IoT Tunis...
Bilel Jamoussi - Driving Internet of Ihings (IoT) standardization - IoT Tunis...
 
Industry 4.0
Industry 4.0Industry 4.0
Industry 4.0
 
2016 ibm watson io t forum 躍升雲端 敏捷打造物聯網平台
2016 ibm watson io t forum 躍升雲端 敏捷打造物聯網平台2016 ibm watson io t forum 躍升雲端 敏捷打造物聯網平台
2016 ibm watson io t forum 躍升雲端 敏捷打造物聯網平台
 
CD December 2015 - IFS showcase
CD December 2015 - IFS showcaseCD December 2015 - IFS showcase
CD December 2015 - IFS showcase
 
Application scenarios and real-world deployments for IoT and Smart Cities
Application scenarios and real-world deployments for IoT and Smart CitiesApplication scenarios and real-world deployments for IoT and Smart Cities
Application scenarios and real-world deployments for IoT and Smart Cities
 
Roberto Minerva: iot challenges - IoT Tunisia 2016
Roberto Minerva:  iot challenges  - IoT Tunisia 2016Roberto Minerva:  iot challenges  - IoT Tunisia 2016
Roberto Minerva: iot challenges - IoT Tunisia 2016
 
Smart Cities are the Internet of Things
Smart Cities are the Internet of ThingsSmart Cities are the Internet of Things
Smart Cities are the Internet of Things
 
Getting Ready for Industry 4.0
Getting Ready for Industry 4.0Getting Ready for Industry 4.0
Getting Ready for Industry 4.0
 

Semelhante a A Short Course on the Internet of Things

WMSN Dev Kit Brochure
WMSN Dev Kit BrochureWMSN Dev Kit Brochure
WMSN Dev Kit Brochure
Srideep Ghosh
 
Industrial IoT and OT/IT Convergence
Industrial IoT and OT/IT ConvergenceIndustrial IoT and OT/IT Convergence
Industrial IoT and OT/IT Convergence
Michelle Holley
 
SCADA-IoT_Ben-Yee-V2-2018-ENTELEC-PowerPoint.pdf
SCADA-IoT_Ben-Yee-V2-2018-ENTELEC-PowerPoint.pdfSCADA-IoT_Ben-Yee-V2-2018-ENTELEC-PowerPoint.pdf
SCADA-IoT_Ben-Yee-V2-2018-ENTELEC-PowerPoint.pdf
GobinathAECEJRF1101
 
ISA11 - Mike Kuniavsky - Designing Smart Things
ISA11 - Mike Kuniavsky - Designing Smart ThingsISA11 - Mike Kuniavsky - Designing Smart Things
ISA11 - Mike Kuniavsky - Designing Smart Things
Interaction South America
 

Semelhante a A Short Course on the Internet of Things (20)

Introduction to Internet of Things (IoT)
Introduction to Internet of Things (IoT) Introduction to Internet of Things (IoT)
Introduction to Internet of Things (IoT)
 
Designing Smart Things: user experience design for networked devices
Designing Smart Things: user experience design for networked devicesDesigning Smart Things: user experience design for networked devices
Designing Smart Things: user experience design for networked devices
 
Internet of things
Internet of thingsInternet of things
Internet of things
 
1_IoT_Fundamentals.ppt
1_IoT_Fundamentals.ppt1_IoT_Fundamentals.ppt
1_IoT_Fundamentals.ppt
 
WMSN Dev Kit Brochure
WMSN Dev Kit BrochureWMSN Dev Kit Brochure
WMSN Dev Kit Brochure
 
IEEE CS Phoenix - Internet of Things Innovations & Megatrends Update 12/12/18
IEEE CS Phoenix - Internet of Things Innovations & Megatrends Update 12/12/18IEEE CS Phoenix - Internet of Things Innovations & Megatrends Update 12/12/18
IEEE CS Phoenix - Internet of Things Innovations & Megatrends Update 12/12/18
 
Industrial Pioneers Days - Machine Learning
Industrial Pioneers Days - Machine LearningIndustrial Pioneers Days - Machine Learning
Industrial Pioneers Days - Machine Learning
 
Real World IoT Architecture Use Cases
Real World IoT Architecture Use CasesReal World IoT Architecture Use Cases
Real World IoT Architecture Use Cases
 
Internet Of Things
 Internet Of Things Internet Of Things
Internet Of Things
 
Industry 4.0
Industry 4.0Industry 4.0
Industry 4.0
 
Internet of Things Innovations & Megatrends Update 12/14/16
Internet of Things Innovations & Megatrends Update 12/14/16Internet of Things Innovations & Megatrends Update 12/14/16
Internet of Things Innovations & Megatrends Update 12/14/16
 
Ecosystem Building for IC Industry
Ecosystem Building for IC IndustryEcosystem Building for IC Industry
Ecosystem Building for IC Industry
 
Industrial IoT and OT/IT Convergence
Industrial IoT and OT/IT ConvergenceIndustrial IoT and OT/IT Convergence
Industrial IoT and OT/IT Convergence
 
Internet of Things.pptx
Internet of Things.pptxInternet of Things.pptx
Internet of Things.pptx
 
Design & Implementation Of Fault Identification In Underground Cables Using IOT
Design & Implementation Of Fault Identification In Underground Cables Using IOTDesign & Implementation Of Fault Identification In Underground Cables Using IOT
Design & Implementation Of Fault Identification In Underground Cables Using IOT
 
SCADA-IoT_Ben-Yee-V2-2018-ENTELEC-PowerPoint.pdf
SCADA-IoT_Ben-Yee-V2-2018-ENTELEC-PowerPoint.pdfSCADA-IoT_Ben-Yee-V2-2018-ENTELEC-PowerPoint.pdf
SCADA-IoT_Ben-Yee-V2-2018-ENTELEC-PowerPoint.pdf
 
niceData com.
niceData com.niceData com.
niceData com.
 
niceData com.
niceData com.niceData com.
niceData com.
 
AIDRC_Generative_AI_TL_v5.pdf
AIDRC_Generative_AI_TL_v5.pdfAIDRC_Generative_AI_TL_v5.pdf
AIDRC_Generative_AI_TL_v5.pdf
 
ISA11 - Mike Kuniavsky - Designing Smart Things
ISA11 - Mike Kuniavsky - Designing Smart ThingsISA11 - Mike Kuniavsky - Designing Smart Things
ISA11 - Mike Kuniavsky - Designing Smart Things
 

Mais de Prasant Misra

The NEEDS vs. the WANTS in IoT
The NEEDS vs. the WANTS in IoTThe NEEDS vs. the WANTS in IoT
The NEEDS vs. the WANTS in IoT
Prasant Misra
 

Mais de Prasant Misra (7)

Hybrid Planner for Smart Charging of Electric Fleets
Hybrid Planner for Smart Charging of Electric FleetsHybrid Planner for Smart Charging of Electric Fleets
Hybrid Planner for Smart Charging of Electric Fleets
 
Reinforcement Learning for EVRP with V2G
Reinforcement Learning for EVRP with V2GReinforcement Learning for EVRP with V2G
Reinforcement Learning for EVRP with V2G
 
Emerging Networking Technologies for Industrial Applications
Emerging Networking Technologies for Industrial ApplicationsEmerging Networking Technologies for Industrial Applications
Emerging Networking Technologies for Industrial Applications
 
Energy Efficient GPS Acquisition with Sparse-GPS
Energy Efficient GPS Acquisition with Sparse-GPSEnergy Efficient GPS Acquisition with Sparse-GPS
Energy Efficient GPS Acquisition with Sparse-GPS
 
The NEEDS vs. the WANTS in IoT
The NEEDS vs. the WANTS in IoTThe NEEDS vs. the WANTS in IoT
The NEEDS vs. the WANTS in IoT
 
Link Layer Protocols for WSN-based IoT
Link Layer Protocols for WSN-based IoTLink Layer Protocols for WSN-based IoT
Link Layer Protocols for WSN-based IoT
 
Challenges in Analytics for BIG Data
Challenges in Analytics for BIG DataChallenges in Analytics for BIG Data
Challenges in Analytics for BIG Data
 

Último

Último (20)

How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
Spatium Project Simulation student brief
Spatium Project Simulation student briefSpatium Project Simulation student brief
Spatium Project Simulation student brief
 

A Short Course on the Internet of Things

  • 1. A Short Course on the Internet of Things Prasant Misra, Ph.D. W: https://sites.google.com/site/prasantmisra
  • 2. Course Content (240 mins) Gateway Field Area Network Back-haul Network CloudInfraFog Networked FieldDevice “Last-mile”  IoT Primer (40 mins.)  History of Computing and Trends  Industrial IoT and Industry 4.0  IoT Architecture Primer (20 mins.)  Functional Architecture  IoT “Last-mile” Considerations (60 mins.)  Field Devices & Platforms  Field Device Stack (PHY, MAC, NTWK, ROUTING, TRANSPORT, APP)  IoT “Last-mile” Communication Nuances (30 mins.)  IoT “Last-mile” Existing and Upcoming Standards (30 mins.)  Derivatives for Intelligence (60 mins.)  Nature of Data Analysis  Intelligence with Machine Learning 2
  • 5. 1960 - 70 1980 - 90 2000 -10 and beyond Year Size History of Computing Accessibility to cyber end points have increased drastically … 5
  • 6. Trend-I: Device/Data Proliferation (by Moore’s Law) Wireless Sensor Networks (WSN) Medical Devices Industrial Systems Portable Smart DevicesRFID 6
  • 8. Trend-I: DATA Proliferation Web & Social Media Enter- prises Gov. 8
  • 9. Trend-II: Integration at Scale (Isolation has cost !!!) (World Wide) Sensor Web (Feng Zhao) Future Combat Systems Ubiquitous embedded devices • Large scale network embedded systems • Seamless integration with the physical environment Complex system with global integration 9
  • 10. Trend-III: Evolution: Man vs. Machine The exponential proliferation of embedded devices (courtesy of Moore’s Law) is NOT matched by a corresponding increase in human ability to consume information ! Increase in Machine Autonomy !!! 10
  • 11. Confluence of Trends Distributed, Information Distillation and Control Systems of Embedded Devices Trend-1: Data & Device Proliferation Trend-3: Autonomy Trend-2: Integration at Scale 11
  • 12. Confluence of Technologies CPS Trend-1: Sensing & Actuation Trend-3: Computation & Control Trend-2: Communication & Networking A cyber-physical system (CPS) refers to a tightly integrated system that is engineered with a collection of technologies, and is designed to drive an application in a principled manner. 12
  • 13. Functional Blocks of CPS Enormous SCALE : both in space and time 13
  • 14. Enormous SCALE : both in space and time Functional Blocks of CPS 14
  • 15. Casting CPS Technology into Application Requirement Use Case: Adaptive Lighting in Road Tunnels Problem: Control the tunnel lighting levels in a manner that ensures continuity of light conditions from the outside to the inside (or vice-versa) such that drivers do not perceive the tunnel as too bright or dark. Solution: Design a system that is able to account for the change in light intensity (i.e., detect physical conditions and interpret), and adjust the illumination levels of the tunnel lamps (i.e., respond) till a point along the length of the tunnel where this change is indiscernible to the drivers (i.e., reason and control in an optimal manner). 15
  • 16. Casting CPS Technology into Application Requirement Use Case: Smart Buildings/Homes Problem: How to make buildings/homes (both new and existing) ‘smarter’ ? • Energy efficient • Damage prevention • Increased comfort 16
  • 17. Beaming from CPS to IoT : The SCALE is even BIGGER !!! C1 C2 Cn P1 P2 Pn CPS Internet CyberworldPhysicalworld NoT IoT = CPS + People ‘in-the-loop’ (that act as sensors, actuators, controllers) IoT = CPS + Hybrid (tight and loose) sense of control 17
  • 18. CPS & IoT  Gives us the ability to look more broadly (SCALE), deeply (PRECISION) and over extended periods of time at the physical world  As a result, our interactions with the physical world has increased !!! Example of a Killer APP: Navigation System 18
  • 19. Navigation System - I Context Service Example Current Location Local business 19
  • 20. Context Service Example Current Location Local business and directions + Time Tracks Businesses in driving direction Navigation System - II 20
  • 21. Context Service Example Current Location Local business and directions + Time Tracks Businesses in driving direction + History Personalized directions  Take 520 East Navigation System - III 21
  • 22. Context Service Example Current Location Local business and directions + Time Tracks Businesses in driving direction + History Personalized directions + Community Tourist recommendation 35% people pick the scenic route Navigation System - IV 22
  • 23. Alert: Bad Traffic Consider Alternate route Context Service Example Current Location Local business and directions Tracks Businesses in driving direction + History Personalized directions + Community Tourist recommendation + Push alerts, triggers, reminders Navigation System - V 23
  • 24. Some formalism and SYSTEMS feel … 24
  • 25. IoT: Vision and Value Proposition Vision: Build a ubiquitous society where everyone (“people”) and everything (“systems, machines, equipment and devices") is immersively connected. Value Proposition:  Connected “Things” will provide utility to “People”  Digital shadow of “People” will provide value to the “Enterprise” 25
  • 26. How BIG is IoT ? 26
  • 28. The FORTUNE TELLER or NOT … IIoT and Industry 4.O is ALL about re-imagination !!!  Improve flexibility, reliability and time to market/scale  Improve customer intimacy and profitability  Improve revenue and market position 28
  • 29. Is the Internet of Things disruptive? OR Are they repackaging known technologies and making them a little better? What is your take ? 29
  • 30. Internet of Things : Architectural Design Primer
  • 31. High-level Functional Architecture DATA @ REST (VOLUME) Archival/Static data (TBs) in Data stores DATA @ MOTION (VELOCITY) Streaming data DATA @ MANY FORMS (VARIETY) Structured/Unstructured, Text, Multimedia, Audio, Video DATA @ DOUBT (VERACITY) Data with uncertainty that may be due to incompleteness, missing points, etc., PRESCRIPTIVE What are the best outcomes ? PREDICTIVE What could happen ? DESCRIPTIVE What has happened ? DISCOVERY What do we have ? NATURE of INGESTED DATA NATURE of ANALYSIS DATA KNOWLEDGE 31
  • 32. Detailed Functional Architecture CloudInfra Fog Networked FieldDevice “Last-mile” Gateway Field Area Network Back-haul Network Gateway Field Area Network Back-haul Network Gateway Field Area Network Back-haul Network 32
  • 33. Functional Architecture Layers and their Key Physical Attributes Gateway Physical Attribute Field Devices (with Sensing, Compute and Actuation HW) Functionality Sense Actuate Control Physical Attribute Last-mile connectivity PAN, HAN, FAN, NAN, CAN, WAN, etc., Functionality Connection Management Routing Physical Attribute Data Storage Functionality Ingestion Semantics Transformation Functionality Interoperability Security Access Control Functionality Business Logic Orchestration Functionality Input Output Transform Physical Attribute Common Service Functions Physical Attribute Business Logic & Related Functions Physical Attribute Users 33
  • 34. Recap : Functional Architecture Service Oriented Approach Application & Business Architecture Describing the service strategy, the organizational, functional process information and geographic aspects of the environment based on the strategic goals and strategic drivers Information Systems Architecture Describing information/data structure and semantics, types and sources of data necessary to support various smart application Data Access Architecture Describing technical components (software, hardware), access technology and data aggregation policies Information Security characterized by: • Availability • Integrity • Confidentiality Interoperability characterized by: • Syntactic • Semantics 34
  • 35. Internet of Things : “Last-mile” Considerations
  • 36. What ROUTE are we going to take ? 36
  • 37. Popular Communication, Networking, and Control Standards for Industrial Systems 37
  • 39. “GO Wireless” and “GO w or w/o IP” !!! 39
  • 40. “Last-mile” Consideration w.r.t. Low-power, Wireless, Constrained Field Devices & Networks Gateway Field Area Network Back-haul Network CloudInfraFog Networked FieldDevice “Last-mile” 40
  • 41.  Consist of many embedded units called sensor nodes, motes etc. ,.  Sensors (and actuators)  Small microcontroller  Limited memory  Radio for wireless communication  Power source (often battery)  Communication centric systems  Motes form networks, and in a one hop or multi-hop fashion transport sensor data to base station Background: Wireless Sensor Networks (WSN) 41
  • 42. • Processing speed ? • Memory ? • Storage ? • Power consumption ? BTNodeMicaZ dotMote Fleck Tmote Sky Radio Sensors/ Actuators Microcontroller Storage Power Source Architecture: WSN platforms 42
  • 43. WSN Node: Core Features Limited Energy Reserves – PREMIUM resource Under MAC Control (bit) RISC KBytes KBytes (bit) KBytes # # 43
  • 44. Sensor Web: Field Device Stack L2: MAC L4: ROUTING L5: TRANSPORT L6: APP L3: NETWORK L1: PHY Do we need a LAYERED approach @ the Field Device level ? 44
  • 46. (Popular) Short and Medium Range Low Power Wireless Technology Technology Standard Body Frequency Band Max Range Max Data Rate Max Power Network Type Bluetooth Bluetooth SIG 2.4 GHz ISM 100 m 1-3 Mbps 1 W WPAN Bluetooth Smart IoT Interconnect 2.4 GHz ISM 35 m 1 Mbps 10 mW WPAN ZigBee IEEE 802.15.4, Zigbee Alliance 2.4 GHz ISM 160 m 250 Kbps 100 mW Star, Mesh Wi-Fi IEEE 802.11 g/n/ac/ad 2.4/5/60 GHz 100 m 6-780 Mbps, 6 Gbps @ 60 GHz 1 W Star, Mesh Zwave Zwave 908 MHz 30 m 100 Kbps 1 mW Star, Mesh ANT+ ANT Alliance 2.4 GHz 100 m 1 Mbps 1 mW Star, Mesh Rubee IEEE 1902.1, IEEE 1902.2 131 kHz 5 m 1.2 Kbps 40-50 nW P2P 46
  • 47. Low Power Wide Area Networking Technology Technology Standards/ Governing Body Frequency Band Max Range Max Data Rate Topology Devices / Access Point Weightless - SubGHz ISM, TV Whitespaces 2-5 k (urban) 200 bps – 100 Kbps, W: 1 Kbps – 10 Mbps Star Unlimited LoraWAN LoRa Alliance 433/780/868/9 15 MHz ISM 2.5 -15 km 0.3 – 50 Kbps Star 1 million SigFox SigFox Ultra narrow Band 30-50 km (rural), 3-10 km (urban) 100 bps Star 1 million WiFi LowPower IEEE P802.11ah SubGHz 1 km (outdoor) 150 - 340 kbps Start, Tree - Dash7 Dash7 Alliance 433/868/915 MHz 2 km 9.6/56/167 Kbps Star, Tree - LTE-Cat 0 3GPP R-13 Cellular 2.5 -5 km 200 kbps Start > 20,000 UMTS (3G), HSDPA / HSUPA 3GPP Cellular 27 km, 10 km 0.73 - 56 Mbps Star Hundreds per cell 47
  • 48. Taxonomy of Key IoT Wireless Technologies 48
  • 49. Low Power Communication Technologies: Frequency 49
  • 50. Low Power Communication Technologies: Data Rate 50
  • 51. Low Power Communication Technologies: Range 51
  • 52. Low Power Communication Technologies: Energy 52
  • 53. Internet of Things : “Last-mile” Considerations Case study with IEEE 802.15.4
  • 54. 54 Existing Stack using IEEE 802.15.4 as the PHY Layer
  • 55. IEEE 802.15.4 IEEE 802.15.4 IEEE 802.15.4 PHY L2: MAC L4: ROUTING L5: TRANSPORT L6: APP L3: NETWORK L1: PHY 55
  • 56. IEEE 802.15.4: Quick Facts IEEE 802.15.4  Offers physical and media access control layers for low-speed, low-power wireless personal area networks (WPANs)  16 non-overlapping channels, spaced 5 MHz apart; and occupy frequencies 2405-2480 MHz  Provides a physical layer bandwidth of 250kbps  Shares the same frequency band as IEEE 802.11 and Bluetooth 56
  • 57. IEEE 802.15.4: Radio Characteristics 57
  • 58. IEEE 802.15.4: Device Classes Full Function Device (FFD)  Any topology  PAN coordinator capable  Talks to any other device  Implements complete protocol set Reduced Function Device (RFD)  Reduced protocol set  Very simple implementation  Cannot become a PAN coordinator  Limited to leafs in more complex topologies 58
  • 59. IEEE 802.15.4: Topology Types Star Topology  All nodes communicate via the central PAN coordinator  Leafs may be any combination of FFD and RFD devices  PAN coordinator is usually having a reliable power source Peer-to-Peer Topology  Nodes can communicate via the central PAN coordinator and via additional point-to-point links  Extension of the pure star topology Cluster Tree Topology  Leafs connect to a network of coordinators (FFDs)  One of the coordinators serves as the PAN coordinator  Clustered star topologies are an important case (e.g., each hotel room forms a star in a HVAC system) 59
  • 60. IEEE 802.15.4: Frame Formats  Max. frame size: 127 octets  Max. frame header: 25 octets 60
  • 61. IEEE 802.15.4: Frame Formats  Beacon Frames Broadcasted by the coordinator to organize the network  Command Frames Used for association, disassociation, data and beacon requests, conflict notification, . . .  Data Frames Carrying user data  Acknowledgement Frames Acknowledges successful data transmission (if requested) 61
  • 62. Link Layer Protocols L2: MAC L4: ROUTING L5: TRANSPORT L6: APP L3: NETWORK L1: PHY IEEE 802.15.4 IEEE 802.15.4 62
  • 63.  Why do we need MAC ?  Wireless channel is a shared medium  Radios, within the communication range of each other and operating in the same frequency band, interfere with each others transmission  Interference -> Collision -> Packet Loss -> Retransmission -> Increase in net energy  The role of MAC  Co-ordinate access to and transmission over the common, shared (wireless) medium  Can traditional MAC methods be directly applied to WSN ?  Control -> often decentralized  Data -> low load but convergecast communication pattern  Links -> highly volatile/dynamic  Nodes/Hops -> Scale is much larger  Energy is the BIGGEST concern  Network longetivity, reliability, fairness, scalability and latency are more important than throughput MAC is Crucial !!! 63
  • 64. MAC Family Reservation (Scheduled, Synchronous) Contention (Unscheduled, Asynchronous)  Reservation-based  Nodes access the channel based on a schedule  Examples: TDMA  Limits collisions, idle listening, overhearing  Bounded latency, fairness, good throughput (in loaded traffic conditions)  Saves node power by pointing them to sleep until needed  Low idle listening  Dependencies: time synchronization and knowledge of network topology  Not flexible under conditions of node mobility, node redeployment and node death: complicates schedule maintenance  Contention-based  Nodes compete (in probabilistic coordination) to access the channel  Examples: ALOHA (pure & slotted), CSMA  Time synchronization “NOT” required  Robust to network changes  High idle listening and overhearing overheads Taxonomy 64
  • 65. MAC: Reservation vs. Contention 65
  • 66.  Collisions  Node(s) is/are within the range of nodes that are transmitting at the same time -> retransmissions  Overhearing  The receiver of a packet is not the intended receiver of that packet  Overhead  Arising from control packets such as RTS/CTS  E.g.: exchange of RTS/CTS induces high overheads in the range of 40-75% of the channel capacity  Idle Listening  Listening to possible traffic that is not sent  Most significant source of energy consumption Function Protocols Reduce Collisions CSMA/CA, MACA, Sift Reduce Overheads CSMA/ARC Reduce Overhearing PAMAS Reduce Idle Listening PSM Causes of Energy Consumption 66
  • 67. Low-power, Constrained Field Devices MAC Family Scheduled (periodic, high-load traffic) Common Active Periods (medium-load traffic) Preamble Sampling (rare reporting events) 67
  • 68.  Build a schedule for all nodes  Time schedule  no collisions  no overhearing  minimized idle listening  bounded latency, fairness, good throughput (in loaded traffic conditions)  BUT: how to setup and maintain the schedule ? Function Protocols Canonical Solution TSMP, IEEE 802.15.4 Centralized Scheduling Arisha, PEDAMACS, BitMAC, G-MAC Distributed Scheduling SMACS Localization-based Scheduling TRAMA, FLAMA, uMAC, EMACs, PMAC Rotating Node Roles PACT, BMA Handling Node Mobility MMAC, FlexiMAC Adapting to Traffic Changes PMAC Receiver Oriented Slot Assignment O-MAC Using different frequencies PicoRadio, Wavenis, f-MAC, Multichannel LMAC, MMSN, Y-MAC, Practical Multichannel MAC Other functionalities LMAC, AI-LMAC, SS-TDMA, RMAC Scheduled MAC Protocols 68
  • 69. Time Synchronized Mesh Protocol (TSMP): Overview  Goal: High end-to-end reliability  Major Components  time synchronized communication (medium access)  TDMA-based: uses timeslots and time frames  Synchronization is achieved by exchanging offset information (and not by beaconing strategies)  frequency hopping (medium access)  automatic node joining and network formation (network)  redundant mesh routing (network)  secure message transfer (network)  Limitations  Complexity in infrastructure-less networks  Scaling is a challenge  Finding a collision free schedule is a two-hop coloring problem  Reduced flexibility to adapt to dynamic topologies 69
  • 70.  Nodes define common active/sleep periods  active period -> communication, where nodes contend for the channel  sleep period -> saving energy  need to maintain a common time reference across all nodes Function Protocols Canonical Solution SMAC Increasing Flexibility TMAC, E2MAC, SWMAC Minimizing Sleep Delay Adaptive listening, nanoMAC, DSMAC, FPA, DMAC, Q-MAC Handling Mobility MSMAC Minimizing Schedules GSA Statistical Approaches RL-MAC, U-MAC Using Wake-up Radio RMAC, E2RMAC Common Active Period MAC Protocols 70
  • 71.  Goal: reduce energy consumption, while supporting good scalability and collision avoidance  Major Components  periodic listen and sleep  Copes with idle listening: uses a scheme of active (listen) and sleep periods  Active periods are fixed; Sleep periods depend on a predefined duty-cycle param  Synchronization is used to form virtual clusters of nodes on the same sleep schedule  Schedules coordinate nodes to minimize additional latency  collision and overhearing avoidance  Adopts a contention-based scheme  In-channel signaling is used to put each node to sleep when its neighbor is transmitting to another node; thus, avoids the overhearing problem but does not require an additional channel  message passing  Small packets transmitted in bursts  RTS/CTS reserves the channel for the whole burst duration rather than for each packet; hence unfair from a per-hop MAC level Sensor MAC (S-MAC): Overview 71
  • 72.  Periodic Listen and Sleep  Each node goes to sleep for some time, and then wakes up and listens to see if any other node wants to talk to it. During sleep, the node turns off its radio, and sets a timer to awake itself later.  Maintain Schedules  Maintain Synchronization S-MAC - I 72
  • 73.  Collision and Overhearing Avoidance  Adopts a contention based scheme  Collision Avoidance  Overhearing Avoidance  Basic Idea  A node can go to sleep whenever its neighbor is talking with another node  Who should sleep?  The immediate neighbors of sender and receiver  How to they know when to sleep?  By overhearing RTS or CTS  Hog long should they sleep?  Network Address Vector (NAV)  Message Passing  How to transmit a long message?  Transmit it as a single long packet  Easy to be corrupted  Transmit as many independent packets  Higher control overhead & longer delay  Divide into fragments, but transmit all in burst S-MAC - II 73
  • 74.  Adaptive duty cycle: duration of the active period is no longer fixed but varies according to traffic  Prematurely ends an active period if no traffic occurs for a duration of TA Timeout MAC (TMAC): Overview 74
  • 75.  Goal: minimize idle listening -> minimize energy consumption  Operation  Node periodically wakes up, turns radio on and checks channel  Wakeup time fixed (time spend sampling RSSI?)  “Check interval” variable  If energy is detected, node powers up in order to receive the packet  Node goes back to sleep  If a packet is received  After a timeout  Preamble length matches channel “checking interval”  No explicit synchronization required  Noise floor estimation used to detect channel activity during LPL Preamble Sampling MAC Protocols 75
  • 76. Function Protocols Canonical Solution Preamble-Sampling ALOHA, Preamble-Sampling CSMA, Cycled Receiver, LPL, Channel polling Improving CCA BMAC Adaptive Duty Cycle EA-ALPL Reducing Preamble Length by Packetization X-MAC, CSMA-MPS, TICER, WOR, MH-MAC, DPS-MAC, CMAC, GeRAF, 1-hopMAC, RICER, SpeckMAC-D, MX-MAC Reducing Preamble Length by Piggybacking Synchronization Information WiseMAC, RATE EST, SP, SyncWUF Use Separate Channels STEM Avoiding Unnecessary reception MFP, 1-hopMAC Drawbacks:  Costly collisions  Longer preamble leads to higher probability of collision in applications with considerate traffic  Limited duty cycle  “Check interval” period cannot be arbitrarily increased -> longer preamble length  Overhearing problem  The target receiver has to wait for the full preamble before receiving the data packet: the per- hop latency is lower bounded by the preamble length. Over a multi-hop path, this latency can accumulate to become quite substantial. Preamble Sampling MAC Protocols 76
  • 77. Goals:  Simple and predictable; Effective collision avoidance by improving CCA  Tolerable to changing RF/networking conditions  Low power operation; Scalable to large numbers of nodes; Small code size and RAM usage CCA  MAC must accurately determine if channel is clear  Need to tell what is noise and what is a signal  Ambient noise is prone to environmental changes  BMAC solution: ‘software automatic gain control’  Signal strength samples taken when channel is assumed to be free – When?  immediately after transmitting a packet  when the data path of the radio stack is not receiving valid data  Samples go in a FIFO queue (sliding window)  Median added to an EWMA (exponentially weighted moving average with decay α) filter  Once noise floor is established (What is a good estimate?), a TX requests starts monitoring RSSI from the radio CCA: Thresholding vs. Outlier Detection  Common approach: take single sample, compare to noise floor  Large number of false negatives  BMAC: search for outliers in RSSI  If a sample has significantly lower energy than the noise floor during the sampling period, then channel is clear Berkeley MAC (BMAC): Overview 77
  • 78.  0=busy, 1=clear  Packet arrives between 22 and 54 ms  Single-sample thresholding produces several false ‘busy’ signals BMAC 78
  • 79.  Series of short preamble packets each containing target address information  Minimize overhearing problem  Reduce latency and reduce energy consumption  Strobed preamble: pauses in the series of short preamble packets  Target receiver can shorten the strobed preamble via an early ACK  Small pauses between preamble packets permit the target receiver to send an early ACK  Reduces latency for the case where destination is awake before preamble completes  Non-target receivers that overhear the strobed preamble can go back to sleep immediately  Preamble period must be greater than sleep period  Reduces per-hop latency and energy XMAC: Overview 79
  • 80. Wireless Sensor (Wise) MAC: Overview WiseMAC uses a scheme that learns the sampling schedule of direct neighbors and exploits this knowledge to minimize the wake-up preamble length  ACK packets, in addition to a carrying the acknowledgement for a received data packet, also have information about the next sampling time of that node  Node keeps a table of the sampling time offsets of all its usual destinations up-to-date  Node transmits a packet just at the right time, with a wake-up preamble of minimized size 80
  • 81. Wireless Sensor (Wise) MAC: I How does the system cope with Clock drifts ?  Clock drifts may make the transmitter lose accuracy about the receiver’s wakeup time.  Transmitter uses a preamble that is just long enough to make up for the estimated maximum clock drift.  The length of the preamble used in this case depends on clock drifts: the smaller the clock drift, the shorter the preamble the transmitter has to use. What if the node has no information about the wakeup time of a neighbor node ?  Node uses a full-length preamble 81
  • 82. Function Protocols Flexible MAC Structure IEEE 802.15.4 CSMA inside TDMA Slots ZMAC Minimizing Convergecast Effect Funneling MAC, MH-MAC Slotted and Sampling SCP Receiver based Scheduling Crankshaft Hybrid Protocols 82
  • 83. Funneling MAC: Overview ConvergcastComms High traffic intensity: 80% of packet loss happens in the 2-hop region from the SINK 83
  • 84. IEEE 802.15.4 MAC: Overview  Two different channel access methods  Beacon-Enabled duty-cycled mode (typically, used in FFD networks)  Non-Beacon Enabled mode (aka Beacon Disabled mode) 84
  • 85. IEEE 802.15.4 Beacon Enabled Mode CAP: Contention Access Period | CFP: Collision Free Period | GTS: Guaranteed Time Slot  Node listen to Beacon and check IF GTS is reserved  If YES: remain powered off until GTS is scheduled  If NO: Performs CSMA/CA during CAP  Synchronization  Sync with Tracking Mode  Sync with Non Tracking Mode 85
  • 86. A Tribute to Fieldbus Technology … 86
  • 87. Milestones of Fieldbus Evolution and Related Fields 87
  • 88. MAC Strategies in Fieldbus systems 88
  • 90. IP over IEEE 802.15.4 L2: MAC L4: ROUTING L5: TRANSPORT L6: APP L3: NETWORK L1: PHY IPv6 over IEEE 802.15.4 IPv6 over IEEE 802.15.4 90
  • 91. Field Devices: Network Topology Planning  STAR topologies are the easiest to setup and manage  STAR will simply the network design, and if there is just 1-hop communication between the field devices and gateway, then the need for the "routing layer" on the stack of the field devices many not arise ... thereby making it more energy efficient and lightweight.  TREE and MESH are also interesting concepts, but they are very tedious to manage. 91
  • 92. IPv6 over IEEE 802.15.4 (6LoWPAN) Benefits of IP over 802.15.4 (RFC 4919)  The pervasive nature of IP networks allows use of existing infrastructure  IP-based technologies already exist, are well-known, and proven to be working  Open and freely available specifications vs. closed proprietary solutions  Tools for diagnostics, management, and commissioning of IP networks already exist  IP-based devices can be connected readily to other IP-based networks, without the need for intermediate entities like translation gateways or proxies 92
  • 93. 6LoWPAN Challenge Header Size Calculation  IPv6 header is 40 octets, UDP header is 8 octets  802.15.4 MAC header can be up to:  25 octets (null security)  25+21=46 octets (AES-CCM-128)  With the 802.15.4 frame size of 127 octets, the following space left for application data:  127-25-40-8 = 54 octets (null security)  127-46-40-8 = 33 octets (AES-CCM-128) IPv6 MTU Requirements  IPv6 requires that links support an MTU of 1280 octets  Link-layer fragmentation / reassembly is needed 93
  • 94. 6LoWPAN Overview (RFC 4944) Overview  An adaptation layer allowing transport of IPv6 packets over 802.15.4 links  Uses 802.15.4 in unslotted CSMA/CA  Based on IEEE standard 802.15.4-2003  Fragmentation / reassembly of IPv6 packets  Compression of IPv6 and UDP/ICMP headers  Mesh routing support (mesh under)  Low processing / storage costs 94
  • 95. 6LoWPAN Dispatch Codes  All 6LoWPAN encapsulated datagrams are prefixed by an encapsulation header stack  Each header in the stack starts with a header type field followed by zero or more header fields 95
  • 96. 6LoWPAN Frame Formats Uncompressed IPv6/UDP (worst case scenario)  Dispatch code (010000012) indicates no compression  Up to 54 / 33 octets left for payload with a max. size MAC header with null / AES-CCM-128 security  The relationship of header information to application payload is obviously really bad 96
  • 97. 6LoWPAN Frame Formats Compressed Link-local IPv6/UDP (best case scenario)  Dispatch code (010000102) indicates HC1 compression  HC1 compression may indicate HC2 compression follows  This shows the maximum compression achievable for link-local addresses (does not work for global addresses)  Any non-compressible header fields are carried after the HC1 or HC1/HC2 tags (partial compression) 97
  • 98. Header Compression, Fragmentation & Reassembly Compression Principles (RFC 4944)  Omit any header fields that can be calculated from the context, send the remaining fields unmodified  Nodes do not have to maintain compression state (stateless compression)  Support (almost) arbitrary combinations of compressed / uncompressed header fields Fragmentation Principles (RFC 4944)  IPv6 packets to large to fit into a single 802.15.4 frame are fragmented  A first fragment carries a header that includes the datagram size (11 bits) and a datagram tag (16 bits)  Subsequent fragments carry a header that includes the datagram size, the datagram tag, and the offset (8 bits)  Time limit for reassembly is 60 seconds 98
  • 99. Routing Layer Protocol L2: MAC L4: ROUTING L5: TRANSPORT L6: APP L3: NETWORK L1: PHY 99
  • 100. How “Lossy” is Lossy ?  LLN Link Characteristics:  High BER  Frequency packet drops  High instability  LLN failures are frequent and usually transient 100
  • 101. Routing Protocol for Low-power Lossy Links (RPL): Key Highlights RPL :  Highly modular  (Core + Additional) modules  Designed specifically for “lossy” networks  Under-reacts to LLN link changes  Agnostic to underlying link layer technology  Is a proactive IPv6 distance vector protocol  Builds a Destination Oriented Directed Acyclic Graph (DODAG) based on an objective  Supports many-to-one, one-to-many, point-to-point communication  Supports different LLN application requirements  Urban (RFC 5548)  Industrial (RFC 5673)  Home (RFC 5826)  Building (RFC 5867) 101
  • 102.  RPL builds DODAGs  DODAG: set of vertices connected by directed edges with no directed cycles  In contrast to trees, DODAGs offer redundant paths  RPL supports DODAGs instance  Concept similar to multi-topology routing (MTR) as done in OSPF  Allows a node to join multiple DODAGs according to different Objective Functions (OF)  There can be multiple DODAGs within a RPL instance  A node can, therefore, belong to multiple RPL instances  Identifications:  DODAG -> {RPLInstanceID}  Unique identity of DODAG: {RPLInstanceID, DODAGID} RPL: DODAG and Instances 102
  • 103. RPL: DODAG and Instances Traffic moves either up towards the DODAG root or down towards the DODAG leafs DODAG Properties  Many-to-one communication: upwards  One-to-many communication: downwards  Point-to-point communication: upwards-downwards RPL Instance Properties  RPL Instance has an optimization objective  Multiple RPL Instances with different optimization objectives can coexist A typical example would be an energy-efficient topology for background traffic along with a low-latency topology for delay-sensitive alarms. 103
  • 104. RPL: TerminologyRPL: Terminology A node’s Rank defines the node’s individual position relative to other nodes with respect to a DODAG root. The scope of Rank is a DODAG Version. Route Construction  Up routes towards nodes of decreasing rank (parents)  Down routes towards nodes of increasing rank  Nodes inform parents of their presence and reachability to descendants  Source route for nodes that cannot maintain down routes Forwarding Rules  All routes go upwards and/or downwards along a DODAG  When going up, always forward to lower rank when possible, may forward to sibling if no lower rank exists  When going down, forward based on down routes Once a non-root node selects its parent set, it can use the following table to covert the path cost of a | Node/link Metric | Rank | | Hop-Count | Cost | | Latency | Cost/65536 | | ETX | Cost | 104
  • 105. RPL: Control Messages DAG Information Object (DIO) A DIO carries information that allows a node to discover an RPL Instance, learn its configuration parameters and select DODAG parents DAG Information Solicitation (DIS) A DIS solicits a DODAG Information Object from an RPL node Destination Advertisement Object (DAO) A DAO propagates destination information upwards along the DODAG 105
  • 106. RPL: DODAG Construction Construction  Nodes periodically send link-local multicast DIO messages  Stability or detection of routing inconsistencies influence the rate of DIO messages  Nodes listen for DIOs and use their information to join a new DODAG, or to maintain an existing DODAG  Nodes may use a DIS message to solicit a DIO  Based on information in the DIOs the node chooses parents that minimize path cost to the DODAG root Essentially a distance vector routing protocol with ranks to prevent count-to-infinity problems 106
  • 107. Application Layer Protocols L2: MAC L4: ROUTING L5: TRANSPORT L6: APP L3: NETWORK L1: PHY IPv6 over IEEE 802.15.4 IPv6 over IEEE 802.15.4 CoAP 107
  • 108. Constrained Application Protocol CoAP: Key Features CoAP (RFC 7252):  Web transfer protocol (coap://) for use with constrained nodes and networks  Based on RESTful protocol design minimizing the complexity of mapping with HTTP  Asynchronous transaction model  Default bound to UDP, and optionally to DTLS  Low header overhead and parsing complexity  URI and content-type support  Subset of MIME types and HTTP response codes  Has GET, POST, PUT, DELETE methods 108
  • 109. CoAP: Transaction Model UDP DTLS … CoAP Message Sub-layer Reliability Request/Response Sub-layer RESTful interaction  Transport  UDP ( + DTLS)  Base Messaging  Simple message exchange between endpoints  Confirmable or Non-Confirmable message answered by Acknowledgment or Reset message  REST Semantics  REST Request/Response piggybacked on CoAP messages  Method, Response code and Options (URI, content-type, etc.,) 109
  • 110. CoAP: Message Format  Header (4 Bytes)  Ver - Version (1)  T – Message type (Confirmable, Non-Confirmable, Acknowledgment, Reset)  TKL – Token length, if any, number of token bytes after the header  Code – Request method (1-10), Response code (40-255)  Message ID – Identifier for matching response  Token (0-8 Bytes) 110
  • 112. CoAP: Dealing with Packet Loss 112
  • 113. Other Popular App Layer Protocols 113
  • 114. Putting it all together … 114
  • 116. Internet of Things : “Last-mile” Communication Nuances
  • 117. Lessons learnt from WSN deployments “at-scale” … 117
  • 118. SEEDLING @ UNSW, Sydney URL : http://cgi.cse.unsw.edu.au/~sensar/seedling/Seedling.html Objective: 1. Show-case a basic prototype of a WSN System in precision agriculture 2. Understand sensornet deployment challenges 3. Increase the interest of high-school students in ICT 118
  • 119.  Choosing a radio transceiver that gave low-power, long-range links  A robust MAC protocol  Simple network topology and planning  Easy network reconfiguration  Simple uniform data representation  Early adoption of solar power for sensor networks Factors CRITICAL to the SUCCESS of Deployments Limited Energy Reserves – PREMIUM Resource Under MAC Control 119
  • 120. These lessons are also RELEVANT today … 120
  • 121. LESSON – 1 … 121
  • 122. Low POWER Low ENERGY Wireless Communication Links: Power is NOT Energy POWER TIME E1 E2  Message Passing / Time to Transmit ALSO governs Energy  Transmit it as a single long packet  Easy to be corrupted  Transmit as many independent packets  Higher control overhead & longer delay  Divide into fragments, but transmit all in burst 122
  • 123. LESSON – 2 … 123
  • 124. Wireless Communication Links: “Longer the Better” Reduced hops help to obtain better PRR with lesser field devices Configuration - 1 Configuration - 2 124
  • 125. LESSON – 3 … 125
  • 126. IP Adaptation MAC PHY Routing Transport App IP A Routing Layer can be AVOIDED with Smart Network Planning If a single hop (with long link) suffices the purpose, then a routing layer may not be required … save ENERGY IP Adaptation MAC PHY Transport App IP 126
  • 127. LESSON – 4 … 127
  • 128. Long Power, Long Links are “GREY”  Approximately 70% of low power, long range links are GREY (i.e., neither good or bad)  Very difficult to predict link behavior 128
  • 129. Characterizing Low Power Links – Tx Variation Tx power variation can happen … 7dB is a large variation 129
  • 130. Characterizing Low Power Links – Rx Variation Rx sensitivity variation … 130
  • 131. Characterizing Low Power Links – Tx/Rx Dual Mode vs. Rx Only Mode Power variation in Tx/Rx dual mode vs. Rx only mode 131
  • 132. LESSON – 5 … 132
  • 133.  Why do we need MAC ?  Wireless channel is a shared medium  Radios, within the communication range of each other and operating in the same frequency band, interfere with each others transmission  Interference -> Collision -> Packet Loss -> Retransmission -> Increase in net energy  The role of MAC  Co-ordinate access to and transmission over the common, shared (wireless) medium  Can traditional MAC methods be directly applied to WSN ?  Control -> often decentralized  Data -> low load but convergecast communication pattern  Links -> highly volatile/dynamic  Nodes/Hops -> Scale is much larger  Energy is the BIGGEST concern  Network longetivity, reliability, fairness, scalability and latency are more important than throughput MAC is Crucial … Design/Choose it Carefully !!! 133
  • 134. MAC Family Reservation (Scheduled, Synchronous) Contention (Unscheduled, Asynchronous)  Reservation-based  Nodes access the channel based on a schedule  Examples: TDMA  Limits collisions, idle listening, overhearing  Bounded latency, fairness, good throughput (in loaded traffic conditions)  Saves node power by pointing them to sleep until needed  Low idle listening  Dependencies: time synchronization and knowledge of network topology  Not flexible under conditions of node mobility, node redeployment and node death: complicates schedule maintenance  Contention-based  Nodes compete (in probabilistic coordination) to access the channel  Examples: ALOHA (pure & slotted), CSMA  Time synchronization “NOT” required  Robust to network changes  High idle listening and overhearing overheads MAC Taxonomy 134
  • 135. MAC: Reservation vs. Contention 135
  • 136. LESSON – 6 … 136
  • 137. Understand the Application’s Traffic Pattern 137
  • 139. The FORTUNE TELLER or NOT …  Low power, long range communication is a very different ball game compared to standard communication technologies.  Many attributes that inherently are known to work in regular communications will “shock you” in low-power communications.  Take inspiration from the tons of WSN deployments that have studied these artifacts rather than hypothesizing “again”. 139
  • 140. Internet of Things : “Last-mile” Existing and Upcoming Standards
  • 141. 141 Existing Stack using IEEE 802.15.4 as the PHY Layer
  • 142. 142 Popular IETF Stack for Field Devices: RFC Portfolio
  • 143. 143 Popular IETF Stack for Field Devices: Other RFC Portfolio
  • 144. 144 Thread Stack for Field Devices
  • 145. “New” IETF Stack for Field Devices: +6TiSCH 145
  • 149. Interoperability via Data Semantics: IEEE 1451 + IEEE 2700 ?  The IEEE 1451 (TEDS) is a well established standard in industrial automation to achieve plug-n-play capability with the help of electronic datasheets.  TEDS is the electronic version of the data sheet that is used to configure a sensor.  TEDS bring forward the concept that if the data sheet is electronic and can be readily accessed upon sensor discovery, it would be possible to configure the sensor automatically.  This is analogous to the operation of plugging a mouse, keyboard, or monitor in the computer and using them without any kind of manual configuration.  TEDS enables self-configuration of the system by self-identification and self-description of sensors and actuators (i.e., plug-and-play).  IEEE 2700 is a sensor calibration standard. 149
  • 150. Internet of Things : Derivatives for Intelligence
  • 151. The Data to Knowledge Pipeline Cyber & Physical Space Entities Edge Global Infra Data Ingestion Data Analysis Applications Data source “Big” data Infra “Little” data Infra Decision making with Knowledge DATA @ REST (VOLUME) Archival/Static data (TBs) in Data stores DATA @ MOTION (VELOCITY) Streaming data DATA @ MANY FORMS (VARIETY) Structured/Unstructured, Text, Multimedia, Audio, Video DATA @ DOUBT (VERACITY) Data with uncertainty that may be due to incompleteness, missing points, etc., NATURE of INGESTED DATA PRESCRIPTIVE What are the best outcomes ? PREDICTIVE What could happen ? DIAGNOSTIC Why did this happen ? DESCRIPTIVE What has happened ? NATURE of DATA ANALYSIS 151
  • 152. Value Hindsight and Insight/ Insights into the PAST Foresight/ Insights into the FUTURE Skill Descriptive “WHAT has happened ? ” Diagnostic “WHY did this happen ?” Prescriptive “WHAT should we do ?” Predictive “WHAT could happen ? ” Information Optimization Nature of Data Analysis DASHBOARD FORECAST ACTIONS, RULES, RECOMMs 152
  • 153. Example: Energy Analysis for a PV Microgrid Descriptive: What is the total energy, instantaneous energy and power, etc., …? Diagnostic: Why is the panel temperature decreasing, when the solar irradiance is high and the wind speed is very low ? Predictive: Can I forecast the plant output for tomorrow, or can I generate 4kWh net energy ? Prescriptive: What actions should be undertaken for the plant to reach 4kW energy generation capacity from its current 2 kW ? 153
  • 154. Example: Self Health Monitoring of Multi-rotor MAV Descriptive: What is the total input power (voltage and current), thrust, vibration and ego-noise profiles, and motor/propeller unit RPM ? Diagnostic: Why is the THRUST not increasing with increasing RPM ? Predictive: What is the success probability of the upcoming mission, given that flight and structural health history ? Prescriptive: What actions should be taken for increasing the success probability of the upcoming mission from 75% to 90% ? 154
  • 155. Machine/System Intelligence … Depending on the type and quality of analytics, machines/systems could manifest themselves into:  Informed Systems — Systems That Know/Aware  Adaptive Systems — Systems That Learn  Cognitive Systems — Systems That Reason and Plan 155
  • 156. Deriving Machine Intelligence  Reason and Plan (with Uncertain Knowledge)  Probabilistic Reasoning:  Bayesian Networks  Conditional Distributions  Probabilistic Reasoning over Time:  Hidden Markov Models  Kalman Filters  Dynamic Bayesian Networks  Simple Decisions:  Utility Theory  Decision Networks  Expert Systems  Complex decisions:  Partial observable Markov Decision Process (MDP)  Game Theoretic Models  Learning  Supervised | Semi-supervised | Unsupervised | Reinforcement  Classification  Regression  Clustering 156
  • 158. ML computational methods / algorithms :  LEARN information directly from data, “without” relying on predetermined models  FIND natural patterns in data, which help to generate insights for better decisions and predictions ML teaches Machines to do what “naturally” comes to Humans and Animals “LEARN from EXPERIENCE” 158
  • 159. ML Techniques SUPERVISED Develop a predictive model, based on evidence (both input and output data) UNSUPERVISED Group and interpret data, based only on input data (without labels) CLASSSIFICATION Predicts discrete responses (e.g., email: genuine vs. spam; tumor: cancerous vs. benign) REGRESSION Predicts continuous responses (e.g., changes in temperature; fluctuations in power demand) CLUSTERING Finds hidden patterns or groupings (e.g., object recognition) When to use ?  When you want to train a model to make a prediction.  When you have existing <input, output> data for response that you are trying to predict. When to use ?  When you want to train a model to find a good internal representation.  When you want to explore your data; but don’t yet have a specific goal, or are not sure what information the data contains.  When you want to reduce the dimensions of your data. When to use ?  When you are working with data that can be tagged or categorized. When to use ?  When you are working with data ranges, and want to predict trends. 159
  • 160. Selecting the Right Algorithm ML TECHNIQUES SUPERVISED UNSUPERVISED CLASSIFICATION REGRESSION CLUSTERING Support Vector Machines Discriminant Analysis Naive Bayes Near Neighbor Linear Regression Ensemble Methods Decision Trees Neural Networks K-Means, K-Medoids, Fuzzy C-Means Hierarchical Gaussian Mixture  Is it TRIAL and ERROR ?  Is it Trade-off between:  Speed of training  Memory usage  Predictive accuracy on new data  Transparency / Interpretability (how easily can you understand the reasons for an algorithm to make that prediction)  Using larger training datasets often yield models that generalize well for new data 160
  • 161. ML Workflow Input AcquiredData (Sensor/Image/Video/Transactional) Sub Goal: Data Representation Sub Goal: Preprocessing Sub Goal: Feature Extraction Sub Goal: Build / Train Model Identify: Good and Bad Data Portions Identify: Missing Samples/Values Detect: Outliers Prepare: Cross Validation Sub Goal: Improve Model 161
  • 162. ML Workflow: Feature Derivation  The number of features that could be derived is limited only by our imagination !!! Sensor data  Extract signal properties from raw sensor data  Peak analysis (frequency, power, etc.,)  Pulse and transition analysis (rise time, fall time, settling time, etc.,)  Spectral analysis (power, bandwidth, frequency & its span, etc.,) Image/Video data  Extract features such as edge locations, resolution, color …  Bag of visual words (create a histogram of local image features : edges, corners, blobs, etc.,)  Histogram of oriented gradients  Minimum eigenvalue (detect corner locations in images)  Edge detection (identify points where the degree of brightness changes sharply) Transactional data  Calculate derived features that enhance the information in the data  Time decomposition (break timestamps down into components such as day and month)  Aggregate value calculation (create higher-level features such as total number of times a particular event occurred) 162
  • 164. How it Works ?  Categorizes data points based on the classes of their nearest neighbors in the dataset (“guilty by association”).  Motivating insight: data points near to each other, tend to be similar.  Non-parametric: does not make any assumptions regarding the distribution of data.  Metric for near neighbor : Distance, either Euclidean (most popular), City block, Chebychev, Correlation, Cosine, etc.  Choose K to be ODD for clear majority Best Used :  When you want to use a method that does not have training phase (often called a lazy learner).  When response time, memory and space are of lesser concern (need to store not just the algorithm, but also the training data).  When you want a less smarter algorithm, which can be fooled with irrelevant inputs (i.e., less robust to noise).  When you need a simple algorithm to establish benchmark learning rules. k-Nearest Neighbor (kNN) K = 15 Smoother, more defined boundaries K = 1 164
  • 165. Logistic Regression How it Works ?  Fits a model that can predict the probability of a binary response belonging to one class or the other. Best Used :  When the dependent variable is BINARY.  When data can be clearly separated by a single, linear boundary.  When a baseline is needed for evaluating more complex classification methods. 𝑦 = 1 1 + 𝑒− 𝛽0+ 𝛽1𝑥 x y x y 165
  • 166. Support Vector Machines How it Works ?  Classifies data by finding the linear decision boundary (hyperplane), which separates all data points of one class from those of the other class.  When the data is linearly separable:  the best hyperplane is the one with the: largest margin between the two classes.  When the data is not linearly separable:  use a kernel transform to transform nonlinearly separable data into higher dimensions, where a linear decision boundary can be found.  use a loss function to penalize points on the wrong side of the hyperplane. Best Used :  When data has exactly two classes.  multiclass classification can be performed with a divide-and-conquer approach  When data is complex, has high-dimensionality, and is nonlinearly separable.  When data is limited.  When you need a classifier that’s simple, easy to interpret, and accurate.  When fast response is needed. Support vector 166
  • 167. Neural Networks How it Works ?  Consists of highly connected networks of neurons, which relate (map) the inputs to the desired outputs.  The network is trained by iteratively modifying the strengths (i.e., weights) of the connections so that given inputs map to the correct response. Best Used :  When modeling highly nonlinear systems.  When computation cost is of lesser concern.  When model interpretability is not a key concern* (… however, there is work that can go to the details of interpreting each layer and also suggesting how many neurons are needed; therefore, is interpretable | It can also now handle time information ….)  When there could be unexpected changes in your input data* (… for which the network has to be deep with large number of neurons …) 167
  • 168. Naïve Bayes How it Works ?  Based on Bayes Probability Theorem, it assumes that the presence of a particular feature in a class is unrelated to the presence of any other feature.  Classifies new data based on the highest probability of its belonging to a particular class. c = HYPOTHESIS (class) x = EVIDENCE (predictor variable / new data point) P(c) = probability of the hypothesis before getting the evidence P(c|x) = probability of the hypothesis after getting the evidence Best Used :  When assumption of feature independence holds TRUE; it can easily outperform other well known techniques with lesser training data.  When the model is expected to encounter scenarios that weren’t in the training data.  When CPU and memory resources are a limiting factor* (… although for likelihood estimation, a dataset is needed …).  When you want a method that doesn’t overfit.  When you want a method that can update itself with continuous new data.  When you need a classifier that’s easy to interpret. 𝑃 𝑐 𝑥 = 𝑃 𝑥 𝑐 𝑃(𝑐) 𝑃(𝑥) Posterior = Likelihood ratio x Prior 168
  • 169. Discriminant Analysis How it Works ?  Classifies data by finding linear combinations.  Assumes that different classes generate data based on Gaussian distributions.  Training a discriminant analysis model involves finding the parameters for a Gaussian distribution for each class.  The distribution parameters are used to calculate boundaries, which can be linear or quadratic functions; and these boundaries are used to determine the class of new data. Best Used …  When memory usage during training is a concern.  When you need a model that is fast to predict.  When you need a simple model that is easy to interpret. 169
  • 170. Decision Trees How it Works ?  represents a procedure for classifying categorical data based on their attributes.  decide which attribute to test at a node by determining the “best” way to separate (splitting‐point)  pick the attribute that has the highest Information gain. A decision tree for the concept buys_computer, indicating whether a customer at AllElectronics is likely to purchase a computer. Each internal (nonleaf) node represents a test on an attribute. Each leaf node represents a class (either buys_computer = yes or buy_computers = no) Best Used :  When handling large datasets.  When there is a need to ignore redundant variables, and handle missing data elegantly* (… missing data should be small …).  When memory usage needs to be minimized.  When decision traceability is needed. 170
  • 171. Bagged and Boosted Decision Trees  How do Bagging and Boosting get N leaners ? Trees are simple, but often produce noisy (bushy) or weak (stunted) classifiers. In these ensemble methods, several “weaker” decision trees are combined into a “stronger” ensemble.  Why are the data elements weighted ? 171
  • 172. Bagged and Boosted Decision Trees  How does the classification stage work ? 172
  • 173. Bagged and Boosted Decision Trees Similarities Differences Both are ensemble methods to get N learners from 1 learner Bagging: builds N leaners independently Boosting: tries to add new models that do well where previous models fail Both generate several training data sets by random sampling Bagging: no weighting strategy Boosting: determines weights for the data to tip the scales in favor of the most difficult cases Both make the final decision by averaging the N learners (or taking the majority of them) Bagging: an equally weighted average Boosting: a weighted average (i.e., more weight to those with better performance on training data) Both are good at reducing variance and provide higher stability Bagging: may solve the over-fitting problem | may not reduce bias Boosting: may increase the overfitting problem | tries to reduce bias Best Used :  When there is a need to minimize prediction variance Boosting > Random Forests > Bagging > Single Tree 173
  • 175. Linear/Non-linear/Gaussian Process Regression How it Works ?  Describes a continuous response variable as a linear/non-linear/Gaussian process function. Linear regression : Best Used …  When you need an algorithm that is easy to interpret and fast to fit.  When you need a baseline for evaluating other, more complex regression models. Non-linear regression : Best Used …  When data has strong nonlinear trends, and cannot be easily transformed into a linear space.  When you need to fit custom models to the data. Gaussian Process regression (Kriging) : Best Used …  When interpolation needs to be performed in the presence of uncertainty. Linear Regression Non-linear Regression Kriging 175
  • 176. SVM Regression / Regression Tree SVM Regression: How it Works ?  Works the same as SVM classification algorithms, but is modified to be able to predict a continuous response.  Instead of finding a hyperplane that separates data, it finds a model that deviates from the measured data by a value no greater than a small amount, with parameter values that are as small as possible (to minimize sensitivity to error). SVM Regression: Best Used :  For high-dimensional data , where there will be a large number of predictor variables.  When data is limited, and number of predictor variables are large Regression Tree: How it Works ?  Works the same as decision trees, but is modified to be able to predict a continuous response. Regression Tree : Best Used :  When predictors are categorical (discrete) or behave nonlinearly. 176
  • 178. Clustering Analysis Hard Clustering Each data point belongs to only ONE cluster Soft Clustering Each data point belong to MORE than ONE cluster Data grouping is KNOWN Data grouping is UNKNOWN Self Organizing Maps (SOM) Hierarchical Clustering  Search for possible clusters  Use cluster evaluation to look for the “best’ number of groups for a given clustering algorithm  Data is partitioned into groups (or clusters) based on some measure of similarity or shared characteristic.  Clusters are formed so that objects in the same cluster are very similar and objects in different clusters are very distinct. 178
  • 179. Common Hard Clustering Algos: k-Means / k-Mediods k-Means: How it Works ?  Partitions data into k number of mutually exclusive clusters.  The fitment of a point into a cluster is determined by the distance from that point to the cluster’s center. k-Mediods: How it Works ?  Similar to k-means, but with the requirement that the cluster centers coincide with points in the data. Best Used :  When the number of clusters is known.  For fast clustering of categorical data  To scale to large data sets 179
  • 180. Hierarchical Clustering & SOM Hierarchical: How it Works ?  Produces nested sets of clusters by analyzing similarities between pairs of points and grouping objects into a binary, hierarchical tree. Hierarchical : Best Used :  When advance knowledge of data clusters is missing  When you want visualization to guide your selection SOM: How it Works ?  Neural-network based clustering that transforms a dataset into a topology-preserving 2D map SOM: Best Used :  To visualize high-dimensional data in 2D or 3D 180
  • 181. Possible Modes with Unsupervised Learning Data Clusters Results Lower Dimensional Data Feature Selection Supervised Learning Model Large Data Unsupervised Learning End goal is unsupervised learning Preprocessing step for supervised learning 181
  • 183. Improving Models  Model improvement in learning means:  increasing its accuracy  increasing predictive power  preventing over-fitting (ambiguity between data and noise)  increasing model parsimony  Essentially, reduces errors in learning due to noise, bias and variance Feature Selection  Identifying the most relevant features, which provide the best predictive power.  Could be done by: adding or removing features, which do not improve model performance. Feature Transformation  Recasting existing features into new features using techniques such as: principal component analysis, nonnegative matrix factorization, and factor analysis. Hyperparameter Tuning  It is the process of identifying the set of parameters that provide the best model.  It controls how a ML algorithm fits the model to the data. A model is only as good as the features selected to train on !!! 183
  • 184. Feature Selection  Especially useful:  when dealing with high-dimensional data  when the dataset contains a large number of features and a limited number of observations  Reducing the feature space saves storage and computation time  Makes the result easier to understand Stepwise Regression  Sequentially adding or removing features until there is no improvement in prediction accuracy. Sequential Feature Selection  Iteratively adding or removing predictor variables and evaluating the effect of each change on the performance of the model. Regularization  Using shrinkage estimators to remove redundant features by reducing their weights (coefficients) to zero. Neighborhood Component Analysis (NCA)  Finding the weight each feature has in predicting the output, so that the features with lower weights can be discarded. 184
  • 185. Feature Transformation  Feature transformation is a form of dimensionality reduction Principal Component Analysis (PCA)  Performs a linear transformation on the data, so that most of the variance or information in your high-dimensional dataset is captured by the first few principal components.  The first principal component will capture the most variance, followed by the second principal component, and so on. Nonnegative Matrix Factorization  Used when model terms must represent nonnegative quantities, such as physical quantities. Factor Analysis  Identifies underlying correlations between variables in the dataset to provide a representation in terms of a smaller number of unobserved latent factors, or common factors.  Shows the relationship between variables, so that variables (or features) that are not highly correlated can be removed. 185
  • 186. Feature Transformation & Hyper-parameter Tuning  Begin by setting parameters based on a “best guess” of the outcome.  Goal is to find the “best possible” values - that would yield the best model.  As the parameters are adjusted and model performance begins to improve, a note has to be made as to which parameters are effective and which still require tuning.  Three common parameter tuning methods are:  Bayesian optimization  Grid search  Gradient-based optimization  Hyperparameter tuning is an iterative process 186
  • 187. Choosing the Right Model ?  Why is it so hard to get right?  Each model has its own strengths and weaknesses in a given scenario.  No established set of rules/guidelines.  Closely tied to business case, and understanding of what needs to be accomplished.  What can you do to choose the right model?  How much data do you have and is it continuous?  What type of data is it?  What are you trying to accomplish?  How important is it to visualize the process?  How much detail do you need?  Is storage a limiting factor?  Is response time a limiting factor ?  Is computation cost a limiting factor ? 187
  • 188. Model Over-fitting  Overfitting means that the model is so closely aligned to training data sets that it does not know how to respond to new situations.  Why is overfitting difficult to avoid?  often the result of insufficient/inaccurate information about the scenario.  How do you avoid overfitting?  using appropriate training data.  training data needs to accurately reflect the complexity and diversity of the data the model will be expected to work with.  use regularization  penalizes large parameters to help keep the model from relying too heavily on individual data points and becoming too rigid  control the smoothness of fit  Has the form: [Error + λf(θ)], where f(θ) grows larger as the components of (θ) grow larger and λ represents the strength of the regularization  λ decides how much you want to protect against overfitting  if λ=0, you aren’t looking to correct for overfitting at all  perform model cross-validation  partitions a dataset and uses a subset to train the algorithm and the remaining data for testing  common techniques: k-fold | holdout 188
  • 190. The FORTUNE TELLER or NOT … A general rule-of-thumb:  Training - to generate the MODEL - is an expensive operation  Estimation - using the derived MODEL - is lightweight Intelligence (derived through LEARNING) on Embedded systems:  On-device training MAY NOT be a good strategy  It may be better to offload it to a resourceful device  On-device estimations using the derived model MAY be a good strategy There are EXCEPTIONS to this rule !!!  Sequential versions of many commonly used learning algorithms have been developed (K-means, etc.), and are part of the stream processing suite. 190
  • 191. Acknowledgment and References This short course on IoT has been compiled from various online resources, text books, and research papers on this topic. While Prasant may not be able to “correctly” recollect the right sources, he – nevertheless – requests all viewers to drop a note, if they come across any discrepancies in this regard. 1. MAC Essentials : A.Bachir, M. Dohler, T. Watteyne and K. K. Leung, "MAC Essentials for Wireless Sensor Networks," in IEEE Communications Surveys & Tutorials, vol. 12, no. 2, pp. 222-248, Second Quarter 2010. 2. Machine Learning : https://in.mathworks.com/campaigns/products/offer/machine-learning- with-matlab.html