4. 2. Project Goal
Get viewers' feedback
Viewers' feedback on signage is shown on their faces.
1. Capture video of passing-by viewers from camera.
2. Detect human faces.
3. Recognize Age/Gender & 5 emotion types
(Happy/Surprise/Neutral/sad/anger)
4. Send inferencing results with current video name
& timestamp to server.
Edge Computing
Recognize emotion in Signage site for saving network
bandwidth & reduce turn-around time.
Signage video playback
5. 3. Proposed solution
LED matrix
CloudMQTT
Signage Video
HDMI
USB
Web-based statistic
Emotion Recognized,
Current video name
Edge
Computing
7. 5. Emotion Detection Flow
Neural Compute Stick
Intel® Movidius
Load pre-trained
Emotion Detection model
Into NCS
Realtime Video
of faces from USB cam
Smile : 0.8
Sad : 0.0
Anger : 0.0
Surprise : 0.0
Neutral : 0.2
Inferences
“Smile”
LED array
8. 6. Core Technology required.
AI technology :
• INTEL OpenVINO for :
• Human Face Detection.
• Facial analysis on Gender & Emotions.
Web & IoT technology :
• HTML, CSS & Chart.js for statistic chart display on webpage.
• MQTT for transmitting emotion data to web server.
• LineBot & Node.js to push report to advertizer.
Peripheral control :
• GPIO controls on Raspberry Pi for LED matrix
10. Emotion/Gender Recognition
Image from Camera
FaceDetection
Model
EmotionRecognition
Model
GenderRecognition
Model
OpenVINO
Inference Engine
OpenCV
Read Input Output Window
OpenCV
Display Output
Male
Happy 0.8
Female
Happy 0.7
11. Face Detection Model
Model MobileNet, Google 2017
Layer 164
Framework Caffe
Accuracy 93% (head height > 64px)
Inputs
shape: [1x3x384x672], [BxCxHxW]
* B - batch size
* C - number of channels
* H - image height
* W - image width
Outputs
shape: [1, 1, N, 7],
[id, label, conf, xmin, ymin, xmax, ymax]
* id - ID of the image in the batch
* label - predicted class ID
* conf - confidence for the predicted class
* (xmin, ymin) – coord. of the top left
* (xmax, ymax) – coord. of the bottom right
FPS 6.28 (Raspberry Pi 3 with NCS 2)
12. Age/Gender Recog. Model
Model Convolutional Neural Network
Layer 24
Framework Caffe
Accuracy
Gender accuracy: 95.80%
Avg. age error: 6.99 years
(People in [18, 75] years old)
Validation
Dataset
~20,000 unique subjects representing
diverse ages, genders, and ethnicities.
Inputs shape: [1x3x62x62], [1xCxHxW]
Outputs
Gender shape: [1, 2, 1, 1] - Softmax
output
* female,
* male
Age shape: [1, 1, 1, 1] - Estimated age
divided by 100
FPS
5.06 (Raspberry Pi 3 with NCS 2)
- Face detection + Gender Recognition
13. Emotions Recognition Model
Model Convolutional Neural Network
Layer 33
Framework Caffe
Accuracy 70.20%
Validation
Dataset
2,500 images from AffectNet dataset
Inputs shape: [1x3x64x64], [1xCxHxW]
Outputs
shape: [1, 5, 1, 1] - Softmax output
* Neutral
* Happy
* Sad
* Surprise
* Anger
FPS
3.78 (Raspberry Pi 3 with NCS 2)
- Face detection + Gender Recognition +
Emotion Recognition
16. MQTT, webpage & Chart.js
HTML/Javascript/CSS to sync data
with webpage & Chart.js to display
the real time statistic of emotion
detection。
Mosquitto/MQTT cloud
as MQTT Broker
Recognized Emotion
17. 8. Emotion Statistic webpage
Number/Ratio of Emotion Recog.
on dedicate signage video.
Number of Emotions Recognised
with time-stamp.
19. 10. Summary
• Emotion statistic report helps adjusting the
budget for advertizing ROI optimization.
• Edge Computing HW, Pi3 + NCS, can be
used for Emotion Recognition & Advertizing
Video playback both. Reduce HW cost for
DS system provider.
• Emotion Detection can be not only used for
Digital Signage, but also for other showcase
displays, also.
20. Team members
• Frank Jiang :
• Project concept & solution initiator
• LED matrix control development
• Vincent Wong :
• Interactive Face Detection System development
• System Integration.
• Andy Lin :
• Digital Signage Video control development.
• Web Application Validation.
• Hugo Wen :
• Web Development
• Line Bot development
21. Please scan the QRcode
ID: @915rwsrh
URL: http://ids.aiot01.com/ai/home.html
24. OpenVINO™ advantages
• Development toolkit for high perf. CV and DL inference
• API solution for application designers
• No training overhead.
• Minimal footprint, highly portable code.
• Set of libraries to solve CV/DL deployment problems
• Fastest OpenCV build
• Certified OpenVX implementation
• Deep Learning Inference Engine
• Access to all accelerators and heterogeneous exec. model
• Intel CPU, CPU w/integrated graphics
• Vision Processing Unit (VPU) and FPGA
25. High quality DL models (free)
IoT Model Zoo :
• Free reference models for Deep Learning Inference Engine
• Object Detection (Face, People, Vehicles, etc)
• Object Analysis ( Facial attribute, Head Pose, etc)
• Superior performance on INTEL
• Core i5™ : SSD 300 (6 fps) vs People Detection Model (60
fps)