SlideShare uma empresa Scribd logo
1 de 33
Baixar para ler offline
Kevin McGuinness
kevin.mcguinness@dcu.ie
Assistant Professor
School of Electronic Engineering
Dublin City University
http://bit.ly/dlcv2018
#DLUPC
Saliency Prediction
Day 3 Lecture 5
1
The importance of visual attention
The importance of visual attention
The importance of visual attention
The importance of visual attention
Why don’t we see the changes?
We don’t really see the whole image
We only focus on small specific regions: the salient parts
Human beings reliably attend to the same regions of images
when shown
What we perceive
Where we look
What we actually see
Saliency prediction
Produce a computational model of visual attention: predict where humans will look.
Often want to map an image to a heatmap (saliency map).
12
Salient object detection?
Often confused with saliency prediction, but a different task.
Figure from: Progressive Attention Guided Recurrent Network for Salient Object Detection, CVPR 2018 13
Datasets
14
MIT 300
300 natural indoor and
outdoor scenes.
39 observers. 3 sec free view.
ETL 400 ISCAN eye tracker
Test set only: no training data
or public ground truth
http://saliency.mit.edu/results
_mit300.html
A Benchmark of Computational Models of Saliency to Predict Human Fixations [MIT tech report 2012] 15
Fixations and saliency maps
Raw eye tracker data needs to be processed to produce saliency maps
Eye tracker
Fixation
detection
Saliency map
generation
Raw gaze
location-time tracks
Eye fixations
Detect saccades using
distance/velocity thresholding,
clustering
Rendering, Gaussian blur,
normalizing
16
MIT 1003
1003 natural indoor and outdoor
scenes.
15 observers. 3 sec free view.
ETL 400 ISCAN eye tracker
Training dataset for MIT 300
Learning to Predict where Humans Look [ICCV 2009] 17
iSUN
Large scale dataset of natural
scenes
20,608 images with avg. 3
observers each
Collected using webcams and
Amazon Mechanical Turk
Used in LSUN challenge
2015/2016
Xu et al. TurkerGaze: Crowdsourcing Saliency with Webcam based Eye Tracking arXiv 2015. 18
SALICON
Another large scaled dataset of
images from MS COCO dataset
10K train, 5K val, 5K test
Simulated crowdsourced
attention using mouse
movements and simulated
artificial foveation.
Jiang et al. SALICON: Saliency in Context, CVPR 2015
19
Models
20
SalNet: deep visual saliency model
Predict map of visual attention from image pixels
(find the parts of the image that stand out)
● Feedforward 8 layer “fully convolutional”
architecture
● Transfer learning in bottom 3 layers from
pretrained VGG-M model on ImageNet
● Trained on SALICON dataset
Predicted Ground truth
Pan, McGuinness, et al. Shallow and Deep Convolutional Networks for Saliency Prediction, CVPR 2016 http://arxiv.org/abs/1603.00845
21
ImageGroundtruthPrediction
22
ImageGroundtruthPrediction
23
SalGAN
Adversarial loss
Data loss
Junting Pan, Cristian Canton, Kevin McGuinness, Noel E. O’Connor, Jordi Torres, Elisa Sayrol and Xavier Giro-i-Nieto. “SalGAN: Visual
Saliency Prediction with Generative Adversarial Networks.” arXiv. 2017.
24
SalNet and SalGAN benchmarks
25
Deep Gaze
Simple linear model trained on
activations of all conv layers
(upsampled) from AlexNet
Softmax output over full image,
categorical cross entropy.
L1
regularization used to
encourage sparsity.
Kümmerer et al. Deep gaze 1: Boosting saliency prediction with feature maps trained on imagenet. ICLR workshops 2015
26
MLNet
Cornia et al. A Deep Multi-Level Network for Saliency Prediction. ICPR 2016.
27
SALICON
Huang et al., SALICON: Reducing the Semantic Gap in Saliency Prediction by Adapting Deep Neural Networks, ICCV 2015 28
DeepFix
Weights initialized from
VGG16 trained on ImageNet
Dilated convolutions
Location biased convolutions
Inception layers
Kruthiventi et al. DeepFix: A Fully Convolutional
Neural Network for predicting Human Eye Fixations
https://arxiv.org/abs/1510.02927
29
Deep Gaze II
Kümmerer, et al. Understanding low-and high-level contributions to fixation prediction. ICCV 2017
30
From image to video saliency?
Bak et al. Spatio-temporal saliency networks for dynamic saliency prediction. IEEE Trans. Multimedia 2017
SalNet
31
From image to video saliency
Gorji and Clark, Going From Image to Video Saliency: Augmenting Image Salience With Dynamic Attentional Push, CVPR 2018
32
Questions?
33

Mais conteúdo relacionado

Mais procurados

Objects as points (CenterNet) review [CDM]
Objects as points (CenterNet) review [CDM]Objects as points (CenterNet) review [CDM]
Objects as points (CenterNet) review [CDM]Dongmin Choi
 
Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...
Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...
Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...Universitat Politècnica de Catalunya
 
Visual Object Tracking: review
Visual Object Tracking: reviewVisual Object Tracking: review
Visual Object Tracking: reviewDmytro Mishkin
 
CNN Machine learning DeepLearning
CNN Machine learning DeepLearningCNN Machine learning DeepLearning
CNN Machine learning DeepLearningAbhishek Sharma
 
Disentangled Representation Learning of Deep Generative Models
Disentangled Representation Learning of Deep Generative ModelsDisentangled Representation Learning of Deep Generative Models
Disentangled Representation Learning of Deep Generative ModelsRyohei Suzuki
 
DEEPFAKE DETECTION TECHNIQUES: A REVIEW
DEEPFAKE DETECTION TECHNIQUES: A REVIEWDEEPFAKE DETECTION TECHNIQUES: A REVIEW
DEEPFAKE DETECTION TECHNIQUES: A REVIEWvivatechijri
 
Dropout as a Bayesian Approximation
Dropout as a Bayesian ApproximationDropout as a Bayesian Approximation
Dropout as a Bayesian ApproximationSangwoo Mo
 
Computer vision - two view geometry
Computer vision -  two view geometryComputer vision -  two view geometry
Computer vision - two view geometryWael Badawy
 
Multiple object detection
Multiple object detectionMultiple object detection
Multiple object detectionSAURABH KUMAR
 
Wavelet based image fusion
Wavelet based image fusionWavelet based image fusion
Wavelet based image fusionUmed Paliwal
 
Eigenface For Face Recognition
Eigenface For Face RecognitionEigenface For Face Recognition
Eigenface For Face RecognitionMinh Tran
 
[PR12] Generative Models as Distributions of Functions
[PR12] Generative Models as Distributions of Functions[PR12] Generative Models as Distributions of Functions
[PR12] Generative Models as Distributions of FunctionsJaeJun Yoo
 
Applied Deep Learning 11/03 Convolutional Neural Networks
Applied Deep Learning 11/03 Convolutional Neural NetworksApplied Deep Learning 11/03 Convolutional Neural Networks
Applied Deep Learning 11/03 Convolutional Neural NetworksMark Chang
 
Deep learning for object detection
Deep learning for object detectionDeep learning for object detection
Deep learning for object detectionWenjing Chen
 
Computer Vision transformations
Computer Vision  transformationsComputer Vision  transformations
Computer Vision transformationsWael Badawy
 

Mais procurados (20)

Lec15 sfm
Lec15 sfmLec15 sfm
Lec15 sfm
 
Objects as points
Objects as pointsObjects as points
Objects as points
 
Objects as points (CenterNet) review [CDM]
Objects as points (CenterNet) review [CDM]Objects as points (CenterNet) review [CDM]
Objects as points (CenterNet) review [CDM]
 
Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...
Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...
Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...
 
Visual Object Tracking: review
Visual Object Tracking: reviewVisual Object Tracking: review
Visual Object Tracking: review
 
CNN Machine learning DeepLearning
CNN Machine learning DeepLearningCNN Machine learning DeepLearning
CNN Machine learning DeepLearning
 
Disentangled Representation Learning of Deep Generative Models
Disentangled Representation Learning of Deep Generative ModelsDisentangled Representation Learning of Deep Generative Models
Disentangled Representation Learning of Deep Generative Models
 
Pca for semiinar
Pca for semiinarPca for semiinar
Pca for semiinar
 
DEEPFAKE DETECTION TECHNIQUES: A REVIEW
DEEPFAKE DETECTION TECHNIQUES: A REVIEWDEEPFAKE DETECTION TECHNIQUES: A REVIEW
DEEPFAKE DETECTION TECHNIQUES: A REVIEW
 
Dropout as a Bayesian Approximation
Dropout as a Bayesian ApproximationDropout as a Bayesian Approximation
Dropout as a Bayesian Approximation
 
Computer vision - two view geometry
Computer vision -  two view geometryComputer vision -  two view geometry
Computer vision - two view geometry
 
CVPR presentation
CVPR presentationCVPR presentation
CVPR presentation
 
Multiple object detection
Multiple object detectionMultiple object detection
Multiple object detection
 
Wavelet based image fusion
Wavelet based image fusionWavelet based image fusion
Wavelet based image fusion
 
Eigenface For Face Recognition
Eigenface For Face RecognitionEigenface For Face Recognition
Eigenface For Face Recognition
 
[PR12] Generative Models as Distributions of Functions
[PR12] Generative Models as Distributions of Functions[PR12] Generative Models as Distributions of Functions
[PR12] Generative Models as Distributions of Functions
 
Applied Deep Learning 11/03 Convolutional Neural Networks
Applied Deep Learning 11/03 Convolutional Neural NetworksApplied Deep Learning 11/03 Convolutional Neural Networks
Applied Deep Learning 11/03 Convolutional Neural Networks
 
CV_2 Image Processing
CV_2 Image ProcessingCV_2 Image Processing
CV_2 Image Processing
 
Deep learning for object detection
Deep learning for object detectionDeep learning for object detection
Deep learning for object detection
 
Computer Vision transformations
Computer Vision  transformationsComputer Vision  transformations
Computer Vision transformations
 

Semelhante a Visual Attention Prediction Using Saliency Models

最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に - 最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に - Hiroshi Fukui
 
Review of Pose Recognition Systems
Review of Pose Recognition SystemsReview of Pose Recognition Systems
Review of Pose Recognition Systemsvivatechijri
 
PS1_2014_2012B5A7521P_2012B5A7848P_2012B4A7958H
PS1_2014_2012B5A7521P_2012B5A7848P_2012B4A7958HPS1_2014_2012B5A7521P_2012B5A7848P_2012B4A7958H
PS1_2014_2012B5A7521P_2012B5A7848P_2012B4A7958HSaurabh Kumar
 
Locate, Size and Count: Accurately Resolving People in Dense Crowds via Detec...
Locate, Size and Count: Accurately Resolving People in Dense Crowds via Detec...Locate, Size and Count: Accurately Resolving People in Dense Crowds via Detec...
Locate, Size and Count: Accurately Resolving People in Dense Crowds via Detec...IRJET Journal
 
lec_11_self_supervised_learning.pdf
lec_11_self_supervised_learning.pdflec_11_self_supervised_learning.pdf
lec_11_self_supervised_learning.pdfAlamgirAkash3
 
Artificial Intelligence for Vision: A walkthrough of recent breakthroughs
Artificial Intelligence for Vision:  A walkthrough of recent breakthroughsArtificial Intelligence for Vision:  A walkthrough of recent breakthroughs
Artificial Intelligence for Vision: A walkthrough of recent breakthroughsNikolas Markou
 
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019Universitat Politècnica de Catalunya
 
Representational Continuity for Unsupervised Continual Learning
Representational Continuity for Unsupervised Continual LearningRepresentational Continuity for Unsupervised Continual Learning
Representational Continuity for Unsupervised Continual LearningMLAI2
 
Satellite Image Classification with Deep Learning Survey
Satellite Image Classification with Deep Learning SurveySatellite Image Classification with Deep Learning Survey
Satellite Image Classification with Deep Learning Surveyijtsrd
 
Semantic segmentation with Convolutional Neural Network Approaches
Semantic segmentation with Convolutional Neural Network ApproachesSemantic segmentation with Convolutional Neural Network Approaches
Semantic segmentation with Convolutional Neural Network ApproachesFellowship at Vodafone FutureLab
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Universitat Politècnica de Catalunya
 
Shadow Detection and Removal using Tricolor Attenuation Model Based on Featur...
Shadow Detection and Removal using Tricolor Attenuation Model Based on Featur...Shadow Detection and Removal using Tricolor Attenuation Model Based on Featur...
Shadow Detection and Removal using Tricolor Attenuation Model Based on Featur...ijtsrd
 
Quality assessment of stereoscopic 3 d image compression by binocular integra...
Quality assessment of stereoscopic 3 d image compression by binocular integra...Quality assessment of stereoscopic 3 d image compression by binocular integra...
Quality assessment of stereoscopic 3 d image compression by binocular integra...Shakas Technologies
 
Nicolae_Denut_Theses
Nicolae_Denut_ThesesNicolae_Denut_Theses
Nicolae_Denut_ThesesNicolae Denut
 

Semelhante a Visual Attention Prediction Using Saliency Models (20)

最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に - 最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に -
 
The Importance of Time in Visual Attention Models
The Importance of Time in Visual Attention ModelsThe Importance of Time in Visual Attention Models
The Importance of Time in Visual Attention Models
 
End to-end convolutional network for saliency prediction
End to-end convolutional network for saliency predictionEnd to-end convolutional network for saliency prediction
End to-end convolutional network for saliency prediction
 
Review of Pose Recognition Systems
Review of Pose Recognition SystemsReview of Pose Recognition Systems
Review of Pose Recognition Systems
 
PS1_2014_2012B5A7521P_2012B5A7848P_2012B4A7958H
PS1_2014_2012B5A7521P_2012B5A7848P_2012B4A7958HPS1_2014_2012B5A7521P_2012B5A7848P_2012B4A7958H
PS1_2014_2012B5A7521P_2012B5A7848P_2012B4A7958H
 
Locate, Size and Count: Accurately Resolving People in Dense Crowds via Detec...
Locate, Size and Count: Accurately Resolving People in Dense Crowds via Detec...Locate, Size and Count: Accurately Resolving People in Dense Crowds via Detec...
Locate, Size and Count: Accurately Resolving People in Dense Crowds via Detec...
 
lec_11_self_supervised_learning.pdf
lec_11_self_supervised_learning.pdflec_11_self_supervised_learning.pdf
lec_11_self_supervised_learning.pdf
 
Artificial Intelligence for Vision: A walkthrough of recent breakthroughs
Artificial Intelligence for Vision:  A walkthrough of recent breakthroughsArtificial Intelligence for Vision:  A walkthrough of recent breakthroughs
Artificial Intelligence for Vision: A walkthrough of recent breakthroughs
 
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019
 
Representational Continuity for Unsupervised Continual Learning
Representational Continuity for Unsupervised Continual LearningRepresentational Continuity for Unsupervised Continual Learning
Representational Continuity for Unsupervised Continual Learning
 
paper
paperpaper
paper
 
Satellite Image Classification with Deep Learning Survey
Satellite Image Classification with Deep Learning SurveySatellite Image Classification with Deep Learning Survey
Satellite Image Classification with Deep Learning Survey
 
AR/SLAM for end-users
AR/SLAM for end-usersAR/SLAM for end-users
AR/SLAM for end-users
 
Semantic segmentation with Convolutional Neural Network Approaches
Semantic segmentation with Convolutional Neural Network ApproachesSemantic segmentation with Convolutional Neural Network Approaches
Semantic segmentation with Convolutional Neural Network Approaches
 
Learning where to look: focus and attention in deep vision
Learning where to look: focus and attention in deep visionLearning where to look: focus and attention in deep vision
Learning where to look: focus and attention in deep vision
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
 
Deep Learning for Computer Vision: Saliency Prediction (UPC 2016)
Deep Learning for Computer Vision: Saliency Prediction (UPC 2016)Deep Learning for Computer Vision: Saliency Prediction (UPC 2016)
Deep Learning for Computer Vision: Saliency Prediction (UPC 2016)
 
Shadow Detection and Removal using Tricolor Attenuation Model Based on Featur...
Shadow Detection and Removal using Tricolor Attenuation Model Based on Featur...Shadow Detection and Removal using Tricolor Attenuation Model Based on Featur...
Shadow Detection and Removal using Tricolor Attenuation Model Based on Featur...
 
Quality assessment of stereoscopic 3 d image compression by binocular integra...
Quality assessment of stereoscopic 3 d image compression by binocular integra...Quality assessment of stereoscopic 3 d image compression by binocular integra...
Quality assessment of stereoscopic 3 d image compression by binocular integra...
 
Nicolae_Denut_Theses
Nicolae_Denut_ThesesNicolae_Denut_Theses
Nicolae_Denut_Theses
 

Mais de Universitat Politècnica de Catalunya

The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...Universitat Politècnica de Catalunya
 
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoTowards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoUniversitat Politècnica de Catalunya
 
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Universitat Politècnica de Catalunya
 
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosGeneration of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosUniversitat Politècnica de Catalunya
 
Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Universitat Politècnica de Catalunya
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Universitat Politècnica de Catalunya
 
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Universitat Politècnica de Catalunya
 
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Universitat Politècnica de Catalunya
 
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Universitat Politècnica de Catalunya
 
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Universitat Politècnica de Catalunya
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Universitat Politècnica de Catalunya
 
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020Universitat Politècnica de Catalunya
 

Mais de Universitat Politècnica de Catalunya (20)

Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Deep Generative Learning for All
Deep Generative Learning for AllDeep Generative Learning for All
Deep Generative Learning for All
 
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
 
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoTowards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
 
The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021
 
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
 
Open challenges in sign language translation and production
Open challenges in sign language translation and productionOpen challenges in sign language translation and production
Open challenges in sign language translation and production
 
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosGeneration of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
 
Discovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in MinecraftDiscovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in Minecraft
 
Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...
 
Intepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural NetworksIntepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural Networks
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
 
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
 
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
 
Curriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object SegmentationCurriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object Segmentation
 
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
 
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
 

Último

Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Thomas Poetter
 
Business Analytics using Microsoft Excel
Business Analytics using Microsoft ExcelBusiness Analytics using Microsoft Excel
Business Analytics using Microsoft Excelysmaelreyes
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Boston Institute of Analytics
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 
办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一
办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一
办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一F sss
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfchwongval
 
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGILLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGIThomas Poetter
 
IMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxIMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxdolaknnilon
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxaleedritatuxx
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 

Último (20)

Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
 
Business Analytics using Microsoft Excel
Business Analytics using Microsoft ExcelBusiness Analytics using Microsoft Excel
Business Analytics using Microsoft Excel
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 
办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一
办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一
办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdf
 
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGILLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
 
IMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxIMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptx
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 

Visual Attention Prediction Using Saliency Models

  • 1. Kevin McGuinness kevin.mcguinness@dcu.ie Assistant Professor School of Electronic Engineering Dublin City University http://bit.ly/dlcv2018 #DLUPC Saliency Prediction Day 3 Lecture 5 1
  • 2. The importance of visual attention
  • 3.
  • 4. The importance of visual attention
  • 5. The importance of visual attention
  • 6.
  • 7. The importance of visual attention
  • 8. Why don’t we see the changes? We don’t really see the whole image We only focus on small specific regions: the salient parts Human beings reliably attend to the same regions of images when shown
  • 12. Saliency prediction Produce a computational model of visual attention: predict where humans will look. Often want to map an image to a heatmap (saliency map). 12
  • 13. Salient object detection? Often confused with saliency prediction, but a different task. Figure from: Progressive Attention Guided Recurrent Network for Salient Object Detection, CVPR 2018 13
  • 15. MIT 300 300 natural indoor and outdoor scenes. 39 observers. 3 sec free view. ETL 400 ISCAN eye tracker Test set only: no training data or public ground truth http://saliency.mit.edu/results _mit300.html A Benchmark of Computational Models of Saliency to Predict Human Fixations [MIT tech report 2012] 15
  • 16. Fixations and saliency maps Raw eye tracker data needs to be processed to produce saliency maps Eye tracker Fixation detection Saliency map generation Raw gaze location-time tracks Eye fixations Detect saccades using distance/velocity thresholding, clustering Rendering, Gaussian blur, normalizing 16
  • 17. MIT 1003 1003 natural indoor and outdoor scenes. 15 observers. 3 sec free view. ETL 400 ISCAN eye tracker Training dataset for MIT 300 Learning to Predict where Humans Look [ICCV 2009] 17
  • 18. iSUN Large scale dataset of natural scenes 20,608 images with avg. 3 observers each Collected using webcams and Amazon Mechanical Turk Used in LSUN challenge 2015/2016 Xu et al. TurkerGaze: Crowdsourcing Saliency with Webcam based Eye Tracking arXiv 2015. 18
  • 19. SALICON Another large scaled dataset of images from MS COCO dataset 10K train, 5K val, 5K test Simulated crowdsourced attention using mouse movements and simulated artificial foveation. Jiang et al. SALICON: Saliency in Context, CVPR 2015 19
  • 21. SalNet: deep visual saliency model Predict map of visual attention from image pixels (find the parts of the image that stand out) ● Feedforward 8 layer “fully convolutional” architecture ● Transfer learning in bottom 3 layers from pretrained VGG-M model on ImageNet ● Trained on SALICON dataset Predicted Ground truth Pan, McGuinness, et al. Shallow and Deep Convolutional Networks for Saliency Prediction, CVPR 2016 http://arxiv.org/abs/1603.00845 21
  • 24. SalGAN Adversarial loss Data loss Junting Pan, Cristian Canton, Kevin McGuinness, Noel E. O’Connor, Jordi Torres, Elisa Sayrol and Xavier Giro-i-Nieto. “SalGAN: Visual Saliency Prediction with Generative Adversarial Networks.” arXiv. 2017. 24
  • 25. SalNet and SalGAN benchmarks 25
  • 26. Deep Gaze Simple linear model trained on activations of all conv layers (upsampled) from AlexNet Softmax output over full image, categorical cross entropy. L1 regularization used to encourage sparsity. Kümmerer et al. Deep gaze 1: Boosting saliency prediction with feature maps trained on imagenet. ICLR workshops 2015 26
  • 27. MLNet Cornia et al. A Deep Multi-Level Network for Saliency Prediction. ICPR 2016. 27
  • 28. SALICON Huang et al., SALICON: Reducing the Semantic Gap in Saliency Prediction by Adapting Deep Neural Networks, ICCV 2015 28
  • 29. DeepFix Weights initialized from VGG16 trained on ImageNet Dilated convolutions Location biased convolutions Inception layers Kruthiventi et al. DeepFix: A Fully Convolutional Neural Network for predicting Human Eye Fixations https://arxiv.org/abs/1510.02927 29
  • 30. Deep Gaze II Kümmerer, et al. Understanding low-and high-level contributions to fixation prediction. ICCV 2017 30
  • 31. From image to video saliency? Bak et al. Spatio-temporal saliency networks for dynamic saliency prediction. IEEE Trans. Multimedia 2017 SalNet 31
  • 32. From image to video saliency Gorji and Clark, Going From Image to Video Saliency: Augmenting Image Salience With Dynamic Attentional Push, CVPR 2018 32