SlideShare uma empresa Scribd logo
1 de 42
Contextless Object Recognition 
with Shape-enriched SIFT and 
Bags of Features 
Marcel Tella Amo 
Directed by Dr. Matthias Zeppelzauer (TU Wien) 
Codirected by Dr. Xavier Giró-i-Nieto (UPC)
Motivation 
2 
Object Recognition and Classification 
Categories 
• Ball 
• Airplane 
• Chair 
• Beaver 
• … 
Ball Airplane Chair 
Shape 
Information 
Texture 
information
3 
Index 
Requirements 
State of the Art 
Design 
Results
Requirements 
4
Requirements State of the Art Design Results 
Design shape features that can be used in an 
aggregated framework, like Bag of Words with 
no need of matching or alignment. 
5 
Take a 
successful method : 
Shape 
Information 
SIFT
Requirements State of the Art Design Results 
Analyse the implication of the vocabulary size 
with respect to the size of the shape features. 
SIFT 
6 
Shape
The proposed features should be at least scale, 
rotation and translation invariant. If it is 
possible, flip invariant as well. 
7 
Requirements State of the Art Design Results
Need for Segmentation to codify the shape 
Study the limitations of shape coding when using a state of the art 
segmentation. 
Manual annotations vs Automatic Segmentation 
8 
Requirements State of the Art Design Results
State of the Art 
9
Requirements State of the Art Design Results 
Object Candidates algorithms 
Multiscale Combinatorial Grouping (MCG) 
10 
Ranking 
Object Plausibility 
Arbelaez, P., Pont-Tuset, J., Barron, J. T., Marques, F., Malik, J. (2014). 
Multiscale Combinatorial Grouping. CVPR. 
High 
Low
Requirements State of the Art Design Results 
Shape Context 
11 
G. Mori, S. Belongie, and J. Malik. Ecient shape 
matching using shape 
contexts. PAMI, 27(11), 2005.
Requirements State of the Art Design Results 
Interest point descriptors: 
SIFT descriptor 
Simplified example 
Typically 4x4 divisions * 8 bins/hist = 128 features 
dense SIFT 
sparse SIFT 
12 
David G Lowe, Distinctive image features from scale-invariant keypoints, International journal of 
computer vision 60 (2004), no. 2, 91{110.
Requirements State of the Art Design Results 
Enrichment of SIFT 
Extra features : Absolute spatial location (X,Y) or angle and distance 
Rene Grzeszick, Leonard Rothacker, and Gernot A. Fink, "Bag-of-features representations using spatial visual vocabularies 
for object classication,“ in IEEE Intl. Conf. on Image Processing, Melbourne, Australia, 2013 
Extra features : Relative position + aspect ratio + scale ratio + Color Space 
Carreira, J., Caseiro, R., Batista, J., & Sminchisescu, C. (2012). Semantic segmentation with second-order pooling. In 
Computer Vision{ECCV 2012} (pp. 430-443). Springer Berlin Heidelberg. 
13 
128-dimensional SIFT descriptor Extra features
Bag of Words 
14 
Requirements State of the Art Design Results
Requirements State of the Art Design Results 
Bags of Words - Pipeline 
15 
Get 
Descriptors 
Clustering 
(K-means) 
Create 
histograms 
Train Model 
(SVM) 
Image 
Create 
histogram 
Evaluate 
(SVM)
Design 
16
Requirements State of the Art Design Results 
Why dense SIFT? 
17
Main principle: Combination of dense SIFT and Object Candidates 
18 
Requirements State of the Art Design Results
Requirements State of the Art Design Results 
Distance to the nearest border (DNB) 
Logarithmic distance to the nearest border (LDNB) 
Less influence of big distances 
19 
Carreira, J., Caseiro, R., Batista, J., & Sminchisescu, C. (2012). Semantic segmentation with second-order 
pooling. In Computer Vision-ECCV 2012 (pp. 430-443). Springer Berlin Heidelberg.
Distance and Angle to the nearest border (DANB) 
Problem: Really similar in 2D but very different values. 
Solution: Codify them in two separated features. 
20 
Requirements State of the Art Design Results
Rotation Invariant Angle to the nearest border 
21 
Requirements State of the Art Design Results
Distance to the center (DC) 
22 
Requirements State of the Art Design Results
η - Angular Scan (ηAS) 
WINNER! 
23 
Requirements State of the Art Design Results
Shape Context from a dense SIFT (DSC) 
Note: It crosses the contour of the region like Shape Context. 
ηAS does not! 
24 
Requirements State of the Art Design Results
Requirements State of the Art Design Results 
Rotation Invariant Region Quantization (RIRQ) 
Main idea: Get spatial information. 
Easily extensible to a pyramid! 
25 
Lazebnik, S., Schmid, C., & Ponce, J. (2006). 2006 IEEE Computer Society Conference on (Vol. 2, pp. 
2169-2178). IEEE.
Achieving flip invariance (RIRQ) 
1 
2 
4 3 
1 
2 3 
4 
2 
4 1 
3 2 
3 
4 
1 
4 2 2 4 
SORT SORT 
2 4 
26 
Requirements State of the Art Design Results
Where do we integrate our features? 
Two main Architectures 
Enriched SIFT (eSIFT) 
SIFT Shape features 
Visual Vocabulary 
Bag of eSIFT visual words 
BoW+Shape 
SIFT 
Visual Vocabulary 
Bag of Words Shape histogram 
27 
Requirements State of the Art Design Results
BoW+Shape Creation of the shape histograms 
SIFT 
Accumulation of features 
Visual Vocabulary 
Bag of Words Shape histogram 
1 
1. Accumulate the 
same feature for all 
points . 
2. Create a 
histogram of X bins 
for that feature. 
1 
2 
2 
3. Concatenate 
histograms to create 
the final one. 
Example: 8-Angular Scan 
8 distances (different angles) 
# SIFT keypoints 
28 
Requirements State of the Art Design Results
Results and conclusions 
29
Requirements State of the Art Design Results 
The dataset: Caltech-101 
30 
•Well recognized dataset 
• 101 Different Categories of images 
• Ground truth annotations available 
• From 40 to 800 images per category.
Requirements State of the Art Design Results 
Metrics: Accuracy (%) 
31 
Correct Classifications 
Correct + Incorrect Classifications
Requirements State of the Art Design Results 
Experiments setup 
32 
• 30 images per category in train and 30-50 in test. 
• 101 Categories + Background category. 
• Different Vocabulary sizes in the X axis. 
• Accuracy(%) in the Y axis: 
•Experiments and analysis: 
• eSIFT 
• BoW+S 
• eSIFT vs BoW+S 
• Performance acheived 
• Comparison between adding features before or after quantization 
• Number of bins per histogram 
• Ground truth vs MCG Object Canditates 
• Context vs Shape
Results enriched SIFT 
33 
Requirements State of the Art Design Results
Results BoW+S 
34 
Requirements State of the Art Design Results
Requirements State of the Art Design Results 
Performance achieved 
35 
Conclusion 
With Angular Scan, there is an increase of performance 
from 16% to around 41%.
Requirements State of the Art Design Results 
Comparison between adding features 
after and before 
Conclusion 
In Angular Scan, if the number of shape features is high, 
both architectures tend to converge. 36
Requirements State of the Art Design Results 
Number of bins per histogram 
Conclusion 
In Angular Scan, 8 bins is the value that gives the best 
performance. 37
Requirements State of the Art Design Results 
Ground truth vs MCG Object Candidates 
Conclusion 1 
2 
Higher vocabulary values lead to a more robust 
approach in terms of segmentation errors. 
Shape-based methods are more sensible to 
segmentation errors than texture-based. 38
Requirements State of the Art Design Results 
Context gain vs Shape gain 
Conclusion 
Object 
Context 
It gives better performance to codify the shape 
than the context of the image. 39
FutureWork 
Comparison betwen our work and 
Second Order Pooling 
PhD thesis of Carles Ventura 
Carreira, J., Caseiro, R., Batista, J., & Sminchisescu, C. (2012). Semantic segmentation with second-order 
pooling. In Computer Vision-ECCV 2012 (pp. 430-443). Springer Berlin Heidelberg. 
40
Distance to the nearest border (DNB) 
41 
Future Work
Conclusions 
1. Increase of performance from 16% to around 41% 
2. In Angular Scan, if the number of shape features is high, both 
architectures tend to converge. 
3. In Angular Scan, 8 bins is the value that gives the best performance. 
4. Higher vocabulary values lead to a more robust approach in terms of 
segmentation errors. 
5. Shape-based methods are more sensible to segmentation errors than 
texture-based. 
6. It gives better performance to codify the shape than the context of the 
image. 
Thank you! 
Questions? 42

Mais conteúdo relacionado

Mais procurados

On NURBS Geometry Representation in 3D modelling
On NURBS Geometry Representation in 3D modellingOn NURBS Geometry Representation in 3D modelling
On NURBS Geometry Representation in 3D modellingPirouz Nourian
 
Point Cloud Segmentation for 3D Reconstruction
Point Cloud Segmentation for 3D ReconstructionPoint Cloud Segmentation for 3D Reconstruction
Point Cloud Segmentation for 3D ReconstructionPirouz Nourian
 
Mesh final pzn_geo1004_2015_f3_2017
Mesh final pzn_geo1004_2015_f3_2017Mesh final pzn_geo1004_2015_f3_2017
Mesh final pzn_geo1004_2015_f3_2017Pirouz Nourian
 
Polygon Mesh Representation
Polygon Mesh RepresentationPolygon Mesh Representation
Polygon Mesh RepresentationPirouz Nourian
 
Template Matching - Pattern Recognition
Template Matching - Pattern RecognitionTemplate Matching - Pattern Recognition
Template Matching - Pattern RecognitionMustafa Salam
 
Cvpr2007 object category recognition p2 - part based models
Cvpr2007 object category recognition   p2 - part based modelsCvpr2007 object category recognition   p2 - part based models
Cvpr2007 object category recognition p2 - part based modelszukun
 
Lec13 stereo converted
Lec13 stereo convertedLec13 stereo converted
Lec13 stereo convertedBaliThorat1
 
Practical Digital Image Processing 3
 Practical Digital Image Processing 3 Practical Digital Image Processing 3
Practical Digital Image Processing 3Aly Abdelkareem
 
Iccv2009 recognition and learning object categories p1 c01 - classical methods
Iccv2009 recognition and learning object categories   p1 c01 - classical methodsIccv2009 recognition and learning object categories   p1 c01 - classical methods
Iccv2009 recognition and learning object categories p1 c01 - classical methodszukun
 
GRPHICS01 - Introduction to 3D Graphics
GRPHICS01 - Introduction to 3D GraphicsGRPHICS01 - Introduction to 3D Graphics
GRPHICS01 - Introduction to 3D GraphicsMichael Heron
 
Practical Digital Image Processing 4
Practical Digital Image Processing 4Practical Digital Image Processing 4
Practical Digital Image Processing 4Aly Abdelkareem
 
Lec14 multiview stereo
Lec14 multiview stereoLec14 multiview stereo
Lec14 multiview stereoBaliThorat1
 
Build Your Own 3D Scanner: Conclusion
Build Your Own 3D Scanner: ConclusionBuild Your Own 3D Scanner: Conclusion
Build Your Own 3D Scanner: ConclusionDouglas Lanman
 

Mais procurados (20)

Ar1 twf030 lecture2.2
Ar1 twf030 lecture2.2Ar1 twf030 lecture2.2
Ar1 twf030 lecture2.2
 
On NURBS Geometry Representation in 3D modelling
On NURBS Geometry Representation in 3D modellingOn NURBS Geometry Representation in 3D modelling
On NURBS Geometry Representation in 3D modelling
 
Point Cloud Segmentation for 3D Reconstruction
Point Cloud Segmentation for 3D ReconstructionPoint Cloud Segmentation for 3D Reconstruction
Point Cloud Segmentation for 3D Reconstruction
 
Mesh final pzn_geo1004_2015_f3_2017
Mesh final pzn_geo1004_2015_f3_2017Mesh final pzn_geo1004_2015_f3_2017
Mesh final pzn_geo1004_2015_f3_2017
 
Polygon Mesh Representation
Polygon Mesh RepresentationPolygon Mesh Representation
Polygon Mesh Representation
 
Lec15 sfm
Lec15 sfmLec15 sfm
Lec15 sfm
 
Lec10 alignment
Lec10 alignmentLec10 alignment
Lec10 alignment
 
Template Matching - Pattern Recognition
Template Matching - Pattern RecognitionTemplate Matching - Pattern Recognition
Template Matching - Pattern Recognition
 
Cvpr2007 object category recognition p2 - part based models
Cvpr2007 object category recognition   p2 - part based modelsCvpr2007 object category recognition   p2 - part based models
Cvpr2007 object category recognition p2 - part based models
 
Lec13 stereo converted
Lec13 stereo convertedLec13 stereo converted
Lec13 stereo converted
 
Ar1 twf030 lecture1.2
Ar1 twf030 lecture1.2Ar1 twf030 lecture1.2
Ar1 twf030 lecture1.2
 
PPT s03-machine vision-s2
PPT s03-machine vision-s2PPT s03-machine vision-s2
PPT s03-machine vision-s2
 
Object representations
Object representationsObject representations
Object representations
 
Practical Digital Image Processing 3
 Practical Digital Image Processing 3 Practical Digital Image Processing 3
Practical Digital Image Processing 3
 
Iccv2009 recognition and learning object categories p1 c01 - classical methods
Iccv2009 recognition and learning object categories   p1 c01 - classical methodsIccv2009 recognition and learning object categories   p1 c01 - classical methods
Iccv2009 recognition and learning object categories p1 c01 - classical methods
 
GRPHICS01 - Introduction to 3D Graphics
GRPHICS01 - Introduction to 3D GraphicsGRPHICS01 - Introduction to 3D Graphics
GRPHICS01 - Introduction to 3D Graphics
 
Practical Digital Image Processing 4
Practical Digital Image Processing 4Practical Digital Image Processing 4
Practical Digital Image Processing 4
 
Lec14 eigenface and fisherface
Lec14 eigenface and fisherfaceLec14 eigenface and fisherface
Lec14 eigenface and fisherface
 
Lec14 multiview stereo
Lec14 multiview stereoLec14 multiview stereo
Lec14 multiview stereo
 
Build Your Own 3D Scanner: Conclusion
Build Your Own 3D Scanner: ConclusionBuild Your Own 3D Scanner: Conclusion
Build Your Own 3D Scanner: Conclusion
 

Semelhante a Contextless Object Recognition with Shape-enriched SIFT and Bags of Features

Salient KeypointSelection for Object Representation
Salient KeypointSelection for Object RepresentationSalient KeypointSelection for Object Representation
Salient KeypointSelection for Object RepresentationPrerana Mukherjee
 
187186134 5-geometric-modeling
187186134 5-geometric-modeling187186134 5-geometric-modeling
187186134 5-geometric-modelingmanojg1990
 
187186134 5-geometric-modeling
187186134 5-geometric-modeling187186134 5-geometric-modeling
187186134 5-geometric-modelingmanojg1990
 
5 geometric-modeling-ppt-university-of-victoria
5 geometric-modeling-ppt-university-of-victoria5 geometric-modeling-ppt-university-of-victoria
5 geometric-modeling-ppt-university-of-victoriaRaghu Gadde
 
5_Geometric_Modeling.pdf
5_Geometric_Modeling.pdf5_Geometric_Modeling.pdf
5_Geometric_Modeling.pdfKeerthanaP37
 
Automatic Image Annotation (AIA)
Automatic Image Annotation (AIA)Automatic Image Annotation (AIA)
Automatic Image Annotation (AIA)Farzaneh Rezaei
 
Presentation vision transformersppt.pptx
Presentation vision transformersppt.pptxPresentation vision transformersppt.pptx
Presentation vision transformersppt.pptxhtn540
 
COMPUTER CONTROL IN PROCESS PLANNING Unit 2 (ME CAD/CAM)
COMPUTER CONTROL IN PROCESS PLANNING Unit 2 (ME CAD/CAM)COMPUTER CONTROL IN PROCESS PLANNING Unit 2 (ME CAD/CAM)
COMPUTER CONTROL IN PROCESS PLANNING Unit 2 (ME CAD/CAM)Avt Shubhash
 
A CAD ppt 25-10-19.pdf
A CAD ppt 25-10-19.pdfA CAD ppt 25-10-19.pdf
A CAD ppt 25-10-19.pdfKeerthanaP37
 
SANN: Programming Code Representation Using Attention Neural Network with Opt...
SANN: Programming Code Representation Using Attention Neural Network with Opt...SANN: Programming Code Representation Using Attention Neural Network with Opt...
SANN: Programming Code Representation Using Attention Neural Network with Opt...Peter Brusilovsky
 
111431635-geometric-modeling-glad1-150630140219-lva1-app6892 (1).pdf
111431635-geometric-modeling-glad1-150630140219-lva1-app6892 (1).pdf111431635-geometric-modeling-glad1-150630140219-lva1-app6892 (1).pdf
111431635-geometric-modeling-glad1-150630140219-lva1-app6892 (1).pdfVIGNESHG144026
 
Geometric modeling111431635 geometric-modeling-glad (1)
Geometric modeling111431635 geometric-modeling-glad (1)Geometric modeling111431635 geometric-modeling-glad (1)
Geometric modeling111431635 geometric-modeling-glad (1)manojg1990
 
Scrdet++ analysis
Scrdet++ analysisScrdet++ analysis
Scrdet++ analysisNEHA Kapoor
 
Dibujo y Modelación 3D
Dibujo y Modelación 3DDibujo y Modelación 3D
Dibujo y Modelación 3DR. Sosa
 
Easy edd phd talks 28 oct 2008
Easy edd phd talks 28 oct 2008Easy edd phd talks 28 oct 2008
Easy edd phd talks 28 oct 2008Taha Sochi
 

Semelhante a Contextless Object Recognition with Shape-enriched SIFT and Bags of Features (20)

Salient KeypointSelection for Object Representation
Salient KeypointSelection for Object RepresentationSalient KeypointSelection for Object Representation
Salient KeypointSelection for Object Representation
 
187186134 5-geometric-modeling
187186134 5-geometric-modeling187186134 5-geometric-modeling
187186134 5-geometric-modeling
 
187186134 5-geometric-modeling
187186134 5-geometric-modeling187186134 5-geometric-modeling
187186134 5-geometric-modeling
 
5 geometric modeling
5 geometric modeling5 geometric modeling
5 geometric modeling
 
5 geometric-modeling-ppt-university-of-victoria
5 geometric-modeling-ppt-university-of-victoria5 geometric-modeling-ppt-university-of-victoria
5 geometric-modeling-ppt-university-of-victoria
 
5_Geometric_Modeling.pdf
5_Geometric_Modeling.pdf5_Geometric_Modeling.pdf
5_Geometric_Modeling.pdf
 
Automatic Image Annotation (AIA)
Automatic Image Annotation (AIA)Automatic Image Annotation (AIA)
Automatic Image Annotation (AIA)
 
Presentation vision transformersppt.pptx
Presentation vision transformersppt.pptxPresentation vision transformersppt.pptx
Presentation vision transformersppt.pptx
 
COMPUTER CONTROL IN PROCESS PLANNING Unit 2 (ME CAD/CAM)
COMPUTER CONTROL IN PROCESS PLANNING Unit 2 (ME CAD/CAM)COMPUTER CONTROL IN PROCESS PLANNING Unit 2 (ME CAD/CAM)
COMPUTER CONTROL IN PROCESS PLANNING Unit 2 (ME CAD/CAM)
 
A CAD ppt 25-10-19.pdf
A CAD ppt 25-10-19.pdfA CAD ppt 25-10-19.pdf
A CAD ppt 25-10-19.pdf
 
06_features_slides.pdf
06_features_slides.pdf06_features_slides.pdf
06_features_slides.pdf
 
SANN: Programming Code Representation Using Attention Neural Network with Opt...
SANN: Programming Code Representation Using Attention Neural Network with Opt...SANN: Programming Code Representation Using Attention Neural Network with Opt...
SANN: Programming Code Representation Using Attention Neural Network with Opt...
 
111431635-geometric-modeling-glad1-150630140219-lva1-app6892 (1).pdf
111431635-geometric-modeling-glad1-150630140219-lva1-app6892 (1).pdf111431635-geometric-modeling-glad1-150630140219-lva1-app6892 (1).pdf
111431635-geometric-modeling-glad1-150630140219-lva1-app6892 (1).pdf
 
Geometric modeling111431635 geometric-modeling-glad (1)
Geometric modeling111431635 geometric-modeling-glad (1)Geometric modeling111431635 geometric-modeling-glad (1)
Geometric modeling111431635 geometric-modeling-glad (1)
 
3DRepo
3DRepo3DRepo
3DRepo
 
lecture_16_jiajun.pdf
lecture_16_jiajun.pdflecture_16_jiajun.pdf
lecture_16_jiajun.pdf
 
Scrdet++ analysis
Scrdet++ analysisScrdet++ analysis
Scrdet++ analysis
 
Dibujo y Modelación 3D
Dibujo y Modelación 3DDibujo y Modelación 3D
Dibujo y Modelación 3D
 
Easy edd phd talks 28 oct 2008
Easy edd phd talks 28 oct 2008Easy edd phd talks 28 oct 2008
Easy edd phd talks 28 oct 2008
 
2015 10-08 - additive manufacturing software 1
2015 10-08 - additive manufacturing software  12015 10-08 - additive manufacturing software  1
2015 10-08 - additive manufacturing software 1
 

Mais de Universitat Politècnica de Catalunya

The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...Universitat Politècnica de Catalunya
 
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoTowards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoUniversitat Politècnica de Catalunya
 
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Universitat Politècnica de Catalunya
 
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosGeneration of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosUniversitat Politècnica de Catalunya
 
Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Universitat Politècnica de Catalunya
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Universitat Politècnica de Catalunya
 
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Universitat Politècnica de Catalunya
 
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Universitat Politècnica de Catalunya
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Universitat Politècnica de Catalunya
 
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Universitat Politècnica de Catalunya
 
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Universitat Politècnica de Catalunya
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Universitat Politècnica de Catalunya
 

Mais de Universitat Politècnica de Catalunya (20)

Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Deep Generative Learning for All
Deep Generative Learning for AllDeep Generative Learning for All
Deep Generative Learning for All
 
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
 
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoTowards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
 
The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021
 
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
 
Open challenges in sign language translation and production
Open challenges in sign language translation and productionOpen challenges in sign language translation and production
Open challenges in sign language translation and production
 
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosGeneration of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
 
Discovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in MinecraftDiscovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in Minecraft
 
Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...
 
Intepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural NetworksIntepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural Networks
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
 
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
 
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
 
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
 
Curriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object SegmentationCurriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object Segmentation
 
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
 

Último

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 

Último (20)

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 

Contextless Object Recognition with Shape-enriched SIFT and Bags of Features

  • 1. Contextless Object Recognition with Shape-enriched SIFT and Bags of Features Marcel Tella Amo Directed by Dr. Matthias Zeppelzauer (TU Wien) Codirected by Dr. Xavier Giró-i-Nieto (UPC)
  • 2. Motivation 2 Object Recognition and Classification Categories • Ball • Airplane • Chair • Beaver • … Ball Airplane Chair Shape Information Texture information
  • 3. 3 Index Requirements State of the Art Design Results
  • 5. Requirements State of the Art Design Results Design shape features that can be used in an aggregated framework, like Bag of Words with no need of matching or alignment. 5 Take a successful method : Shape Information SIFT
  • 6. Requirements State of the Art Design Results Analyse the implication of the vocabulary size with respect to the size of the shape features. SIFT 6 Shape
  • 7. The proposed features should be at least scale, rotation and translation invariant. If it is possible, flip invariant as well. 7 Requirements State of the Art Design Results
  • 8. Need for Segmentation to codify the shape Study the limitations of shape coding when using a state of the art segmentation. Manual annotations vs Automatic Segmentation 8 Requirements State of the Art Design Results
  • 9. State of the Art 9
  • 10. Requirements State of the Art Design Results Object Candidates algorithms Multiscale Combinatorial Grouping (MCG) 10 Ranking Object Plausibility Arbelaez, P., Pont-Tuset, J., Barron, J. T., Marques, F., Malik, J. (2014). Multiscale Combinatorial Grouping. CVPR. High Low
  • 11. Requirements State of the Art Design Results Shape Context 11 G. Mori, S. Belongie, and J. Malik. Ecient shape matching using shape contexts. PAMI, 27(11), 2005.
  • 12. Requirements State of the Art Design Results Interest point descriptors: SIFT descriptor Simplified example Typically 4x4 divisions * 8 bins/hist = 128 features dense SIFT sparse SIFT 12 David G Lowe, Distinctive image features from scale-invariant keypoints, International journal of computer vision 60 (2004), no. 2, 91{110.
  • 13. Requirements State of the Art Design Results Enrichment of SIFT Extra features : Absolute spatial location (X,Y) or angle and distance Rene Grzeszick, Leonard Rothacker, and Gernot A. Fink, "Bag-of-features representations using spatial visual vocabularies for object classication,“ in IEEE Intl. Conf. on Image Processing, Melbourne, Australia, 2013 Extra features : Relative position + aspect ratio + scale ratio + Color Space Carreira, J., Caseiro, R., Batista, J., & Sminchisescu, C. (2012). Semantic segmentation with second-order pooling. In Computer Vision{ECCV 2012} (pp. 430-443). Springer Berlin Heidelberg. 13 128-dimensional SIFT descriptor Extra features
  • 14. Bag of Words 14 Requirements State of the Art Design Results
  • 15. Requirements State of the Art Design Results Bags of Words - Pipeline 15 Get Descriptors Clustering (K-means) Create histograms Train Model (SVM) Image Create histogram Evaluate (SVM)
  • 17. Requirements State of the Art Design Results Why dense SIFT? 17
  • 18. Main principle: Combination of dense SIFT and Object Candidates 18 Requirements State of the Art Design Results
  • 19. Requirements State of the Art Design Results Distance to the nearest border (DNB) Logarithmic distance to the nearest border (LDNB) Less influence of big distances 19 Carreira, J., Caseiro, R., Batista, J., & Sminchisescu, C. (2012). Semantic segmentation with second-order pooling. In Computer Vision-ECCV 2012 (pp. 430-443). Springer Berlin Heidelberg.
  • 20. Distance and Angle to the nearest border (DANB) Problem: Really similar in 2D but very different values. Solution: Codify them in two separated features. 20 Requirements State of the Art Design Results
  • 21. Rotation Invariant Angle to the nearest border 21 Requirements State of the Art Design Results
  • 22. Distance to the center (DC) 22 Requirements State of the Art Design Results
  • 23. η - Angular Scan (ηAS) WINNER! 23 Requirements State of the Art Design Results
  • 24. Shape Context from a dense SIFT (DSC) Note: It crosses the contour of the region like Shape Context. ηAS does not! 24 Requirements State of the Art Design Results
  • 25. Requirements State of the Art Design Results Rotation Invariant Region Quantization (RIRQ) Main idea: Get spatial information. Easily extensible to a pyramid! 25 Lazebnik, S., Schmid, C., & Ponce, J. (2006). 2006 IEEE Computer Society Conference on (Vol. 2, pp. 2169-2178). IEEE.
  • 26. Achieving flip invariance (RIRQ) 1 2 4 3 1 2 3 4 2 4 1 3 2 3 4 1 4 2 2 4 SORT SORT 2 4 26 Requirements State of the Art Design Results
  • 27. Where do we integrate our features? Two main Architectures Enriched SIFT (eSIFT) SIFT Shape features Visual Vocabulary Bag of eSIFT visual words BoW+Shape SIFT Visual Vocabulary Bag of Words Shape histogram 27 Requirements State of the Art Design Results
  • 28. BoW+Shape Creation of the shape histograms SIFT Accumulation of features Visual Vocabulary Bag of Words Shape histogram 1 1. Accumulate the same feature for all points . 2. Create a histogram of X bins for that feature. 1 2 2 3. Concatenate histograms to create the final one. Example: 8-Angular Scan 8 distances (different angles) # SIFT keypoints 28 Requirements State of the Art Design Results
  • 30. Requirements State of the Art Design Results The dataset: Caltech-101 30 •Well recognized dataset • 101 Different Categories of images • Ground truth annotations available • From 40 to 800 images per category.
  • 31. Requirements State of the Art Design Results Metrics: Accuracy (%) 31 Correct Classifications Correct + Incorrect Classifications
  • 32. Requirements State of the Art Design Results Experiments setup 32 • 30 images per category in train and 30-50 in test. • 101 Categories + Background category. • Different Vocabulary sizes in the X axis. • Accuracy(%) in the Y axis: •Experiments and analysis: • eSIFT • BoW+S • eSIFT vs BoW+S • Performance acheived • Comparison between adding features before or after quantization • Number of bins per histogram • Ground truth vs MCG Object Canditates • Context vs Shape
  • 33. Results enriched SIFT 33 Requirements State of the Art Design Results
  • 34. Results BoW+S 34 Requirements State of the Art Design Results
  • 35. Requirements State of the Art Design Results Performance achieved 35 Conclusion With Angular Scan, there is an increase of performance from 16% to around 41%.
  • 36. Requirements State of the Art Design Results Comparison between adding features after and before Conclusion In Angular Scan, if the number of shape features is high, both architectures tend to converge. 36
  • 37. Requirements State of the Art Design Results Number of bins per histogram Conclusion In Angular Scan, 8 bins is the value that gives the best performance. 37
  • 38. Requirements State of the Art Design Results Ground truth vs MCG Object Candidates Conclusion 1 2 Higher vocabulary values lead to a more robust approach in terms of segmentation errors. Shape-based methods are more sensible to segmentation errors than texture-based. 38
  • 39. Requirements State of the Art Design Results Context gain vs Shape gain Conclusion Object Context It gives better performance to codify the shape than the context of the image. 39
  • 40. FutureWork Comparison betwen our work and Second Order Pooling PhD thesis of Carles Ventura Carreira, J., Caseiro, R., Batista, J., & Sminchisescu, C. (2012). Semantic segmentation with second-order pooling. In Computer Vision-ECCV 2012 (pp. 430-443). Springer Berlin Heidelberg. 40
  • 41. Distance to the nearest border (DNB) 41 Future Work
  • 42. Conclusions 1. Increase of performance from 16% to around 41% 2. In Angular Scan, if the number of shape features is high, both architectures tend to converge. 3. In Angular Scan, 8 bins is the value that gives the best performance. 4. Higher vocabulary values lead to a more robust approach in terms of segmentation errors. 5. Shape-based methods are more sensible to segmentation errors than texture-based. 6. It gives better performance to codify the shape than the context of the image. Thank you! Questions? 42