The document summarizes prior work on poselets and attributes for describing people in images. It discusses how poselets were introduced in 2009 and have since been applied to tasks like segmentation, action recognition, and categorization. It also reviews over 20 prior works from 1990-2011 on discovering and learning attributes from text, images, motion capture data, and for tasks such as image retrieval, active learning, and determining gender. The goal of the current work is to extract attributes from images using a poselet-based approach.
The 7 Things I Know About Cyber Security After 25 Years | April 2024
Describing People: A Poselet-based approach to attribute classification
1. Describing People: A Poselet-Based
Approach to Attribute Classification
Lubomir Bourdev1,2
Subhransu Maji1
Jitendra Malik1
1EECS U.C. Berkeley 2Adobe Systems Inc.
7. Prior work on Poselets
• Introduced by [Bourdev and Malik, ICCV09]
• Detection with poselets [Bourdev et al, ECCV10]
• Applications
• Segmentation [Brox et al, ECCV10] [Maire et al, ICCV 11]
• Actions [Yang et al, CVPR10] [Maji et al, CVPR11] [Yao et al, ICCV11]
• Human parsing [Wang et al, CVPR11]
• Semantic contours [Hariharan et al, ICCV11]
• Subordinate level categorization [Farrell et al, ICCV11]
8. Prior work on Poselets
• Introduced by [Bourdev and Malik, ICCV09]
• Detection with poselets [Bourdev et al, ECCV10]
• Applications
• Segmentation [Brox et al, ECCV10] [Maire et al, ICCV 11]
• Actions [Yang et al, CVPR10] [Maji et al, CVPR11] [Yao et al, ICCV11]
• Human parsing [Wang et al, CVPR11]
• Semantic contours [Hariharan et al, ICCV11]
• Subordinate level categorization [Farrell et al, ICCV11]
9. Prior work on Poselets
• Introduced by [Bourdev and Malik, ICCV09]
• Detection with poselets [Bourdev et al, ECCV10]
• Applications
• Segmentation [Brox et al, ECCV10] [Maire et al, ICCV 11]
• Actions [Yang et al, CVPR10] [Maji et al, CVPR11] [Yao et al, ICCV11]
• Human parsing [Wang et al, CVPR11]
• Semantic contours [Hariharan et al, ICCV11]
• Subordinate level categorization [Farrell et al, ICCV11]
10. Prior work on Poselets
• Introduced by [Bourdev and Malik, ICCV09]
• Detection with poselets [Bourdev et al, ECCV10]
• Applications
• Segmentation [Brox et al, ECCV10] [Maire et al, ICCV 11]
• Actions [Yang et al, CVPR10] [Maji et al, CVPR11] [Yao et al, ICCV11]
• Human parsing [Wang et al, CVPR11]
• Semantic contours [Hariharan et al, ICCV11]
• Subordinate level categorization [Farrell et al, ICCV11]
11. Prior work on Poselets
• Introduced by [Bourdev and Malik, ICCV09]
• Detection with poselets [Bourdev et al, ECCV10]
• Applications
• Segmentation [Brox et al, ECCV10] [Maire et al, ICCV 11]
• Actions [Yang et al, CVPR10] [Maji et al, CVPR11] [Yao et al, ICCV11]
• Human parsing [Wang et al, CVPR11]
• Semantic contours [Hariharan et al, ICCV11]
• Subordinate level categorization [Farrell et al, ICCV11]
12. Prior work on Attributes
Attributes as intermediate parts Image retrieval with attributes
Discovering attributes from text Attributes and actions
Discovering attributes from images Active learning with attributes
Attributes from motion capture Attributes of people
Joint learning of classes & attributes Gender attribute
[Cottrell and Medcalfe, NIPS90] [Golomb et al, NIPS90] [Moghaddam& Yang, PAMI02]
[Ferrari &Zisserman, NIPS07] [Kumar et al, ECCV08] [Gallagher and Chen, CVPR08]
[Cao et al, ACM08] [Lampert et al, CVPR09] [Farhadi et al, CVPR 09] [Wang et al,
BMVC09] [Wang and Forsyth, ICCV09] [Kumar et al, ICCV09] [Farhadi et al, CVPR10]
[Berg et al, ECCV10] [Wang and Mori, ECCV10] [Sigal et al, ECCV10] [Branson el al,
ECCV10] [Hwang et al, CVPR11] [Parikh and Grauman, CVPR11] [Douze et al, CVPR11]
[Kovashka et al, ICCV11] [Liu et al, CVPR11] [Qiu et al, ICCV11] [Yao et al, ICCV11]
[Dhar et al, CVPR11] [Parikh and Grauman, ICCV11] [Siddiquie et al, CVPR11]
13. Prior work on Attributes
Attributes as intermediate parts Image retrieval with attributes
Discovering attributes from text Attributes and actions
Discovering attributes from images Active learning with attributes
Attributes from motion capture Attributes of people
Joint learning of classes & attributes Gender attribute
[Cottrell and Medcalfe, NIPS90] [Golomb et al, NIPS90] [Moghaddam& Yang, PAMI02]
[Ferrari &Zisserman, NIPS07] [Kumar et al, ECCV08] [Gallagher and Chen, CVPR08]
[Cao et al, ACM08] [Lampert et al, CVPR09] [Farhadi et al, CVPR 09] [Wang et al,
BMVC09] [Wang and Forsyth, ICCV09] [Kumar et al, ICCV09] [Farhadi et al, CVPR10]
[Berg et al, ECCV10] [Wang and Mori, ECCV10] [Sigal et al, ECCV10] [Branson el al,
ECCV10] [Hwang et al, CVPR11] [Parikh and Grauman, CVPR11] [Douze et al, CVPR11]
[Kovashka et al, ICCV11] [Liu et al, CVPR11] [Qiu et al, ICCV11][Yao et al, ICCV11]
[Dhar et al, CVPR11] [Parikh and Grauman, ICCV11] [Siddiquie et al, CVPR11]
14. Prior work on Attributes
Attributes as intermediate parts Image retrieval with attributes
Discovering attributes from text Attributes and actions
Discovering attributes from images Active learning with attributes
Attributes from motion capture Attributes of people
Joint learning of classes & attributes Gender attribute
[Cottrell and Medcalfe, NIPS90] [Golomb et al, NIPS90] [Moghaddam& Yang, PAMI02]
[Ferrari &Zisserman, NIPS07] [Kumar et al, ECCV08] [Gallagher and Chen, CVPR08]
[Cao et al, ACM08] [Lampert et al, CVPR09] [Farhadi et al, CVPR 09] [Wang et al,
BMVC09] [Wang and Forsyth, ICCV09] [Kumar et al, ICCV09] [Farhadi et al, CVPR10]
[Berg et al, ECCV10] [Wang and Mori, ECCV10] [Sigal et al, ECCV10] [Branson el al,
ECCV10] [Hwang et al, CVPR11] [Parikh and Grauman, CVPR11] [Douze et al, CVPR11]
[Kovashka et al, ICCV11] [Liu et al, CVPR11] [Qiu et al, ICCV11] [Yao et al, ICCV11]
[Dhar et al, CVPR11] [Parikh and Grauman, ICCV11] [Siddiquie et al, CVPR11]
15. Prior work on Attributes
Attributes as intermediate parts Image retrieval with attributes
Discovering attributes from text Attributes and actions
Discovering attributes from images Active learning with attributes
Attributes from motion capture Attributes of people
Joint learning of classes & attributes Gender attribute
[Cottrell and Medcalfe, NIPS90] [Golomb et al, NIPS90] [Moghaddam& Yang, PAMI02]
[Ferrari &Zisserman, NIPS07] [Kumar et al, ECCV08] [Gallagher and Chen, CVPR08]
[Cao et al, ACM08] [Lampert et al, CVPR09] [Farhadi et al, CVPR 09] [Wang et al,
BMVC09] [Wang and Forsyth, ICCV09] [Kumar et al, ICCV09] [Farhadi et al, CVPR10]
[Berg et al, ECCV10] [Wang and Mori, ECCV10] [Sigal et al, ECCV10] [Branson el al,
ECCV10] [Hwang et al, CVPR11] [Parikh and Grauman, CVPR11] [Douze et al, CVPR11]
[Kovashka et al, ICCV11] [Liu et al, CVPR11] [Qiu et al, ICCV11] [Yao et al, ICCV11]
[Dhar et al, CVPR11] [Parikh and Grauman, ICCV11] [Siddiquie et al, CVPR11]
16. Prior work on Attributes
Attributes as intermediate parts Image retrieval with attributes
Discovering attributes from text Attributes and actions
Discovering attributes from images Active learning with attributes
Attributes from motion capture Attributes of people
Joint learning of classes & attributes Gender attribute
[Cottrell and Medcalfe, NIPS90] [Golomb et al, NIPS90] [Moghaddam& Yang, PAMI02]
[Ferrari &Zisserman, NIPS07] [Kumar et al, ECCV08] [Gallagher and Chen, CVPR08]
[Cao et al, ACM08] [Lampert et al, CVPR09] [Farhadi et al, CVPR 09] [Wang et al,
BMVC09] [Wang and Forsyth, ICCV09] [Kumar et al, ICCV09] [Farhadi et al, CVPR10]
[Berg et al, ECCV10] [Wang and Mori, ECCV10] [Sigal et al, ECCV10] [Branson el al,
ECCV10] [Hwang et al, CVPR11] [Parikh and Grauman, CVPR11] [Douze et al, CVPR11]
[Kovashka et al, ICCV11] [Liu et al, CVPR11] [Qiu et al, ICCV11] [Yao et al, ICCV11]
[Dhar et al, CVPR11] [Parikh and Grauman, ICCV11] [Siddiquie et al, CVPR11]
17. Prior work on Attributes
Attributes as intermediate parts Image retrieval with attributes
Discovering attributes from text Attributes and actions
Discovering attributes from images Active learning with attributes
Attributes from motion capture Attributes of people
Joint learning of classes & attributes Gender attribute
[Cottrell and Medcalfe, NIPS90] [Golomb et al, NIPS90] [Moghaddam& Yang, PAMI02]
[Ferrari &Zisserman, NIPS07] [Kumar et al, ECCV08] [Gallagher and Chen, CVPR08]
[Cao et al, ACM08] [Lampert et al, CVPR09] [Farhadi et al, CVPR 09] [Wang et al,
BMVC09] [Wang and Forsyth, ICCV09] [Kumar et al, ICCV09] [Farhadi et al, CVPR10]
[Berg et al, ECCV10] [Wang and Mori, ECCV10] [Sigal et al, ECCV10] [Branson el al,
ECCV10] [Hwang et al, CVPR11] [Parikh and Grauman, CVPR11] [Douze et al, CVPR11]
[Kovashka et al, ICCV11] [Liu et al, CVPR11] [Qiu et al, ICCV11] [Yao et al, ICCV11]
[Dhar et al, CVPR11] [Parikh and Grauman, ICCV11] [Siddiquie et al, CVPR11]
18. Prior work on Attributes
Attributes as intermediate parts Image retrieval with attributes
Discovering attributes from text Attributes and actions
Discovering attributes from images Active learning with attributes
Attributes from motion capture Attributes of people
Joint learning of classes & attributes Gender attribute
[Cottrell and Medcalfe, NIPS90] [Golomb et al, NIPS90] [Moghaddam& Yang, PAMI02]
[Ferrari &Zisserman, NIPS07] [Kumar et al, ECCV08] [Gallagher and Chen, CVPR08]
[Cao et al, ACM08] [Lampert et al, CVPR09] [Farhadi et al, CVPR 09] [Wang et al,
BMVC09] [Wang and Forsyth, ICCV09] [Kumar et al, ICCV09] [Farhadi et al,
CVPR10][Berg et al, ECCV10] [Wang and Mori, ECCV10] [Sigal et al, ECCV10] [Branson
el al, ECCV10] [Hwang et al, CVPR11] [Parikh and Grauman, CVPR11] [Douze et al,
CVPR11] [Kovashka et al, ICCV11] [Liu et al, CVPR11] [Qiu et al, ICCV11] [Yao et al,
ICCV11] [Dhar et al, CVPR11] [Parikh and Grauman, ICCV11] [Siddiquie et al, CVPR11]
19. Prior work on Attributes
Attributes as intermediate parts Image retrieval with attributes
Discovering attributes from text Attributes and actions
Discovering attributes from images Active learning with attributes
Attributes from motion capture Attributes of people
Joint learning of classes & attributes Gender attribute
[Cottrell and Medcalfe, NIPS90] [Golomb et al, NIPS90] [Moghaddam& Yang, PAMI02]
[Ferrari &Zisserman, NIPS07] [Kumar et al, ECCV08] [Gallagher and Chen, CVPR08]
[Cao et al, ACM08] [Lampert et al, CVPR09] [Farhadi et al, CVPR 09] [Wang et al,
BMVC09] [Wang and Forsyth, ICCV09] [Kumar et al, ICCV09] [Farhadi et al, CVPR10]
[Berg et al, ECCV10] [Wang and Mori, ECCV10] [Sigal et al, ECCV10] [Branson el al,
ECCV10] [Hwang et al, CVPR11] [Parikh and Grauman, CVPR11] [Douze et al, CVPR11]
[Kovashka et al, ICCV11] [Liu et al, CVPR11] [Qiu et al, ICCV11] [Yao et al, ICCV11]
[Dhar et al, CVPR11] [Parikh and Grauman, ICCV11] [Siddiquie et al, CVPR11]
20. Prior work on Attributes
Attributes as intermediate parts Image retrieval with attributes
Discovering attributes from text Attributes and actions
Discovering attributes from images Active learning with attributes
Attributes from motion capture Attributes of people
Joint learning of classes & attributes Gender attribute
[Cottrell and Medcalfe, NIPS90] [Golomb et al, NIPS90] [Moghaddam& Yang, PAMI02]
[Ferrari &Zisserman, NIPS07] [Kumar et al, ECCV08] [Gallagher and Chen, CVPR08]
[Cao et al, ACM08] [Lampert et al, CVPR09] [Farhadi et al, CVPR 09] [Wang et al,
BMVC09] [Wang and Forsyth, ICCV09] [Kumar et al, ICCV09] [Farhadi et al, CVPR10]
[Berg et al, ECCV10] [Wang and Mori, ECCV10] [Sigal et al, ECCV10] [Branson el al,
ECCV10] [Hwang et al, CVPR11] [Parikh and Grauman, CVPR11] [Douze et al, CVPR11]
[Kovashka et al, ICCV11] [Liu et al, CVPR11] [Qiu et al, ICCV11] [Yao et al, ICCV11]
[Dhar et al, CVPR11] [Parikh and Grauman, ICCV11] [Siddiquie et al, CVPR11]
21. Prior work on Attributes
Attributes as intermediate parts Attributes and actions
Discovering attributes from text Active learning with attributes
Discovering attributes from images Attributes of people
Attributes from motion capture Gender attribute
Joint learning of classes & attributes
Image retrieval with attributes
[Cottrell and Medcalfe, NIPS90] [Golomb et al, NIPS90] [Moghaddam& Yang, PAMI02]
[Ferrari &Zisserman, NIPS07] [Kumar et al, ECCV08] [Gallagher and Chen, CVPR08]
[Cao et al, ACM08] [Lampert et al, CVPR09] [Farhadi et al, CVPR 09] [Wang et al,
BMVC09] [Wang and Forsyth, ICCV09] [Kumar et al, ICCV09] [Farhadi et al, CVPR10]
[Berg et al, ECCV10] [Wang and Mori, ECCV10] [Sigal et al, ECCV10] [Branson el al,
ECCV10] [Hwang et al, CVPR11] [Parikh and Grauman, CVPR11] [Douze et al, CVPR11]
[Kovashka et al, ICCV11] [Liu et al, CVPR11] [Qiu et al, ICCV11] [Yao et al, ICCV11]
[Dhar et al, CVPR11] [Parikh and Grauman, ICCV11] [Siddiquie et al, CVPR11]
22. Prior work on Attributes
Attributes as intermediate parts Image retrieval with attributes
Discovering attributes from text Attributes and actions
Discovering attributes from images Active learning with attributes
Attributes from motion capture Attributes of people
Joint learning of classes & attributes Gender attribute
[Cottrell and Medcalfe, NIPS90] [Golomb et al, NIPS90] [Moghaddam& Yang, PAMI02]
[Ferrari &Zisserman, NIPS07] [Kumar et al, ECCV08] [Gallagher and Chen, CVPR08]
[Cao et al, ACM08] [Lampert et al, CVPR09] [Farhadi et al, CVPR 09] [Wang et al,
BMVC09] [Wang and Forsyth, ICCV09] [Kumar et al, ICCV09] [Farhadi et al, CVPR10]
[Berg et al, ECCV10] [Wang and Mori, ECCV10] [Sigal et al, ECCV10] [Branson el al,
ECCV10] [Hwang et al, CVPR11] [Parikh and Grauman, CVPR11] [Douze et al, CVPR11]
[Kovashka et al, ICCV11] [Liu et al, CVPR11] [Qiu et al, ICCV11] [Yao et al, ICCV11]
[Dhar et al, CVPR11] [Parikh and Grauman, ICCV11] [Siddiquie et al, CVPR11]
32. Training poselet classifiers
Residual 0.15 0.20 0.10 0.85 0.15 0.35
Error:
1. Given a seed patch
2. Find the closest patch for every other person
3. Sort them by residual error
4. Threshold them
33. Training poselet classifiers
1. Given a seed patch
2. Find the closest patch for every other person
3. Sort them by residual error
4. Threshold them
5. Use them as positive training examples to train
a linear SVM with HOG features
47. Our dataset
• Source: VOC 2010 trainval for Person + H3D
• ~8000 annotations (4000 train + 4000 test)
• 9 binary attributes specified by 5 independent annotators via AMT
• Ground truth label: If 4 of the 5 agree
• Dataset will be made publicly available
52. Our baseline
• Canny-modulated HOG with SPM kernel [Lazebnik et al CVPR06]
• To help the baseline trained separate SPM for four viewpoints:
Full view Head zoom Upper body Legs
• For each attribute we pick the best SPM as our baseline
53. Precision/recall on our test set
Label - ---
frequency
SPM
___
No ___
context
Full ___
Model
54. State-of-the-art Gender Recognition
• We outperform Cognitec (top-notch face
recognizer)
• We outperform any gender recognizer based on
frontal faces (are there others?)
• 61% of our test have frontal faces.
• Even with perfect classification of frontal faces,
max AP=80.5% vs. our AP of 82.4%
55. Confusions
long hair
Men most confused as women
Women most confused as men baseball hat hair hidden
56. annotation
Non-T-shirt most confused to be T-shirt errors
Short pants most confused to be long pants
Are these pants short? wrong person occlusion
60. How poselets help in high-level vision
The image is a complex Poselets decouple pose and
function of the viewpoint, camera view from
pose, appearance, etc. appearance
61. Google “poselets” to get:
• The set of published poselet papers
• H3D data set + Matlab tools
• Java3D annotation tool + video tutorial
• Matlab code to detect people using poselets
• Our latest trained poselets
62. Poselets website
Failure mode
http://eecs.berkeley.edu/~lbourdev/poselets hair,
“A man with with long
“A woman short
“Aglasses,with short hair,
“Aperson short short hair,
man with sleeves and
hair and long sleeves”
• The set of published poseletno hat pants” sleeves
glasses, short sleeves”
papers and long
long
• H3D data set + Matlab toolsand person with
“A shorts”
Java3D annotation tool + video tutorial
longcomputer vision
“A pants”
•
• Matlab code to detect people using poselets
professor who likes
• Our latest trained poselets
machine learning”