SlideShare uma empresa Scribd logo
1 de 5
Baixar para ler offline
GROUP SALIENCY PROPAGATION
FOR LARGE SCALE AND QUICK IMAGE CO-SEGMENTATION
Koteswar Rao Jerripothula †
Jianfei Cai†
Junsong Yuan§
ROSE Lab, Interdisciplinary Graduate School, Nanyang Technological University
†
School of Computer Engineering, Nanyang Technological University
§
School of Electrical and Electronic Engineering, Nanyang Technological University
ABSTRACT
Most of the existing co-segmentation methods are usually
complex, and require pre-grouping of images, fine-tuning a
few parameters and initial segmentation masks etc. These
limitations become serious concerns for their application on
large scale datasets. In this paper, Group Saliency Propaga-
tion (GSP) model is proposed where a single group saliency
map is developed, which can be propagated to segment the
entire group. In addition, it is also shown how a pool of these
group saliency maps can help in quickly segmenting new
input images. Experiments demonstrate that the proposed
method can achieve competitive performance on several
benchmark co-segmentation datasets including ImageNet,
with the added advantage of speed up.
Index Terms— co-segmentation, propagation, quick,
large scale, group, ImageNet, fusion.
1. INTRODUCTION
Image co-segmentation deals with segmenting out common
objects from a set of images. Exploiting the weak supervision
provided by association of similar images, co-segmentation
has become a promising substitute to tedious interactive seg-
mentation for object extraction and labelling in large-scale
datasets. The concept was originally introduced by [1] to
simultaneously segment out the common object from a pair
of images. Since then, many co-segmentation methods have
been proposed to scale from image pair to multiple images
[2, 3, 4]. The work in [2] proposed a discriminative clustering
framework and [3] used optimization for co-segmentation.
Recently, [5] combined co-segmentation with co-sketch for
effective co-segmentation and [6] proposed to co-segment a
noisy image dataset (where some images may not contain the
common object). Most of these existing works require inten-
sive computation due to co-labelling multiple images and also
†This research was carried out at the Rapid-Rich Object Search (ROSE)
Lab at the Nanyang Technological University, Singapore. The ROSE Lab
is supported by the National Research Foundation, Prime Ministers Office,
Singapore, under its IDM Futures Funding Initiative and administered by
the Interactive and Digital Media Programme Office. This research is also
supported by Singapore MoE AcRF Tier-1 Grant RG30/11.
need fine-tuning a few parameters and pre-grouping of images
etc.
Our previous work [7] addresses some of these issues by
performing segmentation on each of the individual images
but with the help of a global saliency map (formed via pair-
wise warping process). Warping [8] is a process of alignment
of one image w.r.t. another image where dense correspon-
dence between two images is established at pixel level and
this process is computationally expensive. Therefore, devel-
oping such a global saliency map for each image requires in-
tensive computation. In this paper, we aim at improving the
efficiency of our previous method so that it can be applied to
large scale datasets without much performance loss.
Geometric Mean Saliency: Since proposed method is
based on our previous work [7], we briefly review the ba-
sic idea here to make the paper self contained. In GMS [7],
segmentation on individual images is performed, but using a
global saliency map that is obtained by fusing single-image
saliency maps of a group of similar images. In this way, even
if an existing single-image saliency detection method fails to
detect the common object as salient in an image, saliency
maps of other images can help in extracting the common ob-
ject by forming a global saliency map. The fusion takes place
at the pixel level using geometric mean after saliency maps of
other images are aligned. The corresponding saliency values
in other images are collected and combined for each image.
When we do this for each image, mostly the same correspond-
ing pixels get together every time and their saliency values
are combined, under the assumption that warping technique
is precise and accurate. This makes the process repetitive
and inefficient. So, we avoid such repetition in the proposed
method by doing this task only once and propagate this infor-
mation in the group. Therefore, in this paper, we effectively
perform the same task as earlier in an efficient way and as a
result we hardly observe any drop in the performance.
In particular, our basic idea is to first cluster visually sim-
ilar images into a group and select a key image to represent
each group. After that we can align all the saliency maps
of other images in the group w.r.t. the key image via warp-
ing technique and fuse them to form a single group saliency
Fig. 1. Comparison between GMS [7] and our proposed GSP.
Instead of performing pairwise matching (GMS [7]), GSP
only matches each member image with the key image that
represents the whole group.
map for the entire group. This group saliency map serves as
a group prior for foreground extraction of all the images in
the group by means of propagation that is done via warping.
Moreover, it can also help to segment a new input image by
the same means. We call our method Group Saliency Propa-
gation (GSP) model. Note that there are only a few attempts
on large-scale cosegmentation such as [3, 9, 10], where [10]
represents state-of-the-art. Although [10] also relies on prop-
agation, it needs some human annotated segmentation masks
and semantic grouping to initiate the propagation, whereas
our method does not have such a limitation. In addition, our
GSP method can also help on quick segmentation of any new
image by utilizing a pool of pre-computed group saliency
maps. Experiments on several benchmark datasets includ-
ing ImageNet [11], show that our proposed GSP model can
achieve competitive results with a drastic speed up.
2. GROUP SALIENCY PROPAGATION
This section provides the detailed description of our proposed
method. The group saliency map is obtained by fusing the key
image saliency map and warped saliency maps of other im-
ages at pixel level. Unlike [7] where such fusion of individual
saliency map and warped saliency map is carried out for every
image, here we only fuse the saliency maps for the key image
and treat the fused saliency map as the group saliency map.
Once we obtain the group saliency map for each group, this
group saliency map can help in segmentation of all the images
in the group as well as new input images through propagation.
Notations: Consider a large dataset with m images I =
{I1, I2, .., Im}. Let M = {MI1
, MI2
, .., MIm
} be the set of
corresponding visual saliency maps, D = {DI1
, DI2
, ..., DIm
}
be the set of corresponding weighted GIST descriptors [12, 6]
(saliency maps are used as weight maps) for global descrip-
tion, and L = {LI1
, LI2
, ..., LIm
} be the set of correspond-
ing dense SIFT descriptors [8] for local description. Denote
I, A, S and M as a new input image, its GIST descriptor,
SIFT descriptor and preprocessed saliency maps respectively.
Pre-processing: We use the same pre-processing steps
as in [7] to ensure that each saliency map covers sufficient
Grab Cut
Source
Images
Saliency
Maps
Our Results
Key
Image
…………………………………….Member Images…………………………………
Warping
WarpingGroup
Saliency
Map
Warped Group
Saliency Maps
Fig. 2. Proposed GSP approach: A single group saliency map
of key image helps in segmenting all the other images.
regions of the object by adding spatial contrast saliency and
then brighten the saliency map to avoid over-penalty caused
by low saliency values.
2.1. Group Saliency Maps
We first perform k-means clustering using GIST descriptors
of images to cluster the entire image dataset into K groups.
For each group the image closest to a cluster center is se-
lected as the key image to represent the whole group. For each
group, its group saliency map is produced using [7] w.r.t the
key image alone where saliency maps of the member images
are aligned to the key image via the warping technique. Then
following the method in GMS [7], the key image saliency map
and the warped saliency maps from other members are fused
at pixel level using geometric mean. Let Ck denote the k-th
cluster as well as the corresponding key image, and U
Ij
Ck
de-
note the warped saliency map of the image Ij w.r.t. key image
Ck. The group saliency map Gk
value for any pixel p can now
be computed as
Gk
(p) = MCk
(p)
Ij =Ck,Ij ∈Ck
U
Ij
Ck
(p)
1
|Ck|
(1)
where |.| represents cardinality.
For quick co-segmentation, along with the group saliency
maps we also need to store GIST and SIFT descriptors of the
key images and number of members in the group for the pur-
pose of group assignment, warping and fusion respectively.
Therefore, the pool of group saliency maps contains not only
the maps G = {G1
, G2
, ..., GK
} but also the corresponding
GIST and SIFT descriptors and number of group members:
A = {DC1
, DC2
, ..., DCK
} and S = {LC1
, LC2
, ..., LCK
}
and C = {|Ck|, |C2|, ..., |CK|} respectively.
2.2. Propagation
In our previous work [7], for a group of n images, when seg-
menting one image, the saliency maps of the other n − 1 im-
ages need to wrap to the current image, which requires com-
puting the costly dense SIFT flows n − 1 times. Thus, seg-
Input
Images
Results
Pool of group
saliency maps
warping
Key image
Saliency
Fusion
Fig. 3. Quick co-segmentation of any new input images by
first group assigment, then group saliency map warping and
then fusion of saliency maps.
menting n images will need computing dense SIFT flows for
n × (n − 1) times, as shown in the left part of Fig. 1, which
is very time consuming, especially for large-scale dataset.
In order to reduce the computational cost, in this paper,
we propose to use group saliency map to propagate the group
prior information. In particular, for a cluster of n images, we
only compute the dense SIFT flows for the key image n − 1
times. When segmenting the n − 1 member images, we sim-
ply warp the group saliency map back to each of them. In this
way, totally we only need 2 × (n − 1) dense SIFT flows, as
shown in the right part of Fig. 1. We achieve a drastic speed
up of n/2 times by using this kind of propagation of group
saliency maps. An illustration of the proposed approach is
given in Fig. 2. In particular, the object prior O for any mem-
ber image Ij would be warped Gk
w.r.t. Ij denoted as UCk
Ij
.
Based on the pool of [G, A, S, C] information, we can
also quickly co-segment a new image by 1) assigning the im-
age to a group using the GIST feature, 2) warping the se-
lected group saliency map to the input image, and 3) fusing
the saliency map of the input image with the warped group
saliency map, as shown in Fig. 3. Since one more saliency
value is to be included in the calculation of geometric mean
of saliency values, the object prior O value for any pixel p of
a new input image I can be computed as,
O(p) = UCk
I (p)
|Ck|
× M(p)
1
|Ck|+1
(2)
where UCk
I denotes warped Gk
w.r.t. I
2.3. Foreground Extraction
We obtain the final mask using GrabCut [13], in which regu-
larization is performed with the help of SLIC superpixel over-
segmentation [14]. Particularly, foreground (F) and back-
ground (B) seed locations for GrabCut are determined by
p ∈
F, if O(p) > τ
B, if O(p) < φ
(3)
Cheetah
Cow Dog
Mixed
Fig. 4. Sample Results on ImageNet [11]
Table 1. Jaccard Similarity and Time consumed comparison
of GSP with GMS[7] using default setting
Jaccard Time(mins.)
Similarity Consumed
Datasets Images GMS[7] GSP GMS[7] GSP
Weizmann Horses 328 0.72 0.72 117 45
MSRC 418 0.68 0.68 105 22
iCoseg 529 0.67 0.66 126 38
CosegRep 572 0.71 0.71 216 35
Internet Images 2470 0.62 0.61 806 133
where φ is a global threshold value of O which is automati-
cally determined by the common Otsu’s method [15] and τ is
the only tuning parameter.
3. EXPERIMENTAL RESULTS
Several benchmark datasets including ImageNet [11] have
been used to validate the proposed GSP model and compare
it with the existing methods. Two objective measures, Jac-
card Similarity (J) and Precision (P), are used for evaluation.
Jaccard Similarity is defined as the intersection divided by the
union of ground truth and segmentation result, and Precision
is defined as the percentage of pixels correctly labeled.
In Table 1, we compare our quantitative results and time
consumed with our previous work [7] on 5 benchmark co-
segmentation datasets: MSRC [17], iCoseg [18], Coseg-Rep
[5], Internet Dataset [6], Weizmann Horses [19], using de-
fault parameter setting :τ = 0.99. In each of our experiments
with m images, K is calculated as nearest integer to m/15 en-
Table 2. Quantitative Comparison with existing methods on
large scale dataset ImageNet
Methods Jaccard Similarity Precision
[10] - 77.3
[16] 0.57 84.3
Ours(1) 0.55 83.9
Ours(2) 0.58 85.6
Ours(3) 0.62 87.7
Table 3. Comparison of using categorization and not using
categorization in default setting scenario
Yes No
Datasets J P J P
MSRC 0.678 87.2 0.674 86.9
iCoseg 0.656 88.9 0.632 87.8
CosegRep 0.706 91.2 0.701 90.4
suring that there are 15 images per sub-group on an average.
Since apart from Weizmann horses dataset, all other datasets
contain multiple sub-groups, experiments are conducted sep-
arately on each sub-group and average performance has been
reported. Reported “Time Consumed” is the total time con-
sumed for entire dataset. It can be noted that performance is
hardly affected but there is significant improvement in speed.
As explained in previous sections, the reason can be attributed
to the fact that we are effectively achieving the same objective
,i.e., collecting corresponding pixels together and combining
their saliency values. However, here we efficiently avoid the
repetition occurring in our previous work [7].
Recently, evaluation on large dataset such as ImageNet
has also become possible with the release of groundtruths on
nearly 4.5k images by [10]. We apply our method on about
0.4 million images of 446 synsets to which these 4.5k images
belong and compare our performance with [10] and [16] in
Table 2. Here τ is the only parameter that we tune in our ex-
periments and we report three results with different setting of
τ: (1) τ is fixed for all classes, (2) τ is fixed within a class, but
varied across the classes and (3) τ is tuned across all the im-
ages of dataset. As expected, the performance improves from
(1) to (3) progressively and it will be interesting to see if this
parameter can be learned. Compared with [16] our method
does not require any segmentation masks for initiation but
still achieves comparable performance as in [16]. Some sam-
ple segmentation results obtained from ImageNet are shown
in Fig. 4. Even with cluttered backgrounds, our method can
still segment the foregrounds successfully.
Since we rely on good clustering to group visually sim-
ilar images and claim that manually pre-grouping is not ab-
solutely essential for our method, it would be interesting to
Fig. 5. Some sample results of our quick co-segmentation
method. In spite of object not being so salient in the saliency
map [20], the proposed approach is able to extract it.
Table 4. Jaccard similarity comparison beween quick co-
segmentation and other automatic methods like GrabCut ini-
tialized by central window and saliency map [20]
Datasets Center Saliency Quick
MSRC 0.44 0.47 0.67
Weizmann Horses 0.53 0.58 0.72
see how our method behaves “without exploiting manual cat-
egorization” in comparison to “with exploiting manual cate-
gorization” on benchmark co-segmentation datasets. It can be
noted in Table 3 that there is very minor drop in performance.
To evauate our quick cosegmentation idea we first make a
pool of 30k group saliency maps produced while attempting
to segment ImageNet and then input the images of datasets
image-by-image. In Table 4, we compare results of our quick
co-segmentation idea with that of simple GrabCut initialized
with central window of 25% of size of the image and ini-
tialized with saliency prior thresholded using Otsu method.
Grabcut initialized by central window, grabcut initialized by
saliency prior and our quick co-segmentation method have
been labelled as Center, Saliency and Quick respectively. It
can be noted that our quick co-segmentation method obtains
significant improvement over others. Some sample results of
this approach are shown in Fig. 5 demonstrating how fore-
ground gets extracted in spite of not being so salient.
4. CONCLUSION
Image co-segmentation in large-scale dataset remaining a
challenging problem as most existing methods cannot be eas-
ily extended to big dataset, we efficiently extend our previous
work [7] to perform large-scale co-segmentation and quick
co-segmentation of new input images by introducing the idea
of propagating a group saliency map in the entire group. Such
an approach requires a lesser number of SIFT flows compared
to the previous method. Experimental results demonstrated
that proposed method is able to perform co-segmentation in a
much faster manner without much affecting on performance.
5. REFERENCES
[1] Carsten Rother, Tom Minka, Andrew Blake, and
Vladimir Kolmogorov, “Cosegmentation of image pairs
by histogram matching-incorporating a global constraint
into mrfs,” in Computer Vision and Pattern Recogni-
tion(CVPR). IEEE, 2006, vol. 1, pp. 993–1000.
[2] Armand Joulin, Francis Bach, and Jean Ponce, “Dis-
criminative clustering for image co-segmentation,” in
Computer Vision and Pattern Recognition (CVPR).
IEEE, 2010, pp. 1943–1950.
[3] Gunhee Kim, Eric P Xing, Li Fei-Fei, and Takeo
Kanade, “Distributed cosegmentation via submodular
optimization on anisotropic diffusion,” in International
Conference on Computer Vision(ICCV). IEEE, 2011,
pp. 169–176.
[4] Armand Joulin, Francis Bach, and Jean Ponce, “Multi-
class cosegmentation,” in Computer Vision and Pattern
Recognition (CVPR). IEEE, 2012, pp. 542–549.
[5] Jifeng Dai, Ying N Wu, Jie Zhou, and Song-Chun Zhu,
“Cosegmentation and cosketch by unsupervised learn-
ing,” in International Conference on Computer Vi-
sion(ICCV), 2013.
[6] Michael Rubinstein, Armand Joulin, Johannes Kopf,
and Ce Liu, “Unsupervised joint object discovery and
segmentation in internet images,” in Computer Vi-
sion and Pattern Recognition (CVPR). IEEE, 2013, pp.
1939–1946.
[7] Koteswar R Jerripothula, Jianfei Cai, Fanman Meng,
and Junsong Yuan, “Automatic image Co-Segmentation
using geometric mean saliency,” in International Con-
ference on Image Processing (ICIP), Paris, France, Oct.
2014, pp. 3282–3286.
[8] Ce Liu, Jenny Yuen, and Antonio Torralba, “Sift flow:
Dense correspondence across scenes and its applica-
tions,” Pattern Analysis and Machine Intelligence, IEEE
Transactions on, vol. 33, no. 5, pp. 978–994, 2011.
[9] Gunhee Kim and Eric P Xing, “On multiple fore-
ground cosegmentation,” in Computer Vision and Pat-
tern Recognition (CVPR). IEEE, 2012, pp. 837–844.
[10] Daniel Kuettel, Matthieu Guillaumin, and Vittorio Fer-
rari, “Segmentation propagation in imagenet,” in Euro-
pean Conference on Computer Vision(ECCV), pp. 459–
473. Springer, 2012.
[11] Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li,
and Li Fei-Fei, “Imagenet: A large-scale hierarchi-
cal image database,” in Computer Vision and Pattern
Recognition (CVPR). IEEE, 2009, pp. 248–255.
[12] Aude Oliva and Antonio Torralba, “Modeling the shape
of the scene: A holistic representation of the spa-
tial envelope,” International journal of computer vi-
sion(IJCV), vol. 42, no. 3, pp. 145–175, 2001.
[13] Carsten Rother, Vladimir Kolmogorov, and Andrew
Blake, “Grabcut: Interactive foreground extraction us-
ing iterated graph cuts,” in Transactions on Graphics
(TOG). ACM, 2004, vol. 23, pp. 309–314.
[14] Radhakrishna Achanta, Appu Shaji, Kevin Smith, Aure-
lien Lucchi, Pascal Fua, and Sabine S¨usstrunk, “Slic su-
perpixels,” Ecole Polytechnique F´ed´eral de Lausssanne
(EPFL), Tech. Rep, vol. 2, pp. 3, 2010.
[15] Nobuyuki Otsu, “A threshold selection method from
gray-level histograms,” Automatica, vol. 11, no. 285-
296, pp. 23–27, 1975.
[16] Matthieu Guillaumin, Daniel Kuettel, and Vittorio Fer-
rari, “Imagenet auto-annotation with segmentation
propagation,” International Journal of Computer Vi-
sion(IJCV), pp. 1–21, 2014.
[17] Jamie Shotton, John Winn, Carsten Rother, and Antonio
Criminisi, “Textonboost: Joint appearance, shape and
context modeling for multi-class object recognition and
segmentation,” in European Conference on Computer
Vision(ECCV), pp. 1–15. Springer, 2006.
[18] Dhruv Batra, Adarsh Kowdle, Devi Parikh, Jiebo Luo,
and Tsuhan Chen, “icoseg: Interactive co-segmentation
with intelligent scribble guidance,” in Computer Vi-
sion and Pattern Recognition (CVPR). IEEE, 2010, pp.
3169–3176.
[19] Eran Borenstein and Shimon Ullman, “Class-specific,
top-down segmentation,” in European Conference on
Computer Vision(ECCV), pp. 109–122. Springer, 2002.
[20] Ming-Ming Cheng, Guo-Xin Zhang, Niloy J Mitra, Xi-
aolei Huang, and Shi-Min Hu, “Global contrast based
salient region detection,” in Computer Vision and Pat-
tern Recognition (CVPR). IEEE, 2011, pp. 409–416.

Mais conteúdo relacionado

Mais procurados

Image resolution enhancement by using wavelet transform 2
Image resolution enhancement by using wavelet transform 2Image resolution enhancement by using wavelet transform 2
Image resolution enhancement by using wavelet transform 2
IAEME Publication
 

Mais procurados (18)

Seed net automatic seed generation with deep reinforcement learning for robus...
Seed net automatic seed generation with deep reinforcement learning for robus...Seed net automatic seed generation with deep reinforcement learning for robus...
Seed net automatic seed generation with deep reinforcement learning for robus...
 
G0443640
G0443640G0443640
G0443640
 
Qcce quality constrained co saliency estimation for common object detection
Qcce quality constrained co saliency estimation for common object detectionQcce quality constrained co saliency estimation for common object detection
Qcce quality constrained co saliency estimation for common object detection
 
IRJET- Contrast Enhancement of Grey Level and Color Image using DWT and SVD
IRJET-  	  Contrast Enhancement of Grey Level and Color Image using DWT and SVDIRJET-  	  Contrast Enhancement of Grey Level and Color Image using DWT and SVD
IRJET- Contrast Enhancement of Grey Level and Color Image using DWT and SVD
 
Gadljicsct955398
Gadljicsct955398Gadljicsct955398
Gadljicsct955398
 
Ijetr011837
Ijetr011837Ijetr011837
Ijetr011837
 
IRJET- Satellite Image Resolution Enhancement using Dual-tree Complex Wav...
IRJET-  	  Satellite Image Resolution Enhancement using Dual-tree Complex Wav...IRJET-  	  Satellite Image Resolution Enhancement using Dual-tree Complex Wav...
IRJET- Satellite Image Resolution Enhancement using Dual-tree Complex Wav...
 
Fuzzy Logic based Contrast Enhancement
Fuzzy Logic based Contrast EnhancementFuzzy Logic based Contrast Enhancement
Fuzzy Logic based Contrast Enhancement
 
IRJET- Saliency based Image Co-Segmentation
IRJET- Saliency based Image Co-SegmentationIRJET- Saliency based Image Co-Segmentation
IRJET- Saliency based Image Co-Segmentation
 
Satellite Image Enhancement Using Dual Tree Complex Wavelet Transform
Satellite Image Enhancement Using Dual Tree Complex Wavelet TransformSatellite Image Enhancement Using Dual Tree Complex Wavelet Transform
Satellite Image Enhancement Using Dual Tree Complex Wavelet Transform
 
Content Based Image Retrieval Using 2-D Discrete Wavelet Transform
Content Based Image Retrieval Using 2-D Discrete Wavelet TransformContent Based Image Retrieval Using 2-D Discrete Wavelet Transform
Content Based Image Retrieval Using 2-D Discrete Wavelet Transform
 
Image resolution enhancement by using wavelet transform 2
Image resolution enhancement by using wavelet transform 2Image resolution enhancement by using wavelet transform 2
Image resolution enhancement by using wavelet transform 2
 
Ijebea14 283
Ijebea14 283Ijebea14 283
Ijebea14 283
 
An Approach for Image Deblurring: Based on Sparse Representation and Regulari...
An Approach for Image Deblurring: Based on Sparse Representation and Regulari...An Approach for Image Deblurring: Based on Sparse Representation and Regulari...
An Approach for Image Deblurring: Based on Sparse Representation and Regulari...
 
FORGERY (COPY-MOVE) DETECTION IN DIGITAL IMAGES USING BLOCK METHOD
FORGERY (COPY-MOVE) DETECTION IN DIGITAL IMAGES USING BLOCK METHODFORGERY (COPY-MOVE) DETECTION IN DIGITAL IMAGES USING BLOCK METHOD
FORGERY (COPY-MOVE) DETECTION IN DIGITAL IMAGES USING BLOCK METHOD
 
Region duplication forgery detection in digital images
Region duplication forgery detection  in digital imagesRegion duplication forgery detection  in digital images
Region duplication forgery detection in digital images
 
Oc2423022305
Oc2423022305Oc2423022305
Oc2423022305
 
An Approach for Image Deblurring: Based on Sparse Representation and Regulari...
An Approach for Image Deblurring: Based on Sparse Representation and Regulari...An Approach for Image Deblurring: Based on Sparse Representation and Regulari...
An Approach for Image Deblurring: Based on Sparse Representation and Regulari...
 

Destaque

Final Presentation Traineeship T Ue
Final Presentation   Traineeship T UeFinal Presentation   Traineeship T Ue
Final Presentation Traineeship T Ue
mbeljaars
 
Automated Face Detection and Recognition
Automated Face Detection and RecognitionAutomated Face Detection and Recognition
Automated Face Detection and Recognition
Waldir Pimenta
 
Biometric Signature Recognization
 Biometric Signature Recognization Biometric Signature Recognization
Biometric Signature Recognization
Faimin Khan
 
Detection and recognition of face using neural network
Detection and recognition of face using neural networkDetection and recognition of face using neural network
Detection and recognition of face using neural network
Smriti Tikoo
 
Digital Hologram Image Processing
Digital Hologram Image ProcessingDigital Hologram Image Processing
Digital Hologram Image Processing
Conor Mc Elhinney
 
Face recognition using neural network
Face recognition using neural networkFace recognition using neural network
Face recognition using neural network
Indira Nayak
 

Destaque (20)

Fractionated Space Systems: Decoupling Conflicting Requirements and Isolating...
Fractionated Space Systems: Decoupling Conflicting Requirements and Isolating...Fractionated Space Systems: Decoupling Conflicting Requirements and Isolating...
Fractionated Space Systems: Decoupling Conflicting Requirements and Isolating...
 
Bigtooth Maple: Developing new cultivars for outstanding fall color in wester...
Bigtooth Maple: Developing new cultivars for outstanding fall color in wester...Bigtooth Maple: Developing new cultivars for outstanding fall color in wester...
Bigtooth Maple: Developing new cultivars for outstanding fall color in wester...
 
Ppt final
Ppt finalPpt final
Ppt final
 
Final Presentation Traineeship T Ue
Final Presentation   Traineeship T UeFinal Presentation   Traineeship T Ue
Final Presentation Traineeship T Ue
 
Thesis on Image compression by Manish Myst
Thesis on Image compression by Manish MystThesis on Image compression by Manish Myst
Thesis on Image compression by Manish Myst
 
Saliency-based Models of Image Content and their Application to Auto-Annotati...
Saliency-based Models of Image Content and their Application to Auto-Annotati...Saliency-based Models of Image Content and their Application to Auto-Annotati...
Saliency-based Models of Image Content and their Application to Auto-Annotati...
 
Automated Face Detection and Recognition
Automated Face Detection and RecognitionAutomated Face Detection and Recognition
Automated Face Detection and Recognition
 
MIT Camera Culture Group Update July 2009
MIT Camera Culture Group Update July 2009MIT Camera Culture Group Update July 2009
MIT Camera Culture Group Update July 2009
 
face detection
face detectionface detection
face detection
 
Biometric Signature Recognization
 Biometric Signature Recognization Biometric Signature Recognization
Biometric Signature Recognization
 
Face detection By Abdul Hanan
Face detection By Abdul HananFace detection By Abdul Hanan
Face detection By Abdul Hanan
 
Signature recognition
Signature recognitionSignature recognition
Signature recognition
 
Week6 face detection
Week6 face detectionWeek6 face detection
Week6 face detection
 
Detection and recognition of face using neural network
Detection and recognition of face using neural networkDetection and recognition of face using neural network
Detection and recognition of face using neural network
 
Digital Hologram Image Processing
Digital Hologram Image ProcessingDigital Hologram Image Processing
Digital Hologram Image Processing
 
Face Detection and Recognition System
Face Detection and Recognition SystemFace Detection and Recognition System
Face Detection and Recognition System
 
Face detection presentation slide
Face detection  presentation slideFace detection  presentation slide
Face detection presentation slide
 
Reflection of light (Physics)
Reflection of light (Physics)Reflection of light (Physics)
Reflection of light (Physics)
 
Automated Face Detection System
Automated Face Detection SystemAutomated Face Detection System
Automated Face Detection System
 
Face recognition using neural network
Face recognition using neural networkFace recognition using neural network
Face recognition using neural network
 

Semelhante a Group saliency propagation for large scale and quick image co segmentation

Ijarcet vol-2-issue-7-2319-2322
Ijarcet vol-2-issue-7-2319-2322Ijarcet vol-2-issue-7-2319-2322
Ijarcet vol-2-issue-7-2319-2322
Editor IJARCET
 
Image Contrast Enhancement for Brightness Preservation Based on Dynamic Stret...
Image Contrast Enhancement for Brightness Preservation Based on Dynamic Stret...Image Contrast Enhancement for Brightness Preservation Based on Dynamic Stret...
Image Contrast Enhancement for Brightness Preservation Based on Dynamic Stret...
CSCJournals
 
Graph Theory Based Approach For Image Segmentation Using Wavelet Transform
Graph Theory Based Approach For Image Segmentation Using Wavelet TransformGraph Theory Based Approach For Image Segmentation Using Wavelet Transform
Graph Theory Based Approach For Image Segmentation Using Wavelet Transform
CSCJournals
 
Spme 2013 segmentation
Spme 2013 segmentationSpme 2013 segmentation
Spme 2013 segmentation
Qujiang Lei
 

Semelhante a Group saliency propagation for large scale and quick image co segmentation (20)

Ijarcet vol-2-issue-7-2319-2322
Ijarcet vol-2-issue-7-2319-2322Ijarcet vol-2-issue-7-2319-2322
Ijarcet vol-2-issue-7-2319-2322
 
Image Classification using Deep Learning
Image Classification using Deep LearningImage Classification using Deep Learning
Image Classification using Deep Learning
 
CNN FEATURES ARE ALSO GREAT AT UNSUPERVISED CLASSIFICATION
CNN FEATURES ARE ALSO GREAT AT UNSUPERVISED CLASSIFICATION CNN FEATURES ARE ALSO GREAT AT UNSUPERVISED CLASSIFICATION
CNN FEATURES ARE ALSO GREAT AT UNSUPERVISED CLASSIFICATION
 
research_paper
research_paperresearch_paper
research_paper
 
Image Segmentation from RGBD Images by 3D Point Cloud Attributes and High-Lev...
Image Segmentation from RGBD Images by 3D Point Cloud Attributes and High-Lev...Image Segmentation from RGBD Images by 3D Point Cloud Attributes and High-Lev...
Image Segmentation from RGBD Images by 3D Point Cloud Attributes and High-Lev...
 
PERFORMANCE ANALYSIS OF CLUSTERING BASED IMAGE SEGMENTATION AND OPTIMIZATION ...
PERFORMANCE ANALYSIS OF CLUSTERING BASED IMAGE SEGMENTATION AND OPTIMIZATION ...PERFORMANCE ANALYSIS OF CLUSTERING BASED IMAGE SEGMENTATION AND OPTIMIZATION ...
PERFORMANCE ANALYSIS OF CLUSTERING BASED IMAGE SEGMENTATION AND OPTIMIZATION ...
 
Object based image enhancement
Object based image enhancementObject based image enhancement
Object based image enhancement
 
Image Contrast Enhancement for Brightness Preservation Based on Dynamic Stret...
Image Contrast Enhancement for Brightness Preservation Based on Dynamic Stret...Image Contrast Enhancement for Brightness Preservation Based on Dynamic Stret...
Image Contrast Enhancement for Brightness Preservation Based on Dynamic Stret...
 
ADVANCED SINGLE IMAGE RESOLUTION UPSURGING USING A GENERATIVE ADVERSARIAL NET...
ADVANCED SINGLE IMAGE RESOLUTION UPSURGING USING A GENERATIVE ADVERSARIAL NET...ADVANCED SINGLE IMAGE RESOLUTION UPSURGING USING A GENERATIVE ADVERSARIAL NET...
ADVANCED SINGLE IMAGE RESOLUTION UPSURGING USING A GENERATIVE ADVERSARIAL NET...
 
Graph Theory Based Approach For Image Segmentation Using Wavelet Transform
Graph Theory Based Approach For Image Segmentation Using Wavelet TransformGraph Theory Based Approach For Image Segmentation Using Wavelet Transform
Graph Theory Based Approach For Image Segmentation Using Wavelet Transform
 
IRJET - Deep Learning Approach to Inpainting and Outpainting System
IRJET -  	  Deep Learning Approach to Inpainting and Outpainting SystemIRJET -  	  Deep Learning Approach to Inpainting and Outpainting System
IRJET - Deep Learning Approach to Inpainting and Outpainting System
 
Spme 2013 segmentation
Spme 2013 segmentationSpme 2013 segmentation
Spme 2013 segmentation
 
An improved image compression algorithm based on daubechies wavelets with ar...
An improved image compression algorithm based on daubechies  wavelets with ar...An improved image compression algorithm based on daubechies  wavelets with ar...
An improved image compression algorithm based on daubechies wavelets with ar...
 
Learning from Simulated and Unsupervised Images through Adversarial Training....
Learning from Simulated and Unsupervised Images through Adversarial Training....Learning from Simulated and Unsupervised Images through Adversarial Training....
Learning from Simulated and Unsupervised Images through Adversarial Training....
 
H010315356
H010315356H010315356
H010315356
 
BIG DATA-DRIVEN FAST REDUCING THE VISUAL BLOCK ARTIFACTS OF DCT COMPRESSED IM...
BIG DATA-DRIVEN FAST REDUCING THE VISUAL BLOCK ARTIFACTS OF DCT COMPRESSED IM...BIG DATA-DRIVEN FAST REDUCING THE VISUAL BLOCK ARTIFACTS OF DCT COMPRESSED IM...
BIG DATA-DRIVEN FAST REDUCING THE VISUAL BLOCK ARTIFACTS OF DCT COMPRESSED IM...
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)
 
IMAGE SEGMENTATION BY MODIFIED MAP-ML ESTIMATIONS
IMAGE SEGMENTATION BY MODIFIED MAP-ML ESTIMATIONSIMAGE SEGMENTATION BY MODIFIED MAP-ML ESTIMATIONS
IMAGE SEGMENTATION BY MODIFIED MAP-ML ESTIMATIONS
 
Image segmentation by modified map ml
Image segmentation by modified map mlImage segmentation by modified map ml
Image segmentation by modified map ml
 
Image segmentation by modified map ml estimations
Image segmentation by modified map ml estimationsImage segmentation by modified map ml estimations
Image segmentation by modified map ml estimations
 

Mais de Koteswar Rao Jerripothula

Image segmentation using advanced fuzzy c-mean algorithm [FYP @ IITR, obtaine...
Image segmentation using advanced fuzzy c-mean algorithm [FYP @ IITR, obtaine...Image segmentation using advanced fuzzy c-mean algorithm [FYP @ IITR, obtaine...
Image segmentation using advanced fuzzy c-mean algorithm [FYP @ IITR, obtaine...
Koteswar Rao Jerripothula
 

Mais de Koteswar Rao Jerripothula (6)

Eye-focused Detection of Bell’s Palsy in Videos
Eye-focused Detection of Bell’s Palsy in VideosEye-focused Detection of Bell’s Palsy in Videos
Eye-focused Detection of Bell’s Palsy in Videos
 
Image Co-segmentation via Saliency Co-fusion
Image Co-segmentation via Saliency Co-fusionImage Co-segmentation via Saliency Co-fusion
Image Co-segmentation via Saliency Co-fusion
 
Image segmentation using advanced fuzzy c-mean algorithm [FYP @ IITR, obtaine...
Image segmentation using advanced fuzzy c-mean algorithm [FYP @ IITR, obtaine...Image segmentation using advanced fuzzy c-mean algorithm [FYP @ IITR, obtaine...
Image segmentation using advanced fuzzy c-mean algorithm [FYP @ IITR, obtaine...
 
Automatic Image Co-segmentation Using Geometric Mean Saliency(Top 10% paper)[...
Automatic Image Co-segmentation Using Geometric Mean Saliency(Top 10% paper)[...Automatic Image Co-segmentation Using Geometric Mean Saliency(Top 10% paper)[...
Automatic Image Co-segmentation Using Geometric Mean Saliency(Top 10% paper)[...
 
Introduction to Machine learning
Introduction to Machine learningIntroduction to Machine learning
Introduction to Machine learning
 
Automatic Image Co-segmentation Using Geometric Mean Saliency
Automatic Image Co-segmentation Using Geometric Mean Saliency   Automatic Image Co-segmentation Using Geometric Mean Saliency
Automatic Image Co-segmentation Using Geometric Mean Saliency
 

Último

BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
SoniaTolstoy
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
fonyou31
 

Último (20)

BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajan
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
 

Group saliency propagation for large scale and quick image co segmentation

  • 1. GROUP SALIENCY PROPAGATION FOR LARGE SCALE AND QUICK IMAGE CO-SEGMENTATION Koteswar Rao Jerripothula † Jianfei Cai† Junsong Yuan§ ROSE Lab, Interdisciplinary Graduate School, Nanyang Technological University † School of Computer Engineering, Nanyang Technological University § School of Electrical and Electronic Engineering, Nanyang Technological University ABSTRACT Most of the existing co-segmentation methods are usually complex, and require pre-grouping of images, fine-tuning a few parameters and initial segmentation masks etc. These limitations become serious concerns for their application on large scale datasets. In this paper, Group Saliency Propaga- tion (GSP) model is proposed where a single group saliency map is developed, which can be propagated to segment the entire group. In addition, it is also shown how a pool of these group saliency maps can help in quickly segmenting new input images. Experiments demonstrate that the proposed method can achieve competitive performance on several benchmark co-segmentation datasets including ImageNet, with the added advantage of speed up. Index Terms— co-segmentation, propagation, quick, large scale, group, ImageNet, fusion. 1. INTRODUCTION Image co-segmentation deals with segmenting out common objects from a set of images. Exploiting the weak supervision provided by association of similar images, co-segmentation has become a promising substitute to tedious interactive seg- mentation for object extraction and labelling in large-scale datasets. The concept was originally introduced by [1] to simultaneously segment out the common object from a pair of images. Since then, many co-segmentation methods have been proposed to scale from image pair to multiple images [2, 3, 4]. The work in [2] proposed a discriminative clustering framework and [3] used optimization for co-segmentation. Recently, [5] combined co-segmentation with co-sketch for effective co-segmentation and [6] proposed to co-segment a noisy image dataset (where some images may not contain the common object). Most of these existing works require inten- sive computation due to co-labelling multiple images and also †This research was carried out at the Rapid-Rich Object Search (ROSE) Lab at the Nanyang Technological University, Singapore. The ROSE Lab is supported by the National Research Foundation, Prime Ministers Office, Singapore, under its IDM Futures Funding Initiative and administered by the Interactive and Digital Media Programme Office. This research is also supported by Singapore MoE AcRF Tier-1 Grant RG30/11. need fine-tuning a few parameters and pre-grouping of images etc. Our previous work [7] addresses some of these issues by performing segmentation on each of the individual images but with the help of a global saliency map (formed via pair- wise warping process). Warping [8] is a process of alignment of one image w.r.t. another image where dense correspon- dence between two images is established at pixel level and this process is computationally expensive. Therefore, devel- oping such a global saliency map for each image requires in- tensive computation. In this paper, we aim at improving the efficiency of our previous method so that it can be applied to large scale datasets without much performance loss. Geometric Mean Saliency: Since proposed method is based on our previous work [7], we briefly review the ba- sic idea here to make the paper self contained. In GMS [7], segmentation on individual images is performed, but using a global saliency map that is obtained by fusing single-image saliency maps of a group of similar images. In this way, even if an existing single-image saliency detection method fails to detect the common object as salient in an image, saliency maps of other images can help in extracting the common ob- ject by forming a global saliency map. The fusion takes place at the pixel level using geometric mean after saliency maps of other images are aligned. The corresponding saliency values in other images are collected and combined for each image. When we do this for each image, mostly the same correspond- ing pixels get together every time and their saliency values are combined, under the assumption that warping technique is precise and accurate. This makes the process repetitive and inefficient. So, we avoid such repetition in the proposed method by doing this task only once and propagate this infor- mation in the group. Therefore, in this paper, we effectively perform the same task as earlier in an efficient way and as a result we hardly observe any drop in the performance. In particular, our basic idea is to first cluster visually sim- ilar images into a group and select a key image to represent each group. After that we can align all the saliency maps of other images in the group w.r.t. the key image via warp- ing technique and fuse them to form a single group saliency
  • 2. Fig. 1. Comparison between GMS [7] and our proposed GSP. Instead of performing pairwise matching (GMS [7]), GSP only matches each member image with the key image that represents the whole group. map for the entire group. This group saliency map serves as a group prior for foreground extraction of all the images in the group by means of propagation that is done via warping. Moreover, it can also help to segment a new input image by the same means. We call our method Group Saliency Propa- gation (GSP) model. Note that there are only a few attempts on large-scale cosegmentation such as [3, 9, 10], where [10] represents state-of-the-art. Although [10] also relies on prop- agation, it needs some human annotated segmentation masks and semantic grouping to initiate the propagation, whereas our method does not have such a limitation. In addition, our GSP method can also help on quick segmentation of any new image by utilizing a pool of pre-computed group saliency maps. Experiments on several benchmark datasets includ- ing ImageNet [11], show that our proposed GSP model can achieve competitive results with a drastic speed up. 2. GROUP SALIENCY PROPAGATION This section provides the detailed description of our proposed method. The group saliency map is obtained by fusing the key image saliency map and warped saliency maps of other im- ages at pixel level. Unlike [7] where such fusion of individual saliency map and warped saliency map is carried out for every image, here we only fuse the saliency maps for the key image and treat the fused saliency map as the group saliency map. Once we obtain the group saliency map for each group, this group saliency map can help in segmentation of all the images in the group as well as new input images through propagation. Notations: Consider a large dataset with m images I = {I1, I2, .., Im}. Let M = {MI1 , MI2 , .., MIm } be the set of corresponding visual saliency maps, D = {DI1 , DI2 , ..., DIm } be the set of corresponding weighted GIST descriptors [12, 6] (saliency maps are used as weight maps) for global descrip- tion, and L = {LI1 , LI2 , ..., LIm } be the set of correspond- ing dense SIFT descriptors [8] for local description. Denote I, A, S and M as a new input image, its GIST descriptor, SIFT descriptor and preprocessed saliency maps respectively. Pre-processing: We use the same pre-processing steps as in [7] to ensure that each saliency map covers sufficient Grab Cut Source Images Saliency Maps Our Results Key Image …………………………………….Member Images………………………………… Warping WarpingGroup Saliency Map Warped Group Saliency Maps Fig. 2. Proposed GSP approach: A single group saliency map of key image helps in segmenting all the other images. regions of the object by adding spatial contrast saliency and then brighten the saliency map to avoid over-penalty caused by low saliency values. 2.1. Group Saliency Maps We first perform k-means clustering using GIST descriptors of images to cluster the entire image dataset into K groups. For each group the image closest to a cluster center is se- lected as the key image to represent the whole group. For each group, its group saliency map is produced using [7] w.r.t the key image alone where saliency maps of the member images are aligned to the key image via the warping technique. Then following the method in GMS [7], the key image saliency map and the warped saliency maps from other members are fused at pixel level using geometric mean. Let Ck denote the k-th cluster as well as the corresponding key image, and U Ij Ck de- note the warped saliency map of the image Ij w.r.t. key image Ck. The group saliency map Gk value for any pixel p can now be computed as Gk (p) = MCk (p) Ij =Ck,Ij ∈Ck U Ij Ck (p) 1 |Ck| (1) where |.| represents cardinality. For quick co-segmentation, along with the group saliency maps we also need to store GIST and SIFT descriptors of the key images and number of members in the group for the pur- pose of group assignment, warping and fusion respectively. Therefore, the pool of group saliency maps contains not only the maps G = {G1 , G2 , ..., GK } but also the corresponding GIST and SIFT descriptors and number of group members: A = {DC1 , DC2 , ..., DCK } and S = {LC1 , LC2 , ..., LCK } and C = {|Ck|, |C2|, ..., |CK|} respectively. 2.2. Propagation In our previous work [7], for a group of n images, when seg- menting one image, the saliency maps of the other n − 1 im- ages need to wrap to the current image, which requires com- puting the costly dense SIFT flows n − 1 times. Thus, seg-
  • 3. Input Images Results Pool of group saliency maps warping Key image Saliency Fusion Fig. 3. Quick co-segmentation of any new input images by first group assigment, then group saliency map warping and then fusion of saliency maps. menting n images will need computing dense SIFT flows for n × (n − 1) times, as shown in the left part of Fig. 1, which is very time consuming, especially for large-scale dataset. In order to reduce the computational cost, in this paper, we propose to use group saliency map to propagate the group prior information. In particular, for a cluster of n images, we only compute the dense SIFT flows for the key image n − 1 times. When segmenting the n − 1 member images, we sim- ply warp the group saliency map back to each of them. In this way, totally we only need 2 × (n − 1) dense SIFT flows, as shown in the right part of Fig. 1. We achieve a drastic speed up of n/2 times by using this kind of propagation of group saliency maps. An illustration of the proposed approach is given in Fig. 2. In particular, the object prior O for any mem- ber image Ij would be warped Gk w.r.t. Ij denoted as UCk Ij . Based on the pool of [G, A, S, C] information, we can also quickly co-segment a new image by 1) assigning the im- age to a group using the GIST feature, 2) warping the se- lected group saliency map to the input image, and 3) fusing the saliency map of the input image with the warped group saliency map, as shown in Fig. 3. Since one more saliency value is to be included in the calculation of geometric mean of saliency values, the object prior O value for any pixel p of a new input image I can be computed as, O(p) = UCk I (p) |Ck| × M(p) 1 |Ck|+1 (2) where UCk I denotes warped Gk w.r.t. I 2.3. Foreground Extraction We obtain the final mask using GrabCut [13], in which regu- larization is performed with the help of SLIC superpixel over- segmentation [14]. Particularly, foreground (F) and back- ground (B) seed locations for GrabCut are determined by p ∈ F, if O(p) > τ B, if O(p) < φ (3) Cheetah Cow Dog Mixed Fig. 4. Sample Results on ImageNet [11] Table 1. Jaccard Similarity and Time consumed comparison of GSP with GMS[7] using default setting Jaccard Time(mins.) Similarity Consumed Datasets Images GMS[7] GSP GMS[7] GSP Weizmann Horses 328 0.72 0.72 117 45 MSRC 418 0.68 0.68 105 22 iCoseg 529 0.67 0.66 126 38 CosegRep 572 0.71 0.71 216 35 Internet Images 2470 0.62 0.61 806 133 where φ is a global threshold value of O which is automati- cally determined by the common Otsu’s method [15] and τ is the only tuning parameter. 3. EXPERIMENTAL RESULTS Several benchmark datasets including ImageNet [11] have been used to validate the proposed GSP model and compare it with the existing methods. Two objective measures, Jac- card Similarity (J) and Precision (P), are used for evaluation. Jaccard Similarity is defined as the intersection divided by the union of ground truth and segmentation result, and Precision is defined as the percentage of pixels correctly labeled. In Table 1, we compare our quantitative results and time consumed with our previous work [7] on 5 benchmark co- segmentation datasets: MSRC [17], iCoseg [18], Coseg-Rep [5], Internet Dataset [6], Weizmann Horses [19], using de- fault parameter setting :τ = 0.99. In each of our experiments with m images, K is calculated as nearest integer to m/15 en-
  • 4. Table 2. Quantitative Comparison with existing methods on large scale dataset ImageNet Methods Jaccard Similarity Precision [10] - 77.3 [16] 0.57 84.3 Ours(1) 0.55 83.9 Ours(2) 0.58 85.6 Ours(3) 0.62 87.7 Table 3. Comparison of using categorization and not using categorization in default setting scenario Yes No Datasets J P J P MSRC 0.678 87.2 0.674 86.9 iCoseg 0.656 88.9 0.632 87.8 CosegRep 0.706 91.2 0.701 90.4 suring that there are 15 images per sub-group on an average. Since apart from Weizmann horses dataset, all other datasets contain multiple sub-groups, experiments are conducted sep- arately on each sub-group and average performance has been reported. Reported “Time Consumed” is the total time con- sumed for entire dataset. It can be noted that performance is hardly affected but there is significant improvement in speed. As explained in previous sections, the reason can be attributed to the fact that we are effectively achieving the same objective ,i.e., collecting corresponding pixels together and combining their saliency values. However, here we efficiently avoid the repetition occurring in our previous work [7]. Recently, evaluation on large dataset such as ImageNet has also become possible with the release of groundtruths on nearly 4.5k images by [10]. We apply our method on about 0.4 million images of 446 synsets to which these 4.5k images belong and compare our performance with [10] and [16] in Table 2. Here τ is the only parameter that we tune in our ex- periments and we report three results with different setting of τ: (1) τ is fixed for all classes, (2) τ is fixed within a class, but varied across the classes and (3) τ is tuned across all the im- ages of dataset. As expected, the performance improves from (1) to (3) progressively and it will be interesting to see if this parameter can be learned. Compared with [16] our method does not require any segmentation masks for initiation but still achieves comparable performance as in [16]. Some sam- ple segmentation results obtained from ImageNet are shown in Fig. 4. Even with cluttered backgrounds, our method can still segment the foregrounds successfully. Since we rely on good clustering to group visually sim- ilar images and claim that manually pre-grouping is not ab- solutely essential for our method, it would be interesting to Fig. 5. Some sample results of our quick co-segmentation method. In spite of object not being so salient in the saliency map [20], the proposed approach is able to extract it. Table 4. Jaccard similarity comparison beween quick co- segmentation and other automatic methods like GrabCut ini- tialized by central window and saliency map [20] Datasets Center Saliency Quick MSRC 0.44 0.47 0.67 Weizmann Horses 0.53 0.58 0.72 see how our method behaves “without exploiting manual cat- egorization” in comparison to “with exploiting manual cate- gorization” on benchmark co-segmentation datasets. It can be noted in Table 3 that there is very minor drop in performance. To evauate our quick cosegmentation idea we first make a pool of 30k group saliency maps produced while attempting to segment ImageNet and then input the images of datasets image-by-image. In Table 4, we compare results of our quick co-segmentation idea with that of simple GrabCut initialized with central window of 25% of size of the image and ini- tialized with saliency prior thresholded using Otsu method. Grabcut initialized by central window, grabcut initialized by saliency prior and our quick co-segmentation method have been labelled as Center, Saliency and Quick respectively. It can be noted that our quick co-segmentation method obtains significant improvement over others. Some sample results of this approach are shown in Fig. 5 demonstrating how fore- ground gets extracted in spite of not being so salient. 4. CONCLUSION Image co-segmentation in large-scale dataset remaining a challenging problem as most existing methods cannot be eas- ily extended to big dataset, we efficiently extend our previous work [7] to perform large-scale co-segmentation and quick co-segmentation of new input images by introducing the idea of propagating a group saliency map in the entire group. Such an approach requires a lesser number of SIFT flows compared to the previous method. Experimental results demonstrated that proposed method is able to perform co-segmentation in a much faster manner without much affecting on performance.
  • 5. 5. REFERENCES [1] Carsten Rother, Tom Minka, Andrew Blake, and Vladimir Kolmogorov, “Cosegmentation of image pairs by histogram matching-incorporating a global constraint into mrfs,” in Computer Vision and Pattern Recogni- tion(CVPR). IEEE, 2006, vol. 1, pp. 993–1000. [2] Armand Joulin, Francis Bach, and Jean Ponce, “Dis- criminative clustering for image co-segmentation,” in Computer Vision and Pattern Recognition (CVPR). IEEE, 2010, pp. 1943–1950. [3] Gunhee Kim, Eric P Xing, Li Fei-Fei, and Takeo Kanade, “Distributed cosegmentation via submodular optimization on anisotropic diffusion,” in International Conference on Computer Vision(ICCV). IEEE, 2011, pp. 169–176. [4] Armand Joulin, Francis Bach, and Jean Ponce, “Multi- class cosegmentation,” in Computer Vision and Pattern Recognition (CVPR). IEEE, 2012, pp. 542–549. [5] Jifeng Dai, Ying N Wu, Jie Zhou, and Song-Chun Zhu, “Cosegmentation and cosketch by unsupervised learn- ing,” in International Conference on Computer Vi- sion(ICCV), 2013. [6] Michael Rubinstein, Armand Joulin, Johannes Kopf, and Ce Liu, “Unsupervised joint object discovery and segmentation in internet images,” in Computer Vi- sion and Pattern Recognition (CVPR). IEEE, 2013, pp. 1939–1946. [7] Koteswar R Jerripothula, Jianfei Cai, Fanman Meng, and Junsong Yuan, “Automatic image Co-Segmentation using geometric mean saliency,” in International Con- ference on Image Processing (ICIP), Paris, France, Oct. 2014, pp. 3282–3286. [8] Ce Liu, Jenny Yuen, and Antonio Torralba, “Sift flow: Dense correspondence across scenes and its applica- tions,” Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 33, no. 5, pp. 978–994, 2011. [9] Gunhee Kim and Eric P Xing, “On multiple fore- ground cosegmentation,” in Computer Vision and Pat- tern Recognition (CVPR). IEEE, 2012, pp. 837–844. [10] Daniel Kuettel, Matthieu Guillaumin, and Vittorio Fer- rari, “Segmentation propagation in imagenet,” in Euro- pean Conference on Computer Vision(ECCV), pp. 459– 473. Springer, 2012. [11] Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei, “Imagenet: A large-scale hierarchi- cal image database,” in Computer Vision and Pattern Recognition (CVPR). IEEE, 2009, pp. 248–255. [12] Aude Oliva and Antonio Torralba, “Modeling the shape of the scene: A holistic representation of the spa- tial envelope,” International journal of computer vi- sion(IJCV), vol. 42, no. 3, pp. 145–175, 2001. [13] Carsten Rother, Vladimir Kolmogorov, and Andrew Blake, “Grabcut: Interactive foreground extraction us- ing iterated graph cuts,” in Transactions on Graphics (TOG). ACM, 2004, vol. 23, pp. 309–314. [14] Radhakrishna Achanta, Appu Shaji, Kevin Smith, Aure- lien Lucchi, Pascal Fua, and Sabine S¨usstrunk, “Slic su- perpixels,” Ecole Polytechnique F´ed´eral de Lausssanne (EPFL), Tech. Rep, vol. 2, pp. 3, 2010. [15] Nobuyuki Otsu, “A threshold selection method from gray-level histograms,” Automatica, vol. 11, no. 285- 296, pp. 23–27, 1975. [16] Matthieu Guillaumin, Daniel Kuettel, and Vittorio Fer- rari, “Imagenet auto-annotation with segmentation propagation,” International Journal of Computer Vi- sion(IJCV), pp. 1–21, 2014. [17] Jamie Shotton, John Winn, Carsten Rother, and Antonio Criminisi, “Textonboost: Joint appearance, shape and context modeling for multi-class object recognition and segmentation,” in European Conference on Computer Vision(ECCV), pp. 1–15. Springer, 2006. [18] Dhruv Batra, Adarsh Kowdle, Devi Parikh, Jiebo Luo, and Tsuhan Chen, “icoseg: Interactive co-segmentation with intelligent scribble guidance,” in Computer Vi- sion and Pattern Recognition (CVPR). IEEE, 2010, pp. 3169–3176. [19] Eran Borenstein and Shimon Ullman, “Class-specific, top-down segmentation,” in European Conference on Computer Vision(ECCV), pp. 109–122. Springer, 2002. [20] Ming-Ming Cheng, Guo-Xin Zhang, Niloy J Mitra, Xi- aolei Huang, and Shi-Min Hu, “Global contrast based salient region detection,” in Computer Vision and Pat- tern Recognition (CVPR). IEEE, 2011, pp. 409–416.