SlideShare uma empresa Scribd logo
1 de 49
Baixar para ler offline
Image Analysis & Retrieval
CS/EE 5590 Special Topics (Class Ids: 44873, 44874)
Fall 2016, M/W 4-5:15pm@Bloch 0012
Lec 11
Object Re-Identification
Zhu Li
Dept of CSEE, UMKC
Office: FH560E, Email: lizhu@umkc.edu, Ph: x 2346.
http://l.web.umkc.edu/lizhu
Z. Li, Image Analysis & Retrv. 2016 p.1
Outline
 ReCap of Lecture 10
 SVM
 HW-2
 Visual Object Re-Identification
 Summary
Z. Li, Image Analysis & Retrv. 2016 p.2
ReCap of Lec 10
Support Vector Machine:
Z. Li, Image Analysis & Retrv. 2016 p.3
SVM Summary
 What is a good classifier ?
 Not only good precision-recall performance at training (empirical
risk function), but also need to consider the model complexity
 Structural Risk: penalizing by VC dimension
 VC dimension
 Is a good measure of model complexity
 How many data points a certain classifier can shatter
 SVM:
 For linear hyperplane decision function, structural risk
minimization is equivalent to gap maximization
 Lagrangian Relaxation & Primal-Dual decomposition
 Support Vectors: mathematically, are data points that has non-zero
Lagrangian associated with
 Kernel Trick: implicit mapping to higher dimensional richer
structure space. Heuristic, may have overfitting risks (eg. RBF)
Z. Li, Image Analysis & Retrv. 2016 p.4
Lagrangian & Primal-Dual Decomposition 
• Primal-Dual Decomposition,  Lagrangian
Z. Li, Image Analysis & Retrv. 2016 p.5
SVM on HoG in Matlab
SVM is only dealing with a 2-class problem
% hogs svm
[n_img, kd]=size(hogs);
hogs_lbl = zeros(1, n_img); hogs_lbl(1:20) = 1;
svm_hogs = svmtrain(hogs, hogs_lbl);
% recog
rec_lbls = svmclassify(svm_hogs, hogs);
fprintf('n error rate = %1.4f', sum(abs(hogs_lbl -
rec_lbls'))/n_img);
stem(abs(hogs_lbl - rec_lbls'), '.'); hold on; grid on;
axis([1 160 0 5]); str=sprintf('err rate=%1.2f ',
sum(abs(hogs_lbl - rec_lbls'))/n_img); title(str);
Z. Li, Image Analysis & Retrv. 2016 p.6
SVM on HoG Performance
For the first 4 image sets: against the rest
2:Faces_easy
26:cougar_face 33:dollar_bill
3:Leopards
Z. Li, Image Analysis & Retrv. 2016 p.7
Homework 2: Aggregation
 Compute SIFT Codebook
 SIFT dimension reduction: kd=24, 32;
 SIFT Kmeans: nc=64, 128.
 SIFT GMM: nc=64, 128;
 VLAD Aggregation:
 Hard assignment VLAD
 Soft assignment VLAD
 Fisher Vector Aggregation:
 1st order FV via vl_feat tools
Matching Experiment
 Like in HW-1
Retrieval Experiment:
 Compute mAP
Z. Li, Image Analysis & Retrv. 2016 p.8
Outline
 ReCap of Lecture 10
 SVM
 HW-2
 Visual Object Re-Identification
 Summary
Z. Li, Image Analysis & Retrv. 2016 p.9
Object Re-Id Outline
The Problem and MPEG CDVS Standardization Scope
CDVS Query Extraction and Compression Pipeline
 Key point Detection and Selection (ALP, CABOX, FS)
 Global Descriptor: Key points aggregation and compression
(SCFV, RVD, AKULA)
 Local Descriptor: Key points and coordinates compression
CDVS Query Processing
 Pair-wise Matching
 Retrieval
 Indexing
CDVA Work
 Handling video input
 Image/Video Understanding
Summary
Z. Li, Image Analysis & Retrv. 2016 p.10
Mobile Visual Search Problem
CDVS: Object Identification: bridging the real and cyber world
Image Understanding/Tagging: associate labels with pixels
Z. Li, Image Analysis & Retrv. 2016 p.11
CDVS Scope
MPEG CDVS Standardization Scope
 Define the visual query bit-stream extracted from images
 Front-end: image feature capture and compression
 Server Back-end: image feature indexing and query processing
Objectives/Challenges:
 Real-time: front end real time performance, e.g, 640x480 @30fps
 Compression: Low bit rate over the air, achieving 20 X compression w.r.t to sending
images, or 10X compression of the raw features.
 Matching Accuracy: >95% accuracy in pair-wise matching (verification) and >90%
precision in identification
 Indexing/Search Efficiency: real time backend response from large (>100m) visual
repository
Z. Li, Image Analysis & Retrv. 2016 p.12
Object Re-Identification via SIFT
 What are the problems ?
 Accuracy
 Speed
 Query Compression
 Localziation
Z. Li, Image Analysis & Retrv. 2016 p.13
Technology Time Line
MPEG-7 CDVS, 8th FP7 Networked Media Concentration meeting, Brussels, December 13, 2011.
(thanks,Yuriy !)
1998 200820042002 2010
MPEG-7 core:
Parts 1-5
MPEG-7
Parts 6-12
2000 2006
MPEG-7
Visual Signatures
Compact Descriptors for
Visual Search
2012
SIFT
algorithm
invented
First
iPhone
SURF
Recognition of
SIFT in CV
community
Major progress in computer vision research
CHoG
MPEG
starts work
on CDVS
First visual search applications
Z. Li, Image Analysis & Retrv. 2016 p.14
CDVS Data Set Performance
• Annotated Data Set:
– Mix of graphics, landmarks,
buildings, objects, video clips, and
paintings.
– Approx 32k imags
• Distraction Set:
– Approx 1m images from various
places
• Image Query Size
– 512bytes to 16K bytes
– 4k~8k bytes are the most useful
objects
Z. Li, Image Analysis & Retrv. 2016 p.15
The (Long)MPEG CDVS Time Line
Meeting / Location/ Date Action Comments
96th meeting, Geneva Mar 25, 2011 Final CfP published
Available databases and
evaluation software.
97th meeting, Torino July 18-23,
2011
Last changes in
databases and
evaluation software
Dataset: 30k annotated images
+ 1m distractor images
98th meeting, Geneva Nov 26 –
Dec 2, 2011
Evaluation of proposals 11 proposals received, selected
TI Lab test model under
consideration
99th meeting, San Jose Feb 10, 2012 WD Working Draft 1
100th meeting, Geneva March 4, 2012 WD Working Draft 2
….. …. …… SIFT patent issue
106th meeting, Geneva Oct, 2013 CD Committee Draft
108th meeting, Valencia Mar, 2014 DIS Finally….
Z. Li, Image Analysis & Retrv. 2016 p.16
Outline
The Problem and CDVS Standardization Scope
CDVS Query Extraction and Compression Pipeline
 Key point Detection and Selection (ALP, CABOX, FS)
 Global Descriptor: Key points aggregation and compression
(SCFV, RVD, AKULA)
 Local Descriptor: Key points and coordinates compression
CDVS Query Processing
 Pair-wise Matching
 Retrieval
 Indexing
CDVA Work
 Handling video input
 Image/Video Understanding
Summary
Z. Li, Image Analysis & Retrv. 2016 p.17
CDVS Pipeline
CDVS Query Processing Pipeline
KD - Keypint Detection
 ALP, BFLoG, CABOX
FS - Feature Selection
GD - Global Descriptor from key points aggregation
 SCFV, RVD, AKULA
LD – Local Descriptor
 SIFT compression, spatial coordinates compression
+
++
+
+
+ +
+
+
Global Descriptor
Local Descriptors
bitstreamKD FS
GD
LD
Descriptor Extraction Descriptor Encoding
Z. Li, Image Analysis & Retrv. 2016 p.18
How SIFT Works
Detection: find extrema in spatial-scale space
 Scale space is represented by LoG filtering output, but
approximated in SIFT by DoG
Key Points Representation: 128 dim
Z. Li, Image Analysis & Retrv. 2016 p.19
Keypoint (SIFT) Detection
Main motivation
 Find invariant and repeatable features to identify objects
o Many options, SURF, SIFT, BRISK, …etc, but SIFT still best
performing
 Circumvent the SIFT patent, which was sold to an unknown third
company
o SIFT patent main claim: DoG filtering to reconstruct scale space
 Extraction Speed: can we be faster than DoG ?
Main Proposals:
 Telecom Italia: ALP – Scale Space SIFT Detector (adopted !)
 Samsung Research: CABOX – Integral Image Domain Fast Box
Filtering (incomplete results)
 Beijing University/ST Micro/Huawei: Freq domain Gaussian
Filtering (deemed not departing far enough from the SIFT patent)
Z. Li, Image Analysis & Retrv. 2016 p.20
m31369: ALP-Scale Space Key point Detector
Telecom Italia: Gianluca Francini, Skjalg Lepsoy, Massimo Balestri
key idea, model the scale space response as a polynomial function, and estimate
its coefficients by LoG filtering at different scales:
ℎ 𝑥, 𝑦, 𝜎 =
𝜎2
∙
𝑑2
𝑑𝑥2
+
𝑑2
𝑑𝑦2
𝑔 𝑥, 𝑦, 𝜎
ℎ 𝑚, 𝑛, 𝜎 ≈
𝑘=1
𝐾
𝛾 𝑘 𝜎 ∙ ℎ 𝑚, 𝑛, 𝜎 𝑘
LoG kernel at any scale can be expressed as l.c. of 4 fixed scale
kernels
𝜸 𝒌 𝝈 ≈ 𝒂 𝒌 𝝈 𝟑
+ 𝒃 𝒌 𝝈 𝟐
+ 𝒄 𝒌 𝝈 + 𝒅 𝒌
Z. Li, Image Analysis & Retrv. 2016 p.21
ALP – Scale Space Response
Express the scale response at (x, y), as a 3rd order polynomial
The polynomial coefficients are obtained by filtering, where Lk
is the LoG filtered image at pre-fixed scale k:
𝛾 𝑘 𝜎
𝜎
Z. Li, Image Analysis & Retrv. 2016 p.22
ALP Scale Space Extrema Detection
ALP filtering scale space response as polynomials
– Scale Space Extrema: 𝑰 ∗ 𝒉 𝟒𝟑𝟎, 𝟏𝟐𝟐, 𝝈 ≈ 𝟐. 𝟖𝟓 𝝈 𝟑
− 𝟑𝟗. 𝟓𝟏 𝝈 𝟐
+ 𝟏𝟕𝟐. 𝟔𝟓 𝝈 − 𝟏𝟕𝟐. 𝟔𝟏
– No Extrema (saddle point): 𝑰 ∗ 𝒉 𝟏𝟓𝟑, 𝟑𝟓𝟔, 𝝈 ≈ 𝟎. 𝟏𝟐 𝝈 𝟑
− 𝟏. 𝟎𝟔 𝝈 𝟐
+ 𝟑. 𝟏𝟓 𝝈 − 𝟐. 𝟓
+
+
Z. Li, Image Analysis & Retrv. 2016 p.23
Displacement refinement
Scale response as a 2nd order polynomial function of
displacement (u, v)
𝐼 ∗ ℎ 𝑥 − 𝑢, 𝑦 − 𝑣, 𝜎
≈ 𝛽5 𝑥, 𝑦, 𝜎 𝑢2
+ 𝛽4 𝑥, 𝑦, 𝜎 𝑣2
+ 𝛽3 𝑥, 𝑦, 𝜎 𝑢𝑣 + 𝛽2 𝑥, 𝑦, 𝜎 𝑢 + 𝛽1 𝑥, 𝑦, 𝜎 𝑣 + 𝛽0 𝑥, 𝑦, 𝜎
Z. Li, Image Analysis & Retrv. 2016 p.24
ALP Performance
Repeatability vs VL_FEAT SIFT:
Z. Li, Image Analysis & Retrv. 2016 p.25
Gaussian Filter Box Filters
Approximate DoG/LoG by a cascade of box filters that can offer early
termination:
• Very fast integral image domain box filtering,
• The box filters are found by solving the following problem, sparse
combination of box filters, via LASSO:
CABOX – Cascade of Box Filters (m30446)
Z. Li, Image Analysis & Retrv. 2016 26
The influence of the used dictionary
determines not only the quality of the
approximation but also the
number of boxes required.
CABOX Results
Z. Li, Image Analysis & Retrv. 2016 27
CABOX Detection Results
More examples of keypoint detection using box filters.
Overlap: 85% Overlap: 88%
• Algorithmically the fastest SIFT detector amongst CE1 contributions
• Ref: V. Fragoso, G. Srivastava, A. Nagar, Z. Li, K. Park, and M. Turk, "Cascade of Box
(CABOX) Filters for Optimal Scale Space Approximation", Proc of the 4th IEEE Int'l
Workshop on Mobile Vision , Columbus, USA, 2014
Z. Li, Image Analysis & Retrv. 2016 p.28
Feature Selection
Why do Feature Selection ?
 Average 1000+ SIFTs extracted for VGA sized images, need to reduce
the number of actual SIFTs sent
 Not all SIFTs are created equal in repeatability in image match,
o model the repeatability as a prob function [Lepsøy, S., Francini, G., Cordara, G.,
& Gusmao, P. P. (2011). Statistical modelling of outliers for fast visual search. IEEE VCIDS 2011.]
of SIFT’s scale, orientation, distance to the center, peak strength,
…,etc :
 Use self-matching (m29359) to improve the offline
repeatability stats robustness
 )()()()()()(),,,,,( 65432
*
1
*
  pffDfdfffpdDr
im0
im1
0 10 20 30 40
0
2
4
6
8
10
12
x 10
4 dsift
im0 -> im1
0 5 10 15 20 25
0
2
4
6
8
10
12
14
x 10
4 dsift
im1 -> im0
𝑓(𝜎)
𝜎 Self-matching via
random out of plane rotationZ. Li, Image Analysis & Retrv. 2016 p.29
Feature Selection
Illustration of FS via offline repeatability PMF
SIFT peak strength pmf
SIFT scale pmf
Combined scale/peak strength pmf
Z. Li, Image Analysis & Retrv. 2016 p.30
Global Descriptor
Why need global descriptor ?
 Key points based query representation is not stateless, it has a
structure, i.e, SIFTs and their positions. This is not good for
retrieval against a large database, complexity O(N)
 Need a “coarser” representation of the information contained
in the image by aggregating local features, for
indexing/hashing purpose.
 Landmark work, Fisher Vector, the best performing solution
in ImageNet challenge, before the CNN deep learning
solution: [Perronnin10] F. Perronnin, J. Sánchez, and T.
Mensink. Improving the fisher kernel for large-scale image
classification. In Proc. ECCV, 2010.
CDVS Global Aggregation Works
 m28061: Beijing University SCFV: for retrieval/identification
 m31491: Samsung AKULA: for matching /verification
 M31426: Univ of Surrey/VisualAtom: RVD: similar to SCFV
Z. Li, Image Analysis & Retrv. 2016 p.31
Global Descriptor – SCFV (m28061)
Beijing University SCFV – Scalable Compressed Fisher Vector
 PCA to bring SIFT down from 128 to 32 dimensions
 Train a GMM of 128 components in 32 dim space, with parameters {ui, σi,
wi }
 Aggregate m=300 SIFT with GMM via Fisher Vector, 1st, and 2nd order,
where γt i is the prob of SIFT xt being generated by GMM component i,
0100011001010
ℊui
X
=
𝜕ℒ X λ
𝜕ui
=
1
300wi t=1
300
γt i (
xt − ui
σi
)
ℊσi
X =
𝜕ℒ X λ
𝜕σi
=
1
600wi t=1
300
γt i [
xt − ui
σi
2
− 1] γt i = p i xt, λ =
wipi(xt|λ)
j=1
128
wjpj(xt|λ)
Z. Li, Image Analysis & Retrv. 2016 p.32
SCFV Distance Function
The SCFV has 32x128 bits, for the 1st order Fisher Vector,
and additional 32x128 bits for the 2nd order FV
Not all GMM components are active, so an 128-bit flag [b1,
b2,…, b128] is also introduced to indicate if it is active.
Rationale: if not many SIFTs are associated with certain
component, then the bits it generates are noise most likely
Distance metric:
A lot of painful work on GMM component turn on logic
optimization to reach a very high performance.
Very fast due to binary ops, can short list a 1m image data
base within 1 sec on desktop computer.
sX,Y =
i=1
128
bi
X
bi
Y
wHa(ui
X
,ui
Y
)(32 − 2Ha(ui
X
, ui
Y
))
32 i=1
128
bi
X
i=1
128
bi
Y
Z. Li, Image Analysis & Retrv. 2016 p.33
SCFV Performance
Matching/Verification (TPR @ 1% FPR)
Retrieval/Identification (mAP)
Z. Li, Image Analysis & Retrv. 2016 p.34
Local Descriptor
SIFT Descriptor Compression
 VisualAtoms/Univ of Surrey: a
handcrafted
transform/quantization scheme +
huffman coding, low memory
cost, slightly less performance
(compared to PVQ), adopted.
 Only binary form is received,
cannot recover the SIFT
h0
h6 h7
h4
h5
h2h3 h1
 

  
 
1
0 and
1
i i
j j
i i i i i
j j j j j
i i
j j
if v QL
v if v QL v QH
if v QH
Z. Li, Image Analysis & Retrv. 2016 p.35
Outline
The Problem and CDVS Standardization Scope
CDVS Query Extraction and Compression Pipeline
 Key point Detection and Selection (ALP, CABOX, FS)
 Global Descriptor: Key points aggregation and compression
(SCFV, RVD, AKULA)
 Local Descriptor: Key points and coordinates compression
CDVS Query Processing
 Pair-wise Matching
 Retrieval
 Indexing
CDVA Work
 Handling video input
 Image/Video Understanding
Summary
Z. Li, Image Analysis & Retrv. 2016 p.36
Pair Wise Matching
Diagram of Image Matching
 First local features are matched,
 if certain number of matched SIFT pairs are identified, then
a Geometric Verification called DISTRAT[] is performed,
to check the consistence of the matching points via distance
ratio check.
 For un-sure image pairs, the global descriptor distance is
computed and a threshold is applied to decide match of non-
match
Z. Li, Image Analysis & Retrv. 2016 p.37
Matching Performance (@ 1% FPR)
• Image Matching Accuracy:
– Mix of graphics, landmarks, buildings, objects, video clips, and paintings.
• Image Identification Accuracy:
– For graphics (cd/book cover, logos, papers), paintings, the performance is in 90% range
– For objects of mixed variety, 78% in average.
– For buildings/landmark, the performance is not reflective of the true potential, as the current
data set has some annotation errors
Z. Li, Image Analysis & Retrv. 2016 p.38
CDVS Retrieval
Retrieval Pipeline
 Short list is generated by GD based k-nn operation via:
 Then for the short list of m candidates, do m times local
descriptor based matching and rank their matching scores
𝑑 𝑅, 𝑄 =
i=1
512
bi
Q
bi
R
W1Ha(ui
Q
,ui
R
)
W2ui
Q(D − 2Ha(ui
Q
, ui
R
))
( i=1
512
bi
Q
)0.3( i=1
512
bi
R
)0.3
Z. Li, Image Analysis & Retrv. 2016 p.39
Retrieval Performance: Mean Average Precision
mAP measures the retrieval performance across all
queries
Image Analysis & Understanding, 2015 p.40
mAP example
 MAP is computed across all query results as the
average precision over the recall
Image Analysis & Understanding, 2015 p.41
CDVS Retrieval Performance
Retrieval Simulation Set Up
 Approx. 17k annotated images mixed with 1m+ distraction image set
 Short Listing: retrieve 500 closest matches by GD and then do pair wise
matching and ranking
 Data sets
mAP
Z. Li, Image Analysis & Retrv. 2016 p.42
MBIT (multi-block indx table) Indexing
GD is partitioned into blocks of 16 bits, and inverted list built.
Shortlisting is by weighted scoring on block wise hamming
distance
Algorithm . MBIT Searching
Input: Query B 𝑞 = {𝑏 𝑚
𝑞
} 𝑚=1
1024
, MBIT T = {𝑇 𝑚 } 𝑚=1
1024
, speedup ratio 𝑇, difference bits 𝐷.
Output: The shortlist {B𝑙}𝑙=1
𝐿
, 𝐿 = 500.
1: Initialize 𝑠(𝑞, 𝑛) = 0, 𝑛 = 1 … 𝑁.
2: for 𝑚 = 1 to 1024 do
4: if the
𝑚+1
2
-th Gaussian of B 𝑞 is not selected then
5: continue;
6: end if;
7: for 𝑑 = 0 to D do
8: Enumerate binary vectors {h 𝑑 } with 𝑑-bit differences with 𝑏 𝑚
𝑞
.
9: For each image 𝑛 in the buckets T 𝑚 (h 𝑑 ), update # 𝑛,𝑑 = # 𝑛,𝑑 + 1.
10: end for
11: end for
12: for 𝑛 = 1 to 𝑁 do
13: Update s(q, 𝑛) = # 𝑛,𝑑
𝐷
𝑑=0 ;
14: end for
15: Sort the image list by their voting score in descending order.
16: Add descriptors of top
𝑁
𝑇
images in the ordered list into subset {B 𝑘} 𝑘=1
𝐾
.
17: Run an exhaustive search within {B 𝑘} 𝑘=1
𝐾
and sort the list by Hamming distance.
18: Return the first 𝐿 = 500 images.
Z. Li, Image Analysis & Retrv. 2016 p.43
(m29361) Bit Mask/Collision Optimized Indexing
Selecting 6-bits segments that are most efficient in discriminating
for shortlisting, allow for permutation of bits
Shortlisting by weighted segment hamming distance also reflecting
the segment entropy
Full paper form: X. Xin, A. Nagar, G. Srivastava, Z. Li*, F.
Fernandes, A. Katsaggelos,Large Visual Repository Search with
Hash Collision Design Optimization. IEEE MultiMedia 20(2): 62-
71 (2013)
Z. Li, Image Analysis & Retrv. 2016 p.44
Outline
The Problem and CDVS Standardization Scope
CDVS Query Extraction and Compression Pipeline
 Key point Detection and Selection (ALP, CABOX, FS)
 Global Descriptor: Key points aggregation and compression
(SCFV, RVD, AKULA)
 Local Descriptor: Key points and coordinates compression
CDVS Query Processing
 Pair-wise Matching
 Retrieval
 Indexing
CDVA Work
 Handling video input
 Image/Video Understanding
Summary
Z. Li, Image Analysis & Retrv. 2016 p.45
Automotive & AR Use Case
Extend to object Identification / event detection for video
input:
 How do we deal with vastly increased data rate ?
o New spatial-temporal interesting points ?
o Key frame based CDVS processing ?
 How do we explore the temporal dimension ?
o Events detection, content classification (video archiving)
Z. Li, Image Analysis & Retrv. 2016 p.46
Image Understanding/Tagging
CDVS is based on a handcrafted feature, i.e, SIFT and SIFT aggregation.
Latest work in Deep Learning pointing to new potentials in CCN to uncover
new structure and knowledge from pixels, to associate with not only object
identity, but also image labels
Z. Li, Image Analysis & Retrv. 2016 p.47
Summary
MPEG CDVS offers the state-of-art tech performance
in visual object re-identification accuracy, speed, and
query compression
Amd work:
 A recoverable SIFT compression scheme, currently SIFT
cannot be recovered from bit stream
 3D key points, wide adoption of RGB+Depth sensors.
 Non-rigid body object identification
CDVA work, still refining the problem definition, main
use cases
 Object identification in video
 Events detection, content classification
 Image Understanding (vs object identification)
Z. Li, Image Analysis & Retrv. 2016 p.48
References
Key References
 Test Model 11: ISO/IEC JTC1/SC29/WG11/N14393
 SoDIS: ISO/IEC DIS 15938-13 Information technology — Multimedia content
description interface — Part 13: Compact descriptors for visual search
 Signal Processing – Image Communication, special issue on visual search and
augmented reality, vol. 28(4), April 2013. Eds. Giovanni Cordara, Miroslaw Bober,
Yuriy A. Reznik:.
 ALP: m31369 CDVS: Telecom Italia’s response to CE1 – Interest point detection
 CABOX: V. Fragoso, G. Srivastava, A. Nagar, Z. Li, K. Park, and M. Turk, "Cascade
of Box (CABOX) Filters for Optimal Scale Space Approximation", Proc of the 4th
IEEE Int'l Workshop on Mobile Vision , Columbus, USA, 2014
 FS: Lepsøy, S., Francini, G., Cordara, G., & Gusmao, P. P. (2011). Statistical modelling
of outliers for fast visual search. IEEE Workshop on Visual Content Identification and
Search (VCIDS 2011). Barcelona, Spain.
 SCFV: L.-Y. Duan, J.Lin, J. Chen, T. Huang, W. Gao: Compact Descriptors for Visual
Search. IEEE Multimedia 21(3): 30-40 (2014)
 RVD: m31426 Improving performance and usability of CDVS TM7 with a Robust
Visual Descriptor (RVD)
 AKULA: A. Nagar, Z. Li, G. Srivastava, K.Park: AKULA - Adaptive Cluster
Aggregation for Visual Search. IEEE DCC 2014: 13-22
Z. Li, Image Analysis & Retrv. 2016 p.49

Mais conteúdo relacionado

Mais procurados

Deep image retrieval learning global representations for image search
Deep image retrieval  learning global representations for image searchDeep image retrieval  learning global representations for image search
Deep image retrieval learning global representations for image searchUniversitat Politècnica de Catalunya
 
Object Detection Beyond Mask R-CNN and RetinaNet III
Object Detection Beyond Mask R-CNN and RetinaNet IIIObject Detection Beyond Mask R-CNN and RetinaNet III
Object Detection Beyond Mask R-CNN and RetinaNet IIIWanjin Yu
 
Survey on optical flow estimation with DL
Survey on optical flow estimation with DLSurvey on optical flow estimation with DL
Survey on optical flow estimation with DLLeapMind Inc
 
Attentive semantic alignment with offset aware correlation kernels
Attentive semantic alignment with offset aware correlation kernelsAttentive semantic alignment with offset aware correlation kernels
Attentive semantic alignment with offset aware correlation kernelsNAVER Engineering
 
Deep image retrieval - learning global representations for image search - ub ...
Deep image retrieval - learning global representations for image search - ub ...Deep image retrieval - learning global representations for image search - ub ...
Deep image retrieval - learning global representations for image search - ub ...Universitat de Barcelona
 
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)Universitat Politècnica de Catalunya
 
Visual odometry & slam utilizing indoor structured environments
Visual odometry & slam utilizing indoor structured environmentsVisual odometry & slam utilizing indoor structured environments
Visual odometry & slam utilizing indoor structured environmentsNAVER Engineering
 
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)Universitat Politècnica de Catalunya
 
Optimized linear spatial filters implemented in FPGA
Optimized linear spatial filters implemented in FPGAOptimized linear spatial filters implemented in FPGA
Optimized linear spatial filters implemented in FPGAIOSRJVSP
 
改进的固定点图像复原算法_英文_阎雪飞
改进的固定点图像复原算法_英文_阎雪飞改进的固定点图像复原算法_英文_阎雪飞
改进的固定点图像复原算法_英文_阎雪飞alen yan
 
Histogram based Enhancement
Histogram based Enhancement Histogram based Enhancement
Histogram based Enhancement Vivek V
 
Graph R-CNN for Scene Graph Generation
Graph R-CNN for Scene Graph GenerationGraph R-CNN for Scene Graph Generation
Graph R-CNN for Scene Graph GenerationSangmin Woo
 
Mask-RCNN for Instance Segmentation
Mask-RCNN for Instance SegmentationMask-RCNN for Instance Segmentation
Mask-RCNN for Instance SegmentationDat Nguyen
 
Toward wave net speech synthesis
Toward wave net speech synthesisToward wave net speech synthesis
Toward wave net speech synthesisNAVER Engineering
 
GENERIC APPROACH FOR VISIBLE WATERMARKING
GENERIC APPROACH FOR VISIBLE WATERMARKINGGENERIC APPROACH FOR VISIBLE WATERMARKING
GENERIC APPROACH FOR VISIBLE WATERMARKINGEditor IJCATR
 

Mais procurados (20)

Deep image retrieval learning global representations for image search
Deep image retrieval  learning global representations for image searchDeep image retrieval  learning global representations for image search
Deep image retrieval learning global representations for image search
 
Object Detection Beyond Mask R-CNN and RetinaNet III
Object Detection Beyond Mask R-CNN and RetinaNet IIIObject Detection Beyond Mask R-CNN and RetinaNet III
Object Detection Beyond Mask R-CNN and RetinaNet III
 
Survey on optical flow estimation with DL
Survey on optical flow estimation with DLSurvey on optical flow estimation with DL
Survey on optical flow estimation with DL
 
Deep Learning for Computer Vision: Image Retrieval (UPC 2016)
Deep Learning for Computer Vision: Image Retrieval (UPC 2016)Deep Learning for Computer Vision: Image Retrieval (UPC 2016)
Deep Learning for Computer Vision: Image Retrieval (UPC 2016)
 
Mask R-CNN
Mask R-CNNMask R-CNN
Mask R-CNN
 
Attentive semantic alignment with offset aware correlation kernels
Attentive semantic alignment with offset aware correlation kernelsAttentive semantic alignment with offset aware correlation kernels
Attentive semantic alignment with offset aware correlation kernels
 
Deep image retrieval - learning global representations for image search - ub ...
Deep image retrieval - learning global representations for image search - ub ...Deep image retrieval - learning global representations for image search - ub ...
Deep image retrieval - learning global representations for image search - ub ...
 
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
 
Visual odometry & slam utilizing indoor structured environments
Visual odometry & slam utilizing indoor structured environmentsVisual odometry & slam utilizing indoor structured environments
Visual odometry & slam utilizing indoor structured environments
 
Lec16 subspace optimization
Lec16 subspace optimizationLec16 subspace optimization
Lec16 subspace optimization
 
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
 
Optimized linear spatial filters implemented in FPGA
Optimized linear spatial filters implemented in FPGAOptimized linear spatial filters implemented in FPGA
Optimized linear spatial filters implemented in FPGA
 
改进的固定点图像复原算法_英文_阎雪飞
改进的固定点图像复原算法_英文_阎雪飞改进的固定点图像复原算法_英文_阎雪飞
改进的固定点图像复原算法_英文_阎雪飞
 
Histogram based Enhancement
Histogram based Enhancement Histogram based Enhancement
Histogram based Enhancement
 
Graph R-CNN for Scene Graph Generation
Graph R-CNN for Scene Graph GenerationGraph R-CNN for Scene Graph Generation
Graph R-CNN for Scene Graph Generation
 
Mask-RCNN for Instance Segmentation
Mask-RCNN for Instance SegmentationMask-RCNN for Instance Segmentation
Mask-RCNN for Instance Segmentation
 
Deep 3D Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2018
Deep 3D Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2018Deep 3D Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2018
Deep 3D Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2018
 
Toward wave net speech synthesis
Toward wave net speech synthesisToward wave net speech synthesis
Toward wave net speech synthesis
 
GENERIC APPROACH FOR VISIBLE WATERMARKING
GENERIC APPROACH FOR VISIBLE WATERMARKINGGENERIC APPROACH FOR VISIBLE WATERMARKING
GENERIC APPROACH FOR VISIBLE WATERMARKING
 
CLIM Program: Remote Sensing Workshop, Statistical Emulation with Dimension R...
CLIM Program: Remote Sensing Workshop, Statistical Emulation with Dimension R...CLIM Program: Remote Sensing Workshop, Statistical Emulation with Dimension R...
CLIM Program: Remote Sensing Workshop, Statistical Emulation with Dimension R...
 

Destaque

01424 dc22may20151119
01424 dc22may2015111901424 dc22may20151119
01424 dc22may20151119Sinaut-Sunat
 
Customer and MVP 1 preso
Customer and MVP 1 presoCustomer and MVP 1 preso
Customer and MVP 1 presonickorlando
 
Complexity Literacy Meeting - La scheda del libro pubblicato da Bettoni, Gand...
Complexity Literacy Meeting - La scheda del libro pubblicato da Bettoni, Gand...Complexity Literacy Meeting - La scheda del libro pubblicato da Bettoni, Gand...
Complexity Literacy Meeting - La scheda del libro pubblicato da Bettoni, Gand...Complexity Institute
 
Guia de peces de la Reserva Natural de Uso Integral y Mixta Laguna de Rocha
Guia de peces de la Reserva Natural de Uso Integral y Mixta Laguna de RochaGuia de peces de la Reserva Natural de Uso Integral y Mixta Laguna de Rocha
Guia de peces de la Reserva Natural de Uso Integral y Mixta Laguna de RochaAlfredo Daniel Bodratti Masino
 
Memòria menjador 1r trimestre 2015 16
Memòria menjador 1r trimestre 2015 16Memòria menjador 1r trimestre 2015 16
Memòria menjador 1r trimestre 2015 16Carolina Silla Melgoso
 
PCA-CompChem_seminar
PCA-CompChem_seminarPCA-CompChem_seminar
PCA-CompChem_seminarAnne D'cruz
 
Valutare le fonti e informare i professionisti: ruolo e strategie del bibliot...
Valutare le fonti e informare i professionisti: ruolo e strategie del bibliot...Valutare le fonti e informare i professionisti: ruolo e strategie del bibliot...
Valutare le fonti e informare i professionisti: ruolo e strategie del bibliot...GIDIF-RBM
 
YEAR 9 HISTORY - THE MAYAN EMPIRE
YEAR 9 HISTORY - THE MAYAN EMPIREYEAR 9 HISTORY - THE MAYAN EMPIRE
YEAR 9 HISTORY - THE MAYAN EMPIREGeorge Dumitrache
 
эрэгтэй эмэгтэй хүмүүсийн сэтгэл зүйн ялгаа
эрэгтэй эмэгтэй хүмүүсийн сэтгэл зүйн ялгааэрэгтэй эмэгтэй хүмүүсийн сэтгэл зүйн ялгаа
эрэгтэй эмэгтэй хүмүүсийн сэтгэл зүйн ялгааazora14
 
Биеийн харилцааны хэл (Body language)
Биеийн харилцааны хэл (Body language)Биеийн харилцааны хэл (Body language)
Биеийн харилцааны хэл (Body language)azora14
 
The Social Organization
The Social OrganizationThe Social Organization
The Social OrganizationSoCo Partners
 
Top Productivity Working Hacks by Jan Rezab
Top Productivity Working Hacks by Jan RezabTop Productivity Working Hacks by Jan Rezab
Top Productivity Working Hacks by Jan RezabJan Rezab
 

Destaque (18)

01424 dc22may20151119
01424 dc22may2015111901424 dc22may20151119
01424 dc22may20151119
 
Customer and MVP 1 preso
Customer and MVP 1 presoCustomer and MVP 1 preso
Customer and MVP 1 preso
 
Bachelor of Laws
Bachelor of LawsBachelor of Laws
Bachelor of Laws
 
Complexity Literacy Meeting - La scheda del libro pubblicato da Bettoni, Gand...
Complexity Literacy Meeting - La scheda del libro pubblicato da Bettoni, Gand...Complexity Literacy Meeting - La scheda del libro pubblicato da Bettoni, Gand...
Complexity Literacy Meeting - La scheda del libro pubblicato da Bettoni, Gand...
 
Sinceramiento fiscal aclaraci----n sobre normativa
Sinceramiento fiscal   aclaraci----n sobre normativaSinceramiento fiscal   aclaraci----n sobre normativa
Sinceramiento fiscal aclaraci----n sobre normativa
 
Guia de peces de la Reserva Natural de Uso Integral y Mixta Laguna de Rocha
Guia de peces de la Reserva Natural de Uso Integral y Mixta Laguna de RochaGuia de peces de la Reserva Natural de Uso Integral y Mixta Laguna de Rocha
Guia de peces de la Reserva Natural de Uso Integral y Mixta Laguna de Rocha
 
Memòria menjador 1r trimestre 2015 16
Memòria menjador 1r trimestre 2015 16Memòria menjador 1r trimestre 2015 16
Memòria menjador 1r trimestre 2015 16
 
PCA-CompChem_seminar
PCA-CompChem_seminarPCA-CompChem_seminar
PCA-CompChem_seminar
 
Valutare le fonti e informare i professionisti: ruolo e strategie del bibliot...
Valutare le fonti e informare i professionisti: ruolo e strategie del bibliot...Valutare le fonti e informare i professionisti: ruolo e strategie del bibliot...
Valutare le fonti e informare i professionisti: ruolo e strategie del bibliot...
 
YEAR 9 HISTORY - THE MAYAN EMPIRE
YEAR 9 HISTORY - THE MAYAN EMPIREYEAR 9 HISTORY - THE MAYAN EMPIRE
YEAR 9 HISTORY - THE MAYAN EMPIRE
 
сэтгэл судлал хичээлийн хөтөлбөр
сэтгэл судлал хичээлийн хөтөлбөрсэтгэл судлал хичээлийн хөтөлбөр
сэтгэл судлал хичээлийн хөтөлбөр
 
эрэгтэй эмэгтэй хүмүүсийн сэтгэл зүйн ялгаа
эрэгтэй эмэгтэй хүмүүсийн сэтгэл зүйн ялгааэрэгтэй эмэгтэй хүмүүсийн сэтгэл зүйн ялгаа
эрэгтэй эмэгтэй хүмүүсийн сэтгэл зүйн ялгаа
 
Биеийн харилцааны хэл (Body language)
Биеийн харилцааны хэл (Body language)Биеийн харилцааны хэл (Body language)
Биеийн харилцааны хэл (Body language)
 
Cameroon agriculture-nutrition nexus: actors and key intervention areas
Cameroon agriculture-nutrition nexus: actors and key intervention areas Cameroon agriculture-nutrition nexus: actors and key intervention areas
Cameroon agriculture-nutrition nexus: actors and key intervention areas
 
Cassava value chain for improved nutrition in Central Africa
Cassava value chain for improved nutrition in Central AfricaCassava value chain for improved nutrition in Central Africa
Cassava value chain for improved nutrition in Central Africa
 
Intégration des jeunes dans la chaine de valeur du manioc par le biais des so...
Intégration des jeunes dans la chaine de valeur du manioc par le biais des so...Intégration des jeunes dans la chaine de valeur du manioc par le biais des so...
Intégration des jeunes dans la chaine de valeur du manioc par le biais des so...
 
The Social Organization
The Social OrganizationThe Social Organization
The Social Organization
 
Top Productivity Working Hacks by Jan Rezab
Top Productivity Working Hacks by Jan RezabTop Productivity Working Hacks by Jan Rezab
Top Productivity Working Hacks by Jan Rezab
 

Semelhante a Lec11 object-re-id

Fcv learn yu
Fcv learn yuFcv learn yu
Fcv learn yuzukun
 
Interpretability of Convolutional Neural Networks - Eva Mohedano - UPC Barcel...
Interpretability of Convolutional Neural Networks - Eva Mohedano - UPC Barcel...Interpretability of Convolutional Neural Networks - Eva Mohedano - UPC Barcel...
Interpretability of Convolutional Neural Networks - Eva Mohedano - UPC Barcel...Universitat Politècnica de Catalunya
 
Representative Previous Work
Representative Previous WorkRepresentative Previous Work
Representative Previous Workbutest
 
Representative Previous Work
Representative Previous WorkRepresentative Previous Work
Representative Previous Workbutest
 
Avihu Efrat's Viola and Jones face detection slides
Avihu Efrat's Viola and Jones face detection slidesAvihu Efrat's Viola and Jones face detection slides
Avihu Efrat's Viola and Jones face detection slideswolf
 
Convolutional neural networks for image classification — evidence from Kaggle...
Convolutional neural networks for image classification — evidence from Kaggle...Convolutional neural networks for image classification — evidence from Kaggle...
Convolutional neural networks for image classification — evidence from Kaggle...Dmytro Mishkin
 
Review: You Only Look One-level Feature
Review: You Only Look One-level FeatureReview: You Only Look One-level Feature
Review: You Only Look One-level FeatureDongmin Choi
 
AUTO AI 2021 talk Real world data augmentations for autonomous driving : B Ra...
AUTO AI 2021 talk Real world data augmentations for autonomous driving : B Ra...AUTO AI 2021 talk Real world data augmentations for autonomous driving : B Ra...
AUTO AI 2021 talk Real world data augmentations for autonomous driving : B Ra...Ravi Kiran B.
 
20110220 computer vision_eruhimov_lecture02
20110220 computer vision_eruhimov_lecture0220110220 computer vision_eruhimov_lecture02
20110220 computer vision_eruhimov_lecture02Computer Science Club
 
Multimodal Residual Networks for Visual QA
Multimodal Residual Networks for Visual QAMultimodal Residual Networks for Visual QA
Multimodal Residual Networks for Visual QAJin-Hwa Kim
 
Deep convnets for global recognition (Master in Computer Vision Barcelona 2016)
Deep convnets for global recognition (Master in Computer Vision Barcelona 2016)Deep convnets for global recognition (Master in Computer Vision Barcelona 2016)
Deep convnets for global recognition (Master in Computer Vision Barcelona 2016)Universitat Politècnica de Catalunya
 
Cvpr2010 open source vision software, intro and training part v open cv and r...
Cvpr2010 open source vision software, intro and training part v open cv and r...Cvpr2010 open source vision software, intro and training part v open cv and r...
Cvpr2010 open source vision software, intro and training part v open cv and r...zukun
 
Deep Learning in Computer Vision
Deep Learning in Computer VisionDeep Learning in Computer Vision
Deep Learning in Computer VisionSungjoon Choi
 
Obscenity Detection in Images
Obscenity Detection in ImagesObscenity Detection in Images
Obscenity Detection in ImagesAnil Kumar Gupta
 

Semelhante a Lec11 object-re-id (20)

Fcv learn yu
Fcv learn yuFcv learn yu
Fcv learn yu
 
Interpretability of Convolutional Neural Networks - Eva Mohedano - UPC Barcel...
Interpretability of Convolutional Neural Networks - Eva Mohedano - UPC Barcel...Interpretability of Convolutional Neural Networks - Eva Mohedano - UPC Barcel...
Interpretability of Convolutional Neural Networks - Eva Mohedano - UPC Barcel...
 
Representative Previous Work
Representative Previous WorkRepresentative Previous Work
Representative Previous Work
 
Representative Previous Work
Representative Previous WorkRepresentative Previous Work
Representative Previous Work
 
Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018
Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018
Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018
 
Lec11 rate distortion optimization
Lec11 rate distortion optimizationLec11 rate distortion optimization
Lec11 rate distortion optimization
 
Avihu Efrat's Viola and Jones face detection slides
Avihu Efrat's Viola and Jones face detection slidesAvihu Efrat's Viola and Jones face detection slides
Avihu Efrat's Viola and Jones face detection slides
 
Convolutional neural networks for image classification — evidence from Kaggle...
Convolutional neural networks for image classification — evidence from Kaggle...Convolutional neural networks for image classification — evidence from Kaggle...
Convolutional neural networks for image classification — evidence from Kaggle...
 
Review: You Only Look One-level Feature
Review: You Only Look One-level FeatureReview: You Only Look One-level Feature
Review: You Only Look One-level Feature
 
Object Detection - Míriam Bellver - UPC Barcelona 2018
Object Detection - Míriam Bellver - UPC Barcelona 2018Object Detection - Míriam Bellver - UPC Barcelona 2018
Object Detection - Míriam Bellver - UPC Barcelona 2018
 
AUTO AI 2021 talk Real world data augmentations for autonomous driving : B Ra...
AUTO AI 2021 talk Real world data augmentations for autonomous driving : B Ra...AUTO AI 2021 talk Real world data augmentations for autonomous driving : B Ra...
AUTO AI 2021 talk Real world data augmentations for autonomous driving : B Ra...
 
20110220 computer vision_eruhimov_lecture02
20110220 computer vision_eruhimov_lecture0220110220 computer vision_eruhimov_lecture02
20110220 computer vision_eruhimov_lecture02
 
Multimodal Residual Networks for Visual QA
Multimodal Residual Networks for Visual QAMultimodal Residual Networks for Visual QA
Multimodal Residual Networks for Visual QA
 
Deep convnets for global recognition (Master in Computer Vision Barcelona 2016)
Deep convnets for global recognition (Master in Computer Vision Barcelona 2016)Deep convnets for global recognition (Master in Computer Vision Barcelona 2016)
Deep convnets for global recognition (Master in Computer Vision Barcelona 2016)
 
Cvpr2010 open source vision software, intro and training part v open cv and r...
Cvpr2010 open source vision software, intro and training part v open cv and r...Cvpr2010 open source vision software, intro and training part v open cv and r...
Cvpr2010 open source vision software, intro and training part v open cv and r...
 
ECCV WS 2012 (Frank)
ECCV WS 2012 (Frank)ECCV WS 2012 (Frank)
ECCV WS 2012 (Frank)
 
Deep Learning for Computer Vision: Object Detection (UPC 2016)
Deep Learning for Computer Vision: Object Detection (UPC 2016)Deep Learning for Computer Vision: Object Detection (UPC 2016)
Deep Learning for Computer Vision: Object Detection (UPC 2016)
 
Deep Learning in Computer Vision
Deep Learning in Computer VisionDeep Learning in Computer Vision
Deep Learning in Computer Vision
 
Obscenity Detection in Images
Obscenity Detection in ImagesObscenity Detection in Images
Obscenity Detection in Images
 
Thesis presentation
Thesis presentationThesis presentation
Thesis presentation
 

Mais de United States Air Force Academy

Mobile Visual Search: Object Re-Identification Against Large Repositories
Mobile Visual Search: Object Re-Identification Against Large RepositoriesMobile Visual Search: Object Re-Identification Against Large Repositories
Mobile Visual Search: Object Re-Identification Against Large RepositoriesUnited States Air Force Academy
 
Tutorial on MPEG CDVS/CDVA Standardization at ICNITS L3 Meeting
Tutorial on MPEG CDVS/CDVA Standardization at ICNITS L3 MeetingTutorial on MPEG CDVS/CDVA Standardization at ICNITS L3 Meeting
Tutorial on MPEG CDVS/CDVA Standardization at ICNITS L3 MeetingUnited States Air Force Academy
 
Light Weight Fingerprinting for Video Playback Verification in MPEG DASH
Light Weight Fingerprinting for Video Playback Verification in MPEG DASHLight Weight Fingerprinting for Video Playback Verification in MPEG DASH
Light Weight Fingerprinting for Video Playback Verification in MPEG DASHUnited States Air Force Academy
 
Subspace Indexing on Grassmannian Manifold for Large Scale Visual Identification
Subspace Indexing on Grassmannian Manifold for Large Scale Visual IdentificationSubspace Indexing on Grassmannian Manifold for Large Scale Visual Identification
Subspace Indexing on Grassmannian Manifold for Large Scale Visual IdentificationUnited States Air Force Academy
 
Scaled Eigen Appearance and Likelihood Prunning for Large Scale Video Duplica...
Scaled Eigen Appearance and Likelihood Prunning for Large Scale Video Duplica...Scaled Eigen Appearance and Likelihood Prunning for Large Scale Video Duplica...
Scaled Eigen Appearance and Likelihood Prunning for Large Scale Video Duplica...United States Air Force Academy
 

Mais de United States Air Force Academy (9)

Lec14 eigenface and fisherface
Lec14 eigenface and fisherfaceLec14 eigenface and fisherface
Lec14 eigenface and fisherface
 
Lec-03 Entropy Coding I: Hoffmann & Golomb Codes
Lec-03 Entropy Coding I: Hoffmann & Golomb CodesLec-03 Entropy Coding I: Hoffmann & Golomb Codes
Lec-03 Entropy Coding I: Hoffmann & Golomb Codes
 
Multimedia Communication Lec02: Info Theory and Entropy
Multimedia Communication Lec02: Info Theory and EntropyMultimedia Communication Lec02: Info Theory and Entropy
Multimedia Communication Lec02: Info Theory and Entropy
 
ECE 4490 Multimedia Communication Lec01
ECE 4490 Multimedia Communication Lec01ECE 4490 Multimedia Communication Lec01
ECE 4490 Multimedia Communication Lec01
 
Mobile Visual Search: Object Re-Identification Against Large Repositories
Mobile Visual Search: Object Re-Identification Against Large RepositoriesMobile Visual Search: Object Re-Identification Against Large Repositories
Mobile Visual Search: Object Re-Identification Against Large Repositories
 
Tutorial on MPEG CDVS/CDVA Standardization at ICNITS L3 Meeting
Tutorial on MPEG CDVS/CDVA Standardization at ICNITS L3 MeetingTutorial on MPEG CDVS/CDVA Standardization at ICNITS L3 Meeting
Tutorial on MPEG CDVS/CDVA Standardization at ICNITS L3 Meeting
 
Light Weight Fingerprinting for Video Playback Verification in MPEG DASH
Light Weight Fingerprinting for Video Playback Verification in MPEG DASHLight Weight Fingerprinting for Video Playback Verification in MPEG DASH
Light Weight Fingerprinting for Video Playback Verification in MPEG DASH
 
Subspace Indexing on Grassmannian Manifold for Large Scale Visual Identification
Subspace Indexing on Grassmannian Manifold for Large Scale Visual IdentificationSubspace Indexing on Grassmannian Manifold for Large Scale Visual Identification
Subspace Indexing on Grassmannian Manifold for Large Scale Visual Identification
 
Scaled Eigen Appearance and Likelihood Prunning for Large Scale Video Duplica...
Scaled Eigen Appearance and Likelihood Prunning for Large Scale Video Duplica...Scaled Eigen Appearance and Likelihood Prunning for Large Scale Video Duplica...
Scaled Eigen Appearance and Likelihood Prunning for Large Scale Video Duplica...
 

Último

ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.MaryamAhmad92
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxOn_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxPooja Bhuva
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSCeline George
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfDr Vijay Vishwakarma
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxCeline George
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17Celine George
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.christianmathematics
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsKarakKing
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxAreebaZafar22
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxEsquimalt MFRC
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxmarlenawright1
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfPoh-Sun Goh
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...pradhanghanshyam7136
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfNirmal Dwivedi
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17Celine George
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Pooja Bhuva
 
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxannathomasp01
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...Nguyen Thanh Tu Collection
 

Último (20)

ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxOn_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
 
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 

Lec11 object-re-id

  • 1. Image Analysis & Retrieval CS/EE 5590 Special Topics (Class Ids: 44873, 44874) Fall 2016, M/W 4-5:15pm@Bloch 0012 Lec 11 Object Re-Identification Zhu Li Dept of CSEE, UMKC Office: FH560E, Email: lizhu@umkc.edu, Ph: x 2346. http://l.web.umkc.edu/lizhu Z. Li, Image Analysis & Retrv. 2016 p.1
  • 2. Outline  ReCap of Lecture 10  SVM  HW-2  Visual Object Re-Identification  Summary Z. Li, Image Analysis & Retrv. 2016 p.2
  • 3. ReCap of Lec 10 Support Vector Machine: Z. Li, Image Analysis & Retrv. 2016 p.3
  • 4. SVM Summary  What is a good classifier ?  Not only good precision-recall performance at training (empirical risk function), but also need to consider the model complexity  Structural Risk: penalizing by VC dimension  VC dimension  Is a good measure of model complexity  How many data points a certain classifier can shatter  SVM:  For linear hyperplane decision function, structural risk minimization is equivalent to gap maximization  Lagrangian Relaxation & Primal-Dual decomposition  Support Vectors: mathematically, are data points that has non-zero Lagrangian associated with  Kernel Trick: implicit mapping to higher dimensional richer structure space. Heuristic, may have overfitting risks (eg. RBF) Z. Li, Image Analysis & Retrv. 2016 p.4
  • 5. Lagrangian & Primal-Dual Decomposition  • Primal-Dual Decomposition,  Lagrangian Z. Li, Image Analysis & Retrv. 2016 p.5
  • 6. SVM on HoG in Matlab SVM is only dealing with a 2-class problem % hogs svm [n_img, kd]=size(hogs); hogs_lbl = zeros(1, n_img); hogs_lbl(1:20) = 1; svm_hogs = svmtrain(hogs, hogs_lbl); % recog rec_lbls = svmclassify(svm_hogs, hogs); fprintf('n error rate = %1.4f', sum(abs(hogs_lbl - rec_lbls'))/n_img); stem(abs(hogs_lbl - rec_lbls'), '.'); hold on; grid on; axis([1 160 0 5]); str=sprintf('err rate=%1.2f ', sum(abs(hogs_lbl - rec_lbls'))/n_img); title(str); Z. Li, Image Analysis & Retrv. 2016 p.6
  • 7. SVM on HoG Performance For the first 4 image sets: against the rest 2:Faces_easy 26:cougar_face 33:dollar_bill 3:Leopards Z. Li, Image Analysis & Retrv. 2016 p.7
  • 8. Homework 2: Aggregation  Compute SIFT Codebook  SIFT dimension reduction: kd=24, 32;  SIFT Kmeans: nc=64, 128.  SIFT GMM: nc=64, 128;  VLAD Aggregation:  Hard assignment VLAD  Soft assignment VLAD  Fisher Vector Aggregation:  1st order FV via vl_feat tools Matching Experiment  Like in HW-1 Retrieval Experiment:  Compute mAP Z. Li, Image Analysis & Retrv. 2016 p.8
  • 9. Outline  ReCap of Lecture 10  SVM  HW-2  Visual Object Re-Identification  Summary Z. Li, Image Analysis & Retrv. 2016 p.9
  • 10. Object Re-Id Outline The Problem and MPEG CDVS Standardization Scope CDVS Query Extraction and Compression Pipeline  Key point Detection and Selection (ALP, CABOX, FS)  Global Descriptor: Key points aggregation and compression (SCFV, RVD, AKULA)  Local Descriptor: Key points and coordinates compression CDVS Query Processing  Pair-wise Matching  Retrieval  Indexing CDVA Work  Handling video input  Image/Video Understanding Summary Z. Li, Image Analysis & Retrv. 2016 p.10
  • 11. Mobile Visual Search Problem CDVS: Object Identification: bridging the real and cyber world Image Understanding/Tagging: associate labels with pixels Z. Li, Image Analysis & Retrv. 2016 p.11
  • 12. CDVS Scope MPEG CDVS Standardization Scope  Define the visual query bit-stream extracted from images  Front-end: image feature capture and compression  Server Back-end: image feature indexing and query processing Objectives/Challenges:  Real-time: front end real time performance, e.g, 640x480 @30fps  Compression: Low bit rate over the air, achieving 20 X compression w.r.t to sending images, or 10X compression of the raw features.  Matching Accuracy: >95% accuracy in pair-wise matching (verification) and >90% precision in identification  Indexing/Search Efficiency: real time backend response from large (>100m) visual repository Z. Li, Image Analysis & Retrv. 2016 p.12
  • 13. Object Re-Identification via SIFT  What are the problems ?  Accuracy  Speed  Query Compression  Localziation Z. Li, Image Analysis & Retrv. 2016 p.13
  • 14. Technology Time Line MPEG-7 CDVS, 8th FP7 Networked Media Concentration meeting, Brussels, December 13, 2011. (thanks,Yuriy !) 1998 200820042002 2010 MPEG-7 core: Parts 1-5 MPEG-7 Parts 6-12 2000 2006 MPEG-7 Visual Signatures Compact Descriptors for Visual Search 2012 SIFT algorithm invented First iPhone SURF Recognition of SIFT in CV community Major progress in computer vision research CHoG MPEG starts work on CDVS First visual search applications Z. Li, Image Analysis & Retrv. 2016 p.14
  • 15. CDVS Data Set Performance • Annotated Data Set: – Mix of graphics, landmarks, buildings, objects, video clips, and paintings. – Approx 32k imags • Distraction Set: – Approx 1m images from various places • Image Query Size – 512bytes to 16K bytes – 4k~8k bytes are the most useful objects Z. Li, Image Analysis & Retrv. 2016 p.15
  • 16. The (Long)MPEG CDVS Time Line Meeting / Location/ Date Action Comments 96th meeting, Geneva Mar 25, 2011 Final CfP published Available databases and evaluation software. 97th meeting, Torino July 18-23, 2011 Last changes in databases and evaluation software Dataset: 30k annotated images + 1m distractor images 98th meeting, Geneva Nov 26 – Dec 2, 2011 Evaluation of proposals 11 proposals received, selected TI Lab test model under consideration 99th meeting, San Jose Feb 10, 2012 WD Working Draft 1 100th meeting, Geneva March 4, 2012 WD Working Draft 2 ….. …. …… SIFT patent issue 106th meeting, Geneva Oct, 2013 CD Committee Draft 108th meeting, Valencia Mar, 2014 DIS Finally…. Z. Li, Image Analysis & Retrv. 2016 p.16
  • 17. Outline The Problem and CDVS Standardization Scope CDVS Query Extraction and Compression Pipeline  Key point Detection and Selection (ALP, CABOX, FS)  Global Descriptor: Key points aggregation and compression (SCFV, RVD, AKULA)  Local Descriptor: Key points and coordinates compression CDVS Query Processing  Pair-wise Matching  Retrieval  Indexing CDVA Work  Handling video input  Image/Video Understanding Summary Z. Li, Image Analysis & Retrv. 2016 p.17
  • 18. CDVS Pipeline CDVS Query Processing Pipeline KD - Keypint Detection  ALP, BFLoG, CABOX FS - Feature Selection GD - Global Descriptor from key points aggregation  SCFV, RVD, AKULA LD – Local Descriptor  SIFT compression, spatial coordinates compression + ++ + + + + + + Global Descriptor Local Descriptors bitstreamKD FS GD LD Descriptor Extraction Descriptor Encoding Z. Li, Image Analysis & Retrv. 2016 p.18
  • 19. How SIFT Works Detection: find extrema in spatial-scale space  Scale space is represented by LoG filtering output, but approximated in SIFT by DoG Key Points Representation: 128 dim Z. Li, Image Analysis & Retrv. 2016 p.19
  • 20. Keypoint (SIFT) Detection Main motivation  Find invariant and repeatable features to identify objects o Many options, SURF, SIFT, BRISK, …etc, but SIFT still best performing  Circumvent the SIFT patent, which was sold to an unknown third company o SIFT patent main claim: DoG filtering to reconstruct scale space  Extraction Speed: can we be faster than DoG ? Main Proposals:  Telecom Italia: ALP – Scale Space SIFT Detector (adopted !)  Samsung Research: CABOX – Integral Image Domain Fast Box Filtering (incomplete results)  Beijing University/ST Micro/Huawei: Freq domain Gaussian Filtering (deemed not departing far enough from the SIFT patent) Z. Li, Image Analysis & Retrv. 2016 p.20
  • 21. m31369: ALP-Scale Space Key point Detector Telecom Italia: Gianluca Francini, Skjalg Lepsoy, Massimo Balestri key idea, model the scale space response as a polynomial function, and estimate its coefficients by LoG filtering at different scales: ℎ 𝑥, 𝑦, 𝜎 = 𝜎2 ∙ 𝑑2 𝑑𝑥2 + 𝑑2 𝑑𝑦2 𝑔 𝑥, 𝑦, 𝜎 ℎ 𝑚, 𝑛, 𝜎 ≈ 𝑘=1 𝐾 𝛾 𝑘 𝜎 ∙ ℎ 𝑚, 𝑛, 𝜎 𝑘 LoG kernel at any scale can be expressed as l.c. of 4 fixed scale kernels 𝜸 𝒌 𝝈 ≈ 𝒂 𝒌 𝝈 𝟑 + 𝒃 𝒌 𝝈 𝟐 + 𝒄 𝒌 𝝈 + 𝒅 𝒌 Z. Li, Image Analysis & Retrv. 2016 p.21
  • 22. ALP – Scale Space Response Express the scale response at (x, y), as a 3rd order polynomial The polynomial coefficients are obtained by filtering, where Lk is the LoG filtered image at pre-fixed scale k: 𝛾 𝑘 𝜎 𝜎 Z. Li, Image Analysis & Retrv. 2016 p.22
  • 23. ALP Scale Space Extrema Detection ALP filtering scale space response as polynomials – Scale Space Extrema: 𝑰 ∗ 𝒉 𝟒𝟑𝟎, 𝟏𝟐𝟐, 𝝈 ≈ 𝟐. 𝟖𝟓 𝝈 𝟑 − 𝟑𝟗. 𝟓𝟏 𝝈 𝟐 + 𝟏𝟕𝟐. 𝟔𝟓 𝝈 − 𝟏𝟕𝟐. 𝟔𝟏 – No Extrema (saddle point): 𝑰 ∗ 𝒉 𝟏𝟓𝟑, 𝟑𝟓𝟔, 𝝈 ≈ 𝟎. 𝟏𝟐 𝝈 𝟑 − 𝟏. 𝟎𝟔 𝝈 𝟐 + 𝟑. 𝟏𝟓 𝝈 − 𝟐. 𝟓 + + Z. Li, Image Analysis & Retrv. 2016 p.23
  • 24. Displacement refinement Scale response as a 2nd order polynomial function of displacement (u, v) 𝐼 ∗ ℎ 𝑥 − 𝑢, 𝑦 − 𝑣, 𝜎 ≈ 𝛽5 𝑥, 𝑦, 𝜎 𝑢2 + 𝛽4 𝑥, 𝑦, 𝜎 𝑣2 + 𝛽3 𝑥, 𝑦, 𝜎 𝑢𝑣 + 𝛽2 𝑥, 𝑦, 𝜎 𝑢 + 𝛽1 𝑥, 𝑦, 𝜎 𝑣 + 𝛽0 𝑥, 𝑦, 𝜎 Z. Li, Image Analysis & Retrv. 2016 p.24
  • 25. ALP Performance Repeatability vs VL_FEAT SIFT: Z. Li, Image Analysis & Retrv. 2016 p.25
  • 26. Gaussian Filter Box Filters Approximate DoG/LoG by a cascade of box filters that can offer early termination: • Very fast integral image domain box filtering, • The box filters are found by solving the following problem, sparse combination of box filters, via LASSO: CABOX – Cascade of Box Filters (m30446) Z. Li, Image Analysis & Retrv. 2016 26
  • 27. The influence of the used dictionary determines not only the quality of the approximation but also the number of boxes required. CABOX Results Z. Li, Image Analysis & Retrv. 2016 27
  • 28. CABOX Detection Results More examples of keypoint detection using box filters. Overlap: 85% Overlap: 88% • Algorithmically the fastest SIFT detector amongst CE1 contributions • Ref: V. Fragoso, G. Srivastava, A. Nagar, Z. Li, K. Park, and M. Turk, "Cascade of Box (CABOX) Filters for Optimal Scale Space Approximation", Proc of the 4th IEEE Int'l Workshop on Mobile Vision , Columbus, USA, 2014 Z. Li, Image Analysis & Retrv. 2016 p.28
  • 29. Feature Selection Why do Feature Selection ?  Average 1000+ SIFTs extracted for VGA sized images, need to reduce the number of actual SIFTs sent  Not all SIFTs are created equal in repeatability in image match, o model the repeatability as a prob function [Lepsøy, S., Francini, G., Cordara, G., & Gusmao, P. P. (2011). Statistical modelling of outliers for fast visual search. IEEE VCIDS 2011.] of SIFT’s scale, orientation, distance to the center, peak strength, …,etc :  Use self-matching (m29359) to improve the offline repeatability stats robustness  )()()()()()(),,,,,( 65432 * 1 *   pffDfdfffpdDr im0 im1 0 10 20 30 40 0 2 4 6 8 10 12 x 10 4 dsift im0 -> im1 0 5 10 15 20 25 0 2 4 6 8 10 12 14 x 10 4 dsift im1 -> im0 𝑓(𝜎) 𝜎 Self-matching via random out of plane rotationZ. Li, Image Analysis & Retrv. 2016 p.29
  • 30. Feature Selection Illustration of FS via offline repeatability PMF SIFT peak strength pmf SIFT scale pmf Combined scale/peak strength pmf Z. Li, Image Analysis & Retrv. 2016 p.30
  • 31. Global Descriptor Why need global descriptor ?  Key points based query representation is not stateless, it has a structure, i.e, SIFTs and their positions. This is not good for retrieval against a large database, complexity O(N)  Need a “coarser” representation of the information contained in the image by aggregating local features, for indexing/hashing purpose.  Landmark work, Fisher Vector, the best performing solution in ImageNet challenge, before the CNN deep learning solution: [Perronnin10] F. Perronnin, J. Sánchez, and T. Mensink. Improving the fisher kernel for large-scale image classification. In Proc. ECCV, 2010. CDVS Global Aggregation Works  m28061: Beijing University SCFV: for retrieval/identification  m31491: Samsung AKULA: for matching /verification  M31426: Univ of Surrey/VisualAtom: RVD: similar to SCFV Z. Li, Image Analysis & Retrv. 2016 p.31
  • 32. Global Descriptor – SCFV (m28061) Beijing University SCFV – Scalable Compressed Fisher Vector  PCA to bring SIFT down from 128 to 32 dimensions  Train a GMM of 128 components in 32 dim space, with parameters {ui, σi, wi }  Aggregate m=300 SIFT with GMM via Fisher Vector, 1st, and 2nd order, where γt i is the prob of SIFT xt being generated by GMM component i, 0100011001010 ℊui X = 𝜕ℒ X λ 𝜕ui = 1 300wi t=1 300 γt i ( xt − ui σi ) ℊσi X = 𝜕ℒ X λ 𝜕σi = 1 600wi t=1 300 γt i [ xt − ui σi 2 − 1] γt i = p i xt, λ = wipi(xt|λ) j=1 128 wjpj(xt|λ) Z. Li, Image Analysis & Retrv. 2016 p.32
  • 33. SCFV Distance Function The SCFV has 32x128 bits, for the 1st order Fisher Vector, and additional 32x128 bits for the 2nd order FV Not all GMM components are active, so an 128-bit flag [b1, b2,…, b128] is also introduced to indicate if it is active. Rationale: if not many SIFTs are associated with certain component, then the bits it generates are noise most likely Distance metric: A lot of painful work on GMM component turn on logic optimization to reach a very high performance. Very fast due to binary ops, can short list a 1m image data base within 1 sec on desktop computer. sX,Y = i=1 128 bi X bi Y wHa(ui X ,ui Y )(32 − 2Ha(ui X , ui Y )) 32 i=1 128 bi X i=1 128 bi Y Z. Li, Image Analysis & Retrv. 2016 p.33
  • 34. SCFV Performance Matching/Verification (TPR @ 1% FPR) Retrieval/Identification (mAP) Z. Li, Image Analysis & Retrv. 2016 p.34
  • 35. Local Descriptor SIFT Descriptor Compression  VisualAtoms/Univ of Surrey: a handcrafted transform/quantization scheme + huffman coding, low memory cost, slightly less performance (compared to PVQ), adopted.  Only binary form is received, cannot recover the SIFT h0 h6 h7 h4 h5 h2h3 h1         1 0 and 1 i i j j i i i i i j j j j j i i j j if v QL v if v QL v QH if v QH Z. Li, Image Analysis & Retrv. 2016 p.35
  • 36. Outline The Problem and CDVS Standardization Scope CDVS Query Extraction and Compression Pipeline  Key point Detection and Selection (ALP, CABOX, FS)  Global Descriptor: Key points aggregation and compression (SCFV, RVD, AKULA)  Local Descriptor: Key points and coordinates compression CDVS Query Processing  Pair-wise Matching  Retrieval  Indexing CDVA Work  Handling video input  Image/Video Understanding Summary Z. Li, Image Analysis & Retrv. 2016 p.36
  • 37. Pair Wise Matching Diagram of Image Matching  First local features are matched,  if certain number of matched SIFT pairs are identified, then a Geometric Verification called DISTRAT[] is performed, to check the consistence of the matching points via distance ratio check.  For un-sure image pairs, the global descriptor distance is computed and a threshold is applied to decide match of non- match Z. Li, Image Analysis & Retrv. 2016 p.37
  • 38. Matching Performance (@ 1% FPR) • Image Matching Accuracy: – Mix of graphics, landmarks, buildings, objects, video clips, and paintings. • Image Identification Accuracy: – For graphics (cd/book cover, logos, papers), paintings, the performance is in 90% range – For objects of mixed variety, 78% in average. – For buildings/landmark, the performance is not reflective of the true potential, as the current data set has some annotation errors Z. Li, Image Analysis & Retrv. 2016 p.38
  • 39. CDVS Retrieval Retrieval Pipeline  Short list is generated by GD based k-nn operation via:  Then for the short list of m candidates, do m times local descriptor based matching and rank their matching scores 𝑑 𝑅, 𝑄 = i=1 512 bi Q bi R W1Ha(ui Q ,ui R ) W2ui Q(D − 2Ha(ui Q , ui R )) ( i=1 512 bi Q )0.3( i=1 512 bi R )0.3 Z. Li, Image Analysis & Retrv. 2016 p.39
  • 40. Retrieval Performance: Mean Average Precision mAP measures the retrieval performance across all queries Image Analysis & Understanding, 2015 p.40
  • 41. mAP example  MAP is computed across all query results as the average precision over the recall Image Analysis & Understanding, 2015 p.41
  • 42. CDVS Retrieval Performance Retrieval Simulation Set Up  Approx. 17k annotated images mixed with 1m+ distraction image set  Short Listing: retrieve 500 closest matches by GD and then do pair wise matching and ranking  Data sets mAP Z. Li, Image Analysis & Retrv. 2016 p.42
  • 43. MBIT (multi-block indx table) Indexing GD is partitioned into blocks of 16 bits, and inverted list built. Shortlisting is by weighted scoring on block wise hamming distance Algorithm . MBIT Searching Input: Query B 𝑞 = {𝑏 𝑚 𝑞 } 𝑚=1 1024 , MBIT T = {𝑇 𝑚 } 𝑚=1 1024 , speedup ratio 𝑇, difference bits 𝐷. Output: The shortlist {B𝑙}𝑙=1 𝐿 , 𝐿 = 500. 1: Initialize 𝑠(𝑞, 𝑛) = 0, 𝑛 = 1 … 𝑁. 2: for 𝑚 = 1 to 1024 do 4: if the 𝑚+1 2 -th Gaussian of B 𝑞 is not selected then 5: continue; 6: end if; 7: for 𝑑 = 0 to D do 8: Enumerate binary vectors {h 𝑑 } with 𝑑-bit differences with 𝑏 𝑚 𝑞 . 9: For each image 𝑛 in the buckets T 𝑚 (h 𝑑 ), update # 𝑛,𝑑 = # 𝑛,𝑑 + 1. 10: end for 11: end for 12: for 𝑛 = 1 to 𝑁 do 13: Update s(q, 𝑛) = # 𝑛,𝑑 𝐷 𝑑=0 ; 14: end for 15: Sort the image list by their voting score in descending order. 16: Add descriptors of top 𝑁 𝑇 images in the ordered list into subset {B 𝑘} 𝑘=1 𝐾 . 17: Run an exhaustive search within {B 𝑘} 𝑘=1 𝐾 and sort the list by Hamming distance. 18: Return the first 𝐿 = 500 images. Z. Li, Image Analysis & Retrv. 2016 p.43
  • 44. (m29361) Bit Mask/Collision Optimized Indexing Selecting 6-bits segments that are most efficient in discriminating for shortlisting, allow for permutation of bits Shortlisting by weighted segment hamming distance also reflecting the segment entropy Full paper form: X. Xin, A. Nagar, G. Srivastava, Z. Li*, F. Fernandes, A. Katsaggelos,Large Visual Repository Search with Hash Collision Design Optimization. IEEE MultiMedia 20(2): 62- 71 (2013) Z. Li, Image Analysis & Retrv. 2016 p.44
  • 45. Outline The Problem and CDVS Standardization Scope CDVS Query Extraction and Compression Pipeline  Key point Detection and Selection (ALP, CABOX, FS)  Global Descriptor: Key points aggregation and compression (SCFV, RVD, AKULA)  Local Descriptor: Key points and coordinates compression CDVS Query Processing  Pair-wise Matching  Retrieval  Indexing CDVA Work  Handling video input  Image/Video Understanding Summary Z. Li, Image Analysis & Retrv. 2016 p.45
  • 46. Automotive & AR Use Case Extend to object Identification / event detection for video input:  How do we deal with vastly increased data rate ? o New spatial-temporal interesting points ? o Key frame based CDVS processing ?  How do we explore the temporal dimension ? o Events detection, content classification (video archiving) Z. Li, Image Analysis & Retrv. 2016 p.46
  • 47. Image Understanding/Tagging CDVS is based on a handcrafted feature, i.e, SIFT and SIFT aggregation. Latest work in Deep Learning pointing to new potentials in CCN to uncover new structure and knowledge from pixels, to associate with not only object identity, but also image labels Z. Li, Image Analysis & Retrv. 2016 p.47
  • 48. Summary MPEG CDVS offers the state-of-art tech performance in visual object re-identification accuracy, speed, and query compression Amd work:  A recoverable SIFT compression scheme, currently SIFT cannot be recovered from bit stream  3D key points, wide adoption of RGB+Depth sensors.  Non-rigid body object identification CDVA work, still refining the problem definition, main use cases  Object identification in video  Events detection, content classification  Image Understanding (vs object identification) Z. Li, Image Analysis & Retrv. 2016 p.48
  • 49. References Key References  Test Model 11: ISO/IEC JTC1/SC29/WG11/N14393  SoDIS: ISO/IEC DIS 15938-13 Information technology — Multimedia content description interface — Part 13: Compact descriptors for visual search  Signal Processing – Image Communication, special issue on visual search and augmented reality, vol. 28(4), April 2013. Eds. Giovanni Cordara, Miroslaw Bober, Yuriy A. Reznik:.  ALP: m31369 CDVS: Telecom Italia’s response to CE1 – Interest point detection  CABOX: V. Fragoso, G. Srivastava, A. Nagar, Z. Li, K. Park, and M. Turk, "Cascade of Box (CABOX) Filters for Optimal Scale Space Approximation", Proc of the 4th IEEE Int'l Workshop on Mobile Vision , Columbus, USA, 2014  FS: Lepsøy, S., Francini, G., Cordara, G., & Gusmao, P. P. (2011). Statistical modelling of outliers for fast visual search. IEEE Workshop on Visual Content Identification and Search (VCIDS 2011). Barcelona, Spain.  SCFV: L.-Y. Duan, J.Lin, J. Chen, T. Huang, W. Gao: Compact Descriptors for Visual Search. IEEE Multimedia 21(3): 30-40 (2014)  RVD: m31426 Improving performance and usability of CDVS TM7 with a Robust Visual Descriptor (RVD)  AKULA: A. Nagar, Z. Li, G. Srivastava, K.Park: AKULA - Adaptive Cluster Aggregation for Visual Search. IEEE DCC 2014: 13-22 Z. Li, Image Analysis & Retrv. 2016 p.49