10. this paper, we present a novel algorithm capable of on intrinsic images aim
matching single individuals in such a scenario based on been proposed for thisimage
ABSTRACT algorithms have from illumination in th
appearance features. In order to reduce the references). However, it sequences [4
[3] for variable for image is difficult to
服装の色を取り出す特徴 Matching of single individuals as
this paper, we present
color
cumulative color histogram transformation
camera learn in our scenario
illumination they movein a disjoint disjoint constancy algorithmthe transfer func
effects across typical
learning phase. In this
environment, a in video surveillance. In known about the scene and the objects. App
camera views is a challenging task
is first aapplied to the segmented moving object. Then,images aim to separate pure ref
novel algorithm capable of on intrinsic an simple cumulative co
compensate for the varyi
incremental major acolor spectrum histogram representation images, with effective me
matching single individuals in such scenario based on from illumination
appearance features. In orderis usedreduce the variable
(IMCSHR) to to represent the appearance of a moving make use [5], Javed et
for image sequences [4]. In of an object
object and cope with disjointpose changes occurring along incrementalbetween color
illumination effects in a typical small camera learn the transfer functions major cam
the track. An IMCHSR-based similarity measurement In this paper, insteadsimilar
environment, a cumulative color histogram transformation learning phase. (IMCSHR), a we pr
algorithm is also proposed to measure the similarity cumulative similarityhistogram tra
is first applied to the segmented moving object. Then, an simple of any color of any two tr
segmented moving objects. A
two spectrum histogram representationfinal compensatepost-the varying illumination cond
step of for
• ⾊色ヒストグラム incremental major color integration phase to mak
matching integration along the object's make use of an [6]. matching algorith
(IMCSHR) is used to represent the appearance of a moving track is eventually object Differently from pre
applied. Experimental results along the proposed require global histogram
object and cope with small pose changes occurring show thatincremental major color spectrum informati
– 正規化RGBにおいてRGジョイントヒストグラム (IMCSHR), in systems [7] nor rely
approach proved capable of providing correct matching a similarity measurementon
the track. An IMCHSR-based similarity measurement
(Nakajimaら, 2003) typical to measure the similarity of any similarity of any two tracked[8].
algorithm is also proposed situations.
two segmented moving objects. A final tracking, post-
Index Terms-Object step of major color spectrum to make MAJOR COLOR
integration phase
network objects, and a
2. the whole matchi
– 正規化⾊色ヒストグラム(⾚赤塚ら, 2006) matching integrationhistogram representation, disjoint camera views. Differently from previous papers, our ap
along the object's track is eventually [6].
applied. Experimental results show that the proposed require global information RGB color space In the about objects in
– RGBジョイントヒストグラム(本⽥田ら, 2009)
1. INTRODUCTION systems [7] nor relycolor yields a totalmodel
approach proved capable of providing correct matching in on a topographic of 1
typical situations. Computer vision-based tracking of movingnetwork [8]. beobjects can general, very difficult t
many possible values.
• ⾊色ヒストグラムと⾼高さ⽅方向の空間的配置 Index Terms-Object basedtracking, major of shape, motion and 2. MAJOR COLOR SPECTRUM HIS
on coherency color spectrum
features ([1, camera views.
appearance
histogram representation, disjoint 2]). However, in the case of people tracking,
distance [6], we can sca
16.8 million to byte a very
– ⼈人体領領域を10分割した各領領域でのHSL⾊色空間メディアン shape features are not immediate to exploit sinceRGB color space, using one much to
In the humans
1.are deformable objects. Moreover, when camera yields a total of 16.8
INTRODUCTION color views are losing
without million different a
値(Birdら, 2005) multiple and moving motion coherencygeneral, not be colorscompare two objec
Computer vision-based tracking of disjoint, objects can be
object. For each moving
may very difficult to are retained in t
the tracked objects move many possible values. By usingare disca
assessed whenmotion and appearance across different the con
based on coherency of shape,
– Color Rank特徴量量、⾊色と⾼高さ⽅方向のジョイントヒストグ
features ([1, 2]). However, views. Actually, the tracking, disjoint [6], we rarely appear thresho
camera in the case of people definition of distance camera canmutualdown the number
scale
distance
16.8 million to a very limited number of
shape features are not immediate implies thatsince extent of separation between much accuracy in such a m
views in itself to exploit the humans
ラム(Linら, 2008) are deformable objects. Moreover, when camera the path and timings oflosing
views prevents prediction of views are without single
An example of represen
in Fig. 1.
multiple and disjoint, motion coherency a may not be object. are retained in the representation, wh
moving objects. In such case, appearance features areeach moving object, a given perc
For the
• 頻出⾊色でヒストグラム構築 colors
assessed when the main cueobjects movethe tracksdifferent rarely appear are discarded [9-11]. Colors
tracked to reconcile across from separate camera views
of a same definition object. Such camera is very common
physical of disjoint a scenario
camera views. Actually, the
– Major Colorヒストグラム(Chengら, 2006) in real-life situations where existing mutual distance threshold are dealt with as
views in itself implies that the extent of separation between camera networks such a major color represent
An example of
cannot provide full coverage of the monitored space nor
views prevents prediction of the path and timings of single
– Color Codebookによる各Codewordのヒストグラム(Cai measure accurate individual biometrics.
moving objects. In such a case, appearance features are the
in Fig. 1.
conditions between disjoint cameras can (a) 'tn flower' picture (b) MCS
The illumination separate camera views
ら, 2010) main cue to reconcile the tracks from
be significantly different and have great
of a same physical object. Such a scenario is very common influence on the Figure 1 The Major Color Sp
of
appearance of movingcamera networks
in real-life situations where existing objects. Illumination effects must be
An example picture is s
cannot provide full eliminated of the monitored space in order to make the
coverage or at least reduced nor
appearance of same object comparable. Color constancy see that the most freque
measure accurate individual biometrics.
The illumination conditions between disjoint cameras can (a) 'tn flower' picture (b) MCSHR Histogram (c) MCS
be significantly different and have great influence on the Figure 1 The Major Color Spectrum Histogram Represe
appearance of moving objects. Illumination effects must be of the 'tn_flower'.
eliminated or at least reduced in order to make the An example picture is shown in Fig. 1 (a), i
(c) Copyrights OMRON Kyoto Univ. Nagoya Univ. 2011. All Rights Reserved.
13. Spatial Covariance Region(Bak, 2010)
HOG+Boosting検出器 共分散特徴 各領領域での
⼈人体パーツを検出 共分散特徴
特徴量量間の相関とそ
の空間的配置
re 3. Illustration of the human and body parts detection re- (a) original (b) original (c) normalized (d) normalized
Detections are indicated by 2D bounding boxes. Colors Figure 4. The first two columns show original images of the same
spond to different body parts: the full body (yellow), the top person captured from different cameras in different environments.
⼈人物 の共分散⾏行行列列 の距離離の算出法 show these images after histogram equaliza-
blue), the torso (green), legs (violet), the left arm (light blue) The last two columns
he right arm (red). tion. Spatial Pyramid Matching
に関する⼀一般化固有値問題 領領域ごとに多重スケールで照合
ignature computation an image, increasing the detail in those regions. Histogram
equalisation achieves this aim by stretching range of his-
n this section we propose a scheme to generate the hu-
固有値の⾮非線形和
signature. The human and body parts detector returns
togram to be as close as possible to an uniform histogram.
The approach is based on the idea that amongst all possible
egions of interest corresponding to body parts: the full
histograms, an uniformly distributed histogram has maxi-
y, the top, the torso, legs, the left arm and the right arm.
mum entropy [5]. Maximizing the entropy of a distribution
top part is composed of the torso and the head (see Fig-
we maximize its information and thus histogram equaliza-
3).
tion maximizes the information content of the output image.
nce the body parts are detected, the next step is to han-
We apply the histogram equalization to each of the color
olor dissimilarities caused by camera and illumination
(c) Copyrights OMRON Kyoto Univ. Nagoya Univ. 2011. All Rights Reserved.
14. 服装の色と模様の特徴の組合せ
• 上村ら(2004)
– RGBヒストグラム
– オートコリログラム
• Berdugoら(2010)
– 正規化RGB特徴
– 垂直⽅方向の⾊色の⽐比、⽅方向勾配、saliency map
• Farenzenaら(2010)
– HSV⾊色ヒストグラム、Maximally Stable Color Region
– 繰り返しパターン
• Bazzaniら(2010)
– HSVヒストグラム
– Local & Global エピトメ
(c) Copyrights OMRON Kyoto Univ. Nagoya Univ. 2011. All Rights Reserved.
15. 服装の色と模様の特徴の組合せ
• 上村ら(2004)
– RGBヒストグラム
– オートコリログラム
• Berdugoら(2010)
– 正規化RGB特徴
– 垂直⽅方向の⾊色の⽐比、⽅方向勾配、saliency map
• Farenzenaら(2010)
– HSV⾊色ヒストグラム、Maximally Stable Color Region
– 繰り返しパターン
• Bazzaniら(2010)
– HSVヒストグラム
– Local & Global エピトメ
(c) Copyrights OMRON Kyoto Univ. Nagoya Univ. 2011. All Rights Reserved.
16. ccumulation of Local Features
SDALF (Farenzena, 2010)
urino1,2,Symmetric-‐‑‒Driven Accumulation of Local Features
M. Cristani1,2
y of Verona, Italy
Genova, Italy HSVヒストグラム 対称軸から
⾊色情報 の距離離で 照合
重み付け
MSCR:
(Forssen,
2007)
対称軸から スコ
⾊色の の距離離で 照合 ア統
v ・空間的配置 重み付け 合
・領領域形状
(a) (a) 上/下半⾝身 (c) (d) (d) (e) (e)
(b) (b) (c)
⼊入⼒力力
(a) (b) (c) (d) (e)
対称軸 対称軸から
Figure 画像 1. RHSP:
Figure Sketch of the approach: a) two instances of the the
1. Sketch of the approach: a) two instances of
の距離離で 照合
person; b) x- and検出 y-axes of asymmetry and symmetry,
Figure 1. Sketch of the approach: a) two instances of the
same person; b) x- and y-axes of asymmetry andRecurrent High-‐‑‒
same same person; b) x-y-axes of asymmetry and symmetry, re- re-
and
symmetry, re-
spectively; c) weighted histogram back-projectionStructured pix-
(brighter pix-
spectively; c) weighted histogram back-projection (brighter 重み付け
spectively; c) weighted histogram back-projection (brighter pix-
Patch
els mean mean a more important color), d) Maximally Stable Color Re-
els a more important color), d) Maximally Stable Color Re-
els mean a more important color), d) Maximally Stable Color Re-
gions; e) Recurrent Highly Structured Patches.
gions; e) Recurrent Highly Structured Patches.
gions; e) Recurrent Highly Structured Patches.
繰り返し構造の
In paper, we we present novel and versatile
In this paper, we present a 複雑パターン
In this this paper, present a novel and and versatile
a novel versatile
appearance-based re-identification method, based on a pon- pon-
appearance-based re-identification method, based on a
(c) Copyrights OMRON Kyoto Univ. Nagoya Univ. 2011. All Rights Reserved.
appearance-based re-identification method, based on a pon-
17. 特徴量の分類 特徴抽出
⾊色空間 Photometric Geometric メディアン
前処理理 前処理理
ヒストグラム
ランク
RGB 領領域分割処理理 ジョイント
正規化 ヒストグラム
HSV セグメンテーショ
ン
Major Color
Opponent Color Histogram/MSCR
Lab ⼈人体パーツ
認識識
勾配/勾配強度度 HOG, SIFT,
SURF, Haar-‐‑‒like
共分散
(c) Copyrights OMRON Kyoto Univ. Nagoya Univ. 2011. All Rights Reserved.
18. 疑問①:結局のところ、どの特徴が良いのか?
特徴抽出
各特徴の組合せで無数に組合せ特徴が⽣生まれる
⾊色空間 Photometric Geometric メディアン
前処理理 前処理理
ヒストグラム
ランク
RGB 領領域分割処理理
ジョイント
正規化 ヒストグラム
HSV セグメンテーショ
ン
Major Color
Opponent Color Histogram/MSCR
Lab ⼈人体パーツ
認識識
勾配/勾配強度度 HOG, SIFT,
SURF, Haar-‐‑‒like
ここに書いただけでも600種類!
共分散
3x5x4x10
(c) Copyrights OMRON Kyoto Univ. Nagoya Univ. 2011. All Rights Reserved.
19. 疑問②:どの成分が役に立っているのだろうか?
役立つ部分だけ使えないだろうか?
特徴抽出
冗⻑⾧長な特徴量量表現は計算時間に影響
⾊色空間 Photometric Geometric メディアン
前処理理 前処理理
ヒストグラム
ランク
RGB 領領域分割処理理
ジョイント
正規化 ヒストグラム
HSV セグメンテーショ
ン
Major Color
Opponent Color Histogram/MSCR
Lab ⼈人体パーツ
認識識
勾配/勾配強度度 HOG, SIFT,
SURF, Haar-‐‑‒like
共分散
(c) Copyrights OMRON Kyoto Univ. Nagoya Univ. 2011. All Rights Reserved.
20. 最適な特徴の組合せを求めて・・・
• 機械的に最適な特徴セットを探索索
– 識識別器学習
• 機械的に最適な特徴成分を探索索
– 距離離指標学習
学習型識識別器
(c) Copyrights OMRON Kyoto Univ. Nagoya Univ. 2011. All Rights Reserved.
21. 個人特化/一般 判別モデル
• 個⼈人特化判別モデル
– 照合する相⼿手毎に違う判別モデルを使わなければな
らず計算コストが⼤大きい。
• ⼀一般判別モデル
– 計算コストが低い。
– 同⼀一⼈人物間距離離 vs. 異異なる⼈人物間距離離の⼆二値識識別
もしくは 最近傍判別の枠組みで解く。
この二つを分離するように識別器を学習
分布
本人
他人
二つのデータ間の距離
(c) Copyrights OMRON Kyoto Univ. Nagoya Univ. 2011. All Rights Reserved.
24. Gray, 2008
⽤用いる特徴量量セット 学習された識識別器において⽤用いられている
各特徴量量チャネルの割合
⾊色:RGB, YCbCr, HSV Feature Channel Percent of classifier weight
模様:Schmid, Gaborフィルタ R 11.0 %
G 9.4 %
B 12.4 %
Y 6.4 %
Cb 6.1 %
Cr 4.5 %
H 14.2 %
S 12.5 %
Schmid 12.0 %
Fig. 3. The filters used in the model to describe texture. (a) Rotationally symmetric Gabor 11.7 %
本⼈人間/他⼈人間の距離離分布を
Schmid filters. (b) Horizontal and vertical Gabor filters.
分離離するような Fig. 8. A table showing the percent of features from each channel, m
Other filters could be added as well, but proved less e↵ective. It has been
特に優れた特徴が存在しない!
局所的な領領域および特徴量量を
observed that adding additional features has few drawbacks other than increasing
computational and storage requirements. The methodology used to select these いろいろな特徴の組合せが
Adaboostを⽤用いて選択 Conclusions
specific channels was somewhat haphazard, so it is likely that better feature
channels may still be found. 5 提案されているが優劣劣付け難い!
2.3 Feature Regions We have presented a novel approach to viewpoint invariant pedestri
tion that learns a similarity function from a set of training data. I
A feature region could be any collection of pixels in the image, but for reasons
of computational sanity they will be restricted to a more tractable subset. Some
popular subsets of regions include the simple rectangle, a collection ofthat this ensemble of localized features is e↵ective at discrim
shown rectangles
(c) Copyrights OMRON Kyoto Univ. Nagoya Univ. 2011. All Rights Reserved.
[19], or a rectangularly shaped region [7]. The motivationtween has been the
for this pedestrians regardless of the viewpoint change between the
25. 2 s=1
s.t. w ˆ+ ˆ−
(xs − xs ) ≥ 1 − ξs , s = 1, · · · , |P|, ξs ≥ 0, s = 1, · · · , |P|,
Prosser, 2010
where C is a positive parameter that trades margin size against training error.
One of the main problems with using an SVM to solve the ranking problem is the pot
⼆二値識識別器による学習 ⽤用いる特徴量量セット
ally large size of P. In problems with lots of queries and/or queries with lots of associa
この⼆二つを分離離するように識識別器を学習
⾊色:RGB, YCbCr, HSV
bservation feature vectors, the size of P means that forming the xs − xs vectors becom ˆ+ ˆ−
分布
模様:Schmid, Gaborフィルタ
本人
omputationally challenging.他人
Particularly, in the case of person re-identification, assum
here is a training set consisting of m person images in two camera views. The size of P
roportional to m2 , it thus increases rapidly as m increases. SVM-based methods also r
⼆二つのデータ間の距離離
n parameter C, which must be known before training. In order to yield a reasonable mo
カメラ間⼈人物照合のよ
ハードな閾値が
ne must use cross validation to tune model parameters. This step requires the rebuild
うな難しい問題におい
f the あることを想定 ては存在しない
training/validation set at each iteration, thus further increasing the computational c
nd memory usage. Hence, the RankSVM in Eqnfilters Horizontal and vertical Gabor filters. (a) Rotationally symmetric
Fig. 3. The (3) is the model to describe texture.
used in not computationally tractable
ある本⼈人/他⼈人ペアにおいて
Schmid filters. (b)
ランキング基準による学習
arge-scale constraint problems due to both computational cost and memory use.
与えられた⼆二つのデータにおいて個別に 本⼈人間距離離が他⼈人間距離離より⼤大きく
Other filters could be added as well, but proved less e↵ective. It has been
Chapelle and Keerthi [1] proposed a method based on primal RankSVM (PRSVM) t
序列列関係のみ満たしていれば observed that adding additional features has few drawbacks other than increasing
elaxes the constrained RankSVM and formulated aなるような線形写像を学習 select these
ペナルティを与えない
computational and storage requirements. The methodology used to
non-constraint model as follows:
specific channels was somewhat haphazard, so it is likely that better feature
channels may still be found.
|P|
1 2
2.3 Feature Regions
+
2
−pixels in the image, but for reasons
w = arg min w
w 2 of
∑ 0, 1 − w they xcollectionˆof
feature region
sanity
ˆs − xs ,
+C A computational could be anywill be restricted to a more tractable subset. Some
s=1
popular subsets of regions include the simple rectangle, a collection of rectangles
[19], or a rectangularly shaped region [7]. The motivation for this has been the
computational savings of computing sums over rectangular regions using an inte-
gral image. However we can use our intuition about the problem to significantly
(c) Copyrights OMRON Kyoto Univ. Nagoya Univ. 2011. All Rights Reserved.
26. 井尻, 2011
事前に⼤大量量データを⽤用いて 距離離指標学習
距離離指標学習
ランキング基準
⾊色ヒストグラム 学習した ⾊色ヒストグラム
距離離指標を
⽤用いて
距離離計算 ⾮非線形カーネル
Jensen-‐‑‒Shannonカーネル
⾮非線形
距離離指標
照合
スコア
(c) Copyrights OMRON Kyoto Univ. Nagoya Univ. 2011. All Rights Reserved.
41. Figure 5. Sample images of cameras 1 and 3 of the iLids scenario.
評価データセット
As one can see, strong differences in environmental condition, e.g.
changes in lighting, are present between the cameras. Other chal-
lenges, e.g. occlusion of persons by other persons and luggage
increase severity for both, tracking and re-identification.
iLIDS
• 空港
• 照明変動
– 屋内
• ⼈人物向き変動
– 限定的
Figure 6. Sample persons of the iLids scenario.
(c) Copyrights OMRON Kyoto Univ. Nagoya Univ. 2011. All Rights Reserved.
42. 評価データセット
VIPeR
• 街路路
• 照明変動
– 屋外
– 多様
• ⼈人物向き変動
– 多様
(c) Copyrights OMRON Kyoto Univ. Nagoya Univ. 2011. All Rights Reserved.
43. 評価基準
• CMC
– Cumulative Match Characteristics
– 累累積照合特性
– 本⼈人が何位以内に照合されているか?
– 横軸に順位を取り、縦軸に照合率率率をプロット
– 実⽤用的には1位で照合されるのが望ましい。
– 上位で照合されなければ実⽤用化できない。
(c) Copyrights OMRON Kyoto Univ. Nagoya Univ. 2011. All Rights Reserved.