This document discusses two approaches to segmenting vehicles in traffic video: motion segmentation and a Gaussian mixture model. Motion segmentation segments regions with coherent motion estimated using optic flow. The Gaussian mixture model models each pixel as a mixture of Gaussians and segments based on whether the current pixel matches the background model. The document evaluates these approaches on test traffic video sequences and discusses challenges including noise in optic flow estimates and artifacts from global camera motion.
2. Introduction
Motivation
In CA alone, >400 road monitoring cameras, plans to install more
Reduce video bit-rate by object coding
Collect traffic flow data and/or surveillance e.g. count vehicles
passing on highway, draw attention to abnormal driver behavior
Two different approaches to segmentation:
Motion Segmentation
Gaussian Mixture Model
3. Motion Segmentation
Segment regions with a coherent motion
Coherent motion: similar parameters in motion model
Steps:
i. Estimate dense optic flow field
ii. Iterate between motion parameter estimation and
region segmentation
iii. Segmentation by k-means clustering of motion
parameters
Translational model use motion vectors directly as parameters
4. Optic Flow Estimation
Optic Flow (or spatio-temporal constraint) equation:
Ix · vx + Iy · vy + It = 0
where Ix , Iy , It are the spatial and temporal derivatives
Problems
i. Under-constrained: add ‘smoothness constraint’ – assume flow
field constant over 5x5 neighbourhood window
weighted LS solution
ii. ‘Small’ flow assumption often not valid:
e.g. at 1 pixel/frame, object will take 10 secs (300 frames@30fps)
to move across width of 300 pixels
multi-scale approach
5. Level 0 (original resolution)
Level 1
Level 2
Multi-scale Optic Flow Estimation
Iteratively Gaussian filter and
sub-sample by 2 to get ‘pyramid’
of lower resolution images
Project and interpolate LS
solution from higher level which
then serve as initial estimates for
current level
Use estimates to ‘pre-warp’ one
frame to satisfy small motion
assumption
LS solution at each level refines
previous estimates
Problem: Error propagation
temporal smoothing essential at
higher levels
4 pixels/frame
2 pixels/frame
1 pixel/frame
7. Results:
Optical flow field estimation
Smoothing of motion vectors across motion (object) boundaries due to
Smoothness constraint added (5x5 window) to solve optic flow equation
Further exacerbated by multi-scale approach
Occlusions, other assumption violations (e.g. constant intensity)
‘noisy’ motion estimates
8. Segmentation
Extract regions of interest by thresholding magnitude of motion vectors
For each connected region, perform k-means clustering using feature
vector:
Color intensities give information on object boundaries to counter the
smoothing of motion vectors across edges in optic flow estimate
Remove small, isolated regions
[vx, vy, x, y, R, G, B]
motion
vectors
pixel
coordinates
color
intensities
9. Segmentation Results
Simple translational motion model adequate
Camera motion
Unable to segment car in background
2-pixel border at level 2 of image pyramid (5x5 neighbourhood window)
translates to a 8-pixel border region at full resolution
10. Segmentation Results
Unsatisfactory segmentation when optic flow estimate is noisy
Further work on
Adding temporal continuity constraint for objects
Improving optic flow estimation e.g. Total Least Squares
Assess reliability of each motion vector estimate and incorporate into
segmentation
11. Gaussian Background Mixture Model
Per-pixel model
Each pixel is modeled as
sum of K weighted
Gaussians. K = 3~5
The weights reflects the
frequency the Gaussian is
identified as part of
background
Model updated adaptively
with learning rate and new
observation
I
N
X
w
X
P
X
X
X
X
t
k
t
k
T
b
t
k
g
t
k
r
t
k
t
k
t
k
t
k
t
k
t
k
t
k
t
t
k
K
k
t
k
t
T
t
b
t
g
t
r
t
2
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
1
,
,
,
,
,
,
,
)
(
12. Segmentation Algorithm
Matching Criterion
If no match found: pixel is
foreground
If match found: background
is average of high
ranking Gaussians.
Foreground is average of
low ranking Gaussians
Update Formula
Update weights:
Update Gaussian:
Match found:
No Match found:
Replace least possible Gaussian
with new observation.
Matching and model
updating
New
observation
background
foreground
2
,
2
, t
k
t
k
t
X
/
w
k
k
k
M
w
w
1
t
m
t
T
t
m
t
t
m
t
m
t
t
m
t
m
X
X
X
,
,
2
,
2
1
,
,
1
,
1
1
13. Segmentation Result 1
• Background: “disappearing” electrical pole, blurring in the trees
• lane marks appear in both foreground/background
14. Segmentation Result 2
Cleaner background:
beginning of original sequence is purely background, so background
model was built faster.
16. Parameters matter
affects how fast the
background model
incorporates new
observation
K affects how sharp the
detail regions appears
17. Artifacts: Global Motion
Constant small motion caused by hand-held camera
Blurring of background
Lane marks (vertical motion) and electrical pole (horizontal
motion)
18. Global Motion Compensation
We used Phase
Correlation Motion
Estimation
Block-based method
Computationally
inexpensive comparing
to block matching
22. Mixture model fails when …
Constant repetitive
motion (jittering)
High contrast between
neighborhood values
(edge regions)
The object would
appear in both
foreground and
background
23. Phase Correlation Motion
Estimation
Use block-based Phase
Correlation Function
(PCF) to estimate
translation vectors.
d
x
f
F
x
PCF
e
f
f
f
f
f
e
f
f
d
x
x
f
d
j
f
d
j
T
T
1
2
2
1
*
2
1
2
2
1
2
1
25. Our Experiment
Obtain test data
We shoot our own test sequences at intersection of
Page Mill Rd. and I-280.
Only translational motions included in the sequences
Segmentation
Tun-Yu experimented on Gaussian mixture model
Wilson experimented on motion segmentation