1. The document proposes a fast structure from motion method for planar image sequences based on parallel projection and optical flow.
2. It formulates the problem using a brightness consistency equation relating pixel intensities in successive images under parallel projection.
3. Depth is estimated by solving an optimization problem that minimizes total variation of the depth map while maximizing photo-consistency between optical flow warped images, formulated in the discrete domain.
A localized nonlinear_method_for_the_contrast_enhancement_of_images
Fast Structure From Motion in Planar Image Sequences
1. Fast Structure from Motion for
Planar Image Sequences
Andreas Weishaupt, Luigi Bagnato, Pierre Vandergheynst
ÉCOLE POLYTECHNIQUE
FÉDÉRALE DE LAUSANNE
2. 2
Depth Map
ÉCOLE POLYTECHNIQUE
FÉDÉRALE DE LAUSANNE
3. 3
Motion
Depth Map
ÉCOLE POLYTECHNIQUE
FÉDÉRALE DE LAUSANNE
4. 4
Motion
Depth Map
ÉCOLE POLYTECHNIQUE
FÉDÉRALE DE LAUSANNE
5. 5
Motion
Depth Map
ÉCOLE POLYTECHNIQUE
FÉDÉRALE DE LAUSANNE
6. 6
Motivations
Cinema 3D
Philips 2D+Depth
Autonomous Navigation (SLAM)
3D scanning/modeling
ÉCOLE POLYTECHNIQUE
FÉDÉRALE DE LAUSANNE
7. 6
Motivations
Cinema 3D
Philips 2D+Depth
Autonomous Navigation (SLAM)
3D scanning/modeling
Target:
- Real time performances
- Good Accuracy
ÉCOLE POLYTECHNIQUE
FÉDÉRALE DE LAUSANNE
8. 7
Problem Formulation
We consider only 2 consecutive frames I0 and I1
Brightness Consistency Equation (BCE)
I1 (x + u) − I0 (x) = 0
f: focal
t: camera translation
d: distance/depth
Ω: camera rotation
u: optical flow
ÉCOLE POLYTECHNIQUE
FÉDÉRALE DE LAUSANNE
9. 7
Problem Formulation
We consider only 2 consecutive frames I0 and I1
Brightness Consistency Equation (BCE)
I1 (x + u) − I0 (x) = 0
f: focal
t: camera translation
d: distance/depth
Ω: camera rotation
u: optical flow
ÉCOLE POLYTECHNIQUE
FÉDÉRALE DE LAUSANNE
10. 7
Problem Formulation
We consider only 2 consecutive frames I0 and I1
Brightness Consistency Equation (BCE)
u
I1 (x + u) − I0 (x) = 0
f: focal
t: camera translation
d: distance/depth
Ω: camera rotation
u: optical flow
ÉCOLE POLYTECHNIQUE
FÉDÉRALE DE LAUSANNE
11. 7
Problem Formulation
We consider only 2 consecutive frames I0 and I1
Brightness Consistency Equation (BCE)
u
I1 (x + u) − I0 (x) = 0
If we assume that the motion between frames is small f: focal
t: camera translation
Linearization d: distance/depth
Ω: camera rotation
T u: optical flow
I1 (x) + I1 (x)u − I0 (x) 0
ÉCOLE POLYTECHNIQUE
FÉDÉRALE DE LAUSANNE
12. They rely on finding pairs of corresponding points in succes-
2. MOTION IN PLANAR IMAGE
siveand optical flow. The optical flow u is defined as the appar-
images. This has the following consequences: optimization problem and 8
• ent motion ofdepends on thepattern of the found cor- images. The
The final result brightness quality between two We model the camera movement during acqui
[9] we know that a convex
successive frames by the rigid 3D translation
respondence. If the match is not exact the reconstruction
central be accurate. on the object plane parallel to the,tsensor Consequently, during camera mo
projection
Problem Formulation - Projection Model
will not
• plane iscorrespondences is a computationnally expen-
Finding given by:
(tx y ,tz )T .
p = (X,Y, Z)T becomes p = p − ∆p = p − t
sive task: dense reconstruction cannot be performed in Z camera min
real-time. Figure 1: pinhole camera model and rigid camera= d(r)er where r = (x, y, sideZview with the p
p motion
∑
denotes the relative ∗ = argmotion. We param |∇Z| +
Figure 2: f ) is a point on the
x∈D
rz r + r Z(r)t to be lim-
• For real-time reconstruction, the recovery has − r f the focal distance and d(r) the distance or dep
tp = · . (2)
ited to some few feature points.+Often, tracking ofrthe rigid camera motion in the scene. close view w
Z(r) pinhole camerazmodel and tical center to a point V is a2: side appro
rFigure 1: rz r Z(r)t Z(r) The pinhole c
where Figure 1. We denote Z
Figure
found feature optical is employed to reduce additionnal as themotion is shown in map = inverse of d thu
and points flow. The optical flow u is defined and appar- Z: depth optimization problem and
r of brightness patternprojection. The opticaldepthhave V → that We solve E
the inverse
Let us define Eq. 2 as the parallelsecond source images. obtained [9] central projection:
computation cost. Tracking introduces a between two
ent motion The orwe know Z. Aapoint on rela
depth map. convex the
of error for 3D reconstruction. the object plane parallel tocan be iteration scheme:
by
flow cancentral projection onby the following projection on the appar- optimization problem
be approximatedflow. The optical flow u is defined as the
and optical the sensor
Another class is ent motion of to obtain dense depth
plane given by:
sensor plane:of recent methodsbrightnessmaps by between two images. The∗= r fixed r Z(r)p. afor
pattern 1. For pwe know that con
maps is based on thecentral projection depth object plane parallel to the sensor = arg min ∑ |∇Z| +
[9] Z, solve 1
fusion of sparse on the im- Z d(r) =
r
age registration techniques, is given by: 2θ
plane e.g. r[8]. Those have the advan- r
Z x∈D
zr + r + systems can be
tage that traditional structure from motion Z(r)t
r Z(r)t
tu = rz ·
p = − − a motion, (3) 2 we projections= arg min ∑ |∇Z
rtoZ(r) rr + r Z(r)tzr. r Z(r)
· . In Figure
(2) have a side view of the came
employed. However, in order rz + z accurate results,
provide Z(r)tz where V V a = argZ
as well as is close min ∑
Z ∗∗ on the sensor plan
approxim
Vx x
large number of depth maps has to be input forrsuchrmeth-
rz + Z(r)t parellel object plane. BasedZ. Eq. 1, we ∈D ∈D7
r have V → on We solve can de Eq.
1. Parallel projectionis shown r Z(r) rz + r Z(r)tz r Z(r)
Eq.3 shows t =as robust depth the estimated opti-
· . (2) camera movement, depth
Let[10] define Eq. p 2 dependency ofmap re- −tion opticalthat iteration scheme:
ods. Finally, in theus itnonlinearhow the parallel projection. The model
nonlinear close ap
links
where V is a
flow can be approximated as well as on the translation t
cal flow on the depth map Z(r) by the following projection on the 2. For fixed V, solve for
sensor plane: define Eq. 2 as the parallel projection. The optical z 1. For have V → Z. for V:
fixed Z, solve We solv
perpendicular to theLet ussensor plane. In this nonlinear form, it is iteration scheme:
flowthe projection in a variational framework. on the
can be approximated by the following projection
2. Optical flowto include plane: rz · r + and 3 we−find a linearized
difficult
sensor u =
Nevertheless, combining Eqs.rz 2 r Z(r)tz
r Z(r)t
r. (3) 1. For ∗ = argsolve f
Z fixed Z, min
V = arg min ∑ Z (V
∗ ∑
1
+ V x∈D 2θ x
relationship between the parallel projectionthe estimated opti-
Eq.3 shows the nonlinear dependencyr Z(r)t − r. optical
r + of and the
u = rz · (3) 8 can be V ∗ = argby th
Eq. solved
flow: cal flow on the depth map Z(r) as zwell r Z(r)tz translation tz
r + as on the 2. For fixed V, solve formin ∑ Z:
V x
Eq.3 tou =sensor plane. In linearization
perpendicularshows thernonlinear . this nonlinear form, it is opti-
the Z(r)tp dependency of the estimated (4)
difficult cal include the projection in a variational framework.
to flow on the depth map Z(r) as well as on the translation t
θ r Z∇I∑tp
z 2. ∗ λfixedminsolve f
Z = arg V, T |∇
For
Nevertheless, combiningthe sensor plane. Infind nonlinear form, it is
3. TV-L1 DEPTH FROM MOTION
perpendicular to Eqs. 2 and 3 we this a linearized 1
x∈D
−λ θ ∗ r ∇I1 t T
relationship between the parallel projection and the optical
difficult to include the projection in a variational framework. V = Z +be solved by the foll
We assume for Nevertheless, combiningknow 2the camera trans- Eq. 8 can ρ(Z) = arg mZ
flow: the moment that we Eqs. and 3 we find a linearized
Z
T
lation parameters t for two u = r the parallel projectionI1and the optical
Z(r)tp .
successive frames I0 and . Fur-
relationship between (4) r ∇I1 tp
8 can be solved
thermore, we assume flow: that the brightness does not change be- Eq. λ θPOLYTECHNIQUE by
ÉCOLE T
r ∇I1 tp
FÉDÉRALE DE LAUSANNE
13.
Eq.3 shows the nonlinear dependency of the estimated opti-
p
T
cal flow on the depth map Z(r) as well as onλ θ λr ∇I T∇I1z t
θ r 1 tp
translation t
the
3.3. TV-L1 DEPTH FROM MOTION to the sensor plane. In this nonlinear form, it∇I T
TV-L1 DEPTH FROM MOTION 9
perpendicular
V=
difficult to include the projection in
θ θ r T is
VZa+Z + −λ−λr ∇I1 tp1
=variational framework.
Problem formulation
We assume for the moment that we know the camera trans- Eqs. 2 and 3 ρ(Z)ρ(Z)
We assume for the moment that we know the camera trans-
Nevertheless, combining aT linearized
we find r ∇I T tp
lation parameters t for two successive frames I0between .the
ation parameters t for two successive frames I0 and I1 . IFur- parallel projection r ∇I1the 1
relationship and 1 Fur-
t
and p optical
hermore, we assume that the brightnessflow: not change be-be-
thermore, Equation that the brightness does not change
BC we assume does Projection Model
ween those images. Using the definition of of optical flow andr In order to solve Eq. Eq.the
tween those images. T (x)u −definition optical flow and= Z(r)tp .
I1 (x) + I1 Using the I0 (x) 0 u In order to solve 9, (4)9,
he projection inin Eq. 4, we can express the image residual
the projection Eq. 4, we can express the image residual cancan be exploited. is giv
be exploited. It It is
ρ(Z) as in [6]: with respect to a known u
(Z) as Linearization
in [6]: p p ≤ With the introd
3. TV-L1 DEPTH FROM1}. 1}. With the int
≤ MOTION
0 be solved iteratively by
We assume for the moment that solved iteratively by the
be we know the camera trans-
ρ(Z) == (x ++0 )0 ) + ∇I(T r r lationu0u− I− I0t. for two successive frames I0 tand I1 . Fur-
T
ρ(Z) I1 I1 (x u u + ∇I1 1 ( ZtZtp − ) 0 ) 0 .
p−
parameters (5)Data Term - bilinear in Z and p
(5)
thermore, we assume that the brightness does not change be-
n+1 n+1pn +p
those images.fol-
Using
p p= =
tween solving thethe fol-the definition of optical flow 1 + 1 and τ
depth map Z = Z(r) can be obtained by
AA depth map Z = Z(r) can be obtained by solving
If we know the motion[2]: We cast the depth estimation in can express the image residual
owing optimization problem [2]:t:
lowing optimization problem
the projection in Eq. 4, we a TV-L1 optimization problem
ρ(Z) as in [6]: In the the discrete domain t
In discrete domain the s
lution depends on the imple
Z ∗ == arg min ∑ |∇Z| + λ ∑ ρ(Z,, I0 , I1 ), (x + u0(6) ∇I1 ( r ZtpdependsI0 . the im
Z ∗ arg min
∑ |∇Z| + λ ∑ |ρ(Z, Iρ(Z)), I1 (6) ) + T lution − u0 ) − on (5)
0 I1 =|
Z Zx∈D
x∈D x∈D∈D erators. In 11, ∇ represe
erators. In Eq.Eq. 11, ∇ rep
x and the the scalar product
and scalar product with
A depth map Z = Z(r) can be obtained by solving the fol- w
where DD we have an domain of of pixels and their ego-motion by leastgence operator defined
is the discrete estimate pixels and x x their position gence operator as as defin
where If is the discrete domain of Z: lowingestimate position
We optimization problem [2]: square
n the image. The left term in in Eq. represents ∗ the regular- map cancan recovered by Z
on the image. The left term Eq. 6 6 represents arg min |∇Z| +map ρ(Z,be, Irecovered b
the regular- be
zation term. Here we set it to to the TV norm Z = which ∑ depth ∑ positivity constra
the TV norm of of which Zim-
Z Z
ization term.arg minwe set 1it− I0 + I1 ( r Ztp − u0 ) x∈D
λ positivity 1constraint
depth
I0 ), (6)
t = Here im-
∗ T 2
I
sparseness constraint on Z Z and acts edge-preserving. ered depth map, i.e. i.e.Z(r
ered depth map, if if
x∈D
oses a a sparseness constraint on and acts edge-preserving.
poses t
the data term which where is to the image to provide global convergen
x∈D
he right term is is the data term which weDsetthe discrete domain ofto providexglobal conve
The right term we set to the image pixels and their position
on the image. The left term in Eq. 6 represents depth map
esidual as defined in Eq. 5. We have chosen the robust L1 of detail in the the regular-
residual as defined in Eq. 5. We have chosen the robust L1 ingof multi-scale resolution
detail in the depth m
orm as it has some advantages when compared toHere we set it to theaTVanorm of Z which im-
ization term. the usu-
ing multi-scale resolu
norm as it has some advantages when compared to the usu- on Z and acts edge-preserving.
poses a sparseness constraint
lly employed L2 norm [9]. Eq. 6 is not a strictly is the data term which we set to the imagek I
The right term convex use downsampled images
ÉCOLE POLYTECHNIQUE
FÉDÉRALE DE LAUSANNE
14. image residual. This can be repeated until level 0 is reached
and show how to combine the depth to t: best to includ
where weZ. Given two successive images I0we mustI1 we can recover thefrom motion esti-
obtain the final depth map 0 Z.
the image residual with respect is
the∑ and0the rof p T ∇Iapproach:
camera translation parameters by optimizing in Section2I31normego-motioneachrother, 1 = 0.10 (1
mation described
L − I + rely on estimation
described in Section 4. Since both parts Zt 1 Z∇I
4. EGO-MOTION ESTIMATION it is very likely that wexcan combine them by performing al- the coars
t:
the image residual with respect to ternating depth and ego-motion estimation. We find that it
∈D 1. At
Camera Ego-Motion Estimation
Let us assume now that we have an estimate of the depth map
Z. Given two successive images I0 and I1 we can recover the In the special case of in the movement L Z by th
is best to include the alternation schemecamera multi-scale parallel to s
and
∑
the image residual with respect to t:
∗ T
approach: sensor plane solving Eq. 12 results in the linear syste
1. r Z∇I =c(x) level we initialize L t by as explained
I1 − I0 + r Ztp T ∇I1 At the coarsest1resolution with L,(12)
camera translation parameters by optimizing the L2 norm of
A(x)b = 0.
2
t = arg min ∈D I1 − I0 + I1 ( r Ztp − Zu0 ) small constant. We can first solve for L Z are zero
L by some
zero
ters
t x and
∑ I1 − I0 + r x∈D ∇I1 r Z∇I1 = 0. (12)
Ztp T as explained in Section 3. Since the ego-motion parame-
2 2. With the fla
x∈D
In casespecial tcaseT of camera 2. Withare zero theparallel to2we2 estimate verymotion 2 Z 2 ∂∂Ix1 ∂∂Iy1
the t = (t , , 0)
Special case of camera movement parallel to the movement estimated∈D r map will be the xflat. r
ters depth
∑x input Z ∂ x
the flat depth as
the∂ I1 ∑ ∈D
parameters a
A(x) = map
sensor plane solving Eq. 12 results in the linear∈D r 4.Z 2 ∂ x ∂ y ∑x∈D r 2 Z 2 ∂ y
In the special x y
∑x system k+1
parameters according to Section 2 ∂ I1 ∂ I1 ∂ I1
2
t3. Given the e
sensor plane solving Eq. 12 results in the linear system
A(x)b = c(x) with = c(x) with
A(x)b 3. Given the estimated motion parameters
linear system of equations k+1 and the
Z, we first estimate the optical flow u0 = k
and we compute the depth map at level depth map
depth map
r Z(r)tp , then k Z.
2 2 ∂ I1
2
∂ I1 ∂ I1
∑x∈D r 2 Z 2 ∂ x ∂ y 4. From the refined depth map k Z, we compute the motion Z(r)tp ,
r
∑x∈D r Z ∂ x k t. − ∑x∈D r Z ∂ Ix (I1 − I0 )
1
2 2 b = (tx , ty )
parameters T
A(x) = c(x) = I ∂
2 Z 2 ∂ I1 ∂until the ∈D r resolution is ) the re
1 − ∑x finest Z ∂4. 1From
.
2 Z 2 ∂ I1 ∂ I1 2Z 2 2∂ I1 ∂ I1
∑x∈Drr Z ∂ y 5. Steps 3 andr are repeated
I1
(I − I0
∂ x ∑x∈D ∑x∈D
2 4
∑x∈D r ∂y ∂y
∂x reached and the final depth map 0 Zobtained.
∂x ∂y
A(x) = 2
is
parameters
For general camera motion, we can solve Eq. 12 by iter
and ∂ I1 ∂ I1 2 ∂ I1
∑x∈D , r )T Z 2 ∂ x ∂ y
General case t = (t , t t
2 r 2 Z 6. e.g. Levenberg-Marquardt or gradient descen
∑x∈D tive methods,RESULTSwhere x contains Steps translatio
∂y
xn+1 = xn + γ∇E(xn )
5.
the three
3 and
x1 y z
− ∑x∈D r Z ∂ Ix (I1 − I0 ) reached and
In order to verify our approach, we use synthetic images of
size 512 × 512 and ground truth depth maps of the image residual,
parameters and E is the energy generated by
c(x) = ∂ .
Gradiend−descent ∂∂Iy1 (I1 − I0 )
∑x∈D r Z
ray-tracing of a 3D model of a living room. We have gen-
and erated multiple sequences∂for =
E various types of camera trans-∂ u
For general camera motion, we can solve Eq. 12 by itera-
∂ xi x∈D
lation, i.e. for movement parallel
∑andI1perpendicular to the∂ xi .
T
− I0 + ∇I1 u ∇I1 T
tive methods, e.g. Levenberg-Marquardt or gradient descent:
x n+1 = xn + γ∇E(xn ) where x contains the three translation image plane as well as for linear combinations of both. The
∂ I1
parameters and E is the energy of the image− ∑x∈D r Z ∂ x (Iis to evaluatepartial derivatives of u with respect toto veri
residual, purpose 1 − I0 ) first ego-motion estimation In order the motio
The and depth
c(x) = seperately. .
from motion parameters are given by the Jacobian matrix
∂ IWe run the ego-motion estimation with ground truth × 512
size 512
T − ∑x∈D r Z ∂ y (I1 − I0 )
1
∂E ∂u
= ∑ I1 − I0 + ∇I1 u ∇I1
T
. depth maps on the different sequences and we ray-tracing of a
obtain the
∂ xi x∈D ∂ xi
translation vector estimates as listed
rz r Zin Table 1. For sim-
The partial For general camerato the motion we canwe only show the12 by itera- deviation of rthe Z
derivatives of u with respect motion, plicity solve Eq. mean and standard
rz + r Ztz eratedr 0 multiple
i.e.
z
parameters are given by the Jacobian matrix Ju =
T
normalized vectors. 0 lation,rryZtz r Zty for .
tive methods, e.g. Levenberg-Marquardt or gradientfrom Zmotionr part by−r r Z +
We evaluate the depth descent: x rx + Zt
rz +
using the
−rz r inputs. Zt )2 normalize the plane
xn+1 = rxn + γ∇E(xn ) where x contains the threevectors asfor zcomparing the usedPOLYTECHNIQUE as
translationWe image(rz + r Ztz )2
ground truth translation (r + r z z
input images which is convenient ÉCOLE pa-
rz + r Ztz and E is the energy of rameters. In our residual,we use 5 levels of purpose is to ev
rz Z
0
parameters the image experiments, FÉDÉRALE DE LAUSANNE
resolution
15. T
ρ(Z) = I1 (x + u0 ) + ∇I1 ( r Ztp − u0 ) − I0 . (5)
pn + τ 11
pn+1 =
depth map Z = Z(r) can be obtained by solving the fol- 1 + τ∇
wing optimizationEstimation - TV-L1
Depth problem [2]:
In the discrete domain the st
lution depends on the implem
Z ∗ = arg min
Z x∈D
∑ |∇Z| + λ ∑ |ρ(Z, I0 , I1 ),| (6)
erators. In Eq. 11, ∇ represen
x∈D
and the scalar product with
here D is the discrete domain of pixels and x their position gence operator as defined in
the image. The left term in Eq. 6 represents the regular- map can be recovered by Z =
tion term. Here we set it to the TV norm of Z which im- depth positivity constraint h
ses a sparseness constraint on Z and acts edge-preserving. ered depth map, i.e. if Z(r)
e right term is the data term which we set to the image to provide global convergenc
idual as defined in Eq. 5. We have chosen the robust L1 of detail in the depth map Z
rm as it has some advantages when compared to the usu- ing a multi-scale resolution
y employed L2 norm [9]. Eq. 6 is not a strictly convex use downsampled images k I0
ÉCOLE POLYTECHNIQUE
FÉDÉRALE DE LAUSANNE
16. T
ρ(Z) = I1 (x + u0 ) + ∇I1 ( r Ztp − u0 ) − I0 . (5)
pn + τ 11
pn+1 =
depth map Z = Z(r) can be obtained by solving the fol- 1 + τ∇
wing optimizationEstimation - TV-L1
Depth problem [2]:
In the discrete domain the st
lution depends on the implem
Z ∗ = arg min
Z x∈D
∑ |∇Z| + λ ∑ |ρ(Z, I0 , I1 ),| (6)
erators. In Eq. 11, ∇ represen
x∈D
and the scalar product with
here D is the discrete domain of pixels and x their position gence operator as defined in
the image. The left term in Eq. 6 represents the regular- map can be recovered by Z =
tion term. Here we set it to the TV norm of Z which im- depth positivity constraint h
ses a sparseness constraint on Z and acts edge-preserving. ered depth map, i.e. if Z(r)
e right term is the data term which we set to the image to provide global convergenc
idual as defined in Eq. 5. We have chosen the robust L1 of detail in the depth map Z
rm as it has some advantages when compared to the usu- ing a multi-scale resolution
y employed L2 norm [9]. Eq. 6 is not a strictly convex use downsampled images k I0
ÉCOLE POLYTECHNIQUE
FÉDÉRALE DE LAUSANNE
17. T
ρ(Z) = I1 (x + u0 ) + ∇I1 ( r Ztp − u0 ) − I0 . (5)
pn + τ 11
pn+1 =
depth map Z = Z(r) can be obtained by solving the fol- 1 + τ∇
wing optimizationEstimation - TV-L1
Depth problem [2]:
In the discrete domain the st
lution depends on the implem
Z ∗ = arg min
Z x∈D
∑ |∇Z| + λ ∑ |ρ(Z, I0 , I1 ),| (6)
erators. In Eq. 11, ∇ represen
x∈D
and the scalar product with
here D is the discrete domain of pixels and x their position gence operator as defined in
the image. The left term in Eq. 6 represents the regular- map can be recovered by Z =
tion term. Here we set it to the TV norm of Z which im- depth positivity constraint h
ses a sparseness constraint on Z and acts edge-preserving. ered depth map, i.e. if Z(r)
e right term is the data term which we set to the image to provide global convergenc
idual as defined in Eq. 5. We have chosen the robust L1 of detail in the depth map Z
rm as it has some advantages when compared to the usu- ing a multi-scale resolution
y employed L2 norm [9]. Eq. 6 is not a strictly convex use downsampled images k I0
ÉCOLE POLYTECHNIQUE
FÉDÉRALE DE LAUSANNE
18. T
ρ(Z) = I1 (x + u0 ) + ∇I1 ( r Ztp − u0 ) − I0 . (5)
pn + τ 11
pn+1 =
depth map Z 2: side view withobtained by solving the fol-
Figure = Z(r) can be the projections of camera motion 1 + τ∇
wing optimizationEstimation - TV-L1
Depth problem [2]:
In the discrete domain the st
r- [2] lution depends on the implem
optimization problem and thus hard toI solve. From(6) and
Z ∗ = arg min ∑ |∇Z| + λ ∑ |ρ(Z, 0 , I1 ),|
he [9] we knowZthat a convex relaxation can be formulated: erators. In Eq. 11, ∇ represen
x∈D x∈D
or and the scalar product with
Functional Splitting gence operator as defined in
here D is the discrete domain of 1pixels and x their position
the image. Themin ∑ |∇Z| + 6∑ (V − Z)2 the regular- )|, map can be recovered by Z =
Z ∗ = arg left term in Eq. represents + λ ∑ |ρ(V
Z 2θ x∈D
tion term. Here we x∈D it to the TV norm of Z which im-
set x∈D depth positivity constraint h
2) a sparseness constraint on Z and acts edge-preserving. (7)
ses ered depth map, i.e. if Z(r)
where a data term which we Z to for θ →
e right termVisisthe close approximation ofset andthe image 0 we provide global convergenc
to
idual have V → Z. We solve Eq. 7 using an alternative two-step detail in the depth map Z
al as defined in Eq. 5. We have chosen the robust L1 of
rm asiteration scheme:
he it has some advantages when compared to the usu- ing a multi-scale resolution
y employed L2 normsolve for V: 6 is not a strictly convex
1. For fixed Z, [9]. Eq. use downsampled images k I0
3) 1
V ∗ = arg min
V
∑ 2θ (V − Z)2 + λ ∑ |ρ(V )|. (8)
x∈D x∈D
i-
tz 2. For fixed V, solve for Z:
is
k. 1
Z = arg min ∑ |∇Z| +
∗
∑ (V − Z)2 . (9)
ed Z x∈D 2θ x∈D
al ÉCOLE POLYTECHNIQUE
Eq. 8 can be solved by the following soft-thresholding: FÉDÉRALE DE LAUSANNE
19. central projection on the object plane parallel to the sensor
T
ρ(Z) = I1 (x +and)thus hard r Ztp − u0 ) −[2]. and (5)
timization problem u0 + ∇I1 ( to solve. From I0
plane is given by:
Z = arg min ∑ |∇Z| +
1
− n+1 ∑
pn + τ
11
] we know that2: convex relaxation can be formulated:
∗
∑ (V pZ) + λ= |ρ(V )|,
2
Figure a side view with the projections of camera motion
depth map Z = Z(r)tp = be obtained by solving (2) fol-
can z · r r + r Z(r)t r the
Z x∈D 2θ x∈D x∈D
1 +(7)τ∇
wing optimizationEstimation -
Depth problem [2]:r Z(r) rz + r Z(r)tz TV-L1
− .
r Z(r) where V is a close approximation of Z and for θ → 0 we
have V → Z. In the discretean alternative two-step
We solve Eq. 7 using domain the st
1
Let us define Eq. 2 as the parallel projection. The optical
iteration scheme:
∗
r- Z ∗ Z ∑ |∇Z| ∑2θ and λ ∑ + to , ),
Z = arg min arg min + |∇Z| + thusZ) ρ(Z, I∑I |ρ(VFrom(6) fixed lution depends on the implem
optimization problem ∑ (V − hard λ solve. )|, 1. [2] and solve for V:
=
2
flow can be approximated by the following projection on the
sensor plane: | For Z,
0 1 |
he [9] we x∈D Zthat a convex rrelaxation can be formulated: erators. In Eq. 11, ∇ represen
know x∈D x∈D + x∈D r Z(r)t
x∈D
or u = rz · − r. (3) (7) V ∗ = and the scalar + λ ∑ |ρ(V )|.
arg min ∑
1
(V − Z)2 product with (8)
rz + r Z(r)tz
here D is the discrete domain of dependencyand x θ →opti- we
here V Functional Splitting the nonlinear of Z and the estimated 0
is a closeEq.3 shows
approximation pixels of for their position V x∈D 2θ
gence operator as defined in x∈D
1
ve Vimage. We solve on the depth map Z(r) asrepresents the regular- fixed map can be recovered by Z =
the →ZZ.= Themin ∑ |∇Z|sensor plane. alternative form, it is |ρ(V )|,
cal flow Eq. 7 using an well as on the translation tz
2 two-step 2. For
arg perpendicular to the + 6∑ this nonlinear + λ ∑
V, solve for Z:
∗ left term in Eq. In (V − Z)
ration scheme: difficult to include the projection in a variational framework.
we x∈D combining 2θ x∈D
Z
tion term. HereNevertheless,it to the TVand 3 we find aZ which im- Z∗depth positivity ∑ (V − Z)2. (9)h
set Eqs. 2
norm of linearized∈D x = arg min ∑ |∇Z| +
1
constraint
2θ x∈D
2)For fixed Z, solve for V: on Zparallel projection and the optical Eq. 8 can (7) by the following soft-thresholding:
Z x∈D
ses a sparsenessrelationship between the and acts edge-preserving. be solved depth map, i.e. if Z(r)
constraint ered
where
Fixed Z: a data term = r Z(r)tp .
flow:
u which we
Z to for θ →
e right termVisisthe close approximation ofset andthe image 0we provide global convergenc
(4) to
idual have V → Z. We3.solve1 DEPTH FROM MOTION robust two-step r detail inρ(Z) <depth ∇I1T tp)2 Z
al as defined in Eq. 5. We have chosen alternative L1 λof ∇I1T tTp if the −λ θ ( r map
1 TV-L Eq. 2 7 using an the θ
advantages when ∑ the camera the
has some ∑ 2θ the − Z) + λcompared to ⇒ usu- ing ∇I tp if |ρ(Z)|> λ (( rr ∇I1Tttpp))2
∗
iteration scheme:
rm asVit = arg min assume for(V moment that we know |ρ(V )|. trans-(8)V = Z + −λρ(Z) r a 1multi-scaleλθθresolution .
θ if ρ(Z) T 2
he VWe ≤ ∇I1
1. For L2 lation∈Dsolve for two 6 is not a strictly convex
y employed fixed x parameters t Eq. successive∈D I0 and I1 . Fur-
normZ, [9]. for V: x frames
use downsampled images (10)0
r ∇I tT
1 p kI
thermore, we assume that the brightness does not change be-
In order to solve Eq. 9, the dual formulation of the TV norm
For fixed V, solve for Z:
tween those images. Using the definition of optical flow and
can be exploited. It is given by: TV (Z) = max{p · ∇Z :
the projection in Eq. 4, we can express the image residual
3) ρ(Z) as in [6]: 1 p ≤ 1}. With the introduced dual variable p, Eq. 9 can
∗
Z = arg min ∑= |∇Z| 2θ ∇I
ρ(Z)V I1
∑ + (V1 ∑ Ztp −0Z) I∑ (5)(9)be solved (8) bypnthe τ∇(∇ · pn −V /θ ) [3, 2]:
0
− 2
∗ V = arg min (x + u ) +1 T ( r Z) − u λ−2 . |ρ(V )|.
+
(V ) x0∈D .
iteratively
+
Chambolle algorithm
i-
Z x∈D x∈D 2θ obtained by solving the fol-
A depth map Z = Z(r) can be x∈D
pn+1 =
1 + τ∇(∇ · pn −V /θ )
. (11)
tz 8 can be solvedlowingthe following soft-thresholding:
q. 2. For fixed V,optimization problem [2]:
by solve for Z: In the discrete domain the stability and properties of the so-
lution depends on the implementation of the differential op-
is Z ∗ = arg min ∑ |∇Z| + λ ∑ ρ(Z, I0 , I1 ), (6)
Z x∈D erators. In Eq. 11, ∇ represents the discrete gradient operator
k.
x∈D
1 ∇ represents the
2 the scalar product within [3]. From Eq. discrete diver-
and
λ θ r ∇Ithetimage. The left∑ in< −λ θ ( r∑ regular-)2 map can be recovered by Z = V − θ ∇ · p. Furthermore, the
∗
where= is the discrete domain of pixels and x their (V − Z) gence operator as defined
Z T D arg min |∇Z| + position . (9) 11, the depth
ed
on 1 p Z xterm
∈D 2θ x the T
if ρ(Z) Eq. 6 represents∈D 1 tp ∇I
izationT
term. Here we set it to the TV norm of Z which im- depth positivity constraint has to be imposed on the recov-
= ∇I tp if ρ(Z) > and acts ∇I1 tp )2 T
al Z + −λ θ rposes a1sparseness constraint on Zλ θ ( redge-preserving. ered depth map, i.e. if Z(r) < 0 we set Z(r) ← 0. In order
. ÉCOLE POLYTECHNIQUE
Eq. 8 can be solved by the following soft-thresholding: FÉDÉRALE DE LAUSANNE