Injecting image priors into Learnable Compressive Subsampling

Injecting image priors into Learnable
Compressive Subsampling
Martino G. Ferrari
April 30, 2018
Supervisors: Prof. S. Voloshynovskiy
O. Taran
University of Geneva
Faculty of Science
1/21

Table of contents
1. Problem formulation
2. My approach
3. Results
4. Applicability
5. Conclusion
2/21

Classical acquisition
Encoder Decoder
x hlp × hlp ˆx
Compressive Subsampling (CS)
Encoder Decoder
x × ∆ ˆx
4/21

Encoder Decoder
x hlp × hlp ˆx
Encoder
b =↓ M(hlp ∗ x)
Encoder Decoder
x × ∆ ˆx
Encoder
b = Ax
4/21

Encoder Decoder
x hlp × hlp ˆx
Encoder
b =↓ M(hlp ∗ x)
Encoder Decoder
x × ∆ ˆx
Encoder
b = Ax
Decoder
ˆx = hlp ∗ (↑ Rb)
Decoder
ˆx = ∆(b)
where both x and ˆx have dimension n while b has dimension m
4/21

CS Decoder
The typical Compressive Subsampling decoding is performed by solving
an optimisation problem, as for example the LASSO [1] minimisation:
ˆx = arg min
x
Ax − b 2
2+α x 1
5/21

CS Decoder
The typical Compressive Subsampling decoding is performed by solving
an optimisation problem, as for example the LASSO [1] minimisation:
ˆx = arg min
x
Ax − b 2
2+α x 1
Limitations:
• computationally hard
• noise over-ﬁt
5/21

Learnable Compressive Subsampling
Adaptive sampling (k-best) [2]
Encoder Decoder
Ψx × Ψ∗
ˆx
learn
Learnable CS (favg) [3]
Encoder Decoder
Ψx × Ψ∗
ˆx
ΨX
learn
7/21

Encoder Decoder
Ψx × Ψ∗
ˆx
learn
Encoder
b = P ΩΨx
Encoder Decoder
Ψx × Ψ∗
ˆx
ΨX
learn
7/21

Encoder Decoder
Ψx × Ψ∗
ˆx
learn
Encoder
b = P ΩΨx
Encoder Decoder
Ψx × Ψ∗
ˆx
ΨX
learn
Decoder
ˆx = Ψ∗
P T
Ωb
7/21

Encoder Decoder
Ψx × Ψ∗
ˆx
learn
Encoder
b = P ΩΨx
Encoder Decoder
Ψx × Ψ∗
ˆx
ΨX
learn
Decoder
ˆx = Ψ∗
P T
Ωb
Learning
ˆΩ = arg max
Ω
P ΩΨx 2
2
Learning
ˆΩ = arg max
Ω
N
i=1 P ΩΨxi
2
2
7/21

LSC I - Learning
−1 0 1
·104
0.0
0.2
0.4
0.6
0.8
1.0
frequency
energy
non-cumulated
cumulated
1. transform dataset: ΨX = {Ψx1, . . . , ΨxN }
9/21

LSC I - Learning
−1 0 1
·104
0.0
0.2
0.4
0.6
0.8
1.0
frequency
energy
non-cumulated
cumulated
2. cumulative magnitude: c(k) = 1
N
k
j=−J
N
i=0|Ψxi(j)|2
9/21

LSC I - Learning
−1 0 1
·104
0.0
0.2
0.4
0.6
0.8
1.0
frequency
energy
non-cumulated
cumulated
N
k
j=−J
N
i=0|Ψxi(j)|2
3. sub-band splitting: ΨX = {ΨX1
, . . . , ΨXL
}
9/21

LSC I - Learning
−1 0 1
·104
0.0
0.2
0.4
0.6
0.8
1.0
frequency
energy
non-cumulated
cumulated
N
k
j=−J
N
i=0|Ψxi(j)|2
, . . . , ΨXL
}
4. real Cl
re and imaginary Cl
im codebook generation
9/21

LSC I - Learning
−1 0 1
·104
0.0
0.2
0.4
0.6
0.8
1.0
frequency
energy
non-cumulated
cumulated
N
k
j=−J
N
i=0|Ψxi(j)|2
, . . . , ΨXL
}
4. real Cl
re and imaginary Cl
im codebook generation
5. sampling-pattern computation:
ˆΩl = arg max
Ωl
(P Ωl
σCl
re
+ P Ωl
σCl
im
)
9/21

LSC II - Sampling and Encoding
The sub-band sampling is expressed as:
bl
= P Ωl
(Ψx)l
The code identiﬁcation is computed in a single step for both imaginary
and real part:
ˆcl
re, ˆcl
im = arg min cl
re,cl
im
bl
re − P Ωi
cl
re
2
2 +
bl
im − PΩi
cl
im
2
2+
|bl
|−P Ωi
|cl
re + jcl
im| 2
2+
arg(bl
) − P Ωi arg(cl
re + jcl
im) 2π
10/21

LSC III - Decoding
The decoder is linear and can be expressed as:
ˆx = Ψ∗



(PT
Ω1
b1
+ PT
ΩC
1
PΩC
1
(ˆc1
re + jˆc1
im)),
. . . ,
(PT
ΩL
bL
+ PT
ΩC
L
PΩC
L
(ˆcL
re + jˆcL
im))



where ΩC
l is the complementary set to Ωl
11/21

DIP - Overview
Implement a hourglass network [4] using the Deep Image Prior [5]
framework:
Minimisation problem is as follow:
ˆθ = arg min
θ
b − PΩΨfθ(z)
2
2 + βΩθ(θ)
12/21

DIP - Prior Injection
With prior injection, the minimisation problem becomes:
ˆθ = arg min
θ
b − PΩΨfθ(z)
2
2 + α fθ(z) − c
2
2 + βΩθ(θ)
while the ﬁnal reconstruction is obtained through the following linear
operation:
ˆx = Ψ∗
(PT
Ωb + PT
ΩC PΩC Ψfˆθ(z))
13/21

LSC and DIP results
0
0.5
1
1.5
·10−2MSE CELEBa YaleB NASA SDO
10−2
10−1
0
0.5
1
1.5
·10−2
s.r.
MSE
INRIA
k-Best favg LSC DIP
10−2
10−1
s.r.
OASIS
10−2
10−1
s.r.
NASA IRIS
16/21

LSC robust signal recovering
k-Best
1 .0 0 e -0 1
favgLSC
2 .7 8 e -0 1 7 .7 4 e -0 1 2 .1 5 e +0 0 5 .9 9 e +0 0 1 .6 7 e +0 1 4 .6 4 e +0 1 1 .2 9 e +0 2 3 .5 9 e +0 2
10 2
10 1
100
101
102
Noise STD ( z)
0.0025
0.0050
0.0075
0.0100
0.0125
0.0150
0.0175
MSE
MSE vs Nose STD
Sampling Rate: 0.05
k-best
favg
LSC
10 2
10 1
100
Sampling Rate
0.000
0.002
0.004
0.006
0.008
0.010
0.012
0.014
MSE
MSE vs Nose STD
Noise STD: 15.0
k-best
favg
LSC
17/21

Computed tomography scan (CT)
18/21

Summary
Strengths
• low-sampling-rate recovery
• few prior data needed
• fast encoder/decoder (LSC)
• robust to noise (LSC)
• no prior-training required (DIP)
20/21

Summary
Strengths
Weaknesses
• some dependency on signal
alignment
• complex decoder (DIP)
• prior-training required (LSC)
20/21

Summary
Strengths
Weaknesses
• some dependency on signal
alignment
• complex decoder (DIP)
• prior-training required (LSC)
Future development
• investigate better sub-band splitting (LSC)
• improve coding models (LSC)
• investigate new prior model and cost function (DIP)
• combine deep models within LSC (DIP + LSC)
* This work has been submitted to EUSIPCO 2018 20/21

Backup slide - Alignment
LSC
OASIS YaleB CELEBa SDO IRIS INRIA
Dataset
0.2
0.0
0.2
0.4
0.6
CorrelationCoefficent
Correlation between
Alignement and Reconstruction error
k=0.01
k=0.05
k=0.1
DIP
Dataset
0.2
0.0
0.2
0.4
0.6
0.8
1.0
Correlation between
D.I.P.
L.S.C.

Backup slide - Alignment
LSC
0.02 0.04 0.06 0.08 0.10
Sampling rate
0.22
0.24
0.26
0.28
0.30
0.32
0.34
0.36
0.38
AverageCorrelation
DIP
Dataset
0.2
0.0
0.2
0.4
0.6
0.8
1.0
Correlation between
D.I.P.
L.S.C.

Backup slide - LSC Precision
10 2
10 1
100
Sampling Rate (k)
0.0
0.2
0.4
0.6
0.8
ErrorRate
a. OASIS
Sub-band 1
Sub-band 2
Sub-band 3
Sub-band 4
Sub-band 5
10 2
10 1
100
Sampling Rate (k)
0.00
0.05
0.10
0.15
0.20
0.25
b. SDO
Sub-band 1
Sub-band 2
Sub-band 3
Sub-band 4
Sub-band 5

Backup slide - DIP Convergence
100
101
102
103
104
Iteration
10 5
10 4
10 3
10 2
10 1
MSE
loss error
recover error
observed error

Backup slide - Performances
LSC
10 2
10 1
100
Number of Samples
0.0
0.2
0.4
0.6
0.8
1.0
ReconstructionTime
measured
expected
DIP
103
104
Number of Samples
20
40
60
80
100
ReconstructionTime(s)
measured
expected

LSC
IRIS SDO OASIS YaleB CELEBa INRIA
0
5e-7
10e-7
15e-7
Nornmalized
TrainingTime(s)
0
100
200
300
400
500
TrainingTime(s)
DIP
103
104
Number of Samples
20
40
60
80
100
ReconstructionTime(s)
measured
expected

LSC
IRIS SDO OASIS YaleB CELEBa INRIA
0
5e-7
10e-7
15e-7
Nornmalized
TrainingTime(s)
0
100
200
300
400
500
TrainingTime(s)
DIP
10000 20000 30000 40000 50000 60000
Resolution (log)
20
30
40
50
60
70
AverageReconstruction
Time(s)
measured
expected

References i
Robert Tibshirani.
Regression shrinkage and selection via the lasso.
Journal of the Royal Statistical Society. Series B (Methodological),
58(1):267–288, 1996.
Bubacarr Bah, Ali Sadeghian, and Volkan Cevher.
Energy-aware adaptive bi-lipschitz embeddings.
CoRR, abs/1307.3457, 2013.
Luca Baldassarre, Yen-Huan Li, Jonathan Scarlett, Baran Gzc, Ilija
Bogunovic, and Volkan Cevher.
Learning-based Compressive Subsampling.
IEEE Journal of Selected Topics in Signal Processing,
10(4):809–822, June 2016.
arXiv: 1510.06188.

References ii
Alejandro Newell, Kaiyu Yang, and Jia Deng.
Stacked hourglass networks for human pose estimation.
CoRR, abs/1603.06937, 2016.
Dmitry Ulyanov, Andrea Vedaldi, and Victor Lempitsky.
Deep image prior.
arXiv preprint arXiv:1711.10925, 2017.

Injecting image priors into Learnable Compressive Subsampling

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (19)

Semelhante a Injecting image priors into Learnable Compressive Subsampling

Semelhante a Injecting image priors into Learnable Compressive Subsampling (20)

Último

Último (20)

Injecting image priors into Learnable Compressive Subsampling