2. 2
Introduction
Tensor is a general name of multi-dimensional array.
For the growth of information sensing, demands of tensor data
analysis are substantially increasing.
・
・
・
・
・
・
・
・
・
・
・
・
・・・
・・・
・・・
1d-tensor 2d-tensor 3d-tensor 4d-tensor 5d-tensor
Multi-channel time-series(2d)
Multi-channel
time-frequency signal(3d)
MRI data for multiple subjects(4d)
・・・
・・・
・・・
・・・
・・・
・・・
・・・
Subject 1 Subject 2 Subject N
Task 1
Task 2
Task M
Multi-channel time-frequency signals
for multiple mental tasks and subjects (5d)
9. 行列と行列の積
(I×J)行列 ・(J×K)行列 = (I×K)行列
テンソルと行列の積
(I×J×K)テンソル ×1 (L×I)行列 = (L×J×K)テンソル
(I×J×K)テンソル ×2 (L×J)行列 = (I×L×K)テンソル
(I×J×K)テンソル ×3 (L×K)行列 = (I×J×L)テンソル
9
テンソルの計算(6)
=・
I
J
J
K K
I
I
J
K
×1
I
L = L
JK
I
L I
=
JK
L
JK
行列化
行列化
・
22. Tucker decomposition or
High order singular value decomposition (HOSVD) is
one of mathematical decomposition model for tensor.
It can be used for
dimensionality reduction (compression),
feature extraction (sparse / nonnegative / independent),
completion (estimation of missing value),
prediction (regression) and so on.
22
Tucker decomposition
~=
T
Y A B
(I × J) (I × R) (R × J)(R × R)
D
=
Generalization
of matrix factorization
23. Low-rank approximation of matrix decomposition
Ex) R-rank approximation
What’s a multilinear tensor rank (MT rank)
Ex) (R1,R2,R3)-rank approximation of a 3D-tensor
Multilinear tensor rank is the size of core tensor
23
Low-rank approximation of tensor
~=
T
Y A B
(I × J) (I × R) (R × J)(R × R)
D
Y
A D B
Y
G
(I1 × I2 × I3) (I1 × R1)
A
B
C
(R2 × I2)
(I3 × R3)
(R1 × R2 × R3)
T
=~
Y
GA B
C
24. Appropriate accuracy and compression ratio are important for data
compression.
Rank is a trade-off parameter for both properties.
Compression ratio is linearly changed w.r.t. rank, but
Accuracy is non-linearly changed in many real problems.
It is important to estimate appropriate MT rank for compression
24
Compression & Rank
Higher rank
Lower rank
High accuracy
Low compression ratio
Low accuracy
High compression ratio
rank
Comp.ratio
rank
Accuracy
25. We assume that the observed data is generated by low-rank Tucker
model and additive noise.
Assumption : generated model can be characterized as Tucker model
Rank is an important parameter for the noise reduction
25
Noise reduction & Rank
rank
Accuracy
+
Too highToo low
Over-fitting to noise Insufficient
to construct
26. テンソルのランク推定の研究について紹介
研究成果
スパースTucker分解を用いたテンソルランクの推定
T. Yokota, A. Cichocki. Multilinear tensor rank estimation via
sparse Tucker decomposition, In Proceedings of SCIS&ISIS2014,
pp. 478-483, 2014.
情報量基準を用いたテンソルランクの推定
T. Yokota, N. Lee, and A. Cichocki. Robust Multilinear Tensor
Rank Estimation Using Higher Order Singular Value
Decomposition and Information Criteria, IEEE Transactions on
Signal Processing, vol. 65, issue 5, pp. 1196-1206, 2017.
26
ランク推定の研究について紹介
28. Pruning Sparse Tucker Decomposition (PSTD)
L1-norm minimization of core tensor
Error bound of input tensor and reconstructed tensor
Orthogonal constraint of factor matrices
28
Proposed method & Algorithm
Orthogonal LS fix
Orthogonal LS fixfix fix
fix fix
fix Orthogonal LSfix fix
fix fixfix LASSO
…
…
Pruning step
(coefficient based)
sparse
Orthogonal
29. Main-problem
Sub-problem for U (orthogonal dictionary learning)
Criterion
Update rule
29
Sub-problem for U
Lagrange’s formalization Lagrange’s coefficient
Least squares solution
Orthogonalization
30. Sub-problem for G
We estimate optimal λ corresponding to ε by binary search.
Large λ Sparse & Large error
Small λ Dense & Small error
30
Sub-problem for G
Tensor form of original problem
Vector form in Lagrange’s method
Vector form of Y
Vector form of G
Error bound
For sparse coding
Error
λ
Non-linear monotonic
increasing function
LASSO regression
31. We have sparse coefficient core tensor G
Detection of redundant slices
Pruning redundant slices & dictionaries for all directions
Value of nearly zero implies that the corresponding dictionary is
not used for representing data. (deleted by sparse coding)
31
Pruning step
Slice Unfold Sum of absolute (L1-norm)
relatively large
relatively large
nearly zero
relatively large
relatively small
nearly zero
relatively small
Prune !!
Prune !!
32. Data: synthetic data
Generated core tensor : (10 × 20 × 30)
Generated factor matrices: (25×10), (50×20), (75×30)
Input tensor is generated by
Convergence aspect
Final objective value
1.5 e-2 ± 9,8 e-5
MT rank
Completely estimated
Sparsity of G
42.9 ± 0.626 %
Decreased iteration
99.9 %
32
Experiments: convergence
+
SNR = 10dB
Gaussian noiseTucker model
Applying PSTD algorithm
34. We applied PSTD to image compression
We changed
SNR parameter = {25, 30, …, 45} for error bound
Quantization parameter: q in various values
34
Experiments: Image compression (1)
PSTD
(1024 × 1024)
(8×8×16384)
Bases of PSTD Bases of JPEG
Huffman coding
quantization(q) sorting
Sorting index
zero run-length Huffman coding
Huffman codingdifference
DC
AC
quantization(q) difference Huffman coding