Presented at 2014 Spring Meeting of Acoustical Society of Japan (domestic comference)
Daichi Kitamura, Hiroshi Saruwatari, Satoshi Nakamura, Kazunobu Kondo, Yu Takahashi, Hirokazu Kameoka, "Optimal divergence diversity for superresolution-based nonnegative matrix factorization," Proceedings of 2014 Spring Meeting of Acoustical Society of Japan, 3-2-9, pp.727-730, Tokyo, March 2014.
9. 方位クラスタリング [Araki, 2007], [Miyabe, 2009]
• チャネル間の振幅差を用いてクラスタリング
• スペクトログラムドメインでのバイナリマスキング
• 問題点
– 同一方位の複数音源の分離は不可能
– バイナリマスキングによる人工歪みの発生
9
Right
L R
Center
Left
L R
Center
方位クラスタリング
入力ステレオ信号 分離信号
1 1 1 0 0 0
1 0 0 0 0 0
1 1 1 1 0 0
1 0 0 0 0 0
1 1 1 1 1 1
Frequency
Time
C C C R L R
C L L L R R
C C C C R R
C R R L L L
C C C C C C
Frequency
Time
バイナリマスク混合信号
要素毎の積
10. ハイブリッド手法 [Kitamura, 2013]
• 方位クラスタリングの後に超解像型SNMFを適用するハ
イブリッド手法が提案された
10
方位クラス
タリング
L R
空間分離
スペクトル
分離
超解像型SNMF
ハイブリッド手法
16. 提案手法: フロー図
16
Frequency
Superresolution-based SNMF
Calculation of rate
Yes No
KL-divergence-
based cost function
EUC-distance-
based cost function
(EUC) (KL)
of chasmsCalculation of rate
Yes No
KL-divergence-
based cost function
EUC-distance-
based cost function
(EUC) (KL)
of chasmsCalculation of rate
Yes No
KL-divergence-
based cost function
EUC-distance-
based cost function
(EUC) (KL)
of chasms Calculation of rate
Yes No
KL-divergence-
based cost function
EUC-distance-
based cost function
(EUC) (KL)
of chasms
Time
Conventional hybrid method is a simple method that concatenates normal SNMF and directional clustering.
So, this method cannot reconstruct the lost components, spectral chasms.
This proposed method, red line, is fixed the divergence. So, we already confirmed that the divergence-switching method achieves better result than this red line, in the previous result.
Directional clustering utilizes some clustering methods, such as K-means clustering.
The feature of the clustering / is the differences of the amplitude between channels, namely, the direction of the sources.
From the clustering result, we can obtain binary mask matrix.
So, the separation is achieved by the production of the input spectrogram and this mask.
As another means of addressing multichannel signal separation, Multichannel NMF also has been proposed by Ozerov and Sawada.
This method is a natural extension of NMF, and uses spectral and spatial cues.
But, this unified method is very difficult optimization problem mathematically / because many variables should be optimized by one cost function.
So, this method strongly depends on the initial values.
If the target sources increase in the same direction with target instruments, the separation performance of supervised NMF markedly degrades.
This is because, the several resemble bases arise in both of the target and other instruments.
If the left and right sources close to the center direction, the separation ↓ become difficult, because directional clustering cannot separate well.
In addition, bases extrapolation also become difficult because the number of chasms in the separated cluster / are increased in this case.
In contrast, if the theta become larger, the separation ↓ become easy.
This is a signal flow of the proposed hybrid method.
In our experiment, superresolution-based supervised NMF is applied to only the center direction because the target source is located in the center direction.
However, if the target source is located in the left or right side, we should apply this NMF to the direction that have the target source whether or not there is the other source in that direction.