自分史上一番早い2024振り返り〜コロナ後、仕事は通常ペースに戻ったか〜 by IoT fullstack engineer
[DL輪読会]VOICEFILTER: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking
1. 1
DEEP LEARNING JP
[DL Papers]
http://deeplearning.jp/
VOICEFILTER:TargetedVoice Separation by
Speaker-Conditioned Spectrogram Masking
Hiroshi Sekiguchi, Morikawa Lab
2. 書誌情報
• “VOICEFILTER: Targeted Voice Separation by Speaker-Conditioned
Spectrogram Masking” arXiv:1810.04826v3 [eess.AS] 27 Oct 2018
• Author: Quan Wang1, Hannah Muckenhire2, Kevin Wilson1, Prashant
Sridhar1, Zelin Wu1, John Hershey1, Rif A. Saurous1, Ron J. Weiss1, Ye
Jia1, Ignacio Lopez Moreno1
1Google Inc. USA, 2Idiap Research Institute, Switzerland
• 論文選択の理由
• 重畳音声の分離が研究テーマ
• Google製スマートスピーカ”Google Home”の重畳音声分離をレビュー.
2
11. 話者認識ネットワーク
• 話者認識に2通りあり
– Text Dependent-Speaker Verification(TD-SV):
事前登録の単語(“OK Google”)=テスト時の単語(“OK Google”)
– Test Independent-Speaker Verification(TI-SV):
事前登録の単語(“色々な単語”(音韻も単語長も色々)=テスト時の単語(“Hey
Google”)
→今回は,後者のフレームワーク.
• 関連論文:
– Generalized End-to-End Loss for Speaker Verification,
Lin Wan , et.al,Google
– Long Short-Term Memory Recurrent Neural Network Architectures for Large Scale
Acoustic Modeling
Hasim Sak,et.al,Google, USA
11