O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.

Deep Learningによる超解像の進歩

deep learningベースの超解像手法についてのまとめ

  • Entre para ver os comentários

Deep Learningによる超解像の進歩

  1. 1. Copyright © DeNA Co.,Ltd. All Rights Reserved. Deep Learningによる 超解像の進歩
  2. 2. Copyright © DeNA Co.,Ltd. All Rights Reserved. ⾃⼰紹介 2 Hiroto Honda @hirotomusiker n メーカー研究所 → 2017/1 DeNA n ETH Zurich CVLにて客員(2013-2014) n CVPR NTIRE Workshop Program Committee n DeNA AI研究開発エンジニア n 現職:Object Detection (OSS: https://github.com/DeNA/Chainer_Mask_R-CNN ) n 前職:Low-Level Vision, Computational, Sensor LSI
  3. 3. Copyright © DeNA Co.,Ltd. All Rights Reserved. Contents n 超解像は試しやすい n 初期のSISRネットワーク ⁃ SRCNN, ESPCN, VDSR ⁃ Upsampling⼿法– deconv or pixelshuffle n ベースライン⼿法:SRResNet ⁃ SRResNet, SRGAN, and EDSR n 超解像とperception ⁃ 復元結果とロス関数の関係 ⁃ Perception – Distortion Tradeoff n まとめ 3
  4. 4. Copyright © DeNA Co.,Ltd. All Rights Reserved. 超解像とは n 低解像度画像 n ⾼解像度画像 4 復元
  5. 5. Copyright © DeNA Co.,Ltd. All Rights Reserved. 超解像は試しやすい! 5 original(HR) LR resize train アノテーションが不要な Self-supervised learningの⼀種
  6. 6. Copyright © DeNA Co.,Ltd. All Rights Reserved. 超解像の進歩 6 https://github.com/jbhuang0604/SelfExSRPSNR* [dB] (over bicubic) on Set5 dataset, x4 +1.86 +2.93 +2.06 +3.63 A+0.0 bicubic 2015 20172014 2016 +4.20 +2.48 PSNR data from:5) SRCNN VDSR SRResNet EDSRESPCN 超解像の精度は年々向上している * PSNR = 10 log10 (2552 / MSE ) when max value is 255
  7. 7. Copyright © DeNA Co.,Ltd. All Rights Reserved. 超解像ネットワークの学習 n 正解画像からpatchをcropする HR n patchをダウンサンプルする LR = g(HR) n バッチを編成する {LR}, {HR} n ネットワークfを学習する ロス関数は: MSE(HR, f(LR)) n ...以上! 7 LR=g(HR) f(LR) HR f MSE e.g. bicubic down-sampling
  8. 8. Copyright © DeNA Co.,Ltd. All Rights Reserved. Non-deep⼿法: 辞書ベースのアルゴリズム 8 = 係数を最適化する 8 ベースライン: A+ (2014) http://www.vision.ee.ethz.ch/~timofter/publications/Timofte-ACCV-2014.pdf = 学習済みの辞書 x 0 + x 0 + x 0.8 + x 0.8 + x 0.05 + x 0.05 + LR patch HR patch
  9. 9. Copyright © DeNA Co.,Ltd. All Rights Reserved. n 初期のSISR networks ⁃ SRCNN, ESPCN, VDSR ⁃ Upsampling⼿法 – deconv or pixelshuffle 9
  10. 10. Copyright © DeNA Co.,Ltd. All Rights Reserved. 最初のDeep超解像– SRCNN 10 Kernel size: 9 – 1 – 5 or 9 – 3 – 5 or 9 – 5 – 5 from:1) ⾮常にシンプルで計算量も少ない bicubic x2
  11. 11. Copyright © DeNA Co.,Ltd. All Rights Reserved. VDSR: ディープなSRCNN 11 from:3) 3x3, 64 ch D= 5 to 20
  12. 12. Copyright © DeNA Co.,Ltd. All Rights Reserved. Efficient sub-pixel CNN (ESPCN) 12 SRCNNと違い、LR画像をconvするので効率的 Kernel size 5 – 3 – 3 from:2)
  13. 13. Copyright © DeNA Co.,Ltd. All Rights Reserved. SRCNN / VDSR とESPCNの違い n Post-upsamplingのほうが効率的だが、1.6倍 といった⾮整数の upsamplingができない 13 SRCNN, VDSR ESPCN bicubic x2 output input Pixel shuffle x2 ch h w
  14. 14. Copyright © DeNA Co.,Ltd. All Rights Reserved. CNNによるアップスケール - Deconvolution or PixelShuffle? n Deconvolution 14 https://distill.pub/2016/deconv-checkerboard/ 位置ごとに関与する画素数が均⼀ではないため 格⼦パターンが出てしまう
  15. 15. Copyright © DeNA Co.,Ltd. All Rights Reserved. CNNによるアップスケール - Deconvolution or PixelShuffle? n resize – convolutionしては? 15 格⼦パターンはなくなる Resize(low-pass)により情報が失われる可能性があるので、 Nearest neighborで埋める⽅法も
  16. 16. Copyright © DeNA Co.,Ltd. All Rights Reserved. CNNによるアップスケール - Deconvolution or PixelShuffle? n Sub-pixel convolution (aka. PixelShuffle) 16 各位置でチャネルの情報をタイルする e.g. 9 channels -> 3x3 サブピクセル 格⼦ノイズフリーではない from:2)
  17. 17. Copyright © DeNA Co.,Ltd. All Rights Reserved. n ベースライン⼿法:SRResNet ⁃ SRResNet, SRGAN, and EDSR 17
  18. 18. Copyright © DeNA Co.,Ltd. All Rights Reserved. SRResnet and SRGAN – twitter CVPR’17 18 Skip connection pixel shuffle x2 MSE MSE Discriminator Trained VGG Perceptual Loss Discriminator Loss MSE Loss from:4) pixel shuffle x2 ch h w ・3種類のロス関数 ・MSEのみを使⽤する場合SRResNetと呼ぶ 24 residual blocks, 64 ch
  19. 19. Copyright © DeNA Co.,Ltd. All Rights Reserved. SRResnet* and SRGAN – ネットワーク詳細 19 ・resblockとskip connection ・pixel shuffle upsampling from:4)
  20. 20. Copyright © DeNA Co.,Ltd. All Rights Reserved. さらに⾼精度に特化したEnhanced Deep Super Resolution (EDSR) ソウル⼤ 20 32 residual blocks, 256 ch Skip connection x2 x2 l1 l1 Loss from:5)
  21. 21. Copyright © DeNA Co.,Ltd. All Rights Reserved. PSNRと⾒た⽬ 21 from:5) 20dB台で1dB違うと明らかに⾒た⽬が変わる
  22. 22. Copyright © DeNA Co.,Ltd. All Rights Reserved. n 超解像とPerception ⁃ 復元結果とロス関数の関係 ⁃ Perception – Distortion Tradeoff 22
  23. 23. Copyright © DeNA Co.,Ltd. All Rights Reserved. 主観評価とPSNR 23 Original SRResNet 25.53dB SRGAN 21.15dB bicubic 21.59dB Method→ PSNR → from: 4)
  24. 24. Copyright © DeNA Co.,Ltd. All Rights Reserved. SRResnet and SRGAN – lossでこんなに違う 24 MSE loss ● ● Perceptual loss using VGG ● Discriminator loss ● ● from:4) PSNRが 最も⾼い
  25. 25. Copyright © DeNA Co.,Ltd. All Rights Reserved. 3タイプのロス関数 ①l1/l2 loss ②perceptual loss ③GAN loss 25 generated image real / fake ground truth multi-scale feature matching VGG discrimi- nator generated image ground truth generated image ground truth Low Distortion Good Perception
  26. 26. Copyright © DeNA Co.,Ltd. All Rights Reserved. Perception-Distortion Tradeoff どの⼿法も、low distortionとgood perceptual qualityを 同時に満たせない → tradeoff把握が⼤事 26 from:8)
  27. 27. Copyright © DeNA Co.,Ltd. All Rights Reserved. 超解像の⽬的はなにか? 27 Accurate Plausible 正確な復元 ⾃然な復元 どちらを選ぶかは、⽤途次第!! 引⽤元:4)
  28. 28. Copyright © DeNA Co.,Ltd. All Rights Reserved. n まとめ 28
  29. 29. Copyright © DeNA Co.,Ltd. All Rights Reserved. Progress on SISR – 精度と速度 29 PSNR [dB] (over bicubic) on Set5 dataset, x4 +1.86 +2.93 +2.06 +3.63 A+ SRCNN VDSR SRResNet EDSR0.0 bicubic 2015 20172014 2016 +4.20 ESPCN +2.48 0.44 0.04 0.74 1.33 40.7 ・CNNを通る画像サイズ ・中間レイヤのチャネル数 で計算量が⼤きく変化する PSNRデータ引⽤元:5) Mega-Multiplication per one input pixel for x2 restoration
  30. 30. Copyright © DeNA Co.,Ltd. All Rights Reserved. NTIRE 2017 超解像コンペでのベンチマーク詳細 30 EDSR SRResNet VDSR ESPCN SRCNN A+ from: 9)
  31. 31. Copyright © DeNA Co.,Ltd. All Rights Reserved. まとめ n 超解像はdeepが主流、⾼精度だが計算量が⼤きい n resblock連結 + skip connectionや、pixel shuffle upsamplingが重要 n SRResNetベースの⼿法がベースライン n ʻAccurateʼ か ʻPlausibleʼ かは⽤途次第。 31
  32. 32. Copyright © DeNA Co.,Ltd. All Rights Reserved. Appendix: Residual Dense Network for Super-Resolution 32 DenseNetベースのSRResNet from: 6)
  33. 33. Copyright © DeNA Co.,Ltd. All Rights Reserved. Appendix: Deep Back-Projection Networks For Super-Resolution (best PSNR in NTIRE ʼ18 x8 bicubic downsampling track) 33 from: 7)
  34. 34. Copyright © DeNA Co.,Ltd. All Rights Reserved. Datasets n DIV2K dataset (train, val) https://data.vision.ee.ethz.ch/cvl/DIV2K/ n Set5 dataset (test) http://people.rennes.inria.fr/Aline.Roumy/results/SR_BMVC12.html n B100 dataset (test) https://www2.eecs.berkeley.edu/Research/Projects/CS/vision/bsds/ n Urban100 dataset (test) https://sites.google.com/site/jbhuang0604/publications/struct_sr 34
  35. 35. Copyright © DeNA Co.,Ltd. All Rights Reserved. Competitions n NTIRE2017: New Trends in Image Restoration and Enhancement workshop and challenge on image super- resolution in conjunction with CVPR 2017 http://www.vision.ee.ethz.ch/ntire17/ report: http://www.vision.ee.ethz.ch/~timofter/publications/Timofte-CVPRW-2017.pdf n NTIRE2018: New Trends in Image Restoration and Enhancement workshop and challenge on super-resolution, dehazing, and spectral reconstructionin conjunction with CVPR 2018 http://www.vision.ee.ethz.ch/ntire18/ report: http://openaccess.thecvf.com/content_cvpr_2018_workshops/papers/w13/Timofte_NTIRE_2018 _Challenge_CVPR_2018_paper.pdf n PIRM2018: Workshop and Challenge on Perceptual Image Restoration and Manipulation in conjunction with ECCV 2018 https://www.pirm2018.org/ 35
  36. 36. Copyright © DeNA Co.,Ltd. All Rights Reserved. References 1) Dong et al., Image Super-Resolution Using Deep Convolutional Networks, https://arxiv.org/abs/1501.00092 2) Shi et al., Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network, https://arxiv.org/abs/1609.05158 3) Kim et al., Accurate Image Super-Resolution Using Very Deep Convolutional Networks, https://arxiv.org/pdf/1511.04587 4) Ledig et al., Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , https://arxiv.org/abs/1609.04802 5) Lim et al., Enhanced Deep Residual Networks for Single Image Super-Resolution, https://arxiv.org/abs/1707.02921 6) Zhang et al., Residual Dense Network for Image Super-Resolution, https://arxiv.org/abs/1802.08797 7) Haris et al., Deep Back-Projection Networks For Super-Resolution, https://arxiv.org/pdf/1803.02735.pdf 8) Blau et al., Perception Distortion Tradeoff, https://arxiv.org/abs/1711.06077 9) Timofte et al., NTIRE 2017 Challenge on Single Image Super-Resolution: Methods and Results , http://www.vision.ee.ethz.ch/~timofter/publications/Timofte-CVPRW-2017.pdf

×