Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Single Image Super Resolution Overview
1. Intelligence Machine Vision Lab
Strictly Confidential
Single Image Super Resolution Overview
SUALAB
Ho Seong Lee
2. 2Type A-3
Contents
• What is Super Resolution?
• Applications of Super Resolution
• Deep Learning for Single Image Super Resolution
• Some Issues for Super Resolution
3. 3Type A-3
What is Super Resolution?
• Super Resolution
• Restore High-Resolution(HR) image(or video) from Low-Resolution(LR) image(or video)
• According to the number of input LR images, SR can be classified SISR or MISR
• Single Image Super Resolution
Efficient & Popular
Super
Resolution
4. 4Type A-3
What is Super Resolution?
• Single Image Super Resolution
• Restore High-Resolution(HR) image(or video) from Low-Resolution(LR) image(or video)
• Ill-Posed Problem.. (Regular Inverse Problem) → We can’t have ground truth from LR image
HR 3
HR 2
HR 1
LR
5. 5Type A-3
What is Super Resolution?
• Interpolation-based Single Image Super Resolution
• In image upscaling task, bicubic or bilinear or Lanczos interpolation is usually used.
• Fast, easy.. but low quality..
Super
Resolution
Deep SR
bilinear
6. 6Type A-3
What is Super Resolution?
• Single Image Super Resolution algorithms
• Interpolation-based method
• Reconstruction-based method
• (Deep) Learning-based method
• Today, I will cover learning-based method
7. 7Type A-3
Applications of Super Resolution
• Satellite image processing
• Medical image processing
• Multimedia Industry and Video Enhancement
Reference: “Super Resolution Applications in Modern Digital Image Processing”, 2016 IJCA
TV & Monitor
HD(1280x720), FHD(1920x180) UHD(3840x2160)
8. 8Type A-3
Deep Learning for Single Image Super Resolution
• Learning-based Single Image Super Resolution
• For tackling regular inverse problem, almost use this paradigm
• [HR(GT) image] + [distortion & down-sampling] → [LR(input) image]
• This is limitation of SISR training
• Overall restoration quality is dependent on the distortion & down-sampling method
Reference: “Deep Learning for Single Image Super-Resolution: A Brief Review”, 2018 IEEE Transactions on Multimedia (TMM)
9. 9Type A-3
Deep Learning for Single Image Super Resolution
• Learning-based Single Image Super Resolution
• [HR(GT) image] + [distortion & down-sampling] → [LR(input) image]
• In CVPR 2017 SR Challenge, many team showed many degradation of quality metric
Reference: http://www.vision.ee.ethz.ch/~timofter/publications/NTIRE2017SRchallenge_factsheets.pdf
10. 10Type A-3
Deep Learning for Single Image Super Resolution
• First Deep Learning architecture for Single Image Super Resolution
• SRCNN(2014) – three-layer CNN, MSE Loss, Early upsampling
• Compared to traditional methods, it shows excellent performance.
Reference: “Image Super-Resolution Using Deep Convolutional Networks”, 2014 ECCV
11. 11Type A-3
Deep Learning for Single Image Super Resolution
• Efficient Single Image Super Resolution
• FSRCNN(2016), ESPCN(2016)
• Use Late Upsampling with deconvolution or sub-pixel convolutional layer
Inefficient in Memory, FLOPS
Reference: “Image Super-Resolution Using Deep Convolutional Networks”, 2014 ECCV
12. 12Type A-3
Deep Learning for Single Image Super Resolution
• FSRCNN(Fast Super-Resolution Convolutional Neural Network)
• Use Deconvolution layer instead of pre-processing(upsampling)
• Faster and more accurate than SRCNN
Reference: “Accelerating the Super-Resolution Convolutional Neural Network”, 2016 ECCV
13. 13Type A-3
Deep Learning for Single Image Super Resolution
• ESPCN(Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel
Convolutional Neural Network)
• Use sub-pixel convolutional layer (pixel shuffler or depth_to_space)
• This sub-pixel convolutional layer is used in recent SR models
Reference: “Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network”, 2016 CVPR
Code: https://github.com/leftthomas/ESPCN
14. 14Type A-3
Deep Learning for Single Image Super Resolution
• Deeper Networks for Super-Resolution
• SRCNN, FSRCNN, ESPCN are shallow network → Why not deep network?
• Failed to train deeper models.. → Use shallow network → how to use deeper network?
Reference: “Image Super-Resolution Using Deep Convolutional Networks”, 2014 ECCV
15. 15Type A-3
Deep Learning for Single Image Super Resolution
• VDSR(Accurate Image Super-Resolution Using Very Deep Convolutional Networks)
• VGG based deeper model(20-layer) for Super-Resolution → large receptive field
• Residual learning & High learning rate with gradient clipping
• MSE Loss, Early upsampling
Reference: “Accurate Image Super-Resolution Using Very Deep Convolutional Networks”, 2016 CVPR
16. 16Type A-3
Deep Learning for Single Image Super Resolution
• Deeper Networks for Super-Resolution after VDSR
• DRCN(Deeply-recursive Convolutional network), 2016 CVPR
• SRResNet, 2017 CVPR
• DRRN(Deep Recursive Residual Network), 2017 CVPR
Reference: “Deep Learning for Single Image Super-Resolution: A Brief Review”, 2018 IEEE Transactions on Multimedia (TMM)
17. 17Type A-3
Deep Learning for Single Image Super Resolution
• Deeper Networks for Super-Resolution after VDSR
• EDSR, MDSR (Enhanced Deep Residual Network, Multi Scale EDSR), 2017 CVPRW
• DenseSR, 2017 CVPR
• MemNet, 2017 CVPR
Reference: “Deep Learning for Single Image Super-Resolution: A Brief Review”, 2018 IEEE Transactions on Multimedia (TMM)
18. 18Type A-3
Deep Learning for Single Image Super Resolution
• Generative Adversarial Network(GAN) for Super-Resolution
• SRGAN(Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network)
• First GAN-based SR Model, MSE Loss → Blurry Output → GAN loss + Content loss = Perceptual loss
Reference: “Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network”, 2017 CVPR
19. 19Type A-3
Deep Learning for Single Image Super Resolution
• Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network
• MSE Loss → Blurry Output → GAN loss + Content loss = Perceptual loss
• Replace MSE loss to VGG loss (used in style transfer) and add adversarial loss
Reference: “Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network”, 2017 CVPR
20. 20Type A-3
Deep Learning for Single Image Super Resolution
• Generative Adversarial Network(GAN) for Super-Resolution
• SRGAN, EnhanceNet, SRFeat, ESRGAN
Reference: “A Deep Journey into Super-resolution: A survey”, 2019 arXiv
21. 21Type A-3
Deep Learning for Single Image Super Resolution
Reference: “A Deep Journey into Super-resolution: A survey”, 2019 arXiv
• Deep Learning for Single Image Super Resolution
22. 22Type A-3
Deep Learning for Single Image Super Resolution
Reference: “A Deep Journey into Super-resolution: A survey”, 2019 arXiv
23. 23Type A-3
Deep Learning for Single Image Super Resolution
• Deep Learning for Single Image Super Resolution
24. 24Type A-3
Some Issues for Super Resolution
• Checkerboard artifact
• Deconvolution (Transposed convolution) layer can easily have “uneven overlap”
• Simple solution: use “resize + conv” or “sub-pixel convolutional layer”
Reference: “Deconvolution and Checkerboard Artifacts”, distill blog(https://distill.pub/2016/deconv-checkerboard/)
25. 25Type A-3
Some Issues for Super Resolution
• Loss function
• Propose a various loss function methods in Image Restoration task
• Report the best result when using mixed loss with MS-SSIM loss + 𝒍 𝟏 loss
Reference: “Loss Functions for Image Restoration with Neural Networks”, 2016 IEEE TCI
26. 26Type A-3
Some Issues for Super Resolution
• Loss function
• Recent papers almost use 𝒍 𝟏 loss
Reference: “A Deep Journey into Super-resolution: A survey”, 2019 arXiv
27. 27Type A-3
Some Issues for Super Resolution
• Metric (Distortion measure)
• Almost paper use distortion metric(PSNR and SSIM) as performance metric
• But, high PSNR, SSIM do not guarantee good quality that looks good to human..
Reference: “Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network”, 2017 CVPR
28. 28Type A-3
Some Issues for Super Resolution
• Metric (Human opinion score)
• But, high PSNR, SSIM do not guarantee good quality that looks good to human..
• In SRGAN Paper, Mean Opinion Score(MOS) Testing is done for quantify perceptual quality
• 26 raters, score from 1(bad) to 5(excellent)
• The raters were calibrated on the NN(1) and HR(5) version of 20 images from BSD300 dataset
29. 29Type A-3
Some Issues for Super Resolution
• Metric Paper (The Perception-Distortion Tradeoff, 2018 CVPR)
• Analysis distortion measure(PSNR, SSIM, etc.) vs human opinion score(Perceptual quality)
• Good supplementary video: https://www.youtube.com/watch?v=6Yid4dituqo (PR-12)
Reference: “The Perception-Distortion Tradeoff”, 2018 CVPR
Better
quality
30. 30Type A-3
References
• “Image Super-Resolution Using Deep Convolutional Networks”, 2014 ECCV
• “Super Resolution Applications in Modern Digital Image Processing”, 2016 IJCA
• “Loss Functions for Image Restoration with Neural Networks”, 2016 IEEE TCI
• “Deconvolution and Checkerboard Artifacts”, distill blog(https://distill.pub/2016/deconv-checkerboard/)
• “Accurate Image Super-Resolution Using Very Deep Convolutional Networks”, 2016 CVPR
• “Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network”, 2016
CVPR
• “Accelerating the Super-Resolution Convolutional Neural Network”, 2016 ECCV
• “Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network”, 2017 CVPR
• http://www.vision.ee.ethz.ch/~timofter/publications/NTIRE2017SRchallenge_factsheets.pdf
• “The Perception-Distortion Tradeoff”, 2018 CVPR
• “Deep Learning for Single Image Super-Resolution: A Brief Review”, 2018 IEEE Transactions on Multimedia (TMM)
• “A Deep Journey into Super-resolution: A survey”, 2019 arXiv