Presentation MM-4104, Smart Sharpen using OpenCL in Adobe Photoshop CC – Challenges and Achievements, by Joseph Hsieh at the AMD Developer Summit, November 11-13, 2013.
MM-4104, Smart Sharpen using OpenCL in Adobe Photoshop CC – Challenges and Achievements, by Joseph Hsieh
1. SMART SHARPEN USING OPENCL IN PHOTOSHOP CC
– CHALLENGES AND ACHIEVEMENTS
JOSEPH HSIEH
ADOBE SYSTEMS INC.
2. TOPICS
Table
‒ What is OpenCL?
‒ Why Adobe Photoshop CC chooses OpenCL?
‒ Smart Sharpen in Adobe Photoshop CC
‒ Challenges
‒ What have we learned?
‒ Demo
‒ Summary
‒ Q&A
2 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
4. WHAT IS OPENCL?
OpenCL is an open standard framework (supported by AMD, Apple, Intel, Nvidia, etc.) that allows
developer to write programs that execute on different hardware platforms – GPUs, CPUs, DSPs, etc.
CUDA, GLSL, C++ AMP are competing GPGPU (General-Purpose Computing on Graphics Processing Units)
technologies.
4 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
6. WHY ADOBE PHOTOSHOP CC CHOOSES OPENCL
Some advanced algorithms are just not fast enough.
GPU computing is suitable for algorithms that benefits from massive parallelization.
Cross-Platform (OpenCL and OpenGL Shading language (GLSL) are the only two cross-platform GPGPU
solution).
Easier to map algorithm into OpenCL code than GLSL.
Highly efficient (capable of using fast cache memory).
Learning curves – Syntax (C99 plus vector support operations).
OpenCL is available on many latest mobile devices.
6 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
8. SHARPENING
From Scott Kelby, “I haven’t met a digital camera (or scanned) photo that I didn’t think needed a little
sharpening”.
For very blurred image, you need to do deblur, not the sharpening.
8 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
9. RENOVATED ADOBE PHOTOSHOP CC SMART SHARPEN FEATURE
Adobe renovated the legacy smart sharpen to
‒ Addresses “noise get boosted when you sharpen” issue.
‒ Better overall quality (preserve color).
‒ Reduce the halo effect.
9 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
10. COMPARISON BETWEEN LEGACY AND CURRENT SMART SHARPEN
10 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
11. COMPARISON BETWEEN LEGACY (ACCURATE) AND CURRENT VERSION
11 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
12. COMPARISON BETWEEN UNSHARP MASK AND SMART SHARPEN
12 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
13. COMPARISON BETWEEN UNSHARP MASK THRESHOLD AND SMART SHARPEN
13 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
14. SHARPENING TIPS
Sharpen the image on a separate layer.
Blending Mode: luminosity
Adjust opaque percentage
You may want to denoise before sharpening.
14 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
15. PERFORMANCE CONCERN
User experience – responsiveness is important.
‒ CPU Optimization.
‒ Multi-threading
‒ Vectorization using intrinsic instructions
Still not fast enough…
‒ Is the algorithm parallelizable?
‒ Is GPGPU Optimization suitable for our sharpening algorithm?
15 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
16. CHALLENGES IN SMART SHARPEN OPENCL DEVELOPMENT
Need to rethink about the algorithm.
Limited resources. (global memory, local memory, and private memory)
Algorithm is memory bound.
Avoid memory corruption.
Debug strategy.
How to make solid OpenCL kernels?
Optimization strategy.
Driver Issues? (better re-check the OpenCL spec and your kernel first…).
Quality control (GPU on different vendor platforms).
16 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
17. WHAT WE HAVE LEARNED
Reference code first (making sure you know the algorithm right).
Divide and conquer (easy to verify).
Always design for easy to do unit testing.
17 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
18. WHAT WE HAVE LEARNED
Optimization strategies (general practices):
‒ Always profile first.
‒ Algorithm parallelization.
‒ Global memory read as less as possible. (???)
‒ Memory access pattern.
‒ Avoid branching.
‒ Avoid edge condition handling.
‒ Arithmetic complexity (% is bad).
‒ Avoid bank conflict as much as possible. (???)
‒ Maximize the lazy synchronization.
‒ Trade-offs.
‒ Do not make the conclusion before doing the experiments…
‒ Optimization for certain vendor devices?
18 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
19. WHAT WE HAVE LEARNED
Quality control across various GPUs.
‒ Design of testing scenarios.
‒ Automation
‒ Collaborate with vendors.
19 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
22. AMD CODEXL – TOOL FOR HETEROGENEOUS COMPUTING
Features:
‒ OpenCL debug
‒ GPU profiling
‒ Analyze OpenCL kernel
‒ Collect OpenCL application trace
‒ Integrated with Microsoft Visual Studio
22 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
23. BENEFIT OF USING AMD CODEXL
Helps you to optimize your OpenCL kernel efficiently.
‒ Analyze the usage of VGPR, SGPR, Local Memory, etc.
‒ Profiling.
‒ Host code efficiency.
23 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
24. AMD CODEXL- OPENCL KERNEL DEBUG
24 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
25. AMD CODEXL- MEMORY OBJECT VIEWER
25 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
26. AMD CODEXL – TIMELINE VISUALIZATION
Visualize host code execution, data transfer, and kernel execution.
26 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
27. AMD CODEXL – PERFORMANCE COUNTER
27 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
28. AMD CODEXL – KERNEL OCCUPANCY VIEWER
28 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
30. OPENCL 2.0 NEW FEATURES
Shared virtual memory
‒ Great for APUs and integrated GPUs.
Image2D, Image3D memory object read_write modifier.
Dynamic Parallelism: kernel can enqueue other kernels.
‒ Avoid to transfer execution and data between the device and host.
Pipe
‒ FIFO memory object.
30 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
33. SUMMARY
Adobe Photoshop CC adopts OpenCL to accelerate great features.
Adobe dedicates to deliver the best user experience.
‒ Embrace advanced solid technologies.
‒ Works closely with vendors to insure fully tested high quality software.
33 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL