Using parallel programming to improve performance of image processing

•Transferir como PPTX, PDF•

1 gostou•1,498 visualizações

Implement Anisotropic Diffusion on CUDA platform 1 thread handle 1 pixel Dividing the image to multiple sub-regions, process them parallely to exploit multiple cores

Tecnologia

USING PARALLEL PROGRAMMING TO
IMPROVE PERFORMANCE OF IMAGE
PROCESSING
Chan Le – KAIST ’13

INTRODUCTION
 Me:
 Chan Le – 3rd year undergraduate student
 Double major in Computer Science & Management Science
 A Vietnamese - KAIST ’13

 Professor:
 Won-Ki Jeong
 GPU-accelerated large-scale biomedical image processing

 Project:
 Apply parallel programming to improve performance of image
processing

MOTIVATION
 Biomedical researches work with images
 Really big images
 Take long time to process
 Raw images are hard to analyze & use for research
 Really noisy sometimes
 Need to preprocess before using
 Image preprocessing using serial algorithms are
slow
 Nowadays, parallel computing are developing
 Thanks to the popularity of multi-core CPUs and GPUs

RELATED WORKS: USING PDE IN NOISE-
REDUCTION
IN
(x,y+1
ΔW = IW – It
)
ΔN

IW ΔW It ΔE IE
(x-1,y) (x,y) (x+1,y)
ΔS
IS
(x,y-
1)

 Heat equation
 At pixel every (x,y) of the image at the time t:
I =It+ΔI
 t+1

ΔI = (ΔW+ ΔN+ ΔE+ ΔS) / 4

RELATED WORKS: ANISOTROPIC DIFFUSION
 Paper: Scale-space and edge detection using
anisotropic diffusion (Pietro Perona & Jitendra Malik, 1990)
 Basic idea: Adding coefficient to each ΔW,ΔN,ΔS,ΔE
 .

 
/4
 How to calculate each c?
C=

C=

NVIDIA CUDA
 Serial vs Parallel program
 Thread: unit of processing
 In the past: CPU has only 1 core -> 1 thread at a time
 Nowadays: multi-cores -> multiple thread at a time

 CUDA™ is a parallel computing platform and
programming model invented by NVIDIA.
 http://www.nvidia.com/object/cuda_home_new.html

 How could it helps?
 CPU: 1-6 cores
 GPU: hundreds
  improve performance by the scale of 10 to 100, depends on
the algorithm

MY IMPLEMENTATION
 Implement Anisotropic
Diffusion on CUDA
platform

 1 thread handle 1 pixel
 Dividing the image to
multiple sub-regions,
process them parallely to
exploit multiple cores

CONCLUSION
 The result of this project could be use to help
improving quality of images before using.

 Utilizing GPU computing power could improve the
performance of your program by 100-200 times
 Partial Differential Equations are good choices
when design parallel algorithm
 However, the performance is limited by the GPU’s
memory size

Mais conteúdo relacionado

Mais procurados

GAN - Theory and ApplicationsEmanuele Ghelfi

Math behind the kernelsRevanth Kumar

About Unsupervised Image-to-Image TranslationMehdi Shibahara

MLIP - Chapter 3 - Introduction to deep learningCharles Deledalle

Object classification using deep neural networknishakushwah4

Lecture32zukun

Large scale object recognition （AMMAI presentation）Po-Jen Lai

07 regularizationRonald Teo

Machine learning in Rapolol92

Implementation of optimized diamond search algorithmnaeemtayyab

Machine Learning: Make Your Ruby Code SmarterAstrails

2013.10.24 big datavisualizationSean Kandel

Image transformsBCET, Balasore

Epsrcws08 campbell isvm_01Cheng Feng

Matrix Factorizations for Recommender SystemsDmitriy Selivanov

Hebb networkSiksha 'O' Anusandhan (Deemed to be University )

NCM LECTURE NOTES ON I . n. herestein cryptography(3)NARAYANASWAMY CHANDRAMOWLISWARAN

Generative Adversarial NetworksMustafa Yagmur

Deep Neural NetworkJun Young Park

20100822 computervision vekslerComputer Science Club

Mais procurados (20)

GAN - Theory and Applications

Math behind the kernels

About Unsupervised Image-to-Image Translation

MLIP - Chapter 3 - Introduction to deep learning

Object classification using deep neural network

Lecture32

Large scale object recognition （AMMAI presentation）

07 regularization

Machine learning in R

Implementation of optimized diamond search algorithm

Machine Learning: Make Your Ruby Code Smarter

2013.10.24 big datavisualization

Image transforms

Epsrcws08 campbell isvm_01

Matrix Factorizations for Recommender Systems

Hebb network

NCM LECTURE NOTES ON I . n. herestein cryptography(3)

Generative Adversarial Networks

Deep Neural Network

20100822 computervision veksler

Semelhante a Using parallel programming to improve performance of image processing

H2O Distributed Deep Learning by Arno Candel 071614Sri Ambati

ICCV2009: MAP Inference in Discrete Models: Part 2zukun

Discrete Models in Computer VisionYap Wooi Hen

05 history of cv a machine learning (theory) perspective on computer visionzukun

15_wk4_unsupervised-learning_manifold-EM-cs365-2014.pdfMcSwathi

Fcv hum mach_gemanzukun

MLHEP Lectures - day 3, basic trackarogozhnikov

Digital image processing short quesstion answersAteeq Zada

From RNN to neural networks for cyclic undirected graphstuxette

Image De-Noising Using Deep Neural Networkaciijournal

DRAW: Deep Recurrent Attentive WriterMark Chang

A Comparative Study of Image Compression AlgorithmsIJORCS

"Demystifying Deep Neural Networks," a Presentation from BDTIEdge AI and Vision Alliance

Learning to Rank with Neural NetworksBhaskar Mitra

Lect3cg2011ishusharma6098

Introduction to ml and dlSuyashSingh70

chapter-2 SPACIAL DOMAIN.pptxAyeleFeyissa1

H2O.ai's Distributed Deep Learning by Arno Candel 04/03/14Sri Ambati

MLHEP 2015: Introductory Lecture #3arogozhnikov

Introduction to Deep Neural NetworkLiwei Ren任力偉

Semelhante a Using parallel programming to improve performance of image processing (20)

H2O Distributed Deep Learning by Arno Candel 071614

ICCV2009: MAP Inference in Discrete Models: Part 2

Discrete Models in Computer Vision

05 history of cv a machine learning (theory) perspective on computer vision

15_wk4_unsupervised-learning_manifold-EM-cs365-2014.pdf

Fcv hum mach_geman

MLHEP Lectures - day 3, basic track

Digital image processing short quesstion answers

From RNN to neural networks for cyclic undirected graphs

Image De-Noising Using Deep Neural Network

DRAW: Deep Recurrent Attentive Writer

A Comparative Study of Image Compression Algorithms

"Demystifying Deep Neural Networks," a Presentation from BDTI

Learning to Rank with Neural Networks

Lect3cg2011

Introduction to ml and dl

chapter-2 SPACIAL DOMAIN.pptx

H2O.ai's Distributed Deep Learning by Arno Candel 04/03/14

MLHEP 2015: Introductory Lecture #3

Introduction to Deep Neural Network

Último

TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey

SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal

SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero

WordPress Websites for Engineers: Elevate Your Brandgvaughan

"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays

Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University

Advanced Computer Architecture – An IntroductionDilum Bandara

The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech

Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren

Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited

How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe

"ML in Production",Oleksandr BaganFwdays

Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB

DevEX - reference for building teams, processes, and platformsSergiu Bodiu

Artificial intelligence in cctv survelliance.pptxhariprasad279825

What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett

New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada

DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy

Powerpoint exploring the locations used in television show Time Clashcharlottematthew16

DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell

Using parallel programming to improve performance of image processing

1. USING PARALLEL PROGRAMMING TO IMPROVE PERFORMANCE OF IMAGE PROCESSING Chan Le – KAIST ’13

2. INTRODUCTION  Me:  Chan Le – 3rd year undergraduate student  Double major in Computer Science & Management Science  A Vietnamese - KAIST ’13  Professor:  Won-Ki Jeong  GPU-accelerated large-scale biomedical image processing  Project:  Apply parallel programming to improve performance of image processing

3. MOTIVATION  Biomedical researches work with images  Really big images  Take long time to process  Raw images are hard to analyze & use for research  Really noisy sometimes  Need to preprocess before using  Image preprocessing using serial algorithms are slow  Nowadays, parallel computing are developing  Thanks to the popularity of multi-core CPUs and GPUs

4. RELATED WORKS: USING PDE IN NOISE- REDUCTION IN (x,y+1 ΔW = IW – It ) ΔN IW ΔW It ΔE IE (x-1,y) (x,y) (x+1,y) ΔS IS (x,y- 1)  Heat equation  At pixel every (x,y) of the image at the time t: I =It+ΔI  t+1 ΔI = (ΔW+ ΔN+ ΔE+ ΔS) / 4

5. RELATED WORKS: ANISOTROPIC DIFFUSION  Paper: Scale-space and edge detection using anisotropic diffusion (Pietro Perona & Jitendra Malik, 1990)  Basic idea: Adding coefficient to each ΔW,ΔN,ΔS,ΔE  .   /4  How to calculate each c? C= C=

6. RELATED WORKS

7. NVIDIA CUDA  Serial vs Parallel program  Thread: unit of processing  In the past: CPU has only 1 core -> 1 thread at a time  Nowadays: multi-cores -> multiple thread at a time  CUDA™ is a parallel computing platform and programming model invented by NVIDIA.  http://www.nvidia.com/object/cuda_home_new.html  How could it helps?  CPU: 1-6 cores  GPU: hundreds   improve performance by the scale of 10 to 100, depends on the algorithm

8. MY IMPLEMENTATION  Implement Anisotropic Diffusion on CUDA platform  1 thread handle 1 pixel  Dividing the image to multiple sub-regions, process them parallely to exploit multiple cores

9. SOME RESULT – SMALL

10. SOME RESULT – SMALL

11. SOME RESULT – MEDIUM

12. SOME RESULT – MEDIUM

13. BENCHMARK  100 times iteration

14. CONCLUSION  The result of this project could be use to help improving quality of images before using.  Utilizing GPU computing power could improve the performance of your program by 100-200 times  Partial Differential Equations are good choices when design parallel algorithm  However, the performance is limited by the GPU’s memory size

Using parallel programming to improve performance of image processing

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Semelhante a Using parallel programming to improve performance of image processing

Semelhante a Using parallel programming to improve performance of image processing (20)

Último

Último (20)

Using parallel programming to improve performance of image processing