SlideShare uma empresa Scribd logo
1 de 32
Programming using
MPI
&
OpenMP
HIGH PERFORMANCE COMPUTING
MODULE 4
DIVYA TIWARI
MEIT
TERNA ENGINEERING COLLEGE
INTRODUCTION
• Parallel computing has made a tremendous impact on a variety of areas from computational
simulations for scientific and engineering applications to commercial applications in data
mining and transaction processing.
• Numerous programming languages and libraries have been developed for explicit parallel
programming.
• They differ - in their view of the address space that they make available to the programmer
- the degree of synchronization imposed on concurrent activities
- the multiplicity of programs.
• The message-passing programming (MPI) is one of the oldest and most widely used
approaches for programming parallel computers.
Principles
Key Atrributes of MPI
Assumes partitioned
address space
Supports only explicit
parallelization
• Implications of Partitioned address space:
1. Each data must belong to one of the partition of the space.
2. Interaction requires co-operation of two processes (the process that has the data
and process that wants to access the data.)
• Explicit Parallelization:
Programmer is responsible for analysing underlying serial algorithm/application and
identifying ways by which he or she can decompose the computations and extract
concurrency.
Structure of Message Passing Program
• Message-passing programs are often written using:
1. Asynchronous
2. Loosely Synchronous
• In its most general form, the message-passing paradigm supports execution of a different
program on each of the p processes.
• Most message-passing programs are written using the single program multiple data (SPMD)
approach.
Building Blocks: Send and Receive Operations
• Communication between the processes are accomplished by sending and receiving
messages.
• Basic operations in message passing program are send and receive.
• Example:
send(void *sendbuf, int nelems, int dest)
receive(void *recvbuf, int nelems, int source)
sendbuf - points to a buffer that stores the data to be sent.
recvbuf - points to a buffer that stores the data to be received.
nelems - is the number of data units to be sent and received.
dest - is the identifier of the process that receives the data.
source - is the identifier of the process that sends the data.
• Example: A process sending a piece of data to another process
send(void *sendbuf, int nelems, int dest)
receive(void *recvbuf, int nelems, int source)
1 P0 P1
2
3 a = 100; receive(&a, 1, 0)
4 send(&a, 1, 1); printf("%dn", a);
5 a=0;
• In message passing operation there are two types:
1. Blocking Message Passing Operations.
i. Blocking Non-Buffered Send/Receive
ii. Blocking Buffered Send/Receive
2. Non-Blocking Message Passing Operations.
Blocking Message Passing Operations
i. Blocking Non-Buffered Send/Receive
ii. Blocking Buffered Send/Receive
a. In the presence of communication
hardware with buffers at send and
receive ends
b. In the absence of communication
hardware sender interrupts receiver and
deposits data in buffer at receiver end
Non-Blocking Message Passing Operations
Space of possible protocols for send and receive operations.
Non-blocking non-buffered send and receive operations
(a) in absence of communication hardware
(b) in presence of communication hardware
MPI: the Message Passing Interface
• Message-passing architecture is used in parallel computer due to its lower cost relative to
shared-address-space architectures.
• Message-passing is the natural programming paradigm for these machines leads to
development of different message-passing libraries.
• These message-passing libraries works well in vendors own hardware but was incompatible
with parallel computers offered by other vendors.
• Difference between libraries are mostly syntactic but have some serious semantic
differences which require significant re-engineering to port a message-passing program
from one library to another.
• The message-passing interface, or MPI as it is commonly known, was created to essentially
solve this problem.
• The MPI library contains over 125 routines, but the number of key concepts is much smaller.
• It is possible to write fully-functional message-passing programs by using only the six
routines shown below:
1. MPI_Init : Initializes MPI.
2. MPI_Finalize : Terminates MPI.
3. MPI_Comm_size : Determines the number of processes.
4. MPI_Comm_rank : Determines the label of the calling process.
5. MPI_Send : Sends a message.
6. MPI_Recv : Receives a message.
Overlapping Communication with
Computation
• MPI programs are developed so far used blocking send and receive operations whenever
they needed to perform point-to-point communication.
• For Example:
• Consider Cannon's matrix-matrix multiplication program.
• During each iteration of its main computational loop (lines 47– 57), it first computes the matrix
multiplication of the sub-matrices stored in a and b, and then shifts the blocks of a and b, using
MPI_Sendrecv_replace which blocks until the specified matrix block has been sent and received by the
corresponding processes.
• In each iteration, each process spends O (n3/ p1.5) time for performing the matrix-matrix multiplication
and O(n2/p) time for shifting the blocks of matrices A and B.
• Now, since the blocks of matrices A and B do not change as they are shifted among the processors, it will
be preferable if we can overlap the transmission of these blocks with the computation for the matrix-
matrix multiplication, as many recent distributed-memory parallel computers have dedicated
communication controllers that can perform the transmission of messages without interrupting the CPUs.
Non-Blocking Communication Operations
• MPI provide pairs of functions for performing non-blocking send and receive operations to
overlap communication with computation.
• These functions are:
• MPI_Isend:
MPI_Isend starts a send operation but does not complete, that is, it returns before the data is copied out
of the buffer.
• MPI_Irecv:
MPI_Irecv starts a receive operation but returns before the data has been received and copied into the
buffer.
• MPI_Test:
MPI_TEST tests whether or not a non-blocking operation has finished.
• MPI_Wait:
waits (i.e., gets blocked) until a non-blocking operation actually finishes.
• With the support of appropriate hardware, the transmission and reception of messages can
proceed concurrently with the computations performed by the program upon the return of
the above functions.
Collective Communication and Computation
Operation
• MPI provides an extensive set of functions for performing many commonly used collective
communication operations.
• All of the collective communication functions provided by MPI take as an argument a
communicator that defines the group of processes that participate in the collective
operation.
• All the processes that belong to this communicator participate in the operation, and all of
them must call the collective communication function.
• Even though collective communication operations do not act like barriers (i.e., it is possible
for a processor to go past its call for the collective communication operation even before
other processes have reached it), it acts like a virtual synchronization step in the following
sense: the parallel program should be written such that it behaves correctly even if a global
synchronization is performed before and after the collective call.
• Since the operations are virtually synchronous, they do not require tags. In some of the
collective functions data is required to be sent from a single process (source-process) or to
be received by a single process (target-process).
• In these functions, the source- or target-process is one of the arguments supplied to the
routines. All the processes in the group (i.e., communicator) must specify the same source-
or target-process.
• For most collective communication operations, MPI provides two different variants. The
first transfers equal-size data to or from each process, and the second transfers data that can
be of different sizes.
1. Barrier
• The barrier synchronization operation is performed in MPI using the MPI_Barrier function.
int MPI_Barrier(MPI_Comm comm)
2. Broadcast
• The one-to-all broadcast operation is performed in MPI using the MPI_Bcast function.
int MPI_Bcast(void *buf, int count, MPI_Datatype datatype,
int source, MPI_Comm comm)
3. Reduction
• The all-to-one reduction operation is performed in MPI using the MPI_Reduce function.
int MPI_Reduce(void *sendbuf, void *recvbuf, int count,
MPI_Datatype datatype, MPI_Op op, int target,
MPI_Comm comm)
4. Prefix
• The prefix-sum operation is performed in MPI using the MPI_Scan function.
int MPI_Scan(void *sendbuf, void *recvbuf, int count,
MPI_Datatype datatype, MPI_Op op, MPI_Comm comm)
5. Gather
• The gather operation is performed in MPI using the MPI_Gather function.
int MPI_Gather(void *sendbuf, int sendcount,
MPI_Datatype senddatatype, void *recvbuf, int recvcount,
MPI_Datatype recvdatatype, int target, MPI_Comm comm)
6. Scatter
• The scatter operation described is performed in MPI using the MPI_Scatter function.
int MPI_Scatter(void *sendbuf, int sendcount,
MPI_Datatype senddatatype, void *recvbuf, int recvcount,
MPI_Datatype recvdatatype, int source, MPI_Comm comm)
7. All-to-All
• The all-to-all personalized communication operation described in Section 4.5 is performed
in MPI by using the MPI_Alltoall function.
int MPI_Alltoall(void *sendbuf, int sendcount,
MPI_Datatype senddatatype, void *recvbuf, int recvcount,
MPI_Datatype recvdatatype, MPI_Comm comm)
OpenMP Parallel Programming Model
• OpenMP is an API that can be used with FORTRAN, C, and C++ for programming shared
address space machines.
• OpenMP directives provide support for concurrency, synchronization and data handling
while obviating the need for explicitly setting up mutexes, condition variables, data scope,
and initialization.
• OpenMP directives in C and C++ are based on the #pragma compiler directives. The
directive itself consists of a directive name followed by clauses.
1 #pragma omp directive [clause list]
• OpenMP programs execute serially until they encounter the parallel directive.
• parallel directive is responsible for creating a group of threads.
• The main thread that encounters the parallel directive becomes the master of this group of
threads and is assigned the thread id 0 within the group.
• The parallel directive has the following prototype:
1 #pragma omp parallel [clause list]
2 /* structured block */
3
• Each thread created by this directive executes the structured block specified by the parallel
directive.
• The clause list is used to specify conditional parallelization, number of threads, and data
handling.
• Conditional Parallelization: The clause if (scalar expression) determines whether the
parallel construct results in creation of threads. Only one if clause can be used with a
parallel directive.
• Degree of Concurrency: The clause num_threads (integer expression) specifies the
number of threads that are created by the parallel directive.
• Data Handling: The clause private (variable list) indicates that the set of variables
specified is local to each thread – i.e., each thread has its own copy of each variable in the
list.
• The clause firstprivate (variable list) is similar to the private clause, except the values of
variables on entering the threads are initialized to corresponding values before the parallel
directive.
• The clause shared (variable list) indicates that all variables in the list are shared across all
the threads, i.e., there is only one copy. Special care must be taken while handling these
variables by threads to ensure serializability.
A sample OpenMP program along with its Pthreads translation that might be performed
by an OpenMP compiler.
• Given below are powerful set of OpenMP compiler directives:
1. parallel: which precedes a block of code to be executed in parallel by multiple threads.
2. for: which precedes a for loop with independent iterations that may be divided among
threads executing in parallel.
3. parallel for: a combination of the parallel and for directives.
4. sections: which precedes a series of blocks that may be executed in parallel.
5. parallel sections: a combination of the parallel and sections directives.
6. critical: which precedes a critical section.
7. single: which precedes a code block to be executed by a single thread.
Shared Memory Model
• Processors interact and synchronize with each other through shared variables.
Fork/Join Parallelism
• Initially only master thread is active.
• Master thread executes sequential code.
• Fork: Master thread creates or awakens additional threads to execute parallel code.
• Join: At end of parallel code created threads die or are suspended.
Parallel for Loops
• C programs often express data-parallel operations as for loops
for (i = first; i < size; i += prime)
marked[i] = 1;
• OpenMP makes it easy to indicate when the iterations of a loop may execute in
parallel.
• Compiler takes care of generating code that forks/joins threads and allocates the
iterations to threads
Shared and Private Variables
• Shared variable: has same address in execution context of every thread.
• Private variable: has different address in execution context of every thread.
• A thread cannot access the private variables of another thread.
Function omp_get_num_procs
• Returns number of physical processors available for use by the parallel program
int omp_get_num_procs (void)
Function omp_set_num_threads
• Uses the parameter value to set the number of threads to be active in parallel
sections of code.
• May be called at multiple points in a program.
void omp_set_num_threads (int t)
MU Exam Questions
May 2018
• Explain the concept of shared memory programming. 5 marks
• Explain in brief about Performance bottleneck, Data Race and Determinism, Data Race
Avoidance and Deadlock Avoidance. 10 marks
Dec 2018
• Discuss the term collective communication in MPI. 5 marks
• Differentiate between buffered blocking and non-buffered blocking message passing
operation in MPI. 10 marks
May 2019
• Discuss the term collective communication in MPI. 5 marks
• Differentiate between buffered blocking and non-buffered blocking message passing
operation in MPI. 10 marks
• Write a small program demonstrating functional and compiler directives in OpenMP
paradigm and MP paradigm.
Research Paper
Programming using MPI and OpenMP

Mais conteúdo relacionado

Mais procurados

Message Passing Interface (MPI)-A means of machine communication
Message Passing Interface (MPI)-A means of machine communicationMessage Passing Interface (MPI)-A means of machine communication
Message Passing Interface (MPI)-A means of machine communicationHimanshi Kathuria
 
MPI Introduction
MPI IntroductionMPI Introduction
MPI IntroductionRohit Banga
 
Lecture 6
Lecture  6Lecture  6
Lecture 6Mr SMAK
 
parallel programming models
 parallel programming models parallel programming models
parallel programming modelsSwetha S
 
cpu scheduling
cpu schedulingcpu scheduling
cpu schedulinghashim102
 
Inter process communication
Inter process communicationInter process communication
Inter process communicationRJ Mehul Gadhiya
 
Parallel Processors (SIMD)
Parallel Processors (SIMD) Parallel Processors (SIMD)
Parallel Processors (SIMD) Ali Raza
 
Basic communication operations - One to all Broadcast
Basic communication operations - One to all BroadcastBasic communication operations - One to all Broadcast
Basic communication operations - One to all BroadcastRashiJoshi11
 
Unit II - 2 - Operating System - Threads
Unit II - 2 - Operating System - ThreadsUnit II - 2 - Operating System - Threads
Unit II - 2 - Operating System - Threadscscarcas
 
Inter Process Communication Presentation[1]
Inter Process Communication Presentation[1]Inter Process Communication Presentation[1]
Inter Process Communication Presentation[1]Ravindra Raju Kolahalam
 
Unit 5 Advanced Computer Architecture
Unit 5 Advanced Computer ArchitectureUnit 5 Advanced Computer Architecture
Unit 5 Advanced Computer ArchitectureBalaji Vignesh
 
Operating Systems - "Chapter 5 Process Synchronization"
Operating Systems - "Chapter 5 Process Synchronization"Operating Systems - "Chapter 5 Process Synchronization"
Operating Systems - "Chapter 5 Process Synchronization"Ra'Fat Al-Msie'deen
 

Mais procurados (20)

Message Passing Interface (MPI)-A means of machine communication
Message Passing Interface (MPI)-A means of machine communicationMessage Passing Interface (MPI)-A means of machine communication
Message Passing Interface (MPI)-A means of machine communication
 
MPI Introduction
MPI IntroductionMPI Introduction
MPI Introduction
 
Message passing interface
Message passing interfaceMessage passing interface
Message passing interface
 
MPI Tutorial
MPI TutorialMPI Tutorial
MPI Tutorial
 
My ppt hpc u4
My ppt hpc u4My ppt hpc u4
My ppt hpc u4
 
Lecture 6
Lecture  6Lecture  6
Lecture 6
 
Pipelining slides
Pipelining slides Pipelining slides
Pipelining slides
 
parallel programming models
 parallel programming models parallel programming models
parallel programming models
 
cpu scheduling
cpu schedulingcpu scheduling
cpu scheduling
 
Inter process communication
Inter process communicationInter process communication
Inter process communication
 
File systems for Embedded Linux
File systems for Embedded LinuxFile systems for Embedded Linux
File systems for Embedded Linux
 
Chapter 4 pc
Chapter 4 pcChapter 4 pc
Chapter 4 pc
 
Parallel Processors (SIMD)
Parallel Processors (SIMD) Parallel Processors (SIMD)
Parallel Processors (SIMD)
 
Basic communication operations - One to all Broadcast
Basic communication operations - One to all BroadcastBasic communication operations - One to all Broadcast
Basic communication operations - One to all Broadcast
 
Lecture 3 threads
Lecture 3   threadsLecture 3   threads
Lecture 3 threads
 
Unit II - 2 - Operating System - Threads
Unit II - 2 - Operating System - ThreadsUnit II - 2 - Operating System - Threads
Unit II - 2 - Operating System - Threads
 
Inter Process Communication Presentation[1]
Inter Process Communication Presentation[1]Inter Process Communication Presentation[1]
Inter Process Communication Presentation[1]
 
Multiprocessor
MultiprocessorMultiprocessor
Multiprocessor
 
Unit 5 Advanced Computer Architecture
Unit 5 Advanced Computer ArchitectureUnit 5 Advanced Computer Architecture
Unit 5 Advanced Computer Architecture
 
Operating Systems - "Chapter 5 Process Synchronization"
Operating Systems - "Chapter 5 Process Synchronization"Operating Systems - "Chapter 5 Process Synchronization"
Operating Systems - "Chapter 5 Process Synchronization"
 

Semelhante a Programming using MPI and OpenMP

Move Message Passing Interface Applications to the Next Level
Move Message Passing Interface Applications to the Next LevelMove Message Passing Interface Applications to the Next Level
Move Message Passing Interface Applications to the Next LevelIntel® Software
 
High Performance Computing using MPI
High Performance Computing using MPIHigh Performance Computing using MPI
High Performance Computing using MPIAnkit Mahato
 
Presentation - Programming a Heterogeneous Computing Cluster
Presentation - Programming a Heterogeneous Computing ClusterPresentation - Programming a Heterogeneous Computing Cluster
Presentation - Programming a Heterogeneous Computing ClusterAashrith Setty
 
Message passing Programing and MPI.
Message passing Programing and MPI.Message passing Programing and MPI.
Message passing Programing and MPI.Munawar Hussain
 
Tutorial on Parallel Computing and Message Passing Model - C2
Tutorial on Parallel Computing and Message Passing Model - C2Tutorial on Parallel Computing and Message Passing Model - C2
Tutorial on Parallel Computing and Message Passing Model - C2Marcirio Chaves
 
Mpi collective communication operations
Mpi collective communication operationsMpi collective communication operations
Mpi collective communication operationsShah Zaib
 
01-MessagePassingFundamentals.ppt
01-MessagePassingFundamentals.ppt01-MessagePassingFundamentals.ppt
01-MessagePassingFundamentals.pptHarshitPal37
 
Mule batch introduction
Mule batch introductionMule batch introduction
Mule batch introductionSon Nguyen
 
Advanced Scalable Decomposition Method with MPICH Environment for HPC
Advanced Scalable Decomposition Method with MPICH Environment for HPCAdvanced Scalable Decomposition Method with MPICH Environment for HPC
Advanced Scalable Decomposition Method with MPICH Environment for HPCIJSRD
 
Inter-Process Communication in distributed systems
Inter-Process Communication in distributed systemsInter-Process Communication in distributed systems
Inter-Process Communication in distributed systemsAya Mahmoud
 
Intro to MPI
Intro to MPIIntro to MPI
Intro to MPIjbp4444
 
Task communication
Task communicationTask communication
Task communication1jayanti
 

Semelhante a Programming using MPI and OpenMP (20)

More mpi4py
More mpi4pyMore mpi4py
More mpi4py
 
Open MPI 2
Open MPI 2Open MPI 2
Open MPI 2
 
Chap6 slides
Chap6 slidesChap6 slides
Chap6 slides
 
Move Message Passing Interface Applications to the Next Level
Move Message Passing Interface Applications to the Next LevelMove Message Passing Interface Applications to the Next Level
Move Message Passing Interface Applications to the Next Level
 
Introduction to MPI
Introduction to MPIIntroduction to MPI
Introduction to MPI
 
High Performance Computing using MPI
High Performance Computing using MPIHigh Performance Computing using MPI
High Performance Computing using MPI
 
Presentation - Programming a Heterogeneous Computing Cluster
Presentation - Programming a Heterogeneous Computing ClusterPresentation - Programming a Heterogeneous Computing Cluster
Presentation - Programming a Heterogeneous Computing Cluster
 
Message passing Programing and MPI.
Message passing Programing and MPI.Message passing Programing and MPI.
Message passing Programing and MPI.
 
Tutorial on Parallel Computing and Message Passing Model - C2
Tutorial on Parallel Computing and Message Passing Model - C2Tutorial on Parallel Computing and Message Passing Model - C2
Tutorial on Parallel Computing and Message Passing Model - C2
 
25-MPI-OpenMP.pptx
25-MPI-OpenMP.pptx25-MPI-OpenMP.pptx
25-MPI-OpenMP.pptx
 
Lecture9
Lecture9Lecture9
Lecture9
 
Distributed System
Distributed System Distributed System
Distributed System
 
Mpi collective communication operations
Mpi collective communication operationsMpi collective communication operations
Mpi collective communication operations
 
01-MessagePassingFundamentals.ppt
01-MessagePassingFundamentals.ppt01-MessagePassingFundamentals.ppt
01-MessagePassingFundamentals.ppt
 
Mule batch introduction
Mule batch introductionMule batch introduction
Mule batch introduction
 
Advanced Scalable Decomposition Method with MPICH Environment for HPC
Advanced Scalable Decomposition Method with MPICH Environment for HPCAdvanced Scalable Decomposition Method with MPICH Environment for HPC
Advanced Scalable Decomposition Method with MPICH Environment for HPC
 
Inter-Process Communication in distributed systems
Inter-Process Communication in distributed systemsInter-Process Communication in distributed systems
Inter-Process Communication in distributed systems
 
Intro to MPI
Intro to MPIIntro to MPI
Intro to MPI
 
Task communication
Task communicationTask communication
Task communication
 
Chapter 3 - Processes
Chapter 3 - ProcessesChapter 3 - Processes
Chapter 3 - Processes
 

Mais de Divya Tiwari

Digital stick by Divya & Kanti
Digital stick by Divya & KantiDigital stick by Divya & Kanti
Digital stick by Divya & KantiDivya Tiwari
 
Predicting house price
Predicting house pricePredicting house price
Predicting house priceDivya Tiwari
 
Testing strategies -2
Testing strategies -2Testing strategies -2
Testing strategies -2Divya Tiwari
 
Testing strategies part -1
Testing strategies part -1Testing strategies part -1
Testing strategies part -1Divya Tiwari
 
Performance measures
Performance measuresPerformance measures
Performance measuresDivya Tiwari
 
IoT applications and use cases part-2
IoT applications and use cases part-2IoT applications and use cases part-2
IoT applications and use cases part-2Divya Tiwari
 
Io t applications and use cases part-1
Io t applications and use cases part-1Io t applications and use cases part-1
Io t applications and use cases part-1Divya Tiwari
 
Planning for security and security audit process
Planning for security and security audit processPlanning for security and security audit process
Planning for security and security audit processDivya Tiwari
 
Security management concepts and principles
Security management concepts and principlesSecurity management concepts and principles
Security management concepts and principlesDivya Tiwari
 
Responsive web design with html5 and css3
Responsive web design with html5 and css3Responsive web design with html5 and css3
Responsive web design with html5 and css3Divya Tiwari
 
Mac protocols for ad hoc wireless networks
Mac protocols for ad hoc wireless networks Mac protocols for ad hoc wireless networks
Mac protocols for ad hoc wireless networks Divya Tiwari
 
Routing protocols for ad hoc wireless networks
Routing protocols for ad hoc wireless networks Routing protocols for ad hoc wireless networks
Routing protocols for ad hoc wireless networks Divya Tiwari
 

Mais de Divya Tiwari (13)

Digital stick by Divya & Kanti
Digital stick by Divya & KantiDigital stick by Divya & Kanti
Digital stick by Divya & Kanti
 
Predicting house price
Predicting house pricePredicting house price
Predicting house price
 
Testing strategies -2
Testing strategies -2Testing strategies -2
Testing strategies -2
 
Testing strategies part -1
Testing strategies part -1Testing strategies part -1
Testing strategies part -1
 
Performance measures
Performance measuresPerformance measures
Performance measures
 
IoT applications and use cases part-2
IoT applications and use cases part-2IoT applications and use cases part-2
IoT applications and use cases part-2
 
Io t applications and use cases part-1
Io t applications and use cases part-1Io t applications and use cases part-1
Io t applications and use cases part-1
 
Planning for security and security audit process
Planning for security and security audit processPlanning for security and security audit process
Planning for security and security audit process
 
Security management concepts and principles
Security management concepts and principlesSecurity management concepts and principles
Security management concepts and principles
 
Web services
Web servicesWeb services
Web services
 
Responsive web design with html5 and css3
Responsive web design with html5 and css3Responsive web design with html5 and css3
Responsive web design with html5 and css3
 
Mac protocols for ad hoc wireless networks
Mac protocols for ad hoc wireless networks Mac protocols for ad hoc wireless networks
Mac protocols for ad hoc wireless networks
 
Routing protocols for ad hoc wireless networks
Routing protocols for ad hoc wireless networks Routing protocols for ad hoc wireless networks
Routing protocols for ad hoc wireless networks
 

Último

Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptDineshKumar4165
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdfKamal Acharya
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)simmis5
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...Call Girls in Nagpur High Profile
 
Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...tanu pandey
 
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...Call Girls in Nagpur High Profile
 
Online banking management system project.pdf
Online banking management system project.pdfOnline banking management system project.pdf
Online banking management system project.pdfKamal Acharya
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdfankushspencer015
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTbhaskargani46
 
chapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringchapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringmulugeta48
 
Call for Papers - International Journal of Intelligent Systems and Applicatio...
Call for Papers - International Journal of Intelligent Systems and Applicatio...Call for Papers - International Journal of Intelligent Systems and Applicatio...
Call for Papers - International Journal of Intelligent Systems and Applicatio...Christo Ananth
 
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxBSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxfenichawla
 
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...SUHANI PANDEY
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth
 

Último (20)

Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdf
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...
 
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
 
Online banking management system project.pdf
Online banking management system project.pdfOnline banking management system project.pdf
Online banking management system project.pdf
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 
chapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringchapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineering
 
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
 
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
 
Call for Papers - International Journal of Intelligent Systems and Applicatio...
Call for Papers - International Journal of Intelligent Systems and Applicatio...Call for Papers - International Journal of Intelligent Systems and Applicatio...
Call for Papers - International Journal of Intelligent Systems and Applicatio...
 
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxBSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
 
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 

Programming using MPI and OpenMP

  • 1. Programming using MPI & OpenMP HIGH PERFORMANCE COMPUTING MODULE 4 DIVYA TIWARI MEIT TERNA ENGINEERING COLLEGE
  • 2. INTRODUCTION • Parallel computing has made a tremendous impact on a variety of areas from computational simulations for scientific and engineering applications to commercial applications in data mining and transaction processing. • Numerous programming languages and libraries have been developed for explicit parallel programming. • They differ - in their view of the address space that they make available to the programmer - the degree of synchronization imposed on concurrent activities - the multiplicity of programs. • The message-passing programming (MPI) is one of the oldest and most widely used approaches for programming parallel computers.
  • 3. Principles Key Atrributes of MPI Assumes partitioned address space Supports only explicit parallelization • Implications of Partitioned address space: 1. Each data must belong to one of the partition of the space. 2. Interaction requires co-operation of two processes (the process that has the data and process that wants to access the data.) • Explicit Parallelization: Programmer is responsible for analysing underlying serial algorithm/application and identifying ways by which he or she can decompose the computations and extract concurrency.
  • 4. Structure of Message Passing Program • Message-passing programs are often written using: 1. Asynchronous 2. Loosely Synchronous • In its most general form, the message-passing paradigm supports execution of a different program on each of the p processes. • Most message-passing programs are written using the single program multiple data (SPMD) approach.
  • 5. Building Blocks: Send and Receive Operations • Communication between the processes are accomplished by sending and receiving messages. • Basic operations in message passing program are send and receive. • Example: send(void *sendbuf, int nelems, int dest) receive(void *recvbuf, int nelems, int source) sendbuf - points to a buffer that stores the data to be sent. recvbuf - points to a buffer that stores the data to be received. nelems - is the number of data units to be sent and received. dest - is the identifier of the process that receives the data. source - is the identifier of the process that sends the data.
  • 6. • Example: A process sending a piece of data to another process send(void *sendbuf, int nelems, int dest) receive(void *recvbuf, int nelems, int source) 1 P0 P1 2 3 a = 100; receive(&a, 1, 0) 4 send(&a, 1, 1); printf("%dn", a); 5 a=0;
  • 7. • In message passing operation there are two types: 1. Blocking Message Passing Operations. i. Blocking Non-Buffered Send/Receive ii. Blocking Buffered Send/Receive 2. Non-Blocking Message Passing Operations.
  • 8. Blocking Message Passing Operations i. Blocking Non-Buffered Send/Receive
  • 9. ii. Blocking Buffered Send/Receive a. In the presence of communication hardware with buffers at send and receive ends b. In the absence of communication hardware sender interrupts receiver and deposits data in buffer at receiver end
  • 10. Non-Blocking Message Passing Operations Space of possible protocols for send and receive operations.
  • 11. Non-blocking non-buffered send and receive operations (a) in absence of communication hardware (b) in presence of communication hardware
  • 12. MPI: the Message Passing Interface • Message-passing architecture is used in parallel computer due to its lower cost relative to shared-address-space architectures. • Message-passing is the natural programming paradigm for these machines leads to development of different message-passing libraries. • These message-passing libraries works well in vendors own hardware but was incompatible with parallel computers offered by other vendors. • Difference between libraries are mostly syntactic but have some serious semantic differences which require significant re-engineering to port a message-passing program from one library to another. • The message-passing interface, or MPI as it is commonly known, was created to essentially solve this problem.
  • 13. • The MPI library contains over 125 routines, but the number of key concepts is much smaller. • It is possible to write fully-functional message-passing programs by using only the six routines shown below: 1. MPI_Init : Initializes MPI. 2. MPI_Finalize : Terminates MPI. 3. MPI_Comm_size : Determines the number of processes. 4. MPI_Comm_rank : Determines the label of the calling process. 5. MPI_Send : Sends a message. 6. MPI_Recv : Receives a message.
  • 14. Overlapping Communication with Computation • MPI programs are developed so far used blocking send and receive operations whenever they needed to perform point-to-point communication. • For Example: • Consider Cannon's matrix-matrix multiplication program. • During each iteration of its main computational loop (lines 47– 57), it first computes the matrix multiplication of the sub-matrices stored in a and b, and then shifts the blocks of a and b, using MPI_Sendrecv_replace which blocks until the specified matrix block has been sent and received by the corresponding processes. • In each iteration, each process spends O (n3/ p1.5) time for performing the matrix-matrix multiplication and O(n2/p) time for shifting the blocks of matrices A and B. • Now, since the blocks of matrices A and B do not change as they are shifted among the processors, it will be preferable if we can overlap the transmission of these blocks with the computation for the matrix- matrix multiplication, as many recent distributed-memory parallel computers have dedicated communication controllers that can perform the transmission of messages without interrupting the CPUs.
  • 15. Non-Blocking Communication Operations • MPI provide pairs of functions for performing non-blocking send and receive operations to overlap communication with computation. • These functions are: • MPI_Isend: MPI_Isend starts a send operation but does not complete, that is, it returns before the data is copied out of the buffer. • MPI_Irecv: MPI_Irecv starts a receive operation but returns before the data has been received and copied into the buffer. • MPI_Test: MPI_TEST tests whether or not a non-blocking operation has finished. • MPI_Wait: waits (i.e., gets blocked) until a non-blocking operation actually finishes. • With the support of appropriate hardware, the transmission and reception of messages can proceed concurrently with the computations performed by the program upon the return of the above functions.
  • 16. Collective Communication and Computation Operation • MPI provides an extensive set of functions for performing many commonly used collective communication operations. • All of the collective communication functions provided by MPI take as an argument a communicator that defines the group of processes that participate in the collective operation. • All the processes that belong to this communicator participate in the operation, and all of them must call the collective communication function. • Even though collective communication operations do not act like barriers (i.e., it is possible for a processor to go past its call for the collective communication operation even before other processes have reached it), it acts like a virtual synchronization step in the following sense: the parallel program should be written such that it behaves correctly even if a global synchronization is performed before and after the collective call.
  • 17. • Since the operations are virtually synchronous, they do not require tags. In some of the collective functions data is required to be sent from a single process (source-process) or to be received by a single process (target-process). • In these functions, the source- or target-process is one of the arguments supplied to the routines. All the processes in the group (i.e., communicator) must specify the same source- or target-process. • For most collective communication operations, MPI provides two different variants. The first transfers equal-size data to or from each process, and the second transfers data that can be of different sizes. 1. Barrier • The barrier synchronization operation is performed in MPI using the MPI_Barrier function. int MPI_Barrier(MPI_Comm comm) 2. Broadcast • The one-to-all broadcast operation is performed in MPI using the MPI_Bcast function. int MPI_Bcast(void *buf, int count, MPI_Datatype datatype, int source, MPI_Comm comm)
  • 18. 3. Reduction • The all-to-one reduction operation is performed in MPI using the MPI_Reduce function. int MPI_Reduce(void *sendbuf, void *recvbuf, int count, MPI_Datatype datatype, MPI_Op op, int target, MPI_Comm comm) 4. Prefix • The prefix-sum operation is performed in MPI using the MPI_Scan function. int MPI_Scan(void *sendbuf, void *recvbuf, int count, MPI_Datatype datatype, MPI_Op op, MPI_Comm comm) 5. Gather • The gather operation is performed in MPI using the MPI_Gather function. int MPI_Gather(void *sendbuf, int sendcount, MPI_Datatype senddatatype, void *recvbuf, int recvcount, MPI_Datatype recvdatatype, int target, MPI_Comm comm)
  • 19. 6. Scatter • The scatter operation described is performed in MPI using the MPI_Scatter function. int MPI_Scatter(void *sendbuf, int sendcount, MPI_Datatype senddatatype, void *recvbuf, int recvcount, MPI_Datatype recvdatatype, int source, MPI_Comm comm) 7. All-to-All • The all-to-all personalized communication operation described in Section 4.5 is performed in MPI by using the MPI_Alltoall function. int MPI_Alltoall(void *sendbuf, int sendcount, MPI_Datatype senddatatype, void *recvbuf, int recvcount, MPI_Datatype recvdatatype, MPI_Comm comm)
  • 20. OpenMP Parallel Programming Model • OpenMP is an API that can be used with FORTRAN, C, and C++ for programming shared address space machines. • OpenMP directives provide support for concurrency, synchronization and data handling while obviating the need for explicitly setting up mutexes, condition variables, data scope, and initialization. • OpenMP directives in C and C++ are based on the #pragma compiler directives. The directive itself consists of a directive name followed by clauses. 1 #pragma omp directive [clause list] • OpenMP programs execute serially until they encounter the parallel directive.
  • 21. • parallel directive is responsible for creating a group of threads. • The main thread that encounters the parallel directive becomes the master of this group of threads and is assigned the thread id 0 within the group. • The parallel directive has the following prototype: 1 #pragma omp parallel [clause list] 2 /* structured block */ 3 • Each thread created by this directive executes the structured block specified by the parallel directive. • The clause list is used to specify conditional parallelization, number of threads, and data handling. • Conditional Parallelization: The clause if (scalar expression) determines whether the parallel construct results in creation of threads. Only one if clause can be used with a parallel directive. • Degree of Concurrency: The clause num_threads (integer expression) specifies the number of threads that are created by the parallel directive.
  • 22. • Data Handling: The clause private (variable list) indicates that the set of variables specified is local to each thread – i.e., each thread has its own copy of each variable in the list. • The clause firstprivate (variable list) is similar to the private clause, except the values of variables on entering the threads are initialized to corresponding values before the parallel directive. • The clause shared (variable list) indicates that all variables in the list are shared across all the threads, i.e., there is only one copy. Special care must be taken while handling these variables by threads to ensure serializability.
  • 23. A sample OpenMP program along with its Pthreads translation that might be performed by an OpenMP compiler.
  • 24. • Given below are powerful set of OpenMP compiler directives: 1. parallel: which precedes a block of code to be executed in parallel by multiple threads. 2. for: which precedes a for loop with independent iterations that may be divided among threads executing in parallel. 3. parallel for: a combination of the parallel and for directives. 4. sections: which precedes a series of blocks that may be executed in parallel. 5. parallel sections: a combination of the parallel and sections directives. 6. critical: which precedes a critical section. 7. single: which precedes a code block to be executed by a single thread.
  • 25. Shared Memory Model • Processors interact and synchronize with each other through shared variables.
  • 26. Fork/Join Parallelism • Initially only master thread is active. • Master thread executes sequential code. • Fork: Master thread creates or awakens additional threads to execute parallel code. • Join: At end of parallel code created threads die or are suspended.
  • 27. Parallel for Loops • C programs often express data-parallel operations as for loops for (i = first; i < size; i += prime) marked[i] = 1; • OpenMP makes it easy to indicate when the iterations of a loop may execute in parallel. • Compiler takes care of generating code that forks/joins threads and allocates the iterations to threads
  • 28. Shared and Private Variables • Shared variable: has same address in execution context of every thread. • Private variable: has different address in execution context of every thread. • A thread cannot access the private variables of another thread.
  • 29. Function omp_get_num_procs • Returns number of physical processors available for use by the parallel program int omp_get_num_procs (void) Function omp_set_num_threads • Uses the parameter value to set the number of threads to be active in parallel sections of code. • May be called at multiple points in a program. void omp_set_num_threads (int t)
  • 30. MU Exam Questions May 2018 • Explain the concept of shared memory programming. 5 marks • Explain in brief about Performance bottleneck, Data Race and Determinism, Data Race Avoidance and Deadlock Avoidance. 10 marks Dec 2018 • Discuss the term collective communication in MPI. 5 marks • Differentiate between buffered blocking and non-buffered blocking message passing operation in MPI. 10 marks May 2019 • Discuss the term collective communication in MPI. 5 marks • Differentiate between buffered blocking and non-buffered blocking message passing operation in MPI. 10 marks • Write a small program demonstrating functional and compiler directives in OpenMP paradigm and MP paradigm.