SlideShare uma empresa Scribd logo
1 de 4
Baixar para ler offline
International Journal of Science and Research (IJSR), India Online ISSN: 2319-7064
Volume 1 Issue 2, November 2012
www.ijsr.net
An Adaptive Framework towards Analyzing the
Parallel Merge Sort

Husain Ullah Khan 1
, Rajesh Tiwari 2
1
ME Scholar, Department Of Computer Science & Engineering, Shri Shankaracharya
College of Engineering & Technology Bhilai, India,
khan.husain@gamil.com.
2
Department Of Computer Science & Engineering, Shri Shankaracharya
College of Engineering & Technology, Bhilai, India,
raj_tiwari_in@yahoo.com
Abstract: The parallel computing on loosely coupled architecture has been evolved now a day because of the availability of fast and,
inexpensive processors and advancements in communication technologies. The aim of this paper is to evaluate the performance of
parallel merge sort algorithm on parallel programming environments such as MPI .The MPI libraries has been used to established the
communication and synchronization between the processes Merge sort is analyze in this paper because it is an efficient divide-and-
conquer sorting algorithm. it is easier to understand than other useful divide-and-conquer strategies Due to the importance of
distributed computing power of workstations or PCs connected in a local area network. Our aim is to study the performance evaluation
of parallel merge sort.
Keywords: parallel computing, parallel Algorithms, Message Passing Interface, Merge sort, performance analysis.
1. Introduction
Here, we present a parallel version of the well known
merge sort algorithm. The algorithm assumes that the
sequence to be sorted is distributed and so generates a
distributed sorted sequence. For simplicity, we assume
that n is an integer multiple of p, that the n data are
distributed evenly among p tasks. The sequential merge
sort requires O (n logn) [5] time to sort n elements. Due
to the importance of distributed computing power of
workstations or PCs connected in a local area network [7]
researcher have been studying the performance of various
scientific applications [6][7]][8].
2. Methodology of Parallel Merge
Methodology used for parallel algorithm uses master
slave model in the form of tree for parallel sorting. Each
process receives the list of elements from its precedor
process then divides it into two halves, keeps one half for
it and sends the second half for its successor. To address
the corresponding preccedor & successor we have used
the concept of rank calculation.
For a process having odd rank it is calculated as.
Myrank_multiple=2*Myrank+1;
Temp_myrank= Myrank_multiple
And for the process having even rank it is calculated as.
Myrank_multiple=2*Myrank+2;
Temp_myrank= Myrank_multiple.
It uses recursive calls both to emulate the transmission of
the right halves of the arrays and the recursive calls that
process the left halves. After that it will receive the sorted
data from its successor & merge that two sub lists. Then
it sends the result to its precedor. This process will
continue up to root node.
10 11 20 5 9 2 4 13
10 11 20 5 -------------- 9 2 4 13
10 11 ----- 20 5 ----- 9 2 -- 4 13
10 -- 11 --- 20 -- 5 ----- 9 -
-
2 -- 4 -- 13
10 11 --- 5 20 ----- 2 9 -- 4 13
5 10 11 20 -------------- 2 4 9 13
2 4 5 9 10 11 13 20
Figure: -2.1 Communications and sorting in Merge Sort
In above figure divide and conquer strategies are shown.
Here list is divided into two sub lists and again sub list is
divided again two sub list, this process is repeated until
we found single element. Then merge process start with
sorting. Finally we get sorted sequence.
3. Communication and Processing
The sequential time complexity is O(n logn). In case of
parallel algorithm the complexity involves both
communication cost and computational cost. The
31
International Journal of Science and Research (IJSR), India Online ISSN: 2319-7064
Volume 1 Issue 2, November 2012
www.ijsr.net
communication of sub problems is an overhead expense
that we want to minimize. Also, there’s no reason to
allow an internal node process to sit idle, waiting to
receive two results from its children. Instead, we want
the parent to send half the work to the child process and
then accomplish half of the work itself. Central to
computing within MPI [2] is the concept of a
“communicator”. The MPI communicator specifies a
group of processes inside which communication occurs.
MPI_COMM_WORLD is the initial communicator,
containing all processes involved in the computation.
Each process communicates with the others through that
communicator, and has the ability to find position within
the communicator and also the total number of processes
in the communicator.
In the division phase, communication only takes place as
follows, Communication at each step, The root process
prepares the array to sort and then invokes parallel merge
sort [1] while each helper process receives data from its
parent process and invokes parallel merge sort and sends
sorted data back to parent .
Tcomm=2(Tstartup+(n/2)Tdata+Tstartup+(n/4)Tdata+…)
Computation only occurs in merging the sub lists
merging can be done by stepping through each list,
moving the smallest found into the final list first. It takes
2n-1 steps in the worst case to merge two sorted lists ,
each of n numbers into one sorted list in this manner.
Therefore, the computation consists of
Tcomp=1 P1;P2;P1;P3
Tcomp=3 P0;P1
Tcomp=7 P0
Therefore the total time required is,
Ttotal= Tcomp+Tcomm
4. Choosing the Parallel Environment
MPI (Message Passing Interface): MPI is an easily used
parallel processing environment whether our target
system is a single multiprocessor computer with shared
memory or a number of networked computers: the
Message Passing Interface (MPI) [4] As its name implies,
processing is performed through the exchange of
messages among the processes that are cooperating in the
computation. Central to computing within MPI is the
concept of a “communicator”. The MPI communicator
specifies a group of processes inside which
communication occurs. MPI_COMM_WORLD is the
initial communicator, containing all processes involved in
the computation. Each process communicates with the
others through that communicator, and has the ability to
find position within the communicator and also the total
number of processes in the communicator. Through the
communicator, processes have the ability to exchange
messages with each other. The sender of the message
specifies the process to receive the message. In addition,
the sender attaches to the message something called a
message tag, an indication of the kind of message it is.
Since these tags are simply non-negative integers, a large
number is available to the parallel programmer, since that
is the person who decides what the tags are within the
parallel problem solving system being developed. The
process receiving a message specifies both from what
process it is willing to receive a message and what the
message tag is. In addition, however, the receiving
process has the capability of using wild cards, one
specifying that it will accept a message from any sender,
the other specifying that it will accept a message with any
message tag. When the receiving process uses wild card
specifications, MPI provides a means by which the
receiving process can determine the sending process and
the tag used in sending the message. For the parallel
sorting program, we can get by with just one kind of
receive, the one that blocks execution until a message of
the specified sender and tag is available. you need to
initialize within the MPI environment. The presumption
is that this one is called from the programs main, and so it
sends pointers to the argc and argv that it received from
the operating system. We choose MPI to implement
message-passing merge sort on single and networked
computers because
(i) MPI is implemented for a broad variety of
Architectures, including implementations that are freely
available.
(ii) MPI is well documented;
(iii) MPI is more popular with parallel platform than
other parallel platforms, such as PVM [3] and OpenMP.
And our preference for an implementation language is
ANSI C because C is fast and available on virtually any
platform and C is easy to implement.
Parallel merge sort is executed by various processes at
various levels of the process tree, with the root being at
level 0, its children at level 1, and so on . In that, the
process’s level and the MPI process rank are used to
calculate a corresponding helper process’s rank. Then,
merge sort communicates for further sorting half of the
array with that helper process. Serial merge sort is
invoked when no more MPI helper processes are
available.
Procedure for parallel merge sort:
Step.1 Calculate Rank of the process.
Step.2 Assign rank to parent process and child processes.
Step.3 Check for the left & right child according to the
rank of the process.
Step.4 sort left sub arrays.
Step.5 check for the return status from child processes.
Step.6 send sorted data to parent process.
Step.7 repeats the step 4 for right sub array.
Step.8 Merge the two results back into new array
To execute above steps successfully there are various
functions that can be used, some of them are.
1. MPI_Init(int *argc, char ***argv) //initialize MPI
2. MPI_Comm_rank (MPI_Comm comm, int *rank)
//This process’s position within the
32
International Journal of Science and Research (IJSR), India Online ISSN: 2319-7064
Volume 1 Issue 2, November 2012
www.ijsr.net
communicator
3. MPI_Comm_size (MPI_Comm comm, int *size)
//Total number of processes in the communicator
4. MPI_Send( void *buf, int count, MPI_Datatype
datatype, int dest, int tag, MPI_Comm comm )
// Send a message to process with rank dest using tag
5. MPI_Recv( void *buf, int count, MPI_Datatype
datatype, int source, int tag,MPI_Comm comm
, MPI_Status *status )
//Receive a message with the specified tag from the
process with the rank source
6. MPI_Finalize() //Finalize MPI
5. Performance Evaluation and Result
Output from Communications Test
During the divide phase processes which are
communicating to each other is shown in tabular form.
Table : - 5.1 Communication process Table
Building a height 3 tree 0 sending data to 4
0 sending data to 2 0 sending data to 1
1 transmitting to 0 0 getting data from 1
2 sending data to 3 3 transmitting to 2
2 getting data from 3 2 transmitting to 0
0 getting data from 2 4 sending data to 6
4 sending data to 5 5 transmitting to 4
4 getting data from 5 6 sending data to 7
7 transmitting to 6 6 getting data from 7
6 transmitting to 4 4 getting data from 6
4transmitting to 0 0 getting data from 4
Single Computer Results in MPI
4 processes mandate root height of 2
Size: 10000000
Parallel: 3.877
Sequential: 11.607
Speed-up: 2.994
8 processes mandate root height of 3
Size: 10000000
Parallel: 3.643
Sequential: 11.57
Speed-up: 3.18
On the other hand, you can force network
communications by running the application in an MPI
session involving multiple computers, and letting MPI
think that each computer has only one processor. In this
environment, the mismatch between processor speed and
network communications speed becomes obvious.
Networked Computer Results in MPI
4 processes mandate root height of 2
Size: 10000000
Parallel: 8.421
Sequential: 11.611
Speed-up: 1.379
8 processes mandate root height of 3
Size: 10000000
Parallel: 8.168
Sequential: 11.879
Speed-up: 1.4548
Any network communications drastically degrades the
performance. MPI know that each computer has four
processors. In that case, MPI will deal out processes by
fours before it moves to the next available computer.
Thus, if you ask for eight parallel processes, ranks 0
through 3 will be on one computer, with fastest possible
communications, and ranks 4 through 7 will be on
another computer. Thus the only messaging is at the root,
when rank 0 sends its right half to rank 4.
6. Conclusion
This paper introduces parallel environment and its
implementation using MPI with C for parallel merge sort.
The paper reports performance experiments of MPI in
which it is concluded accordingly. A fast network can
make a message-passing (with MPI) solution for some
problem faster. Fast network that avoids a potentially
large network overhead the next time the same resources
are requested so it can provide better performance.
References
[1] K.B.Manwade, R.B.Patil; “Parallel merge sort on
loosely coupled architecture”, National Conference,
PSG Coimbatore.
[2] The OpenMP specification for parallel programming.
Retrieved on March 1, 2011 from http://openmp.org.
[3] The Message Passing Interface (MPI) standard.
Retrieved on March1, 2011from
http://www.mcs.anl.gov/research/projects/mpi/.
[4] Radenski Shared Memory, Message Passing, and
Hybrid Merge Sorts for Standalone and Clustered
SMPs Proc .PDPTA’11, the 2011 international
conference of parallel and distributed processing
technique and applications, CSREA press
(H.Arabnia,Ed.), 2011 pp.367-373.
[5] Wilkinson & Michael Charlotte, “Parallel
programming techniques and applications using
networked workstations and parallel computers”,
Pearson publication.
33
International Journal of Science and Research (IJSR), India Online ISSN: 2319-7064
Volume 1 Issue 2, November 2012
www.ijsr.net
[6] Kalim Qureshi and Haroon Rashid,“A Practical
Performance Comparison of Parallel Matrix
Multiplication Algorithms on Network of
Workstations.”, IEE Transaction Japan, Vol. 125,
No. 3, 2005.
[7] Kalim Qureshi and Haroon Rashid,“ A Practical
Performance Comparison of Two Parallel Fast
Fourier Transform Algorithms on Cluster of PCS”,
IEE Transaction Japan, Vol. 124, No. 11, 2004.
[8] Kalim Qureshi and Masahiko Hatanaka, “A Practical
Approach of Task Partitioning and Scheduling on
Heterogeneous Parallel Distributed Image
Computing System,” Transaction of IEE Japan, Vol.
120-C, No. 1, Jan., 2000, pp. 151-157.
Husain Ullah Khan: M.E.(Pursuing) in Computer
Technology & Application from Shri
Shankaracharya College of Engineering &
Technology, CSVTU, Bhilai ,India. Research areas
are Parallel Computing & its Enhancement.
Rajesh Tiwari: Received the M.E. degree in
Computer Technology & Applications from Shri
Shankaracharya College of Engineering &
Technology, Bhilai, India. Currently he is pursuing
Ph.D from CSVTU, Bhilai. He is having nine years
long experience in the field of teaching. Research
areas are Parallel Computing and its Enhancement,
His research work has been published in many
national and international journals.
34

Mais conteúdo relacionado

Mais procurados

Analysis of mutual exclusion algorithms with the significance and need of ele...
Analysis of mutual exclusion algorithms with the significance and need of ele...Analysis of mutual exclusion algorithms with the significance and need of ele...
Analysis of mutual exclusion algorithms with the significance and need of ele...Govt. P.G. College Dharamshala
 
Implementation of query optimization for reducing run time
Implementation of query optimization for reducing run timeImplementation of query optimization for reducing run time
Implementation of query optimization for reducing run timeAlexander Decker
 
Fundamentals of data structures ellis horowitz & sartaj sahni
Fundamentals of data structures   ellis horowitz & sartaj sahniFundamentals of data structures   ellis horowitz & sartaj sahni
Fundamentals of data structures ellis horowitz & sartaj sahniHitesh Wagle
 
Parallel Algorithm Models
Parallel Algorithm ModelsParallel Algorithm Models
Parallel Algorithm ModelsMartin Coronel
 
Bounded ant colony algorithm for task Allocation on a network of homogeneous ...
Bounded ant colony algorithm for task Allocation on a network of homogeneous ...Bounded ant colony algorithm for task Allocation on a network of homogeneous ...
Bounded ant colony algorithm for task Allocation on a network of homogeneous ...ijcsit
 
Unit 3 cs6601 Distributed Systems
Unit 3 cs6601 Distributed SystemsUnit 3 cs6601 Distributed Systems
Unit 3 cs6601 Distributed SystemsNandakumar P
 
Introduction to Data Structures & Algorithms
Introduction to Data Structures & AlgorithmsIntroduction to Data Structures & Algorithms
Introduction to Data Structures & AlgorithmsAfaq Mansoor Khan
 
Lecture 4 principles of parallel algorithm design updated
Lecture 4   principles of parallel algorithm design updatedLecture 4   principles of parallel algorithm design updated
Lecture 4 principles of parallel algorithm design updatedVajira Thambawita
 
Paper id 42201621
Paper id 42201621Paper id 42201621
Paper id 42201621IJRAT
 
An Algorithm for Optimized Cost in a Distributed Computing System
An Algorithm for Optimized Cost in a Distributed Computing SystemAn Algorithm for Optimized Cost in a Distributed Computing System
An Algorithm for Optimized Cost in a Distributed Computing SystemIRJET Journal
 
IMPROVING SCHEDULING OF DATA TRANSMISSION IN TDMA SYSTEMS
IMPROVING SCHEDULING OF DATA TRANSMISSION IN TDMA SYSTEMSIMPROVING SCHEDULING OF DATA TRANSMISSION IN TDMA SYSTEMS
IMPROVING SCHEDULING OF DATA TRANSMISSION IN TDMA SYSTEMScsandit
 
Effective Load Balance Scheduling Schemes for Heterogeneous Distributed System
Effective Load Balance Scheduling Schemes for Heterogeneous Distributed System  Effective Load Balance Scheduling Schemes for Heterogeneous Distributed System
Effective Load Balance Scheduling Schemes for Heterogeneous Distributed System IJECEIAES
 
A SECURE DIGITAL SIGNATURE SCHEME WITH FAULT TOLERANCE BASED ON THE IMPROVED ...
A SECURE DIGITAL SIGNATURE SCHEME WITH FAULT TOLERANCE BASED ON THE IMPROVED ...A SECURE DIGITAL SIGNATURE SCHEME WITH FAULT TOLERANCE BASED ON THE IMPROVED ...
A SECURE DIGITAL SIGNATURE SCHEME WITH FAULT TOLERANCE BASED ON THE IMPROVED ...csandit
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)ijceronline
 

Mais procurados (19)

Analysis of mutual exclusion algorithms with the significance and need of ele...
Analysis of mutual exclusion algorithms with the significance and need of ele...Analysis of mutual exclusion algorithms with the significance and need of ele...
Analysis of mutual exclusion algorithms with the significance and need of ele...
 
Implementation of query optimization for reducing run time
Implementation of query optimization for reducing run timeImplementation of query optimization for reducing run time
Implementation of query optimization for reducing run time
 
Fundamentals of data structures ellis horowitz & sartaj sahni
Fundamentals of data structures   ellis horowitz & sartaj sahniFundamentals of data structures   ellis horowitz & sartaj sahni
Fundamentals of data structures ellis horowitz & sartaj sahni
 
Parallel Algorithm Models
Parallel Algorithm ModelsParallel Algorithm Models
Parallel Algorithm Models
 
50120140505004
5012014050500450120140505004
50120140505004
 
Bounded ant colony algorithm for task Allocation on a network of homogeneous ...
Bounded ant colony algorithm for task Allocation on a network of homogeneous ...Bounded ant colony algorithm for task Allocation on a network of homogeneous ...
Bounded ant colony algorithm for task Allocation on a network of homogeneous ...
 
Chapter 5 pc
Chapter 5 pcChapter 5 pc
Chapter 5 pc
 
Unit 3 cs6601 Distributed Systems
Unit 3 cs6601 Distributed SystemsUnit 3 cs6601 Distributed Systems
Unit 3 cs6601 Distributed Systems
 
Introduction to Data Structures & Algorithms
Introduction to Data Structures & AlgorithmsIntroduction to Data Structures & Algorithms
Introduction to Data Structures & Algorithms
 
Lecture 4 principles of parallel algorithm design updated
Lecture 4   principles of parallel algorithm design updatedLecture 4   principles of parallel algorithm design updated
Lecture 4 principles of parallel algorithm design updated
 
Paper id 42201621
Paper id 42201621Paper id 42201621
Paper id 42201621
 
Load balancing
Load balancingLoad balancing
Load balancing
 
An Algorithm for Optimized Cost in a Distributed Computing System
An Algorithm for Optimized Cost in a Distributed Computing SystemAn Algorithm for Optimized Cost in a Distributed Computing System
An Algorithm for Optimized Cost in a Distributed Computing System
 
IMPROVING SCHEDULING OF DATA TRANSMISSION IN TDMA SYSTEMS
IMPROVING SCHEDULING OF DATA TRANSMISSION IN TDMA SYSTEMSIMPROVING SCHEDULING OF DATA TRANSMISSION IN TDMA SYSTEMS
IMPROVING SCHEDULING OF DATA TRANSMISSION IN TDMA SYSTEMS
 
Effective Load Balance Scheduling Schemes for Heterogeneous Distributed System
Effective Load Balance Scheduling Schemes for Heterogeneous Distributed System  Effective Load Balance Scheduling Schemes for Heterogeneous Distributed System
Effective Load Balance Scheduling Schemes for Heterogeneous Distributed System
 
call for papers, research paper publishing, where to publish research paper, ...
call for papers, research paper publishing, where to publish research paper, ...call for papers, research paper publishing, where to publish research paper, ...
call for papers, research paper publishing, where to publish research paper, ...
 
A SECURE DIGITAL SIGNATURE SCHEME WITH FAULT TOLERANCE BASED ON THE IMPROVED ...
A SECURE DIGITAL SIGNATURE SCHEME WITH FAULT TOLERANCE BASED ON THE IMPROVED ...A SECURE DIGITAL SIGNATURE SCHEME WITH FAULT TOLERANCE BASED ON THE IMPROVED ...
A SECURE DIGITAL SIGNATURE SCHEME WITH FAULT TOLERANCE BASED ON THE IMPROVED ...
 
Chapter 1 pc
Chapter 1 pcChapter 1 pc
Chapter 1 pc
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)
 

Semelhante a An adaptive framework towards analyzing the parallel merge sort

Performance Analysis of Parallel Algorithms on Multi-core System using OpenMP
Performance Analysis of Parallel Algorithms on Multi-core System using OpenMP Performance Analysis of Parallel Algorithms on Multi-core System using OpenMP
Performance Analysis of Parallel Algorithms on Multi-core System using OpenMP IJCSEIT Journal
 
Task communication
Task communicationTask communication
Task communication1jayanti
 
Performance evaluation of larger matrices over cluster of four nodes using mpi
Performance evaluation of larger matrices over cluster of four nodes using mpiPerformance evaluation of larger matrices over cluster of four nodes using mpi
Performance evaluation of larger matrices over cluster of four nodes using mpieSAT Journals
 
Vol 16 No 2 - July-December 2016
Vol 16 No 2 - July-December 2016Vol 16 No 2 - July-December 2016
Vol 16 No 2 - July-December 2016ijcsbi
 
Problems in Task Scheduling in Multiprocessor System
Problems in Task Scheduling in Multiprocessor SystemProblems in Task Scheduling in Multiprocessor System
Problems in Task Scheduling in Multiprocessor Systemijtsrd
 
Programming using MPI and OpenMP
Programming using MPI and OpenMPProgramming using MPI and OpenMP
Programming using MPI and OpenMPDivya Tiwari
 
Parallel programming Comparisions
Parallel programming ComparisionsParallel programming Comparisions
Parallel programming ComparisionsMuhammad Bilal Khan
 
Comprehensive Performance Evaluation on Multiplication of Matrices using MPI
Comprehensive Performance Evaluation on Multiplication of Matrices using MPIComprehensive Performance Evaluation on Multiplication of Matrices using MPI
Comprehensive Performance Evaluation on Multiplication of Matrices using MPIijtsrd
 
Survey on Load Rebalancing for Distributed File System in Cloud
Survey on Load Rebalancing for Distributed File System in CloudSurvey on Load Rebalancing for Distributed File System in Cloud
Survey on Load Rebalancing for Distributed File System in CloudAM Publications
 
Dynamic Load Calculation in A Distributed System using centralized approach
Dynamic Load Calculation in A Distributed System using centralized approachDynamic Load Calculation in A Distributed System using centralized approach
Dynamic Load Calculation in A Distributed System using centralized approachIJARIIT
 
Towards a Concurrence Analysis in Business Processes
Towards a Concurrence Analysis in Business ProcessesTowards a Concurrence Analysis in Business Processes
Towards a Concurrence Analysis in Business ProcessesAnastasija Nikiforova
 
Vol 3 No 1 - July 2013
Vol 3 No 1 - July 2013Vol 3 No 1 - July 2013
Vol 3 No 1 - July 2013ijcsbi
 
IRJET- Latin Square Computation of Order-3 using Open CL
IRJET- Latin Square Computation of Order-3 using Open CLIRJET- Latin Square Computation of Order-3 using Open CL
IRJET- Latin Square Computation of Order-3 using Open CLIRJET Journal
 
Job Scheduling on the Grid Environment using Max-Min Firefly Algorithm
Job Scheduling on the Grid Environment using Max-Min  Firefly AlgorithmJob Scheduling on the Grid Environment using Max-Min  Firefly Algorithm
Job Scheduling on the Grid Environment using Max-Min Firefly AlgorithmEditor IJCATR
 
An Improved Leader Election Algorithm for Distributed Systems
An Improved Leader Election Algorithm for Distributed SystemsAn Improved Leader Election Algorithm for Distributed Systems
An Improved Leader Election Algorithm for Distributed Systemsijngnjournal
 
IRJET- Semantics based Document Clustering
IRJET- Semantics based Document ClusteringIRJET- Semantics based Document Clustering
IRJET- Semantics based Document ClusteringIRJET Journal
 
A Deterministic Model Of Time For Distributed Systems
A Deterministic Model Of Time For Distributed SystemsA Deterministic Model Of Time For Distributed Systems
A Deterministic Model Of Time For Distributed SystemsJim Webb
 

Semelhante a An adaptive framework towards analyzing the parallel merge sort (20)

Distributed System
Distributed System Distributed System
Distributed System
 
Performance Analysis of Parallel Algorithms on Multi-core System using OpenMP
Performance Analysis of Parallel Algorithms on Multi-core System using OpenMP Performance Analysis of Parallel Algorithms on Multi-core System using OpenMP
Performance Analysis of Parallel Algorithms on Multi-core System using OpenMP
 
1844 1849
1844 18491844 1849
1844 1849
 
Task communication
Task communicationTask communication
Task communication
 
Performance evaluation of larger matrices over cluster of four nodes using mpi
Performance evaluation of larger matrices over cluster of four nodes using mpiPerformance evaluation of larger matrices over cluster of four nodes using mpi
Performance evaluation of larger matrices over cluster of four nodes using mpi
 
Vol 16 No 2 - July-December 2016
Vol 16 No 2 - July-December 2016Vol 16 No 2 - July-December 2016
Vol 16 No 2 - July-December 2016
 
Problems in Task Scheduling in Multiprocessor System
Problems in Task Scheduling in Multiprocessor SystemProblems in Task Scheduling in Multiprocessor System
Problems in Task Scheduling in Multiprocessor System
 
Programming using MPI and OpenMP
Programming using MPI and OpenMPProgramming using MPI and OpenMP
Programming using MPI and OpenMP
 
Parallel programming Comparisions
Parallel programming ComparisionsParallel programming Comparisions
Parallel programming Comparisions
 
Comprehensive Performance Evaluation on Multiplication of Matrices using MPI
Comprehensive Performance Evaluation on Multiplication of Matrices using MPIComprehensive Performance Evaluation on Multiplication of Matrices using MPI
Comprehensive Performance Evaluation on Multiplication of Matrices using MPI
 
Survey on Load Rebalancing for Distributed File System in Cloud
Survey on Load Rebalancing for Distributed File System in CloudSurvey on Load Rebalancing for Distributed File System in Cloud
Survey on Load Rebalancing for Distributed File System in Cloud
 
Dynamic Load Calculation in A Distributed System using centralized approach
Dynamic Load Calculation in A Distributed System using centralized approachDynamic Load Calculation in A Distributed System using centralized approach
Dynamic Load Calculation in A Distributed System using centralized approach
 
Ch4
Ch4Ch4
Ch4
 
Towards a Concurrence Analysis in Business Processes
Towards a Concurrence Analysis in Business ProcessesTowards a Concurrence Analysis in Business Processes
Towards a Concurrence Analysis in Business Processes
 
Vol 3 No 1 - July 2013
Vol 3 No 1 - July 2013Vol 3 No 1 - July 2013
Vol 3 No 1 - July 2013
 
IRJET- Latin Square Computation of Order-3 using Open CL
IRJET- Latin Square Computation of Order-3 using Open CLIRJET- Latin Square Computation of Order-3 using Open CL
IRJET- Latin Square Computation of Order-3 using Open CL
 
Job Scheduling on the Grid Environment using Max-Min Firefly Algorithm
Job Scheduling on the Grid Environment using Max-Min  Firefly AlgorithmJob Scheduling on the Grid Environment using Max-Min  Firefly Algorithm
Job Scheduling on the Grid Environment using Max-Min Firefly Algorithm
 
An Improved Leader Election Algorithm for Distributed Systems
An Improved Leader Election Algorithm for Distributed SystemsAn Improved Leader Election Algorithm for Distributed Systems
An Improved Leader Election Algorithm for Distributed Systems
 
IRJET- Semantics based Document Clustering
IRJET- Semantics based Document ClusteringIRJET- Semantics based Document Clustering
IRJET- Semantics based Document Clustering
 
A Deterministic Model Of Time For Distributed Systems
A Deterministic Model Of Time For Distributed SystemsA Deterministic Model Of Time For Distributed Systems
A Deterministic Model Of Time For Distributed Systems
 

Mais de International Journal of Science and Research (IJSR)

Mais de International Journal of Science and Research (IJSR) (20)

Innovations in the Diagnosis and Treatment of Chronic Heart Failure
Innovations in the Diagnosis and Treatment of Chronic Heart FailureInnovations in the Diagnosis and Treatment of Chronic Heart Failure
Innovations in the Diagnosis and Treatment of Chronic Heart Failure
 
Design and implementation of carrier based sinusoidal pwm (bipolar) inverter
Design and implementation of carrier based sinusoidal pwm (bipolar) inverterDesign and implementation of carrier based sinusoidal pwm (bipolar) inverter
Design and implementation of carrier based sinusoidal pwm (bipolar) inverter
 
Polarization effect of antireflection coating for soi material system
Polarization effect of antireflection coating for soi material systemPolarization effect of antireflection coating for soi material system
Polarization effect of antireflection coating for soi material system
 
Image resolution enhancement via multi surface fitting
Image resolution enhancement via multi surface fittingImage resolution enhancement via multi surface fitting
Image resolution enhancement via multi surface fitting
 
Ad hoc networks technical issues on radio links security & qo s
Ad hoc networks technical issues on radio links security & qo sAd hoc networks technical issues on radio links security & qo s
Ad hoc networks technical issues on radio links security & qo s
 
Microstructure analysis of the carbon nano tubes aluminum composite with diff...
Microstructure analysis of the carbon nano tubes aluminum composite with diff...Microstructure analysis of the carbon nano tubes aluminum composite with diff...
Microstructure analysis of the carbon nano tubes aluminum composite with diff...
 
Improving the life of lm13 using stainless spray ii coating for engine applic...
Improving the life of lm13 using stainless spray ii coating for engine applic...Improving the life of lm13 using stainless spray ii coating for engine applic...
Improving the life of lm13 using stainless spray ii coating for engine applic...
 
An overview on development of aluminium metal matrix composites with hybrid r...
An overview on development of aluminium metal matrix composites with hybrid r...An overview on development of aluminium metal matrix composites with hybrid r...
An overview on development of aluminium metal matrix composites with hybrid r...
 
Pesticide mineralization in water using silver nanoparticles incorporated on ...
Pesticide mineralization in water using silver nanoparticles incorporated on ...Pesticide mineralization in water using silver nanoparticles incorporated on ...
Pesticide mineralization in water using silver nanoparticles incorporated on ...
 
Comparative study on computers operated by eyes and brain
Comparative study on computers operated by eyes and brainComparative study on computers operated by eyes and brain
Comparative study on computers operated by eyes and brain
 
T s eliot and the concept of literary tradition and the importance of allusions
T s eliot and the concept of literary tradition and the importance of allusionsT s eliot and the concept of literary tradition and the importance of allusions
T s eliot and the concept of literary tradition and the importance of allusions
 
Effect of select yogasanas and pranayama practices on selected physiological ...
Effect of select yogasanas and pranayama practices on selected physiological ...Effect of select yogasanas and pranayama practices on selected physiological ...
Effect of select yogasanas and pranayama practices on selected physiological ...
 
Grid computing for load balancing strategies
Grid computing for load balancing strategiesGrid computing for load balancing strategies
Grid computing for load balancing strategies
 
A new algorithm to improve the sharing of bandwidth
A new algorithm to improve the sharing of bandwidthA new algorithm to improve the sharing of bandwidth
A new algorithm to improve the sharing of bandwidth
 
Main physical causes of climate change and global warming a general overview
Main physical causes of climate change and global warming   a general overviewMain physical causes of climate change and global warming   a general overview
Main physical causes of climate change and global warming a general overview
 
Performance assessment of control loops
Performance assessment of control loopsPerformance assessment of control loops
Performance assessment of control loops
 
Capital market in bangladesh an overview
Capital market in bangladesh an overviewCapital market in bangladesh an overview
Capital market in bangladesh an overview
 
Faster and resourceful multi core web crawling
Faster and resourceful multi core web crawlingFaster and resourceful multi core web crawling
Faster and resourceful multi core web crawling
 
Extended fuzzy c means clustering algorithm in segmentation of noisy images
Extended fuzzy c means clustering algorithm in segmentation of noisy imagesExtended fuzzy c means clustering algorithm in segmentation of noisy images
Extended fuzzy c means clustering algorithm in segmentation of noisy images
 
Parallel generators of pseudo random numbers with control of calculation errors
Parallel generators of pseudo random numbers with control of calculation errorsParallel generators of pseudo random numbers with control of calculation errors
Parallel generators of pseudo random numbers with control of calculation errors
 

Último

data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfJiananWang21
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)simmis5
 
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...SUHANI PANDEY
 
Call for Papers - International Journal of Intelligent Systems and Applicatio...
Call for Papers - International Journal of Intelligent Systems and Applicatio...Call for Papers - International Journal of Intelligent Systems and Applicatio...
Call for Papers - International Journal of Intelligent Systems and Applicatio...Christo Ananth
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...Call Girls in Nagpur High Profile
 
Vivazz, Mieres Social Housing Design Spain
Vivazz, Mieres Social Housing Design SpainVivazz, Mieres Social Housing Design Spain
Vivazz, Mieres Social Housing Design Spaintimesproduction05
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTbhaskargani46
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performancesivaprakash250
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordAsst.prof M.Gokilavani
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptMsecMca
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlysanyuktamishra911
 
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptDineshKumar4165
 
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Bookingroncy bisnoi
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...ranjana rawat
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756dollysharma2066
 

Último (20)

data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)
 
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
 
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
 
Call for Papers - International Journal of Intelligent Systems and Applicatio...
Call for Papers - International Journal of Intelligent Systems and Applicatio...Call for Papers - International Journal of Intelligent Systems and Applicatio...
Call for Papers - International Journal of Intelligent Systems and Applicatio...
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
 
Vivazz, Mieres Social Housing Design Spain
Vivazz, Mieres Social Housing Design SpainVivazz, Mieres Social Housing Design Spain
Vivazz, Mieres Social Housing Design Spain
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
 
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
 

An adaptive framework towards analyzing the parallel merge sort

  • 1. International Journal of Science and Research (IJSR), India Online ISSN: 2319-7064 Volume 1 Issue 2, November 2012 www.ijsr.net An Adaptive Framework towards Analyzing the Parallel Merge Sort  Husain Ullah Khan 1 , Rajesh Tiwari 2 1 ME Scholar, Department Of Computer Science & Engineering, Shri Shankaracharya College of Engineering & Technology Bhilai, India, khan.husain@gamil.com. 2 Department Of Computer Science & Engineering, Shri Shankaracharya College of Engineering & Technology, Bhilai, India, raj_tiwari_in@yahoo.com Abstract: The parallel computing on loosely coupled architecture has been evolved now a day because of the availability of fast and, inexpensive processors and advancements in communication technologies. The aim of this paper is to evaluate the performance of parallel merge sort algorithm on parallel programming environments such as MPI .The MPI libraries has been used to established the communication and synchronization between the processes Merge sort is analyze in this paper because it is an efficient divide-and- conquer sorting algorithm. it is easier to understand than other useful divide-and-conquer strategies Due to the importance of distributed computing power of workstations or PCs connected in a local area network. Our aim is to study the performance evaluation of parallel merge sort. Keywords: parallel computing, parallel Algorithms, Message Passing Interface, Merge sort, performance analysis. 1. Introduction Here, we present a parallel version of the well known merge sort algorithm. The algorithm assumes that the sequence to be sorted is distributed and so generates a distributed sorted sequence. For simplicity, we assume that n is an integer multiple of p, that the n data are distributed evenly among p tasks. The sequential merge sort requires O (n logn) [5] time to sort n elements. Due to the importance of distributed computing power of workstations or PCs connected in a local area network [7] researcher have been studying the performance of various scientific applications [6][7]][8]. 2. Methodology of Parallel Merge Methodology used for parallel algorithm uses master slave model in the form of tree for parallel sorting. Each process receives the list of elements from its precedor process then divides it into two halves, keeps one half for it and sends the second half for its successor. To address the corresponding preccedor & successor we have used the concept of rank calculation. For a process having odd rank it is calculated as. Myrank_multiple=2*Myrank+1; Temp_myrank= Myrank_multiple And for the process having even rank it is calculated as. Myrank_multiple=2*Myrank+2; Temp_myrank= Myrank_multiple. It uses recursive calls both to emulate the transmission of the right halves of the arrays and the recursive calls that process the left halves. After that it will receive the sorted data from its successor & merge that two sub lists. Then it sends the result to its precedor. This process will continue up to root node. 10 11 20 5 9 2 4 13 10 11 20 5 -------------- 9 2 4 13 10 11 ----- 20 5 ----- 9 2 -- 4 13 10 -- 11 --- 20 -- 5 ----- 9 - - 2 -- 4 -- 13 10 11 --- 5 20 ----- 2 9 -- 4 13 5 10 11 20 -------------- 2 4 9 13 2 4 5 9 10 11 13 20 Figure: -2.1 Communications and sorting in Merge Sort In above figure divide and conquer strategies are shown. Here list is divided into two sub lists and again sub list is divided again two sub list, this process is repeated until we found single element. Then merge process start with sorting. Finally we get sorted sequence. 3. Communication and Processing The sequential time complexity is O(n logn). In case of parallel algorithm the complexity involves both communication cost and computational cost. The 31
  • 2. International Journal of Science and Research (IJSR), India Online ISSN: 2319-7064 Volume 1 Issue 2, November 2012 www.ijsr.net communication of sub problems is an overhead expense that we want to minimize. Also, there’s no reason to allow an internal node process to sit idle, waiting to receive two results from its children. Instead, we want the parent to send half the work to the child process and then accomplish half of the work itself. Central to computing within MPI [2] is the concept of a “communicator”. The MPI communicator specifies a group of processes inside which communication occurs. MPI_COMM_WORLD is the initial communicator, containing all processes involved in the computation. Each process communicates with the others through that communicator, and has the ability to find position within the communicator and also the total number of processes in the communicator. In the division phase, communication only takes place as follows, Communication at each step, The root process prepares the array to sort and then invokes parallel merge sort [1] while each helper process receives data from its parent process and invokes parallel merge sort and sends sorted data back to parent . Tcomm=2(Tstartup+(n/2)Tdata+Tstartup+(n/4)Tdata+…) Computation only occurs in merging the sub lists merging can be done by stepping through each list, moving the smallest found into the final list first. It takes 2n-1 steps in the worst case to merge two sorted lists , each of n numbers into one sorted list in this manner. Therefore, the computation consists of Tcomp=1 P1;P2;P1;P3 Tcomp=3 P0;P1 Tcomp=7 P0 Therefore the total time required is, Ttotal= Tcomp+Tcomm 4. Choosing the Parallel Environment MPI (Message Passing Interface): MPI is an easily used parallel processing environment whether our target system is a single multiprocessor computer with shared memory or a number of networked computers: the Message Passing Interface (MPI) [4] As its name implies, processing is performed through the exchange of messages among the processes that are cooperating in the computation. Central to computing within MPI is the concept of a “communicator”. The MPI communicator specifies a group of processes inside which communication occurs. MPI_COMM_WORLD is the initial communicator, containing all processes involved in the computation. Each process communicates with the others through that communicator, and has the ability to find position within the communicator and also the total number of processes in the communicator. Through the communicator, processes have the ability to exchange messages with each other. The sender of the message specifies the process to receive the message. In addition, the sender attaches to the message something called a message tag, an indication of the kind of message it is. Since these tags are simply non-negative integers, a large number is available to the parallel programmer, since that is the person who decides what the tags are within the parallel problem solving system being developed. The process receiving a message specifies both from what process it is willing to receive a message and what the message tag is. In addition, however, the receiving process has the capability of using wild cards, one specifying that it will accept a message from any sender, the other specifying that it will accept a message with any message tag. When the receiving process uses wild card specifications, MPI provides a means by which the receiving process can determine the sending process and the tag used in sending the message. For the parallel sorting program, we can get by with just one kind of receive, the one that blocks execution until a message of the specified sender and tag is available. you need to initialize within the MPI environment. The presumption is that this one is called from the programs main, and so it sends pointers to the argc and argv that it received from the operating system. We choose MPI to implement message-passing merge sort on single and networked computers because (i) MPI is implemented for a broad variety of Architectures, including implementations that are freely available. (ii) MPI is well documented; (iii) MPI is more popular with parallel platform than other parallel platforms, such as PVM [3] and OpenMP. And our preference for an implementation language is ANSI C because C is fast and available on virtually any platform and C is easy to implement. Parallel merge sort is executed by various processes at various levels of the process tree, with the root being at level 0, its children at level 1, and so on . In that, the process’s level and the MPI process rank are used to calculate a corresponding helper process’s rank. Then, merge sort communicates for further sorting half of the array with that helper process. Serial merge sort is invoked when no more MPI helper processes are available. Procedure for parallel merge sort: Step.1 Calculate Rank of the process. Step.2 Assign rank to parent process and child processes. Step.3 Check for the left & right child according to the rank of the process. Step.4 sort left sub arrays. Step.5 check for the return status from child processes. Step.6 send sorted data to parent process. Step.7 repeats the step 4 for right sub array. Step.8 Merge the two results back into new array To execute above steps successfully there are various functions that can be used, some of them are. 1. MPI_Init(int *argc, char ***argv) //initialize MPI 2. MPI_Comm_rank (MPI_Comm comm, int *rank) //This process’s position within the 32
  • 3. International Journal of Science and Research (IJSR), India Online ISSN: 2319-7064 Volume 1 Issue 2, November 2012 www.ijsr.net communicator 3. MPI_Comm_size (MPI_Comm comm, int *size) //Total number of processes in the communicator 4. MPI_Send( void *buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm ) // Send a message to process with rank dest using tag 5. MPI_Recv( void *buf, int count, MPI_Datatype datatype, int source, int tag,MPI_Comm comm , MPI_Status *status ) //Receive a message with the specified tag from the process with the rank source 6. MPI_Finalize() //Finalize MPI 5. Performance Evaluation and Result Output from Communications Test During the divide phase processes which are communicating to each other is shown in tabular form. Table : - 5.1 Communication process Table Building a height 3 tree 0 sending data to 4 0 sending data to 2 0 sending data to 1 1 transmitting to 0 0 getting data from 1 2 sending data to 3 3 transmitting to 2 2 getting data from 3 2 transmitting to 0 0 getting data from 2 4 sending data to 6 4 sending data to 5 5 transmitting to 4 4 getting data from 5 6 sending data to 7 7 transmitting to 6 6 getting data from 7 6 transmitting to 4 4 getting data from 6 4transmitting to 0 0 getting data from 4 Single Computer Results in MPI 4 processes mandate root height of 2 Size: 10000000 Parallel: 3.877 Sequential: 11.607 Speed-up: 2.994 8 processes mandate root height of 3 Size: 10000000 Parallel: 3.643 Sequential: 11.57 Speed-up: 3.18 On the other hand, you can force network communications by running the application in an MPI session involving multiple computers, and letting MPI think that each computer has only one processor. In this environment, the mismatch between processor speed and network communications speed becomes obvious. Networked Computer Results in MPI 4 processes mandate root height of 2 Size: 10000000 Parallel: 8.421 Sequential: 11.611 Speed-up: 1.379 8 processes mandate root height of 3 Size: 10000000 Parallel: 8.168 Sequential: 11.879 Speed-up: 1.4548 Any network communications drastically degrades the performance. MPI know that each computer has four processors. In that case, MPI will deal out processes by fours before it moves to the next available computer. Thus, if you ask for eight parallel processes, ranks 0 through 3 will be on one computer, with fastest possible communications, and ranks 4 through 7 will be on another computer. Thus the only messaging is at the root, when rank 0 sends its right half to rank 4. 6. Conclusion This paper introduces parallel environment and its implementation using MPI with C for parallel merge sort. The paper reports performance experiments of MPI in which it is concluded accordingly. A fast network can make a message-passing (with MPI) solution for some problem faster. Fast network that avoids a potentially large network overhead the next time the same resources are requested so it can provide better performance. References [1] K.B.Manwade, R.B.Patil; “Parallel merge sort on loosely coupled architecture”, National Conference, PSG Coimbatore. [2] The OpenMP specification for parallel programming. Retrieved on March 1, 2011 from http://openmp.org. [3] The Message Passing Interface (MPI) standard. Retrieved on March1, 2011from http://www.mcs.anl.gov/research/projects/mpi/. [4] Radenski Shared Memory, Message Passing, and Hybrid Merge Sorts for Standalone and Clustered SMPs Proc .PDPTA’11, the 2011 international conference of parallel and distributed processing technique and applications, CSREA press (H.Arabnia,Ed.), 2011 pp.367-373. [5] Wilkinson & Michael Charlotte, “Parallel programming techniques and applications using networked workstations and parallel computers”, Pearson publication. 33
  • 4. International Journal of Science and Research (IJSR), India Online ISSN: 2319-7064 Volume 1 Issue 2, November 2012 www.ijsr.net [6] Kalim Qureshi and Haroon Rashid,“A Practical Performance Comparison of Parallel Matrix Multiplication Algorithms on Network of Workstations.”, IEE Transaction Japan, Vol. 125, No. 3, 2005. [7] Kalim Qureshi and Haroon Rashid,“ A Practical Performance Comparison of Two Parallel Fast Fourier Transform Algorithms on Cluster of PCS”, IEE Transaction Japan, Vol. 124, No. 11, 2004. [8] Kalim Qureshi and Masahiko Hatanaka, “A Practical Approach of Task Partitioning and Scheduling on Heterogeneous Parallel Distributed Image Computing System,” Transaction of IEE Japan, Vol. 120-C, No. 1, Jan., 2000, pp. 151-157. Husain Ullah Khan: M.E.(Pursuing) in Computer Technology & Application from Shri Shankaracharya College of Engineering & Technology, CSVTU, Bhilai ,India. Research areas are Parallel Computing & its Enhancement. Rajesh Tiwari: Received the M.E. degree in Computer Technology & Applications from Shri Shankaracharya College of Engineering & Technology, Bhilai, India. Currently he is pursuing Ph.D from CSVTU, Bhilai. He is having nine years long experience in the field of teaching. Research areas are Parallel Computing and its Enhancement, His research work has been published in many national and international journals. 34