EEDC Programming Models

EEDC

34330
Execution
Environments for Scientific Programming
Distributed Models
Computing
Master in Computer Architecture,
Networks and Systems - CANS

Group members:

Francesc Lordan francesc.lordan@bsc.es
Roger Rafanell roger.rafanell@bsc.es

Outline

Scientific Programming Models

– Part 1: Introduction

– Part 2: Reference parallel programming models

– Part 3: Novel parallel programming models

– Part 4: Conclusions

– Part 5: Questions

2

Introduction

 Scientific applications:
– Solve complex problems
– Usually long run applications
– Implemented as a sequence of steps
– Each step (task) can be hard to compute
– So …

3

Introduction
In time terms…
Scientific applications can’t be no more
considered in sequential way!!!
OK?

4

Introduction

We need solutions based on distribute and
parallelize the work.

5

Introduction: MPI

1980s - early 1990s: Distributed memory & parallel computing started
as a bunch of incompatible software tools for writing programs.

 MPI (Message Passing Interface)
becomes at 1994 a new reference
standard.
 It provides:
– Portability
– Performance
– Functionality
– Availability (many implementations)

 Good for: Parallelize the processing by distributing the work among
different machines/nodes.

6

Introduction: OpenMP
In the early 90's: Vendors of shared-memory machines supplied similar,
directive-based for Fortran programming extensions:

 The user can extend a serial Fortran program with directives specifying
which loops were to be parallelized.
 The compiler automatically parallelize such loops across the SMP
processors.
 Implementations were all functionally similar, but were diverging (as usual).

 Good for: Parallelize the computation among all the resources of a
single machine.

7

Reference PM: OpenMP
Programming model:
 Computation is done by threads.
 Fork-join model: Threads are dynamically created and destroyed.
 Programmer can specify which variables are shared among threads
and which are private.

8

 Example of sequential PI calculation

9

 Example of OpenMP PI calculation

10

Strong Points:
– Keeps the sequential version.
– Communication is implicit.
– Easy to program, debug and modify.
– Good performance and scalability.

Weaknesses:
– Communication is implicit (less control).
– Simple and flat memory model (does not run on clusters).
– No support for accelerators.

11

Reference PM: MPI
Programming model:
 Computation is done by several processes that execute the same program.
 Communicates by passing data (send/receive).
 Programmer decides:
– Which role the process plays by branches.
– Orders which communications are done.

12

Reference PM: MPI
 Example of MPI PI calculation

13

Reference PM: MPI
Strong Points:
– Any parallel algorithm can be expressed in terms of the MPI paradigm.
– Data placement problems are rarely observed.
– Suitable for clusters/supercomputers (large number of processors).
– Excellent performance and scalable.

Weaknesses:
– Communication is explicit.
– Re-fitting serial code using MPI often requires refactoring.
– Dynamic load balancing is difficult to implement.

14

Reference PM: The best of both worlds
 Hybrid (MPI + OpenMP):
– MPI is most effective for problems with “course-grained” parallelism.
– “Fine-grain” parallelization is successfully handled by OpenMP.

 When use hybrid programming?
– The code exhibits limited scaling with MPI.
– The code could make use of dynamic load balancing.
– The code exhibits fine-grained or a combination of both fine-grained and
course-grained parallelism.

 Some algorithms, such as computational fluid
dynamics, benefit greatly from a hybrid approach!!!

15

Reference PM: Hybrid (MPI + OpenMP)
 Example of MPI + OpenMP PI calculation

16

Reference PM: New reference approaches
 Heterogeneous parallel-computing:
– CUDA (From NVIDIA)
– OpenCL (Open Compute Language)
– Cross-platform
• Implementations for
– ATI GPUs
– NVIDIA GPUs
– x86 CPUs

– API similar to OpenGL.
– Based on C.

17

Novel PMs

 Workflows:
– Based on processes
– Requires planning and scheduling
– Needs flow control
– In-transit visibility

 Novel PMs:
– Complex problems require simple solutions
(non reference PMs based)

18

Microsoft Dryad
 The Dryad Project is investigating programming model
for writing parallel and distributed programs to scale from
a small cluster to a large data-center.

 Theoretical approach (not used)
– Last and unique publication on 2007.

 User defines:
– a set of methods
– a task dependency graph with a specific language.

19

Microsoft Dryad
GraphBuilder Xset = moduleX^N;
GraphBuilder Dset = moduleD^N;
GraphBuilder Mset = moduleM^(N*4);
GraphBuilder Sset = moduleS^(N*4);
GraphBuilder Yset = moduleY^N;
GraphBuilder Hset = moduleH^1;

GraphBuilder XInputs = (ugriz1 >= XSet) || (neighbor >= XSet);
GraphBuilder YInputs = ugriz2 >= YSet;
GraphBuilder XToY = XSet >= DSet >> MSet >= SSet;
for (i = 0; i < N*4; ++i){
XToY = XToY || (SSet.GetVertex(i) >= YSet.GetVertex(i/4));
}

GraphBuilder YToH = YSet >= HSet;
GraphBuilder HOutputs = HSet >= output;
GraphBuilder final = XInputs || YInputs || XToY || YToH || HOutputs;

20

MapReduce
 Programmer only defines 2 functions
– Map(KInput,VInput) list(Ktemp,Vtemp)
– Reduce(Ktemp, list(Vtemp))list(Vtemp)

 The library is in charge of all the rest

21

MapReduce

 Weaknesses
– Specific programming.
– Not easy to find key value pairs.

 Strong points
– Efficiency.
– Simplicity of the model.
– Community and tools.

22

The COMP Superscalar (COMPSs)

23

COMPSs overview - Objective
 Reduce the development complexity of
Grid/Cluster/Cloud applications to the minimum
– As easy as writing a sequential application.

 Target applications: composed of tasks, most of them
repetitive
– Granularity of the tasks of the level of simulations or programs.
– Data: files, objects, arrays, primitive types.

24

COMPSs overview - Main idea
Parallel Resources
(a) Task selection +

Sequential Code parameters direction
Resource 1
...
for (i=0; i<N; i++){
(
(input, output, inout)
T1 (data1, data2);
T2 (data4, data5);
T3 (data2, data5, data6);
T4 (data7, data8);
T5 (data6, data8, data9); (d) Task completion,
}
... Resource 2
synchronization

T10 T20

T30
T40
. ..
(b) Task graph creation T50
T11 T21 Resource N
based on data (c) Scheduling,
T41
T31
dependencies data transfer,
T51 task execution
T12

…

25

Programming model - Sample application
Main program
public void main(){
Integer sum=0;
double pi
double step=1.0d /(double) num_steps;
for (int i=0;i<num_steps;i++){
computeInterval (i, step,sum);
}
pi = sum * step;
}

Subroutine
public static void computeInterval (int index, int step, Integer acum) {
int x = (index -0.5) * step;
acum = acum + 4.0/(1.0+x*x);
}

26

Programming Model - Task Selection
Task selection interface

public interface PiItf {
Implementation
@Method(declaringClass = “Pi")
void computeInterval(
@Parameter(direction = IN)
int index,
@Parameter(direction = IN)
int step,
@Parameter(direction = INOUT) Parameter
Integer index, metadata
);

}

13
27

Programming Model – Main code

public static void main(String[] args) {
Integer sum=0;
double pi
double step=1.0d /(double) num_steps;
NO CHANGES!
for (int i=0;i<num_steps;i++){
computeInterval (i, step, sum);
}
pi = sum * step;
}

1
0
Compute Step Compute … N-1
Step Compute
sum SYNCH
Step Interval Interval Interval
sum sum sum sum

28

Programming Model – Real Example
HMMER
Protein Database Aminoacid Sequence

IQKKSGKWHTLTDLRA
VNAVIQPMGPLQPGLP
SPAMIPKDWPLIIIDLK
DCFFTIPLAEQDCEKFA
FTIPAINNKEPATRF

Model Score E-value N
-------- ------ --------- ---
IL6_2 -78.5 0.13 1
COLFI_2 -164.5 0.35 1
pgtp_13 -36.3 0.48 1
clf2 -15.6 3.6 1
PKD_9 -24.0 5 1
29


Aminoacid
sequence

30


String[] outputs = new String[numDBFrags];

//Process
for (String dbFrag : dbFrags) {
outputs[dbNum]= HMMPfamImpl.hmmpfam(sequence, dbFrag);
}

//Merge
int neighbor = 1;
while (neighbor < numDBFrags) {
for (int db = 0; db < numDBFrags; db += 2 * neighbor) {
if (db + neighbor < numDBFrags) {
HMMPfamImpl.merge(outputs[db], outputs[db + neighbor]);
}
}
neighbor *= 2;
}

31

public interface HMMPfamItf {

@Method(declaringClass = "worker.hmmerobj.HMMPfamImpl")
String hmmpfam(
@Parameter(type = Type.FILE, direction = Direction.IN)
String seqFile,
@Parameter(type = Type.STRING, direction = Direction.IN)
String dbFile
);

@Method(declaringClass = "worker.hmmerobj.HMMPfamImpl")
void scoreRatingSameDB(
@Parameter(type = Type.OBJECT, direction = Direction.INOUT)
String resultFile1,
@Parameter(type = Type.OBJECT, direction = Direction.IN)
String resultFile2
);
}

32


33


34

COMPSs

 Strong points
– Sequential programming approach
– Parallelization at task level
– Transparent data management and remote execution
– Can operate on different infrastructures:
• Cluster/Grid
• Cloud (Public/Private)
– PaaS
– IaaS
• Web services
 Weaknesses:
– Under continuous development
– Does not offer binding to other languages (currently)

35

Tutorial
 Sample & Development Virtual Appliance
– http://bscgrid06.bsc.es/~lezzi/vms/COMPSs_Tutorial.ova

 Tutorial
– http://bscgrid06.bsc.es/~lezzi/ppts/tutorial-Life.ppt

36

Manjrasoft Aneka
 .NET based Platform-as-a-Service
 Allows the usage of:
– Private Clouds.
– Public Clouds: Amazon EC2, Azure, GoGrid.
 Offers mechanisms to control, reserve and monitoring
the resources.
– Also offers autoscale mechanisms.
 3 programming models
– Task-based: tasks are put in a bag of executable tasks.
– Thread-based: exposes the .NET thread API but they are remotely
created.
– MapReduce

 No data dependency analysis!!
37

Microsoft Azure
 .NET based Platform-as-a-Service
 Computing services
– Web Role: Web Service frontend.
– Worker Role: Backend.
 Storage Services

 Strong Point
– Scalable architecture.

 Weakness
– Platform-tied applications.

38

Conclusions

 Scientific problems are usually complex.

 Current reference PMs are usually unsuitable.

 New novel & flexible PMs came into the game.

 Existing gap between scientifics and user-friendly
workflow-oriented programming models.

 A sea of available solutions (DSLs)

39

EEDC Programming Models

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (19)

Semelhante a EEDC Programming Models

Semelhante a EEDC Programming Models (20)

Mais de Roger Rafanell Mas

Mais de Roger Rafanell Mas (12)

Último

Último (20)

EEDC Programming Models

Notas do Editor