SlideShare uma empresa Scribd logo
1 de 27
Advanced Computer Architecture
Conditions of Parallelism
Conditions of Parallelism
The exploitation of parallelism in computing
requires understanding the basic theory associated
with it. Progress is needed in several areas:
computation models for parallel computing
interprocessor communication in parallel architectures
integration of parallel systems into general environments
Data and Resource Dependencies
Program segments cannot be executed in parallel unless
they are independent.
Independence comes in several forms:
Data dependence: data modified by one segement must not be
modified by another parallel segment.
Control dependence: if the control flow of segments cannot be
identified before run time, then the data dependence between the
segments is variable.
Resource dependence: even if several segments are independent
in other ways, they cannot be executed in parallel if there aren’t
sufficient processing resources (e.g. functional units).
dependence graph
A dependence graph is used to describe the
relations between the statements.
The nodes corresponds to the program
statements and directed edges with different labels
shows the ordered relation among the statements.
Data Dependence - 1
The ordering relationship between statements is
shown by data dependence .following are the five
types data dependences:
Flow dependence(denoted by S1 S2): A
statement S2 is flow dependent on statement S1,if
an execution path is exists from S1 to S2 and if
atleast one output of 1 is used as input to S2.
Anti dependence(denoted by S1 S2): Statement
S2 is independent on statement S1 if S2 follows S1
in program order and if output of S2 overlaps the
input of S1.
Data Dependence - 2
Output dependence (denoted by S1 S2): Two
statement are output dependent if they produce the
same output variable.
I/O dependence (denoted by ): Read and
write are I/O statements . I/O dependence occure
when the same file is referenced by both I/O
statements.
Data Dependence - 3
Unknown dependence:
The subscript of a variable is itself subscripted.
The subscript does not contain the loop index variable.
A variable appears more than once with subscripts
having different coefficients of the loop variable (that is,
different functions of the loop variable).
The subscript is nonlinear in the loop index variable.
Parallel execution of program segments which do
not have total data independence can produce
non-deterministic results.
Example………….
EENG-630 - Chapter 2
I/O dependence example
S1: Read (4), A(I)
S2: Rewind (4)
S3: Write (4), B(I)
S4: Rewind (4)
S1 S3
I/O
Control Dependence
Control dependency arise when the order of
execution of statements cannot be determined
before runtime.
For example , conditional statements (IF) will not
be resolved until runtime. Differnet paths taken
after a conditional branch may introduce or
eliminate data dependence among instruction.
Dependence may also exits between operations
performed in successive iteration of looping
procedures.
Control Dependence
Dependence may also exist between operations
performed in successive iteration of looping
procedure.
The successive iteration of the following loop are
control-independent. example:
for (i=0;i<n;i++) {
a[i] = c[i];
if (a[i] < 0) a[i] = 1;
}
Control Dependence
The successive iteration of the following loop are control-
independent. example:
for (i=1;i<n;i++) {
if (a[i-1] < 0) a[i] = 1;
}
Compiler techniques are needed to get around
control dependence limitations.
Resource Dependence
Data and control dependencies are based on the
independence of the work to be done.
Resource independence is concerned with
conflicts in using shared resources, such as
registers, integer and floating point ALUs, etc.
ALU conflicts are called ALU dependence.
Memory (storage) conflicts are called storage
dependence.
The transformation of a sequentially coded
program into a parallel executable form can be
done manually by the programmer using explicit
parallism
Or done by a compiler detecting implicit parallism
automatically .
In both the approaches ,the decomposition of
program is primary objective.
Bernstein’s Conditions - 1
Bernstein’s revealed a set of conditions which
must exist if two processes can execute in parallel.
Notation
Let P1 and P2 be two processes .
Ii is the set of all input variables for a process Pi .
Oi is the set of all output variables for a process Pi .
If P1 and P2 can execute in parallel (which is written
as P1 || P2), then:
1 2
2 1
1 2
I O
I O
O O
∩ = ∅
∩ = ∅
∩ = ∅
Bernstein’s Conditions - 2
In terms of data dependencies, Bernstein’s
conditions imply that two processes can execute in
parallel if they are flow-independent, ant
independent, and output-independent.
The parallelism relation || is commutative (Pi || Pj
implies Pj || Pi ), but not transitive (Pi || Pj and Pj || Pk
does not imply Pi || Pk ) . Therefore, || is not an
equivalence relation.
Intersection of the input sets is allowed.
Hardware And Software Parallelism
For implementation of parallelism , special
hardware and software support is needed .
Compilation support is also needed to close to gap
between hardware and software .
Hardware Parallelism
Hardware parallelism is defined by machine architecture
and hardware multiplicity.
Hardware parallelism is a function of cost and performance
trade offs.
It displays the resources utilization patterns of
simultaneously executable operations.
It can also indicate the peak performance of the
processor resources.
It can be characterized by the number of
instructions that can be issued per machine cycle.
If a processor issues k instructions per machine
cycle, it is called a k-issue processor.
Conventional processors are one-issue machines.
Examples. Intel i960CA is a three-issue processor
(arithmetic, memory access, branch). IBM RS-
6000 is a four-issue processor (arithmetic, floating-
point, memory access, branch).
A machine with n k-issue processors should be
able to handle a maximum of nk threads
simultaneously.
Software Parallelism
Software parallelism is defined by the control and
data dependence of programs, and is revealed in
the program’s flow graph.
It is a function of algorithm, programming style, and
compiler optimization.
The program flow graph display the patterns of
simultaneously executable operations
Parallelism in a program varies during the
execution period.
Mismatch between software and
hardware parallelism - 1
L1 L2 L3 L4
X1 X2
+ -
A B
Maximum software
parallelism (L=load,
X/+/- = arithmetic).
Cycle 1
Cycle 2
Cycle 3
Mismatch between software and
hardware parallelism - 2
L1
L2
L4
L3X1
X2
+
-
A
B
Cycle 1
Cycle 2
Cycle 3
Cycle 4
Cycle 5
Cycle 6
Cycle 7
Same problem, but
considering the
parallelism on a two-issue
superscalar processor.
Mismatch between software and
hardware parallelism - 3
L1
L2
S1
X1
+
L5
L3
L4
S2
X2
-
L6
BA
Cycle 1
Cycle 2
Cycle 3
Cycle 4
Cycle 5
Cycle 6
Same problem,
with two single-
issue processors
= inserted for
synchronization
Types of Software Parallelism
Control Parallelism – two or more operations can
be performed simultaneously. This can be
detected by a compiler, or a programmer can
explicitly indicate control parallelism by using
special language constructs or dividing a program
into multiple processes.
Control Parallelism appear in the form of pipelining
or multi-functional units which is limited by pipeline
length and by the multiplicity of functional units.
Both are handle by the hardware .
Types of Software Parallelism
Data parallelism – multiple data elements have the
same operations applied to them at the same time.
This offers the highest potential for concurrency
(in SIMD and MIMD modes). Synchronization in
SIMD machines handled by hardware.
Data parallel code is easier to write and to
debug .
Solving the Mismatch Problems
Develop compilation support
Redesign hardware for more efficient exploitation
by compilers
Use large register files and sustained instruction
pipelining.
Have the compiler fill the branch and load delay
slots in code generated for RISC processors.
The Role of Compilers
Compilers used to exploit hardware features to
improve performance.
Interaction between compiler and architecture
design is a necessity in modern computer
development.
It is not necessarily the case that more software
parallelism will improve performance in
conventional scalar processors.
The hardware and compiler should be designed at
the same time.

Mais conteúdo relacionado

Mais procurados

Multivector and multiprocessor
Multivector and multiprocessorMultivector and multiprocessor
Multivector and multiprocessorKishan Panara
 
Applications of paralleL processing
Applications of paralleL processingApplications of paralleL processing
Applications of paralleL processingPage Maker
 
Data flow architecture
Data flow architectureData flow architecture
Data flow architectureSourav Routh
 
parallel language and compiler
parallel language and compilerparallel language and compiler
parallel language and compilerVignesh Tamil
 
Hardware and Software parallelism
Hardware and Software parallelismHardware and Software parallelism
Hardware and Software parallelismprashantdahake
 
Introduction to Parallel and Distributed Computing
Introduction to Parallel and Distributed ComputingIntroduction to Parallel and Distributed Computing
Introduction to Parallel and Distributed ComputingSayed Chhattan Shah
 
System models in distributed system
System models in distributed systemSystem models in distributed system
System models in distributed systemishapadhy
 
Parallel Programing Model
Parallel Programing ModelParallel Programing Model
Parallel Programing ModelAdlin Jeena
 
distributed Computing system model
distributed Computing system modeldistributed Computing system model
distributed Computing system modelHarshad Umredkar
 
Clustering: Large Databases in data mining
Clustering: Large Databases in data miningClustering: Large Databases in data mining
Clustering: Large Databases in data miningZHAO Sam
 
Lecture 1 introduction to parallel and distributed computing
Lecture 1   introduction to parallel and distributed computingLecture 1   introduction to parallel and distributed computing
Lecture 1 introduction to parallel and distributed computingVajira Thambawita
 
Data decomposition techniques
Data decomposition techniquesData decomposition techniques
Data decomposition techniquesMohamed Ramadan
 

Mais procurados (20)

Multivector and multiprocessor
Multivector and multiprocessorMultivector and multiprocessor
Multivector and multiprocessor
 
Underlying principles of parallel and distributed computing
Underlying principles of parallel and distributed computingUnderlying principles of parallel and distributed computing
Underlying principles of parallel and distributed computing
 
Applications of paralleL processing
Applications of paralleL processingApplications of paralleL processing
Applications of paralleL processing
 
Distributed System ppt
Distributed System pptDistributed System ppt
Distributed System ppt
 
Parallel processing
Parallel processingParallel processing
Parallel processing
 
Data flow architecture
Data flow architectureData flow architecture
Data flow architecture
 
parallel language and compiler
parallel language and compilerparallel language and compiler
parallel language and compiler
 
Parallel computing persentation
Parallel computing persentationParallel computing persentation
Parallel computing persentation
 
Hardware and Software parallelism
Hardware and Software parallelismHardware and Software parallelism
Hardware and Software parallelism
 
Introduction to Parallel and Distributed Computing
Introduction to Parallel and Distributed ComputingIntroduction to Parallel and Distributed Computing
Introduction to Parallel and Distributed Computing
 
System models in distributed system
System models in distributed systemSystem models in distributed system
System models in distributed system
 
Loops in flow
Loops in flowLoops in flow
Loops in flow
 
Parallel Programing Model
Parallel Programing ModelParallel Programing Model
Parallel Programing Model
 
Type Checking(Compiler Design) #ShareThisIfYouLike
Type Checking(Compiler Design) #ShareThisIfYouLikeType Checking(Compiler Design) #ShareThisIfYouLike
Type Checking(Compiler Design) #ShareThisIfYouLike
 
distributed Computing system model
distributed Computing system modeldistributed Computing system model
distributed Computing system model
 
Parallel processing
Parallel processingParallel processing
Parallel processing
 
Clustering: Large Databases in data mining
Clustering: Large Databases in data miningClustering: Large Databases in data mining
Clustering: Large Databases in data mining
 
Lecture 1 introduction to parallel and distributed computing
Lecture 1   introduction to parallel and distributed computingLecture 1   introduction to parallel and distributed computing
Lecture 1 introduction to parallel and distributed computing
 
6.distributed shared memory
6.distributed shared memory6.distributed shared memory
6.distributed shared memory
 
Data decomposition techniques
Data decomposition techniquesData decomposition techniques
Data decomposition techniques
 

Semelhante a advanced computer architesture-conditions of parallelism

ACA-Lect10.pptx
ACA-Lect10.pptxACA-Lect10.pptx
ACA-Lect10.pptxmeghana092
 
Lec 4 (program and network properties)
Lec 4 (program and network properties)Lec 4 (program and network properties)
Lec 4 (program and network properties)Sudarshan Mondal
 
Code scheduling constraints
Code scheduling constraintsCode scheduling constraints
Code scheduling constraintsArchanaMani2
 
CS 301 Computer ArchitectureStudent # 1 EID 09Kingdom of .docx
CS 301 Computer ArchitectureStudent # 1 EID 09Kingdom of .docxCS 301 Computer ArchitectureStudent # 1 EID 09Kingdom of .docx
CS 301 Computer ArchitectureStudent # 1 EID 09Kingdom of .docxfaithxdunce63732
 
Hardware and software parallelism
Hardware and software parallelismHardware and software parallelism
Hardware and software parallelismSumita Das
 
Reflective and Refractive Variables: A Model for Effective and Maintainable A...
Reflective and Refractive Variables: A Model for Effective and Maintainable A...Reflective and Refractive Variables: A Model for Effective and Maintainable A...
Reflective and Refractive Variables: A Model for Effective and Maintainable A...Vincenzo De Florio
 
Scaling Application on High Performance Computing Clusters and Analysis of th...
Scaling Application on High Performance Computing Clusters and Analysis of th...Scaling Application on High Performance Computing Clusters and Analysis of th...
Scaling Application on High Performance Computing Clusters and Analysis of th...Rusif Eyvazli
 
Traffic Simulator
Traffic SimulatorTraffic Simulator
Traffic Simulatorgystell
 
Disadvantages Of Robotium
Disadvantages Of RobotiumDisadvantages Of Robotium
Disadvantages Of RobotiumSusan Tullis
 
Pipelining and vector processing
Pipelining and vector processingPipelining and vector processing
Pipelining and vector processingKamal Acharya
 
LOCK-FREE PARALLEL ACCESS COLLECTIONS
LOCK-FREE PARALLEL ACCESS COLLECTIONSLOCK-FREE PARALLEL ACCESS COLLECTIONS
LOCK-FREE PARALLEL ACCESS COLLECTIONSijdpsjournal
 
Lock free parallel access collections
Lock free parallel access collectionsLock free parallel access collections
Lock free parallel access collectionsijdpsjournal
 
1000 solved questions
1000 solved questions1000 solved questions
1000 solved questionsKranthi Kumar
 
Machine Learning At Speed: Operationalizing ML For Real-Time Data Streams
Machine Learning At Speed: Operationalizing ML For Real-Time Data StreamsMachine Learning At Speed: Operationalizing ML For Real-Time Data Streams
Machine Learning At Speed: Operationalizing ML For Real-Time Data StreamsLightbend
 

Semelhante a advanced computer architesture-conditions of parallelism (20)

Parallel programming model
Parallel programming modelParallel programming model
Parallel programming model
 
ACA-Lect10.pptx
ACA-Lect10.pptxACA-Lect10.pptx
ACA-Lect10.pptx
 
Lec 4 (program and network properties)
Lec 4 (program and network properties)Lec 4 (program and network properties)
Lec 4 (program and network properties)
 
C question
C questionC question
C question
 
Code scheduling constraints
Code scheduling constraintsCode scheduling constraints
Code scheduling constraints
 
CS 301 Computer ArchitectureStudent # 1 EID 09Kingdom of .docx
CS 301 Computer ArchitectureStudent # 1 EID 09Kingdom of .docxCS 301 Computer ArchitectureStudent # 1 EID 09Kingdom of .docx
CS 301 Computer ArchitectureStudent # 1 EID 09Kingdom of .docx
 
Hardware and software parallelism
Hardware and software parallelismHardware and software parallelism
Hardware and software parallelism
 
Reflective and Refractive Variables: A Model for Effective and Maintainable A...
Reflective and Refractive Variables: A Model for Effective and Maintainable A...Reflective and Refractive Variables: A Model for Effective and Maintainable A...
Reflective and Refractive Variables: A Model for Effective and Maintainable A...
 
Scaling Application on High Performance Computing Clusters and Analysis of th...
Scaling Application on High Performance Computing Clusters and Analysis of th...Scaling Application on High Performance Computing Clusters and Analysis of th...
Scaling Application on High Performance Computing Clusters and Analysis of th...
 
Traffic Simulator
Traffic SimulatorTraffic Simulator
Traffic Simulator
 
Lect12-13_MS_Networks.pptx
Lect12-13_MS_Networks.pptxLect12-13_MS_Networks.pptx
Lect12-13_MS_Networks.pptx
 
Disadvantages Of Robotium
Disadvantages Of RobotiumDisadvantages Of Robotium
Disadvantages Of Robotium
 
Software maintenance
Software maintenanceSoftware maintenance
Software maintenance
 
Pipelining and vector processing
Pipelining and vector processingPipelining and vector processing
Pipelining and vector processing
 
LOCK-FREE PARALLEL ACCESS COLLECTIONS
LOCK-FREE PARALLEL ACCESS COLLECTIONSLOCK-FREE PARALLEL ACCESS COLLECTIONS
LOCK-FREE PARALLEL ACCESS COLLECTIONS
 
Lock free parallel access collections
Lock free parallel access collectionsLock free parallel access collections
Lock free parallel access collections
 
1000 solved questions
1000 solved questions1000 solved questions
1000 solved questions
 
PID2143641
PID2143641PID2143641
PID2143641
 
p850-ries
p850-riesp850-ries
p850-ries
 
Machine Learning At Speed: Operationalizing ML For Real-Time Data Streams
Machine Learning At Speed: Operationalizing ML For Real-Time Data StreamsMachine Learning At Speed: Operationalizing ML For Real-Time Data Streams
Machine Learning At Speed: Operationalizing ML For Real-Time Data Streams
 

Último

ClimART Action | eTwinning Project
ClimART Action    |    eTwinning ProjectClimART Action    |    eTwinning Project
ClimART Action | eTwinning Projectjordimapav
 
Man or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptx
Man or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptxMan or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptx
Man or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptxDhatriParmar
 
4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptx4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptxmary850239
 
Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4JOYLYNSAMANIEGO
 
Scientific Writing :Research Discourse
Scientific  Writing :Research  DiscourseScientific  Writing :Research  Discourse
Scientific Writing :Research DiscourseAnita GoswamiGiri
 
week 1 cookery 8 fourth - quarter .pptx
week 1 cookery 8  fourth  -  quarter .pptxweek 1 cookery 8  fourth  -  quarter .pptx
week 1 cookery 8 fourth - quarter .pptxJonalynLegaspi2
 
Textual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSTextual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSMae Pangan
 
Multi Domain Alias In the Odoo 17 ERP Module
Multi Domain Alias In the Odoo 17 ERP ModuleMulti Domain Alias In the Odoo 17 ERP Module
Multi Domain Alias In the Odoo 17 ERP ModuleCeline George
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management systemChristalin Nelson
 
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDecoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDhatriParmar
 
Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1GloryAnnCastre1
 
Using Grammatical Signals Suitable to Patterns of Idea Development
Using Grammatical Signals Suitable to Patterns of Idea DevelopmentUsing Grammatical Signals Suitable to Patterns of Idea Development
Using Grammatical Signals Suitable to Patterns of Idea Developmentchesterberbo7
 
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQ-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQuiz Club NITW
 
Congestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentationCongestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentationdeepaannamalai16
 
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptxBIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptxSayali Powar
 
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxGrade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxkarenfajardo43
 
Measures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped dataMeasures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped dataBabyAnnMotar
 
4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptxmary850239
 

Último (20)

ClimART Action | eTwinning Project
ClimART Action    |    eTwinning ProjectClimART Action    |    eTwinning Project
ClimART Action | eTwinning Project
 
Man or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptx
Man or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptxMan or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptx
Man or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptx
 
4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptx4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptx
 
Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4
 
Scientific Writing :Research Discourse
Scientific  Writing :Research  DiscourseScientific  Writing :Research  Discourse
Scientific Writing :Research Discourse
 
week 1 cookery 8 fourth - quarter .pptx
week 1 cookery 8  fourth  -  quarter .pptxweek 1 cookery 8  fourth  -  quarter .pptx
week 1 cookery 8 fourth - quarter .pptx
 
Textual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSTextual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHS
 
Multi Domain Alias In the Odoo 17 ERP Module
Multi Domain Alias In the Odoo 17 ERP ModuleMulti Domain Alias In the Odoo 17 ERP Module
Multi Domain Alias In the Odoo 17 ERP Module
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management system
 
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDecoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
 
Faculty Profile prashantha K EEE dept Sri Sairam college of Engineering
Faculty Profile prashantha K EEE dept Sri Sairam college of EngineeringFaculty Profile prashantha K EEE dept Sri Sairam college of Engineering
Faculty Profile prashantha K EEE dept Sri Sairam college of Engineering
 
Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1
 
Using Grammatical Signals Suitable to Patterns of Idea Development
Using Grammatical Signals Suitable to Patterns of Idea DevelopmentUsing Grammatical Signals Suitable to Patterns of Idea Development
Using Grammatical Signals Suitable to Patterns of Idea Development
 
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQ-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
 
Congestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentationCongestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentation
 
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptxINCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
 
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptxBIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
 
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxGrade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
 
Measures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped dataMeasures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped data
 
4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx
 

advanced computer architesture-conditions of parallelism

  • 2. Conditions of Parallelism The exploitation of parallelism in computing requires understanding the basic theory associated with it. Progress is needed in several areas: computation models for parallel computing interprocessor communication in parallel architectures integration of parallel systems into general environments
  • 3. Data and Resource Dependencies Program segments cannot be executed in parallel unless they are independent. Independence comes in several forms: Data dependence: data modified by one segement must not be modified by another parallel segment. Control dependence: if the control flow of segments cannot be identified before run time, then the data dependence between the segments is variable. Resource dependence: even if several segments are independent in other ways, they cannot be executed in parallel if there aren’t sufficient processing resources (e.g. functional units).
  • 4. dependence graph A dependence graph is used to describe the relations between the statements. The nodes corresponds to the program statements and directed edges with different labels shows the ordered relation among the statements.
  • 5. Data Dependence - 1 The ordering relationship between statements is shown by data dependence .following are the five types data dependences: Flow dependence(denoted by S1 S2): A statement S2 is flow dependent on statement S1,if an execution path is exists from S1 to S2 and if atleast one output of 1 is used as input to S2. Anti dependence(denoted by S1 S2): Statement S2 is independent on statement S1 if S2 follows S1 in program order and if output of S2 overlaps the input of S1.
  • 6. Data Dependence - 2 Output dependence (denoted by S1 S2): Two statement are output dependent if they produce the same output variable. I/O dependence (denoted by ): Read and write are I/O statements . I/O dependence occure when the same file is referenced by both I/O statements.
  • 7. Data Dependence - 3 Unknown dependence: The subscript of a variable is itself subscripted. The subscript does not contain the loop index variable. A variable appears more than once with subscripts having different coefficients of the loop variable (that is, different functions of the loop variable). The subscript is nonlinear in the loop index variable. Parallel execution of program segments which do not have total data independence can produce non-deterministic results.
  • 9. EENG-630 - Chapter 2 I/O dependence example S1: Read (4), A(I) S2: Rewind (4) S3: Write (4), B(I) S4: Rewind (4) S1 S3 I/O
  • 10. Control Dependence Control dependency arise when the order of execution of statements cannot be determined before runtime. For example , conditional statements (IF) will not be resolved until runtime. Differnet paths taken after a conditional branch may introduce or eliminate data dependence among instruction. Dependence may also exits between operations performed in successive iteration of looping procedures.
  • 11. Control Dependence Dependence may also exist between operations performed in successive iteration of looping procedure. The successive iteration of the following loop are control-independent. example: for (i=0;i<n;i++) { a[i] = c[i]; if (a[i] < 0) a[i] = 1; }
  • 12. Control Dependence The successive iteration of the following loop are control- independent. example: for (i=1;i<n;i++) { if (a[i-1] < 0) a[i] = 1; } Compiler techniques are needed to get around control dependence limitations.
  • 13. Resource Dependence Data and control dependencies are based on the independence of the work to be done. Resource independence is concerned with conflicts in using shared resources, such as registers, integer and floating point ALUs, etc. ALU conflicts are called ALU dependence. Memory (storage) conflicts are called storage dependence.
  • 14. The transformation of a sequentially coded program into a parallel executable form can be done manually by the programmer using explicit parallism Or done by a compiler detecting implicit parallism automatically . In both the approaches ,the decomposition of program is primary objective.
  • 15. Bernstein’s Conditions - 1 Bernstein’s revealed a set of conditions which must exist if two processes can execute in parallel. Notation Let P1 and P2 be two processes . Ii is the set of all input variables for a process Pi . Oi is the set of all output variables for a process Pi . If P1 and P2 can execute in parallel (which is written as P1 || P2), then: 1 2 2 1 1 2 I O I O O O ∩ = ∅ ∩ = ∅ ∩ = ∅
  • 16. Bernstein’s Conditions - 2 In terms of data dependencies, Bernstein’s conditions imply that two processes can execute in parallel if they are flow-independent, ant independent, and output-independent. The parallelism relation || is commutative (Pi || Pj implies Pj || Pi ), but not transitive (Pi || Pj and Pj || Pk does not imply Pi || Pk ) . Therefore, || is not an equivalence relation. Intersection of the input sets is allowed.
  • 17. Hardware And Software Parallelism For implementation of parallelism , special hardware and software support is needed . Compilation support is also needed to close to gap between hardware and software .
  • 18. Hardware Parallelism Hardware parallelism is defined by machine architecture and hardware multiplicity. Hardware parallelism is a function of cost and performance trade offs. It displays the resources utilization patterns of simultaneously executable operations. It can also indicate the peak performance of the processor resources.
  • 19. It can be characterized by the number of instructions that can be issued per machine cycle. If a processor issues k instructions per machine cycle, it is called a k-issue processor. Conventional processors are one-issue machines. Examples. Intel i960CA is a three-issue processor (arithmetic, memory access, branch). IBM RS- 6000 is a four-issue processor (arithmetic, floating- point, memory access, branch). A machine with n k-issue processors should be able to handle a maximum of nk threads simultaneously.
  • 20. Software Parallelism Software parallelism is defined by the control and data dependence of programs, and is revealed in the program’s flow graph. It is a function of algorithm, programming style, and compiler optimization. The program flow graph display the patterns of simultaneously executable operations Parallelism in a program varies during the execution period.
  • 21. Mismatch between software and hardware parallelism - 1 L1 L2 L3 L4 X1 X2 + - A B Maximum software parallelism (L=load, X/+/- = arithmetic). Cycle 1 Cycle 2 Cycle 3
  • 22. Mismatch between software and hardware parallelism - 2 L1 L2 L4 L3X1 X2 + - A B Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Same problem, but considering the parallelism on a two-issue superscalar processor.
  • 23. Mismatch between software and hardware parallelism - 3 L1 L2 S1 X1 + L5 L3 L4 S2 X2 - L6 BA Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Same problem, with two single- issue processors = inserted for synchronization
  • 24. Types of Software Parallelism Control Parallelism – two or more operations can be performed simultaneously. This can be detected by a compiler, or a programmer can explicitly indicate control parallelism by using special language constructs or dividing a program into multiple processes. Control Parallelism appear in the form of pipelining or multi-functional units which is limited by pipeline length and by the multiplicity of functional units. Both are handle by the hardware .
  • 25. Types of Software Parallelism Data parallelism – multiple data elements have the same operations applied to them at the same time. This offers the highest potential for concurrency (in SIMD and MIMD modes). Synchronization in SIMD machines handled by hardware. Data parallel code is easier to write and to debug .
  • 26. Solving the Mismatch Problems Develop compilation support Redesign hardware for more efficient exploitation by compilers Use large register files and sustained instruction pipelining. Have the compiler fill the branch and load delay slots in code generated for RISC processors.
  • 27. The Role of Compilers Compilers used to exploit hardware features to improve performance. Interaction between compiler and architecture design is a necessity in modern computer development. It is not necessarily the case that more software parallelism will improve performance in conventional scalar processors. The hardware and compiler should be designed at the same time.