SlideShare uma empresa Scribd logo
1 de 40
CSE 8383 - Advanced
Computer Architecture

            Week-3
     Week of Jan 26, 2004
   engr.smu.edu/~rewini/8383
Contents
   Linear Pipelines
   Nonlinear pipelines
   Instruction Pipelines
   Arithmetic Operations
   Design of Multifunction Pipeline
Linear Pipeline
   Processing Stages are linearly
    connected
   Perform fixed function
   Synchronous Pipeline
       Clocked latches between Stage i and
        Stage i+1
       Equal delays in all stages
   Asynchronous Pipeline (Handshaking)
Latches


     S1               S2              S3


              L1                 L2

Slowest stage determines delay

Equal delays  clock period
Reservation Table
          Time


S1    X

S2        X

S3
                 X

                     X
S4
5 tasks on 4 stages
                  Time

S1    X   X   X   X      X

S2        X   X   X      X   X

S3            X   X      X   X   X

S4                X      X   X   X   X
Non Linear Pipelines
   Variable functions
   Feed-Forward
   Feedback
3 stages & 2 functions
       X                  Y



 S1        S2        S3
Reservation Tables for X & Y
S1    X                   X       X
S2        X       X
S3            X       X       X


S1    Y               Y
S2            Y
S3        Y       Y       Y
Linear Instruction Pipelines
   Assume the following instruction
    execution phases:
       Fetch (F)
       Decode (D)
       Operand Fetch (O)
       Execute (E)
       Write results (W)
Pipeline Instruction Execution

F    I1   I2   I3

D         I1   I2   I3

O              I1   I2   I3

E                   I1   I2   I3

W
                         I1   I2   I3
Dependencies
   Data Dependency
    (Operand is not ready yet)

   Instruction Dependency
    (Branching)

    Will that Cause a Problem?
Data Dependency
I1 -- Add R1, R2, R3
I2 -- Sub R4, R1, R5
       1    2    3    4    5    6

  F   I1   I2
  D        I1   I2
 O              I1   I2
 E
                     I1   I2
 W                        I1   I2
Solutions
   STALL
   Forwarding
   Write and Read in one cycle
   ….
Instruction Dependency
I1 – Branch o
I2 –
        1    2    3    4    5    6

   F   I1   I2
  D         I1   I2
  O              I1   I2
  E
                      I1   I2
  W                        I1   I2
Solutions
   STALL
   Predict Branch taken
   Predict Branch not taken
   ….
Floating Point Multiplication
   Inputs (Mantissa1, Exponenet1), (Mantissa2,
    Exponent2)
   Add the two exponents  Exponent-out
   Multiple the 2 mantissas
   Normalize mantissa and adjust exponent
   Round the product mantissa to a single length
    mantissa. You may adjust the exponent
Linear Pipeline for floating-
      point multiplication

     Add             Multiply
                                    Normalize          Round
   Exponents         Mantissa




  Add           Partial                    Normalize      Round
                            Accumulator
Exponents      Products



                                                          Re
                                                       normalize
Linear Pipeline for floating-
       point Addition


            Partial    Add            Find             Partial
 Subtract
             Shift    Mantissa      Leading 1           Shift
Exponents




                                                   Re
                                 Round
                                                normalize
Combined Adder and
       Multiplier
             Partial
                       B
            Products


   A          F              C                G               H
Exponents    Partial         Add             Find             Partial
 Subtract     Shift        Mantissa        Leading 1           Shift
  / ADD



                                                          Re
                                      Round
                                                       normalize

                                       E                  D
Reservation Table for Multiply
    1   2   3   4   5   6   7

A   X
B       X   X
C           X   X
D                   X       X
E                       X
F

G

H
Reservation Table for Addition
    1   2   3   4   5   6   7   8   9
A   Y
B
C               Y
D                                   Y
E                               Y
F       Y   Y
G                   Y
H                       Y   Y
Nonlinear Pipeline Design
   Latency
      The number of clock cycles between two
      initiations of a pipeline
   Collision
      Resource Conflict
   Forbidden Latencies
      Latencies that cause collisions
Nonlinear Pipeline Design
cont
   Latency Sequence
      A sequence of permissible latencies between
      successive task initiations
   Latency Cycle
      A sequence that repeats the same subsequence
   Collision vector
    C = (Cm, Cm-1, …, C2, C1), m <= n-1
    n = number of column in reservation table
    Ci = 1 if latency i causes collision, 0 otherwise
Mul – Mul Collision (lunch
after 1 cycle)
    1   2    3     4    5   6   7

A   X   Z
B       X   X Z    Z
C            X    X Z   Z
D                       X   Z   X
E                           X   Z
F

G

H
Mul –Mul Collision (lunch after
2 cycles)
    1   2   3   4   5   6    7

A   X       Z
B       X   X   Z   Z
C           X   X   Z   Z
D                   X       X Z
E                       X
F

G

H
Mul – Mul Collision (lunch
after 3 cycles)
    1   2   3   4   5   6   7

A   X           Z
B       X   X       Z   Z
C           X   X       Z   Z
D                   X       X
E                       X
F

G

H
Collision Vector for Multiply
after Multiply
Forbidden Latencies: 1, 2

Collision vector
0 0 0 0 1 1  11

Maximum forbidden latency = 2  m = 2
Example
      X             Y



 S1       S2   S3
Reservation Tables for X & Y
S1    X                   X       X
S2        X       X
S3            X       X       X


S1    Y               Y
S2            Y
S3        Y       Y       Y
Reservation Tables for X & Y
S1    X                   X       X
S2        X       X
S3            X       X       X


S1    Y               Y
S2            Y
S3        Y       Y       Y
Forbidden Latencies
   X after X
   X after Y
   Y after X
   Y after Y
X after X
       2
S1    X1        X2                   X1            X2 X1
S2         X1        X2 X1           X2
S3              X1           X2 X1        X2 X1

       5
S1    X1                       X2 X1              X1

 S2        X1        X1                   X2
S3              X1        X1              X1      X2
X after X
       4
S1    X1                       X2        X1                X1
S2         X1        X1                  X2                X2
S3              X1             X1             X2 X1

       7
S1    X1                            X1                X2
                                                      X1
 S2
           X1        X1
S3              X1        X1                  X1
Collision Vector
 Forbidden Latencies: 2, 4, 5, 7
 Collision Vector =

 1011010
Y after Y
S1   Y       Y       Y
S2           Y       Y
S3       Y       Y       Y
                 Y       Y

S1   Y               Y
S2                   Y
S3
             Y
         Y       Y       Y
                         Y
Collision Vector
   Forbidden Latencies: 2, 4
   Collision Vector =
    1010
Exercise – Find the collision
vector

    1   2   3   4   5   6   7

A   X       X   X

B       X               X

C                   X       X

D               X
State Diagram for X

                           8+

             1011010


     3                            8+
         6       8+   1*

     1011011                    1111111
3*           6
Cycles
 Simple cycles  each state appears
  only once
(3), (6), (8), (1, 8), (3, 8), and (6,8)
 Greedy Cycles  simple cycles whose

  edges are all made with minimum
  latencies from their respective starting
  states
 (1,8), (3)  one of them is MAL

Mais conteúdo relacionado

Mais procurados

NFA or Non deterministic finite automata
NFA or Non deterministic finite automataNFA or Non deterministic finite automata
NFA or Non deterministic finite automatadeepinderbedi
 
Computer architecture pipelining
Computer architecture pipeliningComputer architecture pipelining
Computer architecture pipeliningMazin Alwaaly
 
Inter Process Communication
Inter Process CommunicationInter Process Communication
Inter Process CommunicationAdeel Rasheed
 
Pipelining , structural hazards
Pipelining , structural hazardsPipelining , structural hazards
Pipelining , structural hazardsMunaam Munawar
 
Variants of Turing Machine
Variants of Turing MachineVariants of Turing Machine
Variants of Turing MachineRajendran
 
Superscalar & superpipeline processor
Superscalar & superpipeline processorSuperscalar & superpipeline processor
Superscalar & superpipeline processorMuhammad Ishaq
 
Memory management early_systems
Memory management early_systemsMemory management early_systems
Memory management early_systemsMybej Che
 
1.8. equivalence of finite automaton and regular expressions
1.8. equivalence of finite automaton and regular expressions1.8. equivalence of finite automaton and regular expressions
1.8. equivalence of finite automaton and regular expressionsSampath Kumar S
 
Os Swapping, Paging, Segmentation and Virtual Memory
Os Swapping, Paging, Segmentation and Virtual MemoryOs Swapping, Paging, Segmentation and Virtual Memory
Os Swapping, Paging, Segmentation and Virtual Memorysgpraju
 
Basic Blocks and Flow Graphs
Basic Blocks and Flow GraphsBasic Blocks and Flow Graphs
Basic Blocks and Flow GraphsJenny Galino
 
cpu scheduling
cpu schedulingcpu scheduling
cpu schedulinghashim102
 
Interleaved memory
Interleaved memoryInterleaved memory
Interleaved memoryashishgy
 
Lecture6 introduction to data streams
Lecture6 introduction to data streamsLecture6 introduction to data streams
Lecture6 introduction to data streamshktripathy
 
Page replacement algorithms
Page replacement algorithmsPage replacement algorithms
Page replacement algorithmsPiyush Rochwani
 
multiprocessors and multicomputers
 multiprocessors and multicomputers multiprocessors and multicomputers
multiprocessors and multicomputersPankaj Kumar Jain
 
General pipeline concepts
General pipeline conceptsGeneral pipeline concepts
General pipeline conceptsPrasenjit Dey
 

Mais procurados (20)

NFA or Non deterministic finite automata
NFA or Non deterministic finite automataNFA or Non deterministic finite automata
NFA or Non deterministic finite automata
 
Computer architecture pipelining
Computer architecture pipeliningComputer architecture pipelining
Computer architecture pipelining
 
Inter Process Communication
Inter Process CommunicationInter Process Communication
Inter Process Communication
 
Pipelining , structural hazards
Pipelining , structural hazardsPipelining , structural hazards
Pipelining , structural hazards
 
Chomsky Normal Form
Chomsky Normal FormChomsky Normal Form
Chomsky Normal Form
 
Variants of Turing Machine
Variants of Turing MachineVariants of Turing Machine
Variants of Turing Machine
 
NFA & DFA
NFA & DFANFA & DFA
NFA & DFA
 
Superscalar & superpipeline processor
Superscalar & superpipeline processorSuperscalar & superpipeline processor
Superscalar & superpipeline processor
 
Parallel Processing Concepts
Parallel Processing Concepts Parallel Processing Concepts
Parallel Processing Concepts
 
Memory management early_systems
Memory management early_systemsMemory management early_systems
Memory management early_systems
 
1.8. equivalence of finite automaton and regular expressions
1.8. equivalence of finite automaton and regular expressions1.8. equivalence of finite automaton and regular expressions
1.8. equivalence of finite automaton and regular expressions
 
Os Swapping, Paging, Segmentation and Virtual Memory
Os Swapping, Paging, Segmentation and Virtual MemoryOs Swapping, Paging, Segmentation and Virtual Memory
Os Swapping, Paging, Segmentation and Virtual Memory
 
Basic Blocks and Flow Graphs
Basic Blocks and Flow GraphsBasic Blocks and Flow Graphs
Basic Blocks and Flow Graphs
 
cpu scheduling
cpu schedulingcpu scheduling
cpu scheduling
 
Parallel processing and pipelining
Parallel processing and pipeliningParallel processing and pipelining
Parallel processing and pipelining
 
Interleaved memory
Interleaved memoryInterleaved memory
Interleaved memory
 
Lecture6 introduction to data streams
Lecture6 introduction to data streamsLecture6 introduction to data streams
Lecture6 introduction to data streams
 
Page replacement algorithms
Page replacement algorithmsPage replacement algorithms
Page replacement algorithms
 
multiprocessors and multicomputers
 multiprocessors and multicomputers multiprocessors and multicomputers
multiprocessors and multicomputers
 
General pipeline concepts
General pipeline conceptsGeneral pipeline concepts
General pipeline concepts
 

Semelhante a Advanced computer architecture

Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video searchzukun
 
Lesson 10: The Chain Rule (handout)
Lesson 10: The Chain Rule (handout)Lesson 10: The Chain Rule (handout)
Lesson 10: The Chain Rule (handout)Matthew Leingang
 
Lesson 4A - Inverses of Functions.ppt
Lesson 4A - Inverses of Functions.pptLesson 4A - Inverses of Functions.ppt
Lesson 4A - Inverses of Functions.pptssuser78a386
 
Design of infinite impulse response digital filters 2
Design of infinite impulse response digital filters 2Design of infinite impulse response digital filters 2
Design of infinite impulse response digital filters 2HIMANSHU DIWAKAR
 
ITS World Congress :: Vienna, Oct 2012
ITS World Congress :: Vienna, Oct 2012ITS World Congress :: Vienna, Oct 2012
ITS World Congress :: Vienna, Oct 2012László Nádai
 
zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...
zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...
zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...Alex Pruden
 
Numerical Linear Algebra for Data and Link Analysis.
Numerical Linear Algebra for Data and Link Analysis.Numerical Linear Algebra for Data and Link Analysis.
Numerical Linear Algebra for Data and Link Analysis.Leonid Zhukov
 
A generic method for modeling accelerated life testing data
A generic method for modeling accelerated life testing dataA generic method for modeling accelerated life testing data
A generic method for modeling accelerated life testing dataASQ Reliability Division
 
Design of IIR filters
Design of IIR filtersDesign of IIR filters
Design of IIR filtersop205
 

Semelhante a Advanced computer architecture (20)

Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video search
 
Lesson 10: The Chain Rule (handout)
Lesson 10: The Chain Rule (handout)Lesson 10: The Chain Rule (handout)
Lesson 10: The Chain Rule (handout)
 
Lifting 1
Lifting 1Lifting 1
Lifting 1
 
UNIT I_5.pdf
UNIT I_5.pdfUNIT I_5.pdf
UNIT I_5.pdf
 
Matched filter
Matched filterMatched filter
Matched filter
 
Lesson 4A - Inverses of Functions.ppt
Lesson 4A - Inverses of Functions.pptLesson 4A - Inverses of Functions.ppt
Lesson 4A - Inverses of Functions.ppt
 
Continuity.ppt
Continuity.pptContinuity.ppt
Continuity.ppt
 
Design of infinite impulse response digital filters 2
Design of infinite impulse response digital filters 2Design of infinite impulse response digital filters 2
Design of infinite impulse response digital filters 2
 
ITS World Congress :: Vienna, Oct 2012
ITS World Congress :: Vienna, Oct 2012ITS World Congress :: Vienna, Oct 2012
ITS World Congress :: Vienna, Oct 2012
 
Lecture.1
Lecture.1Lecture.1
Lecture.1
 
Lecture28
Lecture28Lecture28
Lecture28
 
zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...
zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...
zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...
 
Numerical Linear Algebra for Data and Link Analysis.
Numerical Linear Algebra for Data and Link Analysis.Numerical Linear Algebra for Data and Link Analysis.
Numerical Linear Algebra for Data and Link Analysis.
 
Lect26 Engin112
Lect26 Engin112Lect26 Engin112
Lect26 Engin112
 
Lecture22
Lecture22Lecture22
Lecture22
 
fghdfh
fghdfhfghdfh
fghdfh
 
Singlevaropt
SinglevaroptSinglevaropt
Singlevaropt
 
A generic method for modeling accelerated life testing data
A generic method for modeling accelerated life testing dataA generic method for modeling accelerated life testing data
A generic method for modeling accelerated life testing data
 
Conic Clustering
Conic ClusteringConic Clustering
Conic Clustering
 
Design of IIR filters
Design of IIR filtersDesign of IIR filters
Design of IIR filters
 

Mais de Md. Mahedi Mahfuj

Mais de Md. Mahedi Mahfuj (20)

Bengali optical character recognition system
Bengali optical character recognition systemBengali optical character recognition system
Bengali optical character recognition system
 
Parallel computing chapter 3
Parallel computing chapter 3Parallel computing chapter 3
Parallel computing chapter 3
 
Parallel computing chapter 2
Parallel computing chapter 2Parallel computing chapter 2
Parallel computing chapter 2
 
Parallel computing(2)
Parallel computing(2)Parallel computing(2)
Parallel computing(2)
 
Parallel computing(1)
Parallel computing(1)Parallel computing(1)
Parallel computing(1)
 
Message passing interface
Message passing interfaceMessage passing interface
Message passing interface
 
Parallel searching
Parallel searchingParallel searching
Parallel searching
 
Clustering manual
Clustering manualClustering manual
Clustering manual
 
Matrix multiplication graph
Matrix multiplication graphMatrix multiplication graph
Matrix multiplication graph
 
Strategy pattern
Strategy patternStrategy pattern
Strategy pattern
 
Observer pattern
Observer patternObserver pattern
Observer pattern
 
Mediator pattern
Mediator patternMediator pattern
Mediator pattern
 
Database management system chapter16
Database management system chapter16Database management system chapter16
Database management system chapter16
 
Database management system chapter15
Database management system chapter15Database management system chapter15
Database management system chapter15
 
Database management system chapter12
Database management system chapter12Database management system chapter12
Database management system chapter12
 
Strategies in job search process
Strategies in job search processStrategies in job search process
Strategies in job search process
 
Report writing(short)
Report writing(short)Report writing(short)
Report writing(short)
 
Report writing(long)
Report writing(long)Report writing(long)
Report writing(long)
 
Job search_resume
Job search_resumeJob search_resume
Job search_resume
 
Job search_interview
Job search_interviewJob search_interview
Job search_interview
 

Advanced computer architecture

  • 1. CSE 8383 - Advanced Computer Architecture Week-3 Week of Jan 26, 2004 engr.smu.edu/~rewini/8383
  • 2. Contents  Linear Pipelines  Nonlinear pipelines  Instruction Pipelines  Arithmetic Operations  Design of Multifunction Pipeline
  • 3. Linear Pipeline  Processing Stages are linearly connected  Perform fixed function  Synchronous Pipeline  Clocked latches between Stage i and Stage i+1  Equal delays in all stages  Asynchronous Pipeline (Handshaking)
  • 4. Latches S1 S2 S3 L1 L2 Slowest stage determines delay Equal delays  clock period
  • 5. Reservation Table Time S1 X S2 X S3 X X S4
  • 6. 5 tasks on 4 stages Time S1 X X X X X S2 X X X X X S3 X X X X X S4 X X X X X
  • 7. Non Linear Pipelines  Variable functions  Feed-Forward  Feedback
  • 8. 3 stages & 2 functions X Y S1 S2 S3
  • 9. Reservation Tables for X & Y S1 X X X S2 X X S3 X X X S1 Y Y S2 Y S3 Y Y Y
  • 10. Linear Instruction Pipelines  Assume the following instruction execution phases:  Fetch (F)  Decode (D)  Operand Fetch (O)  Execute (E)  Write results (W)
  • 11. Pipeline Instruction Execution F I1 I2 I3 D I1 I2 I3 O I1 I2 I3 E I1 I2 I3 W I1 I2 I3
  • 12. Dependencies  Data Dependency (Operand is not ready yet)  Instruction Dependency (Branching) Will that Cause a Problem?
  • 13. Data Dependency I1 -- Add R1, R2, R3 I2 -- Sub R4, R1, R5 1 2 3 4 5 6 F I1 I2 D I1 I2 O I1 I2 E I1 I2 W I1 I2
  • 14. Solutions  STALL  Forwarding  Write and Read in one cycle  ….
  • 15. Instruction Dependency I1 – Branch o I2 – 1 2 3 4 5 6 F I1 I2 D I1 I2 O I1 I2 E I1 I2 W I1 I2
  • 16. Solutions  STALL  Predict Branch taken  Predict Branch not taken  ….
  • 17. Floating Point Multiplication  Inputs (Mantissa1, Exponenet1), (Mantissa2, Exponent2)  Add the two exponents  Exponent-out  Multiple the 2 mantissas  Normalize mantissa and adjust exponent  Round the product mantissa to a single length mantissa. You may adjust the exponent
  • 18. Linear Pipeline for floating- point multiplication Add Multiply Normalize Round Exponents Mantissa Add Partial Normalize Round Accumulator Exponents Products Re normalize
  • 19. Linear Pipeline for floating- point Addition Partial Add Find Partial Subtract Shift Mantissa Leading 1 Shift Exponents Re Round normalize
  • 20. Combined Adder and Multiplier Partial B Products A F C G H Exponents Partial Add Find Partial Subtract Shift Mantissa Leading 1 Shift / ADD Re Round normalize E D
  • 21. Reservation Table for Multiply 1 2 3 4 5 6 7 A X B X X C X X D X X E X F G H
  • 22. Reservation Table for Addition 1 2 3 4 5 6 7 8 9 A Y B C Y D Y E Y F Y Y G Y H Y Y
  • 23. Nonlinear Pipeline Design  Latency The number of clock cycles between two initiations of a pipeline  Collision Resource Conflict  Forbidden Latencies Latencies that cause collisions
  • 24. Nonlinear Pipeline Design cont  Latency Sequence A sequence of permissible latencies between successive task initiations  Latency Cycle A sequence that repeats the same subsequence  Collision vector C = (Cm, Cm-1, …, C2, C1), m <= n-1 n = number of column in reservation table Ci = 1 if latency i causes collision, 0 otherwise
  • 25. Mul – Mul Collision (lunch after 1 cycle) 1 2 3 4 5 6 7 A X Z B X X Z Z C X X Z Z D X Z X E X Z F G H
  • 26. Mul –Mul Collision (lunch after 2 cycles) 1 2 3 4 5 6 7 A X Z B X X Z Z C X X Z Z D X X Z E X F G H
  • 27. Mul – Mul Collision (lunch after 3 cycles) 1 2 3 4 5 6 7 A X Z B X X Z Z C X X Z Z D X X E X F G H
  • 28. Collision Vector for Multiply after Multiply Forbidden Latencies: 1, 2 Collision vector 0 0 0 0 1 1  11 Maximum forbidden latency = 2  m = 2
  • 29. Example X Y S1 S2 S3
  • 30. Reservation Tables for X & Y S1 X X X S2 X X S3 X X X S1 Y Y S2 Y S3 Y Y Y
  • 31. Reservation Tables for X & Y S1 X X X S2 X X S3 X X X S1 Y Y S2 Y S3 Y Y Y
  • 32. Forbidden Latencies  X after X  X after Y  Y after X  Y after Y
  • 33. X after X 2 S1 X1 X2 X1 X2 X1 S2 X1 X2 X1 X2 S3 X1 X2 X1 X2 X1 5 S1 X1 X2 X1 X1 S2 X1 X1 X2 S3 X1 X1 X1 X2
  • 34. X after X 4 S1 X1 X2 X1 X1 S2 X1 X1 X2 X2 S3 X1 X1 X2 X1 7 S1 X1 X1 X2 X1 S2 X1 X1 S3 X1 X1 X1
  • 35. Collision Vector  Forbidden Latencies: 2, 4, 5, 7  Collision Vector = 1011010
  • 36. Y after Y S1 Y Y Y S2 Y Y S3 Y Y Y Y Y S1 Y Y S2 Y S3 Y Y Y Y Y
  • 37. Collision Vector  Forbidden Latencies: 2, 4  Collision Vector = 1010
  • 38. Exercise – Find the collision vector 1 2 3 4 5 6 7 A X X X B X X C X X D X
  • 39. State Diagram for X 8+ 1011010 3 8+ 6 8+ 1* 1011011 1111111 3* 6
  • 40. Cycles  Simple cycles  each state appears only once (3), (6), (8), (1, 8), (3, 8), and (6,8)  Greedy Cycles  simple cycles whose edges are all made with minimum latencies from their respective starting states (1,8), (3)  one of them is MAL