SlideShare uma empresa Scribd logo
1 de 59
PipeliningPipelining
CONTENTSCONTENTS
 What is PipeliningWhat is Pipelining
 How Pipelines WorksHow Pipelines Works
 Advantages/DisadvantagesAdvantages/Disadvantages
 CharacterizeCharacterize PipelinesPipelines
 Pipeline classificationPipeline classification
What is PipeliningWhat is Pipelining
 A technique used in advanced microprocessors whereA technique used in advanced microprocessors where
the microprocessor begins executing a secondthe microprocessor begins executing a second
instruction before the first has been completed.instruction before the first has been completed.
- A Pipeline is a series of stages, where some work isA Pipeline is a series of stages, where some work is
done at each stage. The work is not finished until it hasdone at each stage. The work is not finished until it has
passed through all stages.passed through all stages.
 With pipelining, the computer architecture allows theWith pipelining, the computer architecture allows the
next instructions to be fetched while the processor isnext instructions to be fetched while the processor is
performing arithmetic operations, holding them in aperforming arithmetic operations, holding them in a
buffer close to the processor until each instructionbuffer close to the processor until each instruction
operation can performed.operation can performed.
How Pipelines WorksHow Pipelines Works
 The pipeline is divided into segments andThe pipeline is divided into segments and
each segment can execute it operationeach segment can execute it operation
concurrently with the other segments.concurrently with the other segments.
Once a segment completes an operations,Once a segment completes an operations,
it passes the result to the next segment init passes the result to the next segment in
the pipeline and fetches the nextthe pipeline and fetches the next
operations from the preceding segment.operations from the preceding segment.
ExampleExample
Instructions FetchInstructions Fetch
 The instruction Fetch (IF) stage is responsible forThe instruction Fetch (IF) stage is responsible for
obtaining the requested instruction from memory. Theobtaining the requested instruction from memory. The
instruction and the program counter (which isinstruction and the program counter (which is
incremented to the next instruction) are stored in theincremented to the next instruction) are stored in the
IF/ID pipeline register as temporary storage so that mayIF/ID pipeline register as temporary storage so that may
be used in the next stage at the start of the next clockbe used in the next stage at the start of the next clock
cycle.cycle.
Instruction DecodeInstruction Decode
 The Instruction Decode (ID) stage is responsible forThe Instruction Decode (ID) stage is responsible for
decoding the instruction and sending out the variousdecoding the instruction and sending out the various
control lines to the other parts of the processor. Thecontrol lines to the other parts of the processor. The
instruction is sent to the control unit where it is decodedinstruction is sent to the control unit where it is decoded
and the registers are fetched from the register file.and the registers are fetched from the register file.
ExecutionExecution
 The Execution (EX) stage is where any calculations areThe Execution (EX) stage is where any calculations are
performed. The main component in this stage is theperformed. The main component in this stage is the
ALU. The ALU is made up of arithmetic, logic andALU. The ALU is made up of arithmetic, logic and
capabilities.capabilities.
Memory and IOMemory and IO
 The Memory and IO (MEM) stage is responsible forThe Memory and IO (MEM) stage is responsible for
storing and loading values to and from memory. It alsostoring and loading values to and from memory. It also
responsible for input or output from the processor. If theresponsible for input or output from the processor. If the
current instruction is not of Memory or IO type than thecurrent instruction is not of Memory or IO type than the
result from the ALU is passed through to the write backresult from the ALU is passed through to the write back
stage.stage.
Write BackWrite Back
 The Write Back (WB) stage is responsibleThe Write Back (WB) stage is responsible
for writing the result of a calculation,for writing the result of a calculation,
memory access or input into the registermemory access or input into the register
file.file.
Operation TimingsOperation Timings
 Estimated timings for each ofEstimated timings for each of
the stages:the stages:
InstructionInstruction
FetchFetch
2ns2ns
InstructionInstruction
DecodeDecode
1ns1ns
ExecutionExecution 2ns2ns
MemoryMemory
and IOand IO
2ns2ns
Write BackWrite Back 1ns1ns
Advantages/DisadvantagesAdvantages/Disadvantages
Advantages:Advantages:
 More efficient use of processorMore efficient use of processor
 Quicker time of execution of large number ofQuicker time of execution of large number of
instructionsinstructions
Disadvantages:Disadvantages:
 Pipelining involves adding hardware to the chipPipelining involves adding hardware to the chip
 Inability to continuously run the pipelineInability to continuously run the pipeline
at full speed because of pipeline hazardsat full speed because of pipeline hazards
which disrupt the smooth execution of thewhich disrupt the smooth execution of the
pipeline.pipeline.
CharacterizeCharacterize PipelinesPipelines
1)1) Hardware or software implementationHardware or software implementation –– pipelining can bepipelining can be
implemented in either software or hardware.implemented in either software or hardware.
2)2) Large or Small ScaleLarge or Small Scale – Stations in a pipeline can range from simplistic to– Stations in a pipeline can range from simplistic to
powerful, and a pipeline can range in length from short to long.powerful, and a pipeline can range in length from short to long.
3)3) Synchronous or asynchronous flowSynchronous or asynchronous flow – A synchronous pipeline operates like– A synchronous pipeline operates like
an assembly line: at a given time, each station is processing some amountan assembly line: at a given time, each station is processing some amount
of information.of information.
4)4) asynchronous pipeline, allow a station to forward information at any time.asynchronous pipeline, allow a station to forward information at any time.
CharacterizeCharacterize PipelinesPipelines
3)3) Buffered or unbuffered flowBuffered or unbuffered flow – One stage– One stage
of pipeline sends data directly to anotherof pipeline sends data directly to another
one or a buffer is place between eachone or a buffer is place between each
pairs of stages.pairs of stages.
4)4) Finite Chunks or Continuous BitFinite Chunks or Continuous Bit
StreamsStreams – The digital information that– The digital information that
passes though a pipeline can consist ofpasses though a pipeline can consist of
a sequence or small data items or ana sequence or small data items or an
arbitrarily long bit stream.arbitrarily long bit stream.
6)6) Automatic Data Feed Or ManualAutomatic Data Feed Or Manual
Data FeedData Feed – Some implementations of– Some implementations of
pipelines use a separate mechanism topipelines use a separate mechanism to
move information, and othermove information, and other
Linear pipelinesLinear pipelines
 A linear pipeline processor is a series ofA linear pipeline processor is a series of
processing stages and memory access.processing stages and memory access.
 In pipelining, we divide a task intoIn pipelining, we divide a task into
set of subtasks.set of subtasks.
Linear pipelinesLinear pipelines
 The Precedence relation of a set ofThe Precedence relation of a set of
subtask {T1,T2….TK} for a givensubtask {T1,T2….TK} for a given
task T implies that the same task Tjtask T implies that the same task Tj
cannot start until some earlier taskcannot start until some earlier task
Ti finishes.Ti finishes.
 The interdependencies of allThe interdependencies of all
subtask form the precedence graph.subtask form the precedence graph.
Linear Pipeline processorLinear Pipeline processor
 Linear Pipeline processor is a cascade ofLinear Pipeline processor is a cascade of
processing stages which are linearlyprocessing stages which are linearly
connected.connected.
 It perform a fixed function over a stream ofIt perform a fixed function over a stream of
data flowing from one end to other.data flowing from one end to other.
 External input are fed into the pipeline atExternal input are fed into the pipeline at
the first stage and final result emerges atthe first stage and final result emerges at
the last stage of the pipeline.the last stage of the pipeline.
Non-linear ORNon-linear OR dynamicdynamic
pipelinepipeline pipelinespipelines
 A non-linear pipelining (also calledA non-linear pipelining (also called
dynamic pipeline) can be configured todynamic pipeline) can be configured to
perform various functions at differentperform various functions at different
times. In a dynamic pipeline, there is alsotimes. In a dynamic pipeline, there is also
feed-forward or feed-back connection. Afeed-forward or feed-back connection. A
non-linear pipeline also allows very longnon-linear pipeline also allows very long
instruction words.instruction words.
Non-linear pipelinesNon-linear pipelines
 Traditional linear pipeline are staticTraditional linear pipeline are static
pipeline as they are used to perform mixedpipeline as they are used to perform mixed
function.function.
 It allow feed forward and feedbackIt allow feed forward and feedback
connection in associationconnection in association
Instruction pipeline designInstruction pipeline design
 This pipeline reads consecutive instructionThis pipeline reads consecutive instruction
from memory while previous instructionsfrom memory while previous instructions
are being executed in the other segments.are being executed in the other segments.
How instruction executeHow instruction execute
 This phase consists of a sequence ofThis phase consists of a sequence of
operations. Each phase require one oroperations. Each phase require one or
more clock cycle to execute. Thesemore clock cycle to execute. These
includesincludes
 Instruction fetchInstruction fetch
 DecodeDecode
 Operand fetchOperand fetch
 ExecuteExecute
 Result storageResult storage
Basic terms used in instructionBasic terms used in instruction
pipelinepipeline
 Instruction pipeline cycle:Instruction pipeline cycle: It is clock periodIt is clock period
of the pipelineof the pipeline
 Instruction issue latencyInstruction issue latency: It is clock period: It is clock period
 Instruction issue rateInstruction issue rate: no of instruction: no of instruction
issued per cycleissued per cycle
 Simple operation latencySimple operation latency:: It includeIt include
add,load,store,branches, move etc. it alsoadd,load,store,branches, move etc. it also
includes complex operationsincludes complex operations
Mechanism of instructionMechanism of instruction
pipelinepipeline
 For smooth flow and working of theFor smooth flow and working of the
instruction pipeline following mechanisminstruction pipeline following mechanism
are usedare used
 Prefetch bufferPrefetch buffer
 Sequential bufferSequential buffer
 Target bufferTarget buffer
 Loop bufferLoop buffer
Mechanism of instructionMechanism of instruction
pipelinepipeline
 Internal data forwarding :Internal data forwarding :
 it improve the throughput of the pipelineit improve the throughput of the pipeline
processor.processor.
 Its core idea is to replace unnecessaryIts core idea is to replace unnecessary
memory access by register to register transfermemory access by register to register transfer
in a sequence of load arithmetic storein a sequence of load arithmetic store
operations.operations.
Mechanism of instructionMechanism of instruction
pipelinepipeline
 Internal data forwarding can furtherInternal data forwarding can further
divided into three directiondivided into three direction
 Store load forwardStore load forward
 This store, load and forward can be replaced by twoThis store, load and forward can be replaced by two
parallel operations store-register-transferparallel operations store-register-transfer
 Load-load forwardLoad-load forward
 Two load-load can be replaced by one load andTwo load-load can be replaced by one load and
one register transferone register transfer
 Store-store forwardStore-store forward
 Two memory updates of the same word can beTwo memory updates of the same word can be
combined into one. Because second storecombined into one. Because second store
overwritten the first.overwritten the first.
Difficulties with instructionDifficulties with instruction
pipelinepipeline
 Resource conflict:Resource conflict: it is caused whenit is caused when
accessing memory by two segments at theaccessing memory by two segments at the
same time.same time.
 Data dependenciesData dependencies : when an: when an
instruction depends on the result of ainstruction depends on the result of a
previous instruction which is not availableprevious instruction which is not available
 Branch difficultiesBranch difficulties :: it arises from:: it arises from
branch and other instruction that changesbranch and other instruction that changes
the sequence's of instructionsthe sequence's of instructions
Branch difficultiesBranch difficulties
 Main difficulties arises with the conditionalMain difficulties arises with the conditional
branch instructions.branch instructions.
 Until the instruction is actually, executed, itUntil the instruction is actually, executed, it
is impossible to determine whether theis impossible to determine whether the
branch will be taken or not.branch will be taken or not.
 Prefetch targetPrefetch target andand loop buffersloop buffers areare
used to handle branch difficultiesused to handle branch difficulties
Branch difficultiesBranch difficulties
 Branch prediction is used to predict someBranch prediction is used to predict some
additional guess the outcome of aadditional guess the outcome of a
conditional branch instruction before it isconditional branch instruction before it is
executed.executed.
 Branch can be predict in two waysBranch can be predict in two ways
 StaticallyStatically
 DynamicallyDynamically
Branch difficultiesBranch difficulties
 Statically is usually wired into theStatically is usually wired into the
processor.processor.
 Dynamic branch strategy uses recentDynamic branch strategy uses recent
branch history to predict whether or notbranch history to predict whether or not
the branch will be taken next time when itthe branch will be taken next time when it
occurs.occurs.
 For this dynamic prediction additionalFor this dynamic prediction additional
hardware are used calledhardware are used called branch targetbranch target
buffer and delayed branch arebuffer and delayed branch are
Branch difficultiesBranch difficulties
 In delayed branch the compiler detectsIn delayed branch the compiler detects
their branch instructions and rearrange thetheir branch instructions and rearrange the
machine language code sequence bymachine language code sequence by
inserting useful instruction that keep theinserting useful instruction that keep the
pipeline operations without interruptionpipeline operations without interruption
Arithmetic Pipeline DesignArithmetic Pipeline Design
 This technique is used to speedupThis technique is used to speedup
numerical arithmetic computations.numerical arithmetic computations.
 Arithmetic operations is performed withArithmetic operations is performed with
finite precision due to the use of fixed sizefinite precision due to the use of fixed size
memory word or registers.memory word or registers.
Arithmetic Pipeline DesignArithmetic Pipeline Design
 Depending on the function to beDepending on the function to be
implemented , different pipeline stages inimplemented , different pipeline stages in
an arithmetic unit require differentan arithmetic unit require different
hardware logics.hardware logics.
 All arithmetic operations can beAll arithmetic operations can be
implemented with basic add and shiftimplemented with basic add and shift
operations,operations,
Arithmetic Pipeline DesignArithmetic Pipeline Design
 For high speed addition we require carryFor high speed addition we require carry
propagation adder (CPA) and carry savepropagation adder (CPA) and carry save
adder(CSA)adder(CSA)
Shortcut Method of finding Latency & Collision VectorShortcut Method of finding Latency & Collision Vector
• Forbidden Latency Set,F = {5} U {2} U {2}
= { 2,5}
State DiagramState Diagram
 The initial collision vector (ICV) is a binaryThe initial collision vector (ICV) is a binary
vector formed from F such thatvector formed from F such that
C = (CC = (Cnn…. C…. C22 CC11))
where Cwhere Cii = 1 if i= 1 if i ∈∈ F and CF and Cii = 0 if otherwise= 0 if otherwise
 Thus in our exampleThus in our example
F = { 2,5 }F = { 2,5 }
C = (1 0 0 1 0)C = (1 0 0 1 0)
Multifunctional pipelineMultifunctional pipeline
 A pipeline processor which can perform pA pipeline processor which can perform p
distinct function can be described by pdistinct function can be described by p
reservation tables overlaid together.reservation tables overlaid together.
 Each task to be initiated can beEach task to be initiated can be
associated with a function tag identifyingassociated with a function tag identifying
the reservation table to be used.the reservation table to be used.
 Collision may occur between two or moreCollision may occur between two or more
tasks with the same function tag or fromtasks with the same function tag or from
distinct function tag.distinct function tag.
Multifunctional pipelineMultifunctional pipeline
 The stage usage for each function can beThe stage usage for each function can be
displayed with a different tag in thedisplayed with a different tag in the
overlaid reservation table.overlaid reservation table.
Pipeline HazardsPipeline Hazards
 Data HazardsData Hazards – an instruction uses the result of the– an instruction uses the result of the
previous instruction. A hazard occurs exactly when anprevious instruction. A hazard occurs exactly when an
instruction tries to read a register in its ID stage that aninstruction tries to read a register in its ID stage that an
earlier instruction intends to write in its WB stage.earlier instruction intends to write in its WB stage.
 When an instruction depends on the results of theWhen an instruction depends on the results of the
previous instructionprevious instruction
 Control HazardsControl Hazards – the location of an instruction– the location of an instruction
depends on previous instruction, Due to branches anddepends on previous instruction, Due to branches and
other instructions that affect the PCother instructions that affect the PC
Pipeline HazardsPipeline Hazards
 Structural HazardsStructural Hazards – two instructions– two instructions
need to access the same resource.need to access the same resource.
 Resource conflict.Resource conflict.
 Hardware cannot support all possibleHardware cannot support all possible
combinations of instructions in simultaneouscombinations of instructions in simultaneous
overlapped executionoverlapped execution
StallingStalling
 Stalling involves halting the flow of instructions until theStalling involves halting the flow of instructions until the
required result is ready to be used. However stallingrequired result is ready to be used. However stalling
wastes processor time by doing nothing while waitingwastes processor time by doing nothing while waiting
for the result.for the result.
 A stall is the delay in cycles caused due to any of theA stall is the delay in cycles caused due to any of the
hazards mentioned abovehazards mentioned above
 How to Calculate SpeedupHow to Calculate Speedup ::
1/(1+pipeline stall per instruction)* Number1/(1+pipeline stall per instruction)* Number
of stagesof stages
StallingStalling
 So what is the speed up for an idealSo what is the speed up for an ideal
pipeline with no stalls?pipeline with no stalls?
Number of cycles needed to initially fill upNumber of cycles needed to initially fill up
the pipeline could be included inthe pipeline could be included in
computation of average stall per instructioncomputation of average stall per instruction
Structural hazardsStructural hazards
 When more than one instruction in theWhen more than one instruction in the
pipeline needs to access a resource, thepipeline needs to access a resource, the
data path is said to have a structuraldata path is said to have a structural
hazardhazard
 Examples of resources: register file,Examples of resources: register file,
memory, ALU.memory, ALU.
 Solution: Stall the pipeline for one clockSolution: Stall the pipeline for one clock
cycle when the conflict is detected. Thiscycle when the conflict is detected. This
results in a pipeline bubbleresults in a pipeline bubble
Data hazard - solutionData hazard - solution
 Usually solved by data or registerUsually solved by data or register
forwarding (bypassing or short-circuiting)forwarding (bypassing or short-circuiting)
 How it is done ?How it is done ?
 The data selected is not really usedThe data selected is not really used
Data hazard classificationData hazard classification
 RAWRAW - Read After Write. Most common:- Read After Write. Most common:
solved by data forwarding.solved by data forwarding.
 WAWWAW - Write After Write : Inst i (load)- Write After Write : Inst i (load)
before inst j (add). Both write to samebefore inst j (add). Both write to same
register.register.
 WARWAR - Write after Read: inst j tries to- Write after Read: inst j tries to
write a destination before it is read by I, sowrite a destination before it is read by I, so
I incorrectly gets its valueI incorrectly gets its value
Dynamic Instruction SchedulingDynamic Instruction Scheduling
 With dynamic scheduling the hardwareWith dynamic scheduling the hardware
tries to rearrange the instructions duringtries to rearrange the instructions during
run-time to reduce pipeline stalls.run-time to reduce pipeline stalls.
 Simpler compiler handles dependenciesSimpler compiler handles dependencies
not known at compile timenot known at compile time
 Allows code compiled for a differentAllows code compiled for a different
machine to run efficiently.machine to run efficiently.
Out-Of-Order ExecutionOut-Of-Order Execution
 With out-of-order execution, the SUBD isWith out-of-order execution, the SUBD is
allowed to executed before the addallowed to executed before the add
 this can lead to out-of order completion,this can lead to out-of order completion,
which can cause WAW and WAR hazardswhich can cause WAW and WAR hazards
Score boardingScore boarding
 The scoreboard implements a centralizedThe scoreboard implements a centralized
 control scheme that Detects all resource andcontrol scheme that Detects all resource and
data hazardsdata hazards
 Allows instructions to execute out-of-order whenAllows instructions to execute out-of-order when
no resource hazards or data dependenciesno resource hazards or data dependencies
 First implemented in 1964 by the CDC 6600,First implemented in 1964 by the CDC 6600,
which had 18 separate functional units,4 FPwhich had 18 separate functional units,4 FP
units (2 multiply, 1 add, 1 divide),7 memory unitsunits (2 multiply, 1 add, 1 divide),7 memory units
(5 loads, 2 stores)(5 loads, 2 stores)
 7 integer units (add, shift, logical, compare, etc.)7 integer units (add, shift, logical, compare, etc.)
Scoreboard ImplicationsScoreboard Implications
 Our dynamic pipeline (much simpler)Our dynamic pipeline (much simpler)
 2 FP multiply (10 EX cycles)2 FP multiply (10 EX cycles)
 1 FP add (2 EX cycles)1 FP add (2 EX cycles)
 1 FP divide (40 EX cycles)1 FP divide (40 EX cycles)
 1 integer unit (1 EX cycle)1 integer unit (1 EX cycle)
Scoreboard ImplicationsScoreboard Implications
 Out-of-order completion can lead to WAROut-of-order completion can lead to WAR
and WAW hazards?and WAW hazards?
 Solution for WAWSolution for WAW
 Detect WAW hazard before reading operandsDetect WAW hazard before reading operands
 Stall write until other instruction completesStall write until other instruction completes
Scoreboard ImplicationsScoreboard Implications
 Solutions for WARSolutions for WAR
 Detect WAR hazards before writing back toDetect WAR hazards before writing back to
the register files and stall the write backthe register files and stall the write back
 This scoreboard does not take advantage ofThis scoreboard does not take advantage of
forwarding (i.e. bypasses), since it waits untilforwarding (i.e. bypasses), since it waits until
both results are written back to the register fileboth results are written back to the register file
 Scoreboard replaces DR, EX, WB with 4Scoreboard replaces DR, EX, WB with 4
stagesstages
Scoreboard ImplicationsScoreboard Implications
 Solutions for WARSolutions for WAR
 Detect WAR hazards before writing back toDetect WAR hazards before writing back to
the register files and stall the write backthe register files and stall the write back
 This scoreboard does not take advantage ofThis scoreboard does not take advantage of
forwarding (i.e. bypasses), since it waits untilforwarding (i.e. bypasses), since it waits until
both results are written back to the register fileboth results are written back to the register file
 Scoreboard replaces DR, EX, WB with 4Scoreboard replaces DR, EX, WB with 4
stagesstages
Stages of Scoreboard ControlStages of Scoreboard Control
 Decode+Issue (Issue)Decode+Issue (Issue)
 Read operands (Read)Read operands (Read)
 Execution (EX)Execution (EX)
 Write result (WB)Write result (WB)
Parts of the ScoreboardParts of the Scoreboard
 Instruction statusInstruction status : which of 4 steps the: which of 4 steps the
instruction is in: Issue, Read, EX, or WBinstruction is in: Issue, Read, EX, or WB
 Functional unit statusFunctional unit status :Indicates the:Indicates the
state of the functional unit (FU). 9 fields forstate of the functional unit (FU). 9 fields for
each functional unit.each functional unit.
 Register result statusRegister result status —Indicates—Indicates
which functional unit will write eachwhich functional unit will write each
register, if one exists. Blank when noregister, if one exists. Blank when no
pending instructions will write that registerpending instructions will write that register
Parts of the ScoreboardParts of the Scoreboard
 Busy:Indicates whether the unit is busy or notBusy:Indicates whether the unit is busy or not
 Op:Operation to perform in the unit (e.g., + or –)Op:Operation to perform in the unit (e.g., + or –)
 Fi:Destination registerFi:Destination register
 Fj, Fk:Source-register numbersFj, Fk:Source-register numbers
 Qj, Qk:Functional units producing sourceQj, Qk:Functional units producing source
registers Fj, Fkregisters Fj, Fk
 Rj, Rk:Flags indicating when Fj, Fk are readyRj, Rk:Flags indicating when Fj, Fk are ready
Tomasulo Algorithm forTomasulo Algorithm for
Dynamic SchedulingDynamic Scheduling
 For IBM 360/91 in 1967 -about 3 years afterFor IBM 360/91 in 1967 -about 3 years after
CDC 6600 •Goal: High performance withoutCDC 6600 •Goal: High performance without
special compilers •special compilers •
 Differences between IBM 360 & CDC 6600 –IBMDifferences between IBM 360 & CDC 6600 –IBM
has only 2 register specifiers/instr vs. 3 in CDChas only 2 register specifiers/instr vs. 3 in CDC
6600 –IBM has register-memory instructions –6600 –IBM has register-memory instructions –
IBM has 4 FP registers vs. 8 in CDC 6600 –IBMIBM has 4 FP registers vs. 8 in CDC 6600 –IBM
has pipelined functional units (3 adds, 2has pipelined functional units (3 adds, 2
multiplies)multiplies)
Tomasulo AlgorithmTomasulo Algorithm
 Tomasulo algorithm is designed to handleTomasulo algorithm is designed to handle
name dependencies (WAW and WARname dependencies (WAW and WAR
hazards) efficientlyhazards) efficiently
Tomasulo Algorithm AdvantageTomasulo Algorithm Advantage
 Prevents register from being thePrevents register from being the
bottleneck.bottleneck.
 Eliminates WAR, WAW hazards –AllowsEliminates WAR, WAW hazards –Allows
loop unrolling in HW Common.loop unrolling in HW Common.
 Data Bus –Broadcasts results to multipleData Bus –Broadcasts results to multiple
instructions –Central bottleneckinstructions –Central bottleneck
 It provide Dynamic schedulingIt provide Dynamic scheduling
 It provide Register renamingIt provide Register renaming
 Load/store disambiguationLoad/store disambiguation

Mais conteúdo relacionado

Mais procurados

Pipelining , structural hazards
Pipelining , structural hazardsPipelining , structural hazards
Pipelining , structural hazardsMunaam Munawar
 
Pipelinig hazardous
Pipelinig hazardousPipelinig hazardous
Pipelinig hazardousjasscheema
 
pipeline and vector processing
pipeline and vector processingpipeline and vector processing
pipeline and vector processingAcad
 
Architectural support for High Level Language
Architectural support for High Level LanguageArchitectural support for High Level Language
Architectural support for High Level LanguageSudhanshu Janwadkar
 
Loop parallelization & pipelining
Loop parallelization & pipeliningLoop parallelization & pipelining
Loop parallelization & pipeliningjagrat123
 
pipelining and hazards occure in assembly language.
pipelining and hazards occure in assembly language.pipelining and hazards occure in assembly language.
pipelining and hazards occure in assembly language.Zohaib Arshid
 
Pipeline processing - Computer Architecture
Pipeline processing - Computer Architecture Pipeline processing - Computer Architecture
Pipeline processing - Computer Architecture S. Hasnain Raza
 
Performance Enhancement with Pipelining
Performance Enhancement with PipeliningPerformance Enhancement with Pipelining
Performance Enhancement with PipeliningAneesh Raveendran
 
INSTRUCTION PIPELINING
INSTRUCTION PIPELININGINSTRUCTION PIPELINING
INSTRUCTION PIPELININGrubysistec
 
Computer architecture pipelining
Computer architecture pipeliningComputer architecture pipelining
Computer architecture pipeliningMazin Alwaaly
 
Ct213 processor design_pipelinehazard
Ct213 processor design_pipelinehazardCt213 processor design_pipelinehazard
Ct213 processor design_pipelinehazardrakeshrakesh2020
 
Instruction pipeline: Computer Architecture
Instruction pipeline: Computer ArchitectureInstruction pipeline: Computer Architecture
Instruction pipeline: Computer ArchitectureInteX Research Lab
 
Pipeline hazard
Pipeline hazardPipeline hazard
Pipeline hazardAJAL A J
 

Mais procurados (20)

Pipelining , structural hazards
Pipelining , structural hazardsPipelining , structural hazards
Pipelining , structural hazards
 
Pipelinig hazardous
Pipelinig hazardousPipelinig hazardous
Pipelinig hazardous
 
pipeline and vector processing
pipeline and vector processingpipeline and vector processing
pipeline and vector processing
 
Chapter6 pipelining
Chapter6  pipeliningChapter6  pipelining
Chapter6 pipelining
 
pipelining
pipeliningpipelining
pipelining
 
Instruction pipelining
Instruction pipeliningInstruction pipelining
Instruction pipelining
 
Instruction Pipelining
Instruction PipeliningInstruction Pipelining
Instruction Pipelining
 
Architectural support for High Level Language
Architectural support for High Level LanguageArchitectural support for High Level Language
Architectural support for High Level Language
 
Pipelining
PipeliningPipelining
Pipelining
 
Loop parallelization & pipelining
Loop parallelization & pipeliningLoop parallelization & pipelining
Loop parallelization & pipelining
 
3 Pipelining
3 Pipelining3 Pipelining
3 Pipelining
 
pipelining and hazards occure in assembly language.
pipelining and hazards occure in assembly language.pipelining and hazards occure in assembly language.
pipelining and hazards occure in assembly language.
 
Pipeline processing - Computer Architecture
Pipeline processing - Computer Architecture Pipeline processing - Computer Architecture
Pipeline processing - Computer Architecture
 
Performance Enhancement with Pipelining
Performance Enhancement with PipeliningPerformance Enhancement with Pipelining
Performance Enhancement with Pipelining
 
INSTRUCTION PIPELINING
INSTRUCTION PIPELININGINSTRUCTION PIPELINING
INSTRUCTION PIPELINING
 
Computer architecture pipelining
Computer architecture pipeliningComputer architecture pipelining
Computer architecture pipelining
 
Ct213 processor design_pipelinehazard
Ct213 processor design_pipelinehazardCt213 processor design_pipelinehazard
Ct213 processor design_pipelinehazard
 
Instruction pipeline: Computer Architecture
Instruction pipeline: Computer ArchitectureInstruction pipeline: Computer Architecture
Instruction pipeline: Computer Architecture
 
Piplining
PipliningPiplining
Piplining
 
Pipeline hazard
Pipeline hazardPipeline hazard
Pipeline hazard
 

Semelhante a Unit 3

Pipeline & Nonpipeline Processor
Pipeline & Nonpipeline ProcessorPipeline & Nonpipeline Processor
Pipeline & Nonpipeline ProcessorSmit Shah
 
Pipelining of Processors
Pipelining of ProcessorsPipelining of Processors
Pipelining of ProcessorsGaditek
 
Pipelining of Processors Computer Architecture
Pipelining of  Processors Computer ArchitecturePipelining of  Processors Computer Architecture
Pipelining of Processors Computer ArchitectureHaris456
 
Active Network Node in Silicon-Based L3 Gigabit Routing Switch
Active Network Node in Silicon-Based L3 Gigabit Routing SwitchActive Network Node in Silicon-Based L3 Gigabit Routing Switch
Active Network Node in Silicon-Based L3 Gigabit Routing SwitchTal Lavian Ph.D.
 
Pipelining 16 computers Artitacher pdf
Pipelining   16 computers Artitacher  pdfPipelining   16 computers Artitacher  pdf
Pipelining 16 computers Artitacher pdfMadhuGupta99385
 
DEVICE DRIVERS AND INTERRUPTS SERVICE MECHANISM.pdf
DEVICE DRIVERS AND INTERRUPTS  SERVICE MECHANISM.pdfDEVICE DRIVERS AND INTERRUPTS  SERVICE MECHANISM.pdf
DEVICE DRIVERS AND INTERRUPTS SERVICE MECHANISM.pdfAkritiPradhan2
 
Pipelining in Computer System Achitecture
Pipelining in Computer System AchitecturePipelining in Computer System Achitecture
Pipelining in Computer System AchitectureYashiUpadhyay3
 
pipelining-190913185902.pptx
pipelining-190913185902.pptxpipelining-190913185902.pptx
pipelining-190913185902.pptxAshokRachapalli1
 
Topic2a ss pipelines
Topic2a ss pipelinesTopic2a ss pipelines
Topic2a ss pipelinesturki_09
 
AFFECT OF PARALLEL COMPUTING ON MULTICORE PROCESSORS
AFFECT OF PARALLEL COMPUTING ON MULTICORE PROCESSORSAFFECT OF PARALLEL COMPUTING ON MULTICORE PROCESSORS
AFFECT OF PARALLEL COMPUTING ON MULTICORE PROCESSORScscpconf
 
Affect of parallel computing on multicore processors
Affect of parallel computing on multicore processorsAffect of parallel computing on multicore processors
Affect of parallel computing on multicore processorscsandit
 
10 implementing subprograms
10 implementing subprograms10 implementing subprograms
10 implementing subprogramsMunawar Ahmed
 
pipeline in computer architecture design
pipeline in computer architecture  designpipeline in computer architecture  design
pipeline in computer architecture designssuser87fa0c1
 

Semelhante a Unit 3 (20)

Pipeline & Nonpipeline Processor
Pipeline & Nonpipeline ProcessorPipeline & Nonpipeline Processor
Pipeline & Nonpipeline Processor
 
pipelining
pipeliningpipelining
pipelining
 
Pipelining of Processors
Pipelining of ProcessorsPipelining of Processors
Pipelining of Processors
 
Pipelining of Processors Computer Architecture
Pipelining of  Processors Computer ArchitecturePipelining of  Processors Computer Architecture
Pipelining of Processors Computer Architecture
 
Assembly p1
Assembly p1Assembly p1
Assembly p1
 
Pipelining
PipeliningPipelining
Pipelining
 
Active Network Node in Silicon-Based L3 Gigabit Routing Switch
Active Network Node in Silicon-Based L3 Gigabit Routing SwitchActive Network Node in Silicon-Based L3 Gigabit Routing Switch
Active Network Node in Silicon-Based L3 Gigabit Routing Switch
 
Pipeline Computing by S. M. Risalat Hasan Chowdhury
Pipeline Computing by S. M. Risalat Hasan ChowdhuryPipeline Computing by S. M. Risalat Hasan Chowdhury
Pipeline Computing by S. M. Risalat Hasan Chowdhury
 
Pipelining 16 computers Artitacher pdf
Pipelining   16 computers Artitacher  pdfPipelining   16 computers Artitacher  pdf
Pipelining 16 computers Artitacher pdf
 
DEVICE DRIVERS AND INTERRUPTS SERVICE MECHANISM.pdf
DEVICE DRIVERS AND INTERRUPTS  SERVICE MECHANISM.pdfDEVICE DRIVERS AND INTERRUPTS  SERVICE MECHANISM.pdf
DEVICE DRIVERS AND INTERRUPTS SERVICE MECHANISM.pdf
 
Pipelining in Computer System Achitecture
Pipelining in Computer System AchitecturePipelining in Computer System Achitecture
Pipelining in Computer System Achitecture
 
Pipeline
PipelinePipeline
Pipeline
 
pipelining-190913185902.pptx
pipelining-190913185902.pptxpipelining-190913185902.pptx
pipelining-190913185902.pptx
 
Topic2a ss pipelines
Topic2a ss pipelinesTopic2a ss pipelines
Topic2a ss pipelines
 
AFFECT OF PARALLEL COMPUTING ON MULTICORE PROCESSORS
AFFECT OF PARALLEL COMPUTING ON MULTICORE PROCESSORSAFFECT OF PARALLEL COMPUTING ON MULTICORE PROCESSORS
AFFECT OF PARALLEL COMPUTING ON MULTICORE PROCESSORS
 
Affect of parallel computing on multicore processors
Affect of parallel computing on multicore processorsAffect of parallel computing on multicore processors
Affect of parallel computing on multicore processors
 
ch2.pptx
ch2.pptxch2.pptx
ch2.pptx
 
10 implementing subprograms
10 implementing subprograms10 implementing subprograms
10 implementing subprograms
 
pipeline in computer architecture design
pipeline in computer architecture  designpipeline in computer architecture  design
pipeline in computer architecture design
 
Soc.pptx
Soc.pptxSoc.pptx
Soc.pptx
 

Último

College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performancesivaprakash250
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...ranjana rawat
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxupamatechverse
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Christo Ananth
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)Suman Mia
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlysanyuktamishra911
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxupamatechverse
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSRajkumarAkumalla
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 

Último (20)

College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 

Unit 3

  • 2. CONTENTSCONTENTS  What is PipeliningWhat is Pipelining  How Pipelines WorksHow Pipelines Works  Advantages/DisadvantagesAdvantages/Disadvantages  CharacterizeCharacterize PipelinesPipelines  Pipeline classificationPipeline classification
  • 3. What is PipeliningWhat is Pipelining  A technique used in advanced microprocessors whereA technique used in advanced microprocessors where the microprocessor begins executing a secondthe microprocessor begins executing a second instruction before the first has been completed.instruction before the first has been completed. - A Pipeline is a series of stages, where some work isA Pipeline is a series of stages, where some work is done at each stage. The work is not finished until it hasdone at each stage. The work is not finished until it has passed through all stages.passed through all stages.  With pipelining, the computer architecture allows theWith pipelining, the computer architecture allows the next instructions to be fetched while the processor isnext instructions to be fetched while the processor is performing arithmetic operations, holding them in aperforming arithmetic operations, holding them in a buffer close to the processor until each instructionbuffer close to the processor until each instruction operation can performed.operation can performed.
  • 4. How Pipelines WorksHow Pipelines Works  The pipeline is divided into segments andThe pipeline is divided into segments and each segment can execute it operationeach segment can execute it operation concurrently with the other segments.concurrently with the other segments. Once a segment completes an operations,Once a segment completes an operations, it passes the result to the next segment init passes the result to the next segment in the pipeline and fetches the nextthe pipeline and fetches the next operations from the preceding segment.operations from the preceding segment.
  • 6.
  • 7. Instructions FetchInstructions Fetch  The instruction Fetch (IF) stage is responsible forThe instruction Fetch (IF) stage is responsible for obtaining the requested instruction from memory. Theobtaining the requested instruction from memory. The instruction and the program counter (which isinstruction and the program counter (which is incremented to the next instruction) are stored in theincremented to the next instruction) are stored in the IF/ID pipeline register as temporary storage so that mayIF/ID pipeline register as temporary storage so that may be used in the next stage at the start of the next clockbe used in the next stage at the start of the next clock cycle.cycle.
  • 8. Instruction DecodeInstruction Decode  The Instruction Decode (ID) stage is responsible forThe Instruction Decode (ID) stage is responsible for decoding the instruction and sending out the variousdecoding the instruction and sending out the various control lines to the other parts of the processor. Thecontrol lines to the other parts of the processor. The instruction is sent to the control unit where it is decodedinstruction is sent to the control unit where it is decoded and the registers are fetched from the register file.and the registers are fetched from the register file.
  • 9. ExecutionExecution  The Execution (EX) stage is where any calculations areThe Execution (EX) stage is where any calculations are performed. The main component in this stage is theperformed. The main component in this stage is the ALU. The ALU is made up of arithmetic, logic andALU. The ALU is made up of arithmetic, logic and capabilities.capabilities.
  • 10. Memory and IOMemory and IO  The Memory and IO (MEM) stage is responsible forThe Memory and IO (MEM) stage is responsible for storing and loading values to and from memory. It alsostoring and loading values to and from memory. It also responsible for input or output from the processor. If theresponsible for input or output from the processor. If the current instruction is not of Memory or IO type than thecurrent instruction is not of Memory or IO type than the result from the ALU is passed through to the write backresult from the ALU is passed through to the write back stage.stage.
  • 11. Write BackWrite Back  The Write Back (WB) stage is responsibleThe Write Back (WB) stage is responsible for writing the result of a calculation,for writing the result of a calculation, memory access or input into the registermemory access or input into the register file.file.
  • 12. Operation TimingsOperation Timings  Estimated timings for each ofEstimated timings for each of the stages:the stages: InstructionInstruction FetchFetch 2ns2ns InstructionInstruction DecodeDecode 1ns1ns ExecutionExecution 2ns2ns MemoryMemory and IOand IO 2ns2ns Write BackWrite Back 1ns1ns
  • 13. Advantages/DisadvantagesAdvantages/Disadvantages Advantages:Advantages:  More efficient use of processorMore efficient use of processor  Quicker time of execution of large number ofQuicker time of execution of large number of instructionsinstructions Disadvantages:Disadvantages:  Pipelining involves adding hardware to the chipPipelining involves adding hardware to the chip  Inability to continuously run the pipelineInability to continuously run the pipeline at full speed because of pipeline hazardsat full speed because of pipeline hazards which disrupt the smooth execution of thewhich disrupt the smooth execution of the pipeline.pipeline.
  • 14. CharacterizeCharacterize PipelinesPipelines 1)1) Hardware or software implementationHardware or software implementation –– pipelining can bepipelining can be implemented in either software or hardware.implemented in either software or hardware. 2)2) Large or Small ScaleLarge or Small Scale – Stations in a pipeline can range from simplistic to– Stations in a pipeline can range from simplistic to powerful, and a pipeline can range in length from short to long.powerful, and a pipeline can range in length from short to long. 3)3) Synchronous or asynchronous flowSynchronous or asynchronous flow – A synchronous pipeline operates like– A synchronous pipeline operates like an assembly line: at a given time, each station is processing some amountan assembly line: at a given time, each station is processing some amount of information.of information. 4)4) asynchronous pipeline, allow a station to forward information at any time.asynchronous pipeline, allow a station to forward information at any time.
  • 15. CharacterizeCharacterize PipelinesPipelines 3)3) Buffered or unbuffered flowBuffered or unbuffered flow – One stage– One stage of pipeline sends data directly to anotherof pipeline sends data directly to another one or a buffer is place between eachone or a buffer is place between each pairs of stages.pairs of stages. 4)4) Finite Chunks or Continuous BitFinite Chunks or Continuous Bit StreamsStreams – The digital information that– The digital information that passes though a pipeline can consist ofpasses though a pipeline can consist of a sequence or small data items or ana sequence or small data items or an arbitrarily long bit stream.arbitrarily long bit stream. 6)6) Automatic Data Feed Or ManualAutomatic Data Feed Or Manual Data FeedData Feed – Some implementations of– Some implementations of pipelines use a separate mechanism topipelines use a separate mechanism to move information, and othermove information, and other
  • 16. Linear pipelinesLinear pipelines  A linear pipeline processor is a series ofA linear pipeline processor is a series of processing stages and memory access.processing stages and memory access.  In pipelining, we divide a task intoIn pipelining, we divide a task into set of subtasks.set of subtasks.
  • 17. Linear pipelinesLinear pipelines  The Precedence relation of a set ofThe Precedence relation of a set of subtask {T1,T2….TK} for a givensubtask {T1,T2….TK} for a given task T implies that the same task Tjtask T implies that the same task Tj cannot start until some earlier taskcannot start until some earlier task Ti finishes.Ti finishes.  The interdependencies of allThe interdependencies of all subtask form the precedence graph.subtask form the precedence graph.
  • 18. Linear Pipeline processorLinear Pipeline processor  Linear Pipeline processor is a cascade ofLinear Pipeline processor is a cascade of processing stages which are linearlyprocessing stages which are linearly connected.connected.  It perform a fixed function over a stream ofIt perform a fixed function over a stream of data flowing from one end to other.data flowing from one end to other.  External input are fed into the pipeline atExternal input are fed into the pipeline at the first stage and final result emerges atthe first stage and final result emerges at the last stage of the pipeline.the last stage of the pipeline.
  • 19. Non-linear ORNon-linear OR dynamicdynamic pipelinepipeline pipelinespipelines  A non-linear pipelining (also calledA non-linear pipelining (also called dynamic pipeline) can be configured todynamic pipeline) can be configured to perform various functions at differentperform various functions at different times. In a dynamic pipeline, there is alsotimes. In a dynamic pipeline, there is also feed-forward or feed-back connection. Afeed-forward or feed-back connection. A non-linear pipeline also allows very longnon-linear pipeline also allows very long instruction words.instruction words.
  • 20. Non-linear pipelinesNon-linear pipelines  Traditional linear pipeline are staticTraditional linear pipeline are static pipeline as they are used to perform mixedpipeline as they are used to perform mixed function.function.  It allow feed forward and feedbackIt allow feed forward and feedback connection in associationconnection in association
  • 21. Instruction pipeline designInstruction pipeline design  This pipeline reads consecutive instructionThis pipeline reads consecutive instruction from memory while previous instructionsfrom memory while previous instructions are being executed in the other segments.are being executed in the other segments.
  • 22. How instruction executeHow instruction execute  This phase consists of a sequence ofThis phase consists of a sequence of operations. Each phase require one oroperations. Each phase require one or more clock cycle to execute. Thesemore clock cycle to execute. These includesincludes  Instruction fetchInstruction fetch  DecodeDecode  Operand fetchOperand fetch  ExecuteExecute  Result storageResult storage
  • 23. Basic terms used in instructionBasic terms used in instruction pipelinepipeline  Instruction pipeline cycle:Instruction pipeline cycle: It is clock periodIt is clock period of the pipelineof the pipeline  Instruction issue latencyInstruction issue latency: It is clock period: It is clock period  Instruction issue rateInstruction issue rate: no of instruction: no of instruction issued per cycleissued per cycle  Simple operation latencySimple operation latency:: It includeIt include add,load,store,branches, move etc. it alsoadd,load,store,branches, move etc. it also includes complex operationsincludes complex operations
  • 24. Mechanism of instructionMechanism of instruction pipelinepipeline  For smooth flow and working of theFor smooth flow and working of the instruction pipeline following mechanisminstruction pipeline following mechanism are usedare used  Prefetch bufferPrefetch buffer  Sequential bufferSequential buffer  Target bufferTarget buffer  Loop bufferLoop buffer
  • 25. Mechanism of instructionMechanism of instruction pipelinepipeline  Internal data forwarding :Internal data forwarding :  it improve the throughput of the pipelineit improve the throughput of the pipeline processor.processor.  Its core idea is to replace unnecessaryIts core idea is to replace unnecessary memory access by register to register transfermemory access by register to register transfer in a sequence of load arithmetic storein a sequence of load arithmetic store operations.operations.
  • 26. Mechanism of instructionMechanism of instruction pipelinepipeline  Internal data forwarding can furtherInternal data forwarding can further divided into three directiondivided into three direction  Store load forwardStore load forward  This store, load and forward can be replaced by twoThis store, load and forward can be replaced by two parallel operations store-register-transferparallel operations store-register-transfer  Load-load forwardLoad-load forward  Two load-load can be replaced by one load andTwo load-load can be replaced by one load and one register transferone register transfer  Store-store forwardStore-store forward  Two memory updates of the same word can beTwo memory updates of the same word can be combined into one. Because second storecombined into one. Because second store overwritten the first.overwritten the first.
  • 27. Difficulties with instructionDifficulties with instruction pipelinepipeline  Resource conflict:Resource conflict: it is caused whenit is caused when accessing memory by two segments at theaccessing memory by two segments at the same time.same time.  Data dependenciesData dependencies : when an: when an instruction depends on the result of ainstruction depends on the result of a previous instruction which is not availableprevious instruction which is not available  Branch difficultiesBranch difficulties :: it arises from:: it arises from branch and other instruction that changesbranch and other instruction that changes the sequence's of instructionsthe sequence's of instructions
  • 28. Branch difficultiesBranch difficulties  Main difficulties arises with the conditionalMain difficulties arises with the conditional branch instructions.branch instructions.  Until the instruction is actually, executed, itUntil the instruction is actually, executed, it is impossible to determine whether theis impossible to determine whether the branch will be taken or not.branch will be taken or not.  Prefetch targetPrefetch target andand loop buffersloop buffers areare used to handle branch difficultiesused to handle branch difficulties
  • 29. Branch difficultiesBranch difficulties  Branch prediction is used to predict someBranch prediction is used to predict some additional guess the outcome of aadditional guess the outcome of a conditional branch instruction before it isconditional branch instruction before it is executed.executed.  Branch can be predict in two waysBranch can be predict in two ways  StaticallyStatically  DynamicallyDynamically
  • 30. Branch difficultiesBranch difficulties  Statically is usually wired into theStatically is usually wired into the processor.processor.  Dynamic branch strategy uses recentDynamic branch strategy uses recent branch history to predict whether or notbranch history to predict whether or not the branch will be taken next time when itthe branch will be taken next time when it occurs.occurs.  For this dynamic prediction additionalFor this dynamic prediction additional hardware are used calledhardware are used called branch targetbranch target buffer and delayed branch arebuffer and delayed branch are
  • 31. Branch difficultiesBranch difficulties  In delayed branch the compiler detectsIn delayed branch the compiler detects their branch instructions and rearrange thetheir branch instructions and rearrange the machine language code sequence bymachine language code sequence by inserting useful instruction that keep theinserting useful instruction that keep the pipeline operations without interruptionpipeline operations without interruption
  • 32. Arithmetic Pipeline DesignArithmetic Pipeline Design  This technique is used to speedupThis technique is used to speedup numerical arithmetic computations.numerical arithmetic computations.  Arithmetic operations is performed withArithmetic operations is performed with finite precision due to the use of fixed sizefinite precision due to the use of fixed size memory word or registers.memory word or registers.
  • 33. Arithmetic Pipeline DesignArithmetic Pipeline Design  Depending on the function to beDepending on the function to be implemented , different pipeline stages inimplemented , different pipeline stages in an arithmetic unit require differentan arithmetic unit require different hardware logics.hardware logics.  All arithmetic operations can beAll arithmetic operations can be implemented with basic add and shiftimplemented with basic add and shift operations,operations,
  • 34. Arithmetic Pipeline DesignArithmetic Pipeline Design  For high speed addition we require carryFor high speed addition we require carry propagation adder (CPA) and carry savepropagation adder (CPA) and carry save adder(CSA)adder(CSA)
  • 35. Shortcut Method of finding Latency & Collision VectorShortcut Method of finding Latency & Collision Vector • Forbidden Latency Set,F = {5} U {2} U {2} = { 2,5}
  • 36. State DiagramState Diagram  The initial collision vector (ICV) is a binaryThe initial collision vector (ICV) is a binary vector formed from F such thatvector formed from F such that C = (CC = (Cnn…. C…. C22 CC11)) where Cwhere Cii = 1 if i= 1 if i ∈∈ F and CF and Cii = 0 if otherwise= 0 if otherwise  Thus in our exampleThus in our example F = { 2,5 }F = { 2,5 } C = (1 0 0 1 0)C = (1 0 0 1 0)
  • 37. Multifunctional pipelineMultifunctional pipeline  A pipeline processor which can perform pA pipeline processor which can perform p distinct function can be described by pdistinct function can be described by p reservation tables overlaid together.reservation tables overlaid together.  Each task to be initiated can beEach task to be initiated can be associated with a function tag identifyingassociated with a function tag identifying the reservation table to be used.the reservation table to be used.  Collision may occur between two or moreCollision may occur between two or more tasks with the same function tag or fromtasks with the same function tag or from distinct function tag.distinct function tag.
  • 38. Multifunctional pipelineMultifunctional pipeline  The stage usage for each function can beThe stage usage for each function can be displayed with a different tag in thedisplayed with a different tag in the overlaid reservation table.overlaid reservation table.
  • 39. Pipeline HazardsPipeline Hazards  Data HazardsData Hazards – an instruction uses the result of the– an instruction uses the result of the previous instruction. A hazard occurs exactly when anprevious instruction. A hazard occurs exactly when an instruction tries to read a register in its ID stage that aninstruction tries to read a register in its ID stage that an earlier instruction intends to write in its WB stage.earlier instruction intends to write in its WB stage.  When an instruction depends on the results of theWhen an instruction depends on the results of the previous instructionprevious instruction  Control HazardsControl Hazards – the location of an instruction– the location of an instruction depends on previous instruction, Due to branches anddepends on previous instruction, Due to branches and other instructions that affect the PCother instructions that affect the PC
  • 40. Pipeline HazardsPipeline Hazards  Structural HazardsStructural Hazards – two instructions– two instructions need to access the same resource.need to access the same resource.  Resource conflict.Resource conflict.  Hardware cannot support all possibleHardware cannot support all possible combinations of instructions in simultaneouscombinations of instructions in simultaneous overlapped executionoverlapped execution
  • 41. StallingStalling  Stalling involves halting the flow of instructions until theStalling involves halting the flow of instructions until the required result is ready to be used. However stallingrequired result is ready to be used. However stalling wastes processor time by doing nothing while waitingwastes processor time by doing nothing while waiting for the result.for the result.  A stall is the delay in cycles caused due to any of theA stall is the delay in cycles caused due to any of the hazards mentioned abovehazards mentioned above  How to Calculate SpeedupHow to Calculate Speedup :: 1/(1+pipeline stall per instruction)* Number1/(1+pipeline stall per instruction)* Number of stagesof stages
  • 42. StallingStalling  So what is the speed up for an idealSo what is the speed up for an ideal pipeline with no stalls?pipeline with no stalls? Number of cycles needed to initially fill upNumber of cycles needed to initially fill up the pipeline could be included inthe pipeline could be included in computation of average stall per instructioncomputation of average stall per instruction
  • 43. Structural hazardsStructural hazards  When more than one instruction in theWhen more than one instruction in the pipeline needs to access a resource, thepipeline needs to access a resource, the data path is said to have a structuraldata path is said to have a structural hazardhazard  Examples of resources: register file,Examples of resources: register file, memory, ALU.memory, ALU.  Solution: Stall the pipeline for one clockSolution: Stall the pipeline for one clock cycle when the conflict is detected. Thiscycle when the conflict is detected. This results in a pipeline bubbleresults in a pipeline bubble
  • 44. Data hazard - solutionData hazard - solution  Usually solved by data or registerUsually solved by data or register forwarding (bypassing or short-circuiting)forwarding (bypassing or short-circuiting)  How it is done ?How it is done ?  The data selected is not really usedThe data selected is not really used
  • 45. Data hazard classificationData hazard classification  RAWRAW - Read After Write. Most common:- Read After Write. Most common: solved by data forwarding.solved by data forwarding.  WAWWAW - Write After Write : Inst i (load)- Write After Write : Inst i (load) before inst j (add). Both write to samebefore inst j (add). Both write to same register.register.  WARWAR - Write after Read: inst j tries to- Write after Read: inst j tries to write a destination before it is read by I, sowrite a destination before it is read by I, so I incorrectly gets its valueI incorrectly gets its value
  • 46. Dynamic Instruction SchedulingDynamic Instruction Scheduling  With dynamic scheduling the hardwareWith dynamic scheduling the hardware tries to rearrange the instructions duringtries to rearrange the instructions during run-time to reduce pipeline stalls.run-time to reduce pipeline stalls.  Simpler compiler handles dependenciesSimpler compiler handles dependencies not known at compile timenot known at compile time  Allows code compiled for a differentAllows code compiled for a different machine to run efficiently.machine to run efficiently.
  • 47. Out-Of-Order ExecutionOut-Of-Order Execution  With out-of-order execution, the SUBD isWith out-of-order execution, the SUBD is allowed to executed before the addallowed to executed before the add  this can lead to out-of order completion,this can lead to out-of order completion, which can cause WAW and WAR hazardswhich can cause WAW and WAR hazards
  • 48. Score boardingScore boarding  The scoreboard implements a centralizedThe scoreboard implements a centralized  control scheme that Detects all resource andcontrol scheme that Detects all resource and data hazardsdata hazards  Allows instructions to execute out-of-order whenAllows instructions to execute out-of-order when no resource hazards or data dependenciesno resource hazards or data dependencies  First implemented in 1964 by the CDC 6600,First implemented in 1964 by the CDC 6600, which had 18 separate functional units,4 FPwhich had 18 separate functional units,4 FP units (2 multiply, 1 add, 1 divide),7 memory unitsunits (2 multiply, 1 add, 1 divide),7 memory units (5 loads, 2 stores)(5 loads, 2 stores)  7 integer units (add, shift, logical, compare, etc.)7 integer units (add, shift, logical, compare, etc.)
  • 49.
  • 50. Scoreboard ImplicationsScoreboard Implications  Our dynamic pipeline (much simpler)Our dynamic pipeline (much simpler)  2 FP multiply (10 EX cycles)2 FP multiply (10 EX cycles)  1 FP add (2 EX cycles)1 FP add (2 EX cycles)  1 FP divide (40 EX cycles)1 FP divide (40 EX cycles)  1 integer unit (1 EX cycle)1 integer unit (1 EX cycle)
  • 51. Scoreboard ImplicationsScoreboard Implications  Out-of-order completion can lead to WAROut-of-order completion can lead to WAR and WAW hazards?and WAW hazards?  Solution for WAWSolution for WAW  Detect WAW hazard before reading operandsDetect WAW hazard before reading operands  Stall write until other instruction completesStall write until other instruction completes
  • 52. Scoreboard ImplicationsScoreboard Implications  Solutions for WARSolutions for WAR  Detect WAR hazards before writing back toDetect WAR hazards before writing back to the register files and stall the write backthe register files and stall the write back  This scoreboard does not take advantage ofThis scoreboard does not take advantage of forwarding (i.e. bypasses), since it waits untilforwarding (i.e. bypasses), since it waits until both results are written back to the register fileboth results are written back to the register file  Scoreboard replaces DR, EX, WB with 4Scoreboard replaces DR, EX, WB with 4 stagesstages
  • 53. Scoreboard ImplicationsScoreboard Implications  Solutions for WARSolutions for WAR  Detect WAR hazards before writing back toDetect WAR hazards before writing back to the register files and stall the write backthe register files and stall the write back  This scoreboard does not take advantage ofThis scoreboard does not take advantage of forwarding (i.e. bypasses), since it waits untilforwarding (i.e. bypasses), since it waits until both results are written back to the register fileboth results are written back to the register file  Scoreboard replaces DR, EX, WB with 4Scoreboard replaces DR, EX, WB with 4 stagesstages
  • 54. Stages of Scoreboard ControlStages of Scoreboard Control  Decode+Issue (Issue)Decode+Issue (Issue)  Read operands (Read)Read operands (Read)  Execution (EX)Execution (EX)  Write result (WB)Write result (WB)
  • 55. Parts of the ScoreboardParts of the Scoreboard  Instruction statusInstruction status : which of 4 steps the: which of 4 steps the instruction is in: Issue, Read, EX, or WBinstruction is in: Issue, Read, EX, or WB  Functional unit statusFunctional unit status :Indicates the:Indicates the state of the functional unit (FU). 9 fields forstate of the functional unit (FU). 9 fields for each functional unit.each functional unit.  Register result statusRegister result status —Indicates—Indicates which functional unit will write eachwhich functional unit will write each register, if one exists. Blank when noregister, if one exists. Blank when no pending instructions will write that registerpending instructions will write that register
  • 56. Parts of the ScoreboardParts of the Scoreboard  Busy:Indicates whether the unit is busy or notBusy:Indicates whether the unit is busy or not  Op:Operation to perform in the unit (e.g., + or –)Op:Operation to perform in the unit (e.g., + or –)  Fi:Destination registerFi:Destination register  Fj, Fk:Source-register numbersFj, Fk:Source-register numbers  Qj, Qk:Functional units producing sourceQj, Qk:Functional units producing source registers Fj, Fkregisters Fj, Fk  Rj, Rk:Flags indicating when Fj, Fk are readyRj, Rk:Flags indicating when Fj, Fk are ready
  • 57. Tomasulo Algorithm forTomasulo Algorithm for Dynamic SchedulingDynamic Scheduling  For IBM 360/91 in 1967 -about 3 years afterFor IBM 360/91 in 1967 -about 3 years after CDC 6600 •Goal: High performance withoutCDC 6600 •Goal: High performance without special compilers •special compilers •  Differences between IBM 360 & CDC 6600 –IBMDifferences between IBM 360 & CDC 6600 –IBM has only 2 register specifiers/instr vs. 3 in CDChas only 2 register specifiers/instr vs. 3 in CDC 6600 –IBM has register-memory instructions –6600 –IBM has register-memory instructions – IBM has 4 FP registers vs. 8 in CDC 6600 –IBMIBM has 4 FP registers vs. 8 in CDC 6600 –IBM has pipelined functional units (3 adds, 2has pipelined functional units (3 adds, 2 multiplies)multiplies)
  • 58. Tomasulo AlgorithmTomasulo Algorithm  Tomasulo algorithm is designed to handleTomasulo algorithm is designed to handle name dependencies (WAW and WARname dependencies (WAW and WAR hazards) efficientlyhazards) efficiently
  • 59. Tomasulo Algorithm AdvantageTomasulo Algorithm Advantage  Prevents register from being thePrevents register from being the bottleneck.bottleneck.  Eliminates WAR, WAW hazards –AllowsEliminates WAR, WAW hazards –Allows loop unrolling in HW Common.loop unrolling in HW Common.  Data Bus –Broadcasts results to multipleData Bus –Broadcasts results to multiple instructions –Central bottleneckinstructions –Central bottleneck  It provide Dynamic schedulingIt provide Dynamic scheduling  It provide Register renamingIt provide Register renaming  Load/store disambiguationLoad/store disambiguation