Pipelining & All Hazards Solution

.AIR UNIVERSITY ISLAMABAD
.AIR UNIVERSITY ISLAMABADStudent at University of Azad Jammu and Kashmir em .AIR UNIVERSITY ISLAMABAD
Pipelining  & All Hazards Solution
Mansoor Bashir
Presentation
Presentation Topic: Pipelining & All Hazards Solution
Content
Introduction to pipeline hazard
Structural Hazard
Data Hazard
Control Hazard
Pipelining
What is Pipelining?
It is an implementation technique where multiple tasks are
performed in overlapped manner.
When Pipelining Can be Implemented?
It can be implemented when a task Is divided into two or subtasks,
which can be performed independently.
Classic 5-stage pipeline:
1) Instruction Fetch (Ifetch),
2) Register Read (Reg),
3) Execute (ALU),
4) Data Memory Access (Dmem),
5) Register Write (Reg)
Pipelined Instruction Execution
Pipeline Hazards
• Hazard: Condition or suitaution which does not allow the
pipeline to operate normally.
• Hazards reduce the performance from the ideal speedup
gained by pipelining
• Hazards in pipeline can make the pipeline to stall
• Eliminating a hazard often requires that some
instructions in the pipeline to be allowed to proceed
while others are delayed
– When an instruction is stalled, instructions issued latter
than the stalled instruction are stopped, while the ones
issued earlier must continue
Pipeline Hazards
• No new instructions are fetched during the stall
• Three types of hazards
– Structural hazards
– Data hazards
– Control hazards
Structural Hazards
A structural hazard occurs when a part of the
processor's hardware is needed by two or more
instructions at the same time.
HW cannot support the combination of instructions
Structural hazards can be avoided by stalling,
duplicating the resource, or pipelining the resource.
Structural Hazards
• Consider a Von Neumann architecture (same memory for instructions
and data)
Structural Hazards
• Stall cycle added (commonly called pipeline bubble)
Structural Hazards
Topic: Data Hazards
Data Hazards
• Data hazards occur when the pipeline changes the
order of read/write accesses to operands so that the
order differs from the order seen by sequentially
executing instructions on an un-pipelined machine
• Consider the execution of following instructions, on
our pipelined example processor:
– ADD R1, R2, R3
– SUB R4, R1, R5
– AND R6, R1, R7
– OR R8, R1, R9
– XOR R10, R1, R11
Data Hazards
• The use of results from ADD instruction causes hazard since the
register is not written until after those instructions read it.
Data Hazards
• Eliminate the stalls for the hazard involving SUB and AND
instructions using a technique called forwarding
Data Hazards
• Store requires an operand during MEM and forwarding is shown here.
– The result of the load is forwarded from the output in MEM/WB to the memory
input to be stored
– In addition the ALUOutput is forwarded to ALU input for address calculation
for both Load and Store
Data Hazards Classification
• Depending on the order of read and write access in the
instructions, data hazards could be classified as three types.
• Consider two instructions i and j, with i occurring before j.
Possible data hazards:
– RAW (Read After Write)
• j tries to read a source before i writes to it , so j incorrectly gets the old
value;
• most common type of hazard, that is what we tried to explain so far.
– WAW (Write After Write)
• j tries to write an operand before is written by i. The write ends up being
performed in wrong order, having i overwrite the operand written by j, the
destination containing the operand written by i rather than the one written
by j
• Present in pipelines that write in more than one pipe stage
– WAR (Write After Read)
• j tries to write a destination before it is read by i, so the instruction i
incorrectly gets the new value
• This doesn’t happen in our example, since all reads are early and writes late
Data Hazards Requiring Stalls
• Unfortunately not all data hazards can be handled by
forwarding. Consider the following sequence:
– LW R1, 0(R2)
– SUB R4, R1, R5
– AND R6, R1, R7
– OR R8, R1, R9
• The problem with this sequence is that the Load
operation will not have data until the end of MEM
stage.
Data Hazards Requiring Stalls
• The load instruction can forward the results to AND and OR
instruction, but not to the SUB instruction since that would mean
forwarding results in “negative” time
Data Hazards Requiring Stalls
• The load interlock causes a stall to be inserted at clock cycle 4,
delaying the SUB instruction and those that follow by one cycle.
– This delay allows the value to be successfully forwarded onto the next clock
cycle
Data Hazards Requiring Stalls
• Before stall insertion
LW R1, 0(R2) IF ID EX MEM WB
SUB R4, R1, R5 IF ID EX MEM WB
AND R6, R1, R7 IF ID EX MEM WB
OR R8, R1, R9 IF ID EX MEM WB
LW R1, 0(R2) IF ID EX MEM WB
SUB R4, R1, R5 IF ID stall EX MEM WB
AND R6, R1, R7 IF stall ID EX MEM WB
OR R8, R1, R9 stall IF ID EX MEM WB
• After stall insertion
Compiler Scheduling for Data Hazards
• Consider a typical code, such as A = B+C
LW R1, B IF ID EX MEM WB
LW R2, C IF ID EX MEM WB
ADD R3, R1, R2 IF ID stall EX MEM WB
SW A, R3 IF stall ID EX MEM WB
• The ADD instruction must be stalled to allow the load of C to complete
• The SW needs not be delayed because the forwarding hardware passes the result from MEM/WB directly to the data memory input for storing
Compiler Scheduling for Data Hazards
• Rather than just allow the pipeline to stall, the
compiler could try to schedule the pipeline to avoid
the stalls, by rearranging the code
– The compiler could try to avoid the generating the code
with a load followed by an immediate use of the load
destination register
– This technique is called pipeline scheduling or
instruction scheduling and it is a very used technique in
modern compilers
Topic: Control Hazards
Control Hazards
• Can cause a greater performance loss than the data hazards
• When a branch is executed it may or it may not change the
PC (to other value than its value + 4)
– If a branch is changing the PC to its target address, than it is a
taken branch
– If a branch doesn’t change the PC to its target address, than it is a
not taken branch
• If instruction i is a taken branch, than the value of PC will
not change until the end MEM stage of the instruction
execution in the pipeline
– A simple method to deal with branches is to stall the pipe as soon
as we detect a branch until we know the result of the branch
Control Hazards
• A branch causes three cycle stall in our example processor
pipeline
– One cycle is a repeated IF – necessary if the branch would be
taken. If the branch is not taken, this IF is redundant
– Two idle cycles
Branch Instruction IF ID EX MEM WB
Branch Successor IF stall stall IF ID EX MEM WB
Branch Successor
+1 IF ID EX MEM WB
Branch Successor
+2 IF ID EX MEM
Control Hazards
• The three clock cycles lost for every branch is a
significant loss
– With a 30% branch frequency, the machine with branch
stalls achieves only about half of the speedup from
pipelining
– Reducing the branch penalty becomes critical
• The number of clock cycles in a branch stall can be
reduced by two steps:
– Find out if the branch is taken or not in early stage in the
pipeline
– Compute the taken PC (address of the branch target)
earlier
Control Hazards
Reducing the stall from branch hazards by moving the zero test and branch calculation into ID
phase of pipeline. It uses a separate adder to compute the branch target address during ID.
Because the branch target addition happens during ID, it will happen for all instructions. The
branch condition (Regs[IF/ID.IR6…10] op 0) will also be done for all instructions. The selection
of the sequential PC or the branch target PC will still occur during IF, but now it uses values
from ID phase, rather than from EX/MEM register. In this case, the branch instruction is done by
the end of ID phase, so EX, MEM and WB stages are not used for branch instructions anymore.
..
• END
1 de 30

Recomendados

Pipeline hazard por
Pipeline hazardPipeline hazard
Pipeline hazardAJAL A J
61.9K visualizações61 slides
Computer architecture pipelining por
Computer architecture pipeliningComputer architecture pipelining
Computer architecture pipeliningMazin Alwaaly
17K visualizações38 slides
pipelining por
pipeliningpipelining
pipeliningSiddique Ibrahim
83K visualizações78 slides
Input Output Organization por
Input Output OrganizationInput Output Organization
Input Output OrganizationKamal Acharya
43.3K visualizações41 slides
Instruction pipeline: Computer Architecture por
Instruction pipeline: Computer ArchitectureInstruction pipeline: Computer Architecture
Instruction pipeline: Computer ArchitectureMd. Saidur Rahman Kohinoor
56.9K visualizações21 slides
INSTRUCTION LEVEL PARALLALISM por
INSTRUCTION LEVEL PARALLALISMINSTRUCTION LEVEL PARALLALISM
INSTRUCTION LEVEL PARALLALISMKamran Ashraf
20.8K visualizações16 slides

Mais conteúdo relacionado

Mais procurados

Types of Addressing modes- COA por
Types of Addressing modes- COATypes of Addressing modes- COA
Types of Addressing modes- COARuchi Maurya
7.2K visualizações17 slides
Superscalar & superpipeline processor por
Superscalar & superpipeline processorSuperscalar & superpipeline processor
Superscalar & superpipeline processorMuhammad Ishaq
78.3K visualizações17 slides
Presentation on cyclic redundancy check (crc) por
Presentation on cyclic redundancy check (crc)Presentation on cyclic redundancy check (crc)
Presentation on cyclic redundancy check (crc)Sudhanshu Srivastava
34K visualizações15 slides
Pipelining por
PipeliningPipelining
PipeliningAmin Omi
498 visualizações17 slides
Data transfer and manipulation por
Data transfer and manipulationData transfer and manipulation
Data transfer and manipulationSanjeev Patel
35.3K visualizações16 slides
Datapath Design of Computer Architecture por
Datapath Design of Computer ArchitectureDatapath Design of Computer Architecture
Datapath Design of Computer ArchitectureAbu Zaman
19.1K visualizações19 slides

Mais procurados(20)

Types of Addressing modes- COA por Ruchi Maurya
Types of Addressing modes- COATypes of Addressing modes- COA
Types of Addressing modes- COA
Ruchi Maurya7.2K visualizações
Superscalar & superpipeline processor por Muhammad Ishaq
Superscalar & superpipeline processorSuperscalar & superpipeline processor
Superscalar & superpipeline processor
Muhammad Ishaq78.3K visualizações
Presentation on cyclic redundancy check (crc) por Sudhanshu Srivastava
Presentation on cyclic redundancy check (crc)Presentation on cyclic redundancy check (crc)
Presentation on cyclic redundancy check (crc)
Sudhanshu Srivastava34K visualizações
Pipelining por Amin Omi
PipeliningPipelining
Pipelining
Amin Omi498 visualizações
Data transfer and manipulation por Sanjeev Patel
Data transfer and manipulationData transfer and manipulation
Data transfer and manipulation
Sanjeev Patel35.3K visualizações
Datapath Design of Computer Architecture por Abu Zaman
Datapath Design of Computer ArchitectureDatapath Design of Computer Architecture
Datapath Design of Computer Architecture
Abu Zaman19.1K visualizações
Presentation on risc pipeline por Arijit Chakraborty
Presentation on risc pipelinePresentation on risc pipeline
Presentation on risc pipeline
Arijit Chakraborty6K visualizações
Timing and control por chauhankapil
Timing and controlTiming and control
Timing and control
chauhankapil9.9K visualizações
Distance Vector & Link state Routing Algorithm por MOHIT AGARWAL
Distance Vector & Link state Routing AlgorithmDistance Vector & Link state Routing Algorithm
Distance Vector & Link state Routing Algorithm
MOHIT AGARWAL3.1K visualizações
Unit 3-pipelining & vector processing por vishal choudhary
Unit 3-pipelining & vector processingUnit 3-pipelining & vector processing
Unit 3-pipelining & vector processing
vishal choudhary462 visualizações
Instruction Set Architecture por Dilum Bandara
Instruction Set ArchitectureInstruction Set Architecture
Instruction Set Architecture
Dilum Bandara26.6K visualizações
Code optimization in compiler design por Kuppusamy P
Code optimization in compiler designCode optimization in compiler design
Code optimization in compiler design
Kuppusamy P4.1K visualizações
Disk scheduling por NEERAJ BAGHEL
Disk schedulingDisk scheduling
Disk scheduling
NEERAJ BAGHEL9.2K visualizações
memory reference instruction por DeepikaT13
memory reference instructionmemory reference instruction
memory reference instruction
DeepikaT134.3K visualizações
Pipelining por AJAL A J
PipeliningPipelining
Pipelining
AJAL A J1.9K visualizações
Instruction cycle por shweta-sharma99
Instruction cycleInstruction cycle
Instruction cycle
shweta-sharma9971.8K visualizações
Memory organization (Computer architecture) por Sandesh Jonchhe
Memory organization (Computer architecture)Memory organization (Computer architecture)
Memory organization (Computer architecture)
Sandesh Jonchhe13K visualizações
Minimum mode and Maximum mode Configuration in 8086 por Jismy .K.Jose
Minimum mode and Maximum mode Configuration in 8086Minimum mode and Maximum mode Configuration in 8086
Minimum mode and Maximum mode Configuration in 8086
Jismy .K.Jose32.1K visualizações
Cache memory por Anuj Modi
Cache memoryCache memory
Cache memory
Anuj Modi55.2K visualizações
Computer architecture register transfer languages rtl por Mazin Alwaaly
Computer architecture register transfer languages rtlComputer architecture register transfer languages rtl
Computer architecture register transfer languages rtl
Mazin Alwaaly4.3K visualizações

Similar a Pipelining & All Hazards Solution

Ct213 processor design_pipelinehazard por
Ct213 processor design_pipelinehazardCt213 processor design_pipelinehazard
Ct213 processor design_pipelinehazardrakeshrakesh2020
1.6K visualizações27 slides
Topic2a ss pipelines por
Topic2a ss pipelinesTopic2a ss pipelines
Topic2a ss pipelinesturki_09
948 visualizações54 slides
3 Pipelining por
3 Pipelining3 Pipelining
3 Pipeliningfika sweety
4K visualizações50 slides
Pipeline & Nonpipeline Processor por
Pipeline & Nonpipeline ProcessorPipeline & Nonpipeline Processor
Pipeline & Nonpipeline ProcessorSmit Shah
622 visualizações17 slides
Performance Enhancement with Pipelining por
Performance Enhancement with PipeliningPerformance Enhancement with Pipelining
Performance Enhancement with PipeliningAneesh Raveendran
4.3K visualizações72 slides
Assembly p1 por
Assembly p1Assembly p1
Assembly p1raja khizar
168 visualizações41 slides

Similar a Pipelining & All Hazards Solution(20)

Ct213 processor design_pipelinehazard por rakeshrakesh2020
Ct213 processor design_pipelinehazardCt213 processor design_pipelinehazard
Ct213 processor design_pipelinehazard
rakeshrakesh20201.6K visualizações
Topic2a ss pipelines por turki_09
Topic2a ss pipelinesTopic2a ss pipelines
Topic2a ss pipelines
turki_09948 visualizações
3 Pipelining por fika sweety
3 Pipelining3 Pipelining
3 Pipelining
fika sweety4K visualizações
Pipeline & Nonpipeline Processor por Smit Shah
Pipeline & Nonpipeline ProcessorPipeline & Nonpipeline Processor
Pipeline & Nonpipeline Processor
Smit Shah622 visualizações
Performance Enhancement with Pipelining por Aneesh Raveendran
Performance Enhancement with PipeliningPerformance Enhancement with Pipelining
Performance Enhancement with Pipelining
Aneesh Raveendran4.3K visualizações
Assembly p1 por raja khizar
Assembly p1Assembly p1
Assembly p1
raja khizar168 visualizações
CA UNIT III.pptx por ssuser9dbd7e
CA UNIT III.pptxCA UNIT III.pptx
CA UNIT III.pptx
ssuser9dbd7e4 visualizações
High Performance Computer Architecture por Subhasis Dash
High Performance Computer ArchitectureHigh Performance Computer Architecture
High Performance Computer Architecture
Subhasis Dash268 visualizações
Pipeline and data hazard por Waed Shagareen
Pipeline and data hazardPipeline and data hazard
Pipeline and data hazard
Waed Shagareen20.5K visualizações
14 superscalar por Anwal Mirza
14 superscalar14 superscalar
14 superscalar
Anwal Mirza34 visualizações
hazard new.ppt por View20
hazard new.ppthazard new.ppt
hazard new.ppt
View205 visualizações
Pipeline hazards in computer Architecture ppt por mali yogesh kumar
Pipeline hazards in computer Architecture pptPipeline hazards in computer Architecture ppt
Pipeline hazards in computer Architecture ppt
mali yogesh kumar20.9K visualizações
Unit 2 contd. and( unit 3 voice over ppt) por Dr Reeja S R
Unit 2 contd. and( unit 3   voice over ppt)Unit 2 contd. and( unit 3   voice over ppt)
Unit 2 contd. and( unit 3 voice over ppt)
Dr Reeja S R108 visualizações
CALecture3Module1.ppt por BeeMUcz
CALecture3Module1.pptCALecture3Module1.ppt
CALecture3Module1.ppt
BeeMUcz1 visão
Pipelining por sarith divakar
PipeliningPipelining
Pipelining
sarith divakar2.3K visualizações
Processor Organization and Architecture por Dhaval Bagal
Processor Organization and ArchitectureProcessor Organization and Architecture
Processor Organization and Architecture
Dhaval Bagal602 visualizações
Pipelining And Vector Processing por TheInnocentTuber
Pipelining And Vector ProcessingPipelining And Vector Processing
Pipelining And Vector Processing
TheInnocentTuber16 visualizações
Instruction Level Parallelism and Superscalar Processors por Syed Zaid Irshad
Instruction Level Parallelism and Superscalar ProcessorsInstruction Level Parallelism and Superscalar Processors
Instruction Level Parallelism and Superscalar Processors
Syed Zaid Irshad3.3K visualizações
Pipelinig hazardous por jasscheema
Pipelinig hazardousPipelinig hazardous
Pipelinig hazardous
jasscheema4.5K visualizações

Mais de .AIR UNIVERSITY ISLAMABAD

Risk Assessment por
Risk AssessmentRisk Assessment
Risk Assessment.AIR UNIVERSITY ISLAMABAD
476 visualizações24 slides
Dining philosopher por
Dining philosopherDining philosopher
Dining philosopher.AIR UNIVERSITY ISLAMABAD
4.5K visualizações32 slides
Mission ,Vision statement, and strategies of Unicef por
Mission ,Vision statement, and strategies of UnicefMission ,Vision statement, and strategies of Unicef
Mission ,Vision statement, and strategies of Unicef.AIR UNIVERSITY ISLAMABAD
4.3K visualizações51 slides
Project Management System por
Project Management SystemProject Management System
Project Management System.AIR UNIVERSITY ISLAMABAD
260 visualizações48 slides
Interfacing With High Level Programming Language por
Interfacing With High Level Programming Language Interfacing With High Level Programming Language
Interfacing With High Level Programming Language .AIR UNIVERSITY ISLAMABAD
4.8K visualizações23 slides
Code Converters & Parity Checker por
Code Converters & Parity CheckerCode Converters & Parity Checker
Code Converters & Parity Checker.AIR UNIVERSITY ISLAMABAD
2.8K visualizações31 slides

Mais de .AIR UNIVERSITY ISLAMABAD(6)

Último

sam_software_eng_cv.pdf por
sam_software_eng_cv.pdfsam_software_eng_cv.pdf
sam_software_eng_cv.pdfsammyigbinovia
5 visualizações5 slides
CHEMICAL KINETICS.pdf por
CHEMICAL KINETICS.pdfCHEMICAL KINETICS.pdf
CHEMICAL KINETICS.pdfAguedaGutirrez
12 visualizações337 slides
Final Year Presentation por
Final Year PresentationFinal Year Presentation
Final Year PresentationComsat Universal Islamabad Wah Campus
6 visualizações29 slides
Machine Element II Course outline.pdf por
Machine Element II Course outline.pdfMachine Element II Course outline.pdf
Machine Element II Course outline.pdfodatadese1
8 visualizações2 slides
Investor Presentation por
Investor PresentationInvestor Presentation
Investor Presentationeser sevinç
24 visualizações26 slides
SPICE PARK DEC2023 (6,625 SPICE Models) por
SPICE PARK DEC2023 (6,625 SPICE Models) SPICE PARK DEC2023 (6,625 SPICE Models)
SPICE PARK DEC2023 (6,625 SPICE Models) Tsuyoshi Horigome
17 visualizações218 slides

Último(20)

sam_software_eng_cv.pdf por sammyigbinovia
sam_software_eng_cv.pdfsam_software_eng_cv.pdf
sam_software_eng_cv.pdf
sammyigbinovia5 visualizações
CHEMICAL KINETICS.pdf por AguedaGutirrez
CHEMICAL KINETICS.pdfCHEMICAL KINETICS.pdf
CHEMICAL KINETICS.pdf
AguedaGutirrez12 visualizações
Machine Element II Course outline.pdf por odatadese1
Machine Element II Course outline.pdfMachine Element II Course outline.pdf
Machine Element II Course outline.pdf
odatadese18 visualizações
Investor Presentation por eser sevinç
Investor PresentationInvestor Presentation
Investor Presentation
eser sevinç24 visualizações
SPICE PARK DEC2023 (6,625 SPICE Models) por Tsuyoshi Horigome
SPICE PARK DEC2023 (6,625 SPICE Models) SPICE PARK DEC2023 (6,625 SPICE Models)
SPICE PARK DEC2023 (6,625 SPICE Models)
Tsuyoshi Horigome17 visualizações
An approach of ontology and knowledge base for railway maintenance por IJECEIAES
An approach of ontology and knowledge base for railway maintenanceAn approach of ontology and knowledge base for railway maintenance
An approach of ontology and knowledge base for railway maintenance
IJECEIAES12 visualizações
Proposal Presentation.pptx por keytonallamon
Proposal Presentation.pptxProposal Presentation.pptx
Proposal Presentation.pptx
keytonallamon17 visualizações
START Newsletter 3 por Start Project
START Newsletter 3START Newsletter 3
START Newsletter 3
Start Project5 visualizações
NEW SUPPLIERS SUPPLIES (copie).pdf por georgesradjou
NEW SUPPLIERS SUPPLIES (copie).pdfNEW SUPPLIERS SUPPLIES (copie).pdf
NEW SUPPLIERS SUPPLIES (copie).pdf
georgesradjou15 visualizações
K8S Roadmap.pdf por MaryamTavakkoli2
K8S Roadmap.pdfK8S Roadmap.pdf
K8S Roadmap.pdf
MaryamTavakkoli26 visualizações
SUMIT SQL PROJECT SUPERSTORE 1.pptx por Sumit Jadhav
SUMIT SQL PROJECT SUPERSTORE 1.pptxSUMIT SQL PROJECT SUPERSTORE 1.pptx
SUMIT SQL PROJECT SUPERSTORE 1.pptx
Sumit Jadhav 12 visualizações
DevOps to DevSecOps: Enhancing Software Security Throughout The Development L... por Anowar Hossain
DevOps to DevSecOps: Enhancing Software Security Throughout The Development L...DevOps to DevSecOps: Enhancing Software Security Throughout The Development L...
DevOps to DevSecOps: Enhancing Software Security Throughout The Development L...
Anowar Hossain12 visualizações
Effect of deep chemical mixing columns on properties of surrounding soft clay... por AltinKaradagli
Effect of deep chemical mixing columns on properties of surrounding soft clay...Effect of deep chemical mixing columns on properties of surrounding soft clay...
Effect of deep chemical mixing columns on properties of surrounding soft clay...
AltinKaradagli6 visualizações
Update 42 models(Diode/General ) in SPICE PARK(DEC2023) por Tsuyoshi Horigome
Update 42 models(Diode/General ) in SPICE PARK(DEC2023)Update 42 models(Diode/General ) in SPICE PARK(DEC2023)
Update 42 models(Diode/General ) in SPICE PARK(DEC2023)
Tsuyoshi Horigome23 visualizações
SNMPx por Amatullahbutt
SNMPxSNMPx
SNMPx
Amatullahbutt16 visualizações
Investigation of Physicochemical Changes of Soft Clay around Deep Geopolymer ... por AltinKaradagli
Investigation of Physicochemical Changes of Soft Clay around Deep Geopolymer ...Investigation of Physicochemical Changes of Soft Clay around Deep Geopolymer ...
Investigation of Physicochemical Changes of Soft Clay around Deep Geopolymer ...
AltinKaradagli9 visualizações
Thermal aware task assignment for multicore processors using genetic algorithm por IJECEIAES
Thermal aware task assignment for multicore processors using genetic algorithm Thermal aware task assignment for multicore processors using genetic algorithm
Thermal aware task assignment for multicore processors using genetic algorithm
IJECEIAES31 visualizações
Searching in Data Structure por raghavbirla63
Searching in Data StructureSearching in Data Structure
Searching in Data Structure
raghavbirla635 visualizações

Pipelining & All Hazards Solution

  • 4. Content Introduction to pipeline hazard Structural Hazard Data Hazard Control Hazard
  • 5. Pipelining What is Pipelining? It is an implementation technique where multiple tasks are performed in overlapped manner. When Pipelining Can be Implemented? It can be implemented when a task Is divided into two or subtasks, which can be performed independently. Classic 5-stage pipeline: 1) Instruction Fetch (Ifetch), 2) Register Read (Reg), 3) Execute (ALU), 4) Data Memory Access (Dmem), 5) Register Write (Reg)
  • 7. Pipeline Hazards • Hazard: Condition or suitaution which does not allow the pipeline to operate normally. • Hazards reduce the performance from the ideal speedup gained by pipelining • Hazards in pipeline can make the pipeline to stall • Eliminating a hazard often requires that some instructions in the pipeline to be allowed to proceed while others are delayed – When an instruction is stalled, instructions issued latter than the stalled instruction are stopped, while the ones issued earlier must continue
  • 8. Pipeline Hazards • No new instructions are fetched during the stall • Three types of hazards – Structural hazards – Data hazards – Control hazards
  • 9. Structural Hazards A structural hazard occurs when a part of the processor's hardware is needed by two or more instructions at the same time. HW cannot support the combination of instructions Structural hazards can be avoided by stalling, duplicating the resource, or pipelining the resource.
  • 10. Structural Hazards • Consider a Von Neumann architecture (same memory for instructions and data)
  • 11. Structural Hazards • Stall cycle added (commonly called pipeline bubble)
  • 14. Data Hazards • Data hazards occur when the pipeline changes the order of read/write accesses to operands so that the order differs from the order seen by sequentially executing instructions on an un-pipelined machine • Consider the execution of following instructions, on our pipelined example processor: – ADD R1, R2, R3 – SUB R4, R1, R5 – AND R6, R1, R7 – OR R8, R1, R9 – XOR R10, R1, R11
  • 15. Data Hazards • The use of results from ADD instruction causes hazard since the register is not written until after those instructions read it.
  • 16. Data Hazards • Eliminate the stalls for the hazard involving SUB and AND instructions using a technique called forwarding
  • 17. Data Hazards • Store requires an operand during MEM and forwarding is shown here. – The result of the load is forwarded from the output in MEM/WB to the memory input to be stored – In addition the ALUOutput is forwarded to ALU input for address calculation for both Load and Store
  • 18. Data Hazards Classification • Depending on the order of read and write access in the instructions, data hazards could be classified as three types. • Consider two instructions i and j, with i occurring before j. Possible data hazards: – RAW (Read After Write) • j tries to read a source before i writes to it , so j incorrectly gets the old value; • most common type of hazard, that is what we tried to explain so far. – WAW (Write After Write) • j tries to write an operand before is written by i. The write ends up being performed in wrong order, having i overwrite the operand written by j, the destination containing the operand written by i rather than the one written by j • Present in pipelines that write in more than one pipe stage – WAR (Write After Read) • j tries to write a destination before it is read by i, so the instruction i incorrectly gets the new value • This doesn’t happen in our example, since all reads are early and writes late
  • 19. Data Hazards Requiring Stalls • Unfortunately not all data hazards can be handled by forwarding. Consider the following sequence: – LW R1, 0(R2) – SUB R4, R1, R5 – AND R6, R1, R7 – OR R8, R1, R9 • The problem with this sequence is that the Load operation will not have data until the end of MEM stage.
  • 20. Data Hazards Requiring Stalls • The load instruction can forward the results to AND and OR instruction, but not to the SUB instruction since that would mean forwarding results in “negative” time
  • 21. Data Hazards Requiring Stalls • The load interlock causes a stall to be inserted at clock cycle 4, delaying the SUB instruction and those that follow by one cycle. – This delay allows the value to be successfully forwarded onto the next clock cycle
  • 22. Data Hazards Requiring Stalls • Before stall insertion LW R1, 0(R2) IF ID EX MEM WB SUB R4, R1, R5 IF ID EX MEM WB AND R6, R1, R7 IF ID EX MEM WB OR R8, R1, R9 IF ID EX MEM WB LW R1, 0(R2) IF ID EX MEM WB SUB R4, R1, R5 IF ID stall EX MEM WB AND R6, R1, R7 IF stall ID EX MEM WB OR R8, R1, R9 stall IF ID EX MEM WB • After stall insertion
  • 23. Compiler Scheduling for Data Hazards • Consider a typical code, such as A = B+C LW R1, B IF ID EX MEM WB LW R2, C IF ID EX MEM WB ADD R3, R1, R2 IF ID stall EX MEM WB SW A, R3 IF stall ID EX MEM WB • The ADD instruction must be stalled to allow the load of C to complete • The SW needs not be delayed because the forwarding hardware passes the result from MEM/WB directly to the data memory input for storing
  • 24. Compiler Scheduling for Data Hazards • Rather than just allow the pipeline to stall, the compiler could try to schedule the pipeline to avoid the stalls, by rearranging the code – The compiler could try to avoid the generating the code with a load followed by an immediate use of the load destination register – This technique is called pipeline scheduling or instruction scheduling and it is a very used technique in modern compilers
  • 26. Control Hazards • Can cause a greater performance loss than the data hazards • When a branch is executed it may or it may not change the PC (to other value than its value + 4) – If a branch is changing the PC to its target address, than it is a taken branch – If a branch doesn’t change the PC to its target address, than it is a not taken branch • If instruction i is a taken branch, than the value of PC will not change until the end MEM stage of the instruction execution in the pipeline – A simple method to deal with branches is to stall the pipe as soon as we detect a branch until we know the result of the branch
  • 27. Control Hazards • A branch causes three cycle stall in our example processor pipeline – One cycle is a repeated IF – necessary if the branch would be taken. If the branch is not taken, this IF is redundant – Two idle cycles Branch Instruction IF ID EX MEM WB Branch Successor IF stall stall IF ID EX MEM WB Branch Successor +1 IF ID EX MEM WB Branch Successor +2 IF ID EX MEM
  • 28. Control Hazards • The three clock cycles lost for every branch is a significant loss – With a 30% branch frequency, the machine with branch stalls achieves only about half of the speedup from pipelining – Reducing the branch penalty becomes critical • The number of clock cycles in a branch stall can be reduced by two steps: – Find out if the branch is taken or not in early stage in the pipeline – Compute the taken PC (address of the branch target) earlier
  • 29. Control Hazards Reducing the stall from branch hazards by moving the zero test and branch calculation into ID phase of pipeline. It uses a separate adder to compute the branch target address during ID. Because the branch target addition happens during ID, it will happen for all instructions. The branch condition (Regs[IF/ID.IR6…10] op 0) will also be done for all instructions. The selection of the sequential PC or the branch target PC will still occur during IF, but now it uses values from ID phase, rather than from EX/MEM register. In this case, the branch instruction is done by the end of ID phase, so EX, MEM and WB stages are not used for branch instructions anymore.

Notas do Editor

  1. pipeline stall is a delay in execution of an instruction in an instruction pipeline in order to resolve a hazard
  2. pipeline stall is a delay in execution of an instruction in an instruction pipeline in order to resolve a hazard
  3. As a result, when an instruction will perform a data reference, will conflict with an instruction fetch. In this example, the load instruction wants to access the memory to load data at the same time when instruction 3 wants to fetch an instruction from memory.
  4. To solve the problem, a stall cycle is added. The effect of the pipeline bubble is actually to occupy the resources for that instruction slot as it travels through the pipeline. Performance wise, instruction 3 will not complete during clock cycle 8, but during clock cycle 9. We are going to resolve the structural hazard by using stall cyle. This ins 3 will wait until the hw unit memory becomes free then entering in the pipeline by entering first stage if we do that we see that one cycle is waisted due to the stall.
  5. All the instructions after ADD use the result from ADD.
  6. The ADD instruction writes the result in register R1 only at the WB stage, but SUB instruction reads the value during its ID stage. This is what is called a data hazard. Unless precautions are taken, the SUB instruction will read the wrong value and will use it… The AND instruction is also affected by this hazard. As we can see from the figure, the write of R1 doesn’t complete until the end of clock cycle 5. Thus, the AND instruction that reads the registers in clock cycle 4 will receive the wrong results. XOR instruction operates correctly, it reads its inputs (in clock cycle 6) after the ADD has written its result (in clock cycle 5). OR instruction can also be made to work without incurring an hazard, using a simple implementation technique. The technique is to perform the register file reads in the second half of the clock cycle and the writes in the first half.
  7. The data hazard, in certain circumstances can be solved using an implementation technique called forwarding. The idea behind the forwarding is that the result produced by ADD is not really needed by the SUB instruction until it is actually produced. If the result can be moved from where the ADD instruction produces it , the EX/MEM register, to where the SUB needs it, the ALU input latches, then the need for a stall can be avoided. Forwarding works as follow: The ALU result from EX/MEM register is always fed back to the ALU input latches If the forwarding hardware detects that the previous ALU operation has written the register corresponding to a source for the current ALU operation, control logic selects the forwarded result as the ALU input, rather than the value read from the register file. We need to forward results not only from immediately previous instruction, but possible from instructions that started two or three cycles earlier.
  8. To optimize the branch behavior, both of the steps should be taken.
  9. It uses a separate adder to compute the branch target address during ID. Because the branch target addition happens during ID, it will happen for all instructions. The branch condition (Regs[IF/ID.IR6…10] op 0) will also be done for all instructions. The selection of the sequential PC or the branch target PC will still occur during IF, but now it uses values from ID phase, rather than from EX/MEM register. In this case, the branch instruction is done by the end of ID phase, so EX, MEM and WB stages are not used for branch instructions anymore.