SlideShare uma empresa Scribd logo
1 de 31
Baixar para ler offline
Pipelining
www.sarithdivakar.info
Objectives
• Basics of pipelining and how it can lead to improved performance.
• Hazards that cause performance degradation and techniques to
alleviate their effect on performance
• Role of optimizing compilers, which rearrange the sequence of
instructions to maximize the benefits of pipelined execution
• Replicating hardware units in a superscalar processor so that multiple
pipelines can operate concurrently.
Basic Concepts
• Pipelining overlaps the execution of successive instructions to achieve
high performance
• The speed of execution of programs is improved by
• using faster circuit technology
• arranging the hardware so that more than one operation can be performed at
the same time
• The number of operations performed per second is increased, even
though the time needed to perform any one operation is not changed
• Pipelining is a particularly effective way of organizing concurrent
activity in a computer system.
Pipelined Execution
Pipeline Organization
Pipelining Issues
• Any condition that causes the pipeline to stall is called a hazard
• Data hazard: Value of a source operand of an instruction is not
available when needed
• Other hazards arise from memory delays, branch instructions, and
resource limitations
Writing to Register file
Operation using the old
value of the register
Data Dependencies
• Add R2, R3, #100
• Subtract R9, R2, #30
• The Subtract instruction is stalled for three cycles to delay reading
register R2 until cycle 6 when the new value becomes available.
Operand Forwarding
• Pipeline stalls due to data dependencies can be alleviated through the
use of operand forwarding
Handling Data Dependencies in Software
Add R2, R3, #100
NOP
NOP
NOP
Subtract R9, R2, #30
• Code size increases
• Execution time not reduced
The compiler can attempt to optimize the code to improve performance and reduce the code size by
reordering instructions to move useful instructions into the NOP slots
Memory Delays
• Load instruction may require more than one clock cycle to obtain its
operand from memory.
• Load R2, (R3)
• Subtract R9, R2, #30
Operand forwarding
• Operand forwarding cannot be done because the data read from
memory (the cache, in this case) are not available until they are
loaded into register RY at the beginning of cycle 5
The memory operand, which is now in register RY, can be forwarded to the ALU input in cycle 5.
Branch Delays
• Branch instructions can alter the sequence of execution, but they
must first be executed to determine whether and where to branch.
• Branch instructions occur frequently. In fact, they represent about 20
percent of the dynamic instruction count of most programs.
• Reducing the branch penalty requires the branch target address to be
computed earlier in the pipeline
With a two-cycle branch penalty, the relatively high frequency of branch instructions could
increase the execution time for a program by as much as 40 percent.
Unconditional Branches
• Reducing the branch penalty requires the branch target address to be
computed earlier in the pipeline
Target address is determined in the Decode
stage of the pipeline.
Conditional Branches
• For pipelining, the branch condition must be tested as early as
possible to limit the branch penalty
The Branch Delay Slot
• The location that follows a branch instruction is called the branch
delay slot.
• Rather than conditionally discard the instruction in the delay slot, we
can arrange to have the pipeline always execute this instruction,
whether or not the branch is taken
Branch Prediction
• To reduce the branch penalty further, the processor needs to
anticipate that an instruction being fetched is a branch instruction
• Predict its outcome to determine which instruction should be fetched
• Static Branch Prediction
• Dynamic Branch Prediction
Static Branch Prediction
• The simplest form of branch prediction is to assume that the branch
will not be taken and to fetch the next instruction in sequential
address order
• If the prediction is correct, the fetched instruction is allowed to
complete and there is no penalty
• Misprediction incurs the full branch penalty
Dynamic Branch Prediction
• The processor hardware assesses the likelihood of a given branch
being taken by keeping track of branch decisions every time that a
branch instruction is executed.
• dynamic prediction algorithm can use the result of the most recent
execution of a branch instruction
• The processor assumes that the next time the instruction is executed,
the branch decision is likely to be the same as the last time.
A 2-state algorithm
• When the branch instruction is executed and the branch is taken, the
machine moves to state LT. Otherwise, it remains in state LNT.
• The next time the same instruction is encountered, the branch is
predicted as taken if the state machine is in state LT. Otherwise it is
predicted as not taken.
LT - Branch is likely to be taken
LNT - Branch is likely not to be taken
A 4-state algorithm
ST - Strongly likely to be taken
LT - Likely to be taken
LNT - Likely not to be taken
SNT - Strongly likely not to be taken
• After the branch instruction is executed,
and if the branch is actually taken, the
state is changed to ST; otherwise, it is
changed to SNT.
• The branch is predicted as taken if the
state is either ST or LT. Otherwise, the
branch is predicted as not taken.
Performance Evaluation
• Basic performance equation
• For a non-pipelined processor, the execution time, T, of a program
that has a dynamic instruction count of N is given by
• S is the average number of clock cycles it takes to fetch and execute
one instruction
• R is the clock rate in cycles per second
• Instruction throughput
• Number of instructions executed per second. For non-pipelined
execution, the throughput, Pnp, is given by
• The processor with five cycles to execute all instructions.
• If there are no cache misses, S is equal to 5.
• For the five-stage pipeline each instruction is executed in five cycles,
but a new instruction can ideally enter the pipeline every cycle.
• Thus, in the absence of stalls, S is equal to 1, and the ideal throughput
with pipelining is
• A five-stage pipeline can potentially increase the throughput by a
factor of five
n-stage pipeline has the potential to increase throughput n times
Assignment
Date of Submission: 27-Jun-2016
• How much of this potential increase in instruction
throughput can actually be realizedin practice?
• What is a good value for n?
Superscalar Operation
• Add R2, R3, #100
• Load R5, 16(R6)
• Subtract R7, R8, R9
• Store R10, 24(R11)
References
• "Computer Organization", Carl Hamacher Safwat Zaky Zvonko
Vranesic, McGraw-Hill, 2002

Mais conteúdo relacionado

Mais procurados

Pipeline processing and space time diagram
Pipeline processing and space time diagramPipeline processing and space time diagram
Pipeline processing and space time diagramRahul Sharma
 
Real Time Operating system (RTOS) - Embedded systems
Real Time Operating system (RTOS) - Embedded systemsReal Time Operating system (RTOS) - Embedded systems
Real Time Operating system (RTOS) - Embedded systemsHariharan Ganesan
 
Basic MIPS implementation
Basic MIPS implementationBasic MIPS implementation
Basic MIPS implementationkavitha2009
 
Pipeline and data hazard
Pipeline and data hazardPipeline and data hazard
Pipeline and data hazardWaed Shagareen
 
Booth’s algorithm.(a014& a015)
Booth’s algorithm.(a014& a015)Booth’s algorithm.(a014& a015)
Booth’s algorithm.(a014& a015)Piyush Rochwani
 
Page replacement algorithms
Page replacement algorithmsPage replacement algorithms
Page replacement algorithmsPiyush Rochwani
 
Pipeline processing - Computer Architecture
Pipeline processing - Computer Architecture Pipeline processing - Computer Architecture
Pipeline processing - Computer Architecture S. Hasnain Raza
 
Hardware multithreading
Hardware multithreadingHardware multithreading
Hardware multithreadingFraboni Ec
 
Cache memory
Cache memoryCache memory
Cache memoryAnuj Modi
 
Superscalar processor
Superscalar processorSuperscalar processor
Superscalar processornoor ul ain
 
Counters & time delay
Counters & time delayCounters & time delay
Counters & time delayHemant Chetwani
 
Register transfer and micro-operation
Register transfer and micro-operationRegister transfer and micro-operation
Register transfer and micro-operationNikhil Pandit
 
EC8392 -DIGITAL ELECTRONICS -II YEAR ECE-by S.SESHA VIDHYA /ASP/ ECE/ RMKCET
EC8392 -DIGITAL ELECTRONICS -II YEAR ECE-by S.SESHA VIDHYA /ASP/ ECE/ RMKCETEC8392 -DIGITAL ELECTRONICS -II YEAR ECE-by S.SESHA VIDHYA /ASP/ ECE/ RMKCET
EC8392 -DIGITAL ELECTRONICS -II YEAR ECE-by S.SESHA VIDHYA /ASP/ ECE/ RMKCETSeshaVidhyaS
 
Pipelining
PipeliningPipelining
PipeliningAJAL A J
 
Pipelining powerpoint presentation
Pipelining powerpoint presentationPipelining powerpoint presentation
Pipelining powerpoint presentationbhavanadonthi
 

Mais procurados (20)

Pipeline processing and space time diagram
Pipeline processing and space time diagramPipeline processing and space time diagram
Pipeline processing and space time diagram
 
Real Time Operating system (RTOS) - Embedded systems
Real Time Operating system (RTOS) - Embedded systemsReal Time Operating system (RTOS) - Embedded systems
Real Time Operating system (RTOS) - Embedded systems
 
Basic MIPS implementation
Basic MIPS implementationBasic MIPS implementation
Basic MIPS implementation
 
Pipeline and data hazard
Pipeline and data hazardPipeline and data hazard
Pipeline and data hazard
 
Pipelining & All Hazards Solution
Pipelining  & All Hazards SolutionPipelining  & All Hazards Solution
Pipelining & All Hazards Solution
 
Booth’s algorithm.(a014& a015)
Booth’s algorithm.(a014& a015)Booth’s algorithm.(a014& a015)
Booth’s algorithm.(a014& a015)
 
Page replacement algorithms
Page replacement algorithmsPage replacement algorithms
Page replacement algorithms
 
Pipeline processing - Computer Architecture
Pipeline processing - Computer Architecture Pipeline processing - Computer Architecture
Pipeline processing - Computer Architecture
 
Hardware multithreading
Hardware multithreadingHardware multithreading
Hardware multithreading
 
Array Processor
Array ProcessorArray Processor
Array Processor
 
Branch prediction
Branch predictionBranch prediction
Branch prediction
 
Cache memory
Cache memoryCache memory
Cache memory
 
Superscalar processor
Superscalar processorSuperscalar processor
Superscalar processor
 
Counters & time delay
Counters & time delayCounters & time delay
Counters & time delay
 
Register transfer and micro-operation
Register transfer and micro-operationRegister transfer and micro-operation
Register transfer and micro-operation
 
SCHEDULING ALGORITHMS
SCHEDULING ALGORITHMSSCHEDULING ALGORITHMS
SCHEDULING ALGORITHMS
 
Pci,usb,scsi bus
Pci,usb,scsi busPci,usb,scsi bus
Pci,usb,scsi bus
 
EC8392 -DIGITAL ELECTRONICS -II YEAR ECE-by S.SESHA VIDHYA /ASP/ ECE/ RMKCET
EC8392 -DIGITAL ELECTRONICS -II YEAR ECE-by S.SESHA VIDHYA /ASP/ ECE/ RMKCETEC8392 -DIGITAL ELECTRONICS -II YEAR ECE-by S.SESHA VIDHYA /ASP/ ECE/ RMKCET
EC8392 -DIGITAL ELECTRONICS -II YEAR ECE-by S.SESHA VIDHYA /ASP/ ECE/ RMKCET
 
Pipelining
PipeliningPipelining
Pipelining
 
Pipelining powerpoint presentation
Pipelining powerpoint presentationPipelining powerpoint presentation
Pipelining powerpoint presentation
 

Destaque

INSTRUCTION LEVEL PARALLALISM
INSTRUCTION LEVEL PARALLALISMINSTRUCTION LEVEL PARALLALISM
INSTRUCTION LEVEL PARALLALISMKamran Ashraf
 
Instruction Level Parallelism (ILP) Limitations
Instruction Level Parallelism (ILP) LimitationsInstruction Level Parallelism (ILP) Limitations
Instruction Level Parallelism (ILP) LimitationsJose Pinilla
 
Pipelining and ILP (Instruction Level Parallelism)
Pipelining and ILP (Instruction Level Parallelism) Pipelining and ILP (Instruction Level Parallelism)
Pipelining and ILP (Instruction Level Parallelism) A B Shinde
 
Concept of Pipelining
Concept of PipeliningConcept of Pipelining
Concept of PipeliningSHAKOOR AB
 
Instruction level parallelism
Instruction level parallelismInstruction level parallelism
Instruction level parallelismdeviyasharwin
 
Instruction pipelining
Instruction pipeliningInstruction pipelining
Instruction pipeliningTech_MX
 
Pipelining and vector processing
Pipelining and vector processingPipelining and vector processing
Pipelining and vector processingKamal Acharya
 
Pertemuan 9 pipelining
Pertemuan 9 pipeliningPertemuan 9 pipelining
Pertemuan 9 pipeliningjumiathyasiz
 
Instruction Level Parallelism Compiler optimization Techniques Anna Universit...
Instruction Level Parallelism Compiler optimization Techniques Anna Universit...Instruction Level Parallelism Compiler optimization Techniques Anna Universit...
Instruction Level Parallelism Compiler optimization Techniques Anna Universit...Dr.K. Thirunadana Sikamani
 
A Look Back | A Look Ahead Seattle Foundation Services
A Look Back | A Look Ahead Seattle Foundation ServicesA Look Back | A Look Ahead Seattle Foundation Services
A Look Back | A Look Ahead Seattle Foundation ServicesSeattle Foundation
 
Computer architecture kai hwang
Computer architecture   kai hwangComputer architecture   kai hwang
Computer architecture kai hwangSumedha
 
Advanced computer architecture
Advanced computer architectureAdvanced computer architecture
Advanced computer architectureMd. Mahedi Mahfuj
 
The Future of Data Science
The Future of Data ScienceThe Future of Data Science
The Future of Data Sciencesarith divakar
 
Ceg4131 models
Ceg4131 modelsCeg4131 models
Ceg4131 modelsanandme07
 
Big data processing with apache spark
Big data processing with apache sparkBig data processing with apache spark
Big data processing with apache sparksarith divakar
 

Destaque (20)

INSTRUCTION LEVEL PARALLALISM
INSTRUCTION LEVEL PARALLALISMINSTRUCTION LEVEL PARALLALISM
INSTRUCTION LEVEL PARALLALISM
 
Instruction Level Parallelism (ILP) Limitations
Instruction Level Parallelism (ILP) LimitationsInstruction Level Parallelism (ILP) Limitations
Instruction Level Parallelism (ILP) Limitations
 
Pipelining and ILP (Instruction Level Parallelism)
Pipelining and ILP (Instruction Level Parallelism) Pipelining and ILP (Instruction Level Parallelism)
Pipelining and ILP (Instruction Level Parallelism)
 
Concept of Pipelining
Concept of PipeliningConcept of Pipelining
Concept of Pipelining
 
Instruction level parallelism
Instruction level parallelismInstruction level parallelism
Instruction level parallelism
 
Instruction pipelining
Instruction pipeliningInstruction pipelining
Instruction pipelining
 
Pipelining and vector processing
Pipelining and vector processingPipelining and vector processing
Pipelining and vector processing
 
Pertemuan 9 pipelining
Pertemuan 9 pipeliningPertemuan 9 pipelining
Pertemuan 9 pipelining
 
Instruction Level Parallelism Compiler optimization Techniques Anna Universit...
Instruction Level Parallelism Compiler optimization Techniques Anna Universit...Instruction Level Parallelism Compiler optimization Techniques Anna Universit...
Instruction Level Parallelism Compiler optimization Techniques Anna Universit...
 
Chapter6 pipelining
Chapter6  pipeliningChapter6  pipelining
Chapter6 pipelining
 
A Look Back | A Look Ahead Seattle Foundation Services
A Look Back | A Look Ahead Seattle Foundation ServicesA Look Back | A Look Ahead Seattle Foundation Services
A Look Back | A Look Ahead Seattle Foundation Services
 
Computer architecture kai hwang
Computer architecture   kai hwangComputer architecture   kai hwang
Computer architecture kai hwang
 
Advanced computer architecture
Advanced computer architectureAdvanced computer architecture
Advanced computer architecture
 
The Future of Data Science
The Future of Data ScienceThe Future of Data Science
The Future of Data Science
 
Evolution of Computer
Evolution of ComputerEvolution of Computer
Evolution of Computer
 
Pert.11 pipelining
Pert.11 pipeliningPert.11 pipelining
Pert.11 pipelining
 
Ceg4131 models
Ceg4131 modelsCeg4131 models
Ceg4131 models
 
Lec3 final
Lec3 finalLec3 final
Lec3 final
 
pipelining
pipeliningpipelining
pipelining
 
Big data processing with apache spark
Big data processing with apache sparkBig data processing with apache spark
Big data processing with apache spark
 

Semelhante a Pipelining

Round Robin Algorithm.pptx
Round Robin Algorithm.pptxRound Robin Algorithm.pptx
Round Robin Algorithm.pptxSanad Bhowmik
 
12 processor structure and function
12 processor structure and function12 processor structure and function
12 processor structure and functionSher Shah Merkhel
 
notes_Lecture-8 (Computer Architecture) 3rd Semester 2k11 (1).pdf
notes_Lecture-8 (Computer Architecture) 3rd Semester 2k11 (1).pdfnotes_Lecture-8 (Computer Architecture) 3rd Semester 2k11 (1).pdf
notes_Lecture-8 (Computer Architecture) 3rd Semester 2k11 (1).pdfSatyamMishra828076
 
Process scheduling algorithms
Process scheduling algorithmsProcess scheduling algorithms
Process scheduling algorithmsShubham Sharma
 
Topic2a ss pipelines
Topic2a ss pipelinesTopic2a ss pipelines
Topic2a ss pipelinesturki_09
 
Performance Enhancement with Pipelining
Performance Enhancement with PipeliningPerformance Enhancement with Pipelining
Performance Enhancement with PipeliningAneesh Raveendran
 
IT209 Cpu Structure Report
IT209 Cpu Structure ReportIT209 Cpu Structure Report
IT209 Cpu Structure ReportBis Aquino
 
12 processor structure and function
12 processor structure and function12 processor structure and function
12 processor structure and functiondilip kumar
 
Computer organisation and architecture .
Computer organisation and architecture .Computer organisation and architecture .
Computer organisation and architecture .MalligaarjunanN
 
Fault Tolerance in Distributed Environment
Fault Tolerance in Distributed EnvironmentFault Tolerance in Distributed Environment
Fault Tolerance in Distributed EnvironmentOrkhan Gasimov
 
Operating system 30 preemptive scheduling
Operating system 30 preemptive schedulingOperating system 30 preemptive scheduling
Operating system 30 preemptive schedulingVaibhav Khanna
 
pipelining-190913185902.pptx
pipelining-190913185902.pptxpipelining-190913185902.pptx
pipelining-190913185902.pptxAshokRachapalli1
 
12 processor structure and function
12 processor structure and function12 processor structure and function
12 processor structure and functionAnwal Mirza
 

Semelhante a Pipelining (20)

Conditional branches
Conditional branchesConditional branches
Conditional branches
 
Round Robin Algorithm.pptx
Round Robin Algorithm.pptxRound Robin Algorithm.pptx
Round Robin Algorithm.pptx
 
12 processor structure and function
12 processor structure and function12 processor structure and function
12 processor structure and function
 
notes_Lecture-8 (Computer Architecture) 3rd Semester 2k11 (1).pdf
notes_Lecture-8 (Computer Architecture) 3rd Semester 2k11 (1).pdfnotes_Lecture-8 (Computer Architecture) 3rd Semester 2k11 (1).pdf
notes_Lecture-8 (Computer Architecture) 3rd Semester 2k11 (1).pdf
 
Process scheduling algorithms
Process scheduling algorithmsProcess scheduling algorithms
Process scheduling algorithms
 
Topic2a ss pipelines
Topic2a ss pipelinesTopic2a ss pipelines
Topic2a ss pipelines
 
Performance Enhancement with Pipelining
Performance Enhancement with PipeliningPerformance Enhancement with Pipelining
Performance Enhancement with Pipelining
 
IT209 Cpu Structure Report
IT209 Cpu Structure ReportIT209 Cpu Structure Report
IT209 Cpu Structure Report
 
Chap2 slides
Chap2 slidesChap2 slides
Chap2 slides
 
12 processor structure and function
12 processor structure and function12 processor structure and function
12 processor structure and function
 
CPU Scheduling Part-III.pdf
CPU Scheduling Part-III.pdfCPU Scheduling Part-III.pdf
CPU Scheduling Part-III.pdf
 
14 superscalar
14 superscalar14 superscalar
14 superscalar
 
ch2.pptx
ch2.pptxch2.pptx
ch2.pptx
 
Computer organisation and architecture .
Computer organisation and architecture .Computer organisation and architecture .
Computer organisation and architecture .
 
Fault Tolerance in Distributed Environment
Fault Tolerance in Distributed EnvironmentFault Tolerance in Distributed Environment
Fault Tolerance in Distributed Environment
 
Operating system 30 preemptive scheduling
Operating system 30 preemptive schedulingOperating system 30 preemptive scheduling
Operating system 30 preemptive scheduling
 
pipelining-190913185902.pptx
pipelining-190913185902.pptxpipelining-190913185902.pptx
pipelining-190913185902.pptx
 
12 processor structure and function
12 processor structure and function12 processor structure and function
12 processor structure and function
 
14 superscalar
14 superscalar14 superscalar
14 superscalar
 
Unit - 5 Pipelining.pptx
Unit - 5 Pipelining.pptxUnit - 5 Pipelining.pptx
Unit - 5 Pipelining.pptx
 

Último

TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...Nguyen Thanh Tu Collection
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsKarakKing
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the ClassroomPooky Knightsmith
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfPoh-Sun Goh
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxVishalSingh1417
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxPooja Bhuva
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17Celine George
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Association for Project Management
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsMebane Rash
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptxMaritesTamaniVerdade
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.christianmathematics
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.MaryamAhmad92
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxDr. Ravikiran H M Gowda
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...pradhanghanshyam7136
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxAreebaZafar22
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and ModificationsMJDuyan
 

Último (20)

TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptx
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 

Pipelining

  • 2. Objectives • Basics of pipelining and how it can lead to improved performance. • Hazards that cause performance degradation and techniques to alleviate their effect on performance • Role of optimizing compilers, which rearrange the sequence of instructions to maximize the benefits of pipelined execution • Replicating hardware units in a superscalar processor so that multiple pipelines can operate concurrently.
  • 3. Basic Concepts • Pipelining overlaps the execution of successive instructions to achieve high performance • The speed of execution of programs is improved by • using faster circuit technology • arranging the hardware so that more than one operation can be performed at the same time • The number of operations performed per second is increased, even though the time needed to perform any one operation is not changed • Pipelining is a particularly effective way of organizing concurrent activity in a computer system.
  • 6.
  • 7. Pipelining Issues • Any condition that causes the pipeline to stall is called a hazard • Data hazard: Value of a source operand of an instruction is not available when needed • Other hazards arise from memory delays, branch instructions, and resource limitations Writing to Register file Operation using the old value of the register
  • 8. Data Dependencies • Add R2, R3, #100 • Subtract R9, R2, #30 • The Subtract instruction is stalled for three cycles to delay reading register R2 until cycle 6 when the new value becomes available.
  • 9. Operand Forwarding • Pipeline stalls due to data dependencies can be alleviated through the use of operand forwarding
  • 10.
  • 11. Handling Data Dependencies in Software Add R2, R3, #100 NOP NOP NOP Subtract R9, R2, #30 • Code size increases • Execution time not reduced The compiler can attempt to optimize the code to improve performance and reduce the code size by reordering instructions to move useful instructions into the NOP slots
  • 12. Memory Delays • Load instruction may require more than one clock cycle to obtain its operand from memory. • Load R2, (R3) • Subtract R9, R2, #30
  • 13. Operand forwarding • Operand forwarding cannot be done because the data read from memory (the cache, in this case) are not available until they are loaded into register RY at the beginning of cycle 5 The memory operand, which is now in register RY, can be forwarded to the ALU input in cycle 5.
  • 14. Branch Delays • Branch instructions can alter the sequence of execution, but they must first be executed to determine whether and where to branch. • Branch instructions occur frequently. In fact, they represent about 20 percent of the dynamic instruction count of most programs. • Reducing the branch penalty requires the branch target address to be computed earlier in the pipeline With a two-cycle branch penalty, the relatively high frequency of branch instructions could increase the execution time for a program by as much as 40 percent.
  • 15. Unconditional Branches • Reducing the branch penalty requires the branch target address to be computed earlier in the pipeline
  • 16. Target address is determined in the Decode stage of the pipeline.
  • 17. Conditional Branches • For pipelining, the branch condition must be tested as early as possible to limit the branch penalty
  • 18. The Branch Delay Slot • The location that follows a branch instruction is called the branch delay slot. • Rather than conditionally discard the instruction in the delay slot, we can arrange to have the pipeline always execute this instruction, whether or not the branch is taken
  • 19.
  • 20. Branch Prediction • To reduce the branch penalty further, the processor needs to anticipate that an instruction being fetched is a branch instruction • Predict its outcome to determine which instruction should be fetched • Static Branch Prediction • Dynamic Branch Prediction
  • 21. Static Branch Prediction • The simplest form of branch prediction is to assume that the branch will not be taken and to fetch the next instruction in sequential address order • If the prediction is correct, the fetched instruction is allowed to complete and there is no penalty • Misprediction incurs the full branch penalty
  • 22. Dynamic Branch Prediction • The processor hardware assesses the likelihood of a given branch being taken by keeping track of branch decisions every time that a branch instruction is executed. • dynamic prediction algorithm can use the result of the most recent execution of a branch instruction • The processor assumes that the next time the instruction is executed, the branch decision is likely to be the same as the last time.
  • 23. A 2-state algorithm • When the branch instruction is executed and the branch is taken, the machine moves to state LT. Otherwise, it remains in state LNT. • The next time the same instruction is encountered, the branch is predicted as taken if the state machine is in state LT. Otherwise it is predicted as not taken. LT - Branch is likely to be taken LNT - Branch is likely not to be taken
  • 24. A 4-state algorithm ST - Strongly likely to be taken LT - Likely to be taken LNT - Likely not to be taken SNT - Strongly likely not to be taken • After the branch instruction is executed, and if the branch is actually taken, the state is changed to ST; otherwise, it is changed to SNT. • The branch is predicted as taken if the state is either ST or LT. Otherwise, the branch is predicted as not taken.
  • 25. Performance Evaluation • Basic performance equation • For a non-pipelined processor, the execution time, T, of a program that has a dynamic instruction count of N is given by • S is the average number of clock cycles it takes to fetch and execute one instruction • R is the clock rate in cycles per second
  • 26. • Instruction throughput • Number of instructions executed per second. For non-pipelined execution, the throughput, Pnp, is given by • The processor with five cycles to execute all instructions. • If there are no cache misses, S is equal to 5.
  • 27. • For the five-stage pipeline each instruction is executed in five cycles, but a new instruction can ideally enter the pipeline every cycle. • Thus, in the absence of stalls, S is equal to 1, and the ideal throughput with pipelining is • A five-stage pipeline can potentially increase the throughput by a factor of five n-stage pipeline has the potential to increase throughput n times
  • 28. Assignment Date of Submission: 27-Jun-2016 • How much of this potential increase in instruction throughput can actually be realizedin practice? • What is a good value for n?
  • 30. • Add R2, R3, #100 • Load R5, 16(R6) • Subtract R7, R8, R9 • Store R10, 24(R11)
  • 31. References • "Computer Organization", Carl Hamacher Safwat Zaky Zvonko Vranesic, McGraw-Hill, 2002