SlideShare uma empresa Scribd logo
1 de 17
High Performance Computing JawwadShamsi Lecture #6 27th January 2010
Recap Cache Coherence NUMA
Today’s topics Cache Coherence – Continuation Vector Processing
Cache Coherence In SMP or NUMA, multiple copies of cache Each copy may have a different value of data item Maintain Coherency How?
Cache Coherence: Two Approaches Write back: Update Main memory once cache is flushed. Write through: Write is updated to cache as well as to the main memory.
Implementations Software Solutions:  Compile time decision Conservative Inefficient cache utilization Hardware Solutions: Runtime decision More effective
Hardware based solution Directory Protocol Snoopy Protocol
Directory Centralized Controller Individual cache controller makes a request Centralized controller checks and issues command Updates information
Directory Write Processor requests exclusive writes Controller sends message Invalidates Read Issues command to the processor  Holding Processor Writes back to MM Read permitted
Directory Disadvantage Centralized Controller Bottleneck Advantage Useful in large –scale system
Snoopy Protocol Update operation announced All Cache controllers snoop Bus architecture Careful Increased Bus Traffic
Snoopy Protocol Two approaches Write Invalidate One write Multiple readers Exclusive: Writer invalidates others entries Write Update Multiple writers All writes are updated
Write Invalidate The MESI Protocol : P4 processor Data cache: Two status bits, 4 states Modified Exclusive Shared Invalid See Table
4 Possibilities Read Miss: EX to SH SH to SH MO to SH Read-Hit Write-Miss RWITM MO to IN SH to IN Write Hit SH to IN EX   Mo
L1- L2 Cache Consistency
Parallel programming and Amdahl's Law Suppose 1/N time for sequential code And 1-1/N for the parallel
Amdahl's Law Speedup: speed gain of using parallel processor vs. single processor Speed= 1/(s+(p/N)) S=sequential code, p = parallel code, N= no. of processors S= T(1)/ T(j) For j parallel processors As problem size increases, p may rise and s may decrease

Mais conteúdo relacionado

Mais procurados

Parallel Processing Presentation2
Parallel Processing Presentation2Parallel Processing Presentation2
Parallel Processing Presentation2daniyalqureshi712
 
Paralle programming 2
Paralle programming 2Paralle programming 2
Paralle programming 2Anshul Sharma
 
Introduction to parallel processing
Introduction to parallel processingIntroduction to parallel processing
Introduction to parallel processingPage Maker
 
Directory based cache coherence
Directory based cache coherenceDirectory based cache coherence
Directory based cache coherenceHoang Nguyen
 
Introduction 1
Introduction 1Introduction 1
Introduction 1Yasir Khan
 
10 Multicore 07
10 Multicore 0710 Multicore 07
10 Multicore 07timcrack
 
Superscalar & superpipeline processor
Superscalar & superpipeline processorSuperscalar & superpipeline processor
Superscalar & superpipeline processorMuhammad Ishaq
 
What is simultaneous multithreading
What is simultaneous multithreadingWhat is simultaneous multithreading
What is simultaneous multithreadingFraboni Ec
 
Parallel architecture-programming
Parallel architecture-programmingParallel architecture-programming
Parallel architecture-programmingShaveta Banda
 
Multithreading computer architecture
 Multithreading computer architecture  Multithreading computer architecture
Multithreading computer architecture Haris456
 
ملٹی لیول کے شے۔
ملٹی لیول کے شے۔ملٹی لیول کے شے۔
ملٹی لیول کے شے۔maamir farooq
 

Mais procurados (20)

Parallel Processing Presentation2
Parallel Processing Presentation2Parallel Processing Presentation2
Parallel Processing Presentation2
 
Lecture1
Lecture1Lecture1
Lecture1
 
Paralle programming 2
Paralle programming 2Paralle programming 2
Paralle programming 2
 
Lecture5
Lecture5Lecture5
Lecture5
 
Introduction to parallel processing
Introduction to parallel processingIntroduction to parallel processing
Introduction to parallel processing
 
Parallel processing extra
Parallel processing extraParallel processing extra
Parallel processing extra
 
File replication
File replicationFile replication
File replication
 
Lecture4
Lecture4Lecture4
Lecture4
 
Directory based cache coherence
Directory based cache coherenceDirectory based cache coherence
Directory based cache coherence
 
Introduction 1
Introduction 1Introduction 1
Introduction 1
 
10 Multicore 07
10 Multicore 0710 Multicore 07
10 Multicore 07
 
Superscalar & superpipeline processor
Superscalar & superpipeline processorSuperscalar & superpipeline processor
Superscalar & superpipeline processor
 
Memory models
Memory modelsMemory models
Memory models
 
Parallel processing
Parallel processingParallel processing
Parallel processing
 
Parallelism
ParallelismParallelism
Parallelism
 
What is simultaneous multithreading
What is simultaneous multithreadingWhat is simultaneous multithreading
What is simultaneous multithreading
 
Multiprocessor
MultiprocessorMultiprocessor
Multiprocessor
 
Parallel architecture-programming
Parallel architecture-programmingParallel architecture-programming
Parallel architecture-programming
 
Multithreading computer architecture
 Multithreading computer architecture  Multithreading computer architecture
Multithreading computer architecture
 
ملٹی لیول کے شے۔
ملٹی لیول کے شے۔ملٹی لیول کے شے۔
ملٹی لیول کے شے۔
 

Destaque

Parallel and Distributed Algorithms for Large Text Datasets Analysis
Parallel and Distributed Algorithms for Large Text Datasets AnalysisParallel and Distributed Algorithms for Large Text Datasets Analysis
Parallel and Distributed Algorithms for Large Text Datasets AnalysisIllia Ovchynnikov
 
Advanced full text searching techniques using Lucene
Advanced full text searching techniques using LuceneAdvanced full text searching techniques using Lucene
Advanced full text searching techniques using LuceneAsad Abbas
 

Destaque (6)

Chap12alg
Chap12algChap12alg
Chap12alg
 
Parallel and Distributed Algorithms for Large Text Datasets Analysis
Parallel and Distributed Algorithms for Large Text Datasets AnalysisParallel and Distributed Algorithms for Large Text Datasets Analysis
Parallel and Distributed Algorithms for Large Text Datasets Analysis
 
Lecture3
Lecture3Lecture3
Lecture3
 
Lecture2
Lecture2Lecture2
Lecture2
 
Advanced full text searching techniques using Lucene
Advanced full text searching techniques using LuceneAdvanced full text searching techniques using Lucene
Advanced full text searching techniques using Lucene
 
seminar report on Li-Fi Technology
seminar report on Li-Fi Technologyseminar report on Li-Fi Technology
seminar report on Li-Fi Technology
 

Semelhante a High Performance Computing Cache Coherence and Vector Processing

Study of various factors affecting performance of multi core processors
Study of various factors affecting performance of multi core processorsStudy of various factors affecting performance of multi core processors
Study of various factors affecting performance of multi core processorsateeq ateeq
 
GFS - Google File System
GFS - Google File SystemGFS - Google File System
GFS - Google File Systemtutchiio
 
Memory technology and optimization in Advance Computer Architechture
Memory technology and optimization in Advance Computer ArchitechtureMemory technology and optimization in Advance Computer Architechture
Memory technology and optimization in Advance Computer ArchitechtureShweta Ghate
 
Parallel Processing (Part 2)
Parallel Processing (Part 2)Parallel Processing (Part 2)
Parallel Processing (Part 2)Ajeng Savitri
 
Dsm (Distributed computing)
Dsm (Distributed computing)Dsm (Distributed computing)
Dsm (Distributed computing)Sri Prasanna
 
message passing vs shared memory
message passing vs shared memorymessage passing vs shared memory
message passing vs shared memoryHamza Zahid
 
VMWare Performance Tuning by Virtera (Jan 2009)
VMWare Performance Tuning by  Virtera (Jan 2009)VMWare Performance Tuning by  Virtera (Jan 2009)
VMWare Performance Tuning by Virtera (Jan 2009)vmug
 
Symmetric multiprocessing and Microkernel
Symmetric multiprocessing and MicrokernelSymmetric multiprocessing and Microkernel
Symmetric multiprocessing and MicrokernelManoraj Pannerselum
 
Introduction to Thread Level Parallelism
Introduction to Thread Level ParallelismIntroduction to Thread Level Parallelism
Introduction to Thread Level ParallelismDilum Bandara
 

Semelhante a High Performance Computing Cache Coherence and Vector Processing (20)

Study of various factors affecting performance of multi core processors
Study of various factors affecting performance of multi core processorsStudy of various factors affecting performance of multi core processors
Study of various factors affecting performance of multi core processors
 
Parallel processing Concepts
Parallel processing ConceptsParallel processing Concepts
Parallel processing Concepts
 
GFS - Google File System
GFS - Google File SystemGFS - Google File System
GFS - Google File System
 
Memory technology and optimization in Advance Computer Architechture
Memory technology and optimization in Advance Computer ArchitechtureMemory technology and optimization in Advance Computer Architechture
Memory technology and optimization in Advance Computer Architechture
 
Sinfonia
Sinfonia Sinfonia
Sinfonia
 
Kosmos Filesystem
Kosmos FilesystemKosmos Filesystem
Kosmos Filesystem
 
Parallel Processing (Part 2)
Parallel Processing (Part 2)Parallel Processing (Part 2)
Parallel Processing (Part 2)
 
CH08.pdf
CH08.pdfCH08.pdf
CH08.pdf
 
Chapter 10
Chapter 10Chapter 10
Chapter 10
 
OS Intro.ppt
OS Intro.pptOS Intro.ppt
OS Intro.ppt
 
Dsm (Distributed computing)
Dsm (Distributed computing)Dsm (Distributed computing)
Dsm (Distributed computing)
 
Memory comp
Memory compMemory comp
Memory comp
 
tittle
tittletittle
tittle
 
message passing vs shared memory
message passing vs shared memorymessage passing vs shared memory
message passing vs shared memory
 
VMWare Performance Tuning by Virtera (Jan 2009)
VMWare Performance Tuning by  Virtera (Jan 2009)VMWare Performance Tuning by  Virtera (Jan 2009)
VMWare Performance Tuning by Virtera (Jan 2009)
 
Chapter1
Chapter1Chapter1
Chapter1
 
CS6401 OPERATING SYSTEMS Unit 3
CS6401 OPERATING SYSTEMS Unit 3CS6401 OPERATING SYSTEMS Unit 3
CS6401 OPERATING SYSTEMS Unit 3
 
Operating System Lecture 4
Operating System Lecture 4Operating System Lecture 4
Operating System Lecture 4
 
Symmetric multiprocessing and Microkernel
Symmetric multiprocessing and MicrokernelSymmetric multiprocessing and Microkernel
Symmetric multiprocessing and Microkernel
 
Introduction to Thread Level Parallelism
Introduction to Thread Level ParallelismIntroduction to Thread Level Parallelism
Introduction to Thread Level Parallelism
 

Último

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 

Último (20)

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 

High Performance Computing Cache Coherence and Vector Processing

  • 1. High Performance Computing JawwadShamsi Lecture #6 27th January 2010
  • 3. Today’s topics Cache Coherence – Continuation Vector Processing
  • 4. Cache Coherence In SMP or NUMA, multiple copies of cache Each copy may have a different value of data item Maintain Coherency How?
  • 5. Cache Coherence: Two Approaches Write back: Update Main memory once cache is flushed. Write through: Write is updated to cache as well as to the main memory.
  • 6. Implementations Software Solutions: Compile time decision Conservative Inefficient cache utilization Hardware Solutions: Runtime decision More effective
  • 7. Hardware based solution Directory Protocol Snoopy Protocol
  • 8. Directory Centralized Controller Individual cache controller makes a request Centralized controller checks and issues command Updates information
  • 9. Directory Write Processor requests exclusive writes Controller sends message Invalidates Read Issues command to the processor Holding Processor Writes back to MM Read permitted
  • 10. Directory Disadvantage Centralized Controller Bottleneck Advantage Useful in large –scale system
  • 11. Snoopy Protocol Update operation announced All Cache controllers snoop Bus architecture Careful Increased Bus Traffic
  • 12. Snoopy Protocol Two approaches Write Invalidate One write Multiple readers Exclusive: Writer invalidates others entries Write Update Multiple writers All writes are updated
  • 13. Write Invalidate The MESI Protocol : P4 processor Data cache: Two status bits, 4 states Modified Exclusive Shared Invalid See Table
  • 14. 4 Possibilities Read Miss: EX to SH SH to SH MO to SH Read-Hit Write-Miss RWITM MO to IN SH to IN Write Hit SH to IN EX Mo
  • 15. L1- L2 Cache Consistency
  • 16. Parallel programming and Amdahl's Law Suppose 1/N time for sequential code And 1-1/N for the parallel
  • 17. Amdahl's Law Speedup: speed gain of using parallel processor vs. single processor Speed= 1/(s+(p/N)) S=sequential code, p = parallel code, N= no. of processors S= T(1)/ T(j) For j parallel processors As problem size increases, p may rise and s may decrease