1. Scuola Politecnica
Dipartimento di Ingegneria Chimica,
Gestionale, Informatica, Meccanica
Parallel Computer Architectures
Shared-Memory Multiprocessors
Architetture Avanzate dei Calcolatori
Salvatore La Bua
2. S. La Bua - DICGIM/UniPA Architetture Avanzate dei Calcolatori
2
Shared Memory Multiprocessors
Multiprocessors Multicomputers
3. S. La Bua - DICGIM/UniPA Architetture Avanzate dei Calcolatori
3
Taxonomy of Parallel Computers
● SISD
– Single Instruction, Single Data
● Von Neumann architecture
● SIMD
– Single Instruction, Multiple Data
● Vector and Array processor architectures
● MISD
– Multiple Instruction, Single Data
● MIMD
– Multiple Instruction, Multiple Data
● Multiprocessor architectures: UMA, NUMA, COMA
● Multicomputer architectures: MPP, COW
4. S. La Bua - DICGIM/UniPA Architetture Avanzate dei Calcolatori
4
Memory Semantics:
Consistency models
● Strict consistency
– Any read to a location x always returns the value of the most recent
write to x
● Sequential consistency
– For multiple read and write requests, some interleaving is chosen
– All CPUs see the same order
● Processor consistency
– Writes by any CPU are seen by all in the order they were issued
– For every memory word, all CPUs see writes to it in the same order
● Weak consistency
– Does not guarantee that writes from a single CPU are seen in order
● Release consistency
– An improvement to the weak consistency model
5. S. La Bua - DICGIM/UniPA Architetture Avanzate dei Calcolatori
5
UMA Symmetric
Multiprocessor Architectures
● Uniform Memory Access
● Snooping Caches
● Coherence protocols
– Write-through
– Write-allocate
– Write-back
6. S. La Bua - DICGIM/UniPA Architetture Avanzate dei Calcolatori
6
MESI Cache Coherence
Protocol
● MESI: Modified-Exclusive-Shared-Invalid
– A write-back protocol
● Four statuses each cache entry can be in:
– Invalid
● The cache entry does not contain valid data
– Shared
● Multiple caches may hold the line
● Memory is up to date
– Exclusive
● No other cache holds the line
● Memory is up to date
– Modified
● The entry is valid
● Memory is invalid
● No copies exist
7. S. La Bua - DICGIM/UniPA Architetture Avanzate dei Calcolatori
7
UMA Multiprocessors
Using Crossbar
Switches
Using Multistage
Switching Networks
8. S. La Bua - DICGIM/UniPA Architetture Avanzate dei Calcolatori
8
NUMA Multiprocessors
●
Non-Uniform Memory Access
– Single address space visible to all CPUs
– Access to remote memory done using LOAD and STORE instructions
– Access to remote memory is slower than access to local memory
●
NC-NUMA
– Non Cache coherent NUMA
● CC-NUMA
– Cache Coherent NUMA
9. S. La Bua - DICGIM/UniPA Architetture Avanzate dei Calcolatori
9
Sun Fire E25K NUMA
Multiprocessor
● An example of a shared-memory
NUMA multiprocessor
10. S. La Bua - DICGIM/UniPA Architetture Avanzate dei Calcolatori
10
COMA Multiprocessors
● Cache Only Memory Access
– Use each CPU’s main memory as a cache
– Physical address space split into cache lines
● Problems:
– How are cache lines located?
● Main memory or actual cache
– When a line is purged, what happens if it is the last copy?
● Last copy cannot be thrown out