SlideShare a Scribd company logo
1 of 27
Download to read offline
April 28, 2003 ©2001-2003 Howard Huang 1
Cache writes and examples
Today is the last day of caches!
— One important topic we haven’t mentioned yet is writing to a cache.
We’ll see several different scenarios and approaches.
— To introduce some more advanced cache designs, we’ll also present
some of the caching strategies used in modern desktop processors like
Pentiums and PowerPCs.
On Wednesday we’ll turn our attention to peripheral devices and I/O.
April 28, 2003 Cache writes and examples 2
Writing to a cache
Writing to a cache raises several additional issues.
First, let’s assume that the address we want to write to is already loaded
in the cache. We’ll assume a simple direct-mapped cache.
If we write a new value to that address, we can store the new data in the
cache, and avoid an expensive main memory access.
Index Tag DataV Address
...
110
...
1 11010 42803
Data
42803
...
1101 0110
...
Index Tag DataV Address
...
110
...
1 11010 21763
Data
42803
...
1101 0110
...
Mem[214] = 21763
April 28, 2003 Cache writes and examples 3
Inconsistent memory
But now the cache and memory contain different, inconsistent data!
How can we ensure that subsequent loads will return the right value?
This is also problematic if other devices are sharing the main memory, as
in a multiprocessor system.
Index Tag DataV Address
...
110
...
1 11010 21763
Data
42803
...
1101 0110
...
April 28, 2003 Cache writes and examples 4
Write-through caches
A write-through cache solves the inconsistency problem by forcing all
writes to update both the cache and the main memory.
This is simple to implement and keeps the cache and memory consistent.
The bad thing is that forcing every write to go to main memory negates
the advantage of having a cache—the whole point was to avoid accessing
main memory!
Index Tag DataV Address
...
110
...
1 11010 21763
Data
21763
...
1101 0110
...
Mem[214] = 21763
April 28, 2003 Cache writes and examples 5
Write-back caches
In a write-back cache, the memory is not updated until the cache block
needs to be replaced (e.g., when loading data into a full cache set).
For example, we might write some data to the cache at first, leaving it
inconsistent with the main memory as shown before.
Subsequent reads to the same memory address will be serviced by the
cache, which contains the correct, updated data.
Index Tag DataV Address
...
110
...
1 11010 21763
Data
42803
1000 1110
1101 0110
...
Mem[214] = 21763
1225
April 28, 2003 Cache writes and examples 6
Finishing the write back
We don’t need to store the new value back to main memory unless the
cache block gets replaced.
For example, on a read from Mem[142], which maps to the same cache
block, the modified cache contents will first be written to main memory.
Only then can the cache block be replaced with data from address 142.
Index Tag DataV
...
110
...
1 11010 21763
Address Data
21763
1000 1110
1101 0110
...
1225
Index Tag DataV
...
110
...
1 10001 1225
Address Data
21763
1000 1110
1101 0110
...
1225
April 28, 2003 Cache writes and examples 7
Write-back cache discussion
Each block in a write-back cache needs a dirty bit to indicate whether or
not it must be saved to main memory before being replaced—otherwise
we might perform unnecessary writebacks.
Notice the penalty for the main memory access will not be applied until
the execution of some subsequent instruction following the write.
— In our example, the write to Mem[214] affected only the cache.
— But the load from Mem[142] resulted in two memory accesses: one to
save data to address 214, and one to load data from address 142.
— This makes it harder to predict execution time and performance.
The advantage of write-back caches is that not all write operations need
to access main memory, as with write-through caches.
— If a single address is frequently written to, then it doesn’t pay to keep
writing that data through to main memory.
— If several bytes within the same cache block are modified, they will
only force one memory write operation at write-back time.
April 28, 2003 Cache writes and examples 8
Write misses
A second scenario is if we try to write to an address that is not already
contained in the cache; this is called a write miss.
Let’s say we want to store 21763 into Mem[1101 0110] but we find that
address is not currently in the cache.
When we update Mem[1101 0110], should we also load it into the cache?
Index Tag DataV Address
...
110
...
1 00010 123456
Data
6378
...
1101 0110
...
April 28, 2003 Cache writes and examples 9
Write around caches
With a write around policy, the write operation goes directly to main
memory without affecting the cache.
This is good when data is written but not immediately used again, in
which case there’s no point to load it into the cache yet.
for (int i = 0; i < SIZE; i++)
a[i] = i;
Index Tag DataV
...
110
...
1 00010 123456
Address Data
21763
...
1101 0110
...
Mem[214] = 21763
April 28, 2003 Cache writes and examples 10
Allocate on write
An allocate on write strategy would instead load the newly written data
into the cache.
If that data is needed again soon, it will be available in the cache.
Index Tag DataV Address
...
110
...
1 11010 21763
Data
21763
...
1101 0110
...
Mem[214] = 21763
April 28, 2003 Cache writes and examples 11
Modern memory systems
We’ll finish up by describing the memory systems of some real CPUs.
— This will reinforce some of the ideas we’ve learned so far.
— You can also see some more advanced, more modern cache designs.
The book gives an example of the memory system in the MIPS R2000 CPU,
which was popular in the early 90s.
We’ll highlight some of its more interesting features.
— There are separate caches for instructions and data.
— Each cache block is just one word wide.
— A write-through policy is implemented with a write buffer.
April 28, 2003 Cache writes and examples 12
Unified vs. split caches
The R2000 uses a split cache design.
— There is one 64KB cache for instructions and a separate 64KB cache
for data.
— This makes sense when programs need to access both instructions and
data in the same clock cycle—we used separate instruction and data
memories in our single-cycle and pipelined datapaths too!
In contrast, a unified cache stores both program code and data.
— This is more flexible, because a value may be stored anywhere in the
cache memory regardless of whether it’s an instruction or data.
— Unified caches normally lead to lower miss rates. As an example, the
book shows that the miss rate of a unified cache for gcc is 4.8%, while
the miss rate for a split cache turned out to be 5.4%.
April 28, 2003 Cache writes and examples 13
Block size
Each 64KB cache in the R2000 is divided into 16K one-word blocks.
— This doesn’t give the R2000 much chance to take advantage of spatial
locality, as we mentioned last week.
— Figure 7.11 of the book shows how the miss rates could be reduced if
the R2000 supported four-word blocks instead.
0%
1%
2%
3%
4%
5%
6%
Missrate
gcc spice
Block size and miss rate
1-word blocks
4-word blocks
April 28, 2003 Cache writes and examples 14
Write buffers
The R2000 cache is write-through, so on write hits data goes to both the
cache and the main memory.
This can result in slow writes, so the R2000 includes a write buffer, which
queues pending writes to main memory and permits the CPU to continue.
Buffers are commonly used when two devices run at different speeds.
— If a producer generates data too quickly for a consumer to handle, the
extra data is stored in a buffer and the producer can continue on with
other tasks, without waiting for the consumer.
— Conversely, if the producer slows down, the consumer can continue
running at full speed as long as there is excess data in the buffer.
For us, the producer is the CPU and the consumer is the main memory.
BufferProducer Consumer
April 28, 2003 Cache writes and examples 15
Reducing memory stalls
Most newer CPUs include several features to reduce memory stalls.
— With a non-blocking cache, a processor that supports out-of-order
execution can continue in spite of a cache miss. The cache might
process other cache hits, or queue other misses.
— A multiported cache allows multiple reads or writes per clock cycle,
which is necessary for superscalar architectures that execute more
than one instruction simultaneously. (A similar concept popped up in
Homework 3.)
A less expensive alternative to multiporting is used by the Pentium Pro.
— Cache memory is organized into several banks, and multiple accesses
to different banks are permitted in the same clock cycle.
— We saw the same idea last week, in the context of main memory.
April 28, 2003 Cache writes and examples 16
Additional levels of caches
Last week we saw that high miss penalties and average memory access
times are partially due to slow main memories.
AMAT = Hit time + (Miss rate × Miss penalty)
How about adding another level into the memory hierarchy?
Most processors include a secondary (L2) cache, which lies between the
primary (L1) cache and main memory in location, capacity and speed.
L1 cacheCPU Main
Memory
L2 cache
April 28, 2003 Cache writes and examples 17
Reducing the miss penalty
L1 cacheCPU Main
Memory
L2 cache
If the primary cache misses, we might be able to find the desired data in
the L2 cache instead.
— If so, the data can be sent from the L2 cache to the CPU faster than it
could be from main memory.
— Main memory is only accessed if the requested data is in neither the
L1 nor the L2 cache.
The miss penalty is reduced whenever the L2 cache hits.
April 28, 2003 Cache writes and examples 18
Multi-level cache design
AMAT = Hit time + (Miss rate × Miss penalty)
Adding an L2 cache can lower the miss penalty, which means that the L1
miss rate becomes less of a factor.
L1 and L2 caches may employ different organizations and policies. For
instance, here is some information about the Pentium 4 data caches.
The secondary cache is usually much larger than the primary one, so L2
cache blocks are typically bigger as well, and can take more advantage of
spatial locality.
7 cycleswrite-back64 bytes8-way512 KBL2
2 cycleswrite-through64 bytes4-way8 KBL1
LatencyWrite policyBlock sizeAssociativityData sizeCache
April 28, 2003 Cache writes and examples 19
Extending the hierarchy even further
Main
Memory
Caches
CPU
Regs
Hard
Disk
The storage hierarchy can be extended quite a ways.
CPU registers can be considered as small one-word caches
which hold frequently used data.
— However, registers must be controlled explicitly by the
programmer, whereas caching is done automatically by
the hardware.
Main memory can be thought of as a cache for a hard disk.
— Programs and data are both loaded into main memory
before being executed or edited.
We could go even further, but the picture wouldn’t fit.
— Your hard drive may be a cache for networked
information.
— Companies like Akamai provide mirrors or caches of
other web sites.
April 28, 2003 Cache writes and examples 20
Intel Pentium 4 caches
The Intel Pentium 4 has very small L1 caches.
— An execution trace cache holds about 12,000 micro-instructions. This
avoids the cost of having to re-decode instructions (an expensive task
for 8086-based processors) that are read from the cache.
— The primary data cache stores just 8KB.
Intel’s designers chose to minimize primary cache latency at the cost of a
higher miss rate. So level 1 cache hits in the Pentium 4 are very fast, but
there are more misses overall.
The 512KB secondary cache is eight-way associative, with 64-byte blocks.
7 cycleswrite-back64 bytes8-way512 KBL2
2 cycleswrite-through64 bytes4-way8 KBL1
LatencyWrite policyBlock sizeAssociativityData sizeCache
April 28, 2003 Cache writes and examples 21
Pentium 4 memory system
CPU
L1 cache
Main
Memory
L2 cache
256
64
The Pentium caches are connected by a 256-bit (32-byte) bus, so it takes
two clock cycles to transfer a single 64-byte L1 cache block.
The Pentium caches are inclusive. The L1 caches contain a copy of data
which is also in L2, and the L2 cache contains data that’s also in RAM.
— The same data may exist in all three levels of the hierarchy.
— The effective cache size is only 512KB, since the L1 caches are just a
faster copy of data that’s also in L2.
April 28, 2003 Cache writes and examples 22
AMD Athlon XP caches
The AMD Athlon XP is an 8086-compatible processor with several features
aimed at lowering miss rates.
The Athlon has a 128KB split level 1 cache with four-way associativity.
— This is roughly eight times larger than the Pentium 4’s L1 caches.
— As we saw last week, larger caches usually result in fewer misses.
The Athlon’s 512KB secondary cache also has some notable features.
— The higher 16-way associativity further minimizes the miss rate.
— The L1 and L2 caches are exclusive—they never hold the same data.
This makes the effective total cache size 640KB.
April 28, 2003 Cache writes and examples 23
Athlon XP memory system
The Athlon L1 and L2 caches are connected by a 64-bit bus, as compared
to the wider 256-bit bus on the Pentiums.
— AMD claims their caches have a miss rate that’s low enough to offset
the larger miss penalty.
— This illustrates another one of many architectural tradeoffs.
CPU
L1 cache
Main
Memory
L2 cache
64
64
April 28, 2003 Cache writes and examples 24
Motorola PowerPC G4
The Motorola PowerPC G4 has a 64KB, split primary cache with eight-way
associativity.
The on-die 256KB 8-way L2 cache has a 256-bit interface to the L1, just
like the Pentium 4.
The G4 also goes one step further with the memory hierarchy idea, by
supporting an external Level 3 or L3 cache up to 2MB.
— The L3 cache data is not stored on the CPU itself, but the tags are.
Thus, determining L3 cache hits is very fast.
— The L3 cache can also be configured as regular main memory, but
with the advantage of being much faster.
April 28, 2003 Cache writes and examples 25
Caches and dies
As manufacturing technology improves, designers
can squeeze more and more cache memory onto a
processor die.
In the old Pentium III days the L2 cache wasn’t even
on the same chip as the processor itself. Companies
sold processor “modules” that were composed of
several chips internally.
The second picture illustrates the size difference
between an older Pentium 4 and a newer one. The
newer one has an area of just 131 mm2!
The last picture shows the dies of an older Athlon
with only 256KB of L2 cache, and a newer version
with 512KB of L2 cache. You can literally see the
doubling of the cache memory.
Tech Report
Tom’s Hardware
Hardware.fr
April 28, 2003 Cache writes and examples 26
CPU families
Manufacturers often sell a range of processors based on the same design.
In all cases, one of the main differences is the size of the caches.
— Low-end processors like the Duron and Celeron have only half as much
or less L2 cache memory than “standard” versions of the CPU.
— The high-end Opteron features a 1 MB secondary cache, and Xeons can
be outfitted with up to 2 MB of level 3 cache.
This highlights the importance of caches and memory systems for overall
performance.
Duron
Celeron
Budget
OpteronAthlon XPAMD
Xeon MPPentium 4Intel
High-endStandardCompany
April 28, 2003 Cache writes and examples 27
Summary
Writing to a cache poses a couple of interesting issues.
— Write-through and write-back policies keep the cache consistent with
main memory in different ways for write hits.
— Write-around and allocate-on-write are two strategies to handle write
misses, differing in whether updated data is loaded into the cache.
Modern processors include many advanced cache design ideas.
— Split caches can supply instructions and data in the same cycle.
— Non-blocking and multiport caches minimize memory stalls.
— Multilevel cache hierarchies can help reduce miss penalties.
— The speed and width of memory buses also affects performance.
Memory systems are a complex topic, and you should take CS333 to learn
more about it!

More Related Content

What's hot

Multithreading computer architecture
 Multithreading computer architecture  Multithreading computer architecture
Multithreading computer architecture
Haris456
 
Computer architecture
Computer architecture Computer architecture
Computer architecture
Ashish Kumar
 

What's hot (20)

Computer architecture data representation
Computer architecture  data representationComputer architecture  data representation
Computer architecture data representation
 
Memory management
Memory managementMemory management
Memory management
 
Multithreading computer architecture
 Multithreading computer architecture  Multithreading computer architecture
Multithreading computer architecture
 
Basic MIPS implementation
Basic MIPS implementationBasic MIPS implementation
Basic MIPS implementation
 
Virtual Memory
Virtual MemoryVirtual Memory
Virtual Memory
 
Cache Memory
Cache MemoryCache Memory
Cache Memory
 
Multiprocessor Architecture (Advanced computer architecture)
Multiprocessor Architecture  (Advanced computer architecture)Multiprocessor Architecture  (Advanced computer architecture)
Multiprocessor Architecture (Advanced computer architecture)
 
Draw and explain the architecture of general purpose microprocessor
Draw and explain the architecture of general purpose microprocessor Draw and explain the architecture of general purpose microprocessor
Draw and explain the architecture of general purpose microprocessor
 
Memory organization (Computer architecture)
Memory organization (Computer architecture)Memory organization (Computer architecture)
Memory organization (Computer architecture)
 
Memory organization
Memory organizationMemory organization
Memory organization
 
Disk scheduling
Disk schedulingDisk scheduling
Disk scheduling
 
Pipeline
PipelinePipeline
Pipeline
 
Free Space Management, Efficiency & Performance, Recovery and NFS
Free Space Management, Efficiency & Performance, Recovery and NFSFree Space Management, Efficiency & Performance, Recovery and NFS
Free Space Management, Efficiency & Performance, Recovery and NFS
 
Computer architecture
Computer architectureComputer architecture
Computer architecture
 
Storage system architecture
Storage system architectureStorage system architecture
Storage system architecture
 
Parallelism
ParallelismParallelism
Parallelism
 
Computer architecture
Computer architecture Computer architecture
Computer architecture
 
Raid- Redundant Array of Inexpensive Disks
Raid- Redundant Array of Inexpensive DisksRaid- Redundant Array of Inexpensive Disks
Raid- Redundant Array of Inexpensive Disks
 
Multiprocessor architecture
Multiprocessor architectureMultiprocessor architecture
Multiprocessor architecture
 
Cache memory
Cache memoryCache memory
Cache memory
 

Viewers also liked (9)

Prueba slideshare
Prueba slidesharePrueba slideshare
Prueba slideshare
 
Lls oferta generala pcm pentru open
Lls oferta generala pcm pentru openLls oferta generala pcm pentru open
Lls oferta generala pcm pentru open
 
Fisa frunze expresive
Fisa frunze expresiveFisa frunze expresive
Fisa frunze expresive
 
8th Alg - L6.3--Jan12
8th Alg - L6.3--Jan128th Alg - L6.3--Jan12
8th Alg - L6.3--Jan12
 
Combinational circuits
Combinational circuitsCombinational circuits
Combinational circuits
 
6th Math (C1) - Lesson 40
6th Math (C1) - Lesson 406th Math (C1) - Lesson 40
6th Math (C1) - Lesson 40
 
Тарас Леськів “Know your tool – tips and tricks for unity3d developers”
Тарас Леськів “Know your tool – tips and tricks for unity3d developers”Тарас Леськів “Know your tool – tips and tricks for unity3d developers”
Тарас Леськів “Know your tool – tips and tricks for unity3d developers”
 
Vsf ppt on solar impulse v5
Vsf ppt on solar impulse  v5Vsf ppt on solar impulse  v5
Vsf ppt on solar impulse v5
 
MBUDUMA COMMUNICATIONS PROFILE.compressed
MBUDUMA COMMUNICATIONS PROFILE.compressedMBUDUMA COMMUNICATIONS PROFILE.compressed
MBUDUMA COMMUNICATIONS PROFILE.compressed
 

Similar to Write miss

Please do ECE572 requirementECECS 472572 Final Exam Project (W.docx
Please do ECE572 requirementECECS 472572 Final Exam Project (W.docxPlease do ECE572 requirementECECS 472572 Final Exam Project (W.docx
Please do ECE572 requirementECECS 472572 Final Exam Project (W.docx
ARIV4
 
ECECS 472572 Final Exam ProjectRemember to check the errat.docx
ECECS 472572 Final Exam ProjectRemember to check the errat.docxECECS 472572 Final Exam ProjectRemember to check the errat.docx
ECECS 472572 Final Exam ProjectRemember to check the errat.docx
tidwellveronique
 
ECECS 472572 Final Exam ProjectRemember to check the err.docx
ECECS 472572 Final Exam ProjectRemember to check the err.docxECECS 472572 Final Exam ProjectRemember to check the err.docx
ECECS 472572 Final Exam ProjectRemember to check the err.docx
tidwellveronique
 
ECECS 472572 Final Exam ProjectRemember to check the errata
ECECS 472572 Final Exam ProjectRemember to check the errata ECECS 472572 Final Exam ProjectRemember to check the errata
ECECS 472572 Final Exam ProjectRemember to check the errata
EvonCanales257
 

Similar to Write miss (20)

Cache memory
Cache memoryCache memory
Cache memory
 
Memory comp
Memory compMemory comp
Memory comp
 
Cache memory
Cache memoryCache memory
Cache memory
 
Please do ECE572 requirementECECS 472572 Final Exam Project (W.docx
Please do ECE572 requirementECECS 472572 Final Exam Project (W.docxPlease do ECE572 requirementECECS 472572 Final Exam Project (W.docx
Please do ECE572 requirementECECS 472572 Final Exam Project (W.docx
 
MULTI-CORE PROCESSORS: CONCEPTS AND IMPLEMENTATIONS
MULTI-CORE PROCESSORS: CONCEPTS AND IMPLEMENTATIONSMULTI-CORE PROCESSORS: CONCEPTS AND IMPLEMENTATIONS
MULTI-CORE PROCESSORS: CONCEPTS AND IMPLEMENTATIONS
 
MULTI-CORE PROCESSORS: CONCEPTS AND IMPLEMENTATIONS
MULTI-CORE PROCESSORS: CONCEPTS AND IMPLEMENTATIONSMULTI-CORE PROCESSORS: CONCEPTS AND IMPLEMENTATIONS
MULTI-CORE PROCESSORS: CONCEPTS AND IMPLEMENTATIONS
 
Cache.pptx
Cache.pptxCache.pptx
Cache.pptx
 
No sql presentation
No sql presentationNo sql presentation
No sql presentation
 
Configuring Aerospike - Part 2
Configuring Aerospike - Part 2 Configuring Aerospike - Part 2
Configuring Aerospike - Part 2
 
Operating Systems Part III-Memory Management
Operating Systems Part III-Memory ManagementOperating Systems Part III-Memory Management
Operating Systems Part III-Memory Management
 
ECECS 472572 Final Exam ProjectRemember to check the errat.docx
ECECS 472572 Final Exam ProjectRemember to check the errat.docxECECS 472572 Final Exam ProjectRemember to check the errat.docx
ECECS 472572 Final Exam ProjectRemember to check the errat.docx
 
ECECS 472572 Final Exam ProjectRemember to check the err.docx
ECECS 472572 Final Exam ProjectRemember to check the err.docxECECS 472572 Final Exam ProjectRemember to check the err.docx
ECECS 472572 Final Exam ProjectRemember to check the err.docx
 
cashe introduction, and heirarchy basics
cashe introduction, and heirarchy basicscashe introduction, and heirarchy basics
cashe introduction, and heirarchy basics
 
ECECS 472572 Final Exam ProjectRemember to check the errata
ECECS 472572 Final Exam ProjectRemember to check the errata ECECS 472572 Final Exam ProjectRemember to check the errata
ECECS 472572 Final Exam ProjectRemember to check the errata
 
Main Memory Management in Operating System
Main Memory Management in Operating SystemMain Memory Management in Operating System
Main Memory Management in Operating System
 
Limitations of memory system performance
Limitations of memory system performanceLimitations of memory system performance
Limitations of memory system performance
 
Architecture and implementation issues of multi core processors and caching –...
Architecture and implementation issues of multi core processors and caching –...Architecture and implementation issues of multi core processors and caching –...
Architecture and implementation issues of multi core processors and caching –...
 
lecture-2-3_Memory.pdf,describing memory
lecture-2-3_Memory.pdf,describing memorylecture-2-3_Memory.pdf,describing memory
lecture-2-3_Memory.pdf,describing memory
 
virtual memory
virtual memoryvirtual memory
virtual memory
 
Cache memory
Cache memoryCache memory
Cache memory
 

More from marangburu42

More from marangburu42 (20)

Hol
HolHol
Hol
 
Hennchthree 161102111515
Hennchthree 161102111515Hennchthree 161102111515
Hennchthree 161102111515
 
Hennchthree
HennchthreeHennchthree
Hennchthree
 
Hennchthree
HennchthreeHennchthree
Hennchthree
 
Sequential circuits
Sequential circuitsSequential circuits
Sequential circuits
 
Hennchthree 160912095304
Hennchthree 160912095304Hennchthree 160912095304
Hennchthree 160912095304
 
Sequential circuits
Sequential circuitsSequential circuits
Sequential circuits
 
Combinational circuits
Combinational circuitsCombinational circuits
Combinational circuits
 
Karnaugh mapping allaboutcircuits
Karnaugh mapping allaboutcircuitsKarnaugh mapping allaboutcircuits
Karnaugh mapping allaboutcircuits
 
Aac boolean formulae
Aac   boolean formulaeAac   boolean formulae
Aac boolean formulae
 
Virtualmemoryfinal 161019175858
Virtualmemoryfinal 161019175858Virtualmemoryfinal 161019175858
Virtualmemoryfinal 161019175858
 
Io systems final
Io systems finalIo systems final
Io systems final
 
File system interfacefinal
File system interfacefinalFile system interfacefinal
File system interfacefinal
 
File systemimplementationfinal
File systemimplementationfinalFile systemimplementationfinal
File systemimplementationfinal
 
Mass storage structurefinal
Mass storage structurefinalMass storage structurefinal
Mass storage structurefinal
 
All aboutcircuits karnaugh maps
All aboutcircuits karnaugh mapsAll aboutcircuits karnaugh maps
All aboutcircuits karnaugh maps
 
Virtual memoryfinal
Virtual memoryfinalVirtual memoryfinal
Virtual memoryfinal
 
Mainmemoryfinal 161019122029
Mainmemoryfinal 161019122029Mainmemoryfinal 161019122029
Mainmemoryfinal 161019122029
 
Virtualmemorypre final-formatting-161019022904
Virtualmemorypre final-formatting-161019022904Virtualmemorypre final-formatting-161019022904
Virtualmemorypre final-formatting-161019022904
 
Process synchronizationfinal
Process synchronizationfinalProcess synchronizationfinal
Process synchronizationfinal
 

Recently uploaded

一比一原版美国西雅图大学毕业证(Seattle毕业证书)毕业证成绩单留信认证
一比一原版美国西雅图大学毕业证(Seattle毕业证书)毕业证成绩单留信认证一比一原版美国西雅图大学毕业证(Seattle毕业证书)毕业证成绩单留信认证
一比一原版美国西雅图大学毕业证(Seattle毕业证书)毕业证成绩单留信认证
khuurq8kz
 
9711106444 Ghaziabad, Call Girls @ ₹. 1500– Per Shot Per Night 7000 Delhi
9711106444 Ghaziabad, Call Girls @ ₹. 1500– Per Shot Per Night 7000 Delhi9711106444 Ghaziabad, Call Girls @ ₹. 1500– Per Shot Per Night 7000 Delhi
9711106444 Ghaziabad, Call Girls @ ₹. 1500– Per Shot Per Night 7000 Delhi
delhimunirka15
 
FULL ENJOY —📞9711106444 ✦/ Vℐℙ Call Girls in Jasola Vihar, | Delhi🫶
FULL ENJOY —📞9711106444 ✦/ Vℐℙ Call Girls in Jasola Vihar, | Delhi🫶FULL ENJOY —📞9711106444 ✦/ Vℐℙ Call Girls in Jasola Vihar, | Delhi🫶
FULL ENJOY —📞9711106444 ✦/ Vℐℙ Call Girls in Jasola Vihar, | Delhi🫶
delhimunirka15
 
Russian Call Girls Pilibhit Just Call 👉👉 📞 8617370543 Top Class Call Girl Ser...
Russian Call Girls Pilibhit Just Call 👉👉 📞 8617370543 Top Class Call Girl Ser...Russian Call Girls Pilibhit Just Call 👉👉 📞 8617370543 Top Class Call Girl Ser...
Russian Call Girls Pilibhit Just Call 👉👉 📞 8617370543 Top Class Call Girl Ser...
Nitya salvi
 
Mussafah Call Girls +971525373611 Call Girls in Mussafah Abu Dhabi
Mussafah Call Girls +971525373611 Call Girls in Mussafah Abu DhabiMussafah Call Girls +971525373611 Call Girls in Mussafah Abu Dhabi
Mussafah Call Girls +971525373611 Call Girls in Mussafah Abu Dhabi
romeke1848
 
Nehru Nagar, Call Girls ☎️ ((#9711106444)), 💘 Full enjoy Low rate girl💘 Genui...
Nehru Nagar, Call Girls ☎️ ((#9711106444)), 💘 Full enjoy Low rate girl💘 Genui...Nehru Nagar, Call Girls ☎️ ((#9711106444)), 💘 Full enjoy Low rate girl💘 Genui...
Nehru Nagar, Call Girls ☎️ ((#9711106444)), 💘 Full enjoy Low rate girl💘 Genui...
delhimunirka15
 
Call Girls In Chattarpur | Contact Me ☎ +91-9953040155
Call Girls In Chattarpur | Contact Me ☎ +91-9953040155Call Girls In Chattarpur | Contact Me ☎ +91-9953040155
Call Girls In Chattarpur | Contact Me ☎ +91-9953040155
SaketCallGirlsCallUs
 
Pari Chowk Call Girls ☎️ ((#9711106444)), 💘 Full enjoy Low rate girl💘 Genuine...
Pari Chowk Call Girls ☎️ ((#9711106444)), 💘 Full enjoy Low rate girl💘 Genuine...Pari Chowk Call Girls ☎️ ((#9711106444)), 💘 Full enjoy Low rate girl💘 Genuine...
Pari Chowk Call Girls ☎️ ((#9711106444)), 💘 Full enjoy Low rate girl💘 Genuine...
delhimunirka15
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377087607
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377087607FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377087607
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377087607
dollysharma2066
 

Recently uploaded (20)

Orai call girls 📞 8617370543At Low Cost Cash Payment Booking
Orai call girls 📞 8617370543At Low Cost Cash Payment BookingOrai call girls 📞 8617370543At Low Cost Cash Payment Booking
Orai call girls 📞 8617370543At Low Cost Cash Payment Booking
 
一比一原版美国西雅图大学毕业证(Seattle毕业证书)毕业证成绩单留信认证
一比一原版美国西雅图大学毕业证(Seattle毕业证书)毕业证成绩单留信认证一比一原版美国西雅图大学毕业证(Seattle毕业证书)毕业证成绩单留信认证
一比一原版美国西雅图大学毕业证(Seattle毕业证书)毕业证成绩单留信认证
 
9711106444 Ghaziabad, Call Girls @ ₹. 1500– Per Shot Per Night 7000 Delhi
9711106444 Ghaziabad, Call Girls @ ₹. 1500– Per Shot Per Night 7000 Delhi9711106444 Ghaziabad, Call Girls @ ₹. 1500– Per Shot Per Night 7000 Delhi
9711106444 Ghaziabad, Call Girls @ ₹. 1500– Per Shot Per Night 7000 Delhi
 
Theoretical Framework- Explanation with Flow Chart.docx
Theoretical Framework- Explanation with Flow Chart.docxTheoretical Framework- Explanation with Flow Chart.docx
Theoretical Framework- Explanation with Flow Chart.docx
 
Turn Off The Air Con - The Singapore Punk Scene
Turn Off The Air Con - The Singapore Punk SceneTurn Off The Air Con - The Singapore Punk Scene
Turn Off The Air Con - The Singapore Punk Scene
 
Call Girls In Firozabad Escorts ☎️8617370543 🔝 💃 Enjoy 24/7 Escort Service En...
Call Girls In Firozabad Escorts ☎️8617370543 🔝 💃 Enjoy 24/7 Escort Service En...Call Girls In Firozabad Escorts ☎️8617370543 🔝 💃 Enjoy 24/7 Escort Service En...
Call Girls In Firozabad Escorts ☎️8617370543 🔝 💃 Enjoy 24/7 Escort Service En...
 
FULL ENJOY —📞9711106444 ✦/ Vℐℙ Call Girls in Jasola Vihar, | Delhi🫶
FULL ENJOY —📞9711106444 ✦/ Vℐℙ Call Girls in Jasola Vihar, | Delhi🫶FULL ENJOY —📞9711106444 ✦/ Vℐℙ Call Girls in Jasola Vihar, | Delhi🫶
FULL ENJOY —📞9711106444 ✦/ Vℐℙ Call Girls in Jasola Vihar, | Delhi🫶
 
SB_ Scott Pilgrim_ Rough_ RiverPhan (2024)
SB_ Scott Pilgrim_ Rough_ RiverPhan (2024)SB_ Scott Pilgrim_ Rough_ RiverPhan (2024)
SB_ Scott Pilgrim_ Rough_ RiverPhan (2024)
 
Russian Call Girls Pilibhit Just Call 👉👉 📞 8617370543 Top Class Call Girl Ser...
Russian Call Girls Pilibhit Just Call 👉👉 📞 8617370543 Top Class Call Girl Ser...Russian Call Girls Pilibhit Just Call 👉👉 📞 8617370543 Top Class Call Girl Ser...
Russian Call Girls Pilibhit Just Call 👉👉 📞 8617370543 Top Class Call Girl Ser...
 
Call Girls Sultanpur Just Call 📞 8617370543 Top Class Call Girl Service Avail...
Call Girls Sultanpur Just Call 📞 8617370543 Top Class Call Girl Service Avail...Call Girls Sultanpur Just Call 📞 8617370543 Top Class Call Girl Service Avail...
Call Girls Sultanpur Just Call 📞 8617370543 Top Class Call Girl Service Avail...
 
THE ARTS OF THE PHILIPPINE BALLET PRESN
THE ARTS OF  THE PHILIPPINE BALLET PRESNTHE ARTS OF  THE PHILIPPINE BALLET PRESN
THE ARTS OF THE PHILIPPINE BALLET PRESN
 
Mussafah Call Girls +971525373611 Call Girls in Mussafah Abu Dhabi
Mussafah Call Girls +971525373611 Call Girls in Mussafah Abu DhabiMussafah Call Girls +971525373611 Call Girls in Mussafah Abu Dhabi
Mussafah Call Girls +971525373611 Call Girls in Mussafah Abu Dhabi
 
HUMA Final Presentation About Chicano Culture
HUMA Final Presentation About Chicano CultureHUMA Final Presentation About Chicano Culture
HUMA Final Presentation About Chicano Culture
 
WhatsApp Chat: 📞 8617370543 Call Girls In Siddharth Nagar At Low Cost Cash Pa...
WhatsApp Chat: 📞 8617370543 Call Girls In Siddharth Nagar At Low Cost Cash Pa...WhatsApp Chat: 📞 8617370543 Call Girls In Siddharth Nagar At Low Cost Cash Pa...
WhatsApp Chat: 📞 8617370543 Call Girls In Siddharth Nagar At Low Cost Cash Pa...
 
Nehru Nagar, Call Girls ☎️ ((#9711106444)), 💘 Full enjoy Low rate girl💘 Genui...
Nehru Nagar, Call Girls ☎️ ((#9711106444)), 💘 Full enjoy Low rate girl💘 Genui...Nehru Nagar, Call Girls ☎️ ((#9711106444)), 💘 Full enjoy Low rate girl💘 Genui...
Nehru Nagar, Call Girls ☎️ ((#9711106444)), 💘 Full enjoy Low rate girl💘 Genui...
 
Call Girls In Chattarpur | Contact Me ☎ +91-9953040155
Call Girls In Chattarpur | Contact Me ☎ +91-9953040155Call Girls In Chattarpur | Contact Me ☎ +91-9953040155
Call Girls In Chattarpur | Contact Me ☎ +91-9953040155
 
Call Girls Bhavnagar - 📞 8617370543 Our call girls are sure to provide you wi...
Call Girls Bhavnagar - 📞 8617370543 Our call girls are sure to provide you wi...Call Girls Bhavnagar - 📞 8617370543 Our call girls are sure to provide you wi...
Call Girls Bhavnagar - 📞 8617370543 Our call girls are sure to provide you wi...
 
Pari Chowk Call Girls ☎️ ((#9711106444)), 💘 Full enjoy Low rate girl💘 Genuine...
Pari Chowk Call Girls ☎️ ((#9711106444)), 💘 Full enjoy Low rate girl💘 Genuine...Pari Chowk Call Girls ☎️ ((#9711106444)), 💘 Full enjoy Low rate girl💘 Genuine...
Pari Chowk Call Girls ☎️ ((#9711106444)), 💘 Full enjoy Low rate girl💘 Genuine...
 
Jaro je tady - Spring is here (Judith) 3
Jaro je tady - Spring is here (Judith) 3Jaro je tady - Spring is here (Judith) 3
Jaro je tady - Spring is here (Judith) 3
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377087607
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377087607FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377087607
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377087607
 

Write miss

  • 1. April 28, 2003 ©2001-2003 Howard Huang 1 Cache writes and examples Today is the last day of caches! — One important topic we haven’t mentioned yet is writing to a cache. We’ll see several different scenarios and approaches. — To introduce some more advanced cache designs, we’ll also present some of the caching strategies used in modern desktop processors like Pentiums and PowerPCs. On Wednesday we’ll turn our attention to peripheral devices and I/O.
  • 2. April 28, 2003 Cache writes and examples 2 Writing to a cache Writing to a cache raises several additional issues. First, let’s assume that the address we want to write to is already loaded in the cache. We’ll assume a simple direct-mapped cache. If we write a new value to that address, we can store the new data in the cache, and avoid an expensive main memory access. Index Tag DataV Address ... 110 ... 1 11010 42803 Data 42803 ... 1101 0110 ... Index Tag DataV Address ... 110 ... 1 11010 21763 Data 42803 ... 1101 0110 ... Mem[214] = 21763
  • 3. April 28, 2003 Cache writes and examples 3 Inconsistent memory But now the cache and memory contain different, inconsistent data! How can we ensure that subsequent loads will return the right value? This is also problematic if other devices are sharing the main memory, as in a multiprocessor system. Index Tag DataV Address ... 110 ... 1 11010 21763 Data 42803 ... 1101 0110 ...
  • 4. April 28, 2003 Cache writes and examples 4 Write-through caches A write-through cache solves the inconsistency problem by forcing all writes to update both the cache and the main memory. This is simple to implement and keeps the cache and memory consistent. The bad thing is that forcing every write to go to main memory negates the advantage of having a cache—the whole point was to avoid accessing main memory! Index Tag DataV Address ... 110 ... 1 11010 21763 Data 21763 ... 1101 0110 ... Mem[214] = 21763
  • 5. April 28, 2003 Cache writes and examples 5 Write-back caches In a write-back cache, the memory is not updated until the cache block needs to be replaced (e.g., when loading data into a full cache set). For example, we might write some data to the cache at first, leaving it inconsistent with the main memory as shown before. Subsequent reads to the same memory address will be serviced by the cache, which contains the correct, updated data. Index Tag DataV Address ... 110 ... 1 11010 21763 Data 42803 1000 1110 1101 0110 ... Mem[214] = 21763 1225
  • 6. April 28, 2003 Cache writes and examples 6 Finishing the write back We don’t need to store the new value back to main memory unless the cache block gets replaced. For example, on a read from Mem[142], which maps to the same cache block, the modified cache contents will first be written to main memory. Only then can the cache block be replaced with data from address 142. Index Tag DataV ... 110 ... 1 11010 21763 Address Data 21763 1000 1110 1101 0110 ... 1225 Index Tag DataV ... 110 ... 1 10001 1225 Address Data 21763 1000 1110 1101 0110 ... 1225
  • 7. April 28, 2003 Cache writes and examples 7 Write-back cache discussion Each block in a write-back cache needs a dirty bit to indicate whether or not it must be saved to main memory before being replaced—otherwise we might perform unnecessary writebacks. Notice the penalty for the main memory access will not be applied until the execution of some subsequent instruction following the write. — In our example, the write to Mem[214] affected only the cache. — But the load from Mem[142] resulted in two memory accesses: one to save data to address 214, and one to load data from address 142. — This makes it harder to predict execution time and performance. The advantage of write-back caches is that not all write operations need to access main memory, as with write-through caches. — If a single address is frequently written to, then it doesn’t pay to keep writing that data through to main memory. — If several bytes within the same cache block are modified, they will only force one memory write operation at write-back time.
  • 8. April 28, 2003 Cache writes and examples 8 Write misses A second scenario is if we try to write to an address that is not already contained in the cache; this is called a write miss. Let’s say we want to store 21763 into Mem[1101 0110] but we find that address is not currently in the cache. When we update Mem[1101 0110], should we also load it into the cache? Index Tag DataV Address ... 110 ... 1 00010 123456 Data 6378 ... 1101 0110 ...
  • 9. April 28, 2003 Cache writes and examples 9 Write around caches With a write around policy, the write operation goes directly to main memory without affecting the cache. This is good when data is written but not immediately used again, in which case there’s no point to load it into the cache yet. for (int i = 0; i < SIZE; i++) a[i] = i; Index Tag DataV ... 110 ... 1 00010 123456 Address Data 21763 ... 1101 0110 ... Mem[214] = 21763
  • 10. April 28, 2003 Cache writes and examples 10 Allocate on write An allocate on write strategy would instead load the newly written data into the cache. If that data is needed again soon, it will be available in the cache. Index Tag DataV Address ... 110 ... 1 11010 21763 Data 21763 ... 1101 0110 ... Mem[214] = 21763
  • 11. April 28, 2003 Cache writes and examples 11 Modern memory systems We’ll finish up by describing the memory systems of some real CPUs. — This will reinforce some of the ideas we’ve learned so far. — You can also see some more advanced, more modern cache designs. The book gives an example of the memory system in the MIPS R2000 CPU, which was popular in the early 90s. We’ll highlight some of its more interesting features. — There are separate caches for instructions and data. — Each cache block is just one word wide. — A write-through policy is implemented with a write buffer.
  • 12. April 28, 2003 Cache writes and examples 12 Unified vs. split caches The R2000 uses a split cache design. — There is one 64KB cache for instructions and a separate 64KB cache for data. — This makes sense when programs need to access both instructions and data in the same clock cycle—we used separate instruction and data memories in our single-cycle and pipelined datapaths too! In contrast, a unified cache stores both program code and data. — This is more flexible, because a value may be stored anywhere in the cache memory regardless of whether it’s an instruction or data. — Unified caches normally lead to lower miss rates. As an example, the book shows that the miss rate of a unified cache for gcc is 4.8%, while the miss rate for a split cache turned out to be 5.4%.
  • 13. April 28, 2003 Cache writes and examples 13 Block size Each 64KB cache in the R2000 is divided into 16K one-word blocks. — This doesn’t give the R2000 much chance to take advantage of spatial locality, as we mentioned last week. — Figure 7.11 of the book shows how the miss rates could be reduced if the R2000 supported four-word blocks instead. 0% 1% 2% 3% 4% 5% 6% Missrate gcc spice Block size and miss rate 1-word blocks 4-word blocks
  • 14. April 28, 2003 Cache writes and examples 14 Write buffers The R2000 cache is write-through, so on write hits data goes to both the cache and the main memory. This can result in slow writes, so the R2000 includes a write buffer, which queues pending writes to main memory and permits the CPU to continue. Buffers are commonly used when two devices run at different speeds. — If a producer generates data too quickly for a consumer to handle, the extra data is stored in a buffer and the producer can continue on with other tasks, without waiting for the consumer. — Conversely, if the producer slows down, the consumer can continue running at full speed as long as there is excess data in the buffer. For us, the producer is the CPU and the consumer is the main memory. BufferProducer Consumer
  • 15. April 28, 2003 Cache writes and examples 15 Reducing memory stalls Most newer CPUs include several features to reduce memory stalls. — With a non-blocking cache, a processor that supports out-of-order execution can continue in spite of a cache miss. The cache might process other cache hits, or queue other misses. — A multiported cache allows multiple reads or writes per clock cycle, which is necessary for superscalar architectures that execute more than one instruction simultaneously. (A similar concept popped up in Homework 3.) A less expensive alternative to multiporting is used by the Pentium Pro. — Cache memory is organized into several banks, and multiple accesses to different banks are permitted in the same clock cycle. — We saw the same idea last week, in the context of main memory.
  • 16. April 28, 2003 Cache writes and examples 16 Additional levels of caches Last week we saw that high miss penalties and average memory access times are partially due to slow main memories. AMAT = Hit time + (Miss rate × Miss penalty) How about adding another level into the memory hierarchy? Most processors include a secondary (L2) cache, which lies between the primary (L1) cache and main memory in location, capacity and speed. L1 cacheCPU Main Memory L2 cache
  • 17. April 28, 2003 Cache writes and examples 17 Reducing the miss penalty L1 cacheCPU Main Memory L2 cache If the primary cache misses, we might be able to find the desired data in the L2 cache instead. — If so, the data can be sent from the L2 cache to the CPU faster than it could be from main memory. — Main memory is only accessed if the requested data is in neither the L1 nor the L2 cache. The miss penalty is reduced whenever the L2 cache hits.
  • 18. April 28, 2003 Cache writes and examples 18 Multi-level cache design AMAT = Hit time + (Miss rate × Miss penalty) Adding an L2 cache can lower the miss penalty, which means that the L1 miss rate becomes less of a factor. L1 and L2 caches may employ different organizations and policies. For instance, here is some information about the Pentium 4 data caches. The secondary cache is usually much larger than the primary one, so L2 cache blocks are typically bigger as well, and can take more advantage of spatial locality. 7 cycleswrite-back64 bytes8-way512 KBL2 2 cycleswrite-through64 bytes4-way8 KBL1 LatencyWrite policyBlock sizeAssociativityData sizeCache
  • 19. April 28, 2003 Cache writes and examples 19 Extending the hierarchy even further Main Memory Caches CPU Regs Hard Disk The storage hierarchy can be extended quite a ways. CPU registers can be considered as small one-word caches which hold frequently used data. — However, registers must be controlled explicitly by the programmer, whereas caching is done automatically by the hardware. Main memory can be thought of as a cache for a hard disk. — Programs and data are both loaded into main memory before being executed or edited. We could go even further, but the picture wouldn’t fit. — Your hard drive may be a cache for networked information. — Companies like Akamai provide mirrors or caches of other web sites.
  • 20. April 28, 2003 Cache writes and examples 20 Intel Pentium 4 caches The Intel Pentium 4 has very small L1 caches. — An execution trace cache holds about 12,000 micro-instructions. This avoids the cost of having to re-decode instructions (an expensive task for 8086-based processors) that are read from the cache. — The primary data cache stores just 8KB. Intel’s designers chose to minimize primary cache latency at the cost of a higher miss rate. So level 1 cache hits in the Pentium 4 are very fast, but there are more misses overall. The 512KB secondary cache is eight-way associative, with 64-byte blocks. 7 cycleswrite-back64 bytes8-way512 KBL2 2 cycleswrite-through64 bytes4-way8 KBL1 LatencyWrite policyBlock sizeAssociativityData sizeCache
  • 21. April 28, 2003 Cache writes and examples 21 Pentium 4 memory system CPU L1 cache Main Memory L2 cache 256 64 The Pentium caches are connected by a 256-bit (32-byte) bus, so it takes two clock cycles to transfer a single 64-byte L1 cache block. The Pentium caches are inclusive. The L1 caches contain a copy of data which is also in L2, and the L2 cache contains data that’s also in RAM. — The same data may exist in all three levels of the hierarchy. — The effective cache size is only 512KB, since the L1 caches are just a faster copy of data that’s also in L2.
  • 22. April 28, 2003 Cache writes and examples 22 AMD Athlon XP caches The AMD Athlon XP is an 8086-compatible processor with several features aimed at lowering miss rates. The Athlon has a 128KB split level 1 cache with four-way associativity. — This is roughly eight times larger than the Pentium 4’s L1 caches. — As we saw last week, larger caches usually result in fewer misses. The Athlon’s 512KB secondary cache also has some notable features. — The higher 16-way associativity further minimizes the miss rate. — The L1 and L2 caches are exclusive—they never hold the same data. This makes the effective total cache size 640KB.
  • 23. April 28, 2003 Cache writes and examples 23 Athlon XP memory system The Athlon L1 and L2 caches are connected by a 64-bit bus, as compared to the wider 256-bit bus on the Pentiums. — AMD claims their caches have a miss rate that’s low enough to offset the larger miss penalty. — This illustrates another one of many architectural tradeoffs. CPU L1 cache Main Memory L2 cache 64 64
  • 24. April 28, 2003 Cache writes and examples 24 Motorola PowerPC G4 The Motorola PowerPC G4 has a 64KB, split primary cache with eight-way associativity. The on-die 256KB 8-way L2 cache has a 256-bit interface to the L1, just like the Pentium 4. The G4 also goes one step further with the memory hierarchy idea, by supporting an external Level 3 or L3 cache up to 2MB. — The L3 cache data is not stored on the CPU itself, but the tags are. Thus, determining L3 cache hits is very fast. — The L3 cache can also be configured as regular main memory, but with the advantage of being much faster.
  • 25. April 28, 2003 Cache writes and examples 25 Caches and dies As manufacturing technology improves, designers can squeeze more and more cache memory onto a processor die. In the old Pentium III days the L2 cache wasn’t even on the same chip as the processor itself. Companies sold processor “modules” that were composed of several chips internally. The second picture illustrates the size difference between an older Pentium 4 and a newer one. The newer one has an area of just 131 mm2! The last picture shows the dies of an older Athlon with only 256KB of L2 cache, and a newer version with 512KB of L2 cache. You can literally see the doubling of the cache memory. Tech Report Tom’s Hardware Hardware.fr
  • 26. April 28, 2003 Cache writes and examples 26 CPU families Manufacturers often sell a range of processors based on the same design. In all cases, one of the main differences is the size of the caches. — Low-end processors like the Duron and Celeron have only half as much or less L2 cache memory than “standard” versions of the CPU. — The high-end Opteron features a 1 MB secondary cache, and Xeons can be outfitted with up to 2 MB of level 3 cache. This highlights the importance of caches and memory systems for overall performance. Duron Celeron Budget OpteronAthlon XPAMD Xeon MPPentium 4Intel High-endStandardCompany
  • 27. April 28, 2003 Cache writes and examples 27 Summary Writing to a cache poses a couple of interesting issues. — Write-through and write-back policies keep the cache consistent with main memory in different ways for write hits. — Write-around and allocate-on-write are two strategies to handle write misses, differing in whether updated data is loaded into the cache. Modern processors include many advanced cache design ideas. — Split caches can supply instructions and data in the same cycle. — Non-blocking and multiport caches minimize memory stalls. — Multilevel cache hierarchies can help reduce miss penalties. — The speed and width of memory buses also affects performance. Memory systems are a complex topic, and you should take CS333 to learn more about it!