SlideShare a Scribd company logo
1 of 12
'volatile' is volatile
mark@veltzer.net
Memory...
● When programming we use memory all the
time
– Reading/Writing data structures on the heap, stack
or data segment.
– Reading/Writing from/to hardware
– By “Memory” I do not refer to registers in this
presentation (since every core has it's own
registers)
What kinds of guarantees do we
want from memory operations?
● That the operation is not optimized away completely
● That the operation does not take place in registers (that are not
visible to other cores by definition)
● Visibility to other cores (bypass/flush/sync CPU caches)
● Visibility to hardware
● Atomicity
● Order between such two or more memory operations
● Any combination of the above (possibly none of them...)
● Regular C programming (without using special features, that is), the
compiler and the CPU provides none of the above guarantees
So what guarantees does volatile
provide?
● It's just not clear!
● The first two, yes
● The others, maybe. No in a lot of the architectures.
● Depends on your compiler, it's version, your compilation flags,
the astrological sign of the compiler authors best friend...
● To be specific: most volatile implementations do not imply
atomicity or ordering.
● And what does volatile mean for bigger than word or int or
long structures? Pass me the joint as things are getting hazy...
● Don't use volatile (true in most cases!)
Memory reordering
● Imagine the following code (with no compiler
optimizations):
●
●
● What states do you expect other cores to see?
● Or maybe
●
● Yes! The CPU does this. (well, not Intel, but others)
X=5
Y=6
X=7
Y=8
X=5, Y=6
X=7, Y=6
X=7, Y=8
X=5, Y=6
X=5, Y=8
X=7, Y=8
Is the Compiler/CPU allowed to do
that?
● Yes. Actually there are many types of reordering that the
Compiler/CPU is allowed to perform
● Common CPU reordering include:
– Load reordered after load
– Load reordered after store
– Store reordered after load
– Store reordered after store
– Store reordered after atomics
– Load reordered after atomics
– Dependant load reordered (YES! Alpha does this, they should all
be locked up...)
So what do the compiler/CPU
guarantee?
●
They guarantee results in one thread.
●
This means that they may alter your code, reorder it, discard parts of it, use
different operations than the ones you use and more.
● But all of these guarantee that the results you will get will be the same, in
the same thread that they are in.
●
But sometimes you want your code to be left unaltered.
●
This is especially true when other threads or hardware is involved.
●
In these cases the order matters, the specific operations matter, etc.
Enter memory barrier/fence
● A machine memory barrier is a special machine instruction or a
special type of memory access instruction that guarantees order
of execution between memory instructions before it and after it.
● __sync_synchronize() in gcc (user space).
● asm volatile ("mfence" ::: "memory")
● (smp_?)mb(),(smp_?)rmb(),(smp_?)wmb() in kernel development.
● In most cases atomic operations imply a memory barrier of some sort
and new C++11 has nice API with memory model included.
OK, prove it to me...
● Time for a demo.
● Two threads, when we start we have:
●
●
●
●
● Could it be that R1==R2==0 at the end?
X=0
Y=0
X=1
R1=Y
Y=1
R2=X
Hey, but I need volatile to overcome
the compiler!
● No, you don't
● There is something called a “compiler barrier”
● Compiler barriers usually offer several features:
– Forces the compiler to sync unsynchronized registers with memory so that memory writes
before the barrier will go to memory (no cache flush, no memory barrier)
– Forces the compiler to read from memory after the barrier even if the compiler thinks it knows
the value of certain memory locations.
– Forces order of memory operations at the compiler level (not machine level) in relation to the
barrier location in the code
● A compiler barrier is not a machine instruction (as opposed to memory barrier
● It is a compiler directive, influencing how to the compiler will generate machine code
after the directive is given.
● The compiler may emit machine instructions or it may not (depends on many factors)
● Time for another demo again...
References
● “What every programmer should know about memory” by
Ulrich Drepper
● “memory-barries.txt” from the Linux kernel.
● The example for memory barriers shown is derived from
“Memory Reordering Caught in the Act” by Jeff Preshing
● “Volatile_variable” from wikipedia
● “Memory_barrier” from wikipedia
● All examples can be found at linuxapi project at GitHub by
me.
Questions?

More Related Content

What's hot

Arm device tree and linux device drivers
Arm device tree and linux device driversArm device tree and linux device drivers
Arm device tree and linux device driversHoucheng Lin
 
Introduction to Linux Kernel by Quontra Solutions
Introduction to Linux Kernel by Quontra SolutionsIntroduction to Linux Kernel by Quontra Solutions
Introduction to Linux Kernel by Quontra SolutionsQUONTRASOLUTIONS
 
Linux presentation
Linux presentationLinux presentation
Linux presentationNikhil Jain
 
Introduction to systemd
Introduction to systemdIntroduction to systemd
Introduction to systemdYusaku OGAWA
 
Linux Directory Structure
Linux Directory StructureLinux Directory Structure
Linux Directory StructureKevin OBrien
 
Unix shell scripting basics
Unix shell scripting basicsUnix shell scripting basics
Unix shell scripting basicsManav Prasad
 
LLVM Instruction Selection
LLVM Instruction SelectionLLVM Instruction Selection
LLVM Instruction SelectionShiva Chen
 
SFO15-200: Linux kernel generic TEE driver
SFO15-200: Linux kernel generic TEE driverSFO15-200: Linux kernel generic TEE driver
SFO15-200: Linux kernel generic TEE driverLinaro
 
あるキャッシュメモリの話
あるキャッシュメモリの話あるキャッシュメモリの話
あるキャッシュメモリの話nullnilaki
 
U-Boot presentation 2013
U-Boot presentation  2013U-Boot presentation  2013
U-Boot presentation 2013Wave Digitech
 
Complete Guide for Linux shell programming
Complete Guide for Linux shell programmingComplete Guide for Linux shell programming
Complete Guide for Linux shell programmingsudhir singh yadav
 
Lcu14 107- op-tee on ar mv8
Lcu14 107- op-tee on ar mv8Lcu14 107- op-tee on ar mv8
Lcu14 107- op-tee on ar mv8Linaro
 
Browsing Linux Kernel Source
Browsing Linux Kernel SourceBrowsing Linux Kernel Source
Browsing Linux Kernel SourceMotaz Saad
 
Organiser son CI/CD - présentation
Organiser son CI/CD - présentation Organiser son CI/CD - présentation
Organiser son CI/CD - présentation Julien Garderon
 

What's hot (20)

Linux programming - Getting self started
Linux programming - Getting self started Linux programming - Getting self started
Linux programming - Getting self started
 
Arm device tree and linux device drivers
Arm device tree and linux device driversArm device tree and linux device drivers
Arm device tree and linux device drivers
 
Introduction to Linux Kernel by Quontra Solutions
Introduction to Linux Kernel by Quontra SolutionsIntroduction to Linux Kernel by Quontra Solutions
Introduction to Linux Kernel by Quontra Solutions
 
Linux presentation
Linux presentationLinux presentation
Linux presentation
 
Introduction to systemd
Introduction to systemdIntroduction to systemd
Introduction to systemd
 
Linux Directory Structure
Linux Directory StructureLinux Directory Structure
Linux Directory Structure
 
Advanced C - Part 1
Advanced C - Part 1 Advanced C - Part 1
Advanced C - Part 1
 
Unix shell scripting basics
Unix shell scripting basicsUnix shell scripting basics
Unix shell scripting basics
 
Embedded Linux on ARM
Embedded Linux on ARMEmbedded Linux on ARM
Embedded Linux on ARM
 
LLVM Instruction Selection
LLVM Instruction SelectionLLVM Instruction Selection
LLVM Instruction Selection
 
SFO15-200: Linux kernel generic TEE driver
SFO15-200: Linux kernel generic TEE driverSFO15-200: Linux kernel generic TEE driver
SFO15-200: Linux kernel generic TEE driver
 
あるキャッシュメモリの話
あるキャッシュメモリの話あるキャッシュメモリの話
あるキャッシュメモリの話
 
U-Boot presentation 2013
U-Boot presentation  2013U-Boot presentation  2013
U-Boot presentation 2013
 
Complete Guide for Linux shell programming
Complete Guide for Linux shell programmingComplete Guide for Linux shell programming
Complete Guide for Linux shell programming
 
Linux systems - Getting started with setting up and embedded platform
Linux systems - Getting started with setting up and embedded platformLinux systems - Getting started with setting up and embedded platform
Linux systems - Getting started with setting up and embedded platform
 
Lcu14 107- op-tee on ar mv8
Lcu14 107- op-tee on ar mv8Lcu14 107- op-tee on ar mv8
Lcu14 107- op-tee on ar mv8
 
File systems for Embedded Linux
File systems for Embedded LinuxFile systems for Embedded Linux
File systems for Embedded Linux
 
Browsing Linux Kernel Source
Browsing Linux Kernel SourceBrowsing Linux Kernel Source
Browsing Linux Kernel Source
 
Organiser son CI/CD - présentation
Organiser son CI/CD - présentation Organiser son CI/CD - présentation
Organiser son CI/CD - présentation
 
C Programming - Refresher - Part III
C Programming - Refresher - Part IIIC Programming - Refresher - Part III
C Programming - Refresher - Part III
 

Viewers also liked (7)

Realtime
RealtimeRealtime
Realtime
 
Gcc
GccGcc
Gcc
 
Effective cplusplus
Effective cplusplusEffective cplusplus
Effective cplusplus
 
Gcc opt
Gcc optGcc opt
Gcc opt
 
Linux logging
Linux loggingLinux logging
Linux logging
 
Streams
StreamsStreams
Streams
 
Multicore
MulticoreMulticore
Multicore
 

Similar to Volatile

Efficient Buffer Management
Efficient Buffer ManagementEfficient Buffer Management
Efficient Buffer Managementbasisspace
 
Let's Talk Locks!
Let's Talk Locks!Let's Talk Locks!
Let's Talk Locks!C4Media
 
JVM Performance Tuning
JVM Performance TuningJVM Performance Tuning
JVM Performance TuningJeremy Leisy
 
Performance optimization techniques for Java code
Performance optimization techniques for Java codePerformance optimization techniques for Java code
Performance optimization techniques for Java codeAttila Balazs
 
Share and Share Alike
Share and Share AlikeShare and Share Alike
Share and Share Alikeawebneck
 
Linux Locking Mechanisms
Linux Locking MechanismsLinux Locking Mechanisms
Linux Locking MechanismsKernel TLV
 
Programming with Threads in Java
Programming with Threads in JavaProgramming with Threads in Java
Programming with Threads in Javakoji lin
 
strangeloop 2012 apache cassandra anti patterns
strangeloop 2012 apache cassandra anti patternsstrangeloop 2012 apache cassandra anti patterns
strangeloop 2012 apache cassandra anti patternsMatthew Dennis
 
Kernel Recipes 2014 - Performance Does Matter
Kernel Recipes 2014 - Performance Does MatterKernel Recipes 2014 - Performance Does Matter
Kernel Recipes 2014 - Performance Does MatterAnne Nicolas
 
MySQL 5.6 Performance
MySQL 5.6 PerformanceMySQL 5.6 Performance
MySQL 5.6 PerformanceMYXPLAIN
 
Faster computation with matlab
Faster computation with matlabFaster computation with matlab
Faster computation with matlabMuhammad Alli
 
Lightweight Virtualization with Linux Containers and Docker | YaC 2013
Lightweight Virtualization with Linux Containers and Docker | YaC 2013Lightweight Virtualization with Linux Containers and Docker | YaC 2013
Lightweight Virtualization with Linux Containers and Docker | YaC 2013dotCloud
 

Similar to Volatile (20)

Java vs. C/C++
Java vs. C/C++Java vs. C/C++
Java vs. C/C++
 
Java under the hood
Java under the hoodJava under the hood
Java under the hood
 
Efficient Buffer Management
Efficient Buffer ManagementEfficient Buffer Management
Efficient Buffer Management
 
Let's Talk Locks!
Let's Talk Locks!Let's Talk Locks!
Let's Talk Locks!
 
JVM Performance Tuning
JVM Performance TuningJVM Performance Tuning
JVM Performance Tuning
 
Optimizing Linux Servers
Optimizing Linux ServersOptimizing Linux Servers
Optimizing Linux Servers
 
Performance optimization techniques for Java code
Performance optimization techniques for Java codePerformance optimization techniques for Java code
Performance optimization techniques for Java code
 
Share and Share Alike
Share and Share AlikeShare and Share Alike
Share and Share Alike
 
Introduction to Parallelization and performance optimization
Introduction to Parallelization and performance optimizationIntroduction to Parallelization and performance optimization
Introduction to Parallelization and performance optimization
 
Java memory model
Java memory modelJava memory model
Java memory model
 
Introduction to Parallelization ans performance optimization
Introduction to Parallelization ans performance optimizationIntroduction to Parallelization ans performance optimization
Introduction to Parallelization ans performance optimization
 
Linux Locking Mechanisms
Linux Locking MechanismsLinux Locking Mechanisms
Linux Locking Mechanisms
 
Introduction to Parallelization ans performance optimization
Introduction to Parallelization ans performance optimizationIntroduction to Parallelization ans performance optimization
Introduction to Parallelization ans performance optimization
 
Memory model
Memory modelMemory model
Memory model
 
Programming with Threads in Java
Programming with Threads in JavaProgramming with Threads in Java
Programming with Threads in Java
 
strangeloop 2012 apache cassandra anti patterns
strangeloop 2012 apache cassandra anti patternsstrangeloop 2012 apache cassandra anti patterns
strangeloop 2012 apache cassandra anti patterns
 
Kernel Recipes 2014 - Performance Does Matter
Kernel Recipes 2014 - Performance Does MatterKernel Recipes 2014 - Performance Does Matter
Kernel Recipes 2014 - Performance Does Matter
 
MySQL 5.6 Performance
MySQL 5.6 PerformanceMySQL 5.6 Performance
MySQL 5.6 Performance
 
Faster computation with matlab
Faster computation with matlabFaster computation with matlab
Faster computation with matlab
 
Lightweight Virtualization with Linux Containers and Docker | YaC 2013
Lightweight Virtualization with Linux Containers and Docker | YaC 2013Lightweight Virtualization with Linux Containers and Docker | YaC 2013
Lightweight Virtualization with Linux Containers and Docker | YaC 2013
 

Recently uploaded

Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Principled Technologies
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024The Digital Insurer
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024SynarionITSolutions
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 

Recently uploaded (20)

Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 

Volatile

  • 2. Memory... ● When programming we use memory all the time – Reading/Writing data structures on the heap, stack or data segment. – Reading/Writing from/to hardware – By “Memory” I do not refer to registers in this presentation (since every core has it's own registers)
  • 3. What kinds of guarantees do we want from memory operations? ● That the operation is not optimized away completely ● That the operation does not take place in registers (that are not visible to other cores by definition) ● Visibility to other cores (bypass/flush/sync CPU caches) ● Visibility to hardware ● Atomicity ● Order between such two or more memory operations ● Any combination of the above (possibly none of them...) ● Regular C programming (without using special features, that is), the compiler and the CPU provides none of the above guarantees
  • 4. So what guarantees does volatile provide? ● It's just not clear! ● The first two, yes ● The others, maybe. No in a lot of the architectures. ● Depends on your compiler, it's version, your compilation flags, the astrological sign of the compiler authors best friend... ● To be specific: most volatile implementations do not imply atomicity or ordering. ● And what does volatile mean for bigger than word or int or long structures? Pass me the joint as things are getting hazy... ● Don't use volatile (true in most cases!)
  • 5. Memory reordering ● Imagine the following code (with no compiler optimizations): ● ● ● What states do you expect other cores to see? ● Or maybe ● ● Yes! The CPU does this. (well, not Intel, but others) X=5 Y=6 X=7 Y=8 X=5, Y=6 X=7, Y=6 X=7, Y=8 X=5, Y=6 X=5, Y=8 X=7, Y=8
  • 6. Is the Compiler/CPU allowed to do that? ● Yes. Actually there are many types of reordering that the Compiler/CPU is allowed to perform ● Common CPU reordering include: – Load reordered after load – Load reordered after store – Store reordered after load – Store reordered after store – Store reordered after atomics – Load reordered after atomics – Dependant load reordered (YES! Alpha does this, they should all be locked up...)
  • 7. So what do the compiler/CPU guarantee? ● They guarantee results in one thread. ● This means that they may alter your code, reorder it, discard parts of it, use different operations than the ones you use and more. ● But all of these guarantee that the results you will get will be the same, in the same thread that they are in. ● But sometimes you want your code to be left unaltered. ● This is especially true when other threads or hardware is involved. ● In these cases the order matters, the specific operations matter, etc.
  • 8. Enter memory barrier/fence ● A machine memory barrier is a special machine instruction or a special type of memory access instruction that guarantees order of execution between memory instructions before it and after it. ● __sync_synchronize() in gcc (user space). ● asm volatile ("mfence" ::: "memory") ● (smp_?)mb(),(smp_?)rmb(),(smp_?)wmb() in kernel development. ● In most cases atomic operations imply a memory barrier of some sort and new C++11 has nice API with memory model included.
  • 9. OK, prove it to me... ● Time for a demo. ● Two threads, when we start we have: ● ● ● ● ● Could it be that R1==R2==0 at the end? X=0 Y=0 X=1 R1=Y Y=1 R2=X
  • 10. Hey, but I need volatile to overcome the compiler! ● No, you don't ● There is something called a “compiler barrier” ● Compiler barriers usually offer several features: – Forces the compiler to sync unsynchronized registers with memory so that memory writes before the barrier will go to memory (no cache flush, no memory barrier) – Forces the compiler to read from memory after the barrier even if the compiler thinks it knows the value of certain memory locations. – Forces order of memory operations at the compiler level (not machine level) in relation to the barrier location in the code ● A compiler barrier is not a machine instruction (as opposed to memory barrier ● It is a compiler directive, influencing how to the compiler will generate machine code after the directive is given. ● The compiler may emit machine instructions or it may not (depends on many factors) ● Time for another demo again...
  • 11. References ● “What every programmer should know about memory” by Ulrich Drepper ● “memory-barries.txt” from the Linux kernel. ● The example for memory barriers shown is derived from “Memory Reordering Caught in the Act” by Jeff Preshing ● “Volatile_variable” from wikipedia ● “Memory_barrier” from wikipedia ● All examples can be found at linuxapi project at GitHub by me.