The volatile keyword in C++ is poorly misunderstood. Lots of developers believe that it makes the declared variable an atomic variable, but that’s wrong. The truth is that you should use std::atomic or related code (e.g., mutex) on a variable if you want to guarantee atomic operations in a multithreaded context.
What I want to discuss in this article is something else. Before I start you should remember that the volatile keyword should be used when you want to tell the compiler to not optimize a given variable. It basically says “don’t perform any optimizations on operations on this memory because something else outside the program (which the compiler is not aware of) may change it”. The most common use of volatile is in memory-mapped I/O. Gadgets and electronic peripherals (e.g., sensors, displays, etc.) may communicate with the software and optimizations could break the code.
The use of the volatile keyword has caught my attention recently and I want to show you a curious thing about it. I will go through some code examples and I will use an amazing online tool called gcc.godbolt.org to look at the assembly code generated by the compiler.
With the GCC compiler
Let’s start with a very simple example without the volatile keyword. As you can see in the code below, we have an int variable and a while loop. The GCC compiler realizes that the variable will never change its value and the loop will never finish. So it optimizes the assembly code by ignoring the variable check and keeps jumping forever back to the L2 label.
Now consider the same code, but let’s add volatile to the declaration of the int variable on line 4. As we can see, the GCC compiler now generates an assembly code that is totally different than the previous example and it respects the value == 5 check inside the loop.
Now let’s change this example a little bit. What would happen if we move the int variable into a struct? Consider the code below without the volatile keyword. The assembly code is the same as in the first example. The GCC compiler ignores the loop check and keeps jumping forever.
If we want to tell the compiler to not optimize the int variable, should we add the volatile keyword to the int variable declaration inside the struct? Let’s see how this would work:
As we can see, the GCC compiler simply ignored the volatile definition inside the struct and optimized away the loop again. A careless developer could easily assume that code was correct and move on to other tasks. Bugs later would require hours of debugging in order to find the source of the problem. The real solution here is to add the volatile keyword to the struct object on line 6:
I’ve tried different versions of GCC they all have the same result regarding this issue. Now let’s try CLANG and see what happens.
With the CLANG compiler
Using clang 3.9 we can start without the volatile keyword. As you can see below, the loop is optimized just like it happened with the GCC compiler.
Now let’s try the volatile keyword in the struct member variable. We have a surprise here: CLANG doesn’t optimize the loop like GCC did. I tried older versions of CLANG and the output is the same.
We get the same result when we move the volatile to the object on line 6.
So which compiler is the correct one?
It seems to me that CLANG is doing a better job than GCC because the loop deals with the struct object and its member variable. So if the volatile modifier is present either in the object or in its member, then the compiler has to respect that in the loop.
If you have more thoughts about this, please leave comments below.
EDIT 1: I posted this discussion on reddit and we have some great comments there.
EDIT 2: This issue was also discussed on the CppCast Episode 76 with Dan Saks, check it out: http://cppcast.com/2016/10/dan-saks/