Embed with Elliot: The Volatile Keyword

Last time on Embed with Elliot we covered the static keyword, which you can use while declaring a variable or function to increase the duration of the variable without enlarging the scope as you would with a global variable. This piqued the curiosity of a couple of our readers, and we thought we’d run over another (sometimes misunderstood) variable declaration option, namely the volatile keyword.

On its face, volatile is very simple. You use it to tell the compiler that the declared variable can change without notice, and this changes the way that the compiler optimizes with respect to this variable. In big-computer programming, you almost never end up using volatile in C. But in the embedded world, we end up using volatile in one trivial and two very important circumstances, so it’s worth taking a look.

Optimization and the Mind of the Compiler

volatile warns the optimizer that a variable can change without notice, so we need to know why that would ever matter. And for that, we’ll take a quick glimpse into the logic of (one type of) compiler optimization.

Since memory space is presumed to be scarce, a compiler’s going to want to make sure you’re not wasting it. One quick and easy way to do so is to check all the variables, and all the functions that call them, and see if they get changed or if they’re just constants in disguise. Simply put, that means that when the compiler sees something silly like this:

int a = 5;
int b = 0;

b = a + 2;
printf("%d\n", b);

the compiler’s going to notice that you never use a anywhere in your code except that one place that you’ve set it equal to five, and that although you initialized b as zero but then all you do with it is add two to a and stored it in b. The compiler is going to see through all your foolery, and will compile the equivalent of the following code:

printf("%d\n", 7);

That’s optimization! It got rid of two variables that never really varied anyway, and freed up two bytes of RAM. But the takehome lesson for us here is that we’ll have to watch out for cases when we give the compiler the false impression that nothing’s changing when it actually is. And that’s the essential use of volatile.

And before we leave optimization, the “opposite” of volatile is not static as one might linguistically expect. static modifies a variable’s scope. Indeed, if you want to define a file-scoped, persistant variable that can change without warning, and this is common, you’d want static volatile. (Edit: Pointless, and wrong, aside about const removed after getting bashed in the comments.)

Delay Loops

Maybe the first time a beginner would want to use volatile is in writing a delay routine. The simplest delay routine just has the CPU busy itself with counting up to some really big number. That is, you’ll try something like this:

int i;
for (i=0; i<65000; ++i){
	; // just wait
}

and expect it to slow the computer down, but it might not.

If you have optimization off, or if it’s set to a low-enough level, it’ll actually work. But with any real optimization on, the compiler will notice that you never actually use the variable i for anything, and it’ll optimize your code to the following:


That won’t serve very well as a delay function.

The solution is to declare i as volatile. The compiler now pretends that it doesn’t know when i can change, or when it’s going to be used by other functions, and it runs through the full for loop dutifully, just in case.

volatile int i;
for (i=0; i<65000; ++i){
	; // run anyway, because something _could_ happen to i here
}

(Note that this code works just as well to demonstrate the volatile qualifier on your desktop, but you won’t notice the delay unless you count up to a very large number. We tried out 65,000,000 instead, and it pauses for a tick on our desktop machine.)

That was the trivial example.  The next two are real bugs that catch real-life beginning embedded programmers all the time.

Interrupt Service Routines (ISRs)

ISRs are by their nature invoked only outside of the normal program flow, and this naturally confuses the compiler. Indeed, ISRs look to the compiler like functions that are never called, so the definition of ISR in AVR-GCC, for instance, includes the special “used” attribute so that the function doesn’t get thrown away entirely. So you can guess that things get ugly when you want to modify or use a variable from within a function that the compiler doesn’t know is even going to be used.

The good news is that you simply declare the variable as a volatile global variable (outside of the scope of main and the ISR in question) and everything works. The solution is super simple, but if you don’t do it, no errors will be thrown and your ISR simply will appear not to be working because the compiler will replace the variable with a constant.

Here’s what the right way looks like in a simple example:

volatile uint16_t counter=0;

ISR(timer_interrupt_vector){
	++counter;
}

int main(void){
	printf("%d\n", counter);
}

If counter weren’t marked volatile, you can see how it looks like a constant within the context of main, right? Now you won’t fall for this common beginner pitfall.

Memory-mapped Hardware Registers

The second non-trivial use of volatile in embedded programming is in the declarations of memory-mapped hardware; a microcontroller interacts with the world by reading and writing voltages on its various pins, and the pin states are usually made available to your code as specially mapped locations in memory. When the pins are configured as input, for instance, your code can read a bit from a specific memory location to test if the pin is at a logic high or low voltage level. So far, so good.

But while you see pins on which the external voltages can change without notice, the compiler sees normal-looking memory locations, unless it’s taken care of in your code. In general, this is taken care of by the manufacturer’s libraries, so you’ve been able to pretend that nothing’s afoot. But as long as we’re looking into the volatile qualifier, let’s dig quickly into the chip-specific codebases.

For instance, in the pin definition files that get included with any AVR project, you can dig through a chain of definitions to get down to a volatile definition for the PIN registers which are used for reading input:

#define PINB    _SFR_IO8 (0x03)
// and
#define _SFR_IO8(io_addr) _MMIO_BYTE((io_addr) + __SFR_OFFSET)
// and finally
#define _MMIO_BYTE(mem_addr) (*(volatile uint8_t *)(mem_addr))

Phswew! That is, PINB turns into a volatile pointer to an eight-bit integer, where the specified memory address is built up as an IO address plus an offset. But the point here is that this pointer to this memory location is explicitly declared as volatile and that the compiler knows that it’s changeable.

The same goes for ARM chips. In the ARM-standard CMSIS port definitions, the GPIO memory locations are contained inside a bigger struct, GPIO_TypeDef, that takes care of the relative memory offsets, but if you look into those type definitions (here from ST’s implementation), you find:

// in stm32f4xx.h
 __IO uint32_t IDR;      /*!< GPIO port input data register,         Address offset: 0x10      */
// and in core_cm7.h
/* IO definitions (access restrictions to peripheral registers) */
#define     __IO    volatile             /*!< Defines 'read / write' permissions              */

Tadaa! As promised. And it really had to be so, because otherwise the compiler would look at every instance where the data registers are called and notice that they never change.

Wrapup

So that’s it. volatile simply convinces the compiler not to optimize over a variable that’s apparently constant. But until you’ve been bitten by the resulting bugs, you might not have even known just exactly what types of things the compiler is able to optimize away, and thus which variables you need to flag as volatile. We’ve hit the most common pitfalls, but if we’ve missed any situations where you’d use volatile, please let us know in the comments below.

56 thoughts on “Embed with Elliot: The Volatile Keyword

  1. Non volatile variables in my STM32 interrupts where confusing bugs to find as everything worked with the debugger attached but not when running stand alone. These two write-ups are very helpful!

    I suggest a write-up on structures and linked lists, once you have a circular linked list of structures my brain melts, lol.

    1. Oooh! Structures is on my list after pointers.

      But as for linked lists, I don’t really use ’em for embedded stuff. The problem/point of linked lists is that you allocate memory on the fly (with malloc or similar) and on a small-RAM system this can get you into unpredictable crashes unless you’re very careful.

      I _do_ want to do a post on keeping track of your free stack memory using a “stack canary” or “stack painting” method, though, which helps you diagnose that case.

      1. Memory management might be a good topic. For .Net guys like me who are used to just grabbing 100’s of megs of memory off of the heap, it would be good to review how embedded systems implement malloc, and how and where locals get stored (stack,datasegment,etc.)

        Also, you can implement a linked list in a fixed size array, if you want.

          1. >Small embedded systems usually don’t use malloc at all.

            Maybe not in the <10K of RAM world but having a heap isn't that rare once you get to applications that need tens of kilobytes of RAM.

        1. You may use a fixed length array in a similar manner to a linked list, but calling something a linked list implies things about the structure and thus access time, insertion time, ability to use pointers/references to data elements, etc. It might not be a big issue depending on the application, but it’s important to understand.

      2. I use to use linked lists on my ARM systems for LCD menus. A structure to list bump state, in and out functions, the message to display, etc. The whole menu system was assembled at compile so no malloc needed but if you jumped right into full menu code without testing the skeleton first I would end up nards deep in a call cycle that would overflow the call stack and really mess stuff up.

      3. Free stack memory? I’d like this too! I’ve been bitten when trying to code even a rudimentary Web server on an Arduino + esp8266, overflowing memory and doing weird shit.
        C is my favourite language, I’ve worked with it professionally in the past, know (knew?) it quite well, but these articles you’re writing are giving me a great refresher, thanks! Although I never got into the nitty gritty of compiler level behaviours, like optimisation, never really had a need for the volatile keyword.
        Very informative, Thanks!

      4. Linked lists do not have to imply using dynamic memory allocation. For example, you could use a static memory / “resource” pool with manual management of these resource items. Basically a poor man’s malloc/free, potentially much more simplified, efficient and deterministic (e.g. only allow one type of data block, always aligned, no fragmentation).

        For example, some years ago I did a “soft-timer” library from scratch, where a #defined number of “timers” could be started, paused, deleted etc; while driven by a single HW timer.
        Each “soft-timer” object is a very basic struct. To avoid dynamic memory mgmt, a fixed static array (“resource pool”) is declared at compile time. The CreateTimer() looks through the pool for an unused struct item, initializes it and inserts it in a linked list of active timer objects – sorted on increasing time to expire. You get the idea. The active timer list is simply a pointer to the struct type, the pool of times is simply an array of such structs; that may or may not be in use. Timers that are not in use can be identified by some struct member (e.g. ticks_to_expire) having an invalid value (e.g. -1).
        At each HW tick, the leading timers tick-count is decreased until it expires, it’s callback is invoked and the object is “freed” (i.e. remains in the pool, only now flagged as no longer used). So another CreateTimer() can subsequently reuse this item again, by only initiating the struct members and inserting it in the sorted list of active objects.

        So, this could work exactly the same way using dynamic memory allocations; only difference would be that the max number of elements would be, you know, Dynamic as opposed to a compile time defined by pool size. In my timer library, some extra pre-processor clutter also allowed the entire lib to be NOP:ed automatically by simply defining MAX_NBR_OF_TIMERS to 0.

        So, static memory management is a neat trick that can allow useful stuff like lists, trees etc in a small MCU project, without the penalty and risks of fully dynamic allocations. For example, the MISRA rules discourages dynamic memory because the system becomes indeterministic and often hard to analyse worst-case memory usage. And more code is needed for repeated error testing and handling e.g. for out-of-memory conditions.

        On a somewhat related note, the whole “how to fix your system to be deterministic and MISRA compliant” is a very interesting topic in itself.

  2. +1 on the ISR example. Should either say “…an example of the *wrong way to…”, or should say “volatile uint16_t counter=0;”.

    Also, the code samples appear as HTML-entity gibberish.

    Sorry to gripe. I’m very much enjoying this series – thanks!

  3. Holy Amateur Hour Batman! All you have to explain is that the volatile emits code that references the variable value in RAM/Cache rather than using a temporary register. And volatile can be an optimization when you are register constrained. Elliot you are seriously causing some damage to the understanding of the C spec here; please leave this kind of stuff up to the real C nerds.

    1. Yeah Elliot, what were you thinking making this stuff approachable to amateurs? Next time just cut and paste the applicable sections of the C99 specification for us professionals to parse.
      Hey! You kids get off my lawn!

      1. The wording could be both precise and approachable. Similarly, a hallmark of a good programmer is the ability to write simultaneously precise and approachable code. The article could state the above and use that as the basis for why you’d use a volatile for an ISR variable. Though I do agree there is more of a need for approachable explanations than there is of precise explanation.

    1. Wouldn’t using volatile in kernel code be a different use case than using it on a microcontroller running bare metal? This series seems to be targeted for embedded programming and not Linux/kernel/OS programming.

    2. The great thing about embedded programming (especially for newbies), is that we (generally) aren’t using volatile to prevent concurrency issues, which is what that Kernel doc is warning against.

    3. Basically that article is just warning programmers to use proper locking mechanisms instead of using volatile and assuming that will provide the wanted synchronization. Lock free concurrent programming is extremely difficult to get right and requires express knowledge of the memory model of the underlying processor. So, use of volatile in that situation is just assumed to probably be wrong.

    4. What everyone else said: kernel code discourages using volatile as a concurrency mechanism in a multi-threaded system because it reduces overall optimization. That’s totally legit. Use mutexes, locks, or whatever in that case. That’s what they’re there for. (Not a kernel coder, btw.)

      But if you read your own link, you’d come down to the relevant quotes:

      “The volatile storage class was originally meant for memory-mapped I/O registers.” and “Pointers to data structures in coherent memory which might be modified by I/O devices can, sometimes, legitimately be volatile.” but substitute “ISRs” for “I/O devices”.

      And that’s actually the point of this series, to bring folks with other coding experience into the embedded context. I picked these examples precisely because they’re likely to trip up people with systems- or web-coding backgrounds who are used to entirely different idioms.

  4. Speaking of volitile and ISRs, is multi-threading common on any embedded systems? As a desktop programmer, I see volitile used for multi-threading, where the keyword means that the value will always be fetched from memory, and not placed in registers, an optimization that can really mess with a multi-threaded application.

    1. Yes, there is definitely multi-threading in embedded systems, although depending on who you ask, some may not call it “common”. I’ve worked with an RTOS on an ARM microcontroller, which pretty much lets you schedule various tasks with time intervals and priorities, aka multi-threading.

      In fact, there are even some multi-*core* microcontrollers out there, one big one being the Parallax Propeller. I haven’t personally worked with it, though, so I don’t know how memory is shared between processors and such. Maybe someone else would care to shed some light on this?

    2. Using volatile as a simple atomic variable in multithreading can be dangerous. It depends on the architecture and size of the thing whether or not the write will be atomic. It is better to utilize a mutex.

  5. “Volatile” is not the opposite of “const”. In fact, you can have something that is both – a volatile const pointer could contain a memory address of a hardware register that you cannot write to.

    1. In that case the pointer is simply const, and it’s the data to which it points that is volatile. The declaration for such a pointer reflects this, eg “volatile char * const p”: ‘volatile’ qualifies the referenced char, ‘const’ qualifies the pointer p.

      1. This is an actual real world case of const volatile usage:

        http://forum.43oh.com/topic/5235-const-infomem-flash-and-volatile-keyword/
        In particular, the compiler had optimized away his reads to flash and replaced them with hardcoded constants; so when he updated his flash it would have no effect. The volatile keyword as need so the read from flash was forced; the const keyword is needed to enforce no writes to flash AND that the variable is not linked into RAM.

        As you can see, if you let Eliot brain damage your understanding of volatile and const you will run into the same issues spirilis did. Although Elliot’s article is well written, it is not factually correct. It is just some autodidact’s misunderstanding of how C actually works…and it’s just dangerous to anyone who really wants to learn.

        1. Your counter-example is bad, but nonetheless I should make a correction. You’re right in calling me out on being too loosey-goosey in saying that const and volatile are opposites. I went too far. Mea culpa.

          In the article, I should have stopped with “const tells the compiler to throw an error when your code attempts to change the variable”. Because as C-Nerd said, it _is_ possible to have variables that your code shouldn’t change but that nonetheless can change due to outside circumstances, and thus could be qualified volatile as well as const.

          For instance, a microcontroller reading a ROM would be a good counter-example. You might want to qualify the ROM-access pins as being read-only (const) but still warn the compiler that the data coming in on them is subject to change (volatile) as the address changes.

          (Note that qualifying the pins as const is optional, but it helps to keep you from making mistakes in your own code if you try to write to those registers later on. And since all input pins in all micros I know are already qualified volatile, you won’t need to write that out — it’ll be there. But if you want to be super-careful and super-explicit, you’d qualify the pin registers volatile and const.)

          The linked example is crappy, though, because the OP is actually writing to the memory location even though he’s declared it const. Indeed the OP finds that he needs to cast away the “const” every time he writes to it because it’s not really “const”. He’s confused, and that’s not my fault. :)

          Anyway, this is all out of scope for this article. I didn’t even really mean to bring up const, which is another kettle of fish. I should have just left it out. Bah!

          The article was about volatile, and I’ll stand by what I wrote there.

      2. Both the pointer and the pointed-to value could have any combination of volatile and const. a “const volatile char* const volatile” would be a pointer that you cannot write to but which could change out from under you, to a memory location that you aren’t allowed to write to via that pointer, but which also might change out from under you.

  6. You should bring up atomicity. Multibyte accesses are not always atomic, even if you use the ‘volatile’ keyword. You need to take care when reading multi-byte or multi-word values that are modified elsewhere. You can end up with your read being interrupted, and you’ll get a bad value. For example, in your timer example, if the counter contains 0x01FF , and the code starts a read, it might see 0x01 , then get interrupted, and the value becomes 0x0200. The interrupt returns and you read the next byte. Now you should have read 0x0200, but instead you read 0x0100.

    There are a few ways to deal with this. One is to do multiple reads and compare them for equality. Another is to disable interrupts during atomic reads.

  7. It’s not just about reads – volatile also indicates that a write must be written to memory every time as it has side-effects, e.g. a hardware output port, where you want it to write to memory every time the variable is assigned a value.

  8. Even if you use the volatile keyword you are not guaranteed that an operation on a volatile variable is atomic. For example, in the case of the 16 bit integer that is incremented by an interrupt handler you may or may not run into problems. If you are running this on an AVR you will most likely run into problems at some point since AVR is only able to read 8 bits from memory at a time. This means that you may read one part of the variable, be interrupted by the interrupt handler, and continue reading the other, updated part of the variable. If you are programming on a bare metal system the standard solution is to disable interrupts while you read the counter in this case and reenable interrupts right after you have read the counter. However, if you are using an operating system you are much better off using whatever synchronization primitives that the operating system supplies.

    Another note that may save someone some debugging time: Even if you use volatile, the C compiler basically only guarantees that a store instruction will be run. This doesn’t guarantee that the value will actually be saved to main memory. This is mostly a program if you try to do DMA in a system where the DMA is not cache coherent.

    However, it can also be a problem in situations where you are trying to implement lockless synchronization mechanisms where more than one volatile variable is involved. As a demonstration I implemented Peterson’s algorithm (which is a classic algorithm that implements a mutex) using volatiles for all shared data. However, the program didn’t actually work correctly since the memory ordering of the processor I used (Intel Core i5) had some relaxations that Peterson’s algorithm doesn’t take into account. However, by introducing memory barriers at suitable locations the algorithm could be made to work. (If you are interested in this there is a nice blog entry (not written by me) about this at http://bartoszmilewski.com/2008/11/05/who-ordered-memory-fences-on-an-x86/ )

    TL;DR: Don’t do ad-hoc synchronization if you can avoid it. Use standard locking primitives instead, if at all possible.

  9. Great stuff! Thank You! I get frustrated that most of the tutorials are either extremely simple (“Hello World” or “Blink a LED”) or super in-depth and hard to follow. I’ve learned something in each of your posts and I look forward to more in the future.

  10. >But until you’ve been bitten by the resulting bugs, you might not have even known
    >just exactly what types of things the compiler is able to optimize away,

    And you shouldn’t just assume that every compiler will do the same thing once you jam volatile all over the place to try to fix it. I.e. the delay loop optimisation issue outlined in the article depends on the compiler and version.
    If in doubt disassemble the resulting code and actually check what happened.

  11. Why do you write “const is opposite to volatile”?
    volatile const int x;
    – perfectly correct and makes sense (your code cannot change x value but some external events can.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s