Welcome to part three of “Interrupts: The Good, the Bad, and the Ugly”. We’ve already professed our love for interrupts, showing how they are useful for solving multiple common microcontroller tasks with aplomb. That was surely Good. And then we dipped into some of the scheduling and priority problems that can crop up, especially if your interrupt service routines (ISRs) run for too long, or do more than they should. That was Bad, but we can combat those problems by writing lightweight ISRs.
This installment, for better or worse, uncovers our least favorite side effect of running interrupts on a small microcontroller, and that is that your assumptions about what your code is doing can be wrong, and sometimes with disastrous consequences. It’s gonna get Ugly
TL;DR: Once you’ve started changing variables from inside interrupts, you can no longer count on their values staying constant — you never know when the interrupt is going to strike! Murphy’s law says that it will hit at the worst times. The solution is to temporarily turn off interrupts for critical blocks of code, so that your own ISRs can’t pull the rug out from under your feet. (Sounds easy, but read on!)
In this installment, we’ll cover two faces of essentially the same problem, and demonstrate one and a half solutions. But don’t worry. By the end of this article you should have the confidence to write interrupts without fear, because you’ll know the enemy.
Race Conditions
If you remember from our column on the volatile
keyword, when you’d like to share data between an ISR and your main body of code, you have to use a global variable because the ISR can’t take any arguments. Because the compiler can’t see that the global variables are changing anywhere, we mark them with the volatile
keyword to tell the compiler not to optimize those variables away. We should heed our own advice: variables that are accessed by the ISR can change at any time without notice.
The “obvious” pitfall when sharing variables with ISRs is the race condition: your code sets the variable’s value here and then uses it again there. The “race” in question is whether your code can get fast enough from point A to point B without an interrupt occurring in the middle.
Let’s start off with a quiz, written for an AVR microcontroller:
volatile uint16_t raw_accelerometer; volatile enum {NO, YES} has_new_accelerometer_reading; ISR(INT1_vector){ raw_accelerometer = read_accelerometer_z(); has_new_accelerometer_reading = YES; } ... int main(void){ while (1){ if (has_new_accelerometer_reading == YES){ if (raw_accelerometer != 0){ display(raw_accelerometer); } // Flag that we've handled this sample has_new_accelerometer_reading = NO; } } }
We’ve apparently already read the last Embed with Elliot, because the ISR here is short and does the minimum it needs to. All the logic and the print command are in the main loop where they won’t cause trouble for other potential interrupts. Gold star!
The problem, however, is that we see occasional zero values printed out on our display. How can that be? We only display the accelerometer value after explicitly testing that it isn’t zero.
The key word in the above sentence is “after”. Think about the worst possible time for an accelerometer update interrupt to occur How about between testing that raw_accelerometer
isn’t zero, but before printing the value out? Yup, that would do it.
Of course, the chance of an interrupt firing off just exactly during the last instruction in the test statement is relatively small. It won’t happen that often, will it? Bearing in mind that your code runs through the main()
loop a very large number of times, if the accelerometer updates often enough, you can be sure that the glitch will happen. It will just happen infrequently, and that’s the worst kind of bug to trace and squash.
Shadow Variables: A Tentative Solution
Here’s a trial solution:
int main(void){ uint16_t local_accelerometer_data = 0; while (1){ if (has_new_accelerometer_reading == YES){ local_accelerometer_data = raw_accelerometer; if (local_accelerometer_data != 0){ display(local_accelerometer_data); } // Flag that we've handled this sample has_new_accelerometer_reading = NO; } } }
See what happened there? We added a variable that isn’t shared with the ISR, local_accelerometer_data
, and then we work on that copy instead of the variable that the ISR is able to change. Now both the test for zero-ness and the display
statement are guaranteed to be based on the same value.
This is only a half solution because the raw_accelerometer
variable can still change while we’re working on its local shadow copy. Here, it’s no problem, but if we were going to store our local copy back into the shared variable we’d run the risk that the raw_accelerometer
variable had changed while we were working on our local copy, and by writing our local copy back, we’d overwrite the just-changed value. This isn’t trivial, because we’d lose a data point, but at least the work done on the local_accelerometer_data
is correct for the data we had on hand at the time that we copied it.
If you’re working on a fancy-schmancy ARM machine with your luxurious 32-bit integers, this half solution might work for you. For those of us accessing shared 16-bit variables on 8-bit machines, there’s one more complication that we’ll have to cover before we present the silver-bullet solution to all of this.
Atomicity and the 8-Bit Blues
Atomicity is the property of being indivisible. (Like atoms were until the early 1900s.) When you write a single operation in a higher-level language, it can end up compiling into a single instruction in machine code, but more often than not it’ll take a few instructions. If these instructions can get interrupted in a way that changes the outcome, they’re not atomic.
The bad news: almost nothing is atomic. For instance, even something simple like ++i;
can translate into three machine instructions: one to fetch the value of i
from memory into a temporary register, one to add to the register, and a third to write the result back into memory.
The non-atomicity in ++i;
is fairly benign because the temporary register works just like the shadow variable did above. If i
is changed in memory, while the addition operation is taking place in the CPU’s registers, the memory will get overwritten by the addition operation. But at least the value stored back into i
will be the value at the start of the operation plus one.
On an 8-bit machine, dealing with 16-bit or longer shared variables, much stranger stuff can happen. Accessing 16-bit variables is fundamentally different because even creating the shadow variable version of the variable isn’t atomic, taking one instruction per byte, and is thus interruptible. To see this, we’ll need to turn to a bit of assembly language, and think back to making a local copy of our shared 16-bit raw_accelerometer
variable.
For this example we’re using avr-gcc
and it comes with a whole slew of tools. One such tool is avr-objdump
which takes the compiled machine code and disassembles it back into “readable” assembly language. Specifically, if you’ve compiled the code using the debugging flags (-g
) then the output of something like avr-objdump -S myCode.elf > myCode.lst
returns your original code interspersed with the assembly-language version of it.
local_accelerometer_data = raw_accelerometer; b8: 80 91 00 01 lds r24, 0x0100 bc: 90 91 01 01 lds r25, 0x0101
Anyway, you don’t have to be an assembly language guru to see that what we thought was one operation takes two different assembly language instructions. The first instruction copies the lower eight bits of our raw_accelerometer
data into the r24
working register, and the second the upper eight bits.
Now where’s the worst place that the ISR could strike and change the shared raw_accelerometer
value out from under our feet? Between the two lds
instructions. When variable access or manipulation takes more than a single machine / assembly instruction, there’s yet another chance for a failure of atomicity, and for horrible glitches, caused by variables shared with ISRs. And in particular, this breaks our previous “solution” to the atomicity problem.
And notice just how ugly this is. When the interrupt hits between the two lds
commands, the copy of the variable includes the low bits from one reading, and the high bits from an entirely different value. It’s a hybrid of the two values, a number that never was meant to be. And it can arise any time you use a 16-bit variable that’s shared with an ISR.
We’ve cooked up some example code to demo the phenomenon for you. It will run on any AVR microcontroller, including an AVR-based Arduino. The code is written to blink an LED every time a corrupted value is discovered, and it ends up looking like a firefly convention even though we’ve throttled the interrupt speed down to one interrupt every 65,536 CPU cycles. We’re going to need to fix this!
Arduino Aside
The worst failure of atomicity above is caused by sharing a 16-bit variable with the ISR on an 8-bit machine. If we only needed to read eight bits from the accelerometer, the “shadow variable” solution would have worked just fine. We’re reiterating this here because we see a lot of 16-bit int
s used in Arduino code when a shorter data type would suffice.
And we’re not blaming the Arduino users — most of the built-in Arduino example code uses int
for everything, including pin numbers. The 16-bit int
on an AVR Arduino has a range from -32,768 to 32,767. It’s probably good future-proofing for when Arduinos have more than 255 pins, but we can’t imagine needing to access pin number -32,000, whatever that would mean. The rationale behind just using int
for everything, we suppose, is that learning about different data types is tough.
If it were only a matter of wasting RAM or CPU cycles, it’d be OK. But using int
s by default means that you’re introducing all sorts of non-atomicity into your code without knowing it. As soon as you add in interrupts into the mix, as you can see here, it’s a recipe for disaster: don’t use 16-bit numbers on an 8-bit machine unless you need to.
So how does anything ever work on Arduino? The libraries (mostly) aren’t written in this sloppy / naïve fashion (any more). In fact, we’ll take apart the millis()
function once we’ve solved our atomicity problems once and for all. You’ll see that it does exactly the right thing.
Finally, True Atomicity
To re-recap: variables that are shared with an ISR can change without notice, and making a local copy of the variable only half-solves the problem because local copies only work when the operation to make the copy is atomic. How can we make sure that interrupts aren’t changing our variables while we’re not looking? The answer is shockingly simple: turn the interrupts off.
In fact, we didn’t even need to worry about making the shadow local variable at all if we were willing to just turn interrupts on and off again.
int main(void){ while (1){ if (has_new_accelerometer_reading == YES){ cli(); // clears the global interrupt flag if (raw_accelerometer != 0){ display(raw_accelerometer); } // Flag that we've handled this sample has_new_accelerometer_reading = NO; sei(); // re-enables interrupts } } }
Simple enough, and works. Heck, on the AVR, the interrupt disable/enable commands translates directly into a machine code that only takes one cycle each. Hard to beat that. Unless you’re disabling interrupts for a relatively long time.
But then we could use the local variable copy trick:
int main(void){ uint16_t local_accelerometer_data = 0; while (1){ if (has_new_accelerometer_reading == YES){ cli(); // clears the global interrupt flag local_accelerometer_data = raw_accelerometer; sei(); // re-enables interrupts if (local_accelerometer_data != 0){ display(local_accelerometer_data); } // Flag that we've handled this sample has_new_accelerometer_reading = NO; } } }
Now the assignment of the local variable is all that needs protecting, and so the interrupt is only out of action for a couple cycles. That’s pretty awesome. (Again, remember the caveat about the underlying “raw” data getting out of sync with the local copy.)
Arduino’s Millis
One thing our code above didn’t do was to check if interrupts were set in the first place. We simply assumed that interrupts were on, and after we were done with our critical section, turned them back on. But if they weren’t on in the first place, then we’ve changed something unintentionally. The solution is to record the value of the status register, SREG
, which contains the global interrupt enable bit, and restore it after the critical section is over.
As promised, here is the millisecond counter routine from the Arduino library. (It’s in “wiring.c” if you’re interested.)
unsigned long millis() { unsigned long m; uint8_t oldSREG = SREG; // disable interrupts while we read timer0_millis or we might get an // inconsistent value (e.g. in the middle of a write to timer0_millis) cli(); m = timer0_millis; SREG = oldSREG; return m; }
Note that it does the right things for just the right reasons — this is why your Arduino timer code actually works. First, it copies over the value of SREG
which contains the old global interrupt enable bit. Then it creates a local copy of the current (ISR-shared) milliseconds counter. Finally, it restores the SREG
and returns the number of milliseconds. Well done.
Atomic Blocks
Making a general-purpose solution for protecting critical sections like this turns out to be a little bit tricky. At a minimum, we’d like to copy over the SREG
contents as is done in millis()
. The AVR library has a “util/atomic.h” header that defines the ATOMIC_BLOCK
wrapper that basically does that for us.
There are two possible arguments, ATOMIC_RESTORESTATE
and ATOMIC_FORCEON
, and they correspond to the two cases where the block first figures out if the global interrupt vector was on beforehand, or just assumes that it was and sets it on at the end, respectively.
And ATOMIC_BLOCK
is actually cleverer than we’ve discussed so far. It includes a code trick that allows you to use return
or break
statements inside the block, and it still takes care of re-setting the global interrupt flag for you.
As an example, here’s Arduino’s millis()
re-written to take advantage of the standard GCC AVR library:
unsigned long millis(){ ATOMIC_BLOCK(ATOMIC_RESTORESTATE){ return timer0_millis; } }
That’s a lot clearer than the original, in our estimation. Using ATOMIC_BLOCK
saves you a lot of hassle, and makes the code easier to read to boot. Indeed, if you want to save the call overhead, you can simply use timer0_millis
yourself from your main routine as long as you wrap it in an ATOMIC_BLOCK
:
ATOMIC_BLOCK(ATOMIC_RESTORESTATE){ if (timer0_millis > alarm_time){ do_whatever(); } }
The Last Possible Refinement
If you’re one of those optimizer-type people, you’ll have noticed that only one particular ISR shares our variable with the critical section. Instead of disabling all interrupts, you could imagine disabling only the particular interrupt that’s causing the trouble.
This will also get us atomicity, and we can’t think of any reason not to do so except for code complexity. It means giving up on the one-size-fits-all ATOMIC_BLOCK
-style global interrupt manipulations, but it’s probably worth it if you’ve got real-time constraints on other interrupts that you don’t want to block. Nonetheless, we’ve seen a lot of global-interrupt disabling in practice. Would any readers care to chime in with some real-world examples where only specific interrupts were disabled to protect a critical atomic section?
Conclusion
This has been an epic trip through the topic of interrupts, and we hope you liked it. Interrupts are the most powerful tool in the microcontroller arsenal because they directly answer the number-one concern with microcontroller applications: interfacing with real-world peripherals. Mastering interrupts brings you from a microcontroller beginner fully into the intermediate camp, so it’s worth it.
Interrupts also introduce a degree of multitasking, which brings along with it such issues as scheduling and priorities (the bad) and atomicity failures and race conditions (the ugly). Knowing about these issues is half the battle. Are there any other big pitfalls that come with using interrupts that we’ve missed? What are your favorite interrupt horror stories? Post up in the comments.
Thanks! I had run into 16 bit variable issues in ISRs before (and had worked around them in different ways depending on the case), but I had not encountered the ATOMIC_BLOCK macro before. That looks to be quite handy!
Cheers
Where is like button for articles? Elliot does good job with his articles.
+1 – I always look forward to his articles!
Agreed, these are so relevant to me right now it’s scary…Just need one on timers *cough cough Elliot* :). Equation like finding the time is 1/[ (sysclock frequency)/(prescaler)/(if8bit->256, if16bit->65536) ]. Took me forever to get that lol.
These posts are really good! Thanks Elliot.
So is the Main reason atomic_block isn’t used for all 16-bit shared variables instruction length? (I.e it calls about 4 extra assembly codes to read and reset the interrupt register)
Is there a way to do this automatically in the compiler, if only shared variables throat occur both inside and outside ISRs are affected?
has_new_accelerometer_reading = NO;
This should probably be done directly before/after copying the value, to prevent loosing additional samples that occur during the display step…
In your last code sample you should move: ‘has_new_accelerometer_reading = NO;’ before ‘sei;’ otherwise a datapoint could missed.
I didn’t make any attempt here to avoid missing an accelerometer data point — this code just drops observations if they’re coming in too fast to be handled.
The “right” way to solve that problem gets us off to another topic: buffering. Other folks in the comments have picked up on this too. I’ll write that up next week.
A well put together article.
Very interesting, useful article. But would you mind getting rid of the > in ‘if (timer0_millis > alarm_time){‘ in your last code example?
Oops, I meant you need to fix the > to an >
meh, just use a mutex, problem solved
A mutex requires an operating system.
no it doesnt, its just a semaphore, build your own
If you were to leave the interrupt enabled and ‘just use a mutex’ then you’ll likely run into a deadlock situation. If the application takes the mutex and is then interrupted, the ISR will attempt and fail to gain the mutex. At this point, the ISR can either drop the data and return or it can wait for the mutex. Waiting for the application (which is ‘paused’ during the ISR) to return the mutex is going to lock up the system.
Buffer serial stream in IRQ are solved problems with well known solutions using circular queues.
If you need more advanced RTOS features, use or learn to use a proper RTOS. Either case you’ll need to learn something new.
This!
Next installment will be on (circular) buffers, so stay tuned.
As another exercise for readers consider how to handle the input from a gyroscope. The data is the change in orientation and you must accumulate the changes to get the orientation. You cannot miss a data point. How does the final solution here help with that situation?
I suspect, on first thought, I would have the ISR do the little bit of math x+=gyro(x);y+=gyro(y) etc where x,y,z,k… are the largest that can be accessed atomically; then allow the main code to access those globals in read-only mode. If the main loop needed some local copy, it can keep it’s own copies that are updated from the ISR used globals. Without testing, or dragging out my notes from the dragon book, that should prevent the main loop copies from getting erroneous values. Plus, the main loop could sanity check the values if needed, as short int heading readings shouldn’t be outside of some 0-360 type range.
The “dragon book”? Care to elaborate?
Google down again?
https://en.wikipedia.org/wiki/Dragon_Book
(I’ll leave it as an exercise for the reader to decide which one we are talking about here).
>have the ISR do the little bit of math
BZZ, never ever do that
The ISR shouldn’t be doing any math at all. At most it should buffer the raw data and let your regular code handle the math. It is a lot like buffering serial data.
You have also opened yourself to another issue with math libraries or any libraries you call by both irq and your main code. The libraries have better be compiled with reentrant in mind if they use any internal storage.
just use the dummy variable approach, assuming that the main code runs fast enough, otherwise a fifo buffer?
Forgot about Interrupts and use polling, most elegantly using a multi-core CPU (e.g. Propeller, XMOS).
that works until you are out of cores…
Never used a Propeller, and I really should some day. They really sound like fun.
How do the multiple cores deal with shared resources, like common I/O ports?
For simple applications like reading an ADC or accelerometer value and displaying it there’s another alternative that goes against your “Small interrupt service routine” advice – you do everything with interrupts.
e.g. ADC has new value, generates an interrupt that stores the value. Asynchronously a free running hardware timer expires that generates an interrupt and displays the current value. The main loop is empty, the interrupts are not prioritised. Well… they are if you only have one ISR vector. It is dependant on the order of testing for which interrupt occurred at the beginning of the ISR.
There are two variables shared between the main and interrupt contexts in the primary example above — has_new_accelerometer_reading also needs to be volatile.
The other thing to be wary of is that enum types are also 16 bits in this environment (unless specific compiler flags are used) and as such loads and stores from/to enums types are not atomic. As long as one of the bytes is *always* a fixed value, as in the examples above, this will work, but it is not safe to treat enum types as atomic in the general case.
There are actually two variables shared between the main and interrupt contexts in the primary example above — has_new_accelerometer_reading also needs to be volatile.
The other thing to be wary of is that enum types are normally 16 bits in this environment, and as such gets/sets of enum types are not atomic. As long as one of the bytes is *always* fixed this can work, but it is not safe in the general case.
I never used enums for that. If I needed flags, I just make a char or short and use its bits as flags, which can be named, set or cleared the same way register flags are used. This way if I need 8 different flags, I don’t need 8 variables, but only one. I learned about this method in mikroPascal, while programming a resource-limited microcontroller.
Oh, regarding bit-flags! On the AVR8 micros there’s a neat way to get some really efficient optimizations of bit fiddling. If you can find an unused, available HW register at an address lower than 0x20, you (and/or the compiler) can use single instructions CBI/SBI to set and clear individual flags; single instruction atomic access, no shifting and masking.
This can potentially save quite a bit of code space and exec clks, if you do a lot of flag waving between ISRs or ISRmain.
Some of the newer ATmegas added some “General Purpose Registers” in the SFR space for such purposes, but even with older MCUs you can generally find some register that’s not being used in your particular application (e.g. UART or I2C baud-rate or data registers, eeprom address or data register etc). Just make sure it’s a periferal unit that’s not being – and not going to be – used, that you have full R/W access to the bits and that there’d be no unintended HW effect of using them.
The other shared variable: True. Oops. Will fix.
Enums: I always compile with the 8-bit enums flag in AVR-GCC, but you’re right in general that enums are normally the size of an
int
in C, so 16 bits on AVR.CFLAGS += -funsigned-char -funsigned-bitfields -fpack-struct -fshort-enums
FTW.I explicitly specify the size of my variables e.g. uint8_t for unsigned 8-bit instead of trying to figure what size an enum is on a certain compiler to avoid surprises. Declaring the variables using the enum type wile “nice” abstracted the implementation away and can cause issues here if the compiler decided to use 16-bit…
Can ISR change SREG? If yes, what if in the function millis() the ISR changes the contents of SREG after “oldSREG = SREG”?
Not only can ISRs change SREG, they often do. SREG contains all of the carry and overflow information in an operation. The idea that an ISR should not screw up the main code is exactly why the compiler will generate a prologue and an epilogue when you call an ISR which saves SREG and then disables global interrupts, and reverses it at the end. Otherwise many operations would have a result that affects SREG and can in turn affect the main code, not just the value but also things like loops which often rely on instructions which check to see if the Zero flag is set or not.
There is a bit of control over this behaviour. You can declare ISR(interrupt, ISR_NOBLOCK) when defining an interrupt and it will not clear the global interrupt flag allowing your interrupt to be overriden by a higher priority interrupt. Or you can declare ISR(interrupt, ISR_NAKED) and the compiler will not generate any prologue and epilogue code. This is quite dangerous behavior and would lead to exactly the problem you are describing. But in general this shouldn’t be done unless you’re really confident in what you are doing.
Thanks!! You just saved me a lot of trouble! Honestly, I completely forgot about the arithmetic flags set in ISR vs. loops in main code. The AVR datasheets warn about saving status register in a general sense but this is a special case for terrible bugs.
Does Dekker’s algorithm help at all ?
For soc’s we’ve designed in the past, a register is added called Irq_suppress. You write a cycle count into it and for N cycles after that, interrupts are blocked. Once N counts down, they go back to their prior unblocked state. Useful for small critical reg manipulation.
I don’t think the Last Possible Refinement will always work. CLI disables interrupts, but will still “queue them up”. After SEI is called the appropriate ISRs still get called.
Disabling a specific interrupt will cause missed interrupts, if the interrupt (would have) occurred between disabling and re-enabling the specific interrupt.
In some cases this won’t matter, while in others it could be problem.
It depends on the architecture of the processor. I have not dealt too much with avr interrupts but I have spent a fair amount of time fiddling with pic interrupts.
On a pic 18 you have a global enable and global flag along with enable and flag for the individual interrupt sources. When the enable flag for an interrupt source is disabled the triggered flag will still get set. When the enable flag is raised if there is a pending interrupt for that source it will fire the isr. You have to be a little bit careful about enabling an interrupt if you have been using a device in polling mode before switching to interrupts.
Dear Elliot,
Thank you very much for the article. Disabling all the interrupts may distort other important processes. A simple semaphore may help. Please review the attached modifications.
Best regards,
Miguel.
*****
volatile uint16_t raw_accelerometer;
enum {NO, YES} has_new_accelerometer_reading;
ISR(INT1_vector){
if(has_new_accelerometer_reading == NO){ // sampling is allowed (green light)
raw_accelerometer = read_accelerometer_z(); // read a sample
has_new_accelerometer_reading = YES; // sampling must wait (red light)
}
}
…
int main(void){
has_new_accelerometer_reading = NO; // allow sampling (green light)
while (1){
if (has_new_accelerometer_reading == YES){ // a sample is available
if (raw_accelerometer != 0){
display(raw_accelerometer);
}
// Flag that we’ve handled this sample
has_new_accelerometer_reading = NO; // allow further sampling (green light)
}
}
}
This is great write-up. I’ve been bit by almost everything in here at one time or another. I’m glad to have this as a resource to point to when my friends are having similar problems.
main loops are so 2077
Note: Elliot seems to have plenty of embedded experience. There are many many ways to write these small pieces of code such that they will corrupt the buffer structure when e.g. interrupts fill the buffer and user-code empties it. Elliot simply writes it in a way that requires no locks.
Thanx for this review, Elliot.