Stress-Testing An Arduino’s EEPROM

Every time one of us flashes an Arduino’s internal memory, a nagging thought in the backs of our minds reminds us that, although everything in life is impermanent, nonvolatile re-writable memory is even more temporary. With a fixed number of writes until any EEPROM module fails, are we wasting writes every time we upload code with a mistake? The short answer is that most of us shouldn’t really be concerned with this unless we do what [AnotherMaker] has done and continually write data until the memory in an Arduino finally fails.

The software for this is fairly simple. He simply writes the first 256 ints with all zeros, reads them to make sure they are all there, and then repeats the process with ones. After iterating this for literally millions of times continuously over the course of about a month he was finally able to get his first read failure. Further writes past this point only accelerated the demise of the memory module. With this method he was able to get nearly three million writes before the device failed, which is far beyond the tens or hundreds of thousands typically estimated for a device of this type.

To prove this wasn’t an outlier, [AnotherMaker] repeated the test, and did a few others while writing to a much smaller amount of memory. With this he was able to push the number of cycles to over five million. Assuming the Arduino Nano clone isn’t using an amazingly high-quality EEPROM we can safely assume that most of us have nothing to worry about and our Arduinos will be functional for decades to come. Unless a bad Windows driver accidentally bricks your device.

Thanks to [morgan] for the tip!

29 thoughts on “Stress-Testing An Arduino’s EEPROM

  1. That’s not surprising. The first page of the datasheet lists 10,000 cycles for the flash (i.e. where the program is stored) and 100,000 for the EEPROM (what was tested here). However, that is with a data retention of 100 years at 25 degrees C. So 20 – 50 times that with a retention measured in milliseconds is hardly surprising.

        1. I believe the thought process behind alternating like that was to ensure that every pass was an actual write. Writing a zero on top of a zero is not the same as changing it to a one as far as the memory is concerned.

        2. Random numbers would only stress the registers in the processor, and RAM which are both fairly unbreakable. I’d be interested in knowing the actual reason for the failure of this specific test, but eeeprom memory tends to fail faster when you flip the bit really fast. A random number would decrease the signal frequency into the capacitors, reduce the heat, and reduce the frequency of any inductive voltage spikes (if there are any.) From a results/conclusion standpoint, you would have no clue how many times you actually flipped a bit before it failed, if you threw randoms at it.

          Random numbers are really good at masking any old data on eeeprom chips if that is what you were thinking about

          1. I think that bytes are always erased to 0xff prior to a write, and therefore constantly writing 0x00 to the EEprom is probably the fastest way to wear it out.

            “One” bits do not have to have electrons pulled though the isolation barrier, and therefore do not cause wear either during writing nor during erasing.

            I am guessing here though. The physical implementation may invert the bits during both reading and writing. The default value according to the datasheet probably shows the “unprogrammed” state. It may be mildy interesting to see the results of writing some bytes with 0 and others with 0xff (or do it bitwise) and then every thousand cycles or so check if the other state is strill programmable.

            There may be gotcha’s for this. Before arduino for example, there was an eeprom library for AVR which had an “update()” function, and if a byte was the same in eeprom, then the byte was just skipped and not written.

            Also, such libraries are seldom optimised, and instead of checking whether a write is complete, often just the maximum delay is used in the firmware.
            If I remember well, the AVR can generate an interrupt when an EEprom write is complete, and making use of this is probably the fastest way to write blocks to EEprom.

    1. Exactly.

      There are multiple parameters that determine these things and the data sheets only describe a fixed number of cases.

      Run your Arduino (ATmega328p) at 0 to 125 Celsius and 3v3 and it’s specified to run up to 16 MHz.

      Run it at 0 to 125 Celsius and 5v0 and it’s specified to run up to 20 MHz.

      BUT! Run it 25 Celsius and 5v5 and you’ll easily get 30MHz and up towards 40 MHz with an external clock oscillator.

      I had one running near 50MHz at one stage but internal heat glitched it and as it was a DIP package there is no real way to get heat out quickly.

      1. As well as the all 1’s and all 0’s test maybe other data patterns should be included? And, as others have written the long term retention needs to be verified.

        1. There are no “1”s and “0”s. There are only blank and programmed. Blank is “1” (or as a byte 0xFF) and programmed is “0” so you can’t (technically) program a “1”.

          Where this matters is “ease”. You can’t erase a bit – set it to “1”. You can erase different “block” sizes in different memory architectures. The minimum erase block size in an architecture might be 128Bytes for example.

          The ATmega data sheet shows that the EEPROM has 256 blocks of 4 Bytes so here is (most probably) how writing a “0” goes. You write to the memory address register and the EEPROM manager will select the appropriate 4 byte block and copy it to a 32 bit SRAM register and starts erasing the block (erasing is the slowest part of the process) and then you copy the data to the EEPROM and it modifies the appropriate bytes in the 32 bit SRAM register and copies the 4 bytes (32 bits) back to the block that was erased. All that to write one “0”.

          I lot of EEPROM and FLASH systems have quite advanced features to make them faster and more reliable and last longer but none of these specialist feature really can’t be afforded at a bottom range chip with only 1024 Bytes of EEPROM.

          In any case the specs are 10,000 write cycles for the FLASH and 100,000 cycles for the EEPROM and the FLASH retention failure rate is 1PPM over 20 Years @ 85 Celsius and 1PPM over 100 years @ 25 Celsius.

          For a chip in this price range you couldn’t ask more. If you did however needed more by way of capacity then SPI a larger external serial.

        1. I don’t know if this would help. You can’t really get close to the die on common packages like DIP, QFP, QFN. From the bottom you get to the led frame before the die and from the top you get to the bond wires before the die.

          You could try a BGA but I think it would be a lot of effort for very little gain.

          Cooling the die helps but eventually it just comes down to gate propagation delays and routing distances.

  2. “Arduino Nano clone isn’t using an amazingly high-quality EEPROM”. Aren’t clone using the same chip? Refurbished, maybe.

    There was once a company called logic green making compatible chips but it has been out of business for a long time.

  3. One interesting titbit about flash is that hotter the silicon is while the electrons are trapped (when storing data) the longer the retention period is when the chip is powered off and stored in a freezer inside a ziplock bag with silica gel! (Yep the colder you store the silicon the longer it takes the electrons to leak-away-from/escape the trap).

    And since EEPROM and flash at their heart are extremely similar technologies in that they both trap electrons (ref: https://si.ventronchip.com/upfile/images/3a/20201207140944579.png ), the same titbit would be true for both.

  4. It’s the data retention time that decreases. When you have reached the point where you directly can detect an error, you have long passed the cycle count for a usable retention time.
    What good is an eeprom for that holds its data only for one day e.g.

  5. Lots of people worry about what seems a limited write capability of EEPROM and Flash, but often don’t do the real calculations as to what they’re likely to use.

    At a previous job, they were insisting that they needed wear levelling in a simplistic use case. Running the numbers, the flash would’ve lasted 15 years at the expected use rates. That didn’t change their mind though…

    1. yeah this is what i was thinking too, that the worry is its own problem. i worry whenever i am doing development, each time i upload new code i expect it to produce wear. but then i look back at my git history and this project that has been actively maintained for 15 years across 2 different microcontrollers is only 436 commits, and only a fraction of those reflect flash writes. when i’m working on it, i might do a dozen writes an hour, but that’s only for a few hours. most use cases, flash endurance is “good enough”, just don’t write syslog to microsd :)

  6. Seems to be a pretty useless test case when not done in a controlled environment (over the full temperature range) and with no useful information about data rention over time. And if you really have an application where the number of write cycles and rentention time are critical, you’d probably start using FRAM based memory anyway.

      1. Yip!
        When I 1st heared about FRAM I was really excited.
        Then I got more info.
        Wear on read..   8…(

        Now MRAM may be better, but my use case for MRAM/FRAM has faded away.

        Someone has compact comments or links on MRAM?

  7. If you want to write more than the rated number of cycles you can use a wear leveling algorithm. Years ago I created a very simple one that just keeps writing the datastruct prefixed with a “counter” from beginning to end of eeprom and then wrapping around again. The counter would toggle between 0x55 and 0xAA every cycle. It’s easy to fixure out where the last write location was. Another benifit to this method is that the data is atomic.

  8. I used to work for a company that made a device that moved along a track. We wrote code that would record the force along the track in each direction each time it was activated. EEPROM write/erase cycles were a huge concern. We contacted Microchip and they provided, in writing, a statement that the EEPROM would last many times more than what is specified in the spec sheet. The engineers said that almost all EEPROMs will last at least 10 times what is in the spec sheet. But to protect themselves from law suites they greatly understate the figure. Using their new numbers we calculated that the device, even with the most use possible, would last at least 10 years in the field.

  9. Still, the point remains even with the critical commentary…. We hobbyist users have nothing to worry about uploading code. Well, I least I don’t. Upload code a few times and then device just sits there doing it’s thing. … And not expensive to replace anyway if you fry your board/chip.

    1. There are cases where it DOES matter. Let’s say you want a handheld device to have multiple applications, but there isn’t enough room in program flash for all of them. “Ah!,” you say, “I can store multiple programs in the SD card that I already have for data logging, and have the chip reflash itself from that, when the user wants to change applications.” But then, “Oh,” you continue, “what about flash wear?” The difference between 10,000 cycles and 100,000 cycles then becomes significant.

  10. The most vital part of these nonvolatile memories is the temperature of the die when the erase/write cycle happens. the “MAX” in the datasheet is typically the number of cycles when hot.

    The operation of erase/write is doing charge transfer- as the cell wears, the cell gets leakier, and more charge has to be transferred for it to “stick”. Older cells take longer to charge up.

  11. Years ago, as a deterrent to cloned devices, I had the bright idea of burning out an EEPROM location in an MCU as part of initial setup and then have the code check if that location was writable. (Much the way programs on floppy disks used to check the location where a hole had been punched in the medium.).

    However, the EEPROM proved MUCH more durable than the datasheets suggested, and the millions of cycles needed to reliably destroy a location made the setup time unworkably long.

    We went back to the tried-and-true method of having Version n+1 of the product ready by the time the Version n clones hit the market!

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.