Filesystems for computers are not the best bet for embedded systems. Even those who know this fragment of truth still fall into the trap and pay for it later on while surrounded by the rubble that once was a functioning project. Here’s how it happens.
The project starts small, with modest storage needs. It’s just a temperature logger and you want to store that data, so you stick on a little EEPROM. That works pretty well! But you need to store a little more data so the EEPROM gets paired with a small blob of NOR flash which is much larger but still pretty easy to work with. Device settings go to EEPROM, data logs go to NOR. That works for a time but then you remember that people on the Internet are all about the Internet of Things so it’s time to add WiFi. You start serving a few static pages with that surprisingly capable processor and bump into storage problems again so the NOR flash gets replaced with an SD card and now the logs go there too. Suddenly you’re dealing with multiple files and want access on a computer so a real filesystem is in order. FAT is easy, so the card grows a FAT filesystem. Everything is great, but you start to notice patches missing from the logs. Then the SD card gets totally corrupted. What’s going on? Let’s take a look at the problem, and how to reach embedded file nirvana.
How Filesystems Organize Files
You’ve had a surprise Learning Opportunity (AKA, you screwed something up). In the cold garden shed the power supply for your little device isn’t quite as reliable as you though and it tends to suddenly shut down. It turns out FAT (and many other filesystems) aren’t especially tolerant of the kinds of faults you find in these environments. You know what is? littleFS, a BSD licensed open source filesystem by ARM, designed for small devices.
Let’s get pedantic for a moment; what is a filesystem anyway? It’s nothing more than the way data is organized on a storage medium. For a very simple system with simple storage needs, like our original EEPROM temperature logger, there might not really be a “system” per-se. EEPROMs are typically good at byte-resolution addressing so the most direct way to store temperature data might be as simple as “each byte is a sample.” Coupled with knowledge of the sampling interval and time of first boot that would be sufficient for a simple application. In such a system you might say that the EEPROM contains one “file” — the single list of temperature records.
In more complex systems it wouldn’t be surprising to discover that more organization is needed. The advanced temperature logger had a small static website, a file of configuration settings, and the data logs. You could certainly keep all of these in one “file” and hard code the address offsets of each region but that would get brittle when storage regions need to change size or if the SD card is ever read by another device. An alternative may be to store information about where each region begins and ends in a pre-specified structure somewhere on the disk. And once you have such a structure it’s not hard to imagine adding other metadata like what regions of the disk are already erased or in use, timestamps, sizes, permissions, etc. You see where this is going. Even directories could be created by adding hierarchical levels of metadata. Now things are starting to look more recognizably like a desktop filesystem. For relatively simple embedded systems they may not be needed, but once the complexity ramps up, adding a filesystem can make storage much easier to deal with.
Data Corruption on the SD Card
Back to our problematic temperature logger. Why did the data get corrupted? Imagine the SD card with those critical sectors of metadata. What happens if one gets damaged? Sometimes they might be recoverable but without that all important metadata the disk turns into so much entropy. It’s kaput.
How do you make sure that things never gets corrupted? Journaling is a popular option in which changes to be written to disk are first stored in a “journal” of pending operations. If the disk becomes damaged ideally this journal can be replayed to recover. Metadata could be written in more than one place on disk and compared, or written in a certain order with CRCs and other consistency checks themselves written in a specific order. Data could be retained in RAM until it was definitely, absolutely, for sure written to disk. An exhaustive list of options is a better fit for a PhD than this article.
It might be obvious that some of these choices would work better than others. It follows that schemas to increase durability on a desktop computer or server or phone may not be well suited to the risks inherent to an embedded device with erratic power source. A moment’s thought is enough to realize that many of the ideas mentioned above would fail catastrophically if the power was removed in the middle of a write operation. It is possible to design filesystems with power loss in mind but that’s a specific feature to watch out for.
littleFS Loves Microcontroller Designs
Back to littleFS, which it turns out meets many of our criteria for reliably small embedded filesystems. It’s durable against surprise power loss (they credit this as being the main focus of the project in the documentation). It can wear level, which is especially important on small flash devices which aren’t good for as many cycles as ye olde spinning platter and a stark contrast to FAT. And it supports all the static allocation, maximum RAM and ROM guarantees you want when your CPU has 16k of RAM. Plus the entire thing is a single source/header pair, with a second pair for optional utilities!
The best part of littleFS isn’t its consistency guarantees or its licensing (ok, we do really love a good licensing scheme though) it’s the documentation of course! Crack open that header file to see what it’s all about and your greeted by a great level of documentation. The source file? Even better! Just the right level of comment verbosity; explaining most of the higher level logic but not every line. Honestly it probably biases on the side of too much documentation.
But that’s not all! Just when you think the documentation can’t get any better you find SPEC.md and DESIGN. md. SPEC covers the technical details. If you already know how the filesystem works SPEC is what helps you write additional tooling and debug a raw disk. DESIGN is everything else. DESIGN is a 1200 line description of everything. The choices made to get to the final implementation. Existing designs to compare to. The theory of operation, with ASCII diagrams so good we used them to decorate this article.
Anyway if it’s not obvious, I love this project and definitely intend to use it the next time I have storage to organize on a small embedded system. Even if you don’t intend to, if you’ve read this far, it might be worth a skim through DESIGN as a primer on
how to write great documentation what to think about when putting together a filesystem. And as always, if you’ve tried littleFS out or have another favorite uC filesystem, tell us in the comments! We’d love to hear how it worked out.