Everyone has a bad day right? Monday was a particularly bad day for the folks at Sparkfun. Customer support tickets started piling up, leading to the discovery that they had shipped out as many as 1,934 MicroViews without bootloaders.
MicroView is the tiny OLED enabled, Arduino based, microcontroller system which had a wildly successful Kickstarter campaign earlier this year. [Marcus Schappi], the project creator, partnered up with SparkFun to get the MicroViews manufactured and shipped out to backers. This wasn’t a decision made on a whim, Sparkfun had proven themselves by fulfilling over 11,000 Makey Makey boards to backers of that campaign.
Rather than downplay the issue, Sparkfun CEO [Nathan Seidle] has taken to the company blog to explain what happened, how it happened, and what they’re going to do to make it right for their customers. This positions them as the subject of our Fail of the Week column where we commiserate instead of criticize.
First things first, anyone who receives an affected MicroView is getting a second working unit shipped out by the beginning of November. Furthermore, the bootloaderless units can be brought to life relatively easily. [Nate] provided a hex file with the correct bootloader. Anyone with an Atmel AVR In-System Programming (ISP) programmer and a steady hand can bring their MicroView to life. Several users have already done just that. The bootloader only has to be flashed via ISP once. After that, the MicroView will communicate via USB to a host PC. Sparkfun will publish a full tutorial in a few weeks.
Click past the break to read the rest of the story.
So what went wrong? The crux of the problem is a common one to manufacturing: An incomplete production test. For many of their products, Sparkfun loads a single hex file containing the production test and the optiboot bootloader. The test code proves out the functionality of the device, and the bootloader allows the customer to flash the device with their own sketches. The problem is the bootloader normally connects to a PC host via USB. Enumerating a USB connection can take up to 30 seconds. That’s way too slow for volume production.
Sparkfun opted to skip the bootloader test, since all the pins used to load firmware were electrically tested by their production test code. This has all worked fine for years – until now. The production team made a change to the test code on July 18th. The new hex file was released without the bootloader. The production test ran fine, and since no one was testing the bootloader, the problem wasn’t caught until it was out in the wild.
The Sparkfun crew are taking several steps to make sure this never happens again.They’re using a second ATmega chip on their test fixture to verify the bootloader without the slow PC enumeration step. Sparkfun will also avoid changing firmware during a production run. If firmware has to change, they’re planning to beta test before going live on the production line. Finally, Sparkfun is changing the way they approach large scale production. In [Nathan’s] own words:
Moving from low volume to mid-volume production requires a very different approach. SparkFun has made this type of mistake before (faulty firmware on a device) but it was on a smaller scale and we were agile enough to fix the problem before it became too large. As we started producing very large production runs we did not realize quality control and testing would need very different thinking. This was a painful lesson to learn but these checks and balances are needed. If it didn’t happen on Microview it would have happened on a larger production run someday in the future.
Everyone has bad days, this isn’t the first time Sparkfun has lost money due to a mistake. However, they’re doing the right thing by attacking it head on and fixing not only the immediate issue but the underlying thought process which allowed the problem to arise.