The year was 1996, the European Space agency was poised for commercial supremacy in space. Their new Ariane 5 Rocket could launch two three-ton satellites into space. It had more power than anything that had come before.
The rocket rose up towards the heavens on a pillar of flame, carrying four very expensive and very uninsured satellites. Thirty-seven seconds later it self destructed. Seven billion dollars of RUD rained down on the local beaches near the Guiana Space Centre in Southern South America. A video of the failed launch is after the break.
The cause of all this was a single improper type cast in a bit of code that wasn’t even supposed to run during the actual launch. Talk about a fail.
There were two bits of code. One that measured the sideways velocity, and one that used it in the guidance system. The measurement side used a 64 bit variable, but the guidance side used a 16 bit variable. The code was borrowed from an earlier, slower rocket whose velocity would never grow large enough to exceed that 16 bits. The Ariane 5, however, could be described with a Daft Punk song, and quickly overflowed this value.
The code that caused the overflow was actually a bit of pre-launch software that aligned the rocket. It was supposed to be turned off before the rocket firing, but since the rocket launch got delayed so often, the engineers made it timeout 40 seconds into the launch so they didn’t have to keep restarting it.
The ESA never placed blame on a single contractor. The programmers had made assumptions. The engineers had made reasonable shortcuts to make their job easier. It had all made it through inspections, approvals, and finally the launch event.
They certainly learned from the event; the Ariane 5 rocket has flown 82 out of 86 missions successfully since then. It has at least five more launches contracted before it is retired in 2023 for the Ariane 6 rocket being developed now. This event also changed the way critical software and redundant systems were tested, bringing the dangers of code failure to the attention of the public for the first time.
If you want to read more, there is a great discussion on Reddit which tipped us off to this fail, a quite thorough Wikipedia article, and the original article that ran in the New York Times is mirrored here.
Continue reading “Fail Of The Week (in 1996): The 7 Billion Dollar Overflow” →