When a computer crashes, it usually doesn’t leave debris. But when a computer happens to be descending towards the lunar surface and glitches out, that’s a very different story. Turns out that’s what happened on April 26th, as the Japanese Hakuto-R Lunar lander made its mark on the Moon…by crashing into it. [Scott Manley] dove in to try and understand the software bug that caused an otherwise flawless mission to go splat.
The lander began the descent sequence as expected at 100 km above the surface. However, as it descended, the altitude sensor reported the altitude as much lower than it was. It thought it was at zero altitude once it reached about 5 km above the surface. Confused by the fact it hadn’t yet detected physical contact with the surface, the craft continued to slowly descend until it ran out of fuel and plunged to the surface.
Ultimately it all came down to sensor fusion. The lander merges several noisy sensors, such as accelerometers, gyroscopes, and radar, into one cohesive source of truth. The craft passed over a particularly large cliff that caused the radar altimeter to suddenly spike up 3 km. Like good filtering software, the craft reasons that the sensor must be getting spurious data and filters it out. It was now just estimating its altitude by looking at its acceleration. As anyone who has tried to track an object through space using just gyros and accelerometers alone can attest, errors accumulate, and suddenly you’re not where you think you are.
We know what you’re thinking: surely they would have run landing simulations to catch errors like these? Ironically they did, it’s just that after the simulations were run, the landing site for Hakuto-R was changed. Unfortunately, nobody thought to re-run the simulations, and now the Moon has a new lawn ornament,
We’ve previously written about why lunar landings are so hard. While knowing what led to the crash will hopefully prevent a similar fate for future missions, the reality is that remotely landing a robot on a dusty world without the help of GPS is fiendishly difficult and likely will be for some time.
Change management is hard
“I wonder if it will be friends with me?”
Oh no, not again. Said the potted plant.
Sensors are hard. They have a virtually infinite number of ways they can fail. Usually this is taken into a account in an onboard process called Fault Detection, Isolation, and Recovery (FDIR). I’m surprised that Hakuto-R didn’t include some kind of strikeout / retry logic to try and reset faulty sensors. However, I imagine the team at iSpace considered the powered descent environment to be too fast and too delicate to waste time trying to reset sensors and see if they come back online.
Everything usually has to be done onboard during powered descent anyway. The light lag to the moon precludes remote control and these landers usually have incredibly small bandwidth budgets if they’re maneuvering and can’t point their high-gain antenna(s) back to Earth.
Wasn’t even faulty sensors. Faulty logic killed it.
Yes, but the software thought the sensor was faulty. As the previous poster indicated, the software did not reset and re-evaluate the integrity of the “faulty” sensor. This resulted in its demise.
Didn’t read into the details, but “normally“ you have three sensors with the same function and you do majority voting. If all three report the same strange data, the software can assume it’s true. But hey, if only foresight was as good as hindsight.
The problem here is that the sensor was reporting correct information. It was the interpretation of that information which was incorrect. They probably should have just thrown out a subset of samples, instead of filtering all of them. Or not trying to
‘Normally’ goes out the window in a small team like this, trying something audacious on a small budget and schedule. I know from experience.
Maybe they shouldn’t have used Google Moon…
I have the impression that even vehicle ECUs are designed better than that… going “ope, sensor bad or unplugged, will go best guess open loop, until it comes back or reads something sensible”
Some other rambling thoughts: This is also a violation of the “test like you fly, fly like you test” maxim. They made a change to the mission profile and didn’t fully test it. Hindsight is 20/20 of course, but many space missions have failed due to last-minute changes. The engineers at iSpace must have been aware of this, and they also must have been able to account for changes to their mission trajectory. The fact that iSpace didn’t simulate their final landing trajectory is especially confusing, because it means that they changed their landing site at the last minute and chose not to simulate it.
The combination of orbital dynamics and required sun angles on landing (for power, as well as proper shadows on the lunar surface if you’re using optical navigation) means that lunar missions usually accumulate delays in one month increments, but that assumes that the landing site is a constant point on the surface of the moon. Why didn’t iSpace simply accept another month of delay and try and land on their original site?
It can happen in all kinds of “sensor” scenarios. I worked in ground-based avionics on ILS and that has a requirement of shutting down when a signal is out of tolerance for only a few seconds in CAT III, especially CAT III C (zero visibility). For these critical systems, they also have something called a field monitor that is literally in the field near the runway (not on, of course). Problem is, they experience perturbations of the “signal-in-space” from landings either on the runway monitored or other runways that may be used for taxiing. They didn’t want to pay to send me on-site, so I got someone to install a test SW version (via an act of god) to collect a bunch of data for landings, including a description of those landings. A prior study by Mitre Corp. (a few years before I did it) claimed that this would need 3 HP1000 minicomputers to do the real-time analysis of the signals and make the determination. I did it as part of an existing Intel 80C196 processor running at 11.05972 MHz, data measurement cycle. It was a non-linear filter I developed that used the original data as the test set to verify the algorithm did not “alarm” for any of the temporary anomalous conditions (including helicopters) for a total of over 1/4 million sensor measurements. My bosses (plural) were upset it took me (alone, after I got the data) 3 months to come up with a solution, but it’s been working great since 1993. You don’t know what can be done until it’s tried!
A sad mistake, but it pales in comparison to the incredible lunacy of the Mars Lander crash in 1999 that occurred because Lockheed Martin used Imperial measurements while JPL used metric.
Surely because Lockheed Martin used Imperial measurements while the rest of the world used metric.
Indeed. Even the US army knows metric, I once read (I’m European). And the 24 hour system, too. I mean, we can think about them what we like, but they’re at least able to understand the importance of error free communication. I wished other US institutions would be as progressive here. Nothing worse than false pride. Imperial has its place, as much as an accent/dialect has. But for interaction with others on a larger scale, it’s misplaced.
Here in Europe, we accepted English for international communication, for example. It’s been used in-country, too. Airplane pilots sometimes still use their national language to communicate with towers within their country, but English has priority. Even in countries like France or Germany. Imagine, because of national pride we would insist on using our national languages rather than English.. Yikes! An French airplane crossing Rhine would be greeted in German language. Must be a nightmare for the pilots. 😂
Yes! And A4 paper!
“Nothing worse than false pride.”
As with most things, it’s not even remotely that simple.
The real reasons the US refuses to go metric
https://www.youtube.com/watch?v=qbdx2nOQKKo
Now just a centon…why aren’t clocks and calendars in metric? Seems like every yahren we go through the metric discussion as if there aren’t enough centars in a cycle. Maybe some centuron we will get on he same unit of measure… 😁
Interesting video. Can personally confirm that as a chemist and physician my whole world is in metric system.
One thing though is that historically the US was the dominant manufacturer of most material goods and to this day high end manufacturer. As such whatever system we used is engrained- shops have millions of dollars of tooling and refitting entirely for metric would be impossible not to mention still need imperial stuff for repairs.
With modern CNC and stuff it’s less of an issue going forward but still for historical reasons very real.
Basically people will use whatever is easiest / cheapest and I really doubt blind patriotism is a significant contributor. Less “‘Merica!” And more “I don’t want a new car, whole new set of tools in my garage, re-label every road and building sign for domestic infrastructure …” and so on
@craig, I had a ’99 car which for every part, it was a coin flip if a particular fastener would be metric or imperial. Doubled the number of tools needed for anything.
The U.S. landed “ Surveyor “ safely on the
moon approximately 65 years ago.
Salute NASA🫡
And just how much of the Moon has it surveyed in the meantime?
B^)
I grew up in a US Army household and can confirm that I had to learn the difference between klicks (kilometers) and miles at a very young age. It was annoyingly helpful.
If you plan on taking the war elsewhere, it’s helpful to know how to read the road signs.
G-d dammit, I wish my country would just quit being so stubborn and go 100% metric!
If it’s good enough for the rest of the world, it should be good enough for us. Hey, we jight even sell more stuff if we made it to metric measurements.
The units of measure issue was real, but the bigger issue was that the results of the calculations of the ground team were prioritized over conflicting results that had been performed on the spacecraft. (this was told to me by a high school friend who had very intimate knowledge – in fact, for several hours believed the crash was his fault). NASA changed procedures after the incident.
Over-Engineered. Less software/algorithms and more common sense would have been ueful here.
An analog computer using Lidar wouldn’t have had made this mistake, I think.
Or any other analogue solution. They have the advantage of being real-time capable. They leave no room for miscalculations, either. Same goes for mechanical solutions, maybe. Instead of hi-tec sensors, just use a long stick or a long rod of steel wire. Once it touches ground, the probe can be certain to crash soon. ;)
But it had humans at the “wheel” with visuals to the round. Men were doing the “data fusion”. Should we send AI? I think not, at least not yet.
Wait a minute this is space…. more to this issue that just running out of fuel. Should have then run out of fuel landing somewhere else higher, too.
Not knowing its altitude, it descended slowly, all the time expending fuel to provide thrust against the force of moon gravity. This can use up a lot more fuel than just spending a minimum of time landing at a known elevation.
. . The glitch was, Japan made a real attempt . . You have to admire the effort
I don’t know much about spaceflight on moon, but in Earth here as a pilot of gliders there is a huge difference between your altitude “AGL” (above ground level) and MSL (mean sea level). Your MSL can say whatever it wants but if a mountain is in your way, you will impact terrain. That’s frowned upon for a number of reasons.
This spacecraft’s sensors were measuring both the AGL with the laser rangefinder and what we the equivalent reference altitude (mean geodic? Not sure what they use for moon) via the inertial navigation system and in that sense, both were giving correct readings. Especially if there were multiple redundant systems, somewhere in the algorithm should have been a determination that terrain is the difference. The laser rangefinder info absolutely should NOT have been filtered out as inappropriate.
Also mental that even if all that were true, the guidance system would not have a reference descent and approach profile to compare the real time data to predicted. It didn’t know it was going to overfly a cliff? Even for a sunny day glider flight (or SEL) you look at the proposed terrain. Even the Apollo craft did this with the radar and inertial systems being referenced to expected glide slope. To not do even such basic preflight for terrestrial flights is inexcusable. For high stakes moon landing that is truly baffling and inviting exactly this outcome. It was inevitable.
I think selenography is equivalent to geography, so that should mean selenodic is equivalent to geodic.. but if the grad student was trolling, lunatic.
They just should have used a parachute.
I wonder: could you use an optical curve evaluator to measure how far from the surface of the moon you are? I mean it’s a smaller body than Earth so perhaps you could get some usable results that way?
Or would you have a too wide a margin as you get near the surface to be useful? Keep in mind that this thing was quite a few kilometers up from where it was thinking it was.
Note: I mean the curve of the horizon of course.