In what is probably the longest-distance tech support operation in history, the Voyager mission team succeeded in hacking their way around some defective memory and convincing their space probe to send sensor data back to earth again. And for the record, Voyager is a 46-year old system at a distance of now 24 billion kilometers, 22.5 light-hours, from the earth.
While the time delay that distance implies must have made for quite a tense couple days of waiting between sending the patch and finding out if it worked, the age of the computers onboard probably actually helped, in a strange way. Because the code is old-school machine language, one absolutely has to know all the memory addresses where each subroutine starts and ends. You don’t call a function like do_something();
but rather by loading an address in memory and jumping to it.
This means that the ground crew, in principle, knows where every instruction lives. If they also knew where all of the busted memory cells were, it would be a “simple” programming exercise to jump around the bad bits, and re-write all of the subroutine calls accordingly if larger chunks had to be moved. By “simple”, I of course mean “incredibly high stakes, and you’d better make sure you’ve got it right the first time.”
In a way, it’s a fantastic testament to simpler systems that they were able to patch their code around the memory holes. Think about trying to do this with a modern operating system that uses address space layout randomization, for instance. Of course, the purpose there is to make hacking directly on the memory harder, and that’s the opposite of what you’d want in a space probe.
Nonetheless, it’s a testament to careful work and clever software hacking that they managed to get Voyager back online. May she send for another 46 years!
Amazing hack job by the support team! And what a testament to the engineers who designed Voyager!
As a ham radio operator who builds my own gear from scratch, I am astonished that the meager signals transmitted by Voyager can be received here on Earth 24 billion kilometers distant! That is the ultimate record of miles-per-watt. How amazing that Voyager can still generate enough power to stay alive, let alone transmit and receive across 24,000,000,000 kilometers, traveling beyond the solar system.
As an amateur stove operator who cooks my own breakfast from basic ingredients, I am also astonished. Great work NASA. Go Science!
They have several 70+ meter antennas through the world, that helps a lot to catch those signals
It is a classic Finite State Machine situation, given a machine with a known initial State, and not knowing its current state, how can the operator of the machine return it to a state where it’s outputs are intelligible by strategically choosing input states
Well, no. If the program is corrupted then you don’t know what states are even possible.
Activate the reset input.
As I understand it, this thing has a bootloader, and it is hard wired (Literally, by threading a wire though specific cores). The memory that got corrupted was of another type that was susceptible to wear, similar to Flash is now.
No, the CCS/AACS both run on plated wire memory, which is still reprogrammable. Plated wire is just core memory except instead of literal magnetic cores you plate the magnets on the wire, meaning that you can thread things automatically. They don’t need power to retain data, though.
You’re thinking of core rope, which is hard wired, like what Apollo used.
The memory that got corrupted was in the FDS, and it was CMOS RAM: so not flash, but just like normal SRAM (doesn’t need refreshing, does need power).
“Think about trying to do this with a modern operating system…”
BADRAM option on GRUB, memmap on the linux kernel options… on windows, you’re probably screwed
Not really.
https://github.com/prsyahmi/BadMemory
If you’re using *nix, your head is likely screwed, too. Linux hits you harder, even. It’s sect like. Windows is honest about being bad, at least.
What on earth are you wobbling on about?
You could write a RTOS for a modern hardware and it will behave pretty much like a C64 hahahaha. It will be just way faster and with way more memory.
I dont know it the hardware itself it would be that reliable tho.
Re: BADRAM. Ah cool! Well that solves that.
Getting around problems that should be solved at a different level.
(E.g power regulator issues on the motherboard causing marginally functional RAM to fail at some places)
>on windows, you’re probably screwed
“Starting in Windows version 19042, bad memory pages are stored in the registry under HKLM\SYSTEM\CurrentControlSet\Control\WHEA\BadPages. In previous versions of Windows, this information is stored in the BCD system store. This list contains the PFNs for all memory pages that the PFA has predicted are likely to fail. When Windows starts, it excludes these memory pages from system use.”
In a system that’s complex enough to have ASLR, it wouldn’t be an issue; the OS would just be told which physical page has the issue and would not map that into the MMU. The application code wouldn’t care.
>trying to do this with a modern operating system
Actually its pretty easy… since the kernel knows the exact physical addresses it can simply not allocate anything into that region. Linux has the badram option for example.
>address space layout randomization
Please actually read the article you link. It has nothing to do with how (where) things are laid out in the PHYSICAL memory.
“In a way, it’s a fantastic testament to simpler systems that they were able to patch their code around the memory holes. Think about trying to do this with a modern operating system that uses address space layout randomization, for instance. Of course, the purpose there is to make hacking directly on the memory harder, and that’s the opposite of what you’d want in a space probe. ”
There’s another, lesser known technique: using processor registers as a storage.
It’s possible to write code in such a way that it can work without any RAM.
The most popular example that comes to mind is a diagnostic software meant for the IBM PC 5150.
It comes as ROM set and is being installed in place of the PC BIOS.
Also like the dead test cartridge for the C64. Operates only out of the ROM to test the RAM first to make sure it can use the stack before calling subroutines. If it fails, it uses screen flashes to indicate which RAM chip is bad. Pretty smart.
> You don’t call a function like do_something(); but rather by loading an address in memory and jumping to it.
Normally you use an assembler. Which counts instructions for you, and assigns that address a symbolic name like “do_something”.
I don’t think they wrote that thing in raw hex 1802 machine code. You could, but you wouldn’t.
> This means that the ground crew, in principle, knows where every instruction lives.
You know that for any compiled code, too, if you have a decent tool chain. Even with ASLR, you should be able to extract the addresses if you’re authorized. But I don’t think anybody’s going to ASLR embedded code in a space probe, precisely because they might need to do something like this.
I have patched compiled code live in running systems in the field. It’s unforgiving, but it’s not magic.
The real win is that the code is, by necessity, *simple*, and there’s not that much of it.
It wasn’t actually high stakes in the sense that a mistake wouldn’t have really done anything bad. Voyager isolated commanding in one computer, attitude/alignment in another, and flight data in a third.
The issue was with flight data, because it used CMOS. Screwing it up would’ve just meant programming it (via the other computers) again.
HI !
sorry, there’s never been Cosmac1802 on board of both Voyagers !!
True. The Galileo used six of them:
The CPUs of Spacecraft
https://www.cpushack.com/space-craft-cpu.html
The Voyager flight software would have been written around 1975 and had to fit in a few kilowords of RAM. I can guarantee you that no assembler was used, nor did any such thing as a “tool chain” even exist. The programmers almost certainly worked with quadrille pads (or if they were really fancy custom printed worksheets) organized to account for every word, and possibly even every bit, of that RAM. Since it was a real-time system running at kilohertz rates they probably also accounted for instruction processing cycles in some of the more critical routines. The full listing for such a system, ready for markup if necessary, would fit comfortably on a clipboard.
My understanding is that one of the modern Voyager team’s handicaps is that they did not have these detailed handwritten listings. However, after the “poke” experiment part of the response they got was a memory dump, which is probably one of the factors that allowed them to recreate them in enough detail to perform the real fix.
This kind of programming remained common through the early 1980’s because there were lots of embedded systems with just a few kilowords of memory and no runtime human user interface.
The big handicap they have is that they don’t have a working simulator for the FDS, since it, well, broke. The other issue is that I’m pretty sure both the FDS and CCS/AACS are custom processors so yeah, you’re totally right that there’s almost certainly no toolchain since the extant userbase is a total count of 1.
Voyager’s 18-bit CCS command computer was borrowed from the earlier Viking Orbiter. The Orbiter only had a CCS computer (and its backup), no AACS or FDS computers.
Operations on the Voyager spacecraft and Viking Orbiters were done using “sequences”. For example, orient the spacecraft just so, turn on the tape recorder, turn on the camera, wait N seconds, turn off the camera, and turn off the tape recorder. Think of the CCS flight code as a BASH executable and the sequences as shell scripts.
Sequences were designed by sequence engineers. Commonly used sequences were stored online in a library. Sequence engineers designing new sequences could pull in existing sequences from the library in the same way a C program calls C Library functions.
The memory load for a sequence includes (i) the sequence of operations to perform and (ii) the flight software (if not already resident) needed to perform those operations. For example, between planets, you don’t need the camera software loaded. On approaching a planet, the first sequence that uses the camera has to also load the flight software for the camera.
This 1975 paper, “Viking Orbiter Uplink Command Generation and Validation via Simulation”, by Maurice B. McEvoy describes the support software for Viking Orbiter sequencing in some detail. [https://informs-sim.org/wsc75papers/1975_0054.pdf] (There are other papers about Voyager sequencing, but none I’ve found go into as much detail about the software as McEvoy did for Viking.)
Since Voyager uses the same CCS computer as the Orbiter, I assume the same or similar support software could also be borrowed from Viking. (You’ll see similar program names in the Voyager and Orbiter papers with the Orbiter names all having an “O” prefix. The corresponding programs for the Viking Lander had an “L” prefix.)
Anyway, the UNIVAC mainframe-based toolchain described by McEvoy is pretty sophisticated. As noted above, a sequence includes the flight software needed for the operations in the sequence. For a sequence in the library, its flight software is stored as macro-assembly language source. When a finalized sequence is processed into a memory load, all of the new flight software source is assembled and fed to a relocatable linker and loader.
That was for the Viking Orbiter and I assume the same or something similar was done for Voyager sequences.
Voyager’s 18-bit AACS attitude control computer was a modified CCS computer. I don’t know if the modifications required a modified CCS assembler or if the vanilla CCS assembler could have been used.
As you have noted, even with an assembler the programmers would still have been paying close attention to RAM and counting instruction cycles.
The 16-bit FDS computer, as Pat says, was custom-designed. A friend of mine was designing 8080-based hardware and assembly language software using one of those blue-box Intel development systems for NASA in 1977, so I’m pretty sure the Voyager folks at JPL could have written an FDS assembler fairly easily.
All the above being said, a 1995 paper published prior to Voyager 2’s encounter with Uranus, “Voyager Flight Engineering: Preparing for Uranus”, by McLaughlin and Wolff said this: “The AACS and CCS programs were modified without being reassembled as is the case with all AACS and CCS changes since launch.” Interestingly, in addition to in-lab simulations, they also used Voyager 1 as a testbed for some of the planned Voyager 2 operations! [https://arc.aiaa.org/doi/abs/10.2514/6.1985-287] (abstract; the full text can be found if you search around on the internet)
———-
“Computers in Spaceflight” by James Tomayko is a good source for the flight computers:
Chapter 5.6, “Viking Computer Systems”, has the details of the CCS computer also used on Voyager. [https://web.archive.org/web/20231123211500/https://history.nasa.gov/computers/Ch5-6.html]
Chapter 6.2, “Voyager – The flying computer center”, has additional information about the CCS and AACS computers and details the FDS computer. [https://web.archive.org/web/20231123211500/https://history.nasa.gov/computers/Ch6-2.html]
Awesome detail! Thanks for all of this.
> The Voyager flight software would have been written around 1975 and had to fit in a few kilowords of RAM. I can guarantee you that no assembler was used, nor did any such thing as a “tool chain” even exist.
You’re thinking of more like 1955. Maybe earlier.
I *wrote* a certain amount of machine code starting in maybe 1979, and spent a lot of time with people who were writing more than I was and had been doing it for a while. Assemblers were commonplace and expected, not anything new or fancy. So were actual compilers, for that matter, although you wouldn’t have used one for a space probe.
Those of us poor hobbyists who didn’t actually have access to assemblers or big enough machines to run them on– a group which did not include NASA– would usually still write out symbolic assembly code and then manually do what an assembler would have done, as opposed to trying to get the whole memory layout right on the first pass.
I knew people who could read various kinds of octal or hex machine code at sight… but even they still tended to annotate both their own code and code that they were reverse-engineering with symbolic instructions… and labels. Often on quadrille pads, actually.
Yes, you did pay attention to bytes, but you tried *really hard* not to put yourself in a position where needing to add or remove a couple of bytes a routine forced you to go through and manually change numbers scattered all over your whole program. If you were going to write a lot of machine-level code, the first thing you’d write would be an assembler. Or, more realistically, you’d get it from whoever sold you the processor.
You did try to give yourself the ability to patch binaries later. You might even tell your assembler to leave some free bytes between routines, rather than leaving all of the unused memory at the end. But that wasn’t actually a very common strategy as far as I could tell.
Working with raw machine code is a royal pain in the ass even if the instruction decoder is really simple. Once people had computers, it didn’t take very long to hit on the idea of making them help with their own programming.
I don’t remember anybody saying “toolchain”, but we had tool chains. And the bit about tool chains was actually about *modern* access to addresses, anyway. The real point is that machine code *is* still patchable, even now and even in code compiled from very high level languages. You can absolutely do it with modern code. That’s how you exploit memory safety bugs.
best OTA update ever. but not signed.
Danger of becoming a bitcoin miner.
so very much lol; thanks for that
A lot of what drove the design constraints of the day is weight. Old stuff was heavy, so there was a limit as to how much you could put on the satellite. If you took an equivalent weight budget and used modern technology you could have several multiples worth of spare everything available should a problem occur, and as others have pointed out have automatic remapping around failures.
It’s almost a 50 year old spacecraft. They had redundancies – they’ve just burned through them already. Both the spacecraft have a *ton* of failed components.
But the good part of this is, that we now have long-timedata about certain parts are lasting.
The next “Voyager” probe could then made with that same technology again, while other parts that didn’t last could be replaced by something else, something more enduring (modern, high-capacity core memory).
Assuming that we (humans) are still able to reproduce 1970s technology 1:1 using the old fabrication processes.
Maybe NASA/JPL needs help by other states who still have the “know how” to produce 1970s technology.
(The USA aren’t exactly good at preserving “things”, I’m afraid. Everything old gets trashed, not saved. Storage costs. People rather love to produce cheap and sub-standard, to save money and exploit everyone as much as possible. Quality is too costly, after all. Except for military use. But that’s another story.)
What’s also being needed, of course, is a power source that will last for centuries. Considering the travel time and mission length, that might be a priority.
The RTGs on the Voyagers did last longer than expected, but not as long as the isotope itself possibly could still last. The RTG material itself was sort of a limiting factor, too.
Of course that won’t be happening, though, because “of progress”.
People don’t like to set a specific standard into stone, for centuries (data format, transceiver technology).
But that’s exactly what’s being needed for a multi century mission.:
The radio stations on earth must keep supporting same type of communication.
Similarly to how Latin was common language for centuries.
Or how morse code can still be understood in emergency.
Alas, they won’t do that, I’m afraid. Instead, people will be improving specs on paper over and over again (“it needs to run on Linux!”), the years will be passing by, nothing happens.
In the end, a hundred years have passed by and there still won’t be a successor mission to the Voyagers.
Then some catastrophe happens and the space programs will end altogether.
Or, we will see a couple of half-thought-through probes being sent out into deep space who will fail halfway on their mission or the money suddenly is getting short and actibe probes will be abandoned.
That’s just my point of view, of course. I’d love to be proven wrong.
“The USA aren’t exactly good at preserving “things”, I’m afraid. Everything old gets trashed, not saved. Storage costs. People rather love to produce cheap and sub-standard, to save money and exploit everyone as much as possible. Quality is too costly, after all. ”
Germany isn’t that much better in that respect. I ought to know. I’m an American. I’ve lived in Germany now for over 30 years. I am in a position to compare the two countries. There’s crap products there and here.
I’d really appreciate it if you’d cease bashing the USA in every conversation you take part in.
The insanity of saying “USA isn’t exactly good at preserving things” in an article about a 50-year old spacecraft which is still being actively used for science is pretty insane.
Anyone who starts out by saying “why haven’t we been sending out more Voyagers” doesn’t fully understand how amazing those missions are.
In 1964, JPL realized that the planets were going to align in the 1970s *exactly right* to allow sequential gravity assists to hit all the outer planets. This is literally a once-in-a-lifetime setup – the periodicity of that alignment is 175 years. The paper on this was published in 1966.
The Voyager craft were launched in ’77. This means that the US realized the importance and rarity of this mission and funded and built in *11 years* 2 probes that would last almost *50 years* so far.
The Voyager probes aren’t just a testament to engineering. They’re a testament to *humanity itself*.
“The RTGs on the Voyagers did last longer than expected”
What the heck are you talking about? They’re lasting exactly as long as expected. It’s an RTG. It’s not like a battery, you don’t get “lucky” with charge/discharge wear or something. They knew the power curve 30 years ago. They chose to use a few tricks in the power system to buy themselves more time, but that has nothing to do with the RTG.
“In the end, a hundred years have passed by and there still won’t be a successor mission to the Voyagers.”
Golly gee, I wonder why that is! It might have something to do with the fact that the planetary alignment used by the Voyagers *only happens once every 175 years*.
That alignment is what allowed them to get out as far as they did, since that’s how they got the *speed* they did.
I think you may enjoy reading “We are Bob”
IIRC the reason there will never be another Voyager is that the alignment of planets required for Voyager to “slingshot” out of the solar system will not reoccur for 27000 years….
“is probably the longest-distance tech support operation in history”
leave out the ‘probably’ !
> May she send for another 46 years!
Voyager’s last message to Earth will be sometime near 2035 (half-life of the power source reducing and distance, AKA “Free Space Path Loss”, being the limiting factors). Unless we launch a massive number of relays to chase after them but the first relay would need a truly massive antenna massive.
But it would be a cool idea for a future probe to the nearest star system, Alpha Centauri (Only 4.246 light years). Send a probe followed by a new relay every few years to keep in touch. Of course they would need to be bunched close enough to allow for multiple relays failing over the duration of the trip there and the duration of the messages being relayed back to earth.
For reference Voyager 1 (traveling at 17.1 kilometers per second; 10.6 miles per second) will only need another 18,050 years until it is 1 light year away from earth! And 19,390 years until Voyager 2 (15.4 kilometers per second; 9.6 miles per second) is 1 light year away from earth!
Voyager’s science team is very clearly aiming for 2027, the 50-year anniversary. I’d imagine they will probably begin staffing down after that because there’s just no science return at that point.
2035 is when they’re expected to run out of power to run a single science instrument. The DSN has enough margin to stay in contact with the Voyagers to a distance of 200 AU, which won’t be reached for another 25 years.
Oh, thanks.
There’s virtually no chance that Voyager could actually reach the 2050s limit, though, since they’ll run out of hydrazine in the 2040s. According to the Voyager Communication document in the Voyager library at GSFC ( https://voyager.gsfc.nasa.gov/Library/ ), estimates for hydrazine for V1/V2 are 2040 and 2048 respectively.
It’s likely that as the science instruments shut down the staffing will drop and they’ll just check in on Voyager periodically as a PR thing, much like they did with Pioneer.
The same data table you referenced also estimates both spacecraft will run out of electrical power in 2023, so I might take those figures with a grain of salt.
That’s nominal power, for all the VIM instruments – they *did* nominally run out of power in ’23.
https://www.theregister.com/2023/04/27/nasa_tweaks_voyager_2s_power/
It’s different for V1 because it lost a scientific instrument earlier anyway so it didn’t matter.
My favorite is not that they are the farthest computers, but they are the longest continuously operating computers. They’ve been up for 50 years without going down once. Sure they step back into a safer operating mode, but they’ve never powered down and stopped completely.
There’s a nice documentary on the remaining Voyager team “it’s quieter in the twilight” on Amazon prime.
And checkout “Good Night Oppy” as well when there, it is about Opportunity on a 90 sols (92.5 Earth days) mission that lasted for 5352 sols (8 Mars years), 5498 days (15 Earth years).
Yeah ..that’s a good one too.
Thanks. Didn’t know about that one and it’s freeview with Amazon Prime.
PBS has a 90-minute 2017 documentary, “The Farthest”, that presents a general overall history of Voyager with a lot of old news clips. A good complement to the lower-key, more personal (and sadder to me) Twilight movie. [https://www.pbs.org/the-farthest/] (Watch for free if you have a local PBS station. If you don’t, there’s a way to search for and select a station – just pick an arbitrary state and choose one of its stations.)
“probably the longest-distance tech support operation in history”
Is the word “probably” really necessary?
I am glad to see Voyager is still out there. I know deep in my heart Voyager is connecting and expanding knowledge of Earth’s existence and place in the Cosmos for the Better of all and not for the worse.
As ever, really enjoying all the experts in the comments section who know better than NASA how their probe works and how they should have done it better.
I have written in machine code on 8086 processors
Would love to know how they diagnosed a faulty bit/block of memory. Fantastic job!