A couple weeks ago I was at a party where out of the corner of my eye I noticed what looked like a giant phone book sitting open on a table. It was printed with perforated green and white paper bound in a binder who’s cover looked a little worse for the wear. I had closer look with my friend James Kinsey. What we read was astonishing; Program 63, 64, 65, lunar descent and landing. Error codes 1201, 1202. Comments printed in the code, code segments hastily circled with pen. Was this what we thought we were looking at? And who brings this to a party?
We were looking at what is rumored to be the only remaining paper copy of the Lunar Module’s source code . This is the source code book used by Don Eyles during Apollo missions and for development. Don was responsible for writing code for the Lunar Module, specifically about 2000 lines that actually landed man on the moon. He turned up at the MIT Faculty Club in June and brought this original print out along, in its own suitcase of course.
How it works
Both the Apollo Command Module (CM) and Lunar Module (LM) were humankind’s first fly-by-wire aircraft. This means that the human does not actually fly it manually by stick, the computer is the one that controls the throttle, thrusters, gimbals (which are servo motors that control the direction of the larger thrusters), in real-time. This was done by necessity. Humans could not fly these spacecraft manually especially the LM during descent, which was very unstable.
One Page Per Millisecond
The core of the system was the Apollo Guidance Computer (AGC). Part real-time OS, part National Instruments Data Acquisition System, the AGC was a multi-threaded computer & feedback control system in communication with everything from radars, to telescopes, to gyroscopes and accelerometers controlling massive space engines and performing all tasks in real-time. Numerous programs shared CPU cycles, an executive program kept track and prioritized it all. You can read about it for yourself, the best book on this topic is Digital Apollo.
Don graciously took us through a tour of the code where he explained problems with various missions and how they were solved in the code. One example was an issue with LM descent engine’s frequency responses. The manufacturer did not specify an accurate transfer function for the engine’s throttle causing the system to be unstable. Remember, this is a feedback control system, this would be equivalent to not knowing the frequency response in the output circuit of an op-amp.
The error codes I mentioned earlier (1201, 1202 and etc.) were famously displayed during Apollo 11’s landing. These made Neil Armstrong very nervous during the last portion of the lunar landing. Don explained that these errors were due to leaving the docking radar on by accident. Unfortunately this scenario was never simulated or practiced on the ground.
The docking radar was sending numerous interrupts to the processor taking away precious cycles which caused the AGC to warn Neil Armstrong that it was over-taxed and that it would cease any non-essential programs. Essential programs, like flying and maintaining stability of the LM and navigation remained, non-essential programs like updating the DSKY were paused.
Although the picture in the video below can be a choppy at times, hear what Don has to say is where it shines. In addition to the examples above, he mentions so many interesting details like the rule of thumb during code development; each page of code takes about 1ms of CPU time. In this clip Dana Yoerger is asking most of the questions.
The Seventies: Weird-Looking Freak Saves Apollo 14!
Don was a rebel working for the military industrial complex knowing that he served a greater purpose which was to help land humankind on another world for the first time. This 1971 Rolling Stone piece provides a fascinating perspective of what motivated hackers like Don in the late 60’s.
It is truly spectacular to run into engineers from these world-changing projects and to hear what they have to say. If you are one such engineer, or can connect us with one, please don’t hesitate to reach out using the article tips line.
Very Heavy Source Code
I did not hide my enthusiasm for this history lesson. I think it was for this reason that Don asked if I wanted to carry the code back to his car. He was joking, but I took him up on the chance to handle artifact that will be sealed behind glass and displayed at a museum sooner or later. The source code was very heavy, I had to switch arms about three times over a journey of only two city blocks.
Like many involved with the Apollo program Don is very personable, kind, will not brag about himself, and is hesitant to take credit. This is a refreshing attitude in an era of reality television and self-promotion. We’ll learn more soon, keep an eye out for Don’s book due to be published within the next year or so.
Author Bio: Gregory L. Charvat takes every opportunity to learn from those who were involved with the Apollo program. Greg’s author bio can be found here.
Umph! Or in plain English: that’s something I’d have loved to have hands-on contact with and be able to read with my own eyes… Thanks for this article.
Always refreshing to see people who earned the right to be humble. Absolutely fascinating to listen to him, and I have the 430-page AGC book :)
Since most of us will never be able to get our grubby little hands on this is there an online copy of this source code? Also the Avro Arrow had a fly-by-wire system before Apollo, just saying.
http://www.ibiblio.org/apollo/links.html#Software_manuals_and_listings
…is but one of the online resources.
The Arrow had an analog system not a digital it more or less just replaced hydraulics with wires. Apollo had a digital FBW system.
Its on github, I have provided a link in the text.
And remember, when the Avro Arrow was cancelled, many of the top engineers at Avro were recruited by NASA, and helped form the core of the US Space Program. I recommend the book “Arrows to the Moon” by Chris Gainor for an in-depth look.
Also Wikipedia says:
“Following the cancellation of the Avro Arrow project, CF-105 chief aerodynamicist Jim Chamberlin led a team of 25 engineers to NASA’s Space Task Group to become lead engineers, program managers, and heads of engineering in NASA’s manned space programs—projects Mercury, Gemini and Apollo.[93] The Space Task Group team eventually grew to 32 Avro engineers and technicians, and became emblematic of what many Canadians viewed as a “brain drain” to the United States.[93] Among the former Arrow team engineers to go south were Tecwyn Roberts (NASA’s first flight dynamics officer on Project Mercury and later director of networks at the Goddard Space Flight Center) John Hodge (flight director and manager on the cancelled Space Station Freedom project), Dennis Fielder (director of the Space Station Task Force, later the Space Station), Owen Maynard (chief of the LM engineering office in the Apollo Program Office) and Rod Rose (technical assistant for the Space Shuttle program).”
Let’s not forget Margaret Hamilton, the amazing software engineer who led the team that wrote that code, and the rest of the roughly half-million lines of code in Apollo.
Before you even mention Hamilton, why not Laning and Klumpp? I mean, before throwing random names around, one should perhaps first take care of the people more closely associated with the result.
See this 1979 history of Apollo’s on-board guidance and navigation systems by MIT Instrumentation Laboratory technical design director David G. Hoag: http://klabs.org/history/history_docs/mit_docs/1711.pdf
Specifically (p18-19):
> Parts of the computer programming were accomplished early and were essentially independent of mission objectives. These included the basic code for the computer executive system, sequence control, timing and interrupt structions, unchanged since originally designed by Dr. Laning, and the management of the interfaces with the computer display and keyboard unit, telemetry, etc. Also completed relatively early were
the complex but not time-critical data processing routines of navigation, guidance targeting, trajectory extrapolation and lunar ephemeris calculations. Much of the analytical and algorithmic foundation €or these came from Battin’s earlier work for the unmanned space mission studies. For Apollo, Dr. Battin, Dr. James Miller, and Norman Sears, and other analysts made significant improvements in the efficiency and performance of these routines, many of which were of fundamental significance.
…
> The very early programs for the first few unmanned earth orbital test flights were each put together by a small dedicated group led by a chief engineer-programmer. For the first command module flight, Alex Kosmala spent many weeks of long hours leading the design and coding of program CORONA. Similarly, Daniel Lickly’s great personal effort produced the program SOLARIUM. Each of these was an amazing tour-de-force which was impractical for the more complex manned missions. Each of these later missions was assigned the responsibility of a senior engineer who assumed a more technical management role for the program.
> The task first was to partition the job suitably for the analysts, specification writers, programmers, test engineers, and documentation specialists. The leader established schedules and progress milestones, reassigned resources to solve inevitable problems, and generally was responsible for the quality of the program. Names notable here are Dr. James Miller for the first Lunar Module program SUNBURST, Dr. Frederic Martin for the Command Module program COLOSSUS, and George Cherry for the Lunar Module program LUMINARY, These last two were the programs used for the lunar landing missions. Martin and Cherry also did a substantial part
of the design of the powered flight guidance steering functions for these programs. Alan Klumpp made major contributions to the landing program
in the Lunar Module. Daniel Lickly established the atmospheric entry design for the Command Module.
> Much of the detailed code of these programs was written by a team of specialists led by Margaret Hamilton. The task assignments to these individuals included, in addition to writing the code, the testing to certify that the program element met requirements. Overall testing of the assembled collection of program elements necessarily took the use of considerable human and machine resources. The programs had to be
as near error-free as possible and any anomalies had to be understood and recorded for possible affect on the mission. Actually, no program errors were ever uncovered during the missions.
Wow, an amazing source! Thank you. I’m puzzled why I missed that – or forgot that I have it somewhere here, assuming I actually do. Mindell’s book is generally my go-to source for Apollo guidance and control story. Now I see this is one of the resources mentioned at its end.
Richard “Dick” Horace Battin was also one of the leaders in the Apollo software project.
That deserves a link.
https://en.wikipedia.org/wiki/Margaret_Hamilton_(scientist)
http://www.klabs.org/history/apollo_11_alarms/eyles_2004/eyles_2004.htm
The above link doesn’t even mention M. Hamilton, whose role on the Apollo computer project was overstated
Is that though because it is “of it’s time”? (Think Franklin and DNA.)
Unlikely. It’s because 1) she had very little to do with the things described, and in any case, 2) up to 400 people worked on that thing at one point, so even a random unrelated mention of her would be extremely unlikely.
Excellent article as usual, Greg!
We are not worthy!
The speed at which he made the code modification for Apollo 14 to fix the random abort request AND give the crew manual abort capability is nothing shy of amazing. Steely eyed missile man!
‘We are not worthy’ is exactly what i told Don, he’s such a nice guy. Hopefully he will publish his book soon, should be a fascinating read.
Curious if anyone has done a write up on the interpreter? Based on the description in the video, it sounded like a threaded interpreter (list of subroutine addresses) which is a method often used in FORTH. I’d like to know why they used that; he says something about it handling matrix math but that doesn’t seem like it would be a “fit” for threaded code. Was it for compression? Threaded code takes less space by removing the repeated CALL instructions. If the matrix math routines ended up being mostly a list of subroutine calls (multiply this by this, now that by that, etc…) then it makes sense.
Yes, I’d like to learn more about this too. If anyone can chime in it would be much appreciated.
I’d suggest consulting O’Brien’s text, “The Apollo Guidance Computer: Architecture and Operation” (https://www.amazon.com/Apollo-Guidance-Computer-Architecture-Operation/dp/1441908765), for details on this level.
The Interpreter’s primary function was to create a virtual machine w/architecture and instructions independent from the actual physical AGC hardware. This allowed complex functions to be executed on the AGC’s limited hardware, including vector operations and transcendental functions, and also allowed more complicated data types to be manipulated (like single, double, or triple precision variables, vectors, and matrices).
Quoting from O’Brien:
“The Interpreter complements rather than replaces basic AGC coding, as not all the mission software programming requires the capabilities of the interpreter. Additionally, mission programs remain wholly dependent on Executive routines for process management, I/O, interrupts and other system level functions. As such, while the Interpreter has a number of features in common wiht a hardware CPU, it is less a machine emulator than it is an extension of the programming environment. Programs may begin with basic AGC instructions and then enter the Interpreter to evaluate a complex expression, only to return to executing basic instructions when the calculation is complete. Fundamentally, the role of the Interpreter is to extend the AGC’s architecture using software written in the AGC’s native code. Elements such as specific registers or status bits that are not found in the physical machine are implemented in erasable storage, using software logic to define the fules of their operation.”
O’Brien then spends about 50 pages breaking down the Interpreter’s instruction format, its opcode encoding and addressing, its instruction classes, its indexing, how it executes instructions, how it performs vector math, how it stores data, and much, much more.
One wonders if this isn’t the kind of thinking that led Wozniak to develop SWEET16 during the Apple days.
Although it is linked in the article above, the page:
http://www.doneyles.com/LM/Tales.html
should be given more attention by anyone really interested in what happened with those 1201, 1202 codes and WHY they came up. It was NOT a checklist error. Leaving the radar on was thought to be perfectly acceptable and so there was no entry in the checklist telling the astronauts to turn it off. It was a documentation error which led to the possibility of a randomly occurring signal phase error in an antenna position sensor. The computer was spending all its spare time trying to get that antenna turned to the correct location because the sensor was sending garbage data. Really worth the read.
This is worth emphasizing, as it’s one of the big urban legends of the Apollo era (right up there with “All the plans of for the Saturn V were destroyed!”). As Eyles himself explains in that link, Aldrin switching the rendezvous radar switch into the SLEW position prior to landing was NOT a checklist error—it was established procedure. If the crew had to trigger an abort during landing, the rendezvous radar would be needed to pick up the CM’s position and get that information into the PGNCS or AGS. However, it was anticipated that in the event of an abort the crew would be really really busy, so the procedure was to turn the radar on before landing so that it would already be on if an abort happened and the crew would have one less switch to worry about.
Eyles then explains that the 1201 and 1202 errors were the result not of a checklist or procedural error, but of a ground design documentation error—several components of the rendezvous radar and the LM’s computer weren’t electrically phase-aligned, even though they were supposed to be. The components being supplied out-of-phase power wasn’t something that would be simulated in the big cockpit trainers (though the problem was duplicated on full-electrical ground systems after the mission) and so it wasn’t something that was planned for.
The two sources of power being out of phase sent invalid shaft and trunnion angle readings from the rendezvous radar to the computer components responsible for monitoring the radar’s state, which in turn started issuing about 12,800 increment/decrement interrupts per second to the LM’s guidance computer, trying to get the computer to move the radar to a valid position. Dealing with those extra unexpected 12,800 interrupts per second took up about as much free task capacity as the LM’s guidance computer had to spare.
Program flow in the AGC is tightly regulated, and the amount of temp storage designed into the hardware was a function of the expected program flow. Because it wasn’t supposed to be possible for those extra interrupts to happen and because the computer was being given more to do within its cycle time than it was able, tasks started slipping off its plate. Two separate temporary storage areas filled up—first the vector accumulators (which triggered the 1202 alarm) and then the core set areas (which triggered the 1201). Fortunately, the brilliant design of the AGC’s Executive let the AGC flush its overflowing temp storage areas and restart its task list without breaking anything, and as the landing left P64 and transitioned into P66, the problems went away.
I’ve got a write-up of the whole incident, including some more explanation and what-ifs, right here: http://arstechnica.com/science/2015/07/no-a-checklist-error-did-not-almost-derail-the-first-moon-landing/ However, Eyles is absolutely the authority and I’d say his explanation (http://www.doneyles.com/LM/Tales.html) is the definitive one.
OK , what I want to know is how Steve Bales is remembered for not calling an abort with the 1202 alarms and saving the misison but Jack Garman is never mentioned in documentaries even though Bales is seemingly just relaying word by word what Garman is telling him to say, doesn’t sound as sure about it and even gets it wrong once time, prompting a sarconic correction…
“Sarconic” — a blend of sarcastic and sardonic?
TIL the instructions for the pinnacle of human achievement is being used as a coaster for nerd parties rather than be preserved in a museum
What kind of “party” do you expect to see printed source code?
My kind of party!
Just today an Alto was almost booted : https://www.youtube.com/watch?v=PR5LkQugBE0
70s are back !
LLRV was actually the first Digital Fly By Wire aircraft, I believe. Flew several years before either Apollo CM or LM, and like the LM, was so unstable that it needed that FBW. Not sure if it used the same code from the LM?
Falling with style…
Why should it be sealed up for exhibition? At least let our favorite OCR guy scan the pages into a PDF and put it online for historical reference!
The entire Luminary source code listing shown in the above photos, over 1800 pages, has been available online for almost two decades.
Thanks for the tip!
http://lambda-the-ultimate.org/node/3522
I am curious to determine a bit about the design of the internal electronics and plumbing as these had to operate after or during a total vacuum when the hatch was opened for excursions. Most electrolytic capacitors of that era would not likely operate in a vacuum for long.
The only electrolytic capacitors aboard were likely hermetically sealed wet-slug tantalums.
I would love to see this document scanned and made available somewhere. I taught assembly in college and would LOVE to see it.
“Both the Apollo Command Module (CM) and Lunar Module (LM) were humankind’s first fly-by-wire aircraft. ” — I don’t know if they were the first implementations of fly-by-wire, but they were not aircraft! There is no air where they flew!
Hahaha, sementics. You really got him this time, lol.
http://www.ascii-code.com/ascii-art/vehicles/airplanes.php
The F16 followed just a few years later…
2000 lines to land on the frickin’ moon and here I am up to my nipples in 200,000 lines of crap to generate bloated XML…
Real Programmers…:-)
http://web.mit.edu/humor/Computers/real.programmers
Ummm, we never actually went to the moon
go home troll boy…
Okay, WE didn’t go to the Moon but a dozen very skilled people did.
If we didn’t go to the Moon, then where did the reflectors and tire-tracks come from? o_O
I would’ve have loved to see the words “we scanned the pages so our viewers can read through them online.” :P
So. Totally. Cool.
pioneers
Great article! You know what would be cool? locating an old picture of Don Eyles sitting at the control command center while the landing .. reading the code, I would like to see that. Thanks for the story though.
Now on github: https://github.com/chrislgarry/Apollo-11
Lies that just won’t die.