How IBM Stumbled Onto RISC

There are a ton of inventions out in the world that are almost complete accidents, but are still ubiquitous in our day-to-day lives. Things like bubble wrap which was originally intended to be wallpaper, or even superglue, a plastic compound whose sticky properties were only discovered later on. IBM found themselves in a similar predicament in the 1970s after working on a type of mainframe computer made to be a phone switch. Eventually the phone switch was abandoned in favor of a general-purpose processor but not before they stumbled onto the RISC processor which eventually became the IBM 801.

As [Paul] explains, the major design philosophy at the time was to use a large amount of instructions to do specific tasks within the processor. When designing the special-purpose phone switch processor, IBM removed many of these instructions and then, after the project was cancelled, performed some testing on the incomplete platform to see how it performed as a general-purpose computer. They found that by eliminating all but a few instructions and running those without a microcode layer, the processor performance gains were much more than they would have expected at up to three times as fast for comparable hardware.

These first forays into the world of simplified processor architecture both paved the way for the RISC platforms we know today such as ARM and RISC-V, but also helped CISC platforms make tremendous performance gains as well. In fact, RISC-V is a direct descendant from these early RISC processors, with three intermediate designs between then and now. If you want to play with RISC-V yourself, our own [Jonathan Bennett] took a look at a recent RISC-V SBC and its software this past March.

Thanks to [Stephen] for the tip!

Photo via Wikimedia Commons

37 thoughts on “How IBM Stumbled Onto RISC

    1. Agree the powers at the time were hampered by the Apple/Motorola/IBM agreements to develop the PowerPC and that carried over to poor license terms. They wanted a killer processor, made it happen then screwed themselves with poor marketing and sales.

      1. EISA was smoother, though.
        All ISA slots in a system could be replaced by EISA ones, without loosing support for older expansion cards.
        Let’s imagine if Amigas had been upgraded that way.
        The PC side could have been modernized, without breaking any compatibility. A new PC bridgeboard w/ EISA was all it needs to make use of it. But then (after VLB) came lame PCI. *sigh*

      1. I remember that at the time of the Apple/IBM/Motorola deal, the engine that Motorola were going to go to was the Motorola 88110 and follow on. I think that the IBM Austin guys were a bit anxious about that engine, since it was a very good RISC implementation. I never got the chance to code for it – I had done some work on the AMD 29K family before, and in looking for a replacement for Unix workstations to stretch performance further I was looking at 88110, about the time that the AIM alliance was announced and the PowerPC 601 chip arrived.

        BTW I had a very intense discussion at CeBIT Hanover in ’90 or ’91 with the Acorn guys about the ARM processor, and they clearly were targeting high volume, ultra-low power consumption. They didn’t see ARM and PowerPC as competitors at all. Indeed they didn’t see PowerPC as a true RISC engine, since it had some of the characteristics of reduced instruction set complexity – consistent addressing modes, instruction lengths all the same etc – but was a huge instruction set compared to ARM or other RISC engines.

    2. Im between the 801 and Power was the (RISC) ROMP architecture. Used in the IBM PC/RT machines, running either AIX 2 (rather not) or AOS (the academic operating system, BSD based).

    3. I really like PowerPC, I’ve just upgraded my PowerBook 1400/117 to a 166MHz CPU today! Great stuff!

      But to say it’s better than ARM is to miss the point: ARM was (originally) designed, simulated and laid-out by just 2 people: Steve Furber and Sophie Wilson in just a couple of years (well, to reach ARM1). PowerPC on the other hand had a decade of IBM research behind it in the form of the 801, ROMP and POWER architecture (with input from Motorola’s 88000 design) and at least 100 people working on various implementations across IBM and Motorola.

      And PowerPC was designed to compete with the fastest processors around during the 1990s and early 2000s; whereas ARM was designed to power a new generation of Home and educational computers. PowerPC made it into super-computers; ARM kickstarted the PDA, feature phone and smartphone era, thanks to its brilliant low-power.

      PowerPC couldn’t compete against Intel, because they had 1000s of engineers vs Motorola/IBM’s hundreds. We saw that with the G5 which was late, slower than it should be and too power hungry to run in a laptop (again, I have an iMac G5, and I think it was great for the day, a 64-bit CPU before Intel implemented x64).

      ARM didn’t need to compete at the high-end, but it was safe from Intel because Intel couldn’t compete at the low end. So, ARM has been able to do what the microprocessor did to mini-computers, catch up over a period of time.

      And the ARM MacBooks are superb, a worthy RISC successor to PowerPC Macs (I’m using my MacBook M2 to write this!).

  1. I remember hearing as a kid that there was some CISC platform with a single assembly language/machine code instruction for “n factorial”. I’m guessing that it would have been one of the mainframes, but have no idea which. Anyone know ?

    1. That isn’t what makes a CISC a CISC though. There are instructions that calculate trigonometric functions in most RISC processors with FPUs and even a FJCVTZS (Floating-point Javascript Convert to Signed fixed-point, rounding toward Zero) instruction in ARMv8 I think. Special-purpose instructions are not what makes an architecture RISC or CISC. In all cases these weird instructions operate only on registers and likely take only one processor bus cycle to execute. Contrast this with the MOVSD instruction on x86 that moves data pointed to the ESI register to the address in the EDI register and increments these registers to point to the next dword. Three bus cycles at least, one for instruction fetch, one to load data at the address of ESI, and another to store a copy of the data to the address at EDI. This is what is meant by “complex” in CISC. RISC processors in contrast have to have dedicated instructions that do load and store only so that the majority of instructions run on only one bus cycle.

  2. Oddball special-purpose instructions like that are not what makes an architecture CISC though. Most RISC processors that have FPUs have instructions that compute square roots and trigonometric functions, and there is even an FJCVTZS instruction in ARMv8: Floating-point Javascript Convert to Signed fixed-point, rounding toward Zero. How different is are these in principle from an instruction that computes a factorial? The thing is these instructions all do their black magic only on registers, so each of them takes only one processor bus cycle to execute, just to fetch the instruction. Contrast this with the MOVSD instruction in x86. Three bus cycles at least: instruction fetch, read memory pointed to by ESI register, store to memory at EDI. This is what is meant by “complex” in CISC. RISC instruction sets tend to be much more regular and usually have a load-store architecture that limits the instructions that use the bus.

  3. Stormwyrm has it correct. When we started with RISC, the main benefit was that we knew how much data to pre-fetch into the pipeline – how wide an instruction was, how long the operands were – so the speed demons could operate at full memory bus capacity. The perceived problem with the brainiac CISC instruction sets was that you had to fetch the first part of the instruction to work what the operands were, how long they were and where to collect them from. Many ckock cycles would pass by to run a single instruction. RISC engines could execute any instruction in one clock cycle. So, the so-called speed demons could out-pace brainiacs, even if you had to occasionally assemble a sequence of RISC instructions to do the same as one CISC. Since it wasn’t humans working out the optimal string of RISC instructions, but a compiler, who would it trouble if reading assembler for RISC made so much less sense than reading CISC assembler?

    Now, what we failed to comprehend was that CISC engines would get so fast that they could execute a complex instruction in the same, single external clock cycle – when supported by pre-fetch, heavy pipelining, out-of-order execution, branch target caching, register renaming, broadside cache loading and multiple redundant execution units. The only way that RISC could have outpaced CISC was to run massively parallel execution units in parallel (since they individually would be much simpler and more compact on the silicon). However, parallel execution was too hard for most compilers to exploit in the general case.

    1. Digression: Given that “serious” modern processors, including RISC, are doing all of that… Any recommendations on tutorials/books to help us learn how to write modern optimizing compilers? Most of Aho/Ullman still applies since it’s well above the instruction level, but when one gets down to trying to do actual instruction assignment I suspect my 1980s assembler reflexes are suboptimal at best.

      “Modern Instruction-Level Coding For Dinosaurs”, anyone?

      (I may finally be able to find time to revisit Xalan’s compiler and apply some of what we learned from Xylem to it.)

      1. There are dozens of compiler books since any edition of the Dragon.

        Maybe you want “Engineering a Compiler” by Linda Turczon etc al. Maybe “Modern Compiler Design” by Dick Grune and others. Allen and Kennedy’s “Optimizing Compilers for Modern Architectures” or Muchnik’s “Advanced Compiler Design and Implementation” come to mind. Maybe Peter Lee’s “Topics in Advanced Language Implementation” or Wolfe’s “High Performance Compilers for Parallel Computers”. Maybe something by Terrence Pratt, O.G. Kadke, Benjamin Pierce, Reinhard Wilhelm, Alain Darte, or one of the many books about LLVM world be handy.

    2. Oddball special-purpose instructions like that are not what makes an architecture CISC though. Most RISC processors that have FPUs have instructions that compute square roots and trigonometric functions, and there is even an FJCVTZS instruction in ARMv8: Floating-point Javascript Convert to Signed fixed-point, rounding toward Zero. How different is are these in principle from an instruction that computes a factorial? The thing is these instructions all do their black magic only on registers, so each of them takes only one processor bus cycle to execute, just to fetch the instruction. Contrast this with the MOVSD instruction in x86. Three bus cycles at least: instruction fetch, read memory pointed to by ESI register, store to memory at EDI. This is what is meant by “complex” in CISC. RISC instruction sets tend to be much more regular and usually have a load-store architecture that limits the instructions that use the bus.

      1. I have the PowerPC 601 instruction set book in my museum in the loft – I must dig it out sometime. I liked it a lot – Motorola put a 601 on a development board format that ran at 25MHz, which we shipped to some OEM customers (like a bunch of people working for the Ministry of Informatics in Lithuania, as it happens). Then the PReP PCI board arrived with PCI slots and a 66MHz 601, and then a series of 604-based motherboards. Somehow I never got past the stage of shipping samples and dev boards – always waiting for someone other than IBM or Apple to ship a product – though my colleagues in Austin, TX did have success, after I left the organisation, with games consoles.

      2. And there was the Parsytec GigaCluster – they had a kind of plug in array processor that was originally based on Inmos Transputers, but Inmos were late to market with the T9000 chip, and they switched to a Motorola 601.

  4. Rumor had it that part of the reason IBM didn’t jump on RISC more strongly was that in the mainframe age there was much more emphasis on being able to sell a range of machines including more limited ones at lower prices, and “we don’t know how to make a less performant RISC that runs at the same clock speed “

    1. Knowing IBM as I did, they might have been a little nervous about cannibalising the mainframe market so would wish to continue to differentiate the workstations / midrange away from s/390. They did use the 64-bit POWER archictecture for the AS/400 follow-on though. The big question was whether they led the market enough to use PowerPC to displace Intel, and whether IBM could make Power Personal Systems, with its PCI bus boards a success against Wintel. It was an ambitious initiative, but without a non-Apple OS it was doomed. Microsoft did run NT 3.51 on PowerPC, but NT 4, and the 32-bit consumer Windows never ran on PowerPC (and I don’t think the concept of the PowerPC 615 helped).

      1. NT 4 did run on PowerPC, good luck finding any machines that could actually run it though those unfortunately were quite rare. Support for PowerPC was dropped after NT 4.0 service pack 2.

        1. Well, I never knew that. Maybe I had switched to AIX 4.1 by then, since I was changing jobs at work, and it was more important for me to have a machine at home I could practice some AIX sysadmin skills on. I think I had that 604-based machine at home, built into a homebrew box for about eight years before replacing it about 20 years ago with an iMac G3 running Yellowdog Linux with its vast 13GB hard disk.

  5. The start of the second paragraph is a little confusing – I think it was meant to say something like: ” the major design philosophy at the time was to have a large number of different instructions to do specific tasks within the processor”.

    Not mentioned, but one of the early advantages of RISC was the substantial reduction of die area used for CISC microcode.

  6. Nice article.
    I recall informal rules of thumb defining RISC like “Single cycle execution (in a pipelined implementation) -no microcode” lending to single chip VLSI implementation of processors.

    Also academic research projects yielding workstation based VLSI design tools (Mead/Conway era) with government subsidized fabrication (MOSIS) made RISC CPU production possible beyond traditional semiconductor companies.
    https://en.wikipedia.org/wiki/Berkeley_RISC
    https://www.acm.org/hennessy-patterson-turing-lecture

    Even with RISC generating huge interest – e.g. Intel 860- other efforts recognized impact of CISC X86. IBM had its own “Blue Lightning” X86 variant. There was a RISC influenced X86 code compatible design from NexGen that became the “586.” Transmeta Crusoe extended X86 compatibility to VLIW, the erstwhile successor to both RISC and CISC.
    https://www.techspot.com/article/2619-nexgen-cpu-history/

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.