RISC-V: Why The ISA Battles Aren’t Over Yet

A computer processor uses a so-called Instruction Set Architecture to talk with the world outside of its own circuitry. This ISA consists of a number of instructions, which essentially define the functionality of that processor, which explains why so many ISAs still exist today. It’s hard to find that one ISA that works for as many distinct use cases as possible, after all.

A fairly new ISA is RISC-V, the first version of which was created back in 2010 at the University of California, Berkeley. Intended to be a fully open ISA, targeting both students (as a learning tool) and industrial users, it is claimed to incorporate a number of design choices that should make it more attractive for a number of applications.

In this article I’ll take a look behind the marketing to take stock of how exactly RISC-V differs from other open ISAs, including Power, SPARC and MIPS.

Welcome to the World of RISC

A Reduced Instruction Set Computer (RISC) is a type of ISA which focuses on creating an instruction set that requires only a limited number of processor cycles to execute a single instruction. Ideally, an instruction would take exactly one cycle. This is in contrast to a Complex Instruction Set Computer (CISC), which focuses on reducing the number of instructions needed for an application, which decreases code storage requirements.

These days CISC is essentially no more, with the Motorola m68k ISA put out to pasture, and any CPU based on Intel’s x86 CISC ISA and successors (like AMD’s 64-bit extensions) being internally a RISC processor with a CISC ISA decoder front-end that breaks CISC instructions into the RISC instructions (micro-opcodes) for its CPU core. At least as far as the CISC versus RISC ISA wars go, here we can say that RISC decidedly won.

Many flavors of RISC

Though RISC ISAs such as Alpha and PA-RISC met their unfortunate demise due to corporate policies rather than any inadequacies in their ISA design itself, we’re fortunately still left with a healthy collection of RISC ISAs today, most notably:

  • SuperH (with open J-2 implementation).
  • ARM (fully proprietary)
  • MIPS (open, royalty-free)
  • Power (open, royalty-free)
  • AVR (proprietary)
  • SPARC (open, royalty-free)
  • OpenRISC (open, royalty-free)

RISC-V as a newcomer places its 9 years of (academic) development against the 34+ years of MIPS, 33+ years of SPARC, and the Power ISA which has its roots in development IBM did back in the early 1970s. Considering the hype around this new ISA, there must be something different about it.

This also considering that OpenRISC, which was developed with many of the same goals as RISC-V back in 2000, never made much of a splash, even though it is being used commercially.

A Shifting Landscape

It’s important to note that back in 2010 when RISC-V was being developed, SPARC had been an open ISA for a long time, with ESA’s LEON SPARC implementation in VHDL having been available since 1997. Since 2010, MIPS and IBM’s Power ISA have also joined the ranks of open and royalty-free ISAs, with open source designs in Verilog, VHDL and others made available. MIPS has been a standard teaching tool for processor ISAs since the 1990s (usually based on DLX), with many students writing their own minimalistic MIPS core as part of their curriculum.

Because of the existing contenders in these areas, RISC-V cannot simply distinguish itself by being open, royalty-free, having a more mature ISA, or better freely available HDL cores. Instead its ISA must have features that make it attractive from the standpoint of power efficiency or other metrics, allowing it to process data more efficiently or faster than the competition.

Here one defining characteristic is that the RISC-V ISA isn’t a singular ISA, but over 20 individual ISAs, each focusing on a specific set of functionality, such as bit manipulation, user-level interrupts, atomic instructions, single- and double-precision floating point, integer multiplication and division, and so on. Also interesting in the RISC-V ecosystem is that adding custom instruction sets without any kind of approval process is encouraged.

Ignoring the Future

One interesting choice in the RISC-V ISA itself is in the subroutine calls and conditions, with RISC-V having no provision for a condition code register (status register), or a carry bit. This choice makes predication impossible, instead forcing the processor to execute every single branch in the expectation that one of them is correct, discarding the results of the other branches. As branch prediction is optional in RISC-V, this could come with a big performance and energy cost penalty.

Since every other major architecture uses predication to improve performance especially for blocks of shorter jumps, such as that produced by a big if/else block or switch statement, it’s quite daring to omit this feature. The provided design rationale by the RISC-V developers is that fast, out-of-order CPUs can overcome this limitation through brute processing force. Interestingly, they do not see the larger code size produced for code without predication to be an issue, despite being proud of their compact instructions being generally quite compact.

Here the somewhat schizophrenic nature of the RISC-V development process begins to shine through. Though it’s supposed to be a good fit for embedded, presumably low-clocked processors, its lack of predication will likely hurt it here in raw performance compared to equivalent ARM-based microcontrollers, whose Thumb-2 compact instruction set is also more efficient than the RISC-V compact ISA.

Choosing Uncertainty Over Certainty with RISC-V

At this point, the only parts of the RISC-V ISA which are ‘frozen’ – meaning that they can be implemented without any fundamental changes expected – are the Base Integer sets for the 32- and 64-bit version, as well as the extensions for integer multiplication and division, atomics, single- and double-precision floating point, as well as quad-precision floating point and compressed instructions.

Extensions such as the hypervisor, bit manipulation, transactional memory, and user-level interrupts are still in flux and thus unsuitable for anything but experimental use, further fragmenting the whole RISC-V ecosystem. This clearly shows that RISC-V isn’t a ‘finished’ ISA, but still very much in the early stages of development. While its core is usable, the embedded instruction set isn’t finished either, and there’s no readily available performance data to back up claims that it can handily outperform any competition.

Worse is probably the immaturity of the available HDL cores and software tools for RISC-V. With the stabilization of the ISA sets taking time, it’s no surprise that few cores and tools offer or expect anything beyond the basic (RV32I or RV64I) functionality. Without many more ISA sets being finished and incorporated into silicon, to a bystander there’s the interesting thought that maybe the major contribution of RISC-V to this renewed ISA war isn’t that of RISC-V being necessarily superior, or it even having any long-term commercial viability.

Showing How It’s To Be Done

Back in 2000 when the OpenRISC project took off, it appeared that the market didn’t quite have the appetite for open and freely available ISAs and associated processor designs. Today that seems to be quite different, and it was RISC-V, not OpenRISC that kicked off this change in corporate thinking that caused IBM to open up its Power ISA, along with the MIPS ISA and even the ARM ISA to a limited extent. RISC-V having DARPA funding when OpenRISC did not probably played a role here too, but who is counting?

Regardless of such details, it seems that the computer hardware industry has embarked on a new path, one where even a hobbyist has access to a number of well-supported HDL cores and is free to experiment with the ISA. Right now one can pick between fully open MIPS, SPARC, Power, RISC-V, and SuperH cores, with maybe some day a fully open ARM core becoming reality as well.

In some ways it evokes flashbacks to the 1980s, when amidst the rapidly growing home computer market, multiple CPU manufacturers struggled to make their ISA and their chips to be the most popular, with Zilog’s Z80 and of course the 6502 being strong 8-bit contenders before a little upstart called ‘Intel’ began to make inroads, culminating in the seemingly complete disappearance of ISA diversity on the desktop and most recently in video consoles.

Here’s to Diversity

I wouldn’t go so far as to say I have a longing for the days of dissimilar platforms (lest someone call me a daft bastard). Anyone working in the software industry during the formative years of personal computing will find themselves regressing through the traumatic memories of porting software between the Commodore 64 and ZX-Spectrum. Thinking that we have it so much better now is not such an extreme position to take.

That said, everyone with a sense of what competition means can see that a world with only Intel, or only AMD, or only ARM, or only RISC-V processors in everything would be rather dull indeed. It is the bouncing off of ideas, of comparing differences, that keeps people thinking and that keeps innovation going. Modern software practices should mean that cross-platform compatibility isn’t as much of an issue as it was back in the 1980s and 1990s.

Here’s to an open, diverse future in the world of ISAs.

36 thoughts on “RISC-V: Why The ISA Battles Aren’t Over Yet

  1. Thankyou for post, interesting review :-)
    Wonder why super wide words with multiple interleaving with great potential to mix simd & mimd together seemed to have stalled re instruction structure development to a degree as in stayed at the high end mostly pretty much, yet not much filtering down to lower end if at all where it still seems predominantly sisd hardly even moving to simd…
    Cheers

    1. Because compilers ended up not being able to use VLIW effectively. I spent whole days getting the last tick out of microcode for the Pixar image computer. That sort of architecture is great for functions where someone is willing to optimize that way. You can end up with a highly performing image library. But not so much for general-purpose compiler code.

    2. memory bandwidth is the bottleneck on modern systems, the trend is toward compression of the code for higher performance. Super-tiny instructions are the way to go for best performance, this is common knowledge for a long time, see ARM Thumb. ARM has had Neon SIMD features since 2009. MIPS has had SIMD features since 2006. Today SIMD features are available on ARM Cortex M4 chips, billions sold, that run on watch batteries and cost less than $1.

        1. Get a real job with a shoddy boss and then be idealistic about developing great code. You’ll talk differently once you are employed but need to apply for a welfare check becasue your ideals have brought a 6 month delay and you won’t be paid. This is real life, not academia.

      1. Which is why RISC-V has the C or Compressed instructions. The performance is actually better than Thumb2 once library calls for register spills are taken into account. Read Andrew Waterman’s paper on RISC-V compressed instructions.

      2. Bandwidth? Why not just slap a HBM2 module on top of the cpu? That’ll solve any memory bottleneck you code into. Also, the author seems to think not having prediction will be an energy/performance detriment. The Asynchronous nature of these chips makes prediction nigh obsolete, IMPROVING performance, efficiency, and security.

        1. Read again. He doesn’t talk about prediction, but predication. I don’t know enough about low-level CPUs, if this is important at all. IBM, not a small player, thought it was not important enough to include it in the PowerPC. But of course, a PowerPC processor has a branch prediction unit.

        2. HBM2 is still a massive bottleneck, with round trip of a single memory request in hundreds of clock cycles.

          And this is exactly the reason why OoO is so efficient. You cannot statically schedule instructions if you have unpredictable delays.

          Now, predication is really bad, it makes OoO vastly more complicated than it should have been. Implicit dependencies that may really never become dependencies block it from reordering instructions.

    1. Hmm, I didn’t know that !
      Have you felt anything which suggests a worry ?
      Raises potentially important issues of :-
      Side effects, efficiency, speed, compatibility, correction issues as in Consequences Or anything ???
      Could there be a clear benefit or overall by virtue of compiler to assembler patterns is it perchance neutral nonsequiter by final analysis ?

  2. It is not true, that every other major architecture has the predication feature. As the Wikipadia article says, for example PowerPC doesn’t have this feature. And I don’t think that you need more energy to do branch prediction without predications, it just gets a bit more complicated to implement in hardware. But every big CPU does this already, independently of predications.

    And you are right that many of the advanced features are still in flux. But nevertheless there are official documents for it at riscv.org, so I don’t see how this causes fragmentation of the RISC-V ecosystem. There are many discussions on the official RISC-V mailing lists about these features, and when the vendors tested different ideas, the best idea will be documented in the official RISC-V standard documents. Much better than the proprietary development process and implementations of the other CPU architectures like ARM.

    And what is already frozen, the RVI32 and RVI64 base architecture and the floating point extension, could potentially replace every single low to medium power microcontroller on market these days. Then the vendors don’t have to pay for an ARM license, so the chips could get cheaper. So a win for everyone, except maybe ARM. You didn’t get paid by ARM for writing this article? :-)

    1. The big thing with microcontrollers is though not performance.
      But rather real time control, something a high performance oriented architecture rarely ever even focuses on.

      Though, this is more down to hardware implementation of the ISA. (unless the ISA makes requirements that certain features needs to be one cycle long, and others multiple cycles, not to mention “odd” fetch systems and such where one reads multiple instructions that together works with the data/registers. But I haven’t seen such in RISC-V though.)

      Though, ARM as well isn’t really well suited for real time control for all types of applications.
      So RISC-V will likely not replace AVR and PIC, among other microcontroller architectures out there.

      After all, performance isn’t critical in most microcontroller applications. (And if performance is more important, then maybe offload that to a SoC via a UART or something and leave the time critical stuff to the micro.)

      This is a big reason for why micros aren’t using instruction pipelines, typically do 1 instruction even 2 or more cycles, and doesn’t do any out of order execution. (Though, there are multi cored micros on the market, but I haven’t poked at them myself, but I can see potential for them in some applications. With enough cores, they could likely rival FPGAs in a lot of applications.)

  3. Technically, RISC isn’t about getting down to one-cycle instructions, since clearly, later RISC generations were able, with super-pipelining and multiple execution units to exceed one instruction per cycle (as did competing architectures).

    The key idea for RISC is to optimise execution for silicon space: to make common operations fast and uncommon operations work. Reducing the instruction set is a by-product of that: infrequent instructions are a waste of silicon as are infrequent addressing modes for accessing data.

    Even early RISC processors added to the complexity of their CPUs by increasing the number of operand fields in an instruction; e.g. 3 operand ALU instructions or even with the 32-bit ARM ISA instructions that combine an ALU operation (with 3 operands) where a constant or register shift or rotate is applied to one source, before the operation and the operation itself being conditional as well as a flag for writing back the result.

    Or consider the pipelining of early RISC CPUs, which added considerable complexity to the design, even to the extent of having to provide forwarding of operands to prevent stalls. CISC processors didn’t employ these kinds of architectural features until later.

    In all cases, it’s really just a question of shifting the use of Silicon to a more efficient use-case – and that too can vary depending on whether we prize potential performance, or code density or power efficiency or, in the case of ARM, a RISC processor designed for Assembler programming and 6502 emulation (because its primary application was an implementation of BBC Basic).

  4. Just to give my own two cents to the topic, but: “A Reduced Instruction Set Computer (RISC) is a type of ISA which focuses on creating an instruction set that requires only a limited number of processor cycles to execute a single instruction. ” Is a sentence I find very incorrect.

    A RISC architecture is following the philosophy that fewer instructions means that it will be easier to work with and implement in hardware. Since we can always express more advanced functions and operations with code, instead of spending “expensive” hardware resources on the task. (After all, only a few lines of code could need tens of thousands of transistors to be effectively realized in hardware. (While the tiny amount of RAM to store that code could only need a few hundred transistors.))

    Unlike CISC architectures were one don’t make such limitations and rather just adds every imaginable instruction one can need. One reason for why x86 for an example has some 6000+ different instruction calls, while ARM only has a few hundred. (There isn’t any CPU on the market supporting all these different instruction calls, since they rarely if ever have all instruction set extensions.)

    But the main difference between these philosophies is that in RISC we migrate functions into the software domain, and in CISC we do it with hardware instead. “The line in the sand” between them is somewhere around 30-50 instructions, depending on who one asks. (Not including instruction calls, as an example, x86 has 96 conditional jump instruction calls, each one having its own set of conditions, but they are all the same instruction in the end. So “conditional jump” is 1 instruction, with 96 different variations in its conditions.)

    And to continue:

    “any CPU based on Intel’s x86 CISC ISA and successors (like AMD’s 64-bit extensions) being internally a RISC processor with a CISC ISA decoder front-end that breaks CISC instructions into the RISC instructions (micro-opcodes) for its CPU core. At least as far as the CISC versus RISC ISA wars go, here we can say that RISC decidedly won.”

    Is also rather incorrect. Yes a lot of instruction calls do initiate microcode execution, but there is still a hefty amount of instructions that makes use of dedicated hardware, since the dedicated hardware executes far faster (and can typically be done in parallel together with other tasks thanks to out of order execution). Practically all x86 CPU is largely still a CISC processor, even from the hardware perspective. (And even ARM is also wandering towards being a CISC architecture today.)

    As far as RISC-V goes, the sentence: “Also interesting in the RISC-V ecosystem is that adding custom instruction sets without any kind of approval process is encouraged.” Is likely a big reason for its being seen as popular, though from a machine code perspective, this means that it will be literally hell to work with, and potentially riddled with security flaws as far as the eye can see….

    (Not like we don’t have software examples of such environments already. At least in software we can give a library a long usually unique name. Good luck having a unique instruction call in hardware were you typically only have 8-32 bits to work with. Not to mention that two manufacturers might have slightly different implementation of what should be the same thing… (Each making their own “improvements” to the function, thereby diverging them from each other for edge cases, or one/more manufacturer unknowingly introducing security flaws…))

    Then there is the whole segment about predication and branch prediction. Well, knowledge is an amazingly double edged sword. But you don’t need a condition register to implement efficient branch prediction. (Just like one doesn’t need an accumulator register (this can make out of order execution simpler), or state flags, and frankly, one doesn’t even need a program counter to build a CPU to be fair…. But then one is going a bit too far out of line in my own opinion.)

    But I have to agree, “schizophrenic” is a very well fitting word for describing RISC-V in general, it is a mixed bag without any real goals, other then trying to be “The one ISA for all future processors”, something it frankly will never be, mostly due to the very unorganized nature of an open source system encouraging people to just add in new features without any form of approval, validation or compatibility-checking process…

    In the end, RISC-V is just another Instruction Set Architecture.

    But here is the thing, I don’t design or pick a CPU based on the ISA. I pick a CPU based on its performance in a given task/workload. And when I design computing systems, I start at the workload and build hardware for it, the ISA will form naturally based on that hardware. So as far as I am concerned, RISC-V is just a premade bridge that might not at all fit my application. And designing an ISA is so dead simple that it honestly takes more effort to read up on RISC-V then it is to just make a new ISA for the new architecture one designed.

    For those that on the other hand likes the idea that RISC-V gives their hardware the ability to just run someone else’s code, then yes, the development/learning-curve becomes a bit easier if you don’t need to learn a new ISA each time you switch CPU (or if you sell a processor, your potential customer will be easier to win over if they already know the software part of it). But then I like to ask, when did you ever program in raw machine code?….

    After all, the more CPU oriented optimizations will be implementation specific (as in hardware), not ISA specific. (Not to mention that learning the machine code of a processor is typically not that hard, and the vast majority of people only go down to assembler, that is still rather far from machine code… And most manufacturers do give you a fairly substantial amount of documentation, after all, they want you to use their system. (NDAs can though at times be a thing.))

    So frankly, I see what RISC-V wants to be, but yes, I can also offer you a free, open source, instruction set architecture with all the bells and whistles, and then state that it is up to you to figure out how to implement it. And then make most of it optional for you to even include, so you can more easily implement it. Then state that you are free to add more instructions and features that I frankly were too lazy to think of myself. Or you can just make your own ISA that actually does what you want in a way that is logical for your hardware and your users of it. (And you save time too, since making an ISA is dead simple…)

      1. WD eats their own cooking. They’re not selling R-V to customers, they’re selling products with R-V inside of them. For most of their end-users it makes exactly 0 difference compared to how it was before RISC-V.

        1. Though, WD does save some money, since they don’t need to do as much development on the software side of things. And I would be surprised if their chip differs much at all from an off the shelf RISC-V core. Potentially maybe having a handful of more “storage” oriented instructions, or encryption acceleration.

      2. Fun thing is, I am not skeptical about RISC-V.

        It is an architecture, like any other.

        Though with a focus on being open source, and letting people generally do what they feel like without much organization behind those development effort. This is both a nice thing, and a horrid one at once. (Since it can lead to multiple conflicting functions using the same instruction calls. Meaning that one can’t just identify if an instruction exists, but also what instruction it is, etc…)

        But I do have to say that the lofty goal of RISC-V becoming the go to ISA for future processors is laughably unrealistic for a whole slew of reasons.

        It will though most likely still be used where it makes sense.

    1. It comes down to tools and infrastructure and community. And RISC-V is getting all of the above. It might be ‘easy’ to roll a new ISA but it is not easy to roll a Clang or GCC backend, a pile of books, a couple dozen (seriously) decent quality free cores you can run on your FPGA, an some silicon implementations from major manufacturers to boot.

      Honestly, you’ll find RISC-V in a lot of things very soon. Will you find it as the main CPU on your desktop? No more than you find ARM there, no. And probably not as the main processor in most mobile devices. But it’ll be all over the place in peripherals and in microcontroller applications. Or as the supporting processors in FPGAs. Because one can grab the core and get compilers, documentation for it, and also software engineers skilled in it.

      1. I agree: The real advantage of RISC-V is that it comes with an open source tool chain, and the base ISA is simple enough for anyone to support in a small custom CPU/soft processor.

        Apart from that I agree with the article that RISC-V is not the be-all and end-all ISA that some seem to hint at, simply because there is no such thing as a universally optimal ISA.

        To add to the mix of ISA:s, here’s another RISC ISA: the MRISC32. See:

        https://www.bitsnbites.eu/the-mrisc32-a-vector-first-cpu-design/

        https://www.bitsnbites.eu/some-features-of-the-mrisc32-isa/

  5. Predication harms OoO, that’s why even ARM dropped it. And its advantages for simple in order cores are negligible too. If you want a unified ISA suitable for both OoO and in-order, you have no choice but to avoid predication and other similar implicit instruction effects.

  6. I’ve found RISC-V to be an ideal soft processor ISA for FPGAs. It has well-defined custom instruction opcode spaces which are easy to support in the provided assembler and compiler tool flows. Custom features and extensions implemented in the FPGA are not just provided for, they’re actually encouraged. ISA validation tests are included which makes writing your own core far smoother.

  7. From the article: “multiple CPU manufacturers struggled to make their ISA and their chips to be the most popular, with Zilog’s Z80 and of course the 6502 being strong 8-bit contenders before a little upstart called ‘Intel’ began to make inroads”.

    Intel, a little upstart? You have this quite backwards.
    Intel was founded in 1968, so they were well-established by 1980. They’d built several generations of microprocessor by then (as well as having *invented* the microprocessor).

    The Z80 was an affordable clone of the Intel 8080. Zilog was founded by people who left Intel after working on the 8080.

    Intel was only an upstart compared to Texas Instruments and Fairchild Semiconductor, which were founded a decade earlier. Even so, one of Intel’s founders (Robert Noyce) was a co-inventor of the integrated circuit.

  8. “At least as far as the CISC versus RISC ISA wars go, here we can say that RISC decidedly won”

    Oh, really? Are you sure?
    Current dominant ISA are x86 and ARM.

    x86 is not a RISC, of course. And the “there is a RISC inside” narrative is quite misleading (I could be less polite). It’s about using pipelining instead of sequential instructions. The internal format of micro-ops and pipeline signals do not make a RISC instruction set, it’s just that the microarchitecture of pipelined CPUs can be similar for RISC and CISC after the decode and microcode stages.

    ARM has always designed ISA with quite complex instructions, such as combined shift/op, predication, flexible addressing modes, multiple register load/store, etc.

    The tendency these last 20 years has been to add instructions. Lots and lots of new instructions for floating point, SIMD/media/3D, crypto,…

    The initiial idea of RISC (“reduced”!), get rid of the cruft of seldom used instructions and addressing modes has failed. Of course it is possible to make small embedded cores with just a minimalistic instruction set. But for highest performance, all modern CPUs have hundreds of instructions.

    An important reason is that frequency gains are something of the past since the early 2000’s, OoO tricks are not sufficient to occupy all execution resources. And now transistors are cheap and parts can be powered off to lower current draw. So adding a special instruction just for calcuating a CRC or for some combination of multiply add multiple on 8 or 16 bits quantities makes sense, even if it represents 0.01% of the average use of the CPU. As performance and complexity is now more dependant on caches, prefetching, branch prediction, as the main issues are about memory latency, having a bloated ALU, FPU and instruction set isn’t a disavantage.

    The ideas of RISC from 30 years ago, minimalistic early SPARC and MIPS instruction sets, are obsolete as is the distinction between RISC and CISC.

  9. The 6502 is usually reported as having 3510 transistors. As of 2019, the largest transistor count in a commercially available microprocessor is 39.54 billion MOSFETs, in AMD’s Zen 2. That’s roughly equivalent to 11 million 6502’s.

  10. A rather late reply, but I think it’s worth pointing out that the MIPS Open program stopped two days after the post: https://news.mynavi.jp/article/20191115-923870/

    In other news Cobham Gaisler, makers of the open-source LEON CPU as the last remaining SPARC implementation (Oracle and Fujitsu already having bailed), is now moving to RISC-V: https://www.hpcwire.com/off-the-wire/de-risc-to-create-first-risc-v-fully-european-platform-for-space/

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.