Now The V In RISC-V Stands For VRoom

March 26, 2022

Hundreds of variations of open-source CPUs written in an HDL seem to float around the internet these days (and that’s a great thing). Many are RISC-V, an open-source instruction set (ISA), and are small toy processors useful for learning and small tasks. However, if you’re [Paul Campbell], you go for a high-end super-scalar, out-of-order, speculative, 8 IPC monster of a RISC-V CPU known as VRoom!.

That might seem a bit like word soup to the uninitiated in the processor design world (which is admittedly relatively small) but what makes this different from VexRISC is the scale and complexity. Rather than executing one instruction at a time sequentially, it executes multiple instructions, completing them concurrently in whatever order it can handle. The VexRISC chip is a good 32-bit modular design that can run Linux. It pulls a solid 1.57 DMIPS/MHz with everything turned on. The VRoom already clocks in at mighty 6.5 DMIPS/MHz, with more performance gains. It peaks at 8 instructions every clock cycle with a dual register file and a clever committing system to keep up.

VRoom is written in System Verilog to leverage Verilator (a handy linting and simulation framework), and while there is some C that generates different files, we’d wager it is pretty run-of-the-mill compared to a TypeScript based project. VRoom currently boots Linux thanks to an AWS-FPGA instance (a Xilinx VU9P Ultrascale), though it has to be trimmed to fit. [Paul] has big plans working his way up to a server-class chip with lots of cores and a huge cache.

It’s all on GitHub under a GPLv3 license; go check it out! [Paul] also has a talk with lots of great details. If you’re interested in getting into RISC-V but a server-class isn’t your speed, we heard Espressif is starting to use RISC-V cores in their ever-popular ESP series.

24 thoughts on “Now The V In RISC-V Stands For VRoom”

Paul Campbell says:

March 26, 2022 at 4:15 am

Paul Campbell here – AMA

A minor nit – I’m currently building in Verilator rather than Icarus (I’ve started using System Verilog Interfaces) – I’d love to still be using Icarus (I’m a big fan of ‘X’ values in simulations for finding bugs).

I released some minor changes (noted in the blog) DMips/MHz is up to 6.5, I’m in the middle of architectural tuning, I expect it to continue to increase

Report comment

Reply
1. Meow says:
  
  March 26, 2022 at 4:55 am
  
  Thanks for highlighting you’re keeping a blog.
  
  I’d love to subscribe, but I cannot find RSS or Atom feed for it.
  
  Report comment
  
  Reply
  1. Paul Campbell says:
    
    March 26, 2022 at 5:50 am
    
    Rats I thought I’d fixed that – I’ll go and poke at it a bit more tomorrow – sorry
    
    Report comment
    
    Reply
2. Matthew Carlson says:
  
  March 26, 2022 at 5:48 am
  
  I quite liked your explanation of why x values are handy for both verification and synthesis. Good luck finding funding/hardware to keep developing this on.
  
  I wonder if there’s any good open source tooling that splits a design across multiple FPGAs. The stuff in familiar with is all proprietary and professionally I switched to firmware a few years ago.
  
  Report comment
  
  Reply
  1. Paul Campbell says:
    
    March 26, 2022 at 4:26 pm
    
    I think that splitting something across chip boundaries is a tough problem, if you want real speed you likely need to pipe-line the die crossings with flops at each end, and to minimise the number of wires – probably this is something you are going to need to actually build into your architecture
    
    Report comment
    
    Reply
3. Kevin Harrelson says:
  
  March 26, 2022 at 6:35 am
  
  You could try SV unions. They are about as good as interfaces. Modports are nice in theory, but generally not worth the hassle.
  
  Report comment
  
  Reply
  1. Paul Campbell says:
    
    March 26, 2022 at 4:45 pm
    
    The big thing I’m after is arbitrary sized arrays of structured things – if I could pass arrays of structures I’d be happy – the main problem I’m trying to deal with using interfaces is how to pass N instances of an interface that can be parameterised at compile time (for example building a system that with 6 address units with a 6-read-port TLB for simulation, and a 4 address unit system with 4 TLB read ports for the FGA testing environment) – this design is heavily parameterised so I can do quick architectural exploration, but simple verilog makes that hard in some areas
    
    Report comment
    
    Reply
4. Matthew Carlson says:
  
  March 26, 2022 at 7:09 am
  
  Thank you! Made a slight tweak to reflect this. There was a ~ in front of 6.5 in your notes, so I didn’t want to misquote you if you were still playing around with it. The Icarus is just a mistake on my part, sorry.
  
  Report comment
  
  Reply
  1. Paul Campbell says:
    
    March 26, 2022 at 4:30 pm
    
    No problem – since I’d just written a note about having to give up Icarus I wanted to give credit where it’s due – Icarus is still a great simulator. The ~6.5 (really 6.49) was only announced in yesterday’s blog post – it’s still a moving target – next big change will be a couple of weeks out.
    
    Report comment
    
    Reply
5. M says:
  
  March 27, 2022 at 5:42 pm
  
  Have you heard of the work being done by LibreSoC?
  
  Report comment
  
  Reply
Tony Liechty says:

March 26, 2022 at 4:52 am

That’s pretty cool! Do you see this going into FPGAs or ASICs more? The picture you have inside a Xilinx/AMD FPGA is pretty cool too. That seems pretty performant vs arm cores, and love that it’s open source.

Report comment

Reply
1. Paul Campbell says:
  
  March 26, 2022 at 5:50 am
  
  This is more something that one would build as an ASIC – for this sort of things FPGAs are more a tool to get lots of testing done – it really needs actual datapaths and an ASIC
  
  Report comment
  
  Reply
2. Sweeney says:
  
  March 26, 2022 at 3:08 pm
  
  It’s rather large for FPGA use. You’d want a small and efficient core design for an FPGA project, not one that takes up most of a 2.5M LE Virtex (in cut down form, for testing).
  
  Report comment
  
  Reply
bill Rowe says:

March 26, 2022 at 6:29 am

I wondered whether, because you’re starting from the ground up, you were able to avoid the security exposures in speculative execution.

Report comment

Reply
1. Alexander Wikström says:
  
  March 26, 2022 at 8:27 am
  
  Out of order execution and speculative execution aren’t mutually inclusive.
  
  Out of order is about executing instructions that has all their data available.
  
  While speculative execution runs instructions that one has yet to determine if one should run or not, typically in reagards to branches and branch prediction.
  
  Though, a lot of the issues with speculative execution is not about executing a branch that shouldn’t have been taken. But rather the fact that a lot of CPUs just didn’t check if the thread were allowed to read what it asked for to start with in this edge case scenario, likely for overall performance reasons.
  
  Report comment
  
  Reply
  1. 789 says:
    
    March 26, 2022 at 9:05 am
    
    Ok, and? That’s not relevant to this discussion. We know that Vroom has speculative execution. It’s stated in the 3rd sentence on this article, and in the side blurb on the Vroom website.
    
    Report comment
    
    Reply
    1. Gravis says:
      
      March 27, 2022 at 6:36 am
      
      I believe the point here was that it’s out-of-order execution rather than speculative execution.
      
      Report comment
      
      Reply
      1. 789 says:
        
        March 27, 2022 at 7:12 am
        
        But it is speculative execution. This is explicitly said many times. At not point was Vroom being capable of speculative execution a question.
        
        Report comment
2. Sweeney says:
  
  March 26, 2022 at 3:30 pm
  
  The security problems were with speculative fetches not checking for access permissions before the fetch. Speculative execution isn’t bad per say, you just need to ensure that security is observed in the correct sequence also.
  
  Report comment
  
  Reply
  1. Paul Campbell says:
    
    March 26, 2022 at 7:30 pm
    
    There’s more to it than just not speculating past privilege (though that’s important) you can potentially leak information by doing clock-level timing of how long things took, and therefore discover whether or not something is in a cache – or whether a test in another privilege level succeeded or not (did it hit in the BTC?) that leaks a bit of data (it’s why VRoom! spends the gates to have multiple BTCs for each priv mode)
    
    Report comment
    
    Reply
3. Paul Campbell says:
  
  March 26, 2022 at 4:38 pm
  
  There’s actually a slide on that in the architectural talk – it’s particularly important if you’re aiming for something really big running VMs.
  
  It’s a REALLY hard problem, I’ve been able to learn from others’ mistakes – for example we wont speculate past a TLB miss (or fault) and fill a cache line. Those things tend to leave performance on the floor. The other thing I’ve been doing is trying to muddy the signal – caches with lots of associative sets, or even fully associative, random replacement algorithms (again that leaves a little performance behind) – RISCV has an architectural cycle accurate counter removing access to that for VMs and/or just user mode is another step in that direction (I think there’s a move for a standard way to do this).
  
  These are just some of the stuff I’ve been doing – it’s a continual issue
  
  Report comment
  
  Reply
Suimi says:

March 26, 2022 at 7:59 am

Really impressive work!! The blog is fascinating :-)

Report comment

Reply
j s says:

March 26, 2022 at 7:28 pm

It’s a good thing he was able to rent time on a VU9P. Just the chip alone is over US$60,000.

Report comment

Reply
1. Paul Campbell says:
  
  March 28, 2022 at 12:27 am
  
  and it’s only $1.5 an hour – $60k is probably too high in reality you can buy boards with VU9P sized chips for ~$5-10k – when the bitcoin bubble busts they’ll likely be at reasonable prices for mere mortals
  
  Report comment
  
  Reply

Hackaday

Now The V In RISC-V Stands For VRoom

24 thoughts on “Now The V In RISC-V Stands For VRoom”

Leave a ReplyCancel reply

Search

Never miss a hack

If you missed it

VRML And The Dream Of Bringing 3D To The World Wide Web

Australia’s Space Program Finally Gets Off The Pad, But Only Barely

What Happens When Lightning Strikes A Plane?

Happy Birthday 6502

Two For The Price Of One: BornHack 2024 And 2025 Badges

Our Columns

A Love Letter To Prototype Zero

Hackaday Podcast Episode 332: 5 Axes Are Better Than 3, Hacking Your Behavior, And The Man Who Made Models

This Week In Security: Perplexity V Cloudflare, GreedyBear, And HashiCorp

The 64-Degree Egg, And Other Delicious Variants

Jenny’s Daily Drivers: FreeDOS 1.4

24 thoughts on “Now The V In RISC-V Stands For VRoom”

Leave a ReplyCancel reply

Search

Never miss a hack

Subscribe

If you missed it

Our Columns