RISC, Tagged Memory, and Minion Cores

Buy a computing device nowadays, and you’re probably getting something that knows x86 or an ARM. There’s more than one architecture out there for general purpose computing with dual-core MIPS boards available and some very strange silicon that’s making its way into dev boards. lowRISC is the latest endeavour from a few notable silicon designers, able to run Linux ‘well’ and adding a few novel security features that haven’t yet been put together this way before.

There are two interesting features that make the lowRISC notable. The first is tagged memory. This has been used before in older, weirder computers as a sort of metadata for memory. Basically, a few bits of each memory address tag each memory address as executable/non-executable, serve as memory watchpoints, garbage collection, and a lock on every word. New instructions are added to the ISA, allowing these tags to be manipulated, watched, and monitored to prevent the most common single security problem: buffer overflows. It’s an extremely interesting application of tagged memory, and something that isn’t really found in a modern architecture.

The second neat feature of the lowRISC are the minions. These are programmable devices tied to the processor’s I/O that work a lot like a Zynq SOC or the PRU inside the BeagleBone. Basically, they’re used for programmable I/O, implementing SPI/I2C/I2S/SDIO in software, offloading work from the main core, and devices that require very precise timing.

The current goal of the lowRISC team is to develop the hardware on an FPGA, releasing some beta silicon in a year’s time. The first complete chip will be an embedded SOC, hopefully release sometime around late 2016 or early 2017. The ultimate goal is an SOC with a GPU that would be used in mobile phones, set-top boxes, and Raspi and BeagleBone-like dev boards. There are enough people on the team, including [Robert Mullins] and [Alex Bradbury] of the University of Cambridge and the Raspberry Pi, researchers at UC Berkeley, and [Bunnie Huang].

It’s a project still in its infancy, but the features these people are going after are very interesting, and something that just isn’t being done with other platforms.

[Alex Bardbury] gave a talk on lowRISC at ORConf last October. You can check out the presentation here.

Building new, weird CPUs in FPGAs

CPU

The popularization of FPGAs for the hobbyist market means a lot more than custom LED controllers and clones of classic computer systems. FPGAs are also a great tool to experiment with computer architecture, creating new, weird, CPUs that don’t abide by the conventions the industry has used for 40 years. [Victor] is designing a new CPU that challenges the conventions of how to access different memory locations, and in the process even came up with a bit of example code that runs on an ARM microcontroller.

Most of the time, the machine code running on your desktop or laptop isn’t that interesting; it’s just long strings of instructions to be processed linearly. The magic of a computer comes through comparisons, an if statement or a jump in code, where the CPU can run one of two pieces of code, depending on a value in a register. There is the problem of reach, though: if a piece of code makes a direct call to another piece of code, the address of the new code must fit within an instruction. On an ARM processor, only 24 bits are available to encode the address, meaning a jump in code can only go 16 MB on either side of its call. Going any further requires more instructions, and the performance hit that comes along with that.

[Victor] decided a solution to this problem would be to create a bit of circuitry that would be a sliding window to store address locations. Instead of storing the literal address for jumps in code, every branch in the code is stored as a location relative to whatever is in the program counter. The result is an easy way to JMP to code very far away in memory, with less of a performance hit.

There’s an implementation for this sliding window token thing [Victor] whipped up for NXP’s ARM Cortex M3 microprocessor, and he’ll be working on an implementation of this concept in a new CPU over on his git.

Massively parallel CPU processes 256 shades of gray

256

The 1980s were a heyday for strange computer architectures; instead of the von Neumann architecture you’d find in one of today’s desktop computers or the Harvard architecture of a microcontroller, a lot of companies experimented with strange parallel designs. While not used much today, at the time these were some of the most powerful computers of their day and were used as the main research tools of the AI renaissance of the 1980s.

Over at the Norwegian University of Science and Technology a huge group of students (13 members!) designed a modern take on the massively parallel computer. It’s called 256 Shades of Gray, and it processes 320×240 pixel 8-bit grayscale graphics like no microcontroller could.

The idea for the project was to create an array-based parallel image processor with an architecture similar to the Goodyear MPP formerly used by NASA or the Connection Machine found in the control room of Jurassic Park. Unlike these earlier computers, the team implemented their array processor in an FPGA, giving rise to their Lena processor this processor is in turn controlled by a 32-bit AVR microcontroller with a custom-build VGA output.

The entire machine can process 10 frames per second of 320×240 resolution grayscale video. There’s a presentation video available (in Norwegian), but the highlight might be their demo of The Game of Life rendered in real-time on their computer. An awesome build, and a very cool experience for all the members of the class.