Down The Intel Microcode Rabbit Hole

The aptly-named [chip-red-pill] team is offering you a chance to go down the Intel rabbit hole. If you learned how to build CPUs back in the 1970s, you would learn that your instruction decoder would, for example, note a register to register move and then light up one register to write to a common bus and another register to read from the common bus. These days, it isn’t that simple. In addition to compiling to an underlying instruction set, processors rarely encode instructions in hardware anymore. Instead, each instruction has microcode that causes the right things to happen at the right time. But Intel encrypts their microcode. Of course, what can be encrypted can also be decrypted.

Using vulnerabilities, you can activate an undocumented debugging mode called red unlock. This allows a microcode dump and the decryption keys are inside. The team did a paper for OffensiveCon22 on this technique and you can see a video about it, below.

Continue reading “Down The Intel Microcode Rabbit Hole”

Scott’s CPU From The Bottom Up

It isn’t for everyone, but if you work much with computers at a low level, you’ll probably sooner or later entertain the idea of creating your own CPU. There was a time when that was a giant undertaking, but with today’s tools and FPGAs it is… well, not easy, but certainly easier. If you have the urge to try your own, you might have a look at [Simply Explained’s] video series called “Building Scott’s CPU.

The 11 videos cover everything from basic transistor logic to sequential circuits and moves on to things like ALUs, clock units, and how jump instructions work.

Continue reading “Scott’s CPU From The Bottom Up”

Calculating Pi On The 4004 CPU, Intel’s First Microprocessor

These days we are blessed with multicore 64-bit monster CPUs that can calculate an entire moon mission’s worth of instructions in the blink of an eye. Once upon a time, though, the state of the art was much less capable; Intel’s first microprocessor, the 4004, was built on a humble 4-bit architecture with limited instructions. [Mark] decided calculating pi on this platform would be a good challenge. 

It’s not the easiest thing to do; a 4-bit processor can’t easily store long numbers, and the 4004 doesn’t have any native floating point capability either. AND and XOR aren’t available, either, and there’s only 10,240 bits of RAM to play with. These limitations guided [Mark’s] choice of algorithm for calculating the only truly round number. Continue reading “Calculating Pi On The 4004 CPU, Intel’s First Microprocessor”

Now The V In RISC-V Stands For VRoom

Hundreds of variations of open-source CPUs written in an HDL seem to float around the internet these days (and that’s a great thing). Many are RISC-V, an open-source instruction set (ISA), and are small toy processors useful for learning and small tasks. However, if you’re [Paul Campbell], you go for a high-end super-scalar, out-of-order, speculative, 8 IPC monster of a RISC-V CPU known as VRoom!.

That might seem a bit like word soup to the uninitiated in the processor design world (which is admittedly relatively small) but what makes this different from VexRISC is the scale and complexity. Rather than executing one instruction at a time sequentially, it executes multiple instructions, completing them concurrently in whatever order it can handle. The VexRISC chip is a good 32-bit modular design that can run Linux. It pulls a solid 1.57 DMIPS/MHz with everything turned on. The VRoom already clocks in at mighty 6.5 DMIPS/MHz, with more performance gains. It peaks at 8 instructions every clock cycle with a dual register file and a clever committing system to keep up.

VRoom is written in System Verilog to leverage Verilator (a handy linting and simulation framework), and while there is some C that generates different files, we’d wager it is pretty run-of-the-mill compared to a TypeScript based project. VRoom currently boots Linux thanks to an AWS-FPGA instance (a Xilinx VU9P Ultrascale), though it has to be trimmed to fit. [Paul] has big plans working his way up to a server-class chip with lots of cores and a huge cache.

It’s all on GitHub under a GPLv3 license; go check it out! [Paul] also has a talk with lots of great details. If you’re interested in getting into RISC-V but a server-class isn’t your speed, we heard Espressif is starting to use RISC-V cores in their ever-popular ESP series.

Tilting At Windmills Nine Bits At A Time

In the old days — we are talking like the 1960s and 1970s — computers were often built for very specific purposes using either discrete logic or “bit slice” chips. Either way, more bits meant more money so frequently these computers were made with just enough bits to meet a required precision. We don’t think that was what was on [Mad Ned’s] mind, though, when he decided to implement a 9-bit CPU called QIXOTE-1 on an FPGA.

Like many hobby projects, this one started with an FPGA board in search of a problem. At first, [Ned] had a plan to create a custom computer along with a custom language to then produce a video game. A quick search on the Internet led to that being a common enough project with one guy that we’ve talked about here on Hackaday before knocking it out of the park.

[Ned] then thought about just doing a no-software video game. Too late to be the first to do that. Not to be deterred, he decided to duplicate the PDP-8. Whoops. That’s been done before, too. Wanting something original, he finally decided on a custom CPU. Since bytes are usually — if not technically — 8 bits, this CPU calls its 9-bit words nonads and uses octal which maps nicely to three digits per nonad.

This first post talks about the story behind the CPU and gives a short overview of its capabilities, but we are waiting for future posts to show more of what’s behind the curtain in what [Ned] calls “Holy Nonads, Part 010.”

The downside to doing a custom CPU is you have to build your own tools. You can always, of course, duplicate something and steal your toolchain. Or go universal.

Three-Dimensional Design Yields Compact Seven-Segment Hex Displays

Computers, from the simplest to the most complex, aren’t very useful if they can’t provide feedback to a user. Whether that interface takes the form of a monitor, a speaker, or a simple LED, there’s almost always some kind of output. One of the most ubiquitous is the ever-present seven-segment display. They’re small, they’re easy to use, and, perhaps most important, they’re cheap.

While the displays themselves are relatively compact, they often require some sort of driver circuitry — something that translates a digit into voltage at the correct pins. These drivers can take up valuable space, especially on a breadboard, and can sometimes make using seven-segment displays cumbersome. Thankfully, [John Lonergan] has a great solution: driver boards that sit completely beneath the displays. His dual seven-segment hex display project was born out of necessity — he needed it for the breadboard CPU SPAM-1, which was getting a bit too bulky. Each module is two seven-segment displays atop a small PCB. Beneath the displays lives an 8-bit PIC microcontroller, which acts as a driver for both of the displays.

It’s so easy to restrict ourselves to thinking in two dimensions when working on electronic design — even designing multilayer PCBs often feels like working on several, distinct two-dimensional areas rather than one three-dimensional one. The concept of stacking components to save space, while fairly straightforward to implement, is a great example of the kind of problem-solving we love to see here at Hackaday. Of course, if you like the idea of 3D circuit design, you have to check out some of these incredible circuit sculptures we’ve featured in the past.

Continue reading “Three-Dimensional Design Yields Compact Seven-Segment Hex Displays”

Vintage Computers With A Real Turbo

In prior centuries, it was common practice to tie the operation of a program to a computer’s clock speed. As computers got faster and faster, the programs tied to that slower clock speed sometimes had trouble running. To patch the issue temporarily, some computers in the early 90s included a “TURBO” button which actually slowed the computer’s clock speed down in order to help older software run without breaking in often unpredictable ways. [Ted Fried] decided that he would turn this idea on its head, though, by essentially building a TURBO button into the hardware of old computers which would greatly increase the execution speed of these computers without causing software mayhem.

To accomplish this, he is running CPU emulators on Teensys (Teensies?), but they are configured to be a drop-in replacement for the physical CPU of several retro computers such as the Apple II, VIC-20, and Commodore 64 rather than an emulator for an entire system. It can be configured to run either in cycle-accurate mode, making it essentially identical to the computer’s original hardware, or it can be placed into an accelerated mode to take advantage of the Teensy 4.1’s 800 MHz processor, which is orders of magnitude faster than the original hardware. This allows (most of) the original hardware to still be used while running programs at wildly faster speeds without needing to worry about any programming hiccups due to the increased clock speed.

The video below demonstrates [Ted]’s creation running in an Apple II but he has several other cores for other retro computers. It’s certainly a unique way to squeeze more computing power out of these antique machines. Some Apple II computers had a 4 MHz clock which seems incredibly slow by modern standards, so the 800 MHz Teensy would have been considered wizardry by the standards of the time, but believe it or not, it’s actually necessary to go the other direction for some applications and slow this computer down to a 1 MHz crawl.

Continue reading “Vintage Computers With A Real Turbo”