Teach Yourself Verilog With This Tiny CPU Design

You probably couldn’t write a decent novel if you’d never read a novel. Learning to do something often involves studying what other people did before you. One problem with trying to learn new technology is finding something simple enough to start your studies.

[InfiniteNOP] wanted to get his feet wet writing CPUs and developed a simple 8-bit architecture that would be a good start for a classroom or self-study. It is a work in progress, so there may be a few bugs in it still to squash, but squashing bugs might be educational too. You can read the documentation in the HACKING file for details on the architecture. Briefly, the instruction’s top four bits encode the operation, while the last four bits select the register operands (there are four registers).

[InfiniteNOP] used the Xilinx tools to simulate and synthesize the CPU, but we thought it might be a good excuse to play with EDAPlayground. You can find a testbench that works with EDAPlayground, although you’ll probably want to update the CPU files to match the latest version.

Continue reading “Teach Yourself Verilog With This Tiny CPU Design”

Designing A CPU In VHDL For FPGAs: OMG.

If you’ve been thinking about playing around with FPGAs and/or are interested in CPU design, [Domipheus] has started a blog post series that you should check out. Normally we’d wait until the whole series is done to post about it, but it’s looking so good, that we thought we’d share it with you while it’s still in progress. So far, there are five parts.

minispartan6In Part One, [Domipheus] goes through his rationale and plans for the CPU. If you’re at all interested in following along, this post is a must-read. The summary, though, is that he’s aiming to make a stripped-down 16-bit processor on a Spartan 6+ FPGA with basic arithmetic and control flow, and write an assembler for it.

In Part Two, [Domipheus] goes over the nitty-gritty of getting VHDL code rendered and uploaded to the FPGA, and as an example builds up the CPU’s eight registers. If you’re new to FPGAs, pay special attention to the test bench code at the end of the post. Xilinx’s ISE package makes building a test suite for your FPGA code pretty easy, and given the eventual complexity of the system, it’s a great idea to have tests set up for each stage. Testing will be a recurring theme throughout the rest of the posts.

In Part Three, [Domipheus] works through his choices for the instruction set and starts writes up the instruction set decoder. In Part Four, we get to see an ALU and the jump commands are implemented. Part Five builds up a bare-bones control unit and connects the decoder, ALU, and registers together to do some math and count up.

pipe

We can’t wait for further installments. If you’re interested in this sort of thing, and are following [Domipheus]’s progress, be sure to let him know: we gotta keep him working.

Of course, this isn’t the first time anyone’s built a soft-CPU in an FPGA. (The OMG was added mostly to go along with the other TLAs.) Here’s a tiny one, a big one, and a bizarre one.

Discrete Transistor Computer Is Not Discreet

Every few years, we hear about someone building a computer from first principles. This doesn’t mean getting a 6502 or Z80, wiring it up, and running BASIC. I’m talking about builds from the ground up, starting with logic chips or even just transistors.

[James Newman]’s 16-bit CPU built from transistors is something he’s been working on for a little under a year now, and it’s shaping up to be one of the most impressive computer builds since the days of Cray and Control Data Corporation.

The 10,000 foot view of this computer is a machine with a 16-bit data bus, a 16-bit address bus, all built out of individual circuit boards containing single OR, AND, XOR gates, decoders, multiplexers, and registers.  These modules are laid out on 2×1.5 meter frames, each of them containing a schematic of the computer printed out with a plotter. The individual circuit modules sit right on top of this schematic, and if you have enough time on your hands, you can trace out every signal in this computer.

The architecture of the computer is more or less the same as any 16-bit processor. Three are four general purpose registers, a 16 bit program counter, a stack pointer, and a status register. [James] already has an assembler and simulator, and the instruction set is more or less what you would expect from a basic microprocessor, although this thing does have division and multiplication instructions.

The first three ‘frames’ of this computer, containing the general purpose registers, the state and status registers, and the ALU, are already complete. Those circuits are mounted on towering frames made of aluminum extrusion. [James] already has 32 bytes of memory wired up, with each individual bit having its own LED. This RAM display will be used for the Game of Life simulation once everything is working.

While this build may seem utterly impractical, it’s not too different from a few notable and historical computers. The fastest computer in the world from 1964 to ’69 was built from individual transistors, and had even wider busses and more registers. The CDC6600 was capable of running at around 10MHz, many times faster than the estimated maximum speed of [James]’ computer – 25kHz. Still, building a computer on this scale is an amazing accomplishment, and something we can’t wait to see running the Game of Life.

Thanks [aleksclark], [Michael], and [wulfman] for sending this in.

A block diagram of the Oldland CPU core

The Oldland CPU 32-bit FPGA Core

Field Programmable Gate Arrays (FPGAs) let you program any logic you’d like onto a chip. You write your logic using a hardware description language, then flash it to the FPGA. You can even design your own processor and flash it to the chip.

That’s exactly what [jamieiles] has done with the Oldland CPU. It’s an open source 32 bit CPU core that you can synthesize for use on an FPGA. Not only can you browse through all the Verilog code in the Github repo, but there’s also a bunch of tools for working with this CPU core.

Included with the package is oldland-rtlsim, which lets you simulate the processor on a PC. The oldland-debug tool lets you connect to the processor for programming and debugging over JTAG. Finally, there’s a GNU toolchain port that lets you build C code for the device.

Going one step futher, [jamieiles] built a full SoC around the Oldland core. This has SPI, UART, timers, and more features you’d expect to find in a microcontroller. It can be flashed to the relatively cheap Terasic DE0-Nano board.

[jamieiles] has also ported u-boot to the processor, and the next thing on the list is the Linux kernel. If you’ve ever been interested in how CPUs actually work, this is a neat project to look through. If you want more open source CPU cores, check out OpenCores.

Resurrecting Capcom’s Kabuki

About a dozen old Capcom arcade titles were designed to run on a custom CPU. It was called the Kabuki, and although most of the core was a standard Z80, a significant portion of the die was dedicated to security. The problem back then was arcade board clones, and when the power was removed from a Kabuki CPU, the memory contents of this security setup were lost, the game wouldn’t play, and 20 years later, people writing emulators were tearing their hair out.

Now that these games are decades old, the on-chip security for the Kabuki CPU is a problem for those who have taken up the task of preserving these old games. However, now these CPUs can be decuicided, programming the chip and placing them in an arcade board without losing their memory contents.

Earlier we saw [ArcadeHacker] a.k.a. [Eduardo]’s efforts to resurrect these old CPUs. He was able to run new code on the Kabuki, but to run the original, unmodified ROMs that came in these arcade games required hardware. Now [ArcadeHacker] has it.

The setup consists of a chip clip that clamps over the Kabuki CPU. With a little bit of Arduino code, the security keys for original, unmodified ROMs can be flashed, put into the arcade board (where the contents of the memory are backed up by a battery), and the clip released. [ArcadeHacker] figures this is how each arcade board was programmed in the factory.

If you’re looking for an in-depth technical description of how to program a Kabuki, [ArcadeHacker] has an incredibly detailed PDF right here.

Continue reading “Resurrecting Capcom’s Kabuki”

Mario Hack

Reprogramming Super Mario World From Inside The Game

[SethBling] recently set a world record speed run of the classic Super Nintendo game Super Mario World on the original SNES hardware. He managed to beat the game in five minutes and 59.6 seconds. How is this possible? He actually reprogrammed the game by moving specific objects to very specific places and then executing a glitch. This method of beating the game was originally discovered by Twitch user [Jeffw356] but it was performed on an emulator. [SethBling] was able to prove that this “credits warp” glitch works on the original hardware.

If you watch the video below, you’ll see [SethBling] visit one of the first available levels in the game. He then proceeds to move certain objects in the game to very specific places. What he’s doing here is manipulating the game’s X coordinate table for the sprites. By moving objects to specific places, he’s manipulating a section of the game’s memory to hold specific values and a specific order. It’s a meticulous process that likely took a lot of practice to get right.

Once the table was setup properly, [SethBling] needed a way to get the SNES to execute the X table as CPU instructions. In Super Mario World, there are special items that Mario can obtain that act as a power up. For example, the mushroom will make him grow in size. Each sprite in the game has a flag to tell the SNES that the item is able to act as a power up. Mario can either collect the power up by himself, or he can use his friendly dinosaur Yoshi to eat the power up, which will also apply the item’s effects to Mario.

The next part of the speed run involves something called the item swap glitch. In the game, Mario can collect coins himself, or Yoshi can also collect them by eating them. A glitch exists where Yoshi can start eating a coin, but Mario jumps off of Yoshi and collects the coin himself simultaneously. The result is that the game knows there is something inside of Yoshi’s mouth but it doesn’t know what. So he ends up holding an empty sprite with no properties. The game just knows that it’s whatever sprite is in sprite slot X.

Now comes the actual item swap. There is an enemy in the game called Chargin’ Chuck. This sprite happens to have the flag set as though it’s a power up. Normally this doesn’t matter because it also has a set flag to tell the game that it cannot be eaten by Yoshi. Also, Chuck is an enemy so it actually hurts Mario rather than act as a power up. So under normal circumstances, this sprite will never actually act as a power up. The developers never programmed the game to properly handle this scenario, because it was supposed to be impossible.

If the coin glitch is performed in a specific location within the level, a Chargin’ Chuck will spawn just after the coin is collected. When the Chuck spawns, it will take that empty sprite slot and suddenly the game believes that Yoshi is holding the Chuck in his mouth. This triggers the power up condition, which as we already know was never programmed into the game. The code ends up jumping to an area of memory that doesn’t contain normal game instructions.

The result of all of this manipulation and glitching is that all of the values in the sprite X coordinate table are executed as CPU instructions. [SethBling] setup this table to hold values that tell the game to jump to the end credits. The console executes them and does as commanded, and the game is over just a few minutes after it began. The video below shows the speed run but doesn’t get too far into the technical details, but you can read more about it here.

This isn’t the first time we’ve seen this type of hack. Speed runs have been performed on Pokemon with very similar techniques. Another hacker managed to program and execute a version of single player pong all from within Pokemon Blue. We can’t wait to see what these game hackers come up with next. Continue reading “Reprogramming Super Mario World From Inside The Game”

Reverse Engineering Capcom’s Crypto CPU

There are a few old Capcom arcade titles – Pang, Cadillacs and Dinosaurs, and Block Block – that are unlike anything else ever seen in the world of coin-ops. They’re old, yes, but what makes these titles exceptional is the CPU they run on. The brains in the hardware of these games is a Kabuki, a Z80 CPU that had a few extra security features. why would Capcom produce such a thing? To combat bootleggers that would copy and reproduce arcade games without royalties going to the original publisher. It’s an interesting part of arcade history, but also a problem for curators: this security has killed a number of arcade machines, leading [Eduardo] to reverse engineering and document the Kabuki in full detail.

While the normal Z80 CPU had a pin specifically dedicated to refreshing DRAM, the Kabuki repurposed this pin for the security functions on the chip. With this pin low, the Kabuki was a standard Z80. When the pin was pulled high, it served as a power supply input for the security features. The security – just a few bits saved in memory – was battery backed, and once this battery was disconnected, the chip would fail, killing the game.

Plugging Kabuki into an old Amstrad CPC 6128 without the security pin pulled high allowed [Eduardo] to test all the Z80 instructions, and with that no surprises were found; the Kabuki is fully compatible with every other Z80 on the planet. Determining how Kabuki works with that special security pin pulled high is a more difficult task, but the Mame team has it nailed down.

The security system inside Kabuki works through a series of bitswaps, circular shifts, XORs, each translation different if the byte is an opcode or data. The process of encoding and decoding the security in Kabuki is well understood, but [Eduardo] had a few unanswered questions. What happens after Kabuki lost power and the memory contents – especially the bitswap, address, and XOR keys – vanished? How was the Kabuki programmed in the factory? Is it possible to reprogram these security keys, allowing one Kabuki to play games it wasn’t manufactured for?

[Eduardo] figured being able to encrypt new, valid code was the first step to running code encrypted with different keys. To test this theory, he wrote a simple ‘Hello World’ for the Capcom hardware that worked perfectly under Mame. While the demo worked perfectly under Mame, it didn’t work when burned onto a EPROM and put into real Capcom hardware.

That’s where this story ends, at least for the time being. The new, encrypted code is valid, Mame runs the encrypted code, but until [Eduardo] or someone else can figure out any additional configuration settings inside the Kabuki, this project is dead in the track. [Eduardo] will be back some time next week tearing the Kabuki apart again, trying to unravel the mysteries of what makes this processor work.