Video gamers know about cheat codes, but assembly language programmers are often in search of undocumented instructions. One way to find them is to map out all of a CPU’s opcodes and where there are holes, try those values, and see what happens. Not good enough for [Ken Shirriff]. He prefers examining the CPU’s microcode and deducing what each part of it does.
Microcode is a feature of many modern CPUs. The CPU runs several “microcode” instructions to process a single opcode. For the Intel 8086, there are 512 micro instructions, each with 21 bits. Each instruction has two parts: a part that moves a source to a destination and another that performs some other operation, such as an ALU operation. [Ken] explains it all in the post, including several hidden registers you can’t see, but the microcode can.
Some of the undocumented instructions are probably not useful. They are either impractical or duplicate a function you can already do another way. Not all of the instructions are there for technical reasons. For example, opcode D6, commonly known as SALC for “Set AL to Carry”, seems to exist only as a trap for anyone making a carbon copy of Intel’s microcode. When other companies like NEC made 8086 clones, having an undocumented instruction would strongly suggest they just copied Intel’s intellectual property (in NECs case, they didn’t).
Other cases happen where an instruction just doesn’t make sense. For example, you can pop all segment registers, and though it is not documented, you can deduce that POP CS should be opcode 0F. The problem is there is no sane reason to pop CS off the stack. The instruction works; it just isn’t useful. The opcodes from 60-6F are conditional jumps that are no different from the instructions at 70-7F because of decoding. There is no reason to document both identical instruction ranges.
The plot thickens when you go to two-byte instructions. You’ll find plenty of instructions of dubious value. You don’t hear much about undocumented instructions anymore. Why? Because modern CPUs have enough circuitry to dedicate some to detecting illegal instructions and halting the CPU. But the 8086 was squeezed too tight to allow for such a luxury. Good thing for people like us who enjoy solving puzzles.
Though it is largely forgotten today, the Intel 80286 was for a while in the 1980s the processor of choice and designated successor to the 8086 in the world of PCs. It brought a new mode that could address up to 16 Mb of memory, and a welcome speed boost over machines using an 8086 or 8088. As with many microprocessors, it has a few undocumented features, and it’s a couple of these that [rep lodsb] takes a look at. Along the way we learn a bit about the 286, and about why Intel had some of these undocumented instructions in the first place.
If you used a 286 it was probably as an end-user sitting in front of a PC-AT or clone. During manufacture and testing though, the processor had need of some extra functions, both for testing the chip itself and for debugging designs using it. It’s in these fields that the undocumented instructions sit, and they relate to an in-circuit emulator, a 286 with a debug port on some of its unused pins, which would have sat on a plug-in daughterboard for systems under test. The 286 was famous for its fancy extended mode taking rather a long time to switch to, and these instructions relate to loading and saving states before and after the switch.
The 286s time as the new hotness was soon blasted away by the 386 with its support for virtual memory, so for most of us it remains as simply a faster way that we ran 8086 code for a few years. They appear from time to time here, even being connected to the internet.
For an old CPU, finding all the valid instructions wasn’t very hard. You simply tried them all. Sure, really old CPUs might make it hard to tell what the instruction did, but once CPUs got illegal instruction traps, you could quickly just scan possible op codes and see what didn’t throw an exception. Modern processors, though, are quite another thing. For example, you might run a random instruction that locks up the machine or miss an instruction that would have been valid but the CPU is in the wrong mode. [Can Bölük] has a novel solution: By speculatively executing the target instruction and then monitoring the microcode sequencer, he can determine if the CPU is decoding an instruction even if it refuses to execute it.
Some unknown instructions may have power for good or evil, such as the recently announced undocumented instructions that can apparently rewrite the microcode. We expect to see a post soon on how to reprogram your Intel processor to run as a 6502 natively.
There was a time when owning a computer meant you probably knew most or all of the instructions it could execute. Your modern PC, though, has a lot of instructions, many of them meant for specialized operating system, encryption, or digital signal processing features.
There are known undocumented instructions in a lot of x86-class CPUs, too. What’s more, these days your x86 CPU might really be a virtual machine running on a different processor, or your CPU could have a defect or a bug. Maybe you want to run sandsifter–a program that searches for erroneous or undocumented instructions. Who knows what is lurking in your CPU?