The 80386 was — arguably — Intel’s first modern CPU. The 8086 was commercially successful, but the paged memory model was stifling. The 80286 also had a protected mode, which differed from the 386’s. [Ken Shirriff] takes the 386 apart for us in a recent blog post.
The 286’s protected mode was less successful than the 386 because of several key limitations as it was a 16-bit processor with a 24-bit address bus. It still required segment changes to access larger amounts of memory, and it had no good way to call back into real mode for compatibility reasons. The 386 fixed all that. You could adopt a segment strategy if you wanted to. But you could also load the segment registers once to point to a 4 GB linear address space and then essentially forget them. You also had a virtual 86 mode that could simulate real mode with some work.
The CPU used a 1-micron process, compared to the 1.5-micron process used earlier. The chip had 285,000 transistors (although the 80386SL had many more). That was ten times the number of devices on the 8086. The cheaper 386SX did use the 1.5 micron process for a while, but with a 16-bit external bus, this was feasible. While 285,000 sounds like a lot, a Core i9 has around 4.2 billion transistors. Times have changed.
A smaller design also allowed chips like the 386SL for laptops. The CPU took up only about a fourth of the die. The rest held bus controllers and cache interfaces to cut costs on laptops. That’s why it had so many more transistors.
[Ken] does his usual in-depth analysis of both the die and the history behind this historic device. We spent a lot of time writing protected mode 386 code, and it was nice to see the details of a very old friend. These days, you can get a pretty capable CPU system on a solderless breadboard, but designing a working 386 system took a few extra parts. The 80286 was a stepping stone between the 8086 and 80386, but even it had some secrets to give up.
[Ken Shirriff] is looking inside chips again. This time, the subject is the MK4116 — a 16 Kbit DRAM chip. Even without a calculator, you know that’s a whopping 2 Kbytes, and while that doesn’t sound impressive, in the late 1970s, it was a modern miracle.
The chip showed up in computers ranging from the TRS-80 to the Xerox Alto and was even a mainstay of arcade video games. While [Ken] thought it would be a pretty predictable teardown, he found several surprises.
Static RAM chips use flip flops and retain their state as long as power is on. That’s convenient, but each flip flop takes multiple transistors, so there is a limit to how many bits you can put on a particular size chip. Dynamic RAM increases that limit because it is nothing more than a capacitor and a single transistor. This increases memory density, but the problem is that the capacitor doesn’t hold charge indefinitely. The computer or an associated circuit had to refresh the memory periodically to maintain the contents.
One of the key innovations for this chip was the use of multiplexed address lines so it could use a smaller package. Inside, two banks of capacitors store the bits, and, usually, a computer would use eight chips to store a byte. Of course, each memory bit is made to be as compact as possible. This chip is also made to be very low power when idle. The secret is that it doesn’t use load transistors but instead uses an active pull-up tied to the system clock. Another interesting feature is the sense amplifier, which has to measure the tiny noisy voltage from the capacitors.
You’ll see all this and more in [Ken’s] write-up. Chips from that era were relatively easy to take apart compared to today’s devices. Want to know how it’s done? [Ken] can tell you. He is well-known for doing a lot of cool stuff, with ICs and even old mainframe and space hardware.
Video gamers know about cheat codes, but assembly language programmers are often in search of undocumented instructions. One way to find them is to map out all of a CPU’s opcodes and where there are holes, try those values, and see what happens. Not good enough for [Ken Shirriff]. He prefers examining the CPU’s microcode and deducing what each part of it does.
Microcode is a feature of many modern CPUs. The CPU runs several “microcode” instructions to process a single opcode. For the Intel 8086, there are 512 micro instructions, each with 21 bits. Each instruction has two parts: a part that moves a source to a destination and another that performs some other operation, such as an ALU operation. [Ken] explains it all in the post, including several hidden registers you can’t see, but the microcode can.
Some of the undocumented instructions are probably not useful. They are either impractical or duplicate a function you can already do another way. Not all of the instructions are there for technical reasons. For example, opcode D6, commonly known as SALC for “Set AL to Carry”, seems to exist only as a trap for anyone making a carbon copy of Intel’s microcode. When other companies like NEC made 8086 clones, having an undocumented instruction would strongly suggest they just copied Intel’s intellectual property (in NECs case, they didn’t).
Other cases happen where an instruction just doesn’t make sense. For example, you can pop all segment registers, and though it is not documented, you can deduce that POP CS should be opcode 0F. The problem is there is no sane reason to pop CS off the stack. The instruction works; it just isn’t useful. The opcodes from 60-6F are conditional jumps that are no different from the instructions at 70-7F because of decoding. There is no reason to document both identical instruction ranges.
The plot thickens when you go to two-byte instructions. You’ll find plenty of instructions of dubious value. You don’t hear much about undocumented instructions anymore. Why? Because modern CPUs have enough circuitry to dedicate some to detecting illegal instructions and halting the CPU. But the 8086 was squeezed too tight to allow for such a luxury. Good thing for people like us who enjoy solving puzzles.
One of the interesting features of the 8086 back in 1978 was the provision for “string” instructions. These took the form of prefixes that would repeat the next instruction a certain number of times. The next instruction was meant to be one of a few string instructions that operated on memory regions and updated pointers to the memory region with each repeated operation. [Ken Shirriff] examines the 8086 die up close and personal to explain how the 8086 microcode pulled this off and it is a great read, as usual.
In general, the string instructions wanted memory pointers in the SI and DI registers and a count in CX. The flags also have a direction bit that determines if the SI and DI registers will increase or decrease on each execution. The repeat prefix could also have conditions on it. In other words, a REP prefix will execute the following string instruction until CX is zero. The REPZ and REPNZ prefixes would do the same but also stop early if the zero flag was set (REPZ) or not set (REPNZ) after each operation. The instructions can work on 8-bit data or 16-bit data and oddly, as [Ken] points out — the microcode is the same either way.
[Ken] does a great job of explaining it all, so we won’t try to repeat it here. But it is more complicated than you’d initially expect. Partially this is because the instruction can be interrupted after any operation. Also, changing the SI and DI registers not only have to account for increment or decrement, but also needs to understand the byte or word size in play. Worse still, an unaligned word had to be broken up into two different accesses. A lot of logic to put in a relatively small amount of silicon.
Even if you never design a microcoded CPU, the discussion is fascinating, and the microphotography is fun to look at, too. We always enjoy [Ken’s] posts on little CPUs and big computers.
Few CPUs have had the long-lasting influence that the 8086 did. It is hard to believe that when your modern desktop computer boots, it probably thinks it is an 8086 from 1978 until some software gooses it into a more modern state. When [Ken] was examining an 8086 die, however, he noticed that part of the die didn’t look like the rest. Turns out, Intel had a bug in the original version of the 8086. In those days you couldn’t patch the microcode. It was more like a PC board — you had to change the layout and make a new one to fix it.
The affected area is the Group Decode ROM. The area is responsible for categorizing instructions based on the type of decoding they require. While it is marked as a ROM, it is more of a programmable logic array. The bug was pretty intense. If an interrupt followed either a MOV SS or POP SS instruction, havoc ensues.
Texas Instruments isn’t the name you usually hear associated with the first microprocessor. But the TI TMX 1795 was an 8008 chip produced months before the 8008. It was never available commercially, though, so it has been largely forgotten by most people. But not [Ken Shirriff]. You can see a demo from 2015 of the device in the video below, too.
The reason the chips have the same architecture is they were built to replace the same large circuit board inside a Datapoint 2200 programmable terminal. These were big beasts that could be programmed in BASIC or PL/B.
Datapoint asked Intel to shrink the board to a chip due to heating problems — but after delays, they instead replaced the power supply and lost interest in the device. TI heard about the affair and wanted in on the deal. However, Datapoint was unimpressed. The chip didn’t tolerate voltage fluctuations very well, since they had replaced the power supply and had a new CPU design that was faster than the chip would be. They were also unimpressed with how much stuff you had to add to get a complete system.
So why did the Intel 8008 work out in the marketplace but the TI chip didn’t? After all, Datapoint decided not to use the 8008, also. But as [Ken] points out, the 8008 was much smaller than the TI chip and, thus, was more cost-effective to produce.
As usual, [Ken]’s posts are always interesting and enlightening. He’s looked at a lot of old computers. He’s even dug into old space hardware. Great stuff!
On the chip is a mixture of analog and digital circuitry, with some elements of a more traditional game chip alongside a ROM, a PLA, and a serial CPU core. The PLA stores pixel data while the ROM stores the CPU code, and the CPU serves to perform calculations necessary to the games themselves. He hasn’t fully reverse-engineered either, but the two areas of the chip are mask-programmed to produce the different games with which the chip could be found.
So the answer to the original question is that there is a CPU on board, but it’s not a 6502 and the operation is a hybrid between dedicated game chip and CPU-controlled chip. What we find interesting is that this serial CPU core might have as we mused in the previous piece made the heart of a usable 1970s microcontroller, was this a missed opportunity on the part of MOS? We’ll never know, but at least another piece of early video game history has been uncovered.