A quick look at the pinouts of an Intel 8086 & 8088 processor reveals a 20 bit address bus. There was high demand for the ability to address 1 meg (2^20) of address space, and Intel delivered. However, a curious individual would wonder how they can achieve such a feat with only 16 bit registers. Intel solved this riddle by combining two registers so they could make it compatible with code written for the 8008, 8080 & 8085. The process they use can be a bit confusing when trying to figure out where to locate your code in the ROM. In this article, we are going to go over the basics of how the Physical Address is calculated and how to locate your code correctly in ROM.
In a monumental effort to confuse young budding computer scientists in the late 70’s, Intel broke its 1 meg of address space into four 64k chunks, with each chunk represented by a 16 bit Segment Register. The value in a Segment Register, called the Segment Address can be thought of as the base address (0000h) of one of the 64k chunks. The address within the 64k chunk is found by an Offset Address. The combination of the Segment Address with the Offset Address is called the Logical Address, and can be transformed to form the elusive Physical Address. In a normal instruction fetch, the Segment Address is stored in the Code Segment (CS) register and the Offset Address is taken from the Instruction Pointer. So the Logical Address will be CS:IP or for example FFF0h:C000h.
The formation of the Physical Address is done by multiplying the Segment Address by 16, and then adding it to the Offset Address. By multiplying the Segment Address by 16, you turn it into a 20 bit value by appending four zero’s to the right side. This calculation is done by a dedicated adder within the processor. But you need to know how the addresses in your program are turned into a Physical Address if you want to know where to locate the code in ROM. This will become more clear below.
The Reset Vector
Now that we are thoroughly confused about the extremely logical and straight-forward internal workings of the x86 address calculations, we can move on to why this information is useful. When the 8086 processor recovers from a hardware reset, the very first address it puts out is FFFF0. This means PINS A0 – A3 are LOW, and A4 – A19 are HIGH.
FFFF0 is the Physical Address. So the Logical Address would be FFFF:0000. With FFFF coming from the Code Segment (CS) Register and 0000 coming from the Instruction Pointer (IP) Register. These are the states of the registers upon reset.
Now, you might have noticed that FFFF0 is really really really really close to the bottom of our memory map. Indeed, it is only 16 bytes away. So the first instruction has to be a far jump to somewhere higher up in memory, and load the Code Segment and Instruction Pointer to the place where your program actually starts. What a brilliant design!
Why Knowing This is Important
Want to roll your own x86 computer from scratch? Consider this schematic (pdf warning) from [Scott’s] 8088 SCB project. Take a look at the processor – he’s only using 16 of the address lines. For the ROM, he’s using a 2764 8k x 8 EPROM, which has 13 address pins. So the question is: where in the ^*#$ do you locate the code in the ROM??? Wait…is it…0000h? Ohhh no, that would be WAY to easy.
First, we have to figure out the reset vector address that will be placed on the EPROM’s address pins. The 8088 will put FFFF0 on its address bus. But from the hardware’s perspective, this address is actually 7FF0.
But wait, there’s more! The 2764 EPROM only has 13 address pins, A0 – A12. This means that when the processor puts FFFF0 onto the address bus, the address seen by the EPROM will actually be 1FF0.
If you still haven’t had enough, now is where you figure out how to get your code (with the reset instruction) into the correct place on the ROM. In this case, the FAR jmp must be located at 1FF0. This is generally done with what is known as a Locator – a program the strips the .EXE generated from the linker into something that can be loaded into ROM. There are not many of these programs around, and if you’re lucky enough to get your hands on one, please let us know in the comments. I have yet to locate Intel’s TLOC.exe, and Paradigm has ignored my requests for theirs.
Below is a hex dump showing the correct placement of the reset vector for [Scott’s] 8088 SBC. The EA hex instruction is a far jump. Far means outside of the 64k segment.
Anyone motivated to make their own x86 SBC now? [Wichit] made this 80C188 SBC, and provides a good starting point. I’ll stick with Arduino.
Note: The screenshots for the binary/hex converter came from here.