Translate Your CP/M Code To 8086, And Leave The 1970s Behind!

“Bring our home computing out of the 1970s and into the 1980s and beyond” is the irresistible promise made by the creator of 8088ify, a piece of software which translates CP/M executables from their 8080-based originals to assembler code that should run on an 8088 under MS/DOS. How can we resist such a futuristic promise here in 2021, even though the code wasn’t written to the sound of Donna Summer or the Village People back in the day but here in 2021 for PCjam, a celebration of the original IBM PC’s 40th anniversary.

As the writer of this code [ibara] points out that Intel intended the 8088 to be a ready upgrade path for the 8080, and designed its instruction set while not directly compatible, to make translation between the two a straightforward process. There was commercial software for the task at the time, but to this day there remained nothing with an open-source licence. It’s written in ANSI C for portability across platforms and compilers, and can even be compiled under CP/M itself.

PCjam is well worth a look, and if any of you fancy a go at writing for the earliest MS-DOS machines we’d like to suggest you create something for it. Meanwhile if you’d like to explore CP/M, you can run a bare metal emulator on the Raspberry Pi.

Header: Thomas Nguyen, CC BY-SA 4.0.

34 thoughts on “Translate Your CP/M Code To 8086, And Leave The 1970s Behind!

  1. I remember CP/M quite well– not certain if it’s “fondly”, but I certainly used it long enough. Does anybody still use CP/M these days for anything non-hobby related? Are there still pockets of commercial or scientific application? There’s still something romantic about writing your own BDOS… it’s been a very long time since I’ve done it.

      1. He spent an awful lot on a full blown CP/M system, and then started writing the column. He stuck with CP/M because he had software written for it. For actual writing, you don’t need much. But I doubt he stuck with it to the end.

        He did try other systems, I think he even bought some. But once the column took off, he was getting so much stuff for review that I’m surprised he had time to keep writing books.

        Some of his declarations stopped making sense decades ago.

    1. No, the 8088/8086 are the x86 architecture. The 8080 (and 8008) was the prior architecture. They are not binary compatible.
      The CP/M and MSDOS APIs are also very different.

      If you have the original assembly code that was used to generate 8080 machine code, you could use that to assemble 8088 machine code that would run.
      Any CP/M calls in your code wouldn’t do anything however, and certainly have no effect on DOS. You’d need to rewrite all of that in your assembly code first.

    2. The 8086 and 8088 were binary compatible and had the same internal architecture and assembly language.

      The 8080 was not binary compatible and used a significantly different assembly language. primarily due a significant difference in its register set and a lack of the more advanced addressing modes of the 8086/88.

      (The NEC V20 was pin and code compatible with the 8088, but also had an 8080 emulation mode which allowed running 8080 code directly. For a while I had a PC clone running legacy CP/M programs from the DOS command line – it could run the programs directly or open a CP/M prompt. I was developing embedded Z80 code at the time, and until I wrote my own there wasn’t a cross-assembler for DOS that ran as fast as M80, so I did the CP/M port to tide me over.)

      That being said, the architecture of the 8086/88 was designed to allow 8080 assembly code to be translated relatively easily into 8086 using automated tools. The 8080 had A, B, C, D, H and L registers, all of which were 8 bits. In addition, B and C could be paired as the 16-bit BC register, D and E as DE, and H and L as HL.

      The 8086 (and descendants) has AH and AL, (AX), BH and BL (BX), CH and CL (CX) and DH and DL (DX) 8 bit registers (16 bit register pairs). This allowed for easy mapping of registers from 8080 to 8086. The PC register was still 16 bits as in the 8080, but the CS (code segment), DS (data segment) ES (extra segment) and SS (stack segment) allowed 16-bit addresses to be mapped anywhere in the 1Mb address space on 16-byte boundaries.

      1. > The PC register was still 16 bits as in the 8080, but the CS (code segment), DS (data segment)
        > ES (extra segment) and SS (stack segment) allowed 16-bit addresses to be mapped anywhere
        > in the 1Mb address space on 16-byte boundaries.

        Yeah, it was a great idea at the time.

        But 2 years later, it was already obsoleted by CPUs with linear address space. Imo, Intel was very, very lucky that IBM chose that CPU for their PC, otherwise it would have died quite quickly (like it should have, in my quite 68000-biased opinion :D).

        Oh well, Intel made up for it with the 80286. That still segmented the memory, but at least the segments were now 1MB instead of 64KB. They turned it into a feature.

        The 64KB barrier was the reason that most games that used 256 colors (8-bit) were using 320×200 mode (as opposed to most home computers at the time supporting 320×240). 320×200 fits within 64KB. 320×240 needs 75KB, and so has to do tricky things to switch segments and still keep up performance.

        Ach, the good old times.

    3. The 8088 couldn’t directly run 8080 code. I’m not even sure it was assembly compatible.

      Intel offered software or a service to help the transition.

      Byte, I think in 1980 or 81, had an article on the topic, maybe using macros. A brief search shows a two-part article about translators in June and July of 1982, with some background. I’m not sure if that’s the article I remember.

  2. I worked with a guy who swore CP/M was going to be THE operating system of the future. Our company brought out the NCR Decisionnmate 5 . He bought one and invested heavily in the expansion packs that slid in the back to add a modem, printer port, extra memory, etc. I found the structure of CP/M a bit backwards… pip a: = b: (copy files from b to a) instead of the msdos of copy b: to a: . There were more that i found not exactly logical. I went with the TRS80 with TRSDOS and DOSPlus. About a year later NCR brought out the PC4i running msdos. He ended up with a huge boat anchor. Now it is worth alot to collectors.

    1. It WAS THE operating system of the future until the next operating system of the future came out, and then the next, and then the next….

      And now it’s Linux until the next operating system of the future comes out. And believe me it will, it always has. And then users will be reminiscing about how good Linux was… And we (or they if I’m dead by then) will be retrofitting hardware to run that ancient Linux OS.

  3. Great! Now I can continue to use my punchcards, reel-to-reel tapes, toggle switches, and my paper tape reader with my original version of BASIC!

    tl;dr Emulators are where it’s at.

      1. You didn’t use fan-fold on a teletype – you used rolls. Anyone who was anyone used a teletype because it came with its own paper tape punch and reader.

        All you needed was a current-loop serial port…

        1. Wasn’t the paper tape oiled? Or was it from passing through an over-oiled puncher? A question that I’ve had since I learned programming in the 70’s. :-)

          1. One of my favorite memories is when I realized the relationship between the which holes were punched and the character they represented. I went on to write a program into which you would type a sentence, and it would ‘write’ it out on the tape in a 5×7 pattern, dot matrix style. I don’t remember if my teacher or grade 7 classmates thought it was cool, but I thought it was!

  4. The 8086/8088 were designed to have an assembler that would take 8085 source and generate 8086 object code. Apparently the early versions of MS-BASIC for 8086 used this, but the memory move subroutine was pretty bad because the idiom didn’t translate very well to the 8086. This is also suspected to be the reason that the undocumented 8085 instructions remained undocumented, because they would have complicated this process.

    This is almost the same idea, except instead of being a cross-assembler, it translates 8085 source code into 8086 source code. This is much easier than writing an entire assembler, and it has a bonus in that the translated code is easy to inspect for correct translation. It is also helped by MS-DOS supporting the CP/M API from the start.

    Doing this from existing binary code would be much more difficult. The instruction lengths are not all going to be the same, so you have to analyze the code to re-assemble it properly. It’s basically the same problem as trying to get a good disassembly from a raw binary. You don’t just need to identify code and data, but also address references stored in data tables. Even worse, sometimes subroutines are written to expect multiple bytes in-line after the call, screwing up a naive disassembly. It just can’t be done without understanding how the code works. I’ve done this many times, and it’s not trivial.

Leave a Reply to GregbenCancel reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.