Bootstrapping An MSDOS Assembler With Batch Files

You have a clean MSDOS system, and you need to write some software for it. What do you do? You could use debug, of course. But there are no labels so while you can get machine code from mnemonics, you’ll still need to figure out the addresses on your own. That wasn’t good enough for [mniip], who created an assembler using mostly batch files. There are a few .COM files and it looks as if the first time you use debug to create those, but there’s also source you can assemble on subsequent builds with the assembler.

Why? We aren’t entirely sure. But it is definitely a hack. The technique sort of reminded us of our own universal cross assembler — sort of.

There are a few things that make this work. First, there are not many 8086 instructions to worry about. Second, you have to use a special format — essentially prefixing the op codes with CALL. This keeps the assembler from having to parse op codes. You actually call a batch file with the name of the instruction. For example:


REM H e l l o , w
CALL DB 72 101 108 108 111 44 32 119

That code snippet shows another nuance. You have to CALL LABEL to introduce a label. To use the label in an instruction, you have to surround it with percent signs.

Of course, as a practical matter, you could use gcc to build a proper assembler. But where’s the sport in that?

26 thoughts on “Bootstrapping An MSDOS Assembler With Batch Files

  1. It’s a clever hack, though not a very useful one. He has 143 batch files totaling 2403 lines at 580 KB! For reference MASM 6.11 minimal command line run-time including linker and librarian is 700 KB. Sure one *could* use edlin or copy con to enter all 2403 lines by hand rather than copying over MASM on a floppy.. but .. ? And presumably an earlier version of MASM or another assembler would be smaller. And most of the batch files are not simple. It would be as tedious to enter the printed hex dump of a working assembler in debug as to hand enter the batch files (for a true boot-strap experience).

  2. Such luxury to have DOS to bootstrap an assembler.

    In the days long before the internet there was no such thing as code distribution or sharing except for some pages in a magazine that were usually written in hex.

    If you wanted and assembler then you had to write one in hex. That being such a daunting task led people to tackle a disassembler first as it was much easier to write as the input is far better defined and you don’t have to store a lot of variables and it’s simply a single pass process.

    By the time you have finished writing a disassembler you know how to convert every mnemonic to hex from your own memory or thinking the process though as the op-code decoder does so there is no longer any real need to write an assembler.

    So every time someone asks you if you have finished the “assembler” that you once said you would write, you just answered “not yet” while you are programming directly in hex without telling them you no longer intend to.

    1. You’re missing the point. The *ONLY* way you could get the functionality written about was to enter it into the computer yourself. It was a GREAT way to get young minds into programming.

      Now you just download it and use it, but never learn anything from it.

      1. Loved that about DOS.

        I once wrote a realtime fairly high speed datalogger in assembly using debug. The little .com streamed the contents from the LPT buffer to a file on the HDD along with a time stamp in CSV. Found I could double the write speed if I removed the loop to close the file. The way to stop datalogging was to power down.

        Of course, the system then complained of a corrupted file system on reboot. A very simple program that was then written and called on boot to properly close the file. Worked pretty well on curbside (dumpster grade) 286 machines back in the day.

        1. Great job! Few years ago I made a crude oscilloscope for the LPT. It had 500 000 samples/second acquisition speed. But I only got there after I started using very tight GOTO loop :) in compiled QBasic, instead of LOOP/FOR/WHILE structured cr.p. Some other optimizations were needed as well. But the thing flew.

    1. c:\> debug
      -a 100
      1373:0100 mov ah,9
      1373:0102 mov dx,108
      1373:0105 int 21
      1373:0107 ret
      1373:0108 db "Hello world!$"
      -n c:\
      -r bx
      BX 0000
      -r cx
      CX 0000
      Writing 00015 bytes

      c:\> c:\
      Hello world!

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.