One Instruction To Rule Them All: C Compiler Emits Only MOV

How many instructions do you need to successfully compile C code? Let’s see, you’d need some jump instructions, some arithmetic functions, and — of course — move instructions, right? Turns out you only need the move instruction, which — on x86, at least — is Turing complete.

While the effort is a bit tongue-in-cheek, we have to admit that if you were trying to create your own CPU, this would make for a simple architecture and might have power or complexity advantages, so maybe someone will find a practical use for it after all. If you wanted a C compiler for a simple CPU, this wouldn’t require much to emulate at a byte-code level, either.

How does it work? The best way to answer that is to look at the presentation slides on the GItHub site. Everything works through clever usage of memory locations and dummy addresses to do nothing. Efficient? Maybe not. But it does work.

For example, if you want to test to see if x and y are equal

    mov [x],0
    mov [y],1
    mov R,[x]

That hides a lot of detail, of course, but the idea is that if Y doesn’t equal X, you’ll write a 1 to some random memory location (which better not be important; that’s one of the details). If X and Y are equal, you’ll overwrite with a 1. So R winds up with a 0 if X and Y are different or a 1 if they are the same.

In practical terms, if you want to translate something like “if (x==y) x=100” you wind up setting an integer pointer to either point to X or to a dummy location using the above trick (just substitute the address of X for 1 and the dummy address for the 0). Then you store the 100 to that integer pointer. With no branches, each line of code executes.

In fact, though, the compiler makes a few compromises including using a jump to call external functions and a floating point instruction. You can avoid both by recompiling libraries and adding the MOV-only floating point emulator, which is entertaining, but probably not very performant.

This reminded us of several “one instruction” computers we’ve seen in the past (including mine). While this is a bit of a strange compiler target, it is hardly the strangest one we’ve seen.

17 thoughts on “One Instruction To Rule Them All: C Compiler Emits Only MOV

  1. I would point out that mov isn’t actually a single instruction but rather is a type of instruction. The difference between instructions is the addressing mode and of course it’s operands. Still an impressive feat though.

    1. That contradicts with the assembly and machine code books I learned with. Many of us count “instructions” using the assembly mnemonics, not the hex opcodes; afterall, we read assembly instructions when reading asm–we don’t read hex opcode’s–they aren’t even displayed.

      Wikipedia has continued this count disparity; it lists the 6502 as having 56 instructions, with the microchip pic having “35 to 80”. However, 6502 uses up 151 of the possible 256 opcodes… And microchips asm “instructions” are 1:1 with opcodes, even though there may be a half dozen that are just different targets for mov. But again, when I’m reading assembly code, I read it as asm instructions, not hex opcode’s, and I prefer the “instruction count” method over the “opcode instruction count” method.

      (And yes, I had to hand translate that mov to hex myself, because I couldn’t afford an assembler… Uphill, both ways, with snow on my keyboard! ;)

      1. I too agree that it is a difference between an instruction and the various calls it can make.

        For an example, X86 has 90 something different conditional jumps, but it is all just the same instruction, but comparing different registers and looking for different results from the comparison.

        One could though argue that an integer comparison and a float one aren’t the same instruction. Or that comparing equality, or greater/smaller are also different instructions. Since different hardware is typically needed.

        Though, then microcode comes in and muddles the waters even more, but at least here we can call them “microcode instructions” and put them in a category of their own.

        But generally as long as it is just a change in what registers the instruction call interacts with, it is typically not seen as a different instruction. But beyond that, the lines gets fuzzy and fast.

        In the end, instruction count is a fairly debatable topic, but at least there is some logical outer bounds for both sides.

        I at least haven’t seen any practical definitions that makes the instruction count fixed without having to make an actual list of what is and isn’t a new instruction within a set of “instructions”/up-codes of similar nature.

      2. When looking at automata theory/theoretical computer science, you usually would care about the kind of operations you need, reducing them to the absolute minimum to make a working compiler or Turing complete language. Having an instruction that has widely differing complexity depending on what variation of the instruction you pick would be considered “cheating”.

        It’s essentially hiding the complexity, while it should be obvious in software.

        Getting data from a memory location, or getting it indirectly is a big difference, if you assume a Turning machine, the latter can be a lot more instructions/higher runtime.

        The way the instruction executes should be similar, because otherwise, you are not really reducing the amount of types of instructions.

      3. These things are easy to confuse with modern CPUs that generally don’t have different addressing modes within one mnemonic.

        I use an old CPU, the Z80 as an example –

        Mnemonic: LD r, n
        Load the bit value “n” into the the location “r” where “r” one of :
        A – A register, direct immediate
        B – B register, direct immediate
        C – C register, direct immediate
        D -D register, direct immediate
        E – E register, direct immediate
        H – H register, direct immediate
        L – L register, direct immediate
        (HL) – the memory location indexed by HL, indirect addressing

        The mnemonic is not an instruction as it cannot be executed by the CPU as the mnemonic has not specified any parameters for “n” or “r”

        To go one step further you could specify these –

        LD B, 0hF0

        which would translate to 8 bit opcode: 8 bit immediate value

        and then you have an instruction that can be executed by the CPU.

        As there are 8 choices for “r”, there are 8 instructions for this one mnemonic.

        Also note that the last instruction (HL) has different addressing mode and a completely different cycle length even though it has the same byte length

        LD (HL), n

        8 bit opcode: 8 bit immediate value

        However the LD B, 0hF0 does two memory reads and LD (HL), 0hF0 does two memory reads and then a memory write.

        There are also other discrepancies in the way instruction decode at a hardware level on the CPU and how they are written as mnemonics or instructions.

        Typical ones:

        SCF, set carry flag
        RCF reset carry flag
        CCF compliment carry flag

        At a hardware level these are usual the equivalent to four opcodes which more of less do this –

        Load carry, 1
        Load carry, 0
        Load carry, carry complement
        Load carry, carry

        The last one never becomes an instruction however that opcode will execute and basically do nothing

    1. I was thinking this, too. This is a great way to obfuscate your compiled code if you’re willing to take the considerable performance hit.

      Although it would probably be easy for antivirus software to detect as a heuristic. Mostly MOVs? Probably something suspicious.

      Maybe this would be better applied to obfuscate particular sections of your binary rather than the whole thing to avoid setting off warning bells.

Leave a Reply to JoelCancel reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.