One Instruction To Rule Them All: C Compiler Emits Only MOV

May 21, 2021

How many instructions do you need to successfully compile C code? Let’s see, you’d need some jump instructions, some arithmetic functions, and — of course — move instructions, right? Turns out you only need the move instruction, which — on x86, at least — is Turing complete.

While the effort is a bit tongue-in-cheek, we have to admit that if you were trying to create your own CPU, this would make for a simple architecture and might have power or complexity advantages, so maybe someone will find a practical use for it after all. If you wanted a C compiler for a simple CPU, this wouldn’t require much to emulate at a byte-code level, either.

How does it work? The best way to answer that is to look at the presentation slides on the GItHub site. Everything works through clever usage of memory locations and dummy addresses to do nothing. Efficient? Maybe not. But it does work.

For example, if you want to test to see if x and y are equal

    mov [x],0
    mov [y],1
    mov R,[x]

That hides a lot of detail, of course, but the idea is that if Y doesn’t equal X, you’ll write a 1 to some random memory location (which better not be important; that’s one of the details). If X and Y are equal, you’ll overwrite with a 1. So R winds up with a 0 if X and Y are different or a 1 if they are the same.

In practical terms, if you want to translate something like “if (x==y) x=100” you wind up setting an integer pointer to either point to X or to a dummy location using the above trick (just substitute the address of X for 1 and the dummy address for the 0). Then you store the 100 to that integer pointer. With no branches, each line of code executes.

In fact, though, the compiler makes a few compromises including using a jump to call external functions and a floating point instruction. You can avoid both by recompiling libraries and adding the MOV-only floating point emulator, which is entertaining, but probably not very performant.

This reminded us of several “one instruction” computers we’ve seen in the past (including mine). While this is a bit of a strange compiler target, it is hardly the strangest one we’ve seen.

17 thoughts on “One Instruction To Rule Them All: C Compiler Emits Only MOV”

Joel says:

May 21, 2021 at 8:17 pm

Congratulations, the gcc project finally is -O3 flag safe on all x86 cpus after several decades.
;-)

Report comment

Reply
Gravis says:

May 21, 2021 at 8:39 pm

I would point out that mov isn’t actually a single instruction but rather is a type of instruction. The difference between instructions is the addressing mode and of course it’s operands. Still an impressive feat though.

Report comment

Reply
1. Tenaja says:
  
  May 21, 2021 at 10:03 pm
  
  That contradicts with the assembly and machine code books I learned with. Many of us count “instructions” using the assembly mnemonics, not the hex opcodes; afterall, we read assembly instructions when reading asm–we don’t read hex opcode’s–they aren’t even displayed.
  
  Wikipedia has continued this count disparity; it lists the 6502 as having 56 instructions, with the microchip pic having “35 to 80”. However, 6502 uses up 151 of the possible 256 opcodes… And microchips asm “instructions” are 1:1 with opcodes, even though there may be a half dozen that are just different targets for mov. But again, when I’m reading assembly code, I read it as asm instructions, not hex opcode’s, and I prefer the “instruction count” method over the “opcode instruction count” method.
  
  (And yes, I had to hand translate that mov to hex myself, because I couldn’t afford an assembler… Uphill, both ways, with snow on my keyboard! ;)
  
  Report comment
  
  Reply
  1. Alexander Wikström says:
    
    May 22, 2021 at 1:04 am
    
    I too agree that it is a difference between an instruction and the various calls it can make.
    
    For an example, X86 has 90 something different conditional jumps, but it is all just the same instruction, but comparing different registers and looking for different results from the comparison.
    
    One could though argue that an integer comparison and a float one aren’t the same instruction. Or that comparing equality, or greater/smaller are also different instructions. Since different hardware is typically needed.
    
    Though, then microcode comes in and muddles the waters even more, but at least here we can call them “microcode instructions” and put them in a category of their own.
    
    But generally as long as it is just a change in what registers the instruction call interacts with, it is typically not seen as a different instruction. But beyond that, the lines gets fuzzy and fast.
    
    In the end, instruction count is a fairly debatable topic, but at least there is some logical outer bounds for both sides.
    
    I at least haven’t seen any practical definitions that makes the instruction count fixed without having to make an actual list of what is and isn’t a new instruction within a set of “instructions”/up-codes of similar nature.
    
    Report comment
    
    Reply
  2. None says:
    
    May 22, 2021 at 5:50 am
    
    When looking at automata theory/theoretical computer science, you usually would care about the kind of operations you need, reducing them to the absolute minimum to make a working compiler or Turing complete language. Having an instruction that has widely differing complexity depending on what variation of the instruction you pick would be considered “cheating”.
    
    It’s essentially hiding the complexity, while it should be obvious in software.
    
    Getting data from a memory location, or getting it indirectly is a big difference, if you assume a Turning machine, the latter can be a lot more instructions/higher runtime.
    
    The way the instruction executes should be similar, because otherwise, you are not really reducing the amount of types of instructions.
    
    Report comment
    
    Reply
  3. RÖB says:
    
    May 22, 2021 at 10:12 pm
    
    These things are easy to confuse with modern CPUs that generally don’t have different addressing modes within one mnemonic.
    
    I use an old CPU, the Z80 as an example –
    
    Mnemonic: LD r, n
    Load the bit value “n” into the the location “r” where “r” one of :
    A – A register, direct immediate
    B – B register, direct immediate
    C – C register, direct immediate
    D -D register, direct immediate
    E – E register, direct immediate
    H – H register, direct immediate
    L – L register, direct immediate
    (HL) – the memory location indexed by HL, indirect addressing
    
    The mnemonic is not an instruction as it cannot be executed by the CPU as the mnemonic has not specified any parameters for “n” or “r”
    
    To go one step further you could specify these –
    
    LD B, 0hF0
    
    which would translate to 8 bit opcode: 8 bit immediate value
    
    and then you have an instruction that can be executed by the CPU.
    
    As there are 8 choices for “r”, there are 8 instructions for this one mnemonic.
    
    Also note that the last instruction (HL) has different addressing mode and a completely different cycle length even though it has the same byte length
    
    LD (HL), n
    
    8 bit opcode: 8 bit immediate value
    
    However the LD B, 0hF0 does two memory reads and LD (HL), 0hF0 does two memory reads and then a memory write.
    
    There are also other discrepancies in the way instruction decode at a hardware level on the CPU and how they are written as mnemonics or instructions.
    
    Typical ones:
    
    SCF, set carry flag
    RCF reset carry flag
    CCF compliment carry flag
    
    At a hardware level these are usual the equivalent to four opcodes which more of less do this –
    
    Load carry, 1
    Load carry, 0
    Load carry, carry complement
    Load carry, carry
    
    The last one never becomes an instruction however that opcode will execute and basically do nothing
    
    Report comment
    
    Reply
Alphatek says:

May 22, 2021 at 2:56 am

Move machines (apparently now called transport triggered architecture) have been around for well over 30 years. Nice to see one in the modern world though.

Report comment

Reply
1. YGDES says:
  
  May 22, 2021 at 2:25 pm
  
  Here this is not a TTA system…
  
  Report comment
  
  Reply
raster says:

May 22, 2021 at 4:10 am

Mmm… a trojan using that kind of code could be really, really hard to understand…

Report comment

Reply
1. David W Lambert says:
  
  May 22, 2021 at 6:02 am
  
  And the fun one could have with the “random” addresses.
  
  Report comment
  
  Reply
2. Tacker says:
  
  May 23, 2021 at 10:11 am
  
  I was thinking this, too. This is a great way to obfuscate your compiled code if you’re willing to take the considerable performance hit.
  
  Although it would probably be easy for antivirus software to detect as a heuristic. Mostly MOVs? Probably something suspicious.
  
  Maybe this would be better applied to obfuscate particular sections of your binary rather than the whole thing to avoid setting off warning bells.
  
  Report comment
  
  Reply
Justin S Davis says:

May 22, 2021 at 8:22 am

It would be simple (and fun) to make a CPU with only a mov instruction…

https://hackaday.io/project/21298-one-instruction-ttl-computer

Report comment

Reply
Timescale says:

May 22, 2021 at 8:36 am

Alright, but how about introducing self modifying code to this concept? Ducks….

Report comment

Reply
Instruction says:

May 22, 2021 at 9:23 am

The same argument could be to have a cpu with a single ‘execute C source’ instruction.

Report comment

Reply
8zKJgranJPeLuSjtt8PT says:

May 22, 2021 at 2:49 pm

Those links are wrong. The attribution is wrong. This is xoreaxeaxeax’s work. https://github.com/xoreaxeaxeax/movfuscator. The links are just forks of the original repo.

Here is the original presentation: https://www.youtube.com/watch?v=HlUe0TUHOIc. In it he does note that he works there but still this is just the fork not the original.

Report comment

Reply
Anonymous malware analyst says:

May 23, 2021 at 7:04 am

I would hate to be tasked with reverse engineering malware compiled in this way.

Report comment

Reply
1. Heath says:
  
  May 23, 2021 at 4:27 pm
  
  https://github.com/kirschju/demovfuscator
  
  Report comment
  
  Reply

Hackaday

One Instruction To Rule Them All: C Compiler Emits Only MOV

17 thoughts on “One Instruction To Rule Them All: C Compiler Emits Only MOV”

Leave a Reply to JoelCancel reply

Search

Never miss a hack

If you missed it

Supercon 2024: Exploring The Ocean With Open Source Hardware

Porting COBOL Code And The Trouble With Ditching Domain Specific Languages

Jenny’s (Not Quite) Daily Drivers: Raspberry Pi 1

In 2025, The Philly Maker Faire Finds Its Groove

Which Browser Should I Use In 2025?

Our Columns

FLOSS Weekly Episode 829: This Machine Kills Vogons

Announcing The Hackaday Pet Hacks Contest

Keebin’ With Kristina: The One With John Lennon’s Typewriter

Linux Fu: Stopping A Runaway

Hackaday Links: April 13, 2025

17 thoughts on “One Instruction To Rule Them All: C Compiler Emits Only MOV”

Leave a Reply to JoelCancel reply

Search

Never miss a hack

Subscribe

If you missed it

Our Columns