The Inner Workings Of The Intel 8086’s Arithmetic Logic Unit

January 29, 2026

In the 1970s CPUs still had wildly different approaches to basic features, with the Intel 8086 being one of them. Whereas the 6502 used separate circuits for operations, and the Intel 8085 a clump of reconfigurable gates, the 8086 uses microcode that configures the ALU along with two lookup tables. This complexity is one of the reasons why the Intel 8086 is so unique, with [Ken Shirriff] taking an in-depth look at its workings on a functional and die-level.

These lookup tables are used for the ALU configuration – as in the above schematic – making for a very flexible but also complex system, where the same microcode can be used by multiple instructions. This is effectively the very definition of a CISC-style processor, a legacy that the x86 ISA would carry with it even if the x86 CPUs today are internally more RISC-like. Decoding a single instruction and having it cascade into any of a variety of microcodes and control signals is very powerful, but comes with many trade-offs.

Of course, as semiconductor technology improved, along with design technologies, many of these trade-offs and disadvantages became less relevant. [Ken] also raises the interesting point that much of this ALU control technology is similar to that used in modern-day FPGAs, with their own reconfigurable logic using LUTs that allow for on-the-fly reconfiguration.

3 thoughts on “The Inner Workings Of The Intel 8086’s Arithmetic Logic Unit”

LordNothing says:

January 29, 2026 at 11:49 am

Modern x64 CPUs are pretty much ARM-like because it offers greater performance in daily applications and games.

Report comment

Reply
1. Megol says:
  
  January 29, 2026 at 1:08 pm
  
  I don’t get what you mean? X86 is still very CISC and the overheads of being CISC will always be present. CISC doesn’t mean slow nor power inefficient, especially today where inefficiencies are mostly non-architectural.
  
  Report comment
  
  Reply
2. Joshua says:
  
  January 29, 2026 at 1:43 pm
  
  They’re RISC-CISC hybrids, I think. RISC core with a CISC front-end, in layman’s terms.
  And this combo isn’t so bad, actually.
  
  CISC instructions get broken down into smaller instructions which can be put into several pipelines, for parallel processing.
  The really complex thing is cache coherency and predictions about which instruction or code sequence might be needed next.
  
  A cache flush by a cache miss causes big performances penalty, maybe.
  That’s why self-modifying code fell out of favor. It didn’t work efficiently on 486 and higher CPUs anymore.
  
  The last true x86 CISC designs were the 286/386, perhaps.
  Though 386 systems often had external cache on motherboards by the 90s,
  which suffered by cache misses by self-modifying code.
  
  And 8086/8088 did address calculation in the ALU, which also was time consuming.
  By contrast, the much more sophisticated 286 had a dedicated circuit for such calculations (it also had a real MMU by the way).
  The 8018x and NEC V20/V30 maybe, as well, but I’m not sure right now.
  
  All in all, the 8086 was “okay” though. Laughing about it wasn’t necessary. By mid-70s standards it was fairly descent, actually.
  The crippled 8088 derivative was much worse (performance cut in half, behind 6510).
  
  Still, I think the NEC V30 was the better overall design, though.
  It compared to 8086 like the Z80 did to the ancient 8080.
  
  If needed, the NEC also could “emulate” (mimic) 8080 instruction handling through a register renaming technique,
  making it more of a true 8080 descendant than the original 8086, even.
  
  Paradoxically, that means that the NEC V30 is processing things at a level closer to the “bare metal” than the 8086 (more native).
  Especially if it does rely less on microcode (seems to be the case).
  
  I know, it’s a bit of an unpopular opinion, maybe. 🙁
  
  Report comment
  
  Reply