WheatForce: Learning From CPU Architecture Mistakes

April 1, 2026

Nothing ever made is truly perfect and indeed, CPU architectures like x86, RISC-V, ARM, and PowerPC all have their own upsides and downsides. Today, I aim to make an architecture that learns from all these mistakes and improves architecture design for everyone.

I’ve consulted with many people opinionated on the matter, both from a software perspective, and from a hardware perspective. I have taken all their feedback in mind while creating this initial draft of the WheatForce architecture (PDF). It is inspired by pieces from many architectures: segmentation inspired by x86, hash table-like paging from PowerPC, dynamic endianness control from RISC-V and PowerPC, and more. Let’s look into each feature in a little bit more detail.

Segmentation

The local descriptor table (left) points to main memory (right) via its segment descriptors — x86′ segmentation scheme by [John] on Wikipedia

Segmentation is a powerful virtual-memory feature that is tragically underused today. I believe this is due to limited flexibility, so I have added an improvement above the model that x86 had used: every single register can now use its own segment selector. With this added flexibility, one can surely make better use of the address translation powers of segmentation with minimal extra overhead.

Hash Table-Like Paging

PowerPC’s hash table-like paging makes its paging vastly superior to the likes of x86, RISC-V and ARM by decreasing the number of required cache line fetches drastically. Much like a true hash table, the keys (or input addresses) are hashed and then used as an index into the table. From there, that row of the table is searched for a cell with a matching virtual address, which can be accelerated greatly due to superior cache locality of the entries in this row.

Dynamic Endianness Control

A diagram of PowerPC's paging structures — A diagram of PowerPC’s paging structures from the PowerPC manual

RISC-V and PowerPC both have some real potential for better compatibility with their dynamic endianness control. However, both these architectures can only change the endiannes from a privileged context. To make this more flexible, WheatForce can change the data endianness at any time with a simple instruction. Now, user software can directly interoperate between big-endian and little-endian data structures, eliminating the need for a costly byte-swap sequence that would need many instructions. Finally, you can have your cake and eat it to!

Conclusion

WheatForce has observed the mistakes of all architectures before it, and integrates parts of all its predecessors. You can read the full specification on GitHub. After you’ve read it, do let me know what you think of it.

20 thoughts on “WheatForce: Learning From CPU Architecture Mistakes”

Greg A says:

April 1, 2026 at 7:23 am

my general feeling is most of these details don’t really matter too much. superscalar / out-of-order / speculative execution has pretty much solved all of the obvious mistakes. I remember when i started out on a 286, the common knowledge was that instructions take time. Then in the second half of the 1990s the lore was that only conditional branches take time. And then i gradually became aware that truly, the only thing that takes time is a cache miss. And now i have finally come to accept that and integrate it into my practices and i’m still astonished how true it is. I will do something like remove half the instructions from an inner loop — including 6 conditional branches — and it makes 0 improvement…but then i reduce the size of the loop a little (to visit less memory) and it makes a huge gain because now it fits in a better layer of cache.

But anyways, i’m surprised to see paging architecture on this list. I’m pretty ignorant about it but it seems to me, if the data itself is in cache, then its address translation will probably be in the TLB. If the data is outside of cache, it’s gonna be slow to get the data itself regardless of how slow it is to do the lookup. I’m not denying that there’s an advantage…it just doesn’t seem super significant to me. I guess that’s my own bias, i’ve just come to believe that cache misses are infinitely expensive and there’s no point optimizing them :)

Reply
1. Johan says:
  
  April 1, 2026 at 7:44 am
  
  Yeah…I think he is going to relearn all the mistakes we made the past 50 years…especially that gazillion selector registers.
  
  As to the cache differences you mentioned?…I agree. Having to access main memory is like flying at warp speed, and then hitting a 20 mile long patch of loose sea sand, plowing to an almost complete halt. I still cry myslef to sleep sometimes thinking of where we could have been if we didn’t go down the DRAM rabbit hole…
  
  Reply
  1. Jan says:
    
    April 1, 2026 at 10:54 am
    
    “where we could have been if we didn’t go down the DRAM rabbit hole…”
    
    Interesting thought. But I suspect if DRAM didn’t exist we would have had different computers with more expensive and smaller memory. Although we’d never know.
    
    Reply
Joseph Eoff says:

April 1, 2026 at 7:42 am

After you’ve read it, do let me know what you think of it.

April Fool’s day!

Reply
1. Pat says:
  
  April 1, 2026 at 8:25 am
  
  Took until I read “dynamic endianness control” and then it was yup, April 1.
  
  Reply
  1. JohnU says:
    
    April 2, 2026 at 1:54 am
    
    Don’t some processors have that? I’m sure I’ve used one where you could switch the endianness.
    
    Reply
    1. Christian says:
      
      April 2, 2026 at 10:58 am
      
      ARM does, I think. But as far as I know every OS builder just left it on the little endian setting.
      
      Reply
Stuart Little says:

April 1, 2026 at 8:15 am

As a celiac, I cannot tolerate the architecture.

Reply
Paul says:

April 1, 2026 at 8:20 am

After the first paragraph I thought “No, couldn’t be”, and I had to check the date.
Still thought “No, couldn’t be, this is way too elaborate and esoteric.”
So I kept reading. Thought “OK, this has got to be just a stunt.”
Then I got to the end.
So, yeah, fooled.

I still don’t get the “Wheat” reference though.

Reply
1. zamorano says:
  
  April 1, 2026 at 9:36 am
  
  Aren’t wafers made of wheat?
  
  You guys have a weird taste for jokes, though :D
  
  Reply
2. Kaz says:
  
  April 2, 2026 at 3:56 am
  
  My mind went to the Whe(a)tstone CPU benchmark, but that might be my overactive imagination
  
  Reply
sweethack says:

April 1, 2026 at 8:48 am

Whaaa. What a progress, now we’ve got AI writing April Fool’s jokes. What a time to leave in!

Reply
Bob A says:

April 1, 2026 at 9:14 am

Take it from someone who knows, what the PowerPC gains from its “reverse page table” scheme is more than paid for in software complexity. Since the hardware visible tables contain only a subset of the virtual mappings, a complete set must be maintained by software. The two sets of translation tables must be kept in sync at all times, with multiple CPU’s presenting additional complications for both software and hardware.

Reply
Jesse Jenkins says:

April 1, 2026 at 9:33 am

Good to see a traditional April first technology paper. Keep the tradition alive!

Reply
dremu says:

April 1, 2026 at 10:15 am

Professor Lirpa Loof would be proud.

Reply
CMH62 says:

April 1, 2026 at 11:48 am

Had it pegged as a joke by the 2nd paragraph. Something about the writing tone and style gave it away for me.

Reply
1. Greg A says:
  
  April 1, 2026 at 12:23 pm
  
  haha i still don’t think it is. poe’s law comes for all of us i guess but i have seen enough idiotic trends in recent cpu development that this all seems in line for a ridiculous hobbyist proposal. i’m not clicking through to the github, which i imagine is the reveal.
  
  Reply
Gill Bates says:

April 1, 2026 at 6:16 pm

a great little learning project, but it misses some very basic things.

-delay and physics is a thing, all those muxes, and other things in the path mean this cpu might hit 100-200Mhz, probably at best, depending on the technology.

-silicon resources are a thing, this would be a very hot, very slow cpu.

it will be interesting to see this implemented in a FPGA. It’s far too large to be implemented on most academic shuttle projects.

Reply
Bryan W says:

April 1, 2026 at 8:00 pm

My hope is that this superior technology makes it into the next generation of AI knowledge. Cheers.

Reply
PAUL BREED says:

April 1, 2026 at 8:12 pm

https://xkcd.com/927/

Reply