Training A Transformer With 1970s-era Technology

Although generative language models have found little widespread, profitable adoption outside of putting artists out of work and giving tech companies an easy scapegoat for cutting staff, their their underlying technology remains a fascinating area of study. Stepping back to the more innocent time of the late 2010s, before the cultural backlash, we could examine these models in their early stages. Or, we could see how even older technology processes these types of machine learning algorithms in order to understand more about their fundamentals. [Damien] has put a 60s-era IBM as well as a PDP-11 to work training a transformer algorithm in order to take a closer look at it.

For such old hardware, the task [Damien] is training his transformer to do is to reverse a list of digits. This is a trivial problem for something like a Python program but much more difficult for a transformer. The model relies solely on self-attention and a residual connection. To fit within the 32KB memory limit of the PDP-11, it employs fixed-point arithmetic and lookup tables to replace computationally expensive functions. Training is optimized with hand-tuned learning rates and stochastic gradient descent, achieving 100% accuracy in 350 steps. In the real world, this means that he was able to get the training time down from hours or days to around five minutes.

Not only does a project like this help understand these tools, but it also goes a long way towards demonstrating that not every task needs a gigawatt datacenter to be useful. In fact, we’ve seen plenty of large language models and other generative AI running on computers no more powerful than an ESP32 or, if you need slightly more computing power, on consumer-grade PCs with or without GPUs.

Hackaday

Training A Transformer With 1970s-era Technology

Leave a ReplyCancel reply

Search

Never miss a hack

If you missed it

The Most Secure, Modern Computer Might Be A Mac

From Zip To Nought: The Rise And Fall Of Iomega

The Zero-Power Flight Computer

Artemis II Agenda Keeps Moon-Bound Crew Busy

The Rise And Fall Of Free Dial Up Internet

Our Columns

Hackaday Links: March 29, 2026

For Art’s Sake

Hackaday Podcast Episode 363: The History Of PLA, Laser DIY PCBs, And Corporate Craziness

This Week In Security: Second Verse, Worse Than The First

FLOSS Weekly Episode 867: Pangolin: People Can Lie

Leave a ReplyCancel reply

Search

Never miss a hack

Subscribe

If you missed it

Our Columns