Large Language Models On Small Computers

September 7, 2024

As technology progresses, we generally expect processing capabilities to scale up. Every year, we get more processor power, faster speeds, greater memory, and lower cost. However, we can also use improvements in software to get things running on what might otherwise be considered inadequate hardware. Taking this to the extreme, while large language models (LLMs) like GPT are running out of data to train on and having difficulty scaling up, [DaveBben] is experimenting with scaling down instead, running an LLM on the smallest computer that could reasonably run one.

Of course, some concessions have to be made to get an LLM running on underpowered hardware. In this case, the computer of choice is an ESP32, so the dataset was reduced from the trillions of parameters of something like GPT-4 or even hundreds of billions for GPT-3 down to only 260,000. The dataset comes from the tinyllamas checkpoint, and llama.2c is the implementation that [DaveBben] chose for this setup, as it can be streamlined to run a bit better on something like the ESP32. The specific model is the ESP32-S3FH4R2, which was chosen for its large amount of RAM compared to other versions since even this small model needs a minimum of 1 MB to run. It also has two cores, which will both work as hard as possible under (relatively) heavy loads like these, and the clock speed of the CPU can be maxed out at around 240 MHz.

Admittedly, [DaveBben] is mostly doing this just to see if it can be done since even the most powerful of ESP32 processors won’t be able to do much useful work with a large language model. It does turn out to be possible, though, and somewhat impressive, considering the ESP32 has about as much processing capability as a 486 or maybe an early Pentium chip, to put things in perspective. If you’re willing to devote a few more resources to an LLM, though, you can self-host it and use it in much the same way as an online model such as ChatGPT.

6 thoughts on “Large Language Models On Small Computers”

deL says:

September 7, 2024 at 2:20 am

Forget extortionate Raspberry Pi AI add-ons, many SBC processors now have inbuilt neural processing units (NPUs)…and they absolutely book:
https://www.rock-chips.com/uploads/pdf/2022.8.26/192/RK3566%20Brief%20Datasheet.pdf

For my next trick…
👽

Report comment

Reply
1. scott_tx says:
  
  September 7, 2024 at 1:14 pm
  
  NPUs dont help with LLM inferencing.
  
  Report comment
  
  Reply
  1. deL says:
    
    September 7, 2024 at 1:34 pm
    
    Sorry, I must be mistaken:
    
    https://github.com/rockchip-linux/rknn-toolkit2
    https://github.com/rockchip-linux/rknpu2
    
    Report comment
    
    Reply
clancydaenlightened says:

September 7, 2024 at 7:19 am

Well you could add another esp32

How many esp32 can fit on a board the size of an original Atari or nes?

How well can you glue logic the gpio to do useful work? Cpld and a FPGA fancy custom chip?

Report comment

Reply
1. alialiali says:
  
  September 8, 2024 at 5:28 am
  
  LLMs tend to be memory bound is my understanding so perhaps fast peripheral memory is more important than more cores.
  
  Report comment
  
  Reply
Done says:

September 7, 2024 at 10:32 am

The “Large” in “LLM” does not really apply here. If my decision space consists of only two choices, say “heads” and/or “tails”, how difficult can it be to “train” the “LLM”?

Report comment

Reply

Hackaday

Large Language Models On Small Computers

6 thoughts on “Large Language Models On Small Computers”

Leave a ReplyCancel reply

Search

Never miss a hack

If you missed it

Why Apple Dumped 2,700 Computers In A Landfill In 1989

A Field Guide To The North American Cold Chain

The DEW Line Remembered

The Fight To Save Lunar Trailblazer

Hacking When It Counts: DIY Prosthetics And The Prison Camp Lathe

Our Columns

Fixing Human Sleep With Air Under Pressure

Hackaday Links: July 20, 2025

Hackaday Podcast Episode 329: AI Surgery, A Prison Camp Lathe, And A One Hertz Four-Fer

This Week In Security: Trains, Fake Homebrew, And AI Auto-Hacking

FLOSS Weekly Episode 841: Drupal And AI: The Right Tool For Everything

6 thoughts on “Large Language Models On Small Computers”

Leave a ReplyCancel reply

Search

Never miss a hack

Subscribe

If you missed it

Our Columns