Nanochat Lets You Build Your Own Hackable LLM

October 20, 2025

Few people know LLMs (Large Language Models) as thoroughly as [Andrej Karpathy], and luckily for us all he expresses that in useful open-source projects. His latest is nanochat, which he bills as a way to create “the best ChatGPT $100 can buy”.

What is it, exactly? nanochat in a minimal and hackable software project — encapsulated in a single speedrun.sh script — for creating a simple ChatGPT clone from scratch, including web interface. The codebase is about 8,000 lines of clean, readable code with minimal dependencies, making every single part of the process accessible to be tampered with.

An accessible, end-to-end codebase for creating a simple ChatGPT clone makes every part of the process hackable.

The $100 is the cost of doing the computational grunt work of creating the model, which takes about 4 hours on a single NVIDIA 8XH100 GPU node. The result is a 1.9 billion parameter micro-model, trained on some 38 billion tokens from an open dataset. This model is, as [Andrej] describes in his announcement on X, a “little ChatGPT clone you can sort of talk to, and which can write stories/poems, answer simple questions.” A walk-through of what that whole process looks like makes it as easy as possible to get started.

Unsurprisingly, a mere $100 doesn’t create a meaningful competitor to modern commercial offerings. However, significant improvements can be had by scaling up the process. A $1,000 version (detailed here) is far more coherent and capable; able to solve simple math or coding problems and take multiple-choice tests.

[Andrej Karpathy]’s work lends itself well to modification and experimentation, and we’re sure this tool will be no exception. His past work includes a method of training a GPT-2 LLM using only pure C code, and years ago we saw his work on a character-based Recurrent Neural Network (mis)used to generate baroque music by cleverly representing MIDI events as text.

11 thoughts on “Nanochat Lets You Build Your Own Hackable LLM”

Jan-Willem Markus says:

October 20, 2025 at 5:13 am

I’ve not looked into LLM too deeply, beside running a few small models offline, and using ChatGPT to solve tricky word puzzles from Blue Prince. These advantages are quite interesting and may be what is needed to create very specific and useful models. In my case, I’d like to put in my own knowledge base and vetted sources on locks, lock picking, and physical security. Possibly in combination with embedded security and electronics, as those are my knowledge fields. It won’t be a model I can share, but it may be able to help me find tricky connections and deep insights.

As an experiment, I would like to see the LLM trained on just the whole of HackaDay and the projects they use as content. You may not be able to publish the dataset or model, but at least HAD can share how well it works. Similar other institutions could do the same.

Report comment

Reply
Sammie Gee says:

October 20, 2025 at 5:18 am

LLM invent and build me affordable house I can live in.

Report comment

Reply
1. John Elliot V says:
  
  October 20, 2025 at 5:56 am
  
  There’s money to be made in child modelling but you gotta be careful.
  
  Report comment
  
  Reply
Ostracus says:

October 20, 2025 at 5:18 am

Reminded of when Arstechnica did their experiment with cryptocurrency, back in the day. Now it’s LLM’s turn.

Report comment

Reply
Bob says:

October 20, 2025 at 5:22 am

I saw this elsewhere yesterday and decided to dig into the source code posted to GitHub. The project is written almost entirely in Python. If this is standard practice for LLM training, it could help to explain, at least in part, the large amount of computing power and ultimately electrical power required for the training process. Python is an interpreted language that can be orders of magnitude less efficient than a compiled language like C. Am I missing something?

Report comment

Reply
1. Jay says:
  
  October 20, 2025 at 6:36 am
  
  Yes, you are. Python is a high level language just used to direct things happening at a lower level. All the computation is happening in compiled libraries written for efficient execution on the target hardware. Those libraries are written in C, or CUDA, or something else as the case & target hardware varies.
  
  Report comment
  
  Reply
  1. Augu5te says:
    
    October 20, 2025 at 8:19 am
    
    And to complete there is a little part in Rust to accelerate some “tokens manipulations”
    
    Report comment
    
    Reply
Mark Topham says:

October 20, 2025 at 5:42 am

The part where Python is most significantly used to shuffle data in and out.

It, generally, isn’t doing the heavy lifting when it comes to AI fields.

Report comment

Reply
iliis says:

October 20, 2025 at 5:46 am

Yes. The actual computations is usually done by well optimized machine code, mostly running on GPUs or even ASICs. Python is just the glue that manages and directs the whole show.

Report comment

Reply
iliis says:

October 20, 2025 at 5:47 am

Hey, are replies broken? Above commend was for @Bob

Report comment

Reply
1. Squonk42 says:
  
  October 20, 2025 at 6:45 am
  
  AWS outage
  
  Report comment
  
  Reply