An Animated Walkthrough Of How Large Language Models Work

November 20, 2024

If you wonder how Large Language Models (LLMs) work and aren’t afraid of getting a bit technical, don’t miss [Brendan Bycroft]’s LLM Visualization. It is an interactively-animated step-by-step walk-through of a GPT large language model complete with animated and interactive 3D block diagram of everything going on under the hood. Check it out!

nano-gpt has only around 85,000 parameters, but the operating principles are all the same as for larger models.

The demonstration walks through a simple task and shows every step. The task is this: using the nano-gpt model, take a sequence of six letters and put them into alphabetical order.

A GPT model is a highly complex prediction engine, so the whole process begins with tokenizing the input (breaking up words and assigning numerical values to the chunks) and ends with choosing an appropriate output from a list of probabilities. There are of course many more steps in between, and different ways to adjust the model’s behavior. All of these are made quite clear by [Brendan]’s process breakdown.

We’ve previously covered how LLMs work, explained without math which eschews gritty technical details in favor of focusing on functionality, but it’s also nice to see an approach like this one, which embraces the technical elements of exactly what is going on.

We’ve also seen a much higher-level peek at how a modern AI model like Anthropic’s Claude works when it processes requests, extracting human-understandable concepts that illustrate what’s going on under the hood.

9 thoughts on “An Animated Walkthrough Of How Large Language Models Work”

shinsukke says:

November 20, 2024 at 5:07 am

Its all still magic to me but now the scale makes it look even more impossibly incomprehensible

Thanks I guess /s

Report comment

Reply
clancydaenlightened says:

November 20, 2024 at 7:21 am

Everytime I see GPT

I read GUID Partition Table

That’s how old I am…..

Report comment

Reply
1. Gravis says:
  
  November 20, 2024 at 9:53 am
  
  I have the exact same problem. ;_;
  
  Report comment
  
  Reply
2. rclark says:
  
  November 20, 2024 at 11:19 am
  
  Same here….
  
  But I understand I think with all these acronyms floating around. Just this morning I asked networking about when they were to cut over our RTUs (Remote Terminal Units )to the new firewall. Got a blank stare… I mean ‘to our field devices’…. Oh, now I know what you are talking about …
  
  Report comment
  
  Reply
3. Hugo Oran says:
  
  November 20, 2024 at 11:30 am
  
  KIA as Killed in action. NIO as Nicht in Ordnung.
  
  Report comment
  
  Reply
4. Hugo Oran Anamnesis says:
  
  November 20, 2024 at 11:31 am
  
  LLM as Lunar Landing Module.
  
  Report comment
  
  Reply
clancydaenlightened says:

November 20, 2024 at 7:22 am

Everytime I see GPT

I read GUID Partition Table

That’s how old I am…..

Report comment

Reply
Paul says:

November 20, 2024 at 7:29 am

Fascinating though the LLM description is, I’m even more interested in the presentation notebook thing. I thought it was just a Jupyter notebook or something, but I’ve never seen it like that. Can’t find info on it. Something home-grown? Sadly the question mark icon (top left) does nothing for me.

Report comment

Reply
1. Sean says:
  
  November 20, 2024 at 4:38 pm
  
  It appears to be a custom Next.js project https://github.com/bbycroft/llm-viz
  
  Report comment
  
  Reply