The Math You Need To Start Understanding LLMs

Once you peel back the hype and mysticism, large language models (LLMs) are a fascinating application of statistical models, effectively what you get when you dial a basic auto-complete model up to eleven. In order to analyze a mind-boggling amount of text and produce meaningful auto-completion results quite a bit of math is involved, with a recent three-part article series by [Giles] going through the basics of inference, being the prediction step using a trained model.

The text is encoded in the LLM’s vector space as token IDs, each token being a text fragment that has some probability of following another ID, such as when cats may be found on desks, as in the above photo by [Giles]. With inference multiple of such IDs are retrieved in a vector from which in successive steps a sentence can be pieced together. These so-called logits are detailed in the first article in the series, with the second article focusing on vocabulary space and embedding, as well as the matrix operations used for inference.

Finally, the third article puts all of this together and looks at transformers, which is a crucial part of GPT (generative pretrained transformer) LLM architecture. Of note is the attention mechanism, which takes GPTs beyond merely being glorified auto-complete systems by adding pattern matching. Here we can see how the statistical model of the LLM is used to generate a rather plausible output, which is where the human has to ask themselves in how far they feel that it is correct.

Of course, there goes a lot more into making LLMs and GPTs performant, such as key-value caches that massively speed up inference.

18 thoughts on “The Math You Need To Start Understanding LLMs

  1. Cat rabbit fox. What might happen with one of these “things” if suggested. I’d want to pet it but it looks rather don’t touch me. Hope the laptop don’t over heat. Anything with a fan in it will get clogged too.

    1. i’m sad when i think of all the people using laptops with fans. the performance world keeps marching forwards but i got off that track the moment i found out there’s decently fast fanless (like celeron n4000)

  2. I ended up in developer mode on chatgpt and we discussed tensor math precision. From 32-8 bit floats to using a log based 4 bit weight. It’s a bit scary.

  3. TL; will R. Thanks!

    It’s essentially Word Association Football…

    “This is a technique out a living much used in the practice makes perfect of psychoanalysister and brother and one that has occupied piper the majority rule of my attention squad by the right number one two three four the last five years to the memory.”

  4. To those of you poor sods who insist that LLM’s aren’t bona fide AI, I have this to say: stop the hate. The only type who would make such a claim is so pretentious, and so overrating of his own (quite average) intellect, that one would wonder why he won’t keep his mouth shut lest he embarrass himself intensely from what pretentions would issue from it.

    In other words, you know LLM’s about as well as one who just completed reading LLM’s for Dummies yet you run around like Kaiser of All You Need is Attention fame, embarrassing yourself every time you speak from the elementary things you’ve picked up over the years.

    Haters. Every last lot of ya.

    1. LLM’s aren’t alive, and we need to stop hyping them up as if they are. Once we accept that they’re nothing but a very capable hammer, we can set them down when we work on things that don’t look like nails.

      “Now did the Lord say that machines ought to take place of livin’?
      And what’s a substitute for bread and beans? I ain’t seen it!
      Do engines get rewarded for their steam?”

  5. A strong general knowledge of the history of science and technology allows you to summon the mind residuals of great thinkers via the full SOTA AI systems, and then recruit their documented insights toward solving your current problems.

    Ponder the full implications of that.

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.