An Animated Walkthrough Of How Large Language Models Work

If you wonder how Large Language Models (LLMs) work and aren’t afraid of getting a bit technical, don’t miss [Brendan Bycroft]’s LLM Visualization. It is an interactively-animated step-by-step walk-through of a GPT large language model complete with animated and interactive 3D block diagram of everything going on under the hood. Check it out!

nano-gpt has only around 85,000 parameters, but the operating principles are all the same as for larger models.

The demonstration walks through a simple task and shows every step. The task is this: using the nano-gpt model, take a sequence of six letters and put them into alphabetical order.

A GPT model is a highly complex prediction engine, so the whole process begins with tokenizing the input (breaking up words and assigning numerical values to the chunks) and ends with choosing an appropriate output from a list of probabilities. There are of course many more steps in between, and different ways to adjust the model’s behavior. All of these are made quite clear by [Brendan]’s process breakdown.

We’ve previously covered how LLMs work, explained without math which eschews gritty technical details in favor of focusing on functionality, but it’s also nice to see an approach like this one, which embraces the technical elements of exactly what is going on.

We’ve also seen a much higher-level peek at how a modern AI model like Anthropic’s Claude works when it processes requests, extracting human-understandable concepts that illustrate what’s going on under the hood.

Playing Chess Against LLMs And The Mystery Of Instruct Models

At first glance, trying to play chess against a large language model (LLM) seems like a daft idea, as its weighted nodes have, at most, been trained on some chess-adjacent texts. It has no concept of board state, stratagems, or even whatever a ‘rook’ or ‘knight’ piece is. This daftness is indeed demonstrated by [Dynomight] in a recent blog post (Substack version), where the Stockfish chess AI is pitted against a range of LLMs, from a small Llama model to GPT-3.5. Although the outcomes (see featured image) are largely as you’d expect, there is one surprise: the gpt-3.5-turbo-instruct model, which seems quite capable of giving Stockfish a run for its money, albeit on Stockfish’s lower settings.

Each model was given the same query, telling it to be a chess grandmaster, to use standard notation, and to choose its next move. The stark difference between the instruct model and the others calls investigation. OpenAI describes the instruct model as an ‘InstructGPT 3.5 class model’, which leads us to this page on OpenAI’s site and an associated 2022 paper that describes how InstructGPT is effectively the standard GPT LLM model heavily fine-tuned using human feedback.

Continue reading “Playing Chess Against LLMs And The Mystery Of Instruct Models”

AI Face Anonymizer Masks Human Identity In Images

We’re all pretty familiar with AI’s ability to create realistic-looking images of people that don’t exist, but here’s an unusual implementation of using that technology for a different purpose: masking people’s identity without altering the substance of the image itself. The result is the photo’s content and “purpose” (for lack of a better term) of the image remains unchanged, while at the same time becoming impossible to identify the actual person in it. This invites some interesting privacy-related applications.

Originals on left, anonymized versions on the right. The substance of the images has not changed.

The paper for Face Anonymization Made Simple has all the details, but the method boils down to using diffusion models to take an input image, automatically pick out identity-related features, and alter them in a way that looks more or less natural. For this purpose, identity-related features essentially means key parts of a human face. Other elements of the photo (background, expression, pose, clothing) are left unchanged. As a concept it’s been explored before, but researchers show that this versatile method is both simpler and better-performing than others.

Diffusion models are the essence of AI image generators like Stable Diffusion. The fact that they can be run locally on personal hardware has opened the doors to all kinds of interesting experimentation, like this haunted mirror and other interactive experiments. Forget tweaking dull sliders like “brightness” and “contrast” for an image. How about altering the level of “moss”, “fire”, or “cookie” instead?

Here’s Code For That AI-Generated Minecraft Clone

A little while ago Oasis was showcased on social media, billing itself as the world’s first playable “AI video game” that responds to complex user input in real-time. Code is available on GitHub for a down-scaled local version if you’d like to take a look. There’s a bit more detail and background in the accompanying project write-up, which talks about both the potential as well as the numerous limitations.

We suspect the focus on supporting complex user input (such as mouse look and an item inventory) is what the creators feel distinguishes it meaningfully from AI-generated DOOM. The latter was a concept that demonstrated AI image generators could (kinda) function as real-time game engines.

Image generators are, in a sense, prediction machines. The idea is that by providing a trained model with a short history of what just happened plus the user’s input as context, it can generate a pretty usable prediction of what should happen next, and do it quickly enough to be interactive. Run that in a loop, and you get some pretty impressive clips to put on social media.

It is a neat idea, and we certainly applaud the creativity of bending an image generator to this kind of application, but we can’t help but really notice the limitations. Sit and stare at something, or walk through dark or repetitive areas, and the system loses its grip and things rapidly go in a downward spiral we can only describe as “dreamily broken”.

It may be more a demonstration of a concept than a properly functioning game, but it’s still a very clever way to leverage image generation technology. Although, if you’d prefer AI to keep the game itself untouched take a look at neural networks trained to use the DOOM level creator tools.

Using AI To Help With Assembly

Although generative AI and large language models have been pushed as direct replacements for certain kinds of workers, plenty of businesses actually doing this have found that using this new technology can cause more problems than it solves when it is given free reign over tasks. While this might not be true indefinitely, the real use case for these tools right now is as a kind of assistant to certain kinds of work. For this they can be incredibly powerful as [Ricardo] demonstrates here, using Amazon Q to help with game development on the Commodore 64.

The first step here was to generate code that would show a sprite moving across the screen. The AI first generated code in all caps, as was the style at the time of the C64, but in [Ricardo]’s development environment this caused some major problems, so the code was converted to lowercase. A more impressive conversion was done in the next steps, as the program needed to take advantage of the optimizations found in the Assembly language. With the code converted to 6502 Assembly that can run on the virtual Commodore, [Ricardo] was eventually able to show four sprites moving across the screen after several iterations with the AI, as well as change the style of the sprites to arbitrary designs.

Although the post is a bit over-optimistic on Amazon Q as a tool specifically for developers, it might have some benefits over other generative AIs especially if it’s capable at the chore of programming in Assembly language. We’d love to hear anyone with real-world experience with this and whether it is truly worth the extra cost over something like Copilot or GPT 4. For any of these generative AI models, though, it’s probably worth trying them out while they’re in their early stages. Keep in mind that there’s a lot more than programming that can be done with some of them as well.

AI Not Needed For Hackaday Projects

It was Supercon this weekend, and Hackaday staffers made their way to Pasadena for what was by all accounts an excellent event. Now they’re all on their way home on red-eye flights and far from their benches, so spare a thought for the lonely editor holding the fort while they’ve been having fun. The supply of cool hacks for your entertainment must continue, so what’s to be done? Fortunately Hackaday writer [Anne Ogborn] has the answer, in the form of an automated Hackaday article generator.

We once had a commenter make a withering insult that one of our contributors’ writing styles looked like the work of an AI driven bot, a sentence that the writer in question treasures enough to have incorporated in their Hackaday email signature. [Anne] is a data scientist and Prolog programmer by trade so knows a bit about AI, and she has no need for such frippery. Instead she’s made a deck of cards each marked with a common theme among the work featured here, and generating new article titles is a simple case of drawing cards from the pack and assembling the resulting sentence.

The result is both amusing and we think, uncannily on the mark. Who wouldn’t want an ESP8266 powered cardboard drone? We think it will make a valuable addition to the Hackaday armoury, to be brought out on days such as the first of April, when there’s always an unexpected shortage of hacks. Video below the break.

Continue reading “AI Not Needed For Hackaday Projects”

All You Need For Artificial Intelligence Is A Commodore 64

Artificial intelligence has always been around us, with [Timothy J. O’Malley]’s 1985 book on AI projects for the Commodore 64 being one example of this. With AI defined as being the theory and development of systems that can perform tasks that normally requiring human intelligence (e.g. visual perception, speech recognition, decision-making), this book is a good introduction to the many ways that computer systems for decades now have been able to learn, make decisions and in general become more human-like. Even if there’s no electronic personality behind the actions.

In the book’s first chapter, [Timothy] isn’t afraid to toss in some opinions about the true nature of intelligence and thinking. Starting with the concept that intelligence is based around storing information and being able to derive meaning from connections between stored pieces of information, the idea of a basic AI as one would use in a game for the computer opponent arises. A number of ways of implementing such an AI is explored in the first and subsequent chapters, using Towers of Hanoi, chess, Nim and other games.

After this we look at natural language processing – referencing ELIZA as an example – followed by heuristics, pattern recognition and AI for robotics. Although much of this may seem outdated in this modern age of LLMs and neural networks, it’s important to realize that much of what we consider ‘bleeding edge’ today has its roots in AI research performed in the 1950s and 1960s. As [Timothy] rightfully states in the final chapter, there is no real limit to how far you can push this type of AI as long as you have more hardware and storage to throw at the problem. This is where we now got datacenters full of GPU-equipped systems churning through vector space calculations for the sake of today’s LLM & diffusion model take on ‘AI’.

Using a Commodore 64 to demonstrate the (lack of) validity of claims is not a new one, with recently a group of researchers using one of these breadbin marvels to run an Ising model with a tensor network and outperforming IBM’s quantum processor. As they say, just because it’s new and shiny doesn’t necessarily mean that it is actually better.