Train A GPT-2 LLM, Using Only Pure C Code

April 28, 2024 by Donald Papp 7 Comments

[Andrej Karpathy] recently released llm.c, a project that focuses on LLM training in pure C, once again showing that working with these tools isn’t necessarily reliant on sprawling development environments. GPT-2 may be older but is perfectly relevant, being the granddaddy of modern LLMs (large language models) with a clear heritage to more modern offerings.

LLMs are fantastically good at communicating despite not actually knowing what they are saying, and training them usually relies on PyTorch deep learning library, itself written in Python. llm.c takes a simpler approach by implementing the neural network training algorithm for GPT-2 directly. The result is highly focused and surprisingly short: about a thousand lines of C in a single file. It is a highly elegant process that does the same thing the bigger, clunkier methods accomplish. It can run entirely on a CPU, or it can take advantage of GPU acceleration, where available.

This isn’t the first time [Andrej Karpathy] has bent his considerable skills and understanding towards boiling down these sorts of concepts into bare-bones implementations. We previously covered a project of his that is the “hello world” of GPT, a tiny model that predicts the next bit in a given sequence and offers low-level insight into just how GPT (generative pre-trained transformer) models work.

Australian Library Uses Chatbot To Imitate Veteran With Predictable Results

April 26, 2024 by Lewin Day 16 Comments

The educational sector is usually the first to decry large language models and AI, due to worries about cheating. The State Library of Queensland, however, has embraced the technology in controversial fashion. In the lead-up to Anzac Day, the primarily Australian war memorial holiday, the library released a chatbot intended to imitate a World War One veteran. It went as well as you’d expect.

*The highlighted line was apparently added to the chatbot’s instructions later on to help shut down tomfoolery.*

Twitter users immediately chimed in with dismay at the very concept. Others showed how easy it was to “jailbreak” the AI, convincing Charlie he was actually supposed to teach Python, imitate Frasier Crane, or explain laws like Elle from Legally Blonde. One person figured out how to get Charlie to spit out his initial instructions; these were patched later in the day to try and stop some of the shenanigans.

From those instructions, it’s clear that this was supposed to be educational, rather than some sort of macabre experiment. However, Charlie didn’t do a great job here, either. As with any Large Language Model, Charlie had no sense of objective truth. He routinely spat out incorrect facts regarding the war, and regularly contradicted himself.

Generally, any plan that includes the words “impersonate a veteran” is a foolhardy one at best. Throwing a machine-generated portrait and a largely uncontrolled AI into the mix didn’t help things. Regardless, the State Library has left the “Virtual Veterans” experience up at the time of writing.

The problem with AI is that it’s not a magic box that gets things right all the time. It never has been. As long as organizations keep putting AI to use in ways like this, the same story will keep playing out.

AI System Drops A Dime On Noisy Neighbors

April 25, 2024 by Dan Maloney 4 Comments

“There goes the neighborhood” isn’t a phrase to be thrown about lightly, but when they build a police station next door to your house, you know things are about to get noisy. Just how bad it’ll be is perhaps a bit subjective, with pleas for relief likely to fall on deaf ears unless you’ve got firm documentation like that provided by this automated noise detection system.

OK, let’s face it — even with objective proof there’s likely nothing that [Christopher Cooper] is going to do about the new crop of sirens going off in his neighborhood. Emergencies require a speedy response, after all, and sirens are perhaps just the price that we pay to live close to each other. That doesn’t mean there’s no reason to monitor the neighborhood noise, though, so [Christopher] got to work. The system uses an Arduino BLE Sense module to detect neighborhood noises and Edge Impulse to classify the sounds. An ESP32 does most of the heavy lifting, including running the UI on a nice little TFT touchscreen.

When a siren-like sound is detected, the sensor records the event and tries to classify the type of siren — fire, police, or ambulance. You can also manually classify sounds the system fails to understand, and export a summary of events to an SD card. If your neighborhood noise problems tend more to barking dogs or early-morning leaf blowers, no problem — you can easily train different models.

While we can’t say that this will help keep the peace in his neighborhood, we really like the way this one came out. We’ve seen the BLE Sense and Edge Impulse team up before, too, for everything from tuning a bike suspension to calming a nervous dog. Continue reading “AI System Drops A Dime On Noisy Neighbors” →

AI Image Generation Meets Virtual Dress Up

March 22, 2024 by Donald Papp 17 Comments

Image generators have really taken off thanks to machine learning, and all kinds of new ideas have been turned on in people’s heads as a result. OOTDiffusion is one such project, its job being to allow virtual try-ons of clothing by combining a picture of a person and an item of clothing, and doing so in a coherent way.

A model sporting a 2021 Remoticon shirt.

When it comes to AI image generators, maintaining consistency of a particular subject in a picture while changing or combining other parts of the image isn’t a trivial task. (If you’re unfamiliar with the basics of how diffusion-type AI image generators work, we have you covered.)

Virtual try-on of clothing is not a new idea, but it’s also far from being a completely solved problem. It’s easy to feed a system high-quality images of people and clothing and ask it to combine them, but the outputs rarely emerge with all their limbs intact, figuratively speaking.

OOTDiffusion addresses the two big challenges in this area: making sure the outputs look natural and realistic, and preserving as much of the garment’s appearance and qualities as possible in the process.

It seems to to a very good job, and you can try it for yourself in the online demo. Check out the research paper for more details, and the GitHub repository provides all the code if you’d like to get a little more hands-on.

A Badge For AI-Free Content – 100% Human!

February 17, 2024 by Lewin Day 48 Comments

These days, just about anyone with a pulse can fall on a keyboard and make an AI image generator spurt out some kind of vaguely visual content. A lot of it is crap. Some of it’s confusing. But most of all, creators hate it when their hand-crafted works are compared with these digital extrusions from mathematical slop. Enter the “not by AI” badge.

*Screenshot from https://notbyai.fyi/business*

Basically, it’s exactly what it sounds like. A sleek, modern badge that you slap on your artwork to tell people that you did this, not an AI. There are pre-baked versions for writers (“written by human”), visual artists (“painted by human”), and musicians (“produced by human”). The idea is that these badges would help people identify human-generated content and steer away from AI content if they’re trying to avoid it.

It’s not just intended to be added to individual artworks. Websites that have “at least 90%” of content created by humans are invited to host the badge, along with apps, too. This directive reveals an immediate flaw—the badge would easily confuse someone if they read the 10% of content by AI on a site wearing the badge. There’s also nothing stopping people from slapping the badge on AI-generated content and simply lying to people.

You might take a more cynical view if you dig deeper, though. The company is charging for various things, such as a monthly fee for businesses that want to display the badges.

We’ve talked about this before when we asked a simple question—how do you convince people your artwork was made by a human? We’re not sure we’ve yet found the answer, but this badge program is at least trying to do something about the issue. Share your human thoughts in the comments below.

Using Local AI On The Command Line To Rename Images (And More)

December 29, 2023 by Donald Papp 36 Comments

We all have a folder full of images whose filenames resemble line noise. How about renaming those images with the help of a local LLM (large language model) executable on the command line? All that and more is showcased on [Justine Tunney]’s bash one-liners for LLMs, a showcase aimed at giving folks ideas and guidance on using a local (and private) LLM to do actual, useful work.

This is built out from the recent llamafile project, which turns LLMs into single-file executables. This not only makes them more portable and easier to distribute, but the executables are perfectly capable of being called from the command line and sending to standard output like any other UNIX tool. It’s simpler to version control the embedded LLM weights (and therefore their behavior) when it’s all part of the same file as well.

One such tool (the multi-modal LLaVA) is capable of interpreting image content. As an example, we can point it to a local image of the Jolly Wrencher logo using the following command:

llava-v1.5-7b-q4-main.llamafile --image logo.jpg --temp 0 -e -p '### User: The image has...\n### Assistant:'

Which produces the following response:

The image has a black background with a white skull and crossbones symbol.

With a different prompt (“What do you see?” instead of “The image has…”) the LLM even picks out the wrenches, but one can already see that the right pieces exist to do some useful work.

Check out [Justine]’s rename-pictures.sh script, which cleverly evaluates image filenames. If an image’s given filename already looks like readable English (also a job for a local LLM) the image is left alone. Otherwise, the picture is fed to an LLM whose output guides the generation of a new short and descriptive English filename in lowercase, with underscores for spaces.

What about the fact that LLM output isn’t entirely predictable? That’s easy to deal with. [Justine] suggests always calling these tools with the --temp 0 parameter. Setting the temperature to zero makes the model deterministic, ensuring that a same input always yields the same output.

There’s more neat examples on the Bash One-Liners for LLMs that demonstrate different ways to use a local LLM that lives in a single-file executable, so be sure to give it a look and see if you get any new ideas. After all, we have previously shown how automating tasks is almost always worth the time invested.

Generating 3D Scenes From Just One Image

December 29, 2023 by Donald Papp 6 Comments

The LucidDreamer project ties a variety of functions into a pipeline that can take a source image (or generate one from a text prompt) and “lift” its content into 3D, creating highly-detailed Gaussian splats that look great and can even be navigated.

Gaussian splatting is a method used to render NeRFs (Neural Radiance Fields), which are themselves a method of generating complex scenes from sparse 2D sources, and doing it quickly. If that is all news to you, that’s probably because this stuff has sprung up with dizzying speed from when the original NeRF concept was thought up barely a handful of years ago.

What makes LucidDreamer neat is the fact that it does so much with so little. The project page has interactive scenes to explore, but there is also a demo for those who would like to try generating scenes from scratch (some familiarity with the basic tools is expected, however.)

In addition to the source code itself the research paper is available for those with a hunger for the details. Read it quick, because at the pace this stuff is expanding, it honestly might be obsolete if you wait too long.