Nanochat Lets You Build Your Own Hackable LLM

Few people know LLMs (Large Language Models) as thoroughly as [Andrej Karpathy], and luckily for us all he expresses that in useful open-source projects. His latest is nanochat, which he bills as a way to create “the best ChatGPT $100 can buy”.

What is it, exactly? nanochat in a minimal and hackable software project — encapsulated in a single speedrun.sh script — for creating a simple ChatGPT clone from scratch, including web interface. The codebase is about 8,000 lines of clean, readable code with minimal dependencies, making every single part of the process accessible to be tampered with.

An accessible, end-to-end codebase for creating a simple ChatGPT clone makes every part of the process hackable.

The $100 is the cost of doing the computational grunt work of creating the model, which takes about 4 hours on a single NVIDIA 8XH100 GPU node. The result is a 1.9 billion parameter micro-model, trained on some 38 billion tokens from an open dataset. This model is, as [Andrej] describes in his announcement on X, a “little ChatGPT clone you can sort of talk to, and which can write stories/poems, answer simple questions.” A walk-through of what that whole process looks like makes it as easy as possible to get started.

Unsurprisingly, a mere $100 doesn’t create a meaningful competitor to modern commercial offerings. However, significant improvements can be had by scaling up the process. A $1,000 version (detailed here) is far more coherent and capable; able to solve simple math or coding problems and take multiple-choice tests.

[Andrej Karpathy]’s work lends itself well to modification and experimentation, and we’re sure this tool will be no exception. His past work includes a method of training a GPT-2 LLM using only pure C code, and years ago we saw his work on a character-based Recurrent Neural Network (mis)used to generate baroque music by cleverly representing MIDI events as text.

Your LLM Won’t Stop Lying Any Time Soon

Researchers call it “hallucination”; you might more accurately refer to it as confabulation, hornswaggle, hogwash, or just plain BS. Anyone who has used an LLM has encountered it; some people seem to find it behind every prompt, while others dismiss it as an occasional annoyance, but nobody claims it doesn’t happen. A recent paper by researchers at OpenAI (PDF) tries to drill down a bit deeper into just why that happens, and if anything can be done.

Spoiler alert: not really. Not unless we completely re-think the way we’re training these models, anyway. The analogy used in the conclusion is to an undergraduate in an exam room. Every right answer is going to get a point, but wrong answers aren’t penalized– so why the heck not guess? You might not pass an exam that way going in blind, but if you have studied (i.e., sucked up the entire internet without permission for training data) then you might get a few extra points. For an LLM’s training, like a student’s final grade, every point scored on the exam is a good point. Continue reading “Your LLM Won’t Stop Lying Any Time Soon”

Hackaday Links Column Banner

Hackaday Links: October 5, 2025

What the Flock? It’s probably just some quirk of The Almighty Algorithm, but ever since we featured a story on Flock’s crime-fighting drones last week, we’ve been flooded with other stories about the company, some of which aren’t very flattering. The first thing that we were pushed was this handy interactive map of the company’s network of automatic license plate readers. We had no idea how extensive the network was, and while our location is relatively free from these devices, at least ones operated on behalf of state, county, or local law enforcement, we did learn to our dismay that our local Lowe’s saw fit to install three of these cameras on the entrances to their parking lot. Not wishing to have our coming and goings documented, we’ll be taking our home improvement dollars elsewhere for now.

Continue reading “Hackaday Links: October 5, 2025”

Macintosh System 7 Ported To X86 With LLM Help

You can use large language models for all sorts of things these days, from writing terrible college papers to bungling legal cases. Or, you can employ them to more interesting ends, such as porting Macintosh System 7 to the x86 architecture, like [Kelsi Davis] did.

When Apple created the Macintosh lineup in the 1980s, it based the computer around Motorola’s 68K CPU architecture. These 16-bit/32-bit CPUs were plenty capable for the time, but the platform ultimately didn’t have the same expansive future as Intel’s illustrious x86 architecture that underpinned rival IBM-compatible machines.

[Kelsi Davis] decided to port the Macintosh System 7 OS to run on native x86 hardware, which would be challenging enough with full access to the source code. However, she instead performed this task by analyzing and reverse engineering the System 7 binaries with the aid of Ghidra and a large language model. Soon enough, she had the classic System 7 desktop running on QEMU with a fully-functional Finder and the GUI working as expected. [Kelsi] credits the LLM with helping her achieve this feat in just three days, versus what she would expect to be a multi-year effort if working unassisted.

Files are on GitHub for the curious. We love a good port around these parts; we particularly enjoyed these efforts to recreate Portal on the N64. If you’re doing your own advanced tinkering with Macintosh software from yesteryear, don’t hesitate to let us know.

A man holds a license plate in front of a black pickup (F-150 Lightning) tailgate. It is a novelty Georgia plate with the designation P00-5000. There are specks of black superimposed over the plate with a transparent sticker, giving it the appearance of digital mud in black.

A Deep Dive On Creepy Cameras

George Orwell might’ve predicted the surveillance state, but it’s still surprising how many entities took 1984 as a how-to manual instead of a cautionary tale. [Benn Jordan] decided to take a closer look at the creepy cameras invading our public spaces and how to circumvent them.

[Jordan] starts us off with an overview of how machine learning “AI” is used Automated License Plate Reader (ALPR) cameras and some of the history behind their usage in the United States. Basically, when you drive by one of these cameras, an ” image segmentation model or something similar” detects the license plate and then runs optical character recognition (OCR) on the plate contents. It will also catalog any bumper stickers with the make and model of the car for a pretty good guess of it being your vehicle, even if the OCR isn’t 100% on the exact plate sequence.

Where the video gets really interesting is when [Jordan] starts disassembling, building, and designing countermeasures to these systems. We get a teardown of a Motorola ALPR for in-vehicle use that is better at being closed hardware than it is at reading license plates, and [Jordan] uses a Raspberry Pi 5, a Halo AI board, and You Only Look Once (YOLO) recognition software to build a “computer vision system that’s much more accurate than anything on the market for law enforcement” for $250.

[Jordan] was able to develop a transparent sticker that renders a license plate unreadable to the ALPR but still plainly visible to a human observer. What’s interesting is that depending on the pattern, the system could read it as either an incorrect alphanumeric sequence or miss detecting the license plate entirely. It turns out, filtering all the rectangles in the world to find just license plates is a tricky problem if you’re a computer. You can find the code on his Github, if you want to take a gander.

You’ve probably heard about using IR LEDs to confuse security cameras, but what about yarn? If you’re looking for more artistic uses for AI image processing, how about this camera that only takes nudes or this one that generates a picture based on geographic data?

Continue reading “A Deep Dive On Creepy Cameras”

Pong Cloned By Neural Network

Although not the first video game ever produced, Pong was the first to achieve commercial success and has had a tremendous influence on our culture as a whole. In Pong’s time, its popularity ushered in the arcade era that would last for more than two decades. Today, it retains a similar popularity partially for approachability: gameplay is relatively simple, has hardwired logic, and provides insights about the state of computer science at the time. For these reasons, [Nick Bild] has decided to recreate this arcade classic, but not in a traditional way. He’s trained a neural network to become the game instead.

Continue reading “Pong Cloned By Neural Network”

This Week In Security: The AI Hacker, FortMajeure, And Project Zero

One of the hot topics currently is using LLMs for security research. Poor quality reports written by LLMs have become the bane of vulnerability disclosure programs. But there is an equally interesting effort going on to put LLMs to work doing actually useful research. One such story is [Romy Haik] at ULTRARED, trying to build an AI Hacker. This isn’t an over-eager newbie naively asking an AI to find vulnerabilities, [Romy] knows what he’s doing. We know this because he tells us plainly that the LLM-driven hacker failed spectacularly.

The plan was to build a multi-LLM orchestra, with a single AI sitting at the top that maintains state through the entire process. Multiple LLMs sit below that one, deciding what to do next, exactly how to approach the problem, and actually generating commands for those tools. Then yet another AI takes the output and figures out if the attack was successful. The tooling was assembled, and [Romy] set it loose on a few intentionally vulnerable VMs.

As we hinted at up above, the results were fascinating but dismal. This LLM successfully found one Remote Code Execution (RCE), one SQL injection, and three Cross-Site Scripting (XSS) flaws. This whole post is sort of sneakily an advertisement for ULTRARED’s actual automated scanner, that uses more conventional methods for scanning for vulnerabilities. But it’s a useful comparison, and it found nearly 100 vulnerabilities among the collection of targets.

The AI did what you’d expect, finding plenty of false positives. Ask an AI to describe a vulnerability, and it will glad do so — no real vulnerability required. But the real problem was the multitude of times that the AI stack did demonstrate a problem, and failed to realize it. [Romy] has thoughts on why this attempt failed, and two points stand out. The first is that while the LLM can be creative in making attacks, it’s really terrible at accurately analyzing the results. The second observation is one of the most important observations to keep in mind regarding today’s AIs. It doesn’t actually want to find a vulnerability. One of the marks of security researchers is the near obsession they have with finding a great score. Continue reading “This Week In Security: The AI Hacker, FortMajeure, And Project Zero”