Are We Surrendering Our Thinking To Machines?

“Once, men turned their thinking over to machines in the hope that this would set them free. But that only permitted other men with machines to enslave them.” — so said [Frank Herbert] in his magnum opus, Dune, or rather in the OC Bible that made up part of the book’s rich worldbuilding. A recent study demonstrating “cognitive surrender” in large language model (LLM) users, as reported in Ars Technica, is going to add more fuel to that Butlerian fire.

Cognitive surrender is, in short, exactly what [Herbert] was warning of: giving over your thinking to machines. In the study, people were asked a series of questions, and — except for the necessary “brain-only” control group — given access to a rigged LLM to help them answer. It was rigged in that it would give wrong answers 50% of the time, which while higher than most LLMs, only a difference in degree, not in kind. Hallucination is unavoidable; here it was just made controllably frequent for the sake of the study.

The hallucinations in the study were errors that the participants should have been able to see through, if they’d thought about the answers. Eighty percent of the time, they did not. That is to say: presented with an obviously wrong answer from the machine, only in 20% of cases did the participants bother to question it. The remainder were experiencing what the researchers dubbed “cognitive surrender”: they turned their thinking over to the machines. There’s a lot more meat to this than we can summarize here, of course, but the whole paper is available free for your perusal.

Giving over thinking to machines is nothing new, of course; it’s probably been a couple decades since the first person drove into a lake on faulty GPS directions, for example. One might even argue that since LLMs are correct much more than 50% of the time, it is statistically wise to listen to them. In that case, however, one might be encouraged to read Dune.

Thanks to [Monika] for the tip!

So Expensive, A Caveman Can Do It

A few years back a company had an ad campaign with a discouraged caveman who was angry because the company claimed their website was “so easy, even a caveman could do it.” Maybe that inspired [JuliusBrussee] to create caveman, a tool for reducing costs when using Claude Code.

The trick is that Claude, like other LLMs, operates on tokens. Tokens aren’t quite words, but they are essentially words or word fragments. Most LLM plans also charge you by the token. So fewer tokens means lower costs. However, LLMs can be quite verbose, unless you make them talk like a caveman.

Continue reading “So Expensive, A Caveman Can Do It”

Controlling Vintage Mac OS With AI

Classic Mac OS was prized for its clean, accessible GUI when it first hit the scene in the 1980s. Back then, developers hadn’t even conceived of all the weird gewgaws that would eventually be shoehorned into modern operating systems, least of all AI agents that seem to be permeating everything these days. And yet! [SeanFDZ] found a way to cram Claude or other AI agents into the vintage Mac world.

The result of [Sean]’s work is AgentBridge, a tool for interfacing modern AI agents with vintage Mac OS (7-9). AgentBridge itself runs as an application within Mac OS. It works by reading and writing text files in a shared folder which can also be accessed by Claude or whichever AI agent is in use. AgentBridge takes commands from its “inbox”, executes them via the Mac Toolbox, and then writes outputs to its “outbox” where they can be picked up and processed by the AI agent. The specifics of how the shared folder work are up to you—you can use a network share, a shared folder in an emulation environment, or just about any other setup that lets the AI agent and AgentBridge access the same folder.

It’s hard to imagine any mainstream use cases for having a fleet of AI-controlled Macintosh SE/30s. Still, that doesn’t mean we don’t find the concept hilarious. Meanwhile, have you considered the prospect of artificial intelligence running on the Commodore 64?

Ask Hackaday: What Will An LLM Be Good For In The Plateau Of Productivity?

A friend of mine has been a software developer for most of the last five decades, and has worked with everything from 1960s mainframes to the machines of today. She recently tried AI coding tools to see what all the fuss is about, as a helper to her extensive coding experience rather than as a zero-work vibe coding tool. Her reaction stuck with me; she referenced her grandfather who had been born in rural America in the closing years of the nineteenth century, and recalled him describing the first time he saw an automobile.

Après Nous, Le Krach

The Gartner hype cycle graph. Jeremykemp, CC BY-SA 3.0.

We are living amid a wave of AI slop and unreasonable hype so it’s an easy win to dunk on LLMs, but as the whole thing climbs towards the peak of inflated expectations on the Gartner hype cycle perhaps it’s time to look forward. The current AI hype is inevitably going to crash and burn, but what comes afterwards? The long tail of the plateau of productivity will contain those applications in which LLMs are a success, but what will they be? We have yet to hack together a working crystal ball, but perhaps it’s still time to gaze into the future. Continue reading “Ask Hackaday: What Will An LLM Be Good For In The Plateau Of Productivity?”

A grim reaper knocking on a door labelled "open source"

What About The Droid Attack On The Repos?

You might not have noticed, but we here at Hackaday are pretty big fans of Open Source — software, hardware, you name it. We’ve also spilled our fair share of electronic ink on things people are doing with AI. So naturally when [Jeff Geerling] declares on his blog (and in a video embedded below) that AI is destroying open source, well, we had to take a look.

[Jeff]’s article highlights a problem he and many others who manage open source projects have noticed: they’re getting flooded with agenetic slop pull requests (PRs). It’s now to the point that GitHub will let you turn off PRs completely, at which point you’ve given up a key piece of the ‘hub’s functionality. That ability to share openly with everyone seemed like a big source of strength for open source projects, but [Jeff] here is joining his voice with others like [Daniel Stenberg] of curl fame, who has dropped bug bounties over a flood of spurious AI-generated PRs.

It’s a problem for maintainers, to be sure, but it’s as much a human problem as an AI one. After all, someone set up that AI agent and pointed at your PRs. While changing the incentive structure– like removing bug bounties– might discourage such actions, [Jeff] has no bounties and the same problem. Ultimately it may be necessary for open source projects to become a little less open, only allowing invited collaborators to submit PRs, which is also now an option on GitHub.

Combine invitation-only access with a strong policy against agenetic AI and LLM code, and you can still run a quality project. The cost of such actions is that the random user with no connection to the project can no longer find and squash bugs. As unlikely as that sounds, it happens! Rather, it did. If the random user is just going to throw their AI agent at the problem, it’s not doing anybody any good.

First they came for our RAM, now they’re here for our repos. If it wasn’t for getting distracted by the cute cat pictures we might just start to think vibe coding could kill open source. Extra bugs was bad enough, but now we can’t even trust the PRs to help us squash them!

Continue reading “What About The Droid Attack On The Repos?”

Bruteforcing Accidental Antenna Designs

Antenna design is often referred to as a black art or witchcraft, even by those experienced in the space. To that end, [Janne] wondered—could years of honed skill be replaced by bruteforcing the problem with the aid of some GPUs? Iterative experiments ensued.

[Janne]’s experience in antenna design was virtually non-existent prior to starting, having a VNA on hand but no other knowledge of the craft. Formerly, this was worked around by simply copying vendor reference designs when putting antennas on PCBs. However, knowing that sometimes a need for something specific arises, they wanted a tool that could help in these regards.

The root of the project came from a research paper using an FDTD tool running on GPUs to inversely design photonic nanostructures. Since light is just another form of radio frequency energy, [Janne] realized this could be tweaked into service as an RF antenna design tool. The core simulation engine of the FDTD tool, along with its gradient solver, were hammered into working as an antenna simulator, with [Janne] using LLMs to also tack on a validation system using openEMS, an open-source electromagnetic field solver. The aim was to ensure the results had some validity to real-world physics, particularly important given [Janne] left most of the coding up to large language models. A reward function development system was then implemented to create antenna designs, rank them on fitness, and then iterate further.

The designs produced by this arcane system are… a little odd, and perhaps not what a human might have created. They also didn’t particularly impress in the performance stakes when [Janne] produced a few on real PCBs. However, they do more-or-less line up with their predicted modelled performance, which was promising. Code is on Github if you want to dive into experimenting yourself. Experienced hands may like to explore the nitty gritty details to see if the LLMs got the basics right.

We’ve featured similar “evolutionary” techniques before, including one project that aimed to develop a radio. If you’ve found ways to creatively generate functional hardware from boatloads of mathematics, be sure to let us know on the tipsline!

Living In The (LLM) Past

In the early days of AI, a common example program was the hexapawn game. This extremely simplified version of a chess program learned to play with your help. When the computer made a bad move, you’d punish it. However, people quickly realized they could punish good moves to ensure they always won against the computer. Large language models (LLMs) seem to know “everything,” but everything is whatever happens to be on the Internet, seahorse emojis and all. That got [Hayk Grigorian] thinking, so he built TimeCapsule LLM to have AI with only historical data.

Sure, you could tell a modern chatbot to pretend it was in, say, 1875 London and answer accordingly. However, you have to remember that chatbots are statistical in nature, so they could easily slip in modern knowledge. Since TimeCapsule only knows data from 1875 and earlier, it will be happy to tell you that travel to the moon is impossible, for example. If you ask a traditional LLM to roleplay, it will often hint at things you know to be true, but would not have been known by anyone of that particular time period.

Chatting with ChatGPT and telling it that it was a person living in Glasgow in 1200 limited its knowledge somewhat. Yet it was also able to hint about North America and the existence of the atom. Granted, the Norse apparently found North America around the year 1000, and Democritus wrote about indivisible matter in the fifth century. But that knowledge would not have been widespread among common people in the year 1200. Training on period texts would surely give a better representation of a historical person.

The model uses texts from 1800 to 1875 published in London. In total, there is about 90 GB of text files in the training corpus. Is this practical? There is academic interest in recreating period-accurate models to study history. Some also see it as a way to track both biases of the period and contrast them with biases found in data today. Of course, unlike the Internet, surviving documents from the 1800s are less likely to have trivialities in them, so it isn’t clear just how accurate a model like this would be for that sort of purpose.

Instead of reading the news, LLMs can write it. Just remember that the statistical nature of LLMs makes them easy to manipulate during training, too.


Featured Art: Royal Courts of Justice in London about 1870, Public Domain