AI For The Skeptics: Attempting To Do Something Useful With It

There are some subjects as a writer in which you know they need to be written, but at the same time you feel it necessary to steel yourself for the inevitable barrage of criticism once your work reaches its audience. Of these the latest is AI, or more specifically the current enthusiasm for Large Language Models, or LLMs. On one side we have the people who’ve drunk a little too much of the Kool-Aid and are frankly a bit annoying on the subject, while on the other we have those who are infuriated by the technology. Given the tide of low quality AI slop to be found online, we can see the latter group’s point.

This is the second in what may become an occasional series looking at the subject from the perspective of wanting to find the useful stuff behind the hype; what is likely to fall by the wayside, and what as yet unheard of applications will turn this thing into something more useful than a slop machine or an agent that might occasionally automate some of your tasks correctly. In the previous article I examined the motivation of that annoying Guy In A Suit who many of us will have encountered who wants to use AI for everything because it’s shiny and new, while in this one I’ll try to do something useful with it myself.

Continue reading “AI For The Skeptics: Attempting To Do Something Useful With It”

New Linux Kernel Rules Put The Onus On Humans For AI Tool Usage

It’s fair to say that the topic of so-called ‘AI coding assistants’ is somewhat controversial. With arguments against them ranging from code quality to copyright issues, there are many valid reasons to be at least hesitant about accepting their output in a project, especially one as massive as the Linux kernel. With a recent update to the Linux kernel documentation the use of these tools has now been formalized.

The upshot of the use of such Large Language Models (LLM) tools is that any commit that uses generated code has to be signed off by a human developer, and this human will ultimately bear responsibility for the code quality as well as any issues that the code may cause, including legal ones. The use of AI tools also has to be declared with the Assisted-by: tag in contributions so that their use can be tracked.

When it comes to other open source projects the approach varies, with NetBSD having banished anything tainted by ‘AI’, cURL shuttering its bug bounty program due to AI code slop, and Mesa’s developers demanding that you understand generated code which you submit, following a tragic slop-cident.

Meanwhile there are also rising concerns that these LLM-based tools may be killing open source through ‘vibe-coding’, along with legal concerns whether LLM-generated code respects the original license of the code that was ingested into the training model. Clearly we haven’t seen the end of these issues yet.

Reverse-Engineering Human Cognition And Decision Making In A Modern Age

Cognitive processes are not something that we generally pay much attention to until something goes wrong, but they cover the entire scope of us ingesting sensory information, the processing and recalling thereof, as well as any resulting decisions made based on such internal deliberation.

Within that context there has also long been a struggle between those who feel that it’s fine for humans to rely on available technologies to make tasks like information recall and calculations easier, and those who insist that a human should be perfectly capable of doing such tasks without any assistance. Plato argued that reading and writing hurt our ability to memorize, and for the longest time it was deemed inappropriate for students to even consider taking one of those newfangled digital calculators into an exam, while now we have many arguing that using an ‘AI’ is the equivalent of using a calculator.

At the root of this conundrum lies the distinction between that which enhances and that which hampers human cognition. When does one merely offload tasks to a device or object, and when does one harm one’s own cognition?

Continue reading “Reverse-Engineering Human Cognition And Decision Making In A Modern Age”

TurboQuant: Reducing LLM Memory Usage With Vector Quantization

Large language models (LLMs) aren’t actually giant computer brains. Instead, they are massive vector spaces in which the probabilities of tokens occurring in a specific order is encoded. Billions of parameters, times N bits per parameter, equals N-billion bits of storage required for a full model. Since increasing the number of parameters makes the models appear smarter, correspondingly the size of these models and their associated caches has been increasing rapidly.

Vector quantization (VQ) is a method that can compress the vectors calculated during inference to take up less space without significant loss of data. Google’s recently published pre-print paper on TurboQuant covers an LLM-oriented VQ algorithm that’s claimed to provide up to a 6x compression level with no negative impact on inference times.

The tokens aren’t directly encoded in the vector space, but their associated key value is, which along with the single token per inference process creates the need for a key-value (KV) cache, the size of which scales with the size of the model. Thus by compressing the KV cache using VQ, it will reduce its size and correspondingly speed up look-ups due to the smaller size in memory. One catch here is that VQ is due to the nature of quantization some accuracy will be lost. The trick here is thus to apply VQ in such a way that it does not affect this accuracy in a noticeable manner.

Other aspects that had to be taken into account by the TurboQuant algorithm was fast computation to keep up with real-time requirements, along with compatibility with so-called ‘AI accelerator’ hardware.

Continue reading “TurboQuant: Reducing LLM Memory Usage With Vector Quantization”

Are We Surrendering Our Thinking To Machines?

“Once, men turned their thinking over to machines in the hope that this would set them free. But that only permitted other men with machines to enslave them.” — so said [Frank Herbert] in his magnum opus, Dune, or rather in the OC Bible that made up part of the book’s rich worldbuilding. A recent study demonstrating “cognitive surrender” in large language model (LLM) users, as reported in Ars Technica, is going to add more fuel to that Butlerian fire.

Cognitive surrender is, in short, exactly what [Herbert] was warning of: giving over your thinking to machines. In the study, people were asked a series of questions, and — except for the necessary “brain-only” control group — given access to a rigged LLM to help them answer. It was rigged in that it would give wrong answers 50% of the time, which while higher than most LLMs, only a difference in degree, not in kind. Hallucination is unavoidable; here it was just made controllably frequent for the sake of the study.

The hallucinations in the study were errors that the participants should have been able to see through, if they’d thought about the answers. Eighty percent of the time, they did not. That is to say: presented with an obviously wrong answer from the machine, only in 20% of cases did the participants bother to question it. The remainder were experiencing what the researchers dubbed “cognitive surrender”: they turned their thinking over to the machines. There’s a lot more meat to this than we can summarize here, of course, but the whole paper is available free for your perusal.

Giving over thinking to machines is nothing new, of course; it’s probably been a couple decades since the first person drove into a lake on faulty GPS directions, for example. One might even argue that since LLMs are correct much more than 50% of the time, it is statistically wise to listen to them. In that case, however, one might be encouraged to read Dune.

Thanks to [Monika] for the tip!

The Heat Island Effect Is Warming Up The AI Data Center Controversy

There’s been a lot of virtual ink spilled in environmental circles about the cooling water requirements of data centers, but less consideration of what happens with all the heat coming out of these buildings. Naturally, it’s going to warm the surrounding environment, but how much? Around 2 C (3.6 F) on average, and potentially much more than that, according to a recent study on the data heat island effect.

It’s common sense, of course: heat removed from the data center doesn’t go away. That heat might go into a body of water if one is available, but otherwise it’s out into the atmosphere to warm up everybody else’s day. In some places — like a Canadian winter — that might not be so bad. In others, where climate change and urban heat islands are cranking up the summertime temperatures, it very much could be. Especially if you’re in the worst-case scenario micro-climate described by the paper, which saw a predicted increase of 9.1 C (16 F).

Now, these results are theoretical and need to be ground-truthed, but anyone who has huddled next to the air-exchange unit of a large building for warmth knows there’s something to them. Unfortunately there don’t seem to be before-and-after measurements available for existing data-centers — AI or otherwise — to show exactly what their heat output is doing in the real world, but the urban heat island effect from all the dark asphalt in our cities is well known. Cooling paint and green roofs can help with that, but they won’t do much for the megawatts being pumped out to keep your cousin’s AI girlfriend online.

Some would argue that all this heat wouldn’t be a problem if we could launch the data centers outside the environment — just have a care the front doesn’t fall off.


Image of data center cooling by Анна from Pixabay

So Expensive, A Caveman Can Do It

A few years back a company had an ad campaign with a discouraged caveman who was angry because the company claimed their website was “so easy, even a caveman could do it.” Maybe that inspired [JuliusBrussee] to create caveman, a tool for reducing costs when using Claude Code.

The trick is that Claude, like other LLMs, operates on tokens. Tokens aren’t quite words, but they are essentially words or word fragments. Most LLM plans also charge you by the token. So fewer tokens means lower costs. However, LLMs can be quite verbose, unless you make them talk like a caveman.

Continue reading “So Expensive, A Caveman Can Do It”