Why Model Collapse In LLMs Is Inevitable With Self-Learning

There is a persistent belief in the ‘AI’ community that large language models (LLMs) have the ability to learn and self-improve by tweaking the weights in their vector space. Although there’s scant evidence that tweaking a probability vector space is anything like the learning process in biological brains, we nevertheless get sold the idea that artificial general intelligence (AGI) is just around the corner if we do just enough tweaking.

Instead of emerging super intelligence, the most likely outcome is what is called model collapse, with a recent paper by [Hector Zenil] going over the details on why self-training/learning in LLMs and similar systems is a fool’s errand. For those who just want the brief summary with all the memes, [Metin] wrote a blog post covering the basics.

In the end an LLM as well as a diffusion model (DM) is a statistical model of input data using which a statistically likely output can be generated (inferred) based on an input query. It follows intuitively that by using said output  to adjust the model with, the model will over time converge on a kind of statistical singularity rather than some ‘AI singularity’ event. This is also why these models need to be constantly trained with external, human-generated data in order to prevent such a collapse.

In the paper by [Hector] a mathematical model is created to demonstrate that an LLM, DM or similar statistical model undergoes degenerative dynamics whenever said external input is reduced. Although in the paper a mechanism is suggested to counter the entropy decay within the model, the ultimate point is that a statistical model cannot improve itself without continuous external anchoring.

The idea of LLMs being at all intelligent in any sense has been a contentious one, with the concept of language models being equated with ‘AI’ dating back to the 20th century, including as fun home computer projects. Much of the problem probably lies in humans projecting intelligent behavior onto these statistical models, turning LLMs into ‘counterfeit humans’, not helped by how closely generated text can resemble something written by a human, even if completely confabulated.

Thanks to [deshipu] for the tip.

How Anthropic’s Model Context Protocol Allows For Easy Remote Execution

As part of the effort to push Large Language Model (LLM) ‘AI’ into more and more places, Anthropic’s Model Context Protocol (MCP) has been adopted as the standard to connect LLMs with various external tools and systems in a client-server model. A light oversight with the architecture of this protocol is that remote command execution (RCE) of arbitrary commands is effectively an essential part of its design, as covered in a recent article by [OX Security].

The details of this flaw are found in a detailed breakdown article, which applies to all implementations regardless of the programming language. Essentially the StdioServerParameters that are passed to the remote server to create a new local instance on said server can contain any command and arguments, which are executed in a server-side shell.

Continue reading “How Anthropic’s Model Context Protocol Allows For Easy Remote Execution”

2025: As The Hardware World Turns

If you’re reading this, that means you’ve successfully made it through 2025! Allow us to be the first to congratulate you — that’s another twelve months of skills learned, projects started, and hacks….hacked. The average Hackaday reader has a thirst for knowledge and an insatiable appetite for new challenges, so we know you’re already eager to take on everything 2026 has to offer.

But before we step too far into the unknown, we’ve found that it helps to take a moment and reflect on where we’ve been. You know how the saying goes: those that don’t learn from history are doomed to repeat it. That whole impending doom bit obviously has a negative connotation, but we like to think the axiom applies for both the lows and highs in life. Sure you should avoid making the same mistake twice, but why not have another go at the stuff that worked? In fact, why not try to make it even better this time?

As such, it’s become a Hackaday tradition to rewind the clock and take a look at some of the most noteworthy stories and trends of the previous year, as seen from our rather unique viewpoint in the maker and hacker world. With a little luck, reviewing the lessons of 2025 can help us prosper in 2026 and beyond.

Continue reading “2025: As The Hardware World Turns”

FLOSS Weekly Episode 841: Drupal And AI: The Right Tool For Everything

This week Jonathan and Katherine talk with Jamie Abrahams about Drupal, and how AI just makes sense. No, really. Jamie makes a compelling case that Drupal is a really good tool for building AI workflows. We cover security, personal AI, and more!

Continue reading “FLOSS Weekly Episode 841: Drupal And AI: The Right Tool For Everything”

Why AI Usage May Degrade Human Cognition And Blunt Critical Thinking Skills

Any statement regarding the potential benefits and/or hazards of AI tends to be automatically very divisive and controversial as the world tries to figure out what the technology means to them, and how to make the most money off it in the process. Either meaning Artificial Inference or Artificial Intelligence depending on who you ask, AI has seen itself used mostly as a way to ‘assist’ people. Whether in the form of a chat client to answer casual questions, or to generate articles, images and code, its proponents claim that it’ll make workers more efficient and remove tedium.

In a recent paper published by researchers at Microsoft and Carnegie Mellon University (CMU) the findings from a survey are however that the effect is mostly negative. The general conclusion is that by forcing people to rely on external tools for basic tasks, they become less capable and prepared of doing such things themselves, should the need arise. A related example is provided by Emanuel Maiberg in his commentary on this study when he notes how simple things like memorizing phone numbers and routes within a city are deemed irrelevant, but what if you end up without a working smartphone?

Does so-called generative AI (GAI) turn workers into monkeys who mindlessly regurgitate whatever falls out of the Magic Machine, or is there true potential for removing tedium and increasing productivity?

Continue reading “Why AI Usage May Degrade Human Cognition And Blunt Critical Thinking Skills”

Credit: Daniel Baxter

Mechanical Intelligence And Counterfeit Humanity

It would seem fair to say that the second half of last century up till the present day has been firmly shaped by our relation with technology and that of computers in particular. From the bulking behemoths at universities, to microcomputers at home, to today’s smartphones, smart homes and ever-looming compute cloud, we all have a relationship with computers in some form. One aspect of computers which has increasingly become underappreciated, however, is that the less we see them as physical objects, the more we seem inclined to accept them as humans. This is the point which [Harry R. Lewis] argues in a recent article in Harvard Magazine.

Born in 1947, [Harry R. Lewis] found himself at the forefront of what would become computer science and related disciplines, with some of his students being well-know to the average Hackaday reader, such as [Bill Gates] and [Mark Zuckerberg]. Suffice it to say, he has seen every attempt to ‘humanize’ computers, ranging from ELIZA to today’s ChatGPT. During this time, the line between humans and computers has become blurred, with computer systems becoming increasingly more competent at imitating human interactions even as they vanished into the background of daily life.

These counterfeit ‘humans’ are not capable of learning, of feeling and experiencing the way that humans can, being at most a facsimile of a human for all but that what makes a human, which is often referred to as ‘the human experience’. More and more of us are communicating these days via smartphone and computer screens with little idea or regard for whether we are talking to a real person or not. Ironically, it seems that by anthropomorphizing these counterfeit humans, we risk becoming less human in the process, while also opening the floodgates for blaming AI when the blame lies square with the humans behind it, such as with the recent Air Canada chatbot case. Equally ridiculous, [Lewis] argues, is the notion that we could create a ‘superintelligence’ while training an ‘AI’ on nothing but the data scraped off the internet, as there are many things in life which cannot be understood simply by reading about them.

Ultimately, the argument is made that it is humanistic learning that should be the focus point of artificial intelligence, as only this way we could create AIs that might truly be seen as our equals, and beneficial for the future of all.

Uncovering ChatGPT Usage In Academic Papers Through Excess Vocabulary

Frequencies of PubMed abstracts containing certain words. Black lines show counterfactual extrapolations from 2021–22 to 2023–24. The first six words are affected by ChatGPT; the last three relate to major events that influenced scientific writing and are shown for comparison. (Credit: Kobak et al., 2024)
Frequencies of PubMed abstracts containing certain words. Black lines show counterfactual extrapolations from 2021–22 to 2023–24. The first six words are affected by ChatGPT; the last three relate to major events that influenced scientific writing and are shown for comparison. (Credit: Kobak et al., 2024)

That students these days love to use ChatGPT for assistance with reports and other writing tasks is hardly a secret, but in academics it’s becoming ever more prevalent as well. This raises the question of whether ChatGPT-assisted academic writings can be distinguished somehow. According to [Dmitry Kobak] and colleagues this is the case, with a strong sign of ChatGPT use being the presence of a lot of flowery excess vocabulary in the text. As detailed in their prepublication paper, the frequency of certain style words is a remarkable change in the used vocabulary of the published works examined.

For their study they looked at over 14 million biomedical abstracts from 2010 to 2024 obtained via PubMed. These abstracts were then analyzed for word usage and frequency, which shows both natural increases in word frequency (e.g. from the SARS-CoV-2 pandemic and Ebola outbreak), as well as massive spikes in excess vocabulary that coincide with the public availability of ChatGPT and similar LLM-based tools.

In total 774 unique excess words were annotated. Here ‘excess’ means ‘outside of the norm’, following the pattern of ‘excess mortality’ where mortality during one period noticeably deviates from patterns established during previous periods. In this regard the bump in words like respiratory are logical, but the surge in style words like intricate and notably would seem to be due to LLMs having a penchant for such flowery, overly dramatized language.

The researchers have made the analysis code available for those interested in giving it a try on another corpus. The main author also addressed the question of whether ChatGPT might be influencing people to write more like an LLM. At this point it’s still an open question of whether people would be more inclined to use ChatGPT-like vocabulary or actively seek to avoid sounding like an LLM.