Examining The Vulnerability Of Large Language Models To Data-Poisoning

February 3, 2025

Large language models (LLMs) are wholly dependent on the quality of the input data with which these models are trained. While suggestions that people eat rocks are funny to you and me, in the case of LLMs intended to help out medical professionals, any false claims or statements dripping out of such an LLM can have dire consequences, ranging from incorrect diagnoses to much worse. In a recent study published in Nature Medicine by [Daniel Alexander Alber] et al. the ease with which this data poisoning can occur is demonstrated.

According to their findings, only 0.001% of training tokens have to be replaced with medical misinformation to order to create models that are likely to produce medically erroneous statement. Most concerning is that such a corrupted model isn’t readily discovered using standard medical LLM benchmarks. There are filters for erroneous content, but these tend to be limited in scope due to the overhead. Post-training adjustments can be made, as can the addition of RAG, but none of this helps with the confident bull excrement due to corruption.

The mitigation approach that the researchers developed cross-references LLM output against biomedical knowledge graphs, to reduce the LLM mostly for generating natural language. In this approach LLM outputs are matched against the graphs and if LLM ‘facts’ cannot be verified, it’s marked as potential misinformation. In a test with 1,000 random passages detected issues with a claimed effectiveness of 91.9%.

Naturally, this does not guarantee that misinformation does not make it past these knowledge graphs, and largely leaves the original problem with LLMs in place, namely that their outputs can never be fully trusted. This study also makes it abundantly clear how easy it is to corrupt an LLM via the input training data, as well as underlining the broader problem that AI is making mistakes that we don’t expect.

23 thoughts on “Examining The Vulnerability Of Large Language Models To Data-Poisoning”

BrightBlueJim says:

February 3, 2025 at 11:44 am

A really good reason why LLMs should not be used in any application where they could put lives in danger. Also, the preponderance of “confident bull excrement” AND the simultaneous preponderance of equally confident bull excrement saying the exact opposite in the politicosphere is probably creating schizophrenic AIs. Don’t trust anything without a face. Or with the wrong number of fingers.

Report comment

Reply
1. 𐂀 𐂅 says:
  
  February 4, 2025 at 5:01 pm
  
  Humans are worse, try discussing certain topics on HAD and see how fast things descend into innumerate madness driven by dogma and political indoctrination.
  
  Report comment
  
  Reply
  1. Ewald says:
    
    February 9, 2025 at 12:18 pm
    
    So true, certainly in this time we live in. It’s the same with self driving cars, they can drive a million km’s without any problem, but if they are involved in an accident the media are full of it and we should immediately remove that tech from the road. We are nowhere near using LLMs as autonomous unsupervised doctors, but the output quality when fed with the right input is remarkable also in medical situations. Where i live law dictates that humans should get a human advice/decision so we only use AI to help us generate the text and the doctor is always responsible for what is sent out to the patients.
    
    Report comment
    
    Reply
Andrew says:

February 3, 2025 at 12:07 pm

Humans are also vulnerable to this.

Report comment

Reply
1. Jon Mayo says:
  
  February 3, 2025 at 12:10 pm
  
  But we can get useful work out of humans even if they are wrong about stuff.
  
  Report comment
  
  Reply
  1. 𐂀 𐂅 says:
    
    February 4, 2025 at 5:03 pm
    
    They only behave when they fear the consequences, then they are often smart enough to say “I don’t know”, and not help at all.
    
    Report comment
    
    Reply
2. Foldi-One says:
  
  February 4, 2025 at 6:27 pm
  
  Most humans are smart enough to be able to deal with the really tiny portion of crap though. Heck even in the propaganda only zone folks will know what the truth isn’t most of the time, even if they never ever talk about it and don’t have a way to know the full truth – the logical inconsistency etc feeds into an understanding of the reliability of the sources and what the truth must be. Which so far at least is something LLM just don’t do for themselves.
  
  Report comment
  
  Reply
3. Shawn says:
  
  February 5, 2025 at 7:43 am
  
  Human doctors, are absolutely fallible and they make mistakes, and on top of that they often have very unhealthy superiority complexes that make them extremely difficult to communicate with.
  I personally think that a well trained A.I. doctor will be much safer and more useful as a practicing doctor. I will elaborate my position, for one there was a heart surgeon who gave over 62 completely unnecessary open heart surgeries just to increase his overall profits, this doctor had hundreds of trusting patients many of which he literally risked their lives and health just to make allot of money.
  Also many doctors have been exposed to be perverts. Basically greed and lust, have compromised many doctors and created an environment wherein only simpletons still blindly trust their doctor.
  Oh and dare I forget the amazing trend of doctors leaving surgical instruments inside people’s bodies during surgeries.
  I’m betting a robot doctor will be much much more likely to save a higher percentage of human lives, especially based upon what I have previously mentioned concerning the many and varied and dangerous flaws of human doctors.
  
  Report comment
  
  Reply
  1. BrightBlueJim says:
    
    February 6, 2025 at 5:56 pm
    
    “..a well trained A.I. doctor will be much safer..”
    And there’s the rub. If your AI is trained with a large enough base of data, it’s too large to have been reviewed by people for reliability. The best you can do is review its diagnoses from the outcomes, which is what you get with human doctors.
    
    Report comment
    
    Reply
SETH says:

February 3, 2025 at 12:23 pm

LLM’s greatest contribution thus far is to sow mistrust around its outputs by hallucinating or miscorrelating information. Natural language is not merely textual, LLM’s use a logic which is correct within its own black box which has no useful meaning in actual reality. The reason computer code can be synthesized better by an LLM, than making a medical diagnosis, is programming languages are far less expressive and far more specific in what they communicate

Report comment

Reply
r4m0n says:

February 3, 2025 at 1:09 pm

“only 0.001% of training tokens have to be replaced”

good luck sneaking 100 bibles worth of poison in something like GPT4 to have an effect…

Report comment

Reply
1. aleksclark says:
  
  February 3, 2025 at 2:31 pm
  
  you think GPT4’s inputs have less poison than that?
  
  Report comment
  
  Reply
2. TG says:
  
  February 3, 2025 at 10:03 pm
  
  You just use another AI to generate the poison. It could do that during a lunch break. Duh
  
  Report comment
  
  Reply
3. A says:
  
  February 4, 2025 at 12:39 am
  
  Acording to a recent article on Ars this process may be (unintentionally?) happening already thanks to less scrupulous scientists.
  
  https://arstechnica.com/science/2025/01/bogus-research-is-undermining-good-science-slowing-lifesaving-research/
  
  Report comment
  
  Reply
Rog77 says:

February 3, 2025 at 2:36 pm

Patient: I want a second opinion!

DoctorGPT: I’m happy to change my original opinion, what would you prefer to be diagnosed with?

Report comment

Reply
Anonymous says:

February 3, 2025 at 6:03 pm

I feel like we need to have some level of spliting models into language and facts. Not just for accuracy but for longer term memory.

For example, a model running a DnD session would work worlds better if it had a rigid database that would provide more and more dense context. Like something that could handle keeping track of HP, spells, previous key story points, or even just character names, personalities and histories.

A rigid database it could access would make fact sourcing and accuracy easier to manage since entries could be added for new information or updated as needed by humans.

Even a more distict but AI based memory would help more than context windows currently do. Though I think a lot could be done by attaching something that gives some rigid context to prompts like a bullet list of stats and is controlled independently based on events.

Report comment

Reply
TG says:

February 3, 2025 at 10:02 pm

Sitting down and reading the entirety of Dissolving Illusions by Suzanne Humphries to a medical AI

Report comment

Reply
shinsukke says:

February 3, 2025 at 11:11 pm

I am wondering why its even an issue if LLMs can get infected or hallucinate?

I thought it was already established that they are statistical text prediction engines, and not actually able to think or reason. If you input 1+1=3 enough times during training, it will consider that the truth. As long as there’s a disclaimer when using an LLM so that no one ever forgets its true nature, I see no problem

Report comment

Reply
Ian says:

February 3, 2025 at 11:35 pm

LLMs are a search engine.
Why is it remotely surprising that they will return bad information, when you include bad information in the original dataset?

This problem is EASILY solved.
(The problem of them being wrong. Not the moral problems of using them for anything at all.)

You vet 100% of the dataset.
Then you use the LLM like the search engine it is.
Why do people not understand this?

If we stopped treating all this “AI” BS like some magical fairy that knows things instead of the literal search engine it is, people would have way more realistic expectations.

But that wouldn’t drive the trillion dollar “AI” grift would it?
People might be a little less comfortable openly using other people’s work if they had to put LLM output into the same mental box as a Google search huh?

LLMs and Generative “AI” are theft.
They are search engines that make it convenient and/or palletable to launder the work of others without credit or payment.

Report comment

Reply
1. BrightBlueJim says:
  
  February 6, 2025 at 6:03 pm
  
  We can’t even vet 100% of Wikipedia.
  
  Report comment
  
  Reply
Winston says:

February 4, 2025 at 4:50 am

Let’s see, a physician or physician’s assistant reading my incomplete medical records for the first time given 10-15 minutes to diagnose me or an AI expert system with my entire medical history working with data even from the latest studies and OUTLINING ITS THOUGHT PROCESS in coming up with a diagnosis which is quickly reviewed by a physician or physician’s assistant to eliminate ridiculous prescriptions like “eat rocks.” Also, unlike humans, the AI would also be sure to be programmed to NOT be ruled by financial incentives from Big Pharma and be programmed to IGNORE studies from same (see below).

I’ll take AI, thank you.

“It is simply no longer possible to believe much of the clinical research that is published, or to rely on the judgment of trusted physicians or authoritative medical guidelines. I take no pleasure in this conclusion, which I reached slowly and reluctantly over my two decades as editor of The New England Journal of Medicine.” – Marcia Angell (2009)

“The case against science is straightforward: much of the scientific literature, perhaps half, may simply be untrue. Afflicted by studies with small sample sizes, tiny effects, invalid exploratory analyses, and flagrant conflicts of interest, together with an obsession for pursuing fashionable trends of dubious importance, science has taken a turn towards darkness.” – Richard Horton, editor of The Lancet (2015)

“Journals have devolved into information laundering operations for the pharmaceutical industry.” – Richard Horton, editor of The Lancet (2004)

BTW, I looked at the Authors and Affiliations section of the study and see that nearly every one has what could be called an “employment conflict of interest” of AI taking their jobs.

Report comment

Reply
1. Foldi-One says:
  
  February 4, 2025 at 6:41 pm
  
  Also, unlike humans, the AI would also be sure to be programmed to NOT be ruled by financial incentives from Big Pharma…
  
  Odds are that is exactly what it wouldn’t be – training these LLM and certifying them fit for medical use is going to be expensive. So just who do you think will be paying for it? IF anything it is more likely to be given priority on the more profitable or still protected IP drugs by filtering the input data etc.
  
  AI expert system with my entire medical history
  
  Not sure I’d trust that either – databases get borked, you get initially misidentified as your brother after an accident involving both of you etc. The human is likely to notice the data doesn’t match reality – for instance at some point nearly everyone who ever had an update/edit in a database my Dad worked for the tech support company ended up gender bent, IIRC the software they where using didn’t do scroll wheel navigation properly so the user wants to get down to the bottom of form the first field that was auto selected on opening an edit instead gets rolled down, so everyone ended up with a female title, and as there was no message before committing the update informing you of all the changes that would go through. (That took a long time to debug as technically there is nothing wrong with the database, and the user interface works perfectly unless you used the scroll wheel at the wrong time).
  
  Report comment
  
  Reply
𐂀 𐂅 says:

February 4, 2025 at 5:06 pm

I’ve been playing with putting multiple AI systems into adversarial interactions to make them more intellectually disciplined, and it works well. Just like humans they can all have a head full of BS but so long as they don’t all subscribe to the same BS you can get a sane consensus out of them. Not much different from managing humans in a high pressure production environment, if you know what that is like…

Report comment

Reply

Hackaday

Examining The Vulnerability Of Large Language Models To Data-Poisoning

23 thoughts on “Examining The Vulnerability Of Large Language Models To Data-Poisoning”

Leave a ReplyCancel reply

Search

Never miss a hack

If you missed it

Ore Formation: A Surface Level Look

In Which I Vibe-Code A Personal Library System

Give Us One Manual For Normies, Another For Hackers

3D Printing And The Dream Of Affordable Prosthetics

Chinese Regulators May Kill Retractable Car Door Handles That Never Should Have Existed

Our Columns

Keebin’ With Kristina: The One With The Pretty Protoypes

FLOSS Weekly Episode 857: SOCification

3D Printering: That New Color Printer

Retrotechtacular: Learning The Slide Rule The New Old Fashioned Way

How Cross-Channel Plumbing Fuelled The Allied March On Berlin

23 thoughts on “Examining The Vulnerability Of Large Language Models To Data-Poisoning”

Leave a ReplyCancel reply

Search

Never miss a hack

Subscribe

If you missed it

Our Columns