Most AI Content Is Trash, Just Like Everything Else

November 3, 2023

[Max Woolf] has been working in the AI space since 2015, and among other work has created numerous useful open-source tools. He also recently wrote a thoughtful blog post that attempts to put into words his feelings on the state of things in the wake of experiencing a bit of an AI backlash-related burnout. Essentially, people effortlessly creating vast amounts of bad AI content has caused a bigger problem than we may realize.

How so? Well, Sturgeon’s law (summarized as “ninety percent of everything is crud”) applies to AI as much as it does to anything else. Theodore Sturgeon was a science fiction author and critic (and writer of multiple Star Trek episodes) who observed in the 1950s that while Science Fiction — the hot new popular thing at the time — was often derided by critics as being little more than low quality pap, so was everything else. It was true that most Science Fiction was garbage. But most work in other fields was of similarly low quality, and thus Science Fiction was really no different. It’s all trash, except for the parts one likes. Just like anything else.

What makes this observation particularly applicable to the current AI landscape is that, according to [Max], the incredible ease of use makes AI’s “ninety percent crud” very large indeed, and the attached backlash is similarly big. The remaining ten percent of AI that is absolutely fantastic and full of possibilities? It’s practically invisible due to how quickly the industry is moving, the speed with which the big players are vying to control it, and how unfashionable it has become to admit one is using AI tools at all.

[Max] knows the scene better than most. One of his projects is simpleaichat, a tool aimed not just at enabling people to integrate AI into projects easier, but piercing the hype around AI to more easily reveal just how these tools actually work. Sadly, a general AI backlash has made developing these tools feel rather less rewarding than it once did.

31 thoughts on “Most AI Content Is Trash, Just Like Everything Else”

AZdave says:

November 3, 2023 at 7:30 pm

Of course AI uses the tons of information it was trained on, but I assume that AI is also programmed to learn from the interactions it has with the people using it. For example, ChatGPT often gives the wrong answers to a task it has been given but will acknowledge the right answer when you correct it. I suspect that the corrected answer is used to influence the later responses. I don’t know that for a fact, but it seems reasonable. Given the extreme levels of intentional disinformation in modern society, that seems scarier to me than anything else.

Report comment

Reply
1. MmmDee says:
  
  November 3, 2023 at 8:04 pm
  
  I think in a perfect world, you are right in suggesting (but not assuming) feedback should be used by an AI system to correct an errant response; however, using ChatGPT, I find my correct feedback is not incorporated to correct an output, instead a task will often regenerate a similar response containing the same error (even after the AI acknowledges it provided an incorrect response). I hope other AI systems are much better at recognizing and incorporating correct feedback.
  
  For now, I simply look at AI systems as automated decision trees/flowcharts.
  
  Report comment
  
  Reply
  1. CMH62 says:
    
    November 3, 2023 at 8:19 pm
    
    What’s scary too is that most of us will never know what “rules” the programmers have overlayed on the AI system to restrict what info you receive. As an exercise only, sometime ask ChatGPT to give you all possible letter combinations of the letters that you know will spell some word which is not polite to utter in public. It will omit the objectionable combination; it will tell you it has given you all possible combinations; it will misnumber the number of answer so as to avoid having to print the offending combination on screen, etc. etc. Again, that’s only an exercise, but it’s illuminating on the lack of transparency with which these systems can be manipulated.
    
    Report comment
    
    Reply
    1. Joseph Eoff says:
      
      November 4, 2023 at 1:04 am
      
      “it will misnumber the number of answer so as to avoid having to print the offending combination on screen”
      
      Nope. It will misnumber the number of answers because chatbots can’t count.
      
      Report comment
      
      Reply
  2. Ewald says:
    
    November 4, 2023 at 1:52 am
    
    The user feedback is not incorporated in real time, that would require huge amounts of processing power and make the model vulnerable to misdirection. Instead the feedback is (probably) used as training data for a next iteration of the model which will undergo all the necessary steps before being fased into production
    
    Report comment
    
    Reply
2. WereCatf says:
  
  November 3, 2023 at 8:09 pm
  
  “but I assume that AI is also programmed to learn from the interactions it has with the people using it” — It’s not. If it was, the same thing would happen all over again as happened with e.g. Microsoft’s Tay, ie. people would troll it until it learned all the wrong lessons and it’d become a massively racist nazibot. There’s a reason why the data they’re trained on gets audited first in an effort to weed out the worst kind of data and they don’t just use unfiltered live data for it.
  
  Report comment
  
  Reply
  1. AZdave says:
    
    November 3, 2023 at 9:30 pm
    
    OK … but for AI to truly learn it will have to eventually do just that. I think it’s inevitable that it will happen sometime. Maybe programmers will include boundary conditions … i.e., “values” … but maybe they won’t.
    
    Report comment
    
    Reply
3. Foxhood says:
  
  November 3, 2023 at 9:32 pm
  
  Not really. If you told it its factual answer was wrong it is just as likely to apologize for being ‘wrong’. It is a built-in response embed that gets injected moment the servers detect you refuting it. just like how it will claim it is merely a machine when asked on its intelligence, or refuse to talk about a problematic subject the developers blacklisted. Its all theatrics.
  
  Probably a good thing as any form of direct feedback is just begging for the Chatbot to get turned into another Tay that runs around shouting “Heil Hitler” against everyone. If you give trolls an opportunity to corrupt it. They will…
  
  The conversations are stored, but they will only become part of the dataset once it has been scrubbed through by humans and even then only the human prompts get fed (to avoid self-contamination).
  
  Report comment
  
  Reply
4. None says:
  
  November 4, 2023 at 1:12 am
  
  It does not actually learn at all from corrections. It will keep making the same mistakes over and over again, quite often.
  
  Report comment
  
  Reply
  1. The Commenter Formerly Known As Ren says:
    
    November 4, 2023 at 6:02 am
    
    Just like some humans.
    IOW, Artificial Stupidity.
    
    Report comment
    
    Reply
  2. AZdave says:
    
    November 4, 2023 at 10:48 am
    
    Yeah, I actually asked ChatGPT if it learned from its human interactions and it said that it did not. It’s not really artificial intelligence until it can learn from its mistakes, of course …. it’s otherwise just an interface to a massive database.
    
    Report comment
    
    Reply
    1. Erin says:
      
      November 4, 2023 at 1:29 pm
      
      it’s not even an interface to a database. It has a gigantic matrix of text tokens. It uses its “attention window” of text from both you and it–some portion of the conversation–along with a gigantic lump of weighted matrices from a ridiculously large corpus of text–to generate words that its training indicates is a good response. Nothing is in a database. With plugins it CAN include a database or website query into all that soup, it does not and is incapable of evaluating whether its response is correct. It will lie brazenly to you just because that’s a high-scoring response. If you call it out, it will mollify you because that’s a high-scoring response.
      
      Your asking it about itself is a fool’s errand. Use real resources instead.
      
      Report comment
      
      Reply
    2. No No says:
      
      November 5, 2023 at 5:15 am
      
      Except humans are often likely to be wrong too. Are you learning from your mistakes if they aren’t mistakes and the person correcting you is wrong? How much incorrect information were you taught in schools?
      
      AI is not a meat bag bumbling around whose illusion of general intelligence helps it survive as humans are, and so our bias as to what intelligence is needs to be examined carefully when we’re developing told intended to be intelligent.
      
      Report comment
      
      Reply
      1. Dude says:
        
        November 7, 2023 at 4:22 am
        
        Most of the time, humans know that they don’t know something and will tell you so, instead of confabulating stuff.
        
        Report comment
Foxhood says:

November 3, 2023 at 9:48 pm

That is honestly the Holy Grail of AI Research.

Sadly while we have made a convincing “intelligence” via a massive model that can brute-force the english language and a whole framework surrounding it to make it seem like it is a well-behaved bot. Truthfully we are no closer to an intelligence capable of self-attending based on ethics and values, than we were two decades ago. :/

Do not mistake a breakthrough in a single aspect, as signalling advancement of AI research as a whole. The field is rather divided into these small slices that each advance independently from each other. E.g. while Generators spiked, our regular Agents we run in robotics/game-environments are still the exact same.

Report comment

Reply
1. TG says:
  
  November 4, 2023 at 1:16 am
  
  I wonder what an actual intelligence will be like if it develops language long before even basic agency. Gonna be fun to psychoanalyze..
  
  Report comment
  
  Reply
Foxhood says:

November 3, 2023 at 9:49 pm

Dangit. Reply got turned into a regular comment. Was meant for AZdave’s response to Werecat

Report comment

Reply
Feinfinger (M-x butterfly) says:

November 4, 2023 at 2:18 am

I tried to get results from today’s AI 5 times and its answers always were factually wrong. In today’s alternate facts era, most users sure won’t care and will swallow everything.

Please prove me wrong…

Report comment

Reply
Skip Flem says:

November 4, 2023 at 3:14 am

GS lives!!!
(genuine stupidity)

Report comment

Reply
Nick Sargeant says:

November 4, 2023 at 4:15 am

This reminds me of the design differences between using Mealy and Moore machines in more traditional logic design. Moore machines are more “stable” due to their clocked state rather than reacting instantly to changes of input.

https://en.wikipedia.org/wiki/Mealy_machine#Comparison_of_Mealy_machines_and_Moore_machines

I imagine the situation will be critical where clinical decisions are made using AI tools.

Report comment

Reply
1. SparkyGSX says:
  
  November 4, 2023 at 5:29 am
  
  Clinical decisions should never be made by AI or any other algorithm. Instead, the AI could process medical images (MRI, X-ray, etc.), and mark areas where the specialist should look more carefully.
  
  The current form of AI is very good at convincing people it’s not complete trash, but it fails horribly when asked something non-trivial by an expert in a given field.
  
  ChatGPT doesn’t really understand anything, it can only mash public sources together, to give you an answer you could have found with a traditional search engine, but such that you will never find what the original source was.
  
  Report comment
  
  Reply
nospam says:

November 4, 2023 at 7:54 am

AI is just the current fad the way blockchain was supposed to be the bees knees 5 years ago.

Report comment

Reply
SETH says:

November 4, 2023 at 7:58 am

Current gen mainstream AI is all reliant on supercomputers and data centers. Actual non garbage AI output will come about when the machine learning is deployable on user end devices. This way a smart watch can actively train on the user’s real world interactions and then coherent useful outputs can be generated. There is value in this approach as it can be best utilized in improving and assisting in daily routines, and would have widespread use in medical applications.

Dall E and ChatGPT, however, are these big ugly amalgamations of the contemporary internetscape, prejudice and racism still there, biases, hallucinations, inaccuracies. We fear modern AI because it excels at our lowest most base qualities. It is not easy to see our reflection so clearly.

Report comment

Reply
1. Anonymous says:
  
  November 4, 2023 at 8:23 am
  
  >We fear modern AI because it excels at our lowest most base qualities.
  I don’t fear it, because I actually enjoy human nature. The world isn’t a textbook, it’s okay to be wrong and say wrong things.
  
  Report comment
  
  Reply
EPS10N says:

November 4, 2023 at 10:29 am

I enjoyed reading all the comments. I agree with most of them… Is there any real advance in AI recently, sure our computing power is now enough to run these behemot of a algorithm. It’s still just a fancy flip switchs board.
Advances come mostly from applied technologies from the middle of the past century… neural network, memsistor and so on. Without a body that can feel the world it’s in and act upon it, there won’t be any AI agent. Does our device feel tired when their batteries are low? Does they seek to get charged? Same old Mind-Body Problem.

Report comment

Reply
1. Francois Otis says:
  
  November 5, 2023 at 12:23 am
  
  It will never be human… it’s Artificial intelligence… It’s intelligence without human’s condition.
  
  Report comment
  
  Reply
adobeflashhater again says:

November 4, 2023 at 8:28 pm

let’s see. Short of a, functional, summation I get from things?
Basically we take the latest & greatest super powered search engine (for parsing overall data bases) and then trawl it through all of our “social” media (for daily language usage morphology & societal norms ) and then call it an A.I.?

Meh.
Euphemisms and colloquialisms, trendy new phrases, etc.
The trolls will still be able to play hell on it through that tired “language is ever evolving” catchall/hole that -already- keeps languages mired in misunderstandings.

Bah. I’m too busy hunting for my latest “universal” cord, to go look for that XKCD cartoon about standards.

Report comment

Reply
Monsonite says:

November 5, 2023 at 3:38 am

What happens when a significant percentage of training data has become watered down and factually incorrect, or corrupted for the purposes of disinformation and misdirection?

It would be sensible that all AI generated content, is marked up with meta data, stating that it is AI generated and should not be used for further training purposes.

When you start using garbage for input, you can only get even more garbage output. And for us human readers, a BS warning should be clearly visible.

I recently read an article on a well respected, popular magazine web site, where not only had the article been entirely created by ChatGPT, but also lengthy replies to comments posted. The article and replies were misleading, factually incorrect and had AI generated propaganda that could only have been from the oil and gas industry.

It reminds me of when farmers in UK, and elsewhere, in the interests of profit, began to feed their cattle on protein that had been contaminated with brain and spinal cord material from slaughtered cattle. The result was BSE, commonly known as “Mad Cow Disease”, which jumped species to humans, deer and cats.

We should be careful what we wish for, and it doesn’t help that nut-job billionaires are actively telling world leaders that AI will replace all human employment.

Let’s unpack this a bit:

White Collar – possibly

Blue Collar – hasn’t yet happened from automation revolution, unlikely to happen anytime soon from AI.

Agricultural workers – maybe in the West for prairie cereal crops, but not for sub-Saharan subsistence farmers or third world nomadic sheep and goat herds.

This is what happens when billionaires get stuck in their own little worlds, spout their often nonsensical viewpoints and get special access to the ears of politicians.

Before we get over excited about Artificial Intelligence – we still have to do an awful lot more work on Human Stupidity.

Report comment

Reply
1. No No says:
  
  November 5, 2023 at 5:21 am
  
  Can we put BS warnings on human Internet comments too? It’s not like you or I aren’t also running on the garbage in / garbage out principle, and yet it seems people only seem to use this as a problem with AI or point out this short coming as a problem for AI, rather than the far, far more general problem it is. For instance, your comment on it being AI generated propaganda sounds remarkably specious, as though it couldn’t be your own biases at play.
  
  Report comment
  
  Reply
2. Foxhood says:
  
  November 5, 2023 at 4:33 pm
  
  @Monsonite, You’d think we would mark with Meta-data, but uhm… we don’t…
  
  Everybody was in such a rush to get their LLMs and Image generators out into the field before competitors and regulators could react. That there are no rules or standards be it Internal or from an external party. Some sites add a AI meta-data or can be recognized as AI generated by a disclaimer, but few actually do it and nothing stops bad actors from lying about AI having done the work for them. There are also myths like a robot “noAI” tag will cause scrapers to ignore the site in question. they don’t. It is genuinely lawless at present time.
  
  Result is that the Data-sets are kind of in danger of not being able to get updated anymore. As the risk of contamination of input for new by the output of old is increasingly likely as AI is abused more and more.
  
  Report comment
  
  Reply
SlyUser says:

January 25, 2024 at 4:27 am

From my experience using Writesonic AI tool, it has its merits, offering decent performance to generate content, but its drawback lies in its tendency to generate repetitive content when not prompted to explore deeper meanings. It shines with guidance and specific prompts, delivering impressive results. The accuracy though, the tool heavily relies on the user’s reference material. If the reference material is inaccurate, the generated information is likely to follow suit. Like to note, the sentence structure it sometimes finds a specific way to describe elements and uses that same flow over and over again. Hoping it gets better though but still very handy to have! https://www.slyautomation.com/blog/writesonic-revolutionizing-content-creation-with-ai-powered-efficiency/

Report comment

Reply

Hackaday

Most AI Content Is Trash, Just Like Everything Else

31 thoughts on “Most AI Content Is Trash, Just Like Everything Else”

Leave a ReplyCancel reply

Search

Never miss a hack

If you missed it

VAR Is Ruining Football, And Tech Is Ruining Sport

Mining And Refining: Uranium And Plutonium

Programming Ada: First Steps On The Desktop

The Hunt For MH370 Goes On With Barnacles As A Lead

MXM: Powerful, Misused, Hackable

Our Columns

2024 Home Sweet Home Automation: The Winners Are In

Upgrade Your Test Probes

Hackaday Links: April 28, 2024

Welcome Back, Voyager

Hackaday Podcast Episode 268: RF Burns, Wireless Charging Sucks, And Barnacles Grow On Flaperons

31 thoughts on “Most AI Content Is Trash, Just Like Everything Else”

Leave a ReplyCancel reply

Search

Never miss a hack

Subscribe

If you missed it

Our Columns