AI’s Existence Is All It Takes To Be Accused Of Being One

New technologies bring with them the threat of change. AI tools are one of the latest such developments. But as is often the case, when technological threats show up, they end up looking awfully human.

Recently, [E. M. Wolkovich] submitted a scientific paper for review that — to her surprise — was declared “obviously” the work of ChatGPT. No part of that was true. Like most people, [E. M. Wolkovich] finds writing a somewhat difficult process. Her paper represents a lot of time and effort. But despite zero evidence, this casual accusation of fraud in a scientific context was just sort of… accepted.

There are several reasons this is concerning. One is that, in principle, the scientific community wouldn’t dream of leveling an accusation of fraud like data manipulation without evidence. But a reviewer had no qualms about casually claiming [Wolkovich]’s writing wasn’t hers, effectively calling her a liar. Worse, at the editorial level, this baseless accusation was accepted and passed along with vague agreement instead of any sort of pushback.

Showing Your Work Isn’t Enough

Interestingly, [Wolkovich] writes everything in plain text using the LaTeX typesetting system, hosted on GitHub, complete with change commits. That means she could easily show her entire change history, from outline to finished manuscript, which should be enough to convince just about anyone that she isn’t a chatbot.

But pondering this raises a very good question: is [Wolkovich] having to prove she isn’t a chatbot a desirable outcome of this situation? We don’t think it is, nor is this an idle question. We’ve seen how even when an artist can present their full workflow to prove an AI didn’t make their art, enough doubt is sown by the accusation to poison the proceedings (not to mention greatly demoralizing the creator in the process.)

Better Standards Would Help

[Wolkovich] uses this opportunity to reflect on and share what this situation indicates about useful change. Now that AI tools exist, guidelines that acknowledge them should be created. Explicit standards about when and how AI tools can be used in the writing process, how those tools should be acknowledged if used, and a process to handle accusations of misuse would all be positive changes.

Because as it stands, it’s hard to see [Wolkovich]’s experience as anything other than an illustration of how a scientific community’s submission and review process was corrupted not by undeclared or thoughtless use of AI but by the simple fact that such tools exist. This seems like both a problem that will only get worse with time (right now, it is fairly easy to detect chatbots) and one that will not solve itself.

29 thoughts on “AI’s Existence Is All It Takes To Be Accused Of Being One

  1. Even the smallest LLMs can make more sense than some humans.

    ollama run tinydolphin "Recently, [E. M. Wolkovich] submitted a scientific paper for review that — to her surprise — was declared “obviously” the work of ChatGPT. No part of that was true. Like most people, [E. M. Wolkovich] finds writing a somewhat difficult process. Her paper represents a lot of time and effort. But despite zero evidence, this casual accusation of fraud in a scientific context was just sort of… accepted."

    This situation seems to be a mix of miscommunication between the parties involved, including E. M. Wolkovich, who was probably not aware that ChatGPT is capable of writing convincing
    academic papers based on her input.

    As a language model like ChatGPT can understand and produce grammatically correct sentences, it's quite possible that ChatGPT's output could be seen as “obviously” the work of E.M.
    Wolkovich, who is not an AI model but a human writer trying to write in a sophisticated manner.

    Either way, this situation highlights how human language understanding can sometimes get misinterpreted or misrepresented by technology, and how important it is for humans to stay
    informed about the capabilities of these models.

  2. I cannot find the draft, would be an interesting read. I keep seeing AI-generated texts in unusual contexts as well. Would be good to read something that sounds AI-written but is human-written to assess my rAIdar

    1. I couldn’t find it either, so I looked over the abstract for her previous paper. I’ve also got some experience with ChatGPT output, and FWIW, I’m on the review board list for the Hackaday Journal thing (which I suspect has died years ago but no one has noticed yet).

      IMO, the abstract does not have a ChatGPT vibe to it. ChatGPT tends to be wordy, soft, indirect, and passive voice. I recently asked it to write a thank-you letter for me as a comparison to my own writing, and it was all of that. (I used my own version instead.)

      The abstract has several sentences each with separate thoughts that butt up against one another – it doesn’t “flow” like a well-written paragraph does. Not a problem because the paper has multiple findings and just listing a sentence for each of them is perfectly appropriate for an abstract, but I think ChatGPT would have made the paragraph flow more smoothly.

      I think the right way to approach this is to turn the spotlight on the reviewers who judged the paper as written by AI. Not in a punitive sense, but to find out a) what makes them believe the paper was written by ChatGPT, b) find out if any of them has any experience in ChatGPT output, and maybe have a conversation about proper procedure when ChatGPT is suspected.

      Accusations of fraud are a big deal. Maybe the reviewer should have had the journal ask the researcher for evidence before making the accusation.

  3. I have no experience with scientific papers review, so I may sound naive: what has checking for the possible use of chatgpt to do with it? If I am an impressive scientist but I’m terrible at writing, what is the problem if I hire someone to help me convey the scientific message as long as I certify that it is my message?
    More in general, I don’t see, at a first glance, any problem with using AI for any work, including novels writing. It’s just a tool that helps the writer elaborate the product they want to deliver. As long as they approve what is being published, to me, it is their work.

    1. Science has a bit of a fraud problem. Lots of faking of data and plagiarism to get papers published due to the associated importance of being a “Published” academic. AI has exaggerated this problem as it can generate convincing fake data more easilly and then create fake theory around it that makes it sound logical. This has proven a massive headache as checking regular fraudulent papers is already bad enough as is, with WAY to many slipping through the cracks. And even when used in benign capacity it induces risk of false information getting inserted (LLMs can’t handle novel information, which papers are full off). So just tossing away papers where GPT involvement is suspected is considered the safer option at present…

      Seriously. The (peer) reviewing part of science is an utter mess!!

      On novels. I view it much like art. If you want to use it, fine. Just don’t expect me to pay like you spent hours on it on what it did for you within seconds.

        1. It’s especially bad in China and India, where professional advancement is tied to the amount you’ve published. That’s created a whole industry of ‘journals’ that will publish anything for a fee.

          This is tied to a badly-misused metric that’s supposed to represent the quality of someone’s work, but is ridiculously easy to game: the H-index. It’s a score that combines the number of papers you’ve written, and the number of times those papers are cited by someone else. The trick is that the numbers have to match: if you’ve written 5 papers that have been cited 5 times each, your score is 5. It doesn’t matter if you’ve also written one paper that’s been cited a thousand times.

          Faking a high H-index is bascially the same as building link farms to game search engines, and the ‘journals’ play the role of making it all look official. The H-index also hides all the actual publications from bureaucrats and HR managers who want a convenient number and wouldn’t be capable of evaluating real papers for themselves, so it’s a perfect storm of garbage:

          The problem is that those papers get mixed into the results legitimate scientists find when they do their research. Instead of dissecting the original paper and seeing that it’s crap, people can end up using an idea with no real support because they don’t bother to try and reproduce the results for themselves.

          And that leads us to the work of Dr. John Iounnadis: a meticulous and highly respected scientist whose goal is to encourage good science by pointing out bad habits and bad practices. He’s the one who tried to replicate the 49 most highly-cited medical papers — basically the gold standard of properly done research — and was only able to reproduce the results of 20, with another 11 producting results that didn’t challenge the original paper. The remaining 18 were shown to be false or to have overstated their results.

          Down in the realm of everyday scientific publishing, the rough estimate is that 90% of published studies don’t report anything new, don’t have a large enough sample population to support their conclusions, have flaws in their methodology that make their results invalid, cite sources in ways that contradict the conclusions of the source, or have outright manipulated their data to get the result the authors wanted.

          The truth of the situation is that ‘scientific publication’ has always been a glorified version of the fallacy of authority (not “sometimes the experts are wrong”, which is true far more often than people like to think, but “if you can’t argue from the evidence, you don’t know if you’re citing the authorities correctly”). Connecting academic and professional success to ‘publication’, tying funding to ‘originality’, and letting for-profit companies be the gatekeepers of access to published work has produced a system that can barely work for people who approach it with the best knowledge and the highest ethics.

  4. eh. Don’t feel like you’re the only singled out.
    I get captcha challenged by gmail a bit more than in 1 out of 3 attempts to check email.
    If the weather so much as causes a blink in the internet service, I grt blocked and have to do the phone code shit all over again.

    Ebay throws a captcha for roughly 50% of my logins. They will often break in at the payment page with another one.
    Great, now I get to go back and reload every-damn-thing into my cart again.

    And somehow, I seem to have gotten onto the cloudflare shitlist a few years ago.
    No relife in sight. Still run into that fucking
    “just a moment/you look like a robot” screen at undecipherable intervals.
    All of this on mulitple devices running different operating sys & browsers.
    Browser history set to clear on closing.

    1. “Browser history set to clear on closing.”

      That’s the reason you keep getting the CAPTCHA.

      You do a CAPTCHA, and it notes that you are human. That’s stored locally in a cookie in your browser.

      With your browser set to clear on closing, the cookie that says “human” gets deleted and you get to keep doing CAPTCHAs.

      1. Which is one of the problems with the way cookies are used, often I get an inscrutable cookie that holds all the information a website wants to store about me, which means I can’t keep the part that says I am human and leave everything else in the bin for my next visit.

      2. Joseph, I figuered that at first.Tried leaving the browser open for several days (music streaming unintrupted) and still get hit with captchas.
        No retries were needed at passwords. Ticked show password option to see that correct things entered before submiting.
        During all of this, if browser left open, Ebay will open and greet me by account name and still showing shoping history, yet they throw captchas at random points of interaction.
        I’ve even left the browser as a fresh install , default settings and still have the same troubles. Gone so far as to zero-write drive, kill power to pc etc, reboot and reinstall OS a couple of times.
        kinda lost for ideas at this point.

        1. Dear Sir, please unplug yourself from the power socket in the wall. Being connected to the ground gives you up.
          Also reset router and router password and make dissapear half of your neighbours. You know which ones. Then you can enjoy our solar pannels and wind generators while relaxing on the freshly digged ground in the backyard.

  5. I think this speaks to wider problems with the scientific review process.

    Reviewers are often unpaid, and probably many of them just want to churn through as many papers, esp. rejections as they can. They are after all human, and as prone to laziness as the rest of us.

    Scigen predates ai as we talk about it today but had gotten some totally stupid papers past editorial staff on more than one occasion. Either way you look at this equation the review process lacks the kind of scientific rigour we associate with it.

    We still see the effects of papers which should have need rejected early on in the form of, among others, the MMR vaccine scare created by Andrew Wankfield.

    I often wonder if there’s a sensible way to ease validation of scientific claims, perhaps with ai tools and embrace them, as models good at language, rather than fear them as tools for plagurism and misinformation.

    1. “Does modern AI pass the Turing test?”

      I love that you’re asking this question when the article literally includes:

      “right now, it is fairly easy to detect chatbots”

      Regardless of what people *think* a “Turing test” is, by Turing’s original idea, if you can detect a chatbot fairly easily, it doesn’t pass the test.

  6. “This seems like both a problem that will only get worse with time (right now, it is fairly easy to detect chatbots)”

    I think this is naive “these things always improve”-type thinking. The reason why chatbots are easy to detect is that their training set is fundamentally limited. It’s easy to say “but it could be everything!” – except I don’t think people realize exactly *how much larger* the equivalent “training set” is for humans. Just from a *visual* standpoint your brain (including your visual cortex) has been flooded with something close to a *zettabit* of data by the time you’re 18. And that’s just visual, and *one person*!

    So hearing people talk about LLMs being trained on “thousands of gigabytes” as if that’s a large number is just hilarious. And that absurd amount of data is *unique* for each person.

    And the thing is, there’s no incentive for LLM AI-type companies to *make* their stuff undetectable. They want them to be repeatable, usable tools. No one wants a hammer that sometimes becomes a screwdriver just so you can’t detect that it’s a hammer.

  7. I’m also guilty of accusing some technical support departments I need to contact as a software engineer. So many cookie cutter answers that don’t work, do the opposite or they just repeat the same with follow up questions, it’s really annoying you never get to talk a real human being.
    ChatGPT actually did a better job of pointing me in the right direction, though its answer was completely unrelated to my problems lol! It’s like talking to someone about a problem and up pops an answer while doing it. You thank that person but he/she insists they didn’t do anything ;)

    1. “I’m also guilty of accusing some technical support departments I need to contact as a software engineer.”

      Oh, God, I just assumed they’re incompetent and script-following! I never even stopped to think about that…

      Sadly, with what I work with, ChatGPT is completely unhelpful because the cookie-cutter (incorrect) responses are so prevalent it has absolutely zero clue.

  8. That’s why people should mention sources/references in their works.

    Anyhow who writes a dissertation/doctoral thesis should properly document them.

    It’s something you’d learn in elementary school. Sigh.

    1. Writing citations is easy (though I’ll always have a seething hatred for any line in a bibliography that says “unpublished correspondence”). The hard part is reading them.

      Serious test: go find a dozen scientific papers from 30 years ago, skip down to the citations, and try to find a readable copy of each one. You’ll find yourself running into at least half a dozen different paywalls that only talk to dues-paying members. You’ll also hit a significant number of books that are no longer in print and magazines that no longer exist.

      Anecdotally, since I can’t find the original source (though it was probably related to coalition-s:, I remember seeing an article by a college professor who was working with a grad student from somewhere else: The professor had given the student a list of reference documents and the student couldn’t get them.. the professor had the benefit of the university’s subscription to many different journals and archives but the student didn’t. The professor started adding up the fees and guessed that the student would have to pay something like $1k per month for access to what the professor considered basic information in the field.

      1. These are all valid points, indeed.
        I just meant to say that an AI is less likely to provide references.

        Especially those references from obsolete media (which often we can get access to by searching national libraries the old way; by making a phone call and ordering photocopies via mail or telefax).

        A real human being might, however.
        This alone is an indication that there’s any human work involved in the writing process.

        Even if it’s merely correctional reading and providing source of information.

        Because, this means that the human writer had at least overseen the work and make correction.

        Like a elementary schooler who did the homework together with its parents.
        Even of there’s an involvement that’s not entirely the own, the final processing was still being done by the pupil, the human writer.

        And that’s what maters most, I think.
        Because, at the heart, it’s not about coming up with the source information all on our own, but to draw conclusions from it and make new discoveries.

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.