New Linux Kernel Rules Put The Onus On Humans For AI Tool Usage

It’s fair to say that the topic of so-called ‘AI coding assistants’ is somewhat controversial. With arguments against them ranging from code quality to copyright issues, there are many valid reasons to be at least hesitant about accepting their output in a project, especially one as massive as the Linux kernel. With a recent update to the Linux kernel documentation the use of these tools has now been formalized.

The upshot of the use of such Large Language Models (LLM) tools is that any commit that uses generated code has to be signed off by a human developer, and this human will ultimately bear responsibility for the code quality as well as any issues that the code may cause, including legal ones. The use of AI tools also has to be declared with the Assisted-by: tag in contributions so that their use can be tracked.

When it comes to other open source projects the approach varies, with NetBSD having banished anything tainted by ‘AI’, cURL shuttering its bug bounty program due to AI code slop, and Mesa’s developers demanding that you understand generated code which you submit, following a tragic slop-cident.

Meanwhile there are also rising concerns that these LLM-based tools may be killing open source through ‘vibe-coding’, along with legal concerns whether LLM-generated code respects the original license of the code that was ingested into the training model. Clearly we haven’t seen the end of these issues yet.

14 thoughts on “New Linux Kernel Rules Put The Onus On Humans For AI Tool Usage

  1. It’s never about the tools, it’s how you use them.

    “Slop” is not generated by AI itself, it is triggered and planted by human. If motives are wrong, humans do not need AI to generate “slop”.

    1. It would be hard to argue against the fact that LLMs enable humans to generate far more slop then if they didn’t have access to it. The average human developer doesn’t write more than 1000 lines of code a day. With LLMs, you can trivially generate 50,000 in one shot. The average experienced engineer can only review about 500 lines of code in a change reliably. Depends on code complexity. This makes slop inevitable and is why I call these tools “slop machines”.

      Also, a lot of recent LLM workflows do not actually require a human in releasing slop onto the world. People can give an AI agent instructions and permission to “fix all bugs and add all features I need to open source projects that I run into”. True that is a human decision, but no human is driving just how much slop they have released onto the world. There are hundreds of bots out there right now trying to contribute to open source projects without oversight right now.

      Accountability matters. With things like this though, a fly by nighter has nothing to lose but like any good gambler they might think everything to gain. It’s someone else’s project after all.

      Responsible use of these technologies does exist. The modern trends in ai software development are including these less and less.

      1. Maybe it would lead to contributors being correctly identified. Something alike Google´s intentions of identifying developers through government-issued IDs or something like that.

        Not that it is something nice ( or even good ) , but if the fly-by-nighters cause so much problem , I can see many projects moving to something like that. Maybe some central thing, coordinated by the FSF or other organization.

      2. There are hundreds of bots out there right now trying to contribute to open source projects without oversight right now.

        It’s already game over. AI content farmers are generating millions of websites full of slop to climb up in search rankings, to capture advertising revenue from Google etc. who claim they don’t support automatically generated content… but they really don’t care who serves the ads, where, or who (if anyone) actually watches them.

        This is the major driver behind the recent zombie internet phenomenon, where you just can’t find anything anymore past the AI generated slop.

        https://www.technologyreview.com/2023/06/26/1075504/junk-websites-filled-with-ai-generated-text-are-pulling-in-money-from-programmatic-ads/

        1. One can ask the same thing: what drives people to spam open source projects with slop? If you eliminate the obvious like grabbing cash bounties, or the occasional fool who thinks they’re a programming genius because of LLM induced psychosis, what’s the point?

          It’s reputation farming. People are pretending to contribute to projects so they can write it on their CVs and scam themselves into better paying jobs, to get work visas, or grants from some institute etc. It’s basically the modern version of a fake diploma from a non-existing university.

          1. It’s all of those things but also others. Some people are just tickled that their chat bots can almost do this. So they just do it for fun. Again, what do they care?

        2. “from Google etc. who claim they don’t support automatically generated content…”
          What? Google encourages it and supplies more and more tools to help people put AI on YT and such.
          And in interviews they are enthusiastic about AI content.

    2. It would never be about the tools; except that the tools you have tend to change how you use them.

      Obviously hammer possession doesn’t literally render you incapable of properly operating screws; but the tools you have and are familiar with have an exceptionally strong tendency to bleed into your ways of thinking about and solving problems. Sometimes this can be a virtue, sometimes not so much.

      Especially, in practice, when the point of this set of standards is substantially to keep a fair number of people working on the same project from stepping on one another’s toes too much.

      If people literally can’t tell that your tools have changed you probably aren’t upsetting anyone(unless you are generating submarine risk by slapping in chunks of unauthorized copyrighted code or the like; but if MCP Mike starts vomiting forth 50 thousand line pull requests there’s going to be a significant amount of tension between his theory that he’s ‘helping’ and other maintainers’ displeasure that it now takes effectively zero time to dump review work on others; which is where having a standard to point to rather than just yelling until someone leaves comes in.

      Anyone who is beating the system by perfectly emulating someone who is participating in the system is…probably not actually ‘beating’ the system from the maintainer’s perspective; that’s just behaving.

      1. Obviously hammer possession doesn’t literally render you incapable of properly operating screws

        It’s not the fact that you now own a hammer, but the fact that you never owned a screwdriver to begin with.

        Lots of people have now gained the proverbial hammer as the only tool in their box, and they’re going around hammering stuff with it because it kinda-sorta works…

    3. Slop has contextual meaning, like the word spam. It implies unwanted generated content. It isnt an assessment of the quality of the response, rather an attitude towards responses in general. It is important to subvert the main evangelical AI narrative because these companies dishonesty and lack of transparency in the face of their claims of life/labor destabilization.

  2. It’s quite unclear to what counts as “using AI tools” for a patch.

    For example:

    I search some API info online and end up reading it on one of those AI-generated SEO trap sites.
    I generate a basic example with AI and then heavily build upon it with my own code.
    I fully generate the code and just briefly review it.

    The “Assisted-by” header does not make any distinction between these.

    1. That’s by design, because many of the sloppers love to “creatively interpret” rules and definitions.

      So here’s a hard rule that states if “AI” was involved at all, disclose it.

  3. We are entering the erra of slop generated from slop. Garbage in truly becomes garbage out. Slop content is being generated a scale impossible for humans to suppress with fact checked information.

    I recently saw a lip synced video on YT generated from video and audio scraped from historical lectures by Richard Fyemann. The new AI generated transcript used in the video had some major issues that would be spotted by even a half intelligent human reading it before rendering the slop video. It was claiming that the gravity on Mars is 38g instead of 38% the gravity on earth (0.38g). But the function is to make as much money by fooling people into viewing as fast as possible and not about accuracy of information. These slop hallucinations will be scraped …. and logical thinking will eventually be suppressed as facts and delusions become harder to differentiate for younger people.

  4. I think lack of accountability for commercial software development is the root issue. AI just magnifies the problem.
    In any other engineering field, gross negligence has consequences.

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.