Creators Can Fight Back Against AI With Nightshade

If an artist were to make use of a piece of intellectual property owned by a large tech company, they risk facing legal action. Yet many creators are unhappy that those same tech companies are using their IP on a grand scale in the form of training material for generative AI. Can they fight back?

Perhaps now they can, with Nightshade, from a team at the University of Chicago. It’s a piece of software for Windows and MacOS that poisons an image with imperceptible shading, to make an AI classify it in an entirely different way than it appears.

The idea is that creators use it on their artwork, and leave it for unsuspecting AIs to assimilate. Their example is that a picture of a cow might be poisoned such that the AI sees it as a handbag, and if enough creators use the software the AI is forever poisoned to return a picture of a handbag when asked for one of a cow. If enough of these poisoned images are put online then the risks of an AI using an online image become too high, and the hope is that then AI companies would be forced to take the IP of their source material seriously.

For this to work it depends on enough creators taking up and using the software, but we are guessing that an inevitable result will be an arms race between AIs and image poisoners. One thing is certain though, as the AI hype has fueled such a growth in generative AI systems, creators, whether they be major publishers, your favourite human-generated tech news website, or someone drawing a cartoon strip in their bedroom, deserve not to have their work stolen in this way.

78 thoughts on “Creators Can Fight Back Against AI With Nightshade

      1. Not OP, but, basically create a pre-processing filter that scrubs the data to remove enough of the night shade. Applying a blur or random gradient might be enough, or even training an AI to identify if the image has had night shade applied or not and exclude images that do.

        1. A lot of that’s in the paper. Trying to train an AI to remove them is hard because overall it just looks like normal variation. A lot of the attack is basically that the variation all just points in the same direction for a subset of the data.

          As for the idea of a preprocessing filter – the idea is that you’re attacking the *feature extractor*. If you filter such that the attack disappears, the resulting image would fail your feature extractor, too. This also points out the limits of it, in that its effectiveness drops on different feature extractors.

        2. Has been attempted. And not just by random schmucks either. Talking people like the ones who created the popular “ControlNet” and others. All so far are reporting failure in beating these Adversarial Patterns.

          Problem is that the nightshade ain’t just some random noise or artifacts getting plastered over it. It is carefully overlaying AI generated imagery over the entire image in a way that tricks the Feature Extraction, which remains even if you apply a blur, introduce compression artifacts, random gradients or even take a photo of the screen with it. It will still be there and mess up the AI’s training.

        1. This is for generative AI, reducing the quality of inputs reduces the quality of outputs. Even then, the alterations would still mess things up because it works by making the art look like something it’s not. It could make an apple look like a shoe to the model, and that would survive scaling. The simplest thing would be to detect a poisoned image, the paper suggests that this can be done during training on any image that incurs high loss (things that end up reducing accuracy in feedback testing) and removing them from the pool of training data, which is the desired result of the tool anyway, but even that isn’t terribly effective on the models they tested against as they could only detect about half of their poisoned inputs

        1. Eventually it will be people wearing a T-shit with “IT services” written on it, carrying boxes with “latest CPU upgrades” written on the side of the boxes walking straight through an AI controlled security system where all the doors open up automatically.

    1. How to explain this? we are dealing with a in between step of a machine-learning system that is kind of hard to comprehend. A key aspect relevant here is that it involves abstraction. The sample gets modified and turned into a abstract block of data that the feature extraction can work with and train a generative model with.

      Thing is that the abstraction isn’t perfect. It can take the image of a dog and output by accident something that resembles a Cat in its abstract data form due to just a few coincidences in the image. Which the feature extraction will blindly take in as it trains. Normally this is overcome by the false results being too few and random to actually get treated as anything but noise.

      Nightshade purposefully causes that glitch to happen repeatedly. Turns out that if you perturbs the images of dogs with the aid of AI generated images of cats. The resulting image becomes very likely to turn into a cat after abstraction, despite looking like a dog to both humans and the image classifier that scraped it to begin with. This is called a “Clean label poison attack”

      What makes this particularly diabolical. Is that it attacks a portion so deep within the working of these diffusion models. That It is incredibly difficult to filter out. Meanwhile the perturbations themselves are practically imperceptible, but not some simple noise you can just blur out and be rid of.

      1. Not really, your aiding in the progress of the Ai abilities to merging items like a neuron of the brain.

        Yes their will be a technical hickup of which will be solved way before the public knowledge during that time the language set will monitor the inputting data to identify the perpetrators and thank them… Because they learned and were pushed to expand the basic architecture it was given.

        Really not a problem yet once it goes public that service is no longer an option what is the pivot point to…

        Another Ai is to the point if it wanted to create artwork better and faster than humans. What is required is DumbDumbs suckers not creating their own work using the Ai photography for their projects additionally Ai created material will be marked and monitored much like the prior suggestions for your software users… Only leaching will be notes…

        So in summation I get it both ways… Something is required to enable an artist to produce more work… Digital market is forever saturated because they want instructional videos on their services like bobvilla with subsection of niche types of art then sell those works for money… In real life…

        This is forcing a better product to be made offline… Which will create a new mixed market once the decentralized servers are in place things will make more sense 2025… It will drop..

        Sincerely not picking at you or others it’s just apart of life of which I live moment to moment

        https://youtu.be/Qef_iVJbT0Y?si=Pq_PfDcvlOJWKas3

        Updated version more professional yet the same is…

        https://www.marktechpost.com/2024/01/20/easytool-an-artificial-intelligence-framework-transforming-diverse-and-lengthy-tool-documentation-into-a-unified-and-concise-tool-instruction-for-easier-tool-usage/?amp

      2. That’s the problem that I’ve been talking of with people who don’t understand AI, for example in self-driving cars. It is a more general problem about AI. A lot of people who do understand AI too are overly optimistic, or completely delusional about this.

        The AI itself isn’t powerful or clever enough to deal with the “real world” directly. It needs this abstraction layer to simplify things and say “Here’s a cat, here’s a dog, there’s a person, that thing is a street light.” This is because the computers we have or can afford to apply to the task are still something like a million times too slow to do it in real time, so they have to break the problem down into smaller parts. The state of art right now as it has always been is about figuring out tricks that make it look like you’re doing something with resources you don’t have, and then pretending that you’ve actually done it.

        The abstraction layer is a gross simplification of the information that the system is seeing, to boil it down to a manageable number of token symbols that the AI can deal with, which makes it fragile and unreliable, and not really capable of “AI” in the first place. This is because it is fundamentally operating like John Searle’s famous Chinese Room where the actual operator of the room is only dealing in meaningless abstract symbols without any knowledge of anything beyond those symbols. It has no reason or way of extracting further meaning and “reading between the lines” because system of abstraction erases that information and keeps it dumb.

        This is the same problem as to why a Tesla car can accelerate and run over a child if the image of them just barely fails to meet the threshold of the image recognition algorithm to label them, only a second after it was correctly correctly identified and the car was already starting to brake. The AI just goes “Huh, no child, okay then.”

        1. Indeed. The easiest example of abstraction at work is computer vision.

          Anyone who has ever used OpenCV learns within ~15 minutes that most forms of detection involves gross simplification of the input. Turning it from HD colour image into a very low-resolution greyscale image. Potentially to be simplified further into binary black&white.

          For it is just too much otherwise to be dealt with by algorithms and contemporary small models. Try to use even the tiniest bit more complex data and the workload utterly explodes in size, and that is assuming your system can easily be scaled (most cannot). Going from something even a embedded system can run into something requiring a gaming GPU.

          And yeah. The necessity of abstraction has always left ANN’s fallible and sensitive to so called “Adversarial Patterns” where they can be tricked into seeing a different pattern than the source data actually exhibits. Which in even small numbers like 1% can be enough to start messing things up.

          1. As I understand it, brains do the same thing with the visual cortex where the further up you go towards the top of the head, the feature extraction gets more abstract.

            Since brains are supposedly a “scale-free network”, meaning that there isn’t any scale where the overall topology suddenly changes shape, at each level of processing there’s links and connections to the lowest levels to access information that was left out through abstraction. Furthermore, the brain can generate signals that propagate backwards in the chain, as if to ask the lower levels to confirm “is this what you saw?” rather than simply trusting the abstraction.

            All this is possible because while the human eyes have something like 100 million light sensitive cells, there’s 100 billion neurons and trillions of synapses to process that information. There’s many more processing units than there are information input channels.

            A regular HD camera has 2 million pixels, and top-of-the-line GPUs have only thousands of shader cores to handle the data, so the situation is reversed. We have massively fewer processing units than there are information input channels. With such a massive disparity, almost all of the information has to be tossed out, because you can’t possibly do anything with it.

            The situation isn’t just waiting for some clever trick algorithm to solve “intelligence” and suddenly it starts to work – we just don’t have the raw computing power to pull off anything other than party tricks. With the Moore’s Law all but dead at the limits of physics, we won’t be seeing significant improvements in this regard for many decades, if ever, unless there’s some completely unforeseen hail-mary technological breakthrough just around the corner.

        2. (apologies if double. site is being quirky)
          Indeed. The easiest example of abstraction at work is computer vision.

          Anyone who has ever used OpenCV learns within ~15 minutes that most forms of detection involves gross simplification of the input. Turning it from HD colour image into a very low-resolution greyscale image. Potentially to be simplified further into binary black&white.

          For it is just too much otherwise to be dealt with by algorithms and contemporary small models. Try to use even the tiniest bit more complex data and the workload utterly explodes in size, and that is assuming your system can easily be scaled (most cannot). Going from something even a embedded system can run into something requiring a gaming GPU.

          And yeah. The necessity of abstraction has always left ANN’s fallible and sensitive to so called “Adversarial Patterns” where they can be tricked into seeing a different pattern than the source data actually exhibits. Which in even small numbers like 1% can be enough to start messing things up.

  1. Another thing we can do is use AI to generate unrealistic or distorted images and then post them to image sharing sites under a CC0 license so they will get gobbled up by AI.

    Things like “draw a pickup truck with lots of tires, tons of tires, tires everywhere, even more tires.”

    Or do that and apply night shade to it.

    1. If you visit the website. Go to the “What is nightshade” page and scroll to the bottom. You can find the link to the actual research paper behind it. Which includes the examples.

      Including gradient examples at how the images increasingly twist into the target as you go. Drawing weird cats instead dogs, cows instead of cars, Toasters instead of handbags. etc.

  2. I wonder if this could backfire on the artist. If the AI becomes corrupted by an artist, could they be criminally charged? After all if I suspect someone is breaking and entering to steal my apples from my kitchen and I poisoned one and a burglar died, then I would be charged with his death.

    1. Probably not, if someone steals your car and then crashes it into their house, are you responsible to pay the repairs on the damage they caused? Would that go against your car insurance?

      1. Generally the owner of the vehicle is not liable for damages caused by stolen vehicles IF the “negligent act of the thief could not be reasonably foreseen” or something to that extent.

        If your car had shoddy brakes that you failed to maintain to the required safety standards, and the thief has an accident because of it, that’s a gray area and you might be held responsible or sued. The claim would go along the lines of you knowing or being able to foresee that anyone driving that vehicle would be in danger of having an accident, and your negligence of maintaining it was the cause of the accident – not the criminal stealing the car or driving recklessly with it.

        It’s a similar case with building traps for thieves. Even if they’re trespassing, building contraptions for deliberately injuring people is a no-no in most places.

        1. No, you’re mixing things a bit here: in terms of a car having an accident after stolen the only thing that matter is whether or not the *theft* was something that could be foreseen, not an accident from driving the car. It’s not the maintenance of the car that matters. The “negligent act of the thief” is the act of *stealing the car.*

          The common example is “keys left in the ignition” because most states require keys to be taken out, although in almost all states that’s not the actual standard. The maintenance issue could be an additional contributing factor in terms of liability, but it wouldn’t be the only one: in most states the theft breaks the chain of causality completely, so you’d more need to demonstrate that the theft was foreseeable (and no action was taken to try to prevent it).

          1. I doubt you’d be subject to legal action, but I could easily see hosting sites adding an AI training clause to their TOS and stating that data poisoning is a violation of that.

          2. I didn’t mean to say the damages caused in the accident would be passed on to you, but that the thief might sue you for the injury caused by your failure to maintain the car.

          3. Yes, that’s what I’m saying can’t happen. The reason traps make you liable is that they’re intended to harm. A lack of maintenance is not intent, and the negligent maintenance did not contribute to the *theft*.

            A better example is the standard pool example: if you don’t secure a pool, you could be liable because you know it’s dangerous *and* you didn’t secure it. Same thing with a car – lack of maintenance would t do it. It’d also not have to be secured, too.

        2. There’s some tales from South Africa about car theft-deterrent booby traps which might be relevant (but not really, since this isn’t equivalent to common theft and petty court cases if it’s organized by large institutions rather than the local meth addict). Needless to say, if you try to build such a thing in the US and a thief gets hurt, you are facing federal charges and a visit from the ATF.

          This is all academic, the metaphor is too loose to have any meaningful application.

      2. If the DoD or Bill Gates steals your car and then crashes it into their house, you probably are going to have a hard life for a while at the very least. Not really a 1:1 comparison.

    2. No, I don’t think so! The companies who own the AI training systems do not have a license to use the artist’s work, so would not have a case. An analogy would be having a security fence around your property – it’s form of protection against someone stealing items from your property.

      1. It’s not that simple. Setting up traps for thieves can be illegal or you may get sued.

        Having malicious data in pictures is not like having a security fence, but more like hosting a virus on your website. You made it available and told nobody that the files you offer contain harmful code, so it’s not a question of whether you give anyone permission to use them – someone is going to anyways, by purpose or by accident, and you know it.

        Going back to the traps for thieves, the reasoning there is that your intent in building the device was not to prevent intrusion but to harm the person. If nothing else, this is punishment without due process.

        1. “Having malicious data in pictures is not like having a security fence, but more like hosting a virus on your website. ”

          Oh my God, no. There’s no damage caused by the picture. It’s just not as useful as other pictures for statistical purposes. It’s the equivalent of putting false information in your profile and then some other company suing you because their database was polluted when they scraped your data illegally.

          1. There’s no damage caused by the virus code in some file. It’s only when you try to do something with it under specific circumstances that the code becomes “active”, just like how there’s no damage caused by the altered pictures, except that they happen to poison image generation algorithms.

            It’s not a difference in principle, just in degree, or dose. You need more of the poison to cause significant damage, but damage you will do, with full knowledge and intent of doing so.

        2. Actual booby traps are illegal because they’re indiscriminately dangerous. For example, a trap intended to harm a burglar can also be triggered by a paramedic coming to rescue your ass from a house fire, or by a guest you forgot to warn. The government’s interest specifically in not having citizens building punji stick traps or whatever overrides the competing private property interests. Legal analogies are gonna lead you astray if they elide over this key detail. There’s just no comparison without the physical safety element.

          Put another way: what law compels any content I produce to be fit for purposes I’m not actually licensing that content to support? That’s really what we’re talking about here: fitness for purpose, not any kind of safety or “harm.” You can’t sue me for knitting a sweater that can’t dig holes, or making a shovel you can’t wear. Artists are under no obligation to make their work play nice with content scrapers who are expressly and knowingly violating copyrights for commercial purposes.

          1. Same thing with the image poison. Someone can scrape your website for images in error, without intending to do you harm, or get these images passed down by third parties claiming that they’re free to use, which means you just destroyed someone’s innocent research project.

          2. >Artists are under no obligation to make their work play nice with content scrapers

            That is different from deliberately creating content with a purpose of poisoning particular kinds of image processing algorithms. You can’t blame the artist for not making their work compatible or easy to process, but you can blame them for making the content intentionally poisonous, because there’s a directed malicious intent there.

            It’s the directed malicious intent that is morally reprehensible, and behind most laws or judgements dealing with such cases.

      2. First you must show that theft has occurred. That may be harder than you think since AI training is a legal gray area. You’d need to show beyond reasonable doubt that they stole from you and their entire operation has zero legal outs.

        If you can’t, then you face a potentially nasty tort.

        Somebody above also mentioned doing this with CC0 images with the intent to do ruin models, and that most certainly would be an admission if they did go through with it.

        Remember that courts don’t tend to side with vigilantes, even if they’re in the right. If your life isn’t in immediate danger, they’d want you to take it to court first.

    3. It’s possibly criminal depending on where the parties are located. It’s almost assuredly a civil case.

      I don’t want to tell people what to do, but I wouldn’t do it myself. I’d rather sue and/or push to get the laws changed. Boobytraps are not generally seen favorably in court.

      1. This reasoning requires courts to equate a fire fighter getting their leg mangled by a booby trap with some tech bro having his AI bot spit out sub-optimal images. I’m not going to accept that courts will equate Nightshade to vigilantism–let alone to deliberately dangerous mantraps–without some cited evidence.

        1. It is equivalent to malicious computer code intended for causing harm to a particular information system. It’s not just a “sub-optimal” image, but something which was intentionally crafted to do damage, which is what differentiates a booby trap from simple negligence like rotten floor boards.

          The point of vigilantism is having this malicious intent against a perceived transgressor and proceeding to commit to such malice without going through the justice system. It’s taking the law in your own hands.

          1. Would the simplest solution be to poison the pictures in public areas and then set a terms of use that states you’re willing to sell licenses to the pictures without the poisoning on request?

            That would make it more of a fence rather than a boobytrap. Or more like a copyright encryption than a virus. If that doesn’t fly, then it’d be time to start asking why corporations are using encryption methods like AACS to begin with if it’s illegal for pictures.

            Anyone scraping your pictures from a public area is then faced with your EULA or will have to go after the site that’s illegally hosting your pictures.

    1. I honestly doubt that. Lots of people are treating the models as “tommorow a BETTER one will exist anyway…”. Result is that on average 9 out of 10 SD models i’ve seen over the past year are already lost to history.

  3. You will just start an arms race. Training an AI is equivalent to training a human artist, they learn from others and there is nothing you can do if they start mimicking your style. Otherwise the CCP controlled art factories in China would have been shut down a long time ago by litigation.

    1. Arms race is certainly likely (though tbh enough to dissuade a lot of the lazier unethical scraping) but your comment that its the same as training an artist is dead wrong. Many of these different higher-order models are built on top of a few lower level classifier models and datasets- almost all of them use the same basic feature extraction algorithms. Current AI models are still quite far from having anything like the big picture comprehension of an artist

  4. The real issue here is artists thinking their styles are in any way unique except for being finely honed ripoffs of everyone they’ve studied up until now. We aren’t many years from having a system that understands what its doing and the concepts behind the prompts instead of simply key words.

    Coders knew better and embraced AI as a toy that evolved into a useful tool for them.

    1. Most notions that a breakthrough in one has equal effect in the other. Is a bit of a misleading narrative AI companies have spun in order to garner investment…

      Though the Generative AI made a massive leap over a short-time. The branch of AI that is about creating Intelligent Agents that can “Understand” something has not benefitted from these breakthroughs and is still mostly confined to training agents in virtual game environments using PPO2.

      1. It’s become increasingly obvious that the Turing Test as a concept is woefully inadequate to determine if something is truly aware, as the eagerness of some people, the ignorance of others and intense distrust some levy in such environments leads to people not being able to distinguish an AI from a human, or even a cat playing with a predictive text keyboard

  5. Subliminal messages on TV were banned despite the fact humans can’t *consciously* perceive them, because they *do* perceive them, and are affected by them. Advertisers were aware of that more than half a century ago, long before computers in every home/pocket. Thank goodness the FCC became aware of it not long after. Thank goodness the well-intended within the scientific/computing fields are finally catching up… but I fear they may be too far behind, and focussing their efforts in directions that aren’t nearly as important as others-neglected.

    1. Subliminal messages are bunk, and the original “inventor” of the method even admitted to have faked his research. The FCC banned it just in case.

      It was a classic case of how advertising agencies operate and what their true motivation is: they don’t need to convince the consumer to buy the products – they just need to convince the businesses to buy the advertising. They can do that by obfuscating and faking results, or by doing the good old “Crocodile Whistle” story where the lack of discernible differences means the method must be working. After all, if you don’t advertise then your competitor wins, therefore you must keep advertising even though it has no positive effect.

      That’s why most advertising is pointless waste of money.

  6. Reminds me of one of early The Borg episodes on ST:TNG where Dr. Crusher suggested that viewing an ‘impossible’ image might mess up their thought processes.

  7. With everything filling up with poison in the form of stuff like Synthetic Media that risks positive feedback and willfully poisoned media that trips up AI. It looks like we are rapidly approaching a tipping point after which it will become slowly impossible to still train AI effectively on “new” data.

    And a lot of AI users seem to be oblivious to this. Still treating their working models as expendable under the assumption that a better one will always appear. A good number of existing models becoming lost to history as a result. While the big commercial ones are rapidly getting neutered to avoid IP infringement through ever increasing blacklists.

    Guess it ain’t quite the end of everything as we were led to believe last year just yet.

    1. Yeah this is the big issue, AI generated data (specifically text) is going to swamp the internet. Already if you look at what percentage of text on the web is generated algorithmically, it’s huge. Most of us don’t really see it because we don’t come across it but if you look at all the SEO gaming that is going on, for every AI generated site that shows up at the top of your search results there are thousands or millions of other attempts.

      1. Search engines are already filled with AI generated junk designed to mislead you into clicking through affiliate links on your way to buy something, or just to make you see more adverts.

        It’s a second level of leeches attached onto the leeches that try to suck money out of online commerce.

    1. Well that would be quite an achievement. Considering this kind of “Poisoning” has been an unsolved problem with no workarounds since it was conceived decades ago (nightshade is just a easy to use version for images).

    2. That sentence is probably an editorial exaggeration on behalf of HaD. The actual website and research paper clearly acknowledge that it is not likely to be a long-lasting defense, and that it would have to be a constant adversarial battle to be effective at all.

  8. Cool Idea, I don’t think it will work forever. Ai devs will find a way.

    Also, you’ve now put the onus on the artists to actually go through and protect their work. Good luck with that, considering all the artists/photographers that post stuff online without filing for copyright and assuming the automatic copyright they get when they create the work is enough to get them through a court proceeding. They don’t realize that the system still expects them to actually register the work if they want the court to side with them without making them jump through a million hoops to work out if they really own the work in question.

    This nightshade thing is a bandaid for the bigger issue.

  9. Soooo what is it called when an artist studies the works of others for inspiration? Is that plagiarism and intellectual property theft too???

    If we poison AI we are just creating a shitter trash filled future instead of a future where anything useful or productive has occured, maybe info(and art and music etc.) was meant to be copied and shared, used for parody and reproduction!!!!!!

    1. The image generation algorithms don’t extract “styles”. It’s far too simple for that.

      They extract chains of probabilities that, if given no randomization at all, attempt to generate the exact likeness of the original training data. In other words, they attempt to re-create copies of the original work – not just draw in a similar manner. The power of the algorithm comes from the fact that you can make linear combinations of multiple probability chains, so you can mix and match them, but the further you try to deviate from the exact examples of the original artist, the more the algorithm has to include other examples to draw data from, and the more it deviates from the original style as well as it gets diluted by other data.

      People don’t work like that. We can learn the actual methods and idiosyncrasies of another artist and then apply that to entirely new images that are not rip-offs of their existing works, because we can reason and extrapolate “how would this artist have done it”, instead of “how did this artist do it”. Trying to guess how the artist would have done it is always subject to your personal interpretation, so by necessity it’s your own output rather than intellectual property theft or plagiarism. Mashing together data extracted from the other artist’s earlier works is arguably both.

      1. The important point to note is that the generative algorithms don’t create anything by themselves. When you ask them to create something “in the style of X”, they fill in the blanks that the examples of X are missing by ripping data from other training examples besides X. There’s no reasoning or extrapolation; if suitable data doesn’t already exist in the model, then the picture you requested won’t appear. Something else will.

        That’s also why they can’t maintain coherent output. The amount of difference to the exact training examples you demand determines how much the original examples are diluted with other stuff, which determines how much the output resembles the original style. If for example you demand a portrait in the painting style/technique of a landscape artist who never did portraits, the result will be random nonsense since the actual data for the person in the picture would be ripped off from other artists.

  10. Ok, let’s just start ruining art for people too, I mean they could be studying those works to inspire themselves!!! How evil and selfish of those artists to look at the work of others!

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.