Google’s Inception Sees This Turtle As A Gun; Image Recognition Camouflage

The good people at MIT’s Computer Science and Artificial Intelligence Laboratory [CSAIL] have found a way of tricking Google’s InceptionV3 image classifier into seeing a rifle where there actually is a turtle. This is achieved by presenting the classifier with what is called ‘adversary examples’.

Adversary examples are a proven concept for 2D stills. In 2014 [Goodfellow], [Shlens] and [Szegedy] added imperceptible noise to the image of a panda that from then on was classified as gibbon. This method relies on the image being undisturbed and can be overcome by zooming, blurring or rotating the image.

The applicability for real world shenanigans has been seriously limited but this changes everything. This weaponized turtle is a color 3D print that is reliably misclassified by the algorithm from any point of view. To achieve this, some knowledge about the classifier is required to generate misleading input. The image transformations, such as rotation, scaling and skewing but also color corrections and even print errors are added to the input and the result is then optimized to reliably mislead the algorithm. The whole process is documented in [CSAIL]’s paper on the method.

What this amounts to is camouflage from machine vision. Assuming that the method also works the other way around, the possibility of disguising guns (or anything else) as turtles has serious implications for automated security systems.

As this turtle targets the Inception algorithm, it should be able to fool the DIY image recognition talkbox that Hackaday’s own [Steven Dufresne] built.

Thanks to [Adam] for the tip.

52 thoughts on “Google’s Inception Sees This Turtle As A Gun; Image Recognition Camouflage

  1. i’m wondering what will happen with the self-driving cars and trucks, if somebody will make some camouflage roads or something, better to learn the lesson now than later, i guess…

    1. Surely you’d be building camouflaged walls, which you move in front of security trucks which now operator without a driver, because that’s safer for all involved?

      Sometimes I wonder it’s a good thing I’m not a terrorist.

      1. Even if self driving cars make terror attacks easy and the number of people killed from them each year increases significantly, there could still be a very large net reduction in deaths because everyday accidents would be less frequent. But the odd thing about us humans is we tend to care more about HOW people die than how MANY people die.

    2. Which do you think would be more effective to repel the self drivers, covering your car in red octagons or mirror chrome vinyl wrap so they think they’re about to have a head on collision with themselves?

    3. nothing because they use multiple systems like radar and lidar, not just machine vision. This approach is absolutely necessary when you consider saftey as relying on a single system for navigation would lead to complete failure due to one piece of equipment. Its kind of like how airplanes have multiple failure modes should any one component give out, thus making it possible to land instead of falling out of the sky like a rock.

  2. This is probably the best demonstration I’ve seen yet that neural network vision systems clearly haven’t copied a real brain. I’m seriously puzzled about what the network IS using to identify the toy turtle as a rifle. The “Cat = guacamole” one is even weirder – clearly the network isn’t using the most obvious feature of guacamole, namely, its greenness.

      1. Human vision and machine vision can be deceived. We live fairly safely by discouraging deception of humans (three card tricksters and poorly maintained traffic lights are discouraged). Should we now also discourage the deception of machine learning algorithms.
        I for one think that , for non-trivial uses, we should discourage the use of machine learning algorithms that differ too much from our perceptions – it,s not us that are seeing things wrong – it is the AI.

      2. The vision subsystem is just a part of our system though. That’s the difference between “AI” and AI – real AI wouldn’t detect the turtle as a gun as it wouldn’t fit the mental model of what a gun would have to look like.

        IOW pattern matching isn’t intelligence.

  3. Reminds me of the time Tesla autopilot mistook the side of a semi truck for empty sky and the driver died. Clearly these algorithms are missing a lot of contextual cues that are obvious to humans.

      1. The AI, from inside it’s box manages to annoy the driver though the “Cry wolf” algorithm until they ignore the actual danger ahead of them.

        Musk was right all along, but he foolishly thinks he can still stop them.

    1. It doesn’t stop at visuals either, Auditory perceptions:
      A crisp packet* that got screwed up and put in a bin… the room warms up due to the heat of a bunch of PCs whom the PSUs are known to blow… The crisp packet* starts to creak it’s way to warmth: The heart attack sessions start…

      Finally someone who notices none of the PCs have gone dead then stands around the bins listening and finds the culprit.

      *Just an example, screwing up a bunch of backlight filters into the plastic recycling bin gives loud (Relative to the PCs) creaking, clicks and scrapes as they try to unfurl/unfold.

    1. His head’s a bit tubular and the legs are at some weird angles too. That’s beside the walnut-effect shell. If you saw this crawling about you’d definitely ring Richard Attenborough. The fact that machine vision is susceptible to optical illusions too isn’t surprising, or even news. It takes in a lot of information, compresses huge amounts of it, to produce one of a small selection of outputs. Of course it’s going to make mistakes. Probably there’ll never be a visual system that doesn’t. In this case they’ve had to go to quite extreme lengths, that’s no ordinary looking turtle.

  4. So instead of William Gibson/Bruce Sterling’s shirts that won’t be recorded (“Zero History”), we have patterns that can be reliably fooled into a fake interpretation by machine. That’s not much different than a “human” optical illusion but the real trouble starts when you start basing real physical/medical/legal consequences on faulty algorithms.

  5. This is a rifle.
    Some people might try to tell you this is a turtle
    They might scream ‘Turtle, turtle, turtle,’ over and over and over again.
    They might put ‘turtle’ in all caps.
    You might even start to believe that this is a turtle.
    But it’s not.
    This is a rifle.

    ~~Google 2017~~

Leave a Reply to RW ver 0.0.1Cancel reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.