After seeing how Google’s Duplex AI was able to book a table at a restaurant by fooling a human maître d’ into thinking it was human, I wondered if it might be possible for us mere hackers to pull off the same feat. What could you or I do without Google’s legions of ace AI programmers and racks of neural network training hardware? Let’s look at the ways we can make a natural language bot of our own. As you’ll see, it’s entirely doable.
In European medieval folklore, a practitioner of magic may call for assistance from a familiar spirit who takes an animal form disguise. [Alex Glow] is our modern-day Merlin who invoked the magical incantations of 3D printing, Arduino, and Raspberry Pi to summon her familiar Archimedes: The AI Robot Owl.
The key attraction in this build is Google’s AIY Vision kit. Specifically the vision processing unit that tremendously accelerates image classification tasks running on an attached Raspberry Pi Zero W. It no longer consumes several seconds to analyze each image, classification can now run several times per second, all performed locally. No connection to Google cloud required. (See our earlier coverage for more technical details.) The default demo application of a Google AIY Vision kit is a “joy detector” that looks for faces and attempts to determine if a face is happy or sad. We’ve previously seen this functionality mounted on a robot dog.
[Alex] aimed to go beyond the default app (and default box) to create Archimedes, who was to reward happy people with a sticker. As a moving robotic owl, Archimedes had far more crowd appeal than the vision kit’s default cardboard box. All the kit components have been integrated into Archimedes’ head. One eye is the expected Pi camera, the other eye is actually the kit’s piezo buzzer. The vision kit’s LED-illuminated button now tops the dapper owl’s hat.
Archimedes was created to join in Google’s promotion efforts. Their presence at this Maker Faire consisted of two tents: one introductory “Learn to Solder” tent where people can create a blinky LED badge, and the other tent is focused on their line of AIY kits like this vision kit. Filled with demos of what the kits can do aside from really cool robot owls.
Hopefully these promotional efforts helped many AIY kits find new homes in the hands of creative makers. It’s pretty exciting that such a powerful and inexpensive neural net processor is now widely available, and we look forward to many more AI-powered hacks to come.
It’s 2018, and while true hoverboards still elude humanity, some future predictions have come true. It’s now possible to talk to computers, and most of the time they might even understand you. Speech recognition is usually achieved through the use of neural networks to process audio, in a way that some suggest mimics the operation of the human brain. However, as it turns out, they can be easily fooled.
The attack begins with an audio sample, generally of a simple spoken phrase, though music can also be used. The desired text that the computer should hear instead is then fed into an algorithm along with the audio sample. This function returns a low value when the output of the speech recognition system matches the desired attack phrase. The input audio file is gradually modified using the mathematics of gradient descent, creating a result that to a human sounds like one thing, and to a machine, something else entirely.
The audio files are available on the site for your own experimental purposes. In a noisy environment with poor audio coupling between speakers and a Google Pixel, results were poor – OK Google only heard the human phrase, not the encoded attack phrase. Given that the sound quality was poor, and the files were generated with a different speech model, this is not entirely surprising. We’d love to hear the results of your experiments in the comments.
It’s all a part of [Nicholas]’s PhD studies around the strengths and pitfalls of neural networks. It highlights the fact that neural networks don’t always work in the way we think they do. Google’s Inception is susceptible to similar attacks with images, as we’ve seen recently.
[Thanks to Wolfgang for the tip!]
The good people at MIT’s Computer Science and Artificial Intelligence Laboratory [CSAIL] have found a way of tricking Google’s InceptionV3 image classifier into seeing a rifle where there actually is a turtle. This is achieved by presenting the classifier with what is called ‘adversary examples’.
Adversary examples are a proven concept for 2D stills. In 2014 [Goodfellow], [Shlens] and [Szegedy] added imperceptible noise to the image of a panda that from then on was classified as gibbon. This method relies on the image being undisturbed and can be overcome by zooming, blurring or rotating the image.
The applicability for real world shenanigans has been seriously limited but this changes everything. This weaponized turtle is a color 3D print that is reliably misclassified by the algorithm from any point of view. To achieve this, some knowledge about the classifier is required to generate misleading input. The image transformations, such as rotation, scaling and skewing but also color corrections and even print errors are added to the input and the result is then optimized to reliably mislead the algorithm. The whole process is documented in [CSAIL]’s paper on the method.
What this amounts to is camouflage from machine vision. Assuming that the method also works the other way around, the possibility of disguising guns (or anything else) as turtles has serious implications for automated security systems.
As this turtle targets the Inception algorithm, it should be able to fool the DIY image recognition talkbox that Hackaday’s own [Steven Dufresne] built.
Thanks to [Adam] for the tip.
In case you didn’t make it to the ISCA (International Society for Computers and their Applications) session this year, you might be interested in a presentation by [Joel Emer] an MIT professor and scientist for NVIDIA. Along with another MIT professor and two PhD students ([Vivienne Sze], [Yu-Hsin Chen], and [Tien-Ju Yang]), [Emer’s] presentation covers hardware architectures for deep neural networks.
The presentation covers the background on deep neural networks and basic theory. Then it progresses to deep learning specifics. One interesting graph shows how neural networks are getting better at identifying objects in images every year and as of 2015 can do a better job than a human over a set of test images. However, the real key is using hardware to accelerate the performance of networks.
Hardware acceleration is important for several reasons. For one, many applications have lots of data associated. Also, training can involve many iterations which can take a long time.
What if every time you learned something new, you forgot a little of what you knew before? That sort of overwriting doesn’t happen in the human brain, but it does in artificial neural networks. It’s appropriately called catastrophic forgetting. So why are neural networks so successful despite this? How does this affect the future of things like self-driving cars? Just what limit does this put on what neural networks will be able to do, and what’s being done about it?
The way a neural network stores knowledge is by setting the values of weights (the lines in between the neurons in the diagram). That’s what those lines literally are, just numbers assigned to pairs of neurons. They’re analogous to the axons in our brain, the long tendrils that reach out from one neuron to the dendrites of another neuron, where they meet at microscopic gaps called synapses. The value of the weight between two artificial neurons is roughly like the number of axons between biological neurons in the brain.
To understand the problem, and the solutions below, you need to know a little more detail.
As a fun project I thought I’d put Google’s Inception-v3 neural network on a Raspberry Pi to see how well it does at recognizing objects first hand. It turned out to be not only fun to implement, but also the way I’d implemented it ended up making for loads of fun for everyone I showed it to, mostly folks at hackerspaces and such gatherings. And yes, some of it bordering on pornographic — cheeky hackers.
An added bonus many pointed out is that, once installed, no internet access is required. This is state-of-the-art, standalone object recognition with no big brother knowing what you’ve been up to, unlike with that nosey Alexa.
But will it lead to widespread useful AI? If a neural network can recognize every object around it, will that lead to human-like skills? Read on. Continue reading “DIY Raspberry Neural Network Sees All, Recognizes Some”