We’ve seen plenty of examples of neural networks listening to speech, reading characters, or identifying images. KickView had a different idea. They wanted to learn to recognize radio signals. Not just any radio signals, but Orthogonal Frequency Division Multiplexing (OFDM) waveforms.
OFDM is a modulation method used by WiFi, cable systems, and many other systems. In particular, they look at an 802.11g signal with a bandwidth of 20 MHz. The question is given a receiver for 802.11g, how can you reliably detect that an 802.11ac signal — up to 160 MHz — is using your channel? To demonstrate the technique they decided to detect 20 MHz signals using a 5 MHz bandwidth.
Increasingly these days drones are being used for urban surveillance, delivery, and examining architectural structures. To do this autonomously often involves using “map-localize-plan” techniques wherein first, the location is determined on a map using GPS, and then based on that, control commands are produced.
A neural network that does steering and collision prediction can compliment the map-localize-plan techniques. However, the neural network needs to be trained using video taken from actual flying drones. But generating that training video involves many hours of flying drones at street level putting vehicles and pedestrians at risk. To train their DroNet, Researchers from the University of Zurich and the Universidad Politecnica de Madrid have come up with safer sources for that video, video recorded from driving cars and bicycles.
For the drone steering predictions, they used over 70,000 images and corresponding steering angles from the publically available car driving data from Udacity’s Open Source Self-Driving project. For the collision predictions, they mounted a GoPro camera to the handlebars of a bicycle and drove around a city. Video recording began when the bicycle was distant from an object and stopped when very close to the object. In total, they collected 32,000 images.
To use the trained network, images from the drone’s forward-facing camera were fed into the network and the output was a steering angle and a probability of collision, which was turned into a velocity. The drone remained at a constant height above ground, though it did work well from 1.5 meters to 5 meters up. It successfully navigated road lanes and avoided moving pedestrians and bicycles. Intersections did confuse it though, likely due to the open spaces messing with the collision predictions. But we think that shouldn’t be a problem when paired with map-localize-plan techniques as a direction to move through the intersection would be chosen for it using the location on the map.
As you can see in the video below, it not only does a decent job of flying down lanes but it also flies well in a parking garage and a hallway, even though it wasn’t trained for either of these.
Humans are very good at watching others and imitating what they do. Show someone a video of flipping a switch to turn on a CNC machine and after a single viewing they’ll be able to do it themselves. But can a robot do the same?
Bear in mind that we want the demonstration video to be of a human arm and hand flipping the switch. When the robot does it, the camera that is its eye will be seeing its robot arm and gripper. So somehow it’ll have to know that its robot parts are equivalent to the human parts in the demonstration video. Oh, and the switch in the demonstration video may be a different model and make, and the CNC machine may be a different one, though we’ll at least put the robot within reach of its switch.
Researchers from Google Brain and the University of Southern California have done it. In their paper describing how, they talk about a few different experiments but we’ll focus on just one, getting a robot to imitate pouring a liquid from a container into a cup.
It’s 2018, and while true hoverboards still elude humanity, some future predictions have come true. It’s now possible to talk to computers, and most of the time they might even understand you. Speech recognition is usually achieved through the use of neural networks to process audio, in a way that some suggest mimics the operation of the human brain. However, as it turns out, they can be easily fooled.
The attack begins with an audio sample, generally of a simple spoken phrase, though music can also be used. The desired text that the computer should hear instead is then fed into an algorithm along with the audio sample. This function returns a low value when the output of the speech recognition system matches the desired attack phrase. The input audio file is gradually modified using the mathematics of gradient descent, creating a result that to a human sounds like one thing, and to a machine, something else entirely.
The audio files are available on the site for your own experimental purposes. In a noisy environment with poor audio coupling between speakers and a Google Pixel, results were poor – OK Google only heard the human phrase, not the encoded attack phrase. Given that the sound quality was poor, and the files were generated with a different speech model, this is not entirely surprising. We’d love to hear the results of your experiments in the comments.
Google has announced their soon to be available Vision Kit, their next easy to assemble Artificial Intelligence Yourself (AIY) product. You’ll have to provide your own Raspberry Pi Zero W but that’s okay since what makes this special is Google’s VisionBonnet board that they do provide, basically a low power neural network accelerator board running TensorFlow.
The VisionBonnet is built around the Intel® Movidius™ Myriad 2 (aka MA2450) vision processing unit (VPU) chip. See the video below for an overview of this chip, but what it allows is the rapid processing of compute-intensive neural networks. We don’t think you’d use it for training the neural nets, just for doing the inference, or in human terms, for making use of the trained neural nets. It may be worth getting the kit for this board alone to use in your own hacks. An alternative is to get Modivius’s Neural Compute Stick, which has the same chip on a USB stick for around $80, not quite double the Vision Kit’s $45 price tag.
The Vision Kit isn’t out yet so we can’t be certain of the details, but based on the hardware it looks like you’ll point the camera at something, press a button and it will speak. We’ve seen this before with this talking object recognizer on a Pi 3 (full disclosure, it was made by yours truly) but without the hardware acceleration, a single object recognition took around 10 seconds. In the vision kit we expect the recognition will be in real-time. So the Vision Kit may be much more dynamic than that. And in case it wasn’t clear, a key feature is that nothing is done on the cloud here, all processing is local.
The kit comes with three different applications: an object recognition one that can recognize up to 1000 different classes of objects, another that recognizes faces and their expressions, and a third that detects people, cats, and dogs. While you can get up to a lot of mischief with just that, you can run your own neural networks too. If you need a refresher on TensorFlow then check out our introduction. And be sure to check out the Myriad 2 VPU video below the break.
Identifying ham radio signals used to be easy. Beeps were Morse code, voice was AM unless it sounded like Donald Duck in which case it was sideband. But there are dozens of modes in common use now including TV, digital data, digital voice, FM, and more coming on line every day. [Randaller] used CUDA to build a neural network that could interface with an RTL-SDR dongle and can classify the signals it hears. Since it is a neural network, it isn’t so much programmed to do it as it is trained. The proof of concept has training to distinguish FM, SECAM, and tetra. However, you can train it to recognize other modulation schemes if you want to invest the time into it.
Around the Hackaday secret bunker, we’ve been talking quite a bit about machine learning and neural networks. There’s been a lot of renewed interest in the topic recently because of the success of TensorFlow. If you are adept at Python and remember your high school algebra, you might enjoy [Oliver Holloway’s] tutorial on getting started with Tensorflow in Python.
[Oliver] gives links on how to do the setup with notes on Python versions. Then he shows some basic setup operations. From there, he has the software “learn” how to classify random points that either fall into a circle or don’t. Granted, this is easy enough to do with traditional programming, so it isn’t a great practical example, but it is illustrative for learning purposes.
Given that it is easy to algorithmically decide which points are in the circle and which are not, it is simple to develop training data. It is also easy to look at the result and see how close it is to the actual circle. You’ll see that it takes a lot of slow learning before the result space looks like a circle and not a triangle or some other odd shape.