DroNet: learning to fly by driving

Delivery Drones Can Learn From Driving And Cycling

Increasingly these days drones are being used for urban surveillance, delivery, and examining architectural structures. To do this autonomously often involves using “map-localize-plan” techniques wherein first, the location is determined on a map using GPS, and then based on that, control commands are produced.

A neural network that does steering and collision prediction can compliment the map-localize-plan techniques. However, the neural network needs to be trained using video taken from actual flying drones. But generating that training video involves many hours of flying drones at street level putting vehicles and pedestrians at risk. To train their DroNet, Researchers from the University of Zurich and the Universidad Politecnica de Madrid have come up with safer sources for that video, video recorded from driving cars and bicycles.

DroNet
DroNet

For the drone steering predictions, they used over 70,000 images and corresponding steering angles from the publically available car driving data from Udacity’s Open Source Self-Driving project. For the collision predictions, they mounted a GoPro camera to the handlebars of a bicycle and drove around a city. Video recording began when the bicycle was distant from an object and stopped when very close to the object. In total, they collected 32,000 images.

To use the trained network, images from the drone’s forward-facing camera were fed into the network and the output was a steering angle and a probability of collision, which was turned into a velocity. The drone remained at a constant height above ground, though it did work well from 1.5 meters to 5 meters up. It successfully navigated road lanes and avoided moving pedestrians and bicycles. Intersections did confuse it though, likely due to the open spaces messing with the collision predictions. But we think that shouldn’t be a problem when paired with map-localize-plan techniques as a direction to move through the intersection would be chosen for it using the location on the map.

As you can see in the video below, it not only does a decent job of flying down lanes but it also flies well in a parking garage and a hallway, even though it wasn’t trained for either of these.

Continue reading “Delivery Drones Can Learn From Driving And Cycling”

Neural Networking: Robots Learning From Video

Humans are very good at watching others and imitating what they do. Show someone a video of flipping a switch to turn on a CNC machine and after a single viewing they’ll be able to do it themselves. But can a robot do the same?

Bear in mind that we want the demonstration video to be of a human arm and hand flipping the switch. When the robot does it, the camera that is its eye will be seeing its robot arm and gripper. So somehow it’ll have to know that its robot parts are equivalent to the human parts in the demonstration video. Oh, and the switch in the demonstration video may be a different model and make, and the CNC machine may be a different one, though we’ll at least put the robot within reach of its switch.

Sound difficult?

Researchers from Google Brain and the University of Southern California have done it. In their paper describing how, they talk about a few different experiments but we’ll focus on just one, getting a robot to imitate pouring a liquid from a container into a cup.

Continue reading “Neural Networking: Robots Learning From Video”

Fooling Speech Recognition With Hidden Voice Commands

It’s 2018, and while true hoverboards still elude humanity, some future predictions have come true. It’s now possible to talk to computers, and most of the time they might even understand you. Speech recognition is usually achieved through the use of neural networks to process audio, in a way that some suggest mimics the operation of the human brain. However, as it turns out, they can be easily fooled.

The attack begins with an audio sample, generally of a simple spoken phrase, though music can also be used. The desired text that the computer should hear instead is then fed into an algorithm along with the audio sample. This function returns a low value when the output of the speech recognition system matches the desired attack phrase. The input audio file is gradually modified using the mathematics of gradient descent, creating a result that to a human sounds like one thing, and to a machine, something else entirely.

The audio files are available on the site for your own experimental purposes. In a noisy environment with poor audio coupling between speakers and a Google Pixel, results were poor – OK Google only heard the human phrase, not the encoded attack phrase. Given that the sound quality was poor, and the files were generated with a different speech model, this is not entirely surprising. We’d love to hear the results of your experiments in the comments.

It’s all a part of [Nicholas]’s PhD studies around the strengths and pitfalls of neural networks. It highlights the fact that neural networks don’t always work in the way we think they do. Google’s Inception is susceptible to similar attacks with images, as we’ve seen recently.

[Thanks to Wolfgang for the tip!]

Google's AIY Vision Kit exploded view

Google’s AIY Vision Kit Augments Pi With Vision Processor

Google has announced their soon to be available Vision Kit, their next easy to assemble Artificial Intelligence Yourself (AIY) product. You’ll have to provide your own Raspberry Pi Zero W but that’s okay since what makes this special is Google’s VisionBonnet board that they do provide, basically a low power neural network accelerator board running TensorFlow.

AIY VisionBonnet with Myriad 2 (MA2450) chip
AIY VisionBonnet with Myriad 2 (MA2450) chip

The VisionBonnet is built around the Intel® Movidius™ Myriad 2 (aka MA2450) vision processing unit (VPU) chip. See the video below for an overview of this chip, but what it allows is the rapid processing of compute-intensive neural networks. We don’t think you’d use it for training the neural nets, just for doing the inference, or in human terms, for making use of the trained neural nets. It may be worth getting the kit for this board alone to use in your own hacks. An alternative is to get Modivius’s Neural Compute Stick, which has the same chip on a USB stick for around $80, not quite double the Vision Kit’s $45 price tag.

The Vision Kit isn’t out yet so we can’t be certain of the details, but based on the hardware it looks like you’ll point the camera at something, press a button and it will speak. We’ve seen this before with this talking object recognizer on a Pi 3 (full disclosure, it was made by yours truly) but without the hardware acceleration, a single object recognition took around 10 seconds. In the vision kit we expect the recognition will be in real-time. So the Vision Kit may be much more dynamic than that. And in case it wasn’t clear, a key feature is that nothing is done on the cloud here, all processing is local.

The kit comes with three different applications: an object recognition one that can recognize up to 1000 different classes of objects, another that recognizes faces and their expressions, and a third that detects people, cats, and dogs. While you can get up to a lot of mischief with just that, you can run your own neural networks too. If you need a refresher on TensorFlow then check out our introduction. And be sure to check out the Myriad 2 VPU video below the break.

Continue reading “Google’s AIY Vision Kit Augments Pi With Vision Processor”

Neural Network Learns SDR Ham Radio

Identifying ham radio signals used to be easy. Beeps were Morse code, voice was AM unless it sounded like Donald Duck in which case it was sideband. But there are dozens of modes in common use now including TV, digital data, digital voice, FM, and more coming on line every day. [Randaller] used CUDA to build a neural network that could interface with an RTL-SDR dongle and can classify the signals it hears. Since it is a neural network, it isn’t so much programmed to do it as it is trained. The proof of concept has training to distinguish FM, SECAM, and tetra. However, you can train it to recognize other modulation schemes if you want to invest the time into it.

Continue reading “Neural Network Learns SDR Ham Radio”

Tensorflow Tutorial Uses Python

Around the Hackaday secret bunker, we’ve been talking quite a bit about machine learning and neural networks. There’s been a lot of renewed interest in the topic recently because of the success of TensorFlow. If you are adept at Python and remember your high school algebra, you might enjoy [Oliver Holloway’s] tutorial on getting started with Tensorflow in Python.

[Oliver] gives links on how to do the setup with notes on Python versions. Then he shows some basic setup operations. From there, he has the software “learn” how to classify random points that either fall into a circle or don’t. Granted, this is easy enough to do with traditional programming, so it isn’t a great practical example, but it is illustrative for learning purposes.

Given that it is easy to algorithmically decide which points are in the circle and which are not, it is simple to develop training data. It is also easy to look at the result and see how close it is to the actual circle. You’ll see that it takes a lot of slow learning before the result space looks like a circle and not a triangle or some other odd shape.

Continue reading “Tensorflow Tutorial Uses Python”

TensorFlow Lite demos

Smarter Phones In Your Hacks With TensorFlow Lite

One way to run a compute-intensive neural network on a hack has been to put a decent laptop onboard. But wouldn’t it be great if you could go smaller and cheaper by using a phone instead? If your neural network was written using Google’s TensorFlow framework then you’ve had the option of using TensorFlow Mobile, but it doesn’t use any of the phone’s accelerated hardware, and so it might not have been fast enough.

TensorFlow Lite architecture
TensorFlow Lite architecture

Google has just released a new solution, the developer preview of TensofFlow Lite for iOS and Android and announced plans to support Raspberry Pi 3. On Android, the bottom layer is the Android Neural Networks API which makes use of the phone’s DSP, GPU and/or any other specialized hardware to speed up computations. Failing that, it falls back on the CPU.

Currently, fewer operators are supported than with TensforFlor Mobile, but more will be added. (Most of what you do in TensorFlow is done through operators, or ops. See our introduction to TensorFlow article if you need a refresher on how TensorFlow works.) The Lite version is intended to be the successor to Mobile. As with Mobile, you’d only do inference on the device. That means you’d train the neural network elsewhere, perhaps on a GPU-rich desktop or on a GPU farm over the network, and then make use of the trained network on your device.

What are we envisioning here? How about replacing the MacBook Pro on the self-driving RC cars we’ve talked about with a much smaller, lighter and less power-hungry Android phone? The phone even has a camera and an IMU built-in, though you’d need a way to talk to the rest of the hardware in lieu of GPIO.

You can try out TensorFlow Lite fairly easily by going to their GitHub and downloading a pre-built binary. We suspect that’s what was done to produce the first of the demonstration videos below.

Continue reading “Smarter Phones In Your Hacks With TensorFlow Lite”