The current wave of excitement around machine learning kicked off when graphics processors were repurposed to make training deep neural networks practical. Nvidia found themselves the engine of a new revolution and seized their opportunity to help push frontiers of research. Their research lab in Seattle will focus on one such field: making robots smart enough to work alongside humans in an IKEA kitchen.
Today’s robots are mostly industrial machines that require workspaces designed for robots. They run day and night, performing repetitive tasks, usually inside cages to keep squishy humans out of harm’s way. Robots will need to be a lot smarter about their surroundings before we could safely dismantle those cages. While there are some industrial robots making a start in this arena, they have a hard time justifying their price premium. (Example: financial difficulty of Rethink Robotics, who made the Baxter and Sawyer robots.)
So there’s a lot of room for improvement in this field, and this evolution will need a training environment offering tasks of varying difficulty levels for robots. Anywhere from the rigorous structured environment where robots work well today, to a dynamic unstructured environment where robots are hopelessly lost. Lab lead Dr. Dieter Fox explained how a kitchen is ideal. A meticulously cleaned and organized kitchen is very similar to an industrial setting. From there, we can gradually make a kitchen more challenging for a robot. For example: today’s robots can easily pick up a can with its rigid regular shape, but what about a half-full bag of flour? And from there, learn to pick up a piece of fresh fruit without bruising it. These tasks share challenges with many other tasks outside of a kitchen.
This isn’t about building a must-have home cooking robot, it’s about working through the range of challenges shared with common kitchen tasks. The lab has a lot of neat hardware, but its success will be measured by the software, and like all research, published results should be reproducible by other labs. You don’t have a high-end robotics lab in your house, but you do have a kitchen. That’s why it’s not just any kitchen, but an IKEA kitchen, to take advantage of the fact they are standardized, affordable, and available around the world for other robot researchers to benchmark against.
Most of us can experiment in a kitchen, IKEA or not. We have access to all the other tools we need: affordable AI hardware from Google, from Beaglebone, and from Nvidia. And we certainly have no shortage of robot arms and manipulators on these pages, ranging from a small laser-cut MeArm to our 2018 Hackaday Prize winner Dexter.
Deep Neural Networks can be pretty good at identifying images — almost as good as they are at attracting Silicon Valley venture capital. But they can also be fairly brittle, and a slew of research projects over the last few years have been working on making the networks’ image classification less likely to be deliberately fooled.
One particular line of attack involves adding particularly-crafted noise to an image that flips some bits in the deep dark heart of the network, and makes it see something else where no human would notice the difference. We got tipped with a YouTube video of a one-pixel attack, embedded below, where changing a single pixel in the image would fool the network. Take that robot overlords!
Or not so fast. Reading the fine-print in the cited paper paints a significantly less gloomy picture for Deep Neural Nets. First, the images in question were 32 pixels by 32 pixels to begin with, so each pixel matters, especially after it’s run through a convolution step with a few-pixel window. The networks they attacked weren’t the sharpest tools in the shed either, with somewhere around a 68% classification success rate. What this means is that the network was unsure to begin with for many of the test images — making it flip from its marginally best (correct) first choice to a second choice shouldn’t be all that hard.
Humans are very good at watching others and imitating what they do. Show someone a video of flipping a switch to turn on a CNC machine and after a single viewing they’ll be able to do it themselves. But can a robot do the same?
Bear in mind that we want the demonstration video to be of a human arm and hand flipping the switch. When the robot does it, the camera that is its eye will be seeing its robot arm and gripper. So somehow it’ll have to know that its robot parts are equivalent to the human parts in the demonstration video. Oh, and the switch in the demonstration video may be a different model and make, and the CNC machine may be a different one, though we’ll at least put the robot within reach of its switch.
Researchers from Google Brain and the University of Southern California have done it. In their paper describing how, they talk about a few different experiments but we’ll focus on just one, getting a robot to imitate pouring a liquid from a container into a cup.
Around the Hackaday secret bunker, we’ve been talking quite a bit about machine learning and neural networks. There’s been a lot of renewed interest in the topic recently because of the success of TensorFlow. If you are adept at Python and remember your high school algebra, you might enjoy [Oliver Holloway’s] tutorial on getting started with Tensorflow in Python.
[Oliver] gives links on how to do the setup with notes on Python versions. Then he shows some basic setup operations. From there, he has the software “learn” how to classify random points that either fall into a circle or don’t. Granted, this is easy enough to do with traditional programming, so it isn’t a great practical example, but it is illustrative for learning purposes.
Given that it is easy to algorithmically decide which points are in the circle and which are not, it is simple to develop training data. It is also easy to look at the result and see how close it is to the actual circle. You’ll see that it takes a lot of slow learning before the result space looks like a circle and not a triangle or some other odd shape.
In case you didn’t make it to the ISCA (International Society for Computers and their Applications) session this year, you might be interested in a presentation by [Joel Emer] an MIT professor and scientist for NVIDIA. Along with another MIT professor and two PhD students ([Vivienne Sze], [Yu-Hsin Chen], and [Tien-Ju Yang]), [Emer’s] presentation covers hardware architectures for deep neural networks.
The presentation covers the background on deep neural networks and basic theory. Then it progresses to deep learning specifics. One interesting graph shows how neural networks are getting better at identifying objects in images every year and as of 2015 can do a better job than a human over a set of test images. However, the real key is using hardware to accelerate the performance of networks.
Hardware acceleration is important for several reasons. For one, many applications have lots of data associated. Also, training can involve many iterations which can take a long time.