Deep Learning — the use of neural networks with modern techniques to tackle problems ranging from computer vision to speech recognition and synthesis — is certainly a current buzzword. However, at the core is a set of powerful methods for organizing self-learning systems. Multi-layer neural networks aren’t new, but there is a resurgence of interest primarily due to the availability of massively parallel computation platforms disguised as video cards.
The problem is getting started in something like this. There are plenty of scholarly papers that can be hard to wade through. Or you can grab some code from GitHub and try to puzzle it out.
Speech synthesis is nothing new, but it has gotten better lately. It is about to get even better thanks to DeepMind’s WaveNet project. The Alphabet (or is it Google?) project uses neural networks to analyze audio data and it learns to speak by example. Unlike other text-to-speech systems, WaveNet creates sound one sample at a time and affords surprisingly human-sounding results.
Before you rush to comment “Not a hack!” you should know we are seeing projects pop up on GitHub that use the technology. For example, there is a concrete implementation by [ibab]. [Tomlepaine] has an optimized version. In addition to learning English, they successfully trained it for Mandarin and even to generate music. If you don’t want to build a system out yourself, the original paper has audio files (about midway down) comparing traditional parametric and concatenative voices with the WaveNet voices.
Another interesting project is the reverse path — teaching WaveNet to convert speech to text. Before you get too excited, though, you might want to note this quote from the read me file:
“We’ve trained this model on a single Titan X GPU during 30 hours until 20 epochs and the model stopped at 13.4 ctc loss. If you don’t have a Titan X GPU, reduce batch_size in the train.py file from 16 to 4.”
Last time we checked, you could get a Titan X for a little less than $2,000.
There is a multi-part lecture series on reinforced learning (the foundation for DeepMind). If you wanted to tackle a project yourself, that might be a good starting point (the first part appears below).
Neural networks ought to be very appealing to hackers. You can easily implement them in hardware or software and relatively simple networks can perform powerful functions. As the jobs we ask of neural networks get more complex, the networks require more artificial neurons. That’s why researchers are pursuing dense integrated neuron chips that could do for neural networks what integrated circuits did for conventional computers.
Researchers at Princeton have announced the first photonic neural network. We recently talked about how artificial neurons work in conventional hardware and software. The artificial neurons look for inputs to reach a threshold which causes them to “fire” and trigger inputs to other neurons.
To map this function to an optical device, the researchers created tiny circular waveguides in a silicon substrate. Light circulates in the waveguide and, when released, modulates the output of a laser. Each waveguide works with a specific wavelength of light. This allows multiple “inputs” (in the form of different wavelengths) to sum together to modulate the laser.
The team used a 49-node network to model a differential equation. The photonic system was nearly 2,000 times faster than other techniques. You can read the actual paper online if you are interested in more details.
There’s been a lot of work done lately on both neural networks and optical computing. Perhaps this fusion will advance both arts.
Last time, I talked about a simple kind of neural net called a perceptron that you can cause to learn simple functions. For the purposes of experimenting, I coded a simple example using Excel. That’s handy for changing things on the fly, but not so handy for putting the code in a microcontroller. This time, I’ll show you how the code looks in C++ and also tell you more about what you can do when faced with a more complex problem.
When you want a person to do something, you train them. When you want a computer to do something, you program it. However, there are ways to make computers learn, at least in some situations. One technique that makes this possible is the perceptron learning algorithm. A perceptron is a computer simulation of a nerve, and there are various ways to change the perceptron’s behavior based on either example data or a method to determine how good (or bad) some outcome is.
What’s a Perceptron?
I’m no biologist, but apparently a neuron has a bunch of inputs and if the level of those inputs gets to a certain level, the neuron “fires” which means it stimulates the input of another neuron further down the line. Not all inputs are created equally: in the mathematical model of them, they have different weighting. Input A might be on a hair trigger, while it might take inputs B and C on together to wake up the neuron in question. Continue reading “Machine Learning: Foundations”→
It’s overkill, but it’s really cool. [Bob Bond] took an NVIDIA Jetson TX1 single-board computer and a webcam and wirelessly combined them with his lawn sprinklers. Now, when his neighbors’ cats come to poop in his yard, a carefully trained neural network detects them and gets them wet.
It is absolutely the case that this could have been done with a simple motion sensor, but if the neural network discriminates sufficiently well between cats and (for instance) his wife, this is an improved solution for sure. Because the single-board computer he’s chosen for the project has a ridiculous amount of horsepower, he can afford to do a lot of image processing, so there’s a chance that everyone on two legs will stay dry. And the code is up on GitHub for you to see, if you’re interested.
[Bob] promises more detail about the neural network in the future. We can’t wait. (And we’d love to see a sentry-turret style build in the future. Think of the water savings!)
We wake up this morning to the news that Google’s deep-search neural network project called AlphaGo has beaten the second ranked world Go master (who happens to be a human being). This is the first of five matches between the two adversaries that will play out this week.
On one hand, this is a sign of maturing technology. It has been almost twenty years since Deep Blue beat Gary Kasparov, the reigning chess world champion at the time. Although there are still four games to play against Lee Sedol, it was recently reported that AlphaGo beat European Go champion Fan Hui in five games straight. Go is generally considered a more difficult game for machine minds to play than chess. This is because Go has a much larger pool of possible moves at any given time.
Does This Matter?
Okay, the news part of this event has been covered: machine beats man. Does it matter? Will this affect your life and how? We want to hear what you think in the comments below. But I’m going to keep going with some of my thoughts on the topic.
Let’s look first at what AlphaGo did to win. At its core, the game of Go is won by figuring out where your opponent will likely make a low-percentage move and then capitalizing on that choice. Know Your Enemy has been a tenet of strategy for a few millennia now and it holds true in the digital age. In addition to the rules of the game, AlphaGo was fed a healthy diet of 30 million positions from expert games. This builds behavior recognition into the system. Not just what moves can be made, but what moves are most likely to be made.
DeepMind, the company behind AlphaGo which was acquired by Google in 2014, has published a paper in Nature about their approach. They were even nice enough to let us read without dealing with a paywall. The secret sauce is the learning process which at its core tries to mimic how living entities learn: observe repetitively while assigning values to outcomes. This is key as it leads past “intellect”, to “intelligence” (the “I” in AI that everyone seems to be waiting for). But this is a bastardized version of “intelligence”. AlphaGo is able to recognize and predict behavior, then make choices that lead to a desired outcome. This is more than intellect as it does value the purpose of an opponent’s decisions. But it falls short of intelligence as AlphaGo doesn’t consciously understand the purpose it has detected. In my mind this is exactly what we need. Truly successful machine learning will be able to make sense out of sometimes irrational input.
The paper from Nature doesn’t go into details about Go, but it explains the approach of the learning system applied to Atari 2600. The algorithm was given 210×160 color video at 60Hz as an input and then told it could use a joystick with one button. From there it taught itself to play 49 games. It was not told the purpose or the rules of the games, but it was given examples of scores from human performance and rewarded for its own quality performances. The chart above shows that it learned to play 29 of them at or above human skill levels.