AlphaGo is the deep learning program that can beat humans at the game Go. You can read Google’s highly technical paper on it, but you’ll have to wade through some very academic language. [Aman Agarwal] has done us a favor. He took the original paper and dissected the important parts of in in plain English. If the title doesn’t make sense to you, you need to read more XKCD.
[Aman] says his treatment will be useful for anyone who doesn’t want to become an expert on neural networks but still wants to understand this important breakthrough. He also thinks people who don’t have English as a first language may find his analysis useful. By the way, the actual Go matches where AlphaGo beat [Sedol] were streamed and you can watch all the replays on YouTube (the first match appears below).
We wake up this morning to the news that Google’s deep-search neural network project called AlphaGo has beaten the second ranked world Go master (who happens to be a human being). This is the first of five matches between the two adversaries that will play out this week.
On one hand, this is a sign of maturing technology. It has been almost twenty years since Deep Blue beat Gary Kasparov, the reigning chess world champion at the time. Although there are still four games to play against Lee Sedol, it was recently reported that AlphaGo beat European Go champion Fan Hui in five games straight. Go is generally considered a more difficult game for machine minds to play than chess. This is because Go has a much larger pool of possible moves at any given time.
Does This Matter?
Okay, the news part of this event has been covered: machine beats man. Does it matter? Will this affect your life and how? We want to hear what you think in the comments below. But I’m going to keep going with some of my thoughts on the topic.
Let’s look first at what AlphaGo did to win. At its core, the game of Go is won by figuring out where your opponent will likely make a low-percentage move and then capitalizing on that choice. Know Your Enemy has been a tenet of strategy for a few millennia now and it holds true in the digital age. In addition to the rules of the game, AlphaGo was fed a healthy diet of 30 million positions from expert games. This builds behavior recognition into the system. Not just what moves can be made, but what moves are most likely to be made.
DeepMind, the company behind AlphaGo which was acquired by Google in 2014, has published a paper in Nature about their approach. They were even nice enough to let us read without dealing with a paywall. The secret sauce is the learning process which at its core tries to mimic how living entities learn: observe repetitively while assigning values to outcomes. This is key as it leads past “intellect”, to “intelligence” (the “I” in AI that everyone seems to be waiting for). But this is a bastardized version of “intelligence”. AlphaGo is able to recognize and predict behavior, then make choices that lead to a desired outcome. This is more than intellect as it does value the purpose of an opponent’s decisions. But it falls short of intelligence as AlphaGo doesn’t consciously understand the purpose it has detected. In my mind this is exactly what we need. Truly successful machine learning will be able to make sense out of sometimes irrational input.
The paper from Nature doesn’t go into details about Go, but it explains the approach of the learning system applied to Atari 2600. The algorithm was given 210×160 color video at 60Hz as an input and then told it could use a joystick with one button. From there it taught itself to play 49 games. It was not told the purpose or the rules of the games, but it was given examples of scores from human performance and rewarded for its own quality performances. The chart above shows that it learned to play 29 of them at or above human skill levels.