AlphaGo is the deep learning program that can beat humans at the game Go. You can read Google’s highly technical paper on it, but you’ll have to wade through some very academic language. [Aman Agarwal] has done us a favor. He took the original paper and dissected the important parts of in in plain English. If the title doesn’t make sense to you, you need to read more XKCD.
[Aman] says his treatment will be useful for anyone who doesn’t want to become an expert on neural networks but still wants to understand this important breakthrough. He also thinks people who don’t have English as a first language may find his analysis useful. By the way, the actual Go matches where AlphaGo beat [Sedol] were streamed and you can watch all the replays on YouTube (the first match appears below).
Interestingly, the explanation doesn’t assume you know how to play Go, but it does presuppose you have an understanding of some kind of two-player board game. As an example of the kind of language you’ll find in the original paper (which is linked in the post), you might see:
The policy network is trained on randomly sampled state-action pairs (s,a) using stochastic gradient ascent to maximize the likelyhood of the human move a selected in state s.
This is followed by some math equations. The post explains stochastic gradient ascent and even contrasts it to another technique for backpropagation, stochastic gradient descent.
We have to say, we’d like to see more academic papers taken apart like this for people who are interested but not experts in that field. We covered the AlphaGo match at the time. Personally, we are always suckers for a good chess computer.