DeepMind, a company that was acquired by Google, made headlines when the program AlphaGo Zero managed to become the best Go player in the world, without using any human knowledge, a feat reported in this blog less than two months ago.
Now, just a few weeks after that result, DeepMind reports, in an article posted in arXiv.org, that the program AlphaZero obtained a similar result for the game of chess.
Computer programs have been the world’s best players for a long time now, basically since Deep Blue defeated the reigning world champion, Garry Kasparov, in 1997. Deep Blue, as almost all the other top chess programs, was deeply specialized in chess, and played the game using handcrafted position evaluation functions (based on grand-master games) coupled with deep search methods. Deep Blue evaluated more than 200 million positions per second, using a very deep search (between 6 and 8 moves, sometimes more) to identify the best possible move.
Modern computer programs use a similar approach, and have attained super-human levels, with the best programs (Komodo and Stockfish) reaching a Elo Rating higher than 3300. The best human players have Elo Ratings between 2800 and 2900. This difference implies that they have less than a one in ten chance of beating the top chess programs, since a difference of 366 points in Elo Rating (anywhere in the scale) mean a probability of winning of 90%, for the most ranked player.
In contrast, AlphaZero learned the game without using any human generated knowledge, by simply playing against another copy of itself, the same approach used by AlphaGo Zero. As the authors describe, AlphaZero learned to play at super-human level, systematically beating the best existing chess program (Stockfish), and in the process rediscovering centuries of human-generated knowledge, such as common opening moves (Ruy Lopez, Sicilian, French and Reti, among others).
The flexibility of AlphaZero (which also learned to play Go and Shogi) provides convincing evidence that general purpose learners are within the reach of the technology. As a side note, the author of this blog, who was a fairly decent chess player in his youth, reached an Elo Rating of 2000. This means that he has less than a one in ten chance of beating someone with a rating of 2400 who has less than a one in ten chance of beating the world champion who has less than a one in ten chance of beating AlphaZero. Quite humbling…
Image by David Lapetina, available at Wikimedia Commons.