What are the findings of learning chess with the AI program 'AlphaZero'?



DeepMind , a subsidiary of Alphabet and an artificial intelligence (AI) developer, and Google Brain , Google's AI-focused research division, have collaborated with chess grandmaster Vladimir Kramnik to 'evolve chess moves in humans.' And carried out a project to compare 'Evolution of Chess AI'.

Acquisition of Chess Knowledge in AlphaZero | ChessBase
https://en.chessbase.com/post/acquisition-of-chess-knowledge-in-alphazero

In this project, AlphaZero will play chess using the game records stored in ChessBase, which is a huge database of chess, the neural network chess engine of 'AlphaZero ', and various components of the open source chess engine 'Stockfish'. I am studying how to learn.

The chess engine is a tool used by chess professionals and amateurs to study their hands, and in recent years, powerful chess engines using neural networks such as AlphaZero have emerged. AlphaZero is a chess engine that allows you to learn about chess completely by yourself through reinforcement learning, and it is possible to optimize the movement and position of pieces by self-learning through the game without the need for a handwritten evaluation function. .. Through self-learning through this game, the chess engine will be able to move and position the pieces with insight.

When the research team compared 'the progress of chess in AlphaZero' with 'the history of the evolution of chess by human moves', it seems that 'an amazing pattern has appeared'. For example, when analyzing the game record data of Chess Base and analyzing the opening of human chess, it seems that everyone in the 1500s pointed to 'e4' at the beginning, but after that, 'd4' 'Nf3' ' You can see that the first move such as 'c4' has become popular.

The graph below summarizes the first moves of human chess by age group.



On the other hand, AlphaZero points to the first move completely randomly at the beginning of learning, but it optimizes the first move through self-learning. As a result, it can be seen that although various moves are pointed to immediately after the start of self-learning, the first move such as 'd4' or 'e4' is gradually preferred. The three graphs are due to the analysis of the first move of different versions of AlphaZero. Also, depending on the version, AlphaZero often points to 'a6' in the 4th move, 'This is a classic move, and it is very interesting that there is no chess AI that points to a unique move.' The team notes.



The chess engine Stockfish has its own evaluation function, which determines the position of the pieces. On the other hand, AlphaZero does not have such an evaluation function, and evaluates the position of the piece with two choices, '+1 (victory is guaranteed)' or '-1 (defeat is certain)'. An analysis of this simple AlphaZero position evaluation method based on Stockfish's evaluation function is shown in the graph below. Stockfish's evaluation functions are 'Imbalance', 'King Safety', 'Material', 'Mobility', 'Space', and 'Threats'. Although it is based on the five criteria of, AlphaZero can be seen that it takes a method of arranging pieces with particular emphasis on 'Material (score)'. However, as the time for self-learning increases, the importance of 'Material' decreases, and instead, the importance of 'King Safety' increases.



Looking at the changes over time in self-learning and which indicators are being emphasized, the research team writes, 'This change is surprisingly human-like.'

Research team Tom McGrath said, 'It's great and interesting to look at AlphaZero's move and look for something we can understand. Sometimes you can't find anything clear from the move, There are many things that aren't the case or that you have to ponder. Of course, there's still a lot to do, and we still don't know how the elements relate to each other, but we can still make some progress. This is an encouraging point, and I look forward to being able to understand the meaning of other factors. '

in Software, Posted by logu_ii