What's behind Google Alpha Zero domination over Stockfish
Some notes and impressions from the gigantic battle: Google Deep Mind AI Alpha Zero vs Stockfish.
What happened few days ago was that the pretender dominated the king in chess engine rivalry. A new program devours the best chess playing engine so far. It was advertised like learning from scratch and for only 4 hours of training was capable of destroying the king in virtual chess world. The interesting point is that has the capacity to learn chess by playing himself which is available by using reinforcement learning.
Some numbers
(source by wikipedia: AlphaZero )
- $25 million - the price of the hardware for a single AlphaGo Zero instance
- from 4 to 72 hours of training to master the game and crush the rival chess engines
- 4.9 million games - just for first three days(played against itself in quick succession.)
- 4 hours of self-play enough for AlphaZero to defeated Stockfish
- 64 GPU workers, 19 CPU parameter servers, 4 TPUs
- Stockfish is evaluating about 70 millions moves per second
- AlphaZero performs 80k positions per second ( but using deep learning focus only on the most promising moves )
Criticism
- Stockfish parameter settings were not optimal
- Stockfish is designed to access an opening database; but the game was without it.
- Training time is extremely fast for such purposes but expensive hardware is needed and the rules are fixed
- The hardware was in AlphaZero favour
- version of Stockfish was 'outdated'