An artificial-intelligence program called AlphaGo Zero has mastered the game of Go without any human data or guidance. A computer scientist and two members of the American Go Association discuss… Click to show full abstract
An artificial-intelligence program called AlphaGo Zero has mastered the game of Go without any human data or guidance. A computer scientist and two members of the American Go Association discuss the implications. See Article p.354 To beat world champions at the game of Go, the computer program AlphaGo has relied largely on supervised learning from millions of human expert moves. David Silver and colleagues have now produced a system called AlphaGo Zero, which is based purely on reinforcement learning and learns solely from self-play. Starting from random moves, it can reach superhuman level in just a couple of days of training and five million games of self-play, and can now beat all previous versions of AlphaGo. Because the machine independently discovers the same fundamental principles of the game that took humans millennia to conceptualize, the work suggests that such principles have some universal character, beyond human bias.
               
Click one of the above tabs to view related content.