Best-Neat-9439

Best-Neat-9439 t1_j2yn1nx wrote

>There are also AI that can improve themselves more than the human given data. The AlphaGo project started off with human Go matches as training data, and evolved into tabula-rasa training by self play. By the end, the AI beats the best human.

Neither AlphaGo Zero or AlphaZero were trained with supervised learning. They were both trained with reinforcement learning (and MCTS, so it's not purely RL, but it's more like RL + planning). It's then not surprising that it can beat humans - its "ground truth" doesn't come from humans anyway.

16