What's up guys! Today I have created a AI using RL(Reinforcement Learning)
that plays Tic Tac Toe with you. Using a Q-Network, we train the AI using the Adam optimizer and we train on 10,000 Episodes ...
Introduction
Both Proximal Policy Optimization (PPO) and Generalized Reinforcement Learning with Proximal Optimizer (GRPO) are the algorithm of Reinforcement Learning (RL). In this blog, I am...