Full Program »
Automatic Generation of a Sub-optimal Agent Population with Learning
Most modern solutions for video game balancing are directed towards specific games. We are currently researching general methods for automatic multiplayer game balancing. The problem is modeled as a meta-game, where game-play change the rules from another game. This way, a Machine Learning agent that learns to play a meta-game, learns how to change a base game following some balancing metric. But an issue resides in generation of high volume of game-play training data, were agents of different skill compete against each other. For this end we propose the automatic generation of a population of surrogate agents by learning sampling. In Reinforcement Learning an agent learns in a trial-error fashion where it improves gradually its policy, the mapping from world state to action to perform. This means that in each successful evolutionary step an agent follows a sub-optimal strategy, or eventually the optimal strategy. We store the agent policy at the end of each training episode. The process is evaluated in simple environments with distinct properties. Quality of the generated population is evaluated by the diversity of the difficulty the agents have in solving their tasks.