[00535] Reinforcement Learning with Variable Exploration
Session Time & Room : 5D (Aug.25, 15:30-17:10) @E711
Type : Contributed Talk
Abstract : Reinforcement learning is a powerful machine learning technique, but unreliable when multiple agents learn simultaneously. Our work applies Q learning to the Iterated Prisoner's Dilemma, an ideal setting to study AI cooperation. We investigate how different frameworks for variable exploration rates effect performance by escaping local optima. One result finds shorter learning periods produce more cooperation, potentially indicating incentive alignment. This furthers previous studies by carefully considering the ways exploration rate might vary over time.