Registered Data

[00535] Reinforcement Learning with Variable Exploration

Session Time & Room : 5D (Aug.25, 15:30-17:10) @E711
Type : Contributed Talk
Abstract : Reinforcement learning is a powerful machine learning technique, but unreliable when multiple agents learn simultaneously. Our work applies Q learning to the Iterated Prisoner's Dilemma, an ideal setting to study AI cooperation. We investigate how different frameworks for variable exploration rates effect performance by escaping local optima. One result finds shorter learning periods produce more cooperation, potentially indicating incentive alignment. This furthers previous studies by carefully considering the ways exploration rate might vary over time.
Classification : 68T05, 91A26, 37N40, 91A05
Format : Talk at Waseda University
Author(s) :
- Brian Mintz (Dartmouth College)
- Feng Fu (Dartmouth College)