Registered Data

[00535] Reinforcement Learning with Variable Exploration

  • Session Time & Room : 5D (Aug.25, 15:30-17:10) @E711
  • Type : Contributed Talk
  • Abstract : Reinforcement learning is a powerful machine learning technique, but unreliable when multiple agents learn simultaneously. Our work applies Q learning to the Iterated Prisoner's Dilemma, an ideal setting to study AI cooperation. We investigate how different frameworks for variable exploration rates effect performance by escaping local optima. One result finds shorter learning periods produce more cooperation, potentially indicating incentive alignment. This furthers previous studies by carefully considering the ways exploration rate might vary over time.
  • Classification : 68T05, 91A26, 37N40, 91A05
  • Format : Talk at Waseda University
  • Author(s) :
    • Brian Mintz (Dartmouth College)
    • Feng Fu (Dartmouth College)