|
Publication: Campbell, Jeffrey, Givigi, Sidney N. and Schwartz, Howard M.
"Multiple Model Q-learning for Stochastic Asynchronous Rewards" |
Abstract:
The main contribution of this work is a novel
machine reinforcement learning algorithm for problems where
a Poissonian stochastic time delay is present in the agent's
reinforcement signal. Despite the presence of the reinforcement
noise, the algorithm can craft a suitable control policy for
the agent's environment. The novel approach can deal with
reinforcements which may be received out of order in time
or may even overlap, which was not previously considered in
the literature. The proposed algorithm is simulated and its
performance is compared to a standard Q-learning algorithm.
Through simulation, the proposed method is found to improve
the performance of a learning agent in an environment with
Poissonian-type stochastically delayed rewards. PDF
Keywords: Reinforcement learning, Markov Decision Process, stochastic time delay, reward, cost, jitter, multiple models |