Back-propagation as reinforcement in prediction tasks
Grüning, A (2005) Back-propagation as reinforcement in prediction tasks In: ICANN 2005 15th International Conference, 2005-09-11 - 2005-09-15, Warsaw, Poland.
Available under License : See the attached licence file.
The back-propagation (BP) training scheme is widely used for training network models in cognitive science besides its well known technical and biological short-comings. In this paper we contribute to making the BP training scheme more acceptable from a biological point of view in cognitively motivated prediction tasks overcoming one of its major drawbacks. Traditionally, recurrent neural networks in symbolic time series prediction (e. g. language) are trained with gradient decent based learning algorithms, notably with back-propagation (BP) through time. A major drawback for the biological plausibility of BP is that it is a supervised scheme in which a teacher has to provide a fully speci.ed target answer. Yet, agents in natural environments often receive a summary feed-back about the degree of success or failure only, a view adopted in reinforcement learning schemes. In this work we show that for simple recurrent networks in prediction tasks for which there is a probability interpretation of the network’s output vector, Elman BP can be reimplemented as a reinforcement learning scheme for which the expected weight updates agree with the ones from traditional Elman BP, using ideas from the AGREL learning scheme (van Ooyen and Roelfsema 2003) for feed-forward networks.
|Item Type:||Conference or Workshop Item (Conference Paper)|
|Divisions :||Faculty of Engineering and Physical Sciences > Computing Science|
|Identification Number :||https://doi.org/10.1007/11550907_86|
|Additional Information :||The original publication is available at http://www.springerlink.com|
|Depositing User :||Symplectic Elements|
|Date Deposited :||31 Aug 2012 14:33|
|Last Modified :||23 Sep 2013 19:35|
Actions (login required)
Downloads per month over past year