当前位置:科学网首页 > 小柯机器人 >详情
作者:小柯机器人 发布时间:2024/6/23 16:09:45

美国纽约大学Heiko H. Schütt小组取得一项新突破。他们发现奖励预测误差神经元实现高效的奖励编码。相关论文发表在2024年6月19日出版的《自然-神经科学》杂志上。




Title: Reward prediction error neurons implement an efficient code for reward

Author: Schtt, Heiko H., Kim, Dongjae, Ma, Wei Ji

Issue&Volume: 2024-06-19

Abstract: We use efficient coding principles borrowed from sensory neuroscience to derive the optimal neural population to encode a reward distribution. We show that the responses of dopaminergic reward prediction error neurons in mouse and macaque are similar to those of the efficient code in the following ways: the neurons have a broad distribution of midpoints covering the reward distribution; neurons with higher thresholds have higher gains, more convex tuning functions and lower slopes; and their slope is higher when the reward distribution is narrower. Furthermore, we derive learning rules that converge to the efficient code. The learning rule for the position of the neuron on the reward axis closely resembles distributional reinforcement learning. Thus, reward prediction error neuron responses may be optimized to broadcast an efficient reward signal, forming a connection between efficient coding and reinforcement learning, two of the most successful theories in computational neuroscience.

DOI: 10.1038/s41593-024-01671-x

Source: https://www.nature.com/articles/s41593-024-01671-x


Nature Neuroscience:《自然—神经科学》,创刊于1998年。隶属于施普林格·自然出版集团,最新IF:28.771