[ONOS Seminar Series] Professor. Heiko Schütt : Reward prediction error neurons implement an efficient code for reward
Description
How rewards are encoded in the brain is of great interest, as we need this information for a wide range of cognitive activities: For example to make decisions, learn, or plan. In a recent project, we used efficient coding principles borrowed from sensory neuroscience to derive the optimal neural population to encode a reward distribution. We showed that the responses of dopaminergic reward prediction error neurons in mouse and macaque are similar to those of the efficient code in the following ways: the neurons have a broad distribution of midpoints covering the reward distribution; neurons with higher thresholds have higher gains, more convex tuning functions and lower slopes; and their slope is higher when the reward distribution is narrower. Furthermore, we derived learning rules that converge to the efficient code. The learning rule for the position of the neuron on the reward axis closely resemble distributional reinforcement learning. Thus, reward prediction error neuron responses may be optimized to broadcast an efficient reward signal, forming a connection between efficient coding and reinforcement learning, two of the most successful theories in computational neuroscience. Going beyond the results project, I will discuss where else efficient coding principles might be relevant and how our results may integrate into our overall understanding of reward based behaviours.
ZOOM LINK/L4E01
Add Event to My Calendar
Subscribe to the OIST Calendar
See OIST events in your calendar app