An anatomical substrate of credit assignment in reinforcement learning

Jorgen Kornfeld

Michal Januszewski

Michale S. Fee

Philipp Schubert

Viren Jain

Winfried Denk

bioRxiv (2020)

Download Google Scholar

Abstract

How is experience used to improve performance? In both biological and artificial systems, the optimization of parameters that affect behavior requires a process that determines whether a parameter affects the outcome and then modifies the parameter accordingly. Central to the recent bloom of artificial intelligence has been the error-backpropagation algorithm(Rumelhart, Hinton, and Williams 1986) , which computationally retraces the signal from the output to each synapse (weight) and allows a large number of parameters to be optimized in parallel at high learning rates. Biological systems, however, lack an obvious mechanism to retrace the signal path. Here we show, by combining high-throughput volume electron microscopy (Denk and Horstmann 2004) and automated connectomic analysis(Januszewski et al. 2018; Dorkenwald et al. 2017; Schubert et al. 2019) , that the synaptic architecture of songbird basal ganglia supports a form of local credit assessment proposed in a model of songbird reinforcement learning (M. S. Fee and Goldberg 2011). We show that three of this model’s major predictions hold true: first, cortical axons that encode exploratory motor variability terminate predominantly on dendritic shafts of spiny neurons. Second, cortical axons that encode timing seek out spines, which enable calcium-based coincidence detection (R. Yuste and Denk 1995) and appear to be capable of creating and storing eligibility traces (Yagishita et al. 2014). Third, synapse pairs that presynaptically share a cortical timing axon and post-synaptically a medium spiny dendrite are substantially more similar in size than expected, indicating a history of Hebbian plasticity (Bartol et al. 2015; Kasthuri et al. 2015) . Combined with numerical simulations these data provide strong evidence for a model of basal ganglia learning with a biologically plausible credit assignment mechanism.

Research Areas

General Science

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations  & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

An anatomical substrate of credit assignment in reinforcement learning

Abstract

Research Areas

Meet the teams driving innovation

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

An anatomical substrate of credit assignment in reinforcement learning

Abstract

Research Areas

Meet the teams driving innovation

AI/ML Foundations  & Capabilities