Published March 23, 2020 | Version v1
Software Open

Enhancement of hippocampal spatial decoding using a dynamic Q-learning method with a relative reward using theta phase precession

  • 1. National Yang Ming University
  • 2. National Cheng Kung University
  • 3. Taipei Medical University
  • 4. Hualien Tzu Chi Hospital
  • 5. National Laboratory Animal Center
  • 6. Zhejiang University
  • 7. University of Washington

Description

    Winners of the 2014 Nobel Prize in Physiology or Medicine, Professors John O’Keefe, May‐Britt Moser and Edvard I. Moser found that the internal global positioning system (GPS) in the brain allows us to be able to flexibly navigate the world they live in – exploring new areas, returning quickly to remembered places, and taking shortcuts and confirmed that place cells in hippocampus and grid cells in entorhinal cortex (EC) are responsible for higher-order cognitive map of the environment. Indeed, these abilities feel so easy and natural that it is not immediately obvious how complex the underlying processes really are. In contrast, spatial navigation remains a substantial challenge for artificial agents whose abilities are far outstripped by those of mammals. 

    Hippocampal place cells and interneurons in mammals have proved that they own stable place fields and theta phase precession profiles to encode the spatial information from the environment. The hippocampal CA1 neurons can be represented as the location of the animal and the prospective information of goal location. Reinforcement learning algorithm, e.g., Q-learning, has been adopted to build a navigation model of place cells for the purpose of addressing goal direction navigation problems.
    In this study, we propose dynamical Q-learning (dQ-learning), because of its adaptive reward function based on theta phase precession, which has recently been associated with a rat’s experiences at destinations, and use of information from both place cells and interneurons as inputs to predict the animal’s trajectory. We evaluated the convergence rates and learning performances of tQ-learning and dQ-learning with different cell types. The results demonstrate that dQ-learning improves learning performance and convergence rate and place cells and interneurons with phase precession may provide valuable information to improve the prediction of trajectory. To investigate whether the enhancement of hippocampal spatial decoding with the dQ-learning method was effective in goal-direction navigation, experimental data were recorded from rats implanted with microelectrodes and trained in a water reward task. During the task electrophysiological recordings of spikes, LFPs, and movement trajectories were acquired. The proposed dQ-learning algorithm achieved better learning performance with good prediction accuracy and a high convergence rate. The adaptive reward function and cell types were found to be critical factors for hippocampal spatial decoding using the dQ-learning method. 

Files

Files (18.3 MB)

Name Size Download all
md5:403a310e37269281ecd2dd822de1eb76
18.3 MB Download