Neural Computation Workshop 2025 (FY2024)

February 22, 2025 - February 22, 2025

Neural Computation Workshop 2024 group photo

About the Workshop

Aim:

The aim is for current and former members of Doya unit to exchange recent progress and new ideas.
 

Date:
Saturday, February 22, 2025
Location:
OIST Seminar Room B250 - Ctr Bldg
 

Program:
8:30-9:00   Registration
9:00 Opening
9:10 Special lecture: Jun Tani (OIST)

Session 1
Chair: Katsuhiko Miyazaki (OIST)
10:00 Eiji Uchibe (ATR)
10:30 Yuji Kanagawa (OIST)
10:50 Break

Session 2
Chair: Jovan Rebolledo Mendez (OIST)
11:10 Junichiro Yoshimoto (Fujita Health University)
11:40 Carlos Enrique Gutierrez (Softbank Corp. AI strategy office, visiting researcher at OIST)
12:10 Yukako Yamane (OIST)

12:30 Lunch & Poster

Session 3
Chair: Razvan Gamanut (OIST)
14:00 Akihiro Funamizu (University of Tokyo)
14:30 Alan Rodrigues (Hiroshima University)
15:00 Kevin Max (OIST)
15:20 Break

Session 4
Chair: Kayoko Miyazaki (OIST)
15:40 Makoto Otsuka
16:10 Miles Desforges (Araya Inc.)
16:40 Jianning Chen (OIST)
17:00 Discussion
17:40 Closing
18:30   Dinner  at the Ocean Terrace

Poster presentation List

 #1 Naoto Yoshida (Kyoto University)
         "Minimalist approach to meta-reinforcement learning in embodied agents"
    #2 Jianning Chen (OIST)
          "Extracting the Dynamics of Meta-parameters in Reinforcement Learning from Choice and Learning Behavior"
    #3 Razvan Gamanut (OIST)
          "A computational bottom-up, mesoscale approach for the study of the interaction between claustrum and cortex"

Abstract:

Eiji Uchibe
Advanced Telecommunications Research Institute International (ATR)

Human-in-the-loop policy learning by co-evolution of human and generative imitation learning

Generative imitation learning can find a reasonable policy from a limited number of successful trajectories of states and actions. However, its performance is upper-bounded by the given trajectories, which is usually given before imitation learning. Recently, reinforcement learning from human feedback (RLHF) has been receiving attention as a method to optimize the reward function with the help of human feedback. This paper proposes a human-in-the-loop generative imitation learning that learns from human demonstrations and evaluation in the systematic way. We adopt Model-Based EntropyRegularized Imitation Learning (MB-ERIL) that estimates the reward function, the policy, and the state transition dynamics from the given demonstrations. Next, the esimated policy and state transition dynamics are used to generate a set of trajectories, which are evaluated by a human operator. Then, the reward function is modified based on RLHF. The modified reward function is further used in the forward reinforcement learning to improve the policy. The proposed method is evaluated on the benchmark task, and experimental results that our human-in-the-loop achieved the best performance.

Yuji Kanagawa
Okinawa Institute Science and Technology

Reward evolution in stable population and prey-predator dynamics

Animal brains have evolved to help us survive and reproduce more offspring. The reward system is the most fundamental brain function for this purpose, evaluating external stimuli and providing learning cues to reinforce positive behaviors like eating and drinking while avoiding dangerous ones. Despite its importance, the environmental conditions under which the reward system evolved remain unclear. Based on biological evidence, we hypothesize that the evolution of foraging is critical for the evolution of the reward system. To test this hypothesis, we conduct an evolutionary simulation of many agents with foraging abilities that learn behavior by reinforcement learning from genetically encoded rewards. Our results show the stability of food rewards evolution in a stable population. Moreover, we test how aversive negative rewards for threats, corresponding to fear, can evolve in the environment with predators. We observed negative rewards for observing threats only in the predator-prey setting, suggesting the importance of predators in the fear evolution.

Junichiro Yoshimoto
Fujita Health University

Multidisciplinary alliance for translational research in neuropsychiatry in Fujita Health University

Fujita Health University has been selected for the Program for Forming Japan’s Peak Research Universities (J-PEAKS), a prestigious initiative aimed at fostering world-class research institutions in Japan. Our proposed project focuses on establishing a leading research center for neuropsychiatric disorders and academia-driven drug discovery ecosystem. A key feature of this initiative is the seamless in-house collaboration among researchers from diverse disciplines, including neuroscience, behavioral pharmacology, neurology, psychiatry, and computer science. This multidisciplinary alliance facilitates the development of an innovative translational research framework for elucidating the pathophysiology of neuropsychiatric disorders and advancing novel therapeutic strategies. In this presentation, we will introduce the framework of this research initiative and present the following preliminary study conducted using this framework.

Carlos Enrique Gutierrez
SoftBank Corp. AI strategy office, visiting researcher at Okinawa Institute Science and Technology

LLM Agents for Neuroscience Workflows

Large Language Model (LLM) agents have emerged as a significant research topic. Over the past year, these agents have been proposed for various applications across industries. Beyond the classical definition of agents as entities that perceive their environment and act upon it, agent components such as memory, reasoning, and planning capabilities are being upgraded as a direct consequence of LLM advancements. In industry, these agents are being designed to operate autonomously over extended periods, utilizing various tools to accomplish tasks or perform actions following predefined workflows.
In neuroscience, where complex data processing, analysis, and brain modeling are common, these agents could potentially automate numerous tasks within existing pipelines. However, the relationship between LLM agents and neuroscience is bidirectional: while neuroscience workflows can benefit from agent automation, current agent architectures could significantly benefit from neuroscience findings to enhance their core components. We will explore the potential synergies and mutual benefits of combining these two fields, examining how their integration could advance both artificial intelligence and neuroscientific research.

Yukako Yamane
Okinawa Institute Science and Technology

Neural representation of figure-ground and their ambiguity

Figure-ground (FG) segregation is one of the earliest “cognitive” functions and its ambiguity of judgement across stimuli and individuals has been widely reported. For instance, our colleagues have conducted psychological experiments using local natural-contours and reported a wide range of ambiguity in FG judgement across the contours. On the other hand, it is proposed that “sensory” signal ambiguity emerges as a neural activity variability (Orban et al., 2016). Here, we investigated how the FG signals emerged in the intermediate-level visual cortex, V4, focusing on its stimulus ambiguity and neural response variability. We examined the single-cell ambiguity in FG determination and response variability. We also examined neural population activity using svGPFA (sparse variational Gaussian process factor analysis). svGPFA summarized the activity of neurons using a Poisson distribution under a Gaussian process, yielding several latent factors. I report the result of these analyses and discuss how “cognitive” ambiguity can be represented as neural activity.

Akihiro Funamizu, (first author: Hayato Maeda)
Institute for Quantitative Biosciences (IQB), University of Tokyo

Modeling the neural basis of multiple strategies with deep reinforcement learning.

Humans and animals use multiple strategies in decision making. Agents combine a model-based strategy, which predicts contexts using state transitions, and a model-free strategy, which updates expected rewards of choices (action values) from direct experience (Daw et al., 2011). Agents also use an inference-based strategy which predicts a hidden task structure from sensory cues (Cazettes et al., 2023). Early studies investigating habit and goal-directed behaviors propose that the brain has parallel independent circuits for multiple strategies (Daw et al., 2005). In contrast, recent human studies and brain-wide electrophysiology in rodents show that multiple strategies are represented in overlapping brain regions (Collins & Cockburn, 2020). It is thus still unclear how the brain implements multiple strategies.
Recently, deep reinforcement learning (RL) with recurrent neural networks has shown that model-free and model-based strategies were both modeled with a single framework (Wang et al., 2018). Also, the RL model captures multiple strategies with slow weight changes and fast trial-by-trial updates (Duan et al., 2016; Hattori et al., 2023). Here we updated the deep RL model and captured the choice behaviors of mice in our previous study during a perceptual decision-making task with multiple behavioral strategies (Wang et al., 2024). Our model implements model-free and inference-based strategies with only updating action values but not by explicitly computing state prediction errors. The proposed model captured the choice behavior of mice better than conventional deep RL models. The activity of artificial units in the model matched the activity of mouse neurons in the frontal cortex. In addition, our network model captured the choices of humans and rodents in the two-step decision task (Daw et al., 2011) by simply trained to maximize the outcomes. These results support the view that multiple behavioral strategies are implemented by a single learning rule using action values and are globally represented in the brain.

Alan Rodorigues
Hiroshima University

Shared Neural and Interoceptive Dysfunctions in Psychiatric and Developmental Disorders

Autism Spectrum Disorder (ASD) is characterized by disturbances in social cognition, communication, and repetitive behaviors, while Major Depressive Disorder (MDD) is associated with negative mood, anhedonia, and reduced motivation for social interactions. Despite their distinct clinical manifestations, neuroimaging studies suggest shared neural disturbances, particularly in the insular cortex and broader neural networks, as potential common mechanisms across developmental and psychiatric disorders. However, the specific symptoms arising from these neural alterations remain unclear. Here, we present early analyses demonstrating that ASD and MDD share interoceptive dysregulations, particularly cardiovascular disturbances. Our findings reveal widespread disruptions in cardiac signals in MDD, alongside a loss of sympathetic and parasympathetic cardiac representation in the insular cortex. Beyond the well-documented social impairments in ASD, we identified a strong association between ASD traits and heightened depressive, anxiety, and stress symptoms. Additionally, reduced volume and disrupted structural connectivity of the ventromedial prefrontal cortex were linked to both high ASD traits and depressive symptoms. Furthermore, in a sample of healthy participants, autistic traits showed a significant correlation with heart rate variability (HRV) signals, reinforcing the role of autonomic nervous system dysfunction in ASD. These findings highlight a potential overlap between neurodevelopmental and psychiatric disorders, suggesting that autonomic and neural dysregulation may serve as a common pathway linking ASD and MDD. Finally, using the Active Inference approach we simulate how cardiovascular disturbances may emerge in MDD. This model demonstrates how prolonged stress, combined with impaired learning rates and reduced precision in predictive mechanisms, contributes to persistent HR elevation in MDD.

 

Kevin Max
Okinawa Institute Science and Technology

Synthetic Biology Meets Neuromorphic Computing: Towards a Bio-Inspired Olfactory Perception System

In this work we explore how the combination of synthetic biology, neuroscience modeling, and neuromorphic electronic systems offers a new approach to creating an artificial system that mimics the natural sense of smell. We argue that a codesign approach offers significant advantages in replicating the complex dynamics of odor sensory processes. We present an intermediate hybrid system of a synthetic sensory neuron that provides three key features: a) receptor-gated ion channels, b) electrode adhesion, and c) event-based encoding and computing. This research seeks to develop a platform for ultra-sensitive, specific, and energy-efficient odor detection, with potential implications for environmental monitoring, medical diagnostics, and security.

 

Makoto Otsuka
LiLz Inc., Tohoku University

From Remote Inspection to Quantum-Enhanced Data Cleansing

My talk is twofold. In the first half, I will introduce LiLz Inc. (https://lilz.jp/), a local startup in Okinawa that specializes in remote inspection through AI and IoT technologies. As a co-founder, I will outline our recent progress and showcase our latest hardware and software products.
In the latter half, I will present a technique we recently proposed for removing mislabeled instances from contaminated training datasets—a common challenge that degrades model generalization in real-world applications. Our method utilizes a quantum annealer in conjunction with surrogate model-based black-box optimization. It iteratively refines the quality of training subsets to lower the validation loss computed against a clean validation set. Experiments on a noisy majority bit task demonstrate that our algorithm effectively prioritizes the removal of high-risk mislabeled instances and leverages the more diverse samples generated by the physical quantum annealer from D-Wave Systems, outperforming simulation-based samplers provided by OpenJij and Neal. Detailed results and methodology are available in our preprint (https://arxiv.org/abs/2501.06916).

Miles Desforges
Araya Inc.

Introduction to Araya, research teams and research DX

I am would like to introduce Araya, where I am currently working. Araya has a strong connection with OIST, as many OIST members have collaborated with us, and several OIST alumni are now part of our team. This represents a promising career pipeline from OIST to industry, which I hope to further strengthen. During this presentation, I will provide a brief overview of the ongoing projects and initiatives at Araya, with the aim of generating interest in our work and opportunities. We actively welcome interns and offer full-time positions for those interested in joining our team.
Additionally, I will highlight the work of my group, Research DX, which focuses on developing innovative solutions to support researchers in their work. If any of the projects or solutions I present resonate with you, I would be happy to discuss potential collaborations or explore how we can assist with your research needs.

Jianning Chen
Okinawa Institute Science and Technology

External and Internal States Affect Behavioral Strategy Selection in Meta-reinforcement Learning

Humans and animals employ multiple behavioral strategies to guide learning and decision-making, including choice preservation, win-stay-lose-shift (WSLS), model-free reinforcement learning (RL), and model-based RL. The selection of these strategies depends on their predicted outcomes, reliability, and computational cost, which vary with the learner’s internal state, learning progress, and environmental features. Strategy switching may be governed by changes in meta-parameters within RL, a framework known as meta-RL. Here, we present the results from two-step tasks with mice and computational modeling to examine the effect of relevant factors on strategy selection in meta-RL.
In our experiment, we manipulated reward probability of two options to change uncertainty levels and task difficulty, while reaction time measures characterize the internal states of the animals. We then extended the regression model and hidden Markov model to infer the deployed strategy. Finally, we employed multi-step particle filtering to estimate the dynamics of meta-parameters for a computational perspective on strategy switching mechanisms.
Our results suggest that mice adaptively switch strategies based on learning progress, internal states, and environment. As learning progresses, they transition from simple model-free RL to model-based RL, accompanied by an increasing inverse temperature parameter. Furthermore, model-free RL contribute heavily to the choice in an aggressive state, characterized by premature, invalid, and rapid responses. This study discusses key factors influencing strategy selection through the optimization of meta-parameters.