Memristive Neural Network Circuit of Operant Conditioning with Reward Delay and Variable Punishment Intensity

Bei Chen, Fazhan Liu, Herbert Ho Ching Iu, Han Bao, Quan Xu

Research output: Contribution to journalArticlepeer-review

Abstract

Operant conditioning is an essential learning mechanism for organisms and a fundamental theory for reinforcement learning in artificial intelligence. This paper proposes a neural network circuit based on non-volatile memristors that mimics the process of operant conditioning, such as the effects of reinforcement (positive reward or negative punishment) on the acquisition and maintenance of certain behaviors. This circuit is composed of two components: a reward operant conditioning circuit and a punishment operant conditioning circuit. These reward and punishment operant conditioning circuits not only simulate the process of exploration, acquisition, and satiety, but also reveal the effect of reward delay and punishment intensity on the acquisition of operant conditioning. This research holds the potential for practical application in training robots to make decisions. By adjusting reward delay and punishment intensity, the learning speed and effectiveness of robots can be enhanced.

Original languageEnglish
Pages (from-to)1
Number of pages1
JournalIEEE Transactions on Circuits and Systems II: Express Briefs
DOIs
Publication statusE-pub ahead of print - 5 Oct 2023

Cite this