CM-DQN: A Value-Based Deep Reinforcement Learning Model to Simulate Confirmation Bias
Journal Title: Engineering and Technology Journal - Year 2024, Vol 9, Issue 07
Abstract
In human decision-making tasks, individuals learn through trials and prediction errors. When individuals learn the task, some are more influenced by good outcomes, while others weigh bad outcomes more heavily. Such confirmation bias can lead to different learning effects. In this study, we propose a new algorithm in Deep Reinforcement Learning, CM-DQN, which applies the idea of different update strategies for positive or negative prediction errors, to simulate the human decision-making process when the task's states are continuous while the actions are discrete. We test in Lunar Lander environment with confirmatory, disconfirmatory bias and non-biased to observe the learning effects. Moreover, we apply the confirmation model in a multi-armed bandit problem (environment in discrete states and discrete actions), which utilizes the same idea as our proposed algorithm, as a contrast experiment to algorithmically simulate the impact of different confirmation bias in decision-making process. In both experiments, confirmatory bias indicates a better learning effect.
Authors and Affiliations
Jiacheng Shen , Lihan Feng,
DESIGN DESIGN AND CONSTRUCTION OF HOME-MADE SOURCE MEASURE UNIT
The voltage values used in an IV measurement often depend on the specific electronic components being tested. For example, to test a solar cell, voltage values in the range of -3 V - 3 V are sufficient, while for an LED...
THE INFLUENCE OF INDUSTRIAL WORK PRACTICE EXPERIENCE ON ENTREPRENEURIAL INTEREST IN STUDENTS OF AUTOMOTIVE MECHANICAL ENGINEERING EXPERTISE PROGRAM SMK TEXMACO PEMALANG
Prakerin or Industrial Work Practice is an educational, training, and learning activity for Vocational High School (SMK) students conducted in the business world or the industrial world related to student competence in a...
Study of Indoor Visual SLAM System for Semi-autonomous Robot Platform
This study propose the use of heterogeneous visual landmarks, points and line segments, to achieve effective cooperation in indoor SLAM environments. In order to achieve un-delayed initialization required by the bearing-...
Python source code Analysis for Bug Detection using Transformers
Effective bug detection is pivotal in software development, with the identification and localization of defects being crucial for robust applications. In Python-based programs, the conventional bug detection process reli...
A REVIEW ON POSSIBLE COMBINATION OF SOLAR DRYER MATERIALS FOR CROPS IN THE PHILIPPINES
Direct sun drying, also known as open sun drying, is the most cost-effective and environmentally beneficial technique of agricultural product preservation, but it degrades crop quality. Solar dryers play an essential par...