CM-DQN: A Value-Based Deep Reinforcement Learning Model to Simulate Confirmation Bias
Journal Title: Engineering and Technology Journal - Year 2024, Vol 9, Issue 07
Abstract
In human decision-making tasks, individuals learn through trials and prediction errors. When individuals learn the task, some are more influenced by good outcomes, while others weigh bad outcomes more heavily. Such confirmation bias can lead to different learning effects. In this study, we propose a new algorithm in Deep Reinforcement Learning, CM-DQN, which applies the idea of different update strategies for positive or negative prediction errors, to simulate the human decision-making process when the task's states are continuous while the actions are discrete. We test in Lunar Lander environment with confirmatory, disconfirmatory bias and non-biased to observe the learning effects. Moreover, we apply the confirmation model in a multi-armed bandit problem (environment in discrete states and discrete actions), which utilizes the same idea as our proposed algorithm, as a contrast experiment to algorithmically simulate the impact of different confirmation bias in decision-making process. In both experiments, confirmatory bias indicates a better learning effect.
Authors and Affiliations
Jiacheng Shen , Lihan Feng,
Identification of Operator Selection Factors from the Perspective of Indonesian Sea Transportation Users
This study seeks to evaluate the key factors influencing the selection of sea transportation operators from the perspective of freight forwarders in Indonesia. Employing the Analytic Hierarchy Process (AHP), a Multi-Crit...
Design and Fabrication of an Inductive Motor with Spike Suppression Mechanism for Power-On Stability
This study focused on the design and development of an inductive motor equipped with a spike suppression mechanism to address the challenges of power surges during startup, enhance energy efficiency, and improve operatio...
Comparative Evaluation of a Seed Planter, the Modified Version and Local (Hand) Seed Planting Method
This study evaluated the performance of a seed planter against the modified version, and hand planting method. A performance test was carried out on the two planters and hand method on a piece of land of known dimension...
Analysis of Pressure Distribution on Airfoil 653-218 Based on Comparison of Suryadarma Low Speed Tunnel with Solidwork Software
The design structure of an aircraft is highly dependent on load factors. The loads acting on the wing of an airplane include aerodynamic loads, fuel weight and loads due to the weight of the wing structure. But of all th...
Study on Pond Culture Water Quality Based On Several Physical and Chemical Parameters in Tambak Oso Village, Sidoarjo Regency
Sidoarjo Regency holds significant potential for pond culture, particularly in Tambak Oso Village, Waru District. Despite this, there has been a decline in pond production in recent years, possibly caused by deterioratin...