Knowledge transfer between heterogeneous reinforcement learning agent
Journal Title: Science Paper Online - Year 2010, Vol 5, Issue 2
Abstract
Aiming at the problem of the existing knowledge transfer methods are only suitable for homogenous reinforcement learning agents, a kind of Q learning algorithm that can transfer knowledge between heterogeneous Agents with different state and action spaces. The main idea of the proposed Q learning algorithm can be described as the follows. Based on a task that was already learned by an old and a new Agent, a neural network was used to off-line learn a mapping relationship of Q value function between the two Agents. The constructed mapping of Q value function was then used to obtain Q value of the new Agent in a new task that was already learned by the old Agent while was not learned by the new Agent. The proposed Q learning algorithm can decrease the number of trials of the new Agent and so as to improve learning speed. Simulation results of 10×10 mazes illustrate the validity of the proposed Q learning algorithm.
Authors and Affiliations
Bo Liu, Ruhai Lei
The FEM analysis of the fracture characters of the rock <br /> under shear-box load<br />
This paper introduces the research state of mode Ⅱ fracture, pointing out that the modeⅡ load is not due to the mode Ⅱ fracture. Using the Shear-box loading condition to restrict mode Ⅰ fracture, therefore the true mode...
Research advance on chemical reaction in microchemical technology
Microchemical technology has become a very hot topic in the field of chemical industry and academy. In this paper, the feasibility of chemical reaction processes in microreactor is studied based on the application of hom...
Combining H∞ and disturbance-observer-based control for a class of uncertain nonlinear systems with neural term
The disturbance rejection and attenuation problem is investigated for a class of uncertain nonlinear systems with neutral-term via the combined H∞ control and disturbance‐observer‐based control. The unknown external dist...
基于神经网络的结晶磷回收工艺模型研究
以结晶除磷工艺为考察对象,建立系统的BP(反向传播)神经网络模型。研究表明,结晶除磷BP模型表现出良好的收敛性能;该模型对工艺出水磷质量浓度以及去除率的预测与实测值吻合较好,预测相对误差(除个别点外)基本稳定在±5%以内。该模...
分布式发电效益的量化分析
由于传统能源未来不可避免地逐渐枯竭,各国对环境保护的重视及现有电力系统的一些弊端,使电力系统研究中形成了一个新热点——分布式发电。然而,由于缺乏有效的量化方法及工具,使得电力公司和广大用户缺乏对分布式发电效益的了解,从而在...