Knowledge transfer between heterogeneous reinforcement learning agent
Journal Title: Science Paper Online - Year 2010, Vol 5, Issue 2
Abstract
Aiming at the problem of the existing knowledge transfer methods are only suitable for homogenous reinforcement learning agents, a kind of Q learning algorithm that can transfer knowledge between heterogeneous Agents with different state and action spaces. The main idea of the proposed Q learning algorithm can be described as the follows. Based on a task that was already learned by an old and a new Agent, a neural network was used to off-line learn a mapping relationship of Q value function between the two Agents. The constructed mapping of Q value function was then used to obtain Q value of the new Agent in a new task that was already learned by the old Agent while was not learned by the new Agent. The proposed Q learning algorithm can decrease the number of trials of the new Agent and so as to improve learning speed. Simulation results of 10×10 mazes illustrate the validity of the proposed Q learning algorithm.
Authors and Affiliations
Bo Liu, Ruhai Lei
异构网络中基于网络选择CPC广播内容研究
本文基于欧盟E2R项目已提出的异构网络中CPC(感知导频信道)广播信道,对CPC可以广播的内容进行了研究。通过CPC广播终端能够使用的无线接入技术的功能和性能参数,终端可以实时评估网络的负载情况和所能提供的QoS保证,选择合适的网络接...
Centrifugal cast microstructure of semisolid hypereutectic high chromium cast iron and its quantitative Analysis
For the semisolid slurry of hypereutectic high chromium cast iron prepared by slope cooling body method, it is fabricated into the semisolid annular part by centrifugal casting method in this paper. The microstructure is...
The study of nitrogen fugacity on the N-containing binaries
The expression for the nitrogen fugacity reveals a high pressure limit that differs from temperature to temperature. An approach for defining the high and the low pressure regions is proposed. By the topologic characteri...
基于神经网络的结晶磷回收工艺模型研究
以结晶除磷工艺为考察对象,建立系统的BP(反向传播)神经网络模型。研究表明,结晶除磷BP模型表现出良好的收敛性能;该模型对工艺出水磷质量浓度以及去除率的预测与实测值吻合较好,预测相对误差(除个别点外)基本稳定在±5%以内。该模...
Study on application of laminated interface element method in meso-scopic numerical analysis of concrete
According to the objective grading curve and filling ratio of aggregates, meso-scopic numerical specimen of concrete is firstly built based on the advanced numerical generation and filling method for aggregates with rand...