An Analysis of Q-Learning Algorithms with Strategies of Reward Function
Journal Title: International Journal on Computer Science and Engineering - Year 2011, Vol 3, Issue 2
Abstract
Q-Learning is a Reinforcement Learning technique that works by learning an action-value function that gives the expected utility of taking a given action in a given state and following a fixed policy thereafter. One of the strengths of Q-Learning is that it is able to compare the expected utility of the available actions without requiring a model of the environment. Reinforcement Learning is an approach where the agent needs no teacher to learn how to solve a problem. The only signal used by the agent to learn from his actions in reinforcement environment is the so called reward, a number which tells the agent if his last action was good (or) not. Q-Learning is a recent form of Reinforcement Learning algorithm that does not need a model of its environment and can be used on-line. This paper discusses about the different strategies of Q-Learning algorithms and reward function.
Authors and Affiliations
Ms. S. Manju, , Dr. Ms. M. Punithavalli,
Survey paper on Copyright Protection for Images on Mobile Devices
The upcoming era of mobile technology has also raised by sharing of images and other graphical data. With this protection to such files is also plays a vital role. Using watermarking we can guarantee to provide the owner...
Performance Evaluation of Reactive, Proactive and Hybrid Routing Protocols in MANET
This Mobile Ad hoc Networks (MANET) is a set of wireless mobile nodes dynamically form spontaneous network which works without centralized administration. Due to this characteristic, there are some challenges that protoc...
Fractals Based Clustering for CBIR
Fractal based CBIR is based on the self similarity fundamentals of fractals. Mathematical and natural fractals are the shapes whose roughness and fragmentation neither tend to vanish, nor fluctuate, but remain essentiall...
Cloud Based Distributed Databases: The Future Ahead
Fault tolerant systems are necessary to be there for distributed databases for data centers or distributed databases requires having fault tolerant system due to the higher data scales supported by current data centers....
iImplementation of AMBA AHB protocol for high capacity memory management using VHDL
Microprocessor performance has improved rapidly these years. In contrast memory latencies and bandwidths have improved little. The result is that the memory access time is the bottleneck which limits the system performan...