An Analysis of Q-Learning Algorithms with Strategies of Reward Function

Journal Title: International Journal on Computer Science and Engineering - Year 2011, Vol 3, Issue 2

Abstract

Q-Learning is a Reinforcement Learning technique that works by learning an action-value function that gives the expected utility of taking a given action in a given state and following a fixed policy thereafter. One of the strengths of Q-Learning is that it is able to compare the expected utility of the available actions without requiring a model of the environment. Reinforcement Learning is an approach where the agent needs no teacher to learn how to solve a problem. The only signal used by the agent to learn from his actions in reinforcement environment is the so called reward, a number which tells the agent if his last action was good (or) not. Q-Learning is a recent form of Reinforcement Learning algorithm that does not need a model of its environment and can be used on-line. This paper discusses about the different strategies of Q-Learning algorithms and reward function.

Authors and Affiliations

Ms. S. Manju, , Dr. Ms. M. Punithavalli,

Keywords

Related Articles

Survey paper on Copyright Protection for Images on Mobile Devices

The upcoming era of mobile technology has also raised by sharing of images and other graphical data. With this protection to such files is also plays a vital role. Using watermarking we can guarantee to provide the owner...

Performance Evaluation of Reactive, Proactive and Hybrid Routing Protocols in MANET

This Mobile Ad hoc Networks (MANET) is a set of wireless mobile nodes dynamically form spontaneous network which works without centralized administration. Due to this characteristic, there are some challenges that protoc...

Fractals Based Clustering for CBIR

Fractal based CBIR is based on the self similarity fundamentals of fractals. Mathematical and natural fractals are the shapes whose roughness and fragmentation neither tend to vanish, nor fluctuate, but remain essentiall...

Cloud Based Distributed Databases: The Future Ahead

Fault tolerant systems are necessary to be there for distributed databases for data centers or distributed databases requires having fault tolerant system due to the higher data scales supported by current data centers....

iImplementation of AMBA AHB protocol for high capacity memory management using VHDL

Microprocessor performance has improved rapidly these years. In contrast memory latencies and bandwidths have improved little. The result is that the memory access time is the bottleneck which limits the system performan...

Download PDF file
  • EP ID EP102749
  • DOI -
  • Views 147
  • Downloads 0

How To Cite

Ms. S. Manju, , Dr. Ms. M. Punithavalli, (2011). An Analysis of Q-Learning Algorithms with Strategies of Reward Function. International Journal on Computer Science and Engineering, 3(2), 814-820. https://europub.co.uk/articles/-A-102749