Modular Multi-Objective Deep Reinforcement Learning with Decision Values

Journal Title: Annals of Computer Science and Information Systems - Year 2018, Vol 15, Issue

Abstract

In this work we present a method for using Deep Q-Networks (DQNs) in multi-objective environments. Deep Q-Networks provide remarkable performance in single objective problems learning from high-level visual state representations. However, in many scenarios (e.g in robotics, games), the agent needs to pursue multiple objectives simultaneously. We propose an architecture in which separate DQNs are used to control the agent's behaviour with respect to particular objectives. In this architecture we introduce decision values to improve the scalarization of multiple DQNs into a single action. Our architecture enables the decomposition of the agent's behaviour into controllable and replaceable sub-behaviours learned by distinct modules. Moreover, it allows to change the priorities of particular objectives post-learning while preserving the overall performance of the agent. To evaluate our solution we used a game-like simulator in which an agent - provided with high-level visual input - pursues multiple objectives in a 2D world.

Authors and Affiliations

Tomasz Tajmajer

Keywords

Related Articles

Robotic Process Automation of Unstructured Data with Machine Learning

In this paper we present our work in progress on building an artificial intelligence system dedicated to tasks regarding the processing of formal documents used in various kinds of business procedures. The main challenge...

Simulation Driven Development of Distributed Systems – Coupling of virtual and real system components

Looking at the end-to-end processing, typical software-intensive systems are built as a system-of-systems where each sub-system specializes according to both the business and technology perspective. One challenge is the...

Development of crowd investing on the basis of ICO crypto assets using block-options for the supply of electric generation capacity

Attraction of investments into the electric power industry is complicated by a number of problems related to the long payback period and instability of the conditions on the market. Investors in the electric power indust...

Reliability Modeling of OSS Systems based on Innovation-Diffusion Theory and Imperfect Debugging

Open Source Software (OSS) has obtained widespread popularity in last few decades due to the exceptional contribution of some well established ones like Apache, Android, MySQL, LibreOffice, Linux etc. not only in the fie...

B2B Price Management using Price Waterfall Model and Business Intelligence solution

The price setting and negotiation process in the B2B field is a complex process that requires a solid methodology and usually also advanced IT tools to make the process as efficient as possible. The Price Waterfall model...

Download PDF file
  • EP ID EP569798
  • DOI 10.15439/2018F231
  • Views 18
  • Downloads 0

How To Cite

Tomasz Tajmajer (2018). Modular Multi-Objective Deep Reinforcement Learning with Decision Values. Annals of Computer Science and Information Systems, 15(), 85-93. https://europub.co.uk/articles/-A-569798