Task independent capsule-based agents for deep Q-learning

Singh, Akash; De Schepper, Tom; Mets, Kevin; Hellinckx, Peter; Oramas, José; Latré, Steven

doi:10.1007/978-3-030-93842-0_4

Title

Task independent capsule-based agents for deep Q-learning

Author

Singh, Akash

De Schepper, Tom

Mets, Kevin

Hellinckx, Peter

Oramas, José

Latré, Steven

Abstract

In recent years, Capsule Networks (CapsNets) have achieved promising results in tasks such as object recognition thanks to their invariance characteristics towards pose and lighting. They have been proposed as an alternative to relational insensitive and translation invariant Convolutional Neural Networks (CNN). It has been empirically proven that CapsNets are capable of achieving competitive performance while requiring significantly fewer parameters. This is a desirable characteristic for Deep reinforcement learning which is known to be sample-inefficient during training. In this paper, we propose DCapsQN, a task-independent CapsNets-based architecture in the deep reinforcement learning setting. We experiment in the model-free reinforcement learning setting, more specifically in Deep Q-Learning using the Atari suite as the testbed of our analysis. To the best of our knowledge, this work constitutes the first CapsNets-based deep reinforcement learning architecture to learn state-action value functions without the need for task-specific adaptation. Our results show that, in this setting, DCapsQN requires 92% fewer parameters than the baseline. Moreover, despite their smaller size, the DCapsQN provides significant boosts in performance (score), ranging between 10%–77% while further stabilising the Deep Q-Learning. This is supported by our empirical results which shows that DCapsQN agents outperform the benchmark Double-DQN agent, with Prioritized experience replay, in eight out of the nine selected environments

Language

English

Source (book)

Artificial Intelligence and Machine Learning : 33rd Benelux Conference on Artificial Intelligence, BNAIC/Benelearn 2021, Esch-sur-Alzette, Luxembourg, November 10–12, 2021, Revised Selected Papers

Source (series)

Communications in computer and information science ; 1530

Publication

Cham : Springer , 2022

ISBN

978-3-030-93841-3

DOI

10.1007/978-3-030-93842-0_4

Volume/pages

p. 69-85

Full text (Publisher's DOI)

https://doi.org/10.1007/978-3-030-93842-0_4

Faculty/Department				Faculty of Sciences. Mathematics and Computer Science Faculty of Applied Engineering Sciences

Research group				Internet Data Lab (IDLab)

Publication type				P1 Proceeding

Subject				Engineering sciences. Technology Computer. Automation

Affiliation				Publications with a UAntwerp address

Identifier

c:irua:201535

Creation

12.12.2023

Last edited

17.06.2024

To cite this reference

https://hdl.handle.net/10067/2015350151162165141