A case for feature-based successor features for transfer in reinforcement learning

Bagot, Louis; Mets, Kevin; De Schepper, Tom; Latré, Steven

Title

Author

Bagot, Louis

Mets, Kevin

De Schepper, Tom

Latré, Steven

Abstract

Successor Features stand at the boundary between modelfree and model-based Reinforcement Learning. By predicting a sum of features instead of a sum of rewards, they enable very efficient transfer learning through the General Policy Improvement Theorem. Recent work has shifted the focus of the feature space from learnt features to a well-chosen set of base rewards. While this framework greatly improves stability, it discards the flexibility to generalize outside the base reward space. In this paper, we aim to rekindle interest in "representation-based" Successor Features for transfer learning, by clarifying the possible design choices and providing simple cases where they prevail. In a robot arm scenario, we find that they more easily transfer to unseen tasks without suffering from instabilities during training. We provide visual interpretation of the learnt features to explain this performance.

Language

English

Source (book)

34th Benelux Conference on Artificial Intelligence and the 31 Belgium Dutch Conference on Machine Learning (BNAIC/BENELEARN 2022), 7-9 November, 2022, Mechelen, Belgium

Publication

2022

Volume/pages

p. 1-16

Full text (open access)

https://repository.uantwerpen.be/docstore/d:irua:20857

Faculty/Department				Faculty of Sciences. Mathematics and Computer Science Faculty of Applied Engineering Sciences

Research group				Internet Data Lab (IDLab)

Publication type				P3 Proceeding

Subject				Engineering sciences. Technology Computer. Automation

Affiliation				Publications with a UAntwerp address

Source file

https://bnaic2022.uantwerpen.be/wp-content/uploads/BNAICBeNeLearn_2022_submission_8036.pdf

Identifier

Creation

14.12.2023

Last edited

17.06.2024

To cite this reference

https://hdl.handle.net/10067/2015610151162165141