Hongyu Zang

Abstract:

In visual offline Reinforcement Learning (RL), pretraining an encoder with existing datasets presents a unique advantage. However, a significant challenge arises in accurately capturing crucial information for decision-making from visual state inputs, often disturbed by redundant information. This limitation hampers the encoder’s ability to generalize effectively to unseen environments. To address this challenge, we propose to pretrain a robust encoder via Control-relevant Saliency Map (C-SMEP), a novel approach designed to enhance the encoder’s generalization capability in visual offline RL. By leveraging a Behavior Cloning (BC) style action prediction module, C-SMEP calculates the gradients of predicted actions to determine the importance of each pixel in image-based observations for control-relevance. Under certain assumptions, we provide theoretical performance guarantees when C-SMEP integrated into conservative or pessimistic offline RL algorithms. Empirical experiments on the DeepMind Control (DMC) suite show that C-SMEP significantly outperforms state-of-the-art baseline methods in challenging unseen environments, evidencing its superiority in generalization and interpretability.

Download paper here

Under Review.

Share on

Twitter Facebook LinkedIn

Hongyu Zang

Share on

You May Also Enjoy

Future Blog Post

Blog Post number 4

Blog Post number 3

Blog Post number 2