Divergence_Minimization

Paper accepted at ICLR!

Our Paper, LS-IQ: Implicit Reward Regularization for Inverse Reinforcement Learning, was accepted at the International Conference on Learning Representations (ICLR)! We achieve fast and stable inverse reinforcement learning by using a squared reward regularizer on a mixture distribution between the expert and the policy distribution. We show that this specific choice of regularizer results in a bounded Divergence, a bounded optimal reward function, and a bounded Q-function. This starkly contrasts the previously used regularizer that mainly resulted in an unbounded reward function causing instability. Also, we show that this regularizer gives a unique reinforcement learning perspective on the original perspective.

We evaluate our approach on complex locomotion tasks such as on the Atlas robot.

Interested? Here is our paper and here you go to our GitHub Repo.

Paper accepted at ICLR!

Our Paper, Time-Efficient Reinforcement Learning with Stochastic Stateful Policies, was accepted at the International Conference on Learning Representations (ICLR) 2024! We introduce a novel training

Read More »

LocoMuJoCo accepted at ROL@NeurIPS

Introducing the first imitation learning benchmark tailored towards locomotion. This benchmark comes with many different environments and motion capture dataset facilitating research in locomotion. We

Read More »