Hi, I’m 

Firas
Al-Hafez

I am a PhD student in robot learning at the Intelligent Autonomous Systems Group (IAS) at TU Darmstadt. My research interests include  reinforcement learning, imitation learning, and robotics.

Background animation 

Paper accepted at ICLR!

Our Paper, Time-Efficient Reinforcement Learning with Stochastic Stateful Policies, was accepted at the International Conference on Learning Representations (ICLR) 2024! We introduce a novel training

Read More »

LocoMuJoCo accepted at ROL@NeurIPS

Introducing the first imitation learning benchmark tailored towards locomotion. This benchmark comes with many different environments and motion capture dataset facilitating research in locomotion. We

Read More »

Paper accepted at ICLR!

Our Paper, Time-Efficient Reinforcement Learning with Stochastic Stateful Policies, was accepted at the International Conference on Learning Representations (ICLR) 2024! We introduce a novel training

Read More »

Highlighted Publications

Here you can find featured publications. My current  work focusses on reinforcement learning and imitation learning for locomotion. Prior to this, I have been working on learning manipulation skills. For those intrigued, take a stroll through the publications showcased below or explore my Google Scholar.

Time-Efficient Reinforcement Learning with Stochastic Stateful Policies

ICLR 2024

F. Al-Hafez, G. Zhao, J. Peters and D. Tateo

We introduce a novel training approach for stateful policies, decomposing them into a stochastic internal state kernel and a stateless policy jointly optimized using our stochastic stateful policy gradient. This method overcomes the drawbacks of Backpropagation Through Time (BPTT), providing a faster and simpler alternative, as demonstrated in evaluations on complex continuous control tasks such as humanoid locomotion.

LocoMuJoCo: A Comprehensive Imitation Learning Benchmark for Locomotion

ROL Workshop at NeurIPS 2023
🎉 Won the Outstanding Presentation Award

F. Al-Hafez, G. Zhao, J. Peters and D. Tateo

Introducing LocoMuJoCo, the first imitation learning benchmark tailored towards locomotion! It comes with a diverse set of environments ranging from musculoskeletal models to the brand-new Unitree H1 robot. We provide many motion capture datasets optimized for each humanoid embodiment. 

LS-IQ: Implicit Reward Regularization for Inverse Reinforcement Learning

ICLR 2023

F. Al-Hafez, D. Tateo, O. Arenz, G. Zhao and J. Peters

Implicit reward functions in inverse reinforcement learning methods offer the potential to greatly streamline and accelerate imitation learning. Nonetheless, their optimization poses significant challenges due to instability. Here, we demonstrate the efficacy of a particular implicit reward function regularizer in substantially enhancing the stability of these methods.

Redundancy Resolution as Action Bias in Policy Search

CoRL 2021

F. Al-Hafez and J. Steil

We integrate redundancy resolution seamlessly into policy search by introducing a redundant action bias. This enables us to define primary goals in the reward while delegating secondary objectives to the redundant action bias, greatly simplifying reward shaping. On the right, you see two different solutions for the same primary goal established by different biases.

Latest Blog Posts

I write blogs about some of my projects or machine learning in general. Even though I have not written so many blogs yet, there are some interesting ones coming in the near future. So stay tuned and frequently checkout this page!

ML Blog

Finding the Optimal Learning Rate using Bayesian Optimization

Within this blog, I am giving a short introduction into Bayesian optimization to find a near optimal learning rate. There exists a lot of great tutorials regarding the theory of Bayesian optimization. The main objective of this blog is to give a hands-on tutorial for hyperparameter optimization. As I will cover the theory only very briefly, it is recommend to read about the latter first before going through this tutorial. I am training a small ResNet implemented in PyTorch on the Kuzushiji-MNIST (or K-MNIST) dataset

Read More »