Bio

Ashish Gaurav

March 11, 2025

Hi, my name is Ashish. I am a recent Ph.D. (CS) graduate from the University of Waterloo, supervised by Prof. Pascal Poupart. Prior to this, I completed my M.Math (CS) at the University of Waterloo, supervised by Prof. Krzysztof Czarnecki and earlier, I was a B.E. (CSE) student in India at the Birla Institute of Technology, Mesra.

Current research #

As a Ph.D. candidate, my research focused on developing methods for inverse reinforcement learning (IRL), particularly in the constrained MDP setting, i.e. learning a constraint instead of a reward. This encompasses the following objectives:

designing inverse constraint learning (ICL) algorithms that can learn constraints from expert demonstrations, and more particularly, algorithms that can learn different types of constraints
understanding the use cases of constraint learning over traditional reward learning
benchmarking various algorithms for constraint learning
applying ICL algorithms to real world domains like robotics and highway driving
determining the relevance of input features (from state action space) for ICL

Past research #

As a Master’s student, I was a part of the Waterloo Intelligent Systems Engineering (WISE) lab. I was also a part of the behavior planning team for autonomoose, Waterloo’s self driving car project. My research topics were:

safe reinforcement learning, particularly through a Linear Temporal Logic (LTL) based reward specification which ensures that the agent behaviours satisfy LTL logic post training
continual learning, both in the context of classification and reinforcement learning; in particular, using continual reinforcement learning for an autonomous driving task curriculum with increasing task complexity
behaviour planning for autonomous driving
out-of-distribution detection in the context of classification