Bio

Ashish Gaurav

Hi, my name is Ashish. I am a recent Ph.D. (CS) graduate from the University of Waterloo, supervised by Prof. Pascal Poupart. Prior to this, I completed my M.Math (CS) at the University of Waterloo, supervised by Prof. Krzysztof Czarnecki and earlier, I was a B.E. (CSE) student in India at the Birla Institute of Technology, Mesra.

Current research #

As a Ph.D. candidate, my research focused on developing methods for inverse reinforcement learning (IRL), particularly in the constrained MDP setting, i.e. learning a constraint instead of a reward. This encompasses the following objectives:

  • designing inverse constraint learning (ICL) algorithms that can learn constraints from expert demonstrations, and more particularly, algorithms that can learn different types of constraints
  • understanding the use cases of constraint learning over traditional reward learning
  • benchmarking various algorithms for constraint learning
  • applying ICL algorithms to real world domains like robotics and highway driving
  • determining the relevance of input features (from state action space) for ICL

Past research #

As a Master’s student, I was a part of the Waterloo Intelligent Systems Engineering (WISE) lab. I was also a part of the behavior planning team for autonomoose, Waterloo’s self driving car project. My research topics were:

  • safe reinforcement learning, particularly through a Linear Temporal Logic (LTL) based reward specification which ensures that the agent behaviours satisfy LTL logic post training
  • continual learning, both in the context of classification and reinforcement learning; in particular, using continual reinforcement learning for an autonomous driving task curriculum with increasing task complexity
  • behaviour planning for autonomous driving
  • out-of-distribution detection in the context of classification