Hello! I am Ashish.
I am currently a Ph.D. (CS) candidate at the University of Waterloo, supervised by Prof. Pascal Poupart. Previously, I completed my M.Math (CS) at the University of Waterloo, supervised by Prof. Krzysztof Czarnecki. A lifetime ago, I was B.E. (CSE) student in India at Birla Institute of Technology, Mesra.
Current research
As a Ph.D. candidate, my research focuses on developing methods for inverse reinforcement learning (IRL), particularly in the constrained MDP setting. I am also interested in the following related questions:
- how can we capture human preferences better? when should we use rewards and/or constraints?
- can we better characterize the underlying unidentifiability/ambiguity of reward and/or constraint recovery from expert demonstrations?
- if we know something about the causal structure, can we improve reward and constraint recovery and/or specification?
Past research
I was a part of the Waterloo Intelligent Systems Engineering (WISE) lab during my Master’s programme. I was also a part of the behavior planning team for autonomoose, Waterloo’s self driving car project. Broadly, I worked on the following topics:
- safe reinforcement learning
- Linear Temporal Logic (LTL) based reward specification
- continual learning, both in the context of classification and reinforcement learning
- out-of-distribution detection