- This event has passed.
BIU learning club – Alon Cohen – Efficient Rate Optimal Regret for Adversarial Contextual MDPs Using Online Function Approximation
April 16 @ 12:00 pm - 1:00 pm IDT
Efficient Rate Optimal Regret for Adversarial Contextual MDPs Using Online Function Approximation
We study learning tabular finite-horizon Markov Decision Processes with adversarially-chosen contexts. Following latest literature, we assume a realizable function class that maps between context and MDP as well as access to online regression oracles that fits the best function given prior observations. This setting is challenging since it does not admit to the standard “Optimism in the face of uncertainty” approach that requires the construction of statistical confidence sets around the estimated MDP parameters. We present the first efficient rate-optimal regret minimization algorithm for adversarial CMDPs that operates under the minimal standard assumption of online function approximation (to the best of our knowledge). We see this as an additional step towards addressing more general function approximation in a provable manner.
Alon Cohen is an assistant professor at the school of Electrical Engineering at Tel-Aviv University. Before joining TAU, Alon received his PhD from faculty of IE&M at the Technion under the supervision of Prof. Tamir Hazan, following which he worked at Google Research hosted by Prof. Yishay Mansour from Tel-Aviv University.
Alon’s research deals with Reinforcement and Machine Learning theory, Online Learning, Optimization, and more.