BIU learning club – Alon Cohen – Efficient Rate Optimal Regret for Adversarial Contextual MDPs Using Online Function Approximation

Title:Efficient Rate Optimal Regret for Adversarial Contextual MDPs Using Online Function ApproximationAbstract:We study learning tabular finite-horizon Markov Decision Processes with adversarially-chosen contexts. Following latest literature, we assume a realizable function class that maps between context and MDP as well as access to online regression oracles that fits the best function given prior observations. This setting ... Read more