This repo contain the code for an orchestrator that decide which RL algorithm to use depending on the environment {Bandit, contextual Bandit and Reinforcement learning}
Djallel Bouneffouf