Multi-level decision making with causal bandits
Date:
Multi-armed bandits are a standard formalism to represent simple yet realistic decision-making problems in which a policy-maker has to find an optimal balance between choosing well-known options or exploring new alternatives. Traditionally, such decision-making problems are encoded using a single model; however, in reality, a decision-maker may have multiple related models of the same problem at different level of resolution, each one providing information about the value and the effects of the available choices. In this talk we will recall the standard multi-armed bandits framework, extend it to a causal settings, and explain how multiple models can be related via causal abstractions. Finally, we will discuss a few theoretical results about transporting information across the models via abstraction using basic algorithms inspired by reinforcement learning.