Date depot: 8 avril 2023 Titre: Multiagent ethical behaviors in multi-objective reinforcement learning Directrice de thèse: Aurélie BEYNIER (LIP6) Encadrant : Paolo VIAPPIANI (LIP6) Domaine scientifique: Sciences et technologies de l'information et de la communication Thématique CNRS : Intelligence artificielle Resumé: Markov Decision Processes (MDPs) and Reinforcement Learning (RL) are two very successful paradigms adopted in Artificial Intelligence for designing autonomous agents capable of dealing with sequential decision problems under uncertainty. One important issue is that, in an open world, any behavior can be learnt a priori, and therefore we should investigate how to avoid “unethical” behaviors. Two kinds of approaches have been recently envisioned to ensure that the agents will behave ethically: top-down approaches enforce the agents to respect ethical rules formalized a priori, while bottom-up approaches try to learn directly ethical behaviors from ethical trajectories of actions. Moreover, attempts to combine both approaches have been proposed. In a recent foundational paper, Abel et al. envision the prospect of using reinforcement to model an idealized ethical artificial agent. Abel et al. identifies two crucial issues raised by this approach: 1) the problem of teaching the agent and 2) the problem of making the policies interpretable. Concerning the first point, Wu and Lin proposed ethics shaping, an approach where the reward obtained by a RL agent is modified to include ethics knowledge; an additional reward is obtained for ethics actions and a negative reward is obtained for unethical actions. Ethics shaping performs quite well when the ethical behavior is clearly identified and aligned with a single objective; however, when the agent has to respect different ethical guidelines, combining several ethical rewards is challenging. One possible approach consists to consider each ethical guideline as an objective and to learn a multi-objective strategy. An active debate in the learning community has been recently raised about whether modeling the objective of the system as a unique scalar reward is enough or not. Vamplew et al. explains why scalar values are insufficient and provides arguments supporting multi-objective models of reward maximization. This debate is relevant for research on ethical AI agents, as multi-objective approaches can be used for modelling ethical behaviors: the MORAL approach combines Multi-Objective Reinforcement Learning (MORL) and preference elicitation to learn the different components of the reward and their respective weights during policy computation or execution. However, it remains difficult to analyze the resulting strategy and to verify that it follows the desired ethics guidelines. This PhD thesis will investigate the prospect of multiagent reinforcement learning for ethical behaviors. We will investigate how different agents can learn ethical objectives while interacting in the same environment, while at the same time taking into account some general preferences. We will also explore how the relative importance of the different ethical objectives can be learnt from experts and combined to make ethical and coordinated decisions. We expect to develop methods that build interpretable ethical behaviors that could be understood by the designer and the users of the systems.