Projet de recherche doctoral numero :4724


Date depot: 1 janvier 1900
Titre: Virtual humanoids learning motion skills
Directeur de thèse: Olivier SIGAUD (ISIR (EDITE))
Domaine scientifique: Sciences et technologies de l'information et de la communication
Thématique CNRS : Non defini

Resumé: Human motion is of very complex nature. It requires many abilities developed during a long learning process, such as advanced balance, good coordination, the ability to exploit inertial effects, to rapidly plan new contacts and to control precisely interaction forces. In a nutshell, {{the objective of this PhD thesis is to design specific learning algorithms that enable virtual humanoids to autonomously construct repertoires of motion skills in the aim of acquiring these abilities}}. We will explore machine learning for humanoid motion as a specific field. This is very different from existing research works that develop either optimization-based methods relying on machine learning only for parameters adjustment, or generic learning frameworks for which it is difficult to cope with the specific complexities of humanoid motion. The global objective is to design unsupervised and reinforcement learning algorithms that can acquire motion skills and mobilize them adequately. Eventually, the virtual humanoid should be able to generate complex motions from simple inputs. For instance, defining a few waypoints for the head trajectory should lead to the automatic generation of walking motions and jumps. The research will be organized around three axes corresponding to three important ingredients of robot motion: motion features and state representations, motion primitives, and skills sequencing and abstraction. 1. Motion features and state representations. A large number of efficient algorithms for bipedal walking and balance are based on simplified models of the dynamics and on the control of meaningful quantities such as the Zero Moment Point [1], the Capture Point [2], or the centroidal angular momentum [3] to cite just a few. This PhD thesis will explore learning algorithms that take as inputs not only raw joint angles, velocities and torques, but also well-selected vectors of physically meaningful measurements. This will raise several challenges, such as the evaluation of the usefulness of these redundant inputs, or the construction of new relevant features based on physical principles. Additionally, efforts will be made to extend to humanoid motion a recent work that proposed to learn appropriate state representations based on priors that are specific to robotics [4] (for instance the proportionality between control inputs and the rate of change of some features). 2. Motion primitives. It has been demonstrated in practice that convolutional neural networks (CNNs) are very useful for image processing [5]. Convolutions are used as a generic and flexible operation that realizes meaningful dimensionality reductions by exploiting the links between proximity and information redundancy in pixel grids. But the sensory input used in the control of virtual or physical humanoid robots is very different from images, especially in the context of this PhD thesis in which visual input will never be considered (in simulation, the robot posture and configuration is always known). Furthermore, the (control) output is much richer than in classical applications of reinforcement learning. This PhD proposes to search for types of computations that, similarly to CNNs for image processing, will improve the efficiency of learning algorithms for humanoid motion control. Several types of computations will be considered, all relying on specificities of humanoids, like their underactuation, redundancy and hierarchical kinematics. Considering relevant functions should help making the learning for controller design more tractable. Interesting results have been obtained with the classical framework of dynamic movement primitives [6], but the use of fixed parametrized structures for the controllers decreases the flexibility and expressiveness of the skills that can be learned. To avoid this caveat and keep a low complexity, we will build upon a recent work [7] that introduces a diffeomorphic matching algorithm to deform vector fields. This offers a flexible way to incrementally modify time-invariant controllers, without losing key topological properties such as asymptotic stability. Another research direction that will be considered concerns the incorporation into learning processes of two very successful approaches for humanoid robot trajectory generation: motion planning and optimization (see [8]). 3. Skills sequencing and abstraction. This axis is about the orchestration of all the learning tasks and the organization of the end-to-end framework that defines and trains new skills, and creates a repertoire from which the skills can be appropriately leveraged by the humanoid depending on the situation and on the task to achieve. Following what humans seem to do, expert methods for humanoid motion generation are usually based on several layers of algorithms connecting low-level controllers to higher-level decision modules. These global approaches can be called ``tiered strategies''. The PhD candid

Doctorant.e: Matheron Guillaume