Projet de recherche doctoral numero :3452

Description

Date depot: 1 janvier 1900
Titre: Signal and image processing study of a non-vocalized speech interface for medical and telecommunication applications
Directeur de thèse: Bruce DENBY (Institut Langevin (EDITE))
Domaine scientifique: Sciences et technologies de l'information et de la communication
Thématique CNRS : Non defini

Resumé: In the past few years there has been significant interest in a new type of speech recognition system using sensors that can give direct information on the movement of the articulators (essentially the tongue and lips), for example, ultrasound imaging or electromyographic sensors. Such systems can restore the ability to speak for persons who have lost the use of their vocal chords due to cancer or injury, and can also enable speech recognition in very noisy environments where the acoustic signal is corrupted. A third very interesting possibility is the so-called Silent Speech Interface, which would allow a user to engage in spoken communication without activating his or her vocal chords, thus providing a very secure and unobtrusive means of verbal communication in a variety of everyday situations; cellphone manufacturers are in particular very interested in this last possibility. At the Sigma Lab, ESPCI ParisTech, we have for a few years now been developing a Silent Speech Interface based on real-time ultrasound imaging of the tongue and video imaging of the lips during speech, and we have recently begun to obtain some very interesting results. The purpose of the the PhD thesis will be to use image and signal processing techniques, including machine learning, to improve system performance, using as a basis a large, recently acquired database of multispeaker silent speech ultrasound and video data. The goal will then be to move towards a high-performance and genuinely practical system, having applications in a wide range of areas both in medicine and in industry.

Doctorant.e: Xu Kele