Date depot: 5 avril 2021 Titre: Deep learning for assisting diagnosis of neurological diseases using a very large-scale clinical data warehouse Directeur de thèse: Olivier COLLIOT (ICM) Directeur de thèse: Didier DORMONT (ICM) Domaine scientifique: Sciences et technologies de l'information et de la communication Thématique CNRS : Non defini Resumé: Keywords: deep learning, medical imaging, big data. Summary. Neurological diseases are a major public health concern. Early and accurate diagnosis is essential to provide adequate care for the patients and design effective clinical trials to find new treatments. In recent years, very large hospital data warehouses have been constituted. In particular, the data warehouse of the AP-HP (Assistance Publique-Hôpitaux de Paris) gathers data from all the hospitals of the greater Paris area, including clinical data, diagnoses, medical reports and medical imaging data (MRI, PET, CT). For instance, it gathers over 130,000 MRIs from adult patients with various types of disorders. This resource constitutes a fantastic opportunity to train efficient deep learning models. Very recently, our team was the first to publish a deep learning tool for neuroimaging data built using the AP-HP data warehouse (Bottani et al, 2021). This tool allows performing automatic quality control of T1- weighted MRI data and thus selecting the data which are usable for training deep learning models. The aim of this project is to design and validate deep learning methods for computer-assisted diagnosis of neurological disorders using a very large dataset (over 100,000 patients) from the AP-HP data warehouse. The first objective will be to design an approach for differential diagnosis from T1-weighted MRI data. A major challenge will be to be able to deal with a very large set of possible diagnoses (several hundreds), some of which may be co-existing in the same patient. This will require the design of dedicated deep learning architectures that account for these specificities. A second objective will be to extend the work to other types of brain imaging data (other MRI sequences such as T2-weighted, FLAIR, diffusion MRI; CT; PET). To that purpose, we will first aim to extend the automatic quality control approach that we proposed for T1-weighted MRI to other modalities. Then, we will design a computer-aided diagnosis method that can use multimodal data as input. Finally, if time permits, we propose to explore the design of models that could automatically generate medical reports from imaging data. This is a challenging task, that has so far only been proposed for much simpler data such as 2D X-ray radiographs. To that purpose, we will propose new architectures that are adapted to the encoding of images for subsequent text generation.