Description
Date depot: 15 novembre 2023
Titre: Causal inference and machine learning methods for heterogeneous biological data
Directeur de thèse:
Hervé ISAMBERT (PC_Curie)
Directrice de thèse:
Maria Carla PARRINI (Institut Curie, Immunity and Cancer)
Domaine scientifique: Sciences et technologies de l'information et de la communication
Thématique CNRS : Sciences de l’information et sciences du vivant
Resumé: Time-lapse imaging microscopy and single-cell transcriptomics, now routinely used in cell and developmental biology labs, produce massive amounts of video images and gene expression data at single cell resolution. However, this wealth of heterogeneous data remain largely under-explored due to the lack of unsupervised methods and tools to analyze them. This highlights the need to develop new Machine Learning and Artificial Intelligence strategies to better exploit the richness and complexity of the information contained in space- and time-resolved cell and developmental biology data.
The Isambert lab has developed novel causal inference methods and tools (https://miic.curie.fr, MIIC R package) to learn cause-effect relationships in a variety of biological or clinical datasets, from single-cell transcriptomic and genomic alteration data (Verny et al 2017, Sella et al 2018, Desterke et al 2020) to medical records of patients (Cabeli et al 2020, Sella et al 2022, Ribeiro Dantas et al 2023) These machine learning methods combine multivariate information analysis with interpretable graphical models (Li et al 2019, Cabeli et al 2021, Ribeiro Dantas et al 2023) and outperform other methods on a broad range of benchmarks, achieving better results with only ten to hundred times fewer samples. These methods have also been recently adapted to analyze time series data such as live-cell time-lapse images of “tumor-on-chips”, which are micro-tumors reconstituted in vitro (Simon et al 2023).
The present PhD project will extend these causal inference and unsupervised Machine Learning methods to analyze large scale heterogeneous data with applications to time-lapse 3D imaging (i.e. 4D imaging) and single-cell transcriptomic data on 3D multicellular systems, such as “gastruloids”, which are early mammalian development models derived from embryonic stem cells, in collaboration with our biologist and biophysicist partners from the multidisciplinary MecaCell3D consortium. In particular, the method will be applied to analyze “tumor-on-chip” ecosystems, in collaboration with MC Parrini (Institut Curie).
Doctorant.e: Montagne Louis