Projet de recherche doctoral numero :8324

Description

Date depot: 12 avril 2022
Titre: A machine learning approach to discover relations between the topology of metabolic networks and clinical phenotypes
Directrice de thèse: Nataliya SOKOLOVSKA (NutrioMICS (ED130))
Encadrant : Hédi SOULA (NUTRIOMICS (ED394))
Domaine scientifique: Sciences et technologies de l'information et de la communication
Thématique CNRS : Intelligence artificielle

Resumé: Our main focus is on microbiota metabolic networks reconstruction and analysis. These metabolic networks are reconstructed from metagenomic data and describe the functional abilities of the microbiota. Since it incorporates meta-information (biochemical reactions, redundancy, etc.) it is a step forward compared to traditional differential genomics analysis (what genes are over/under expressed). Although it is now possible to reconstruct these networks as graphs, mathematical and statistical operations on graphs are challenging and limited, especially clustering methods. Recently, the machine learning community started to develop graph embedding methods where graphs are transformed into meaningful compact representations. We are motivated to develop a new statistical method of directed networks embedding which naturally identifies communities of graphs, i.e. the sub-graphs. This task can be performed, e.g., using stochastic block models which is a technique to learn graphs containing clusters or communities, but also more general groups as, e.g., multipartite structures. Development of new metrics which capture relations between the structure of metabolic graphs and its class (phenotype) is our second avenue of research. To apply: Please contact Hédi Soula (hedi.soula@sorbonne-universite.fr) and Nataliya Sokolovska (nataliya.sokolovska@sorbonne-universite.fr). An ideal candidate is supposed to have a Master degree in Bioinformatics/Computer Science/Mathematics/Systems Biology/Engineering. He/she will propose and develop novel machine learning methods for efficient structured data analysis. It is expected that the candidate works on theoretical foundations of the methods and also implements them (Python). We expect that the candidate is interested in biological and medical applications.