Description
Date depot: 1 janvier 1900
Titre: Development of deep-learning approaches to discover metabolic interaction networks of environmental microbial communities from large metagenomic datasets, domain co-occurrence and protein coevolution
Directeur de thèse:
Hugues RICHARD (LCQB)
Directrice de thèse:
Alessandra CARBONE (LCQB)
Domaine scientifique: Sciences et technologies de l'information et de la communication
Thématique CNRS : Non defini
Resumé:
Background
Metagenomics involves taking an environmental sample and extracting sequence DNA or RNA from microorganisms present in this sample. The resulting sequences allow obtaining a catalog of species after comparison with already known microbial species (taxonomy), reconstructing genomes of previously unknown species (assembly), but especially characterizing the metabolic functions performed by the community with an analysis of the properties of these genes (functional annotation). Metagenomics brings huge potential for discoveries, because over 99% of bacterial species cannot be cultivated in the laboratory. For this, in recent years, several large-scale projects have been initiated in the attempt to characterize the bacterial diversity of the oceans (Tara Ocean project, GOS, ...), microbes commensal to humans (Human Microbiome Project Consortium, 2012), soil composition (Fierer et al 2012), urban environment (Afshinnekoo et al., 2015), or microbial communities subject to extreme environmental conditions (eXtreme Microbiome Project).
Metagenomics lead to an important conceptual change in the analysis of microbial populations: biologists no longer study single genomes independently but rather collections of hundreds or thousands of genomes performing one or more functions as a community. Detailed descriptions of the functions performed by communities are expected to play an important role in our understanding of the equilibrium of the bacterial flora, the bioremediation, the mechanisms of antibiotic resistance, or the development of bacteria that could produce usable energy.
Understanding microbial communities means two things: (i) to characterize them phylogenetically and (ii) to describe the metabolic functions that they perform (e.g. annotate their protein domains). However (ii) is usually overlooked or only partially completed because of its complexity, and the description of the communities is, nowadays, mostly limited to a taxonomic description of its bacterial species. The metabolic analysis is complex because of the low sensitivity of current functional annotation strategies, and of the difficulties to associate species to individual genes obtained from metagenomic sample.
Doctorant.e: David Laurent