Description
Date depot: 10 juillet 2023
Titre: Unveiling sequence signatures of protein specificity
Directeur de thèse:
Martin WEIGT (LCQB)
Directeur de thèse:
Olivier TENAILLON (IAME Inserm U1137)
Domaine scientifique: Sciences et technologies de l'information et de la communication
Thématique CNRS : Sciences de l’information et sciences du vivant
Resumé: Proteins are defined via their amino-acid sequence, which encodes the relevant information for their 3D structure and their biological function. To unveil the sequence- structure/function relationship in proteins, we propose a tight collaboration between computational and experimental approaches. On one hand, rapidly increasing sequence databases provide ample information for data-driven modeling approaches based on statistical learning and artificial intelligence. On the other hand, modern high-throughput experiments allow for the phenotypic screening, experimental evolution and de novo synthesis of thousands of sequence variants. The main aim of this predominantly computational project is to develop and test efficient modeling tools, which aim at an unprecedented quantitative characterization of protein sequence space organization. These modeling tools will integrate readily available sequence information from sequence databases with the most up-to data experimental data. A particular aim here is to reach interpretable but generative statistical models, which shed light on the fine encoding of functional specificity in the protein’s amino-acid sequence.
Doctorant.e: Netti Roberto