Projet de recherche doctoral numero :8913

Description

Date depot: 3 avril 2025
Titre: Development and applications of a local protein surficial similarity search method
Directeur de thèse: Matthieu MONTES (LCQB)
Domaine scientifique: Sciences et technologies de l'information et de la communication
Thématique CNRS : Sciences de l’information et sciences du vivant

Resumé: Proteins are central in most biological processes and understanding their interactions is essential. Proteins can be described through their sequence, structure, surface and/or function(s). The protein surface is an abstract, geometric representation of the protein potential interactions, structure, fold, and sequence. Proteins sharing a related function display similar surfaces that can be independent of their sequence and/or structure similarity. We have developed Protein LOcal Surficial Similarity Screening (PLO3S), a fast, global protein surface shapes comparison method based on a local spectral descriptor, Surface Wave Interpolated Maps (SWIM). SWIM is a wave kernel signature (WKS) conformally projected on a 2D plane. In the present project we aim at developing a refined local shape comparison method for protein-protein or protein-small molecule binding site interaction identification and retrieval, drugs off-target prediction (identifying potential unwanted targets for a given candidate/drug molecule) and drug-design for polypharmacological effect (design molecular binders targeting multiple proteins). This will involve modifying the SWIM descriptor and comparison algorithm to prioritize local surface features, significantly improving sensitivity to protein dynamics that result in shape variations which is critical for accurate binding site identification. This will involve the exploration of novel feature extraction techniques and the development of advanced distance metrics for comparing local surface patches. We will use a graph representation for protein surfaces that will include spatial relationships and surficial properties. This will involve embedding information such as hydrophobicity and charge into the graph structure. Graph Neural Networks (GNNs) will notably be developed to identify key local features and predict interactions based on these graph representations. The performance of the improved method will be evaluated against 1. existing state-of-the-art methods using benchmarking datasets of known protein-protein and protein-small molecule interactions and 2. Methods of the participants of the SHREC community benchmark track on molecular data recurrently organized by our team.