Projet de recherche doctoral numero :8088


Date depot: 19 mars 2021
Directeur de thèse: Daniel RACOCEANU (ICM)
Domaine scientifique: Sciences et technologies de l'information et de la communication
Thématique CNRS : Non defini

Resumé: In many biomedical studies, microscopic image represents an essential biomarker, able to make the difference within the biomedical quantification, assessment, understanding and discovery processes. Despite the recent development of machine / deep learning algorithms, the traceability and the reproducibility of the proposed algorithms are major issues towards their efficient and effective use. Therefore, we strongly need to master fundamentals as: i) data quality (compliant positive/negative examples balance, realistic data augmentation / synthesis, efficient /effective normalization), ii) domain adaptation / transfer learning (within a family of close applications), iii) explainability / traceability (of a proposed/suggested quantification or second opinion) and iv) ergonomic interactivity between the users (biologists and physicians) and the operational framework (relevance feedback, prior knowledge integration). In this PhD study, we propose to design, test and implement a biomedical microscopic image analysis Deep Reinforcement Learning framework, combining methods able to generate high performances (quality and speed), high traceability / explicability and facilitate the framework’s usability in biomedical research and discovery. Beside the analysis module, the framework will integrate a justification generator and a relevance feedback-support integrator, able to formalize any prior knowledge provided. The overall framework will work interactively: analysis - justification - feedback integration. In order to do this, we propose to align the decision-making process of our deep learning models, with the rational process of the end user decision-maker (biologists, physician etc.). First, by favoring knowledge-based hybrid approaches, which not only improve the explainability, but also the overall performance when compared to the black-box data driven approaches. In addition, we propose to introduce modifications at the architecture level of the model, in order to generate interpretable explanations for its own decisions. The explanations will involve pattern recognition features (confidence degrees from different levels of the deep learning structure), this will be presented by a comprehensive visual and text explanation along with the final decision (pattern recognition - driven explanation). Furthermore, we will provide a saliency map, highlighting the important parts of the input data that lead to the final decision (data-driven explanation). In addition, the causality will allow us to consolidate the traceability / explainability of the generated results. In this sense, the use of Layer-Wise Relevance Propagation, gradient-based localization and reinforcement learning to generate feedbacks6 not only allows the expert to interact/verify the second opinion validity/consistency, but also to localize/quantify visually the regions of potential interest for even more in-depth analysis. Finally, we plan to provide the end user with constant feedback about the operating conditions, and make pre- and post-processing to limit abnormalities in the data that could influence the results (i.e. data correlated to sensitive attributes - to be tested using a Generative Adversarial Network option). In order to realize this study, our team will use its strong expertise in Deep Learning and semantics, which will be serving as a strong basis to create a support for the relevance-feedback from the biologists, in order to facilitate the adoption of this technology within a routine research process.