Projet de recherche doctoral numero :8657

Description

Date depot: 14 février 2024
Titre: Controllable Code Generation
Directeur de thèse: Benoit SAGOT (Inria-Paris (ED-130))
Domaine scientifique: Sciences et technologies de l'information et de la communication
Thématique CNRS : Traitement automatique des langues et de la parole

Resumé: Deep learning and in particular large language models are disrupting the field of code genera- tion. From program synthesis to code completion, code translation, and automatic testing, new tools are emerging which are augmenting humans for all software engineering tasks. Github released Copilot, OpenAI released ChatGPT, and Google is empowering its programmers with ML-based completion. Yet, there is research to be done on semantically correct code generation, symbolic regression of programs, better interfaces and stronger problem solving skills for new models. The aim of this PhD project, led in collaboration between the Fundamental AI Research lab (FAIR) at Meta AI and Inria Paris (project-team ALMAnaCH), includes: 1. Designing and developing novel code generation methods. Find new architectures, prompting, or training methods offering better accuracy or interesting compromises between size, computational speed and performance. These methods will be tested on common code generation benchmarks such as HumanEval (Chen et al. 2021), MBPP (Austin et al. 2021), or Co- deNet (Puri et al. 2021). 2. Develop new applications of machine learning for code-related tasks (e.g. code optimiza- tion, code explanation, code reviews). 3. Develop sequence generation and comprehension methods that can be used for other tasks than code generation. For instance to handle large sequences of natural language text.



Doctorant.e: Chambon Pierre