Date depot: 12 avril 2022 Titre: Design of an AI hardware accelerator for edge computing Directeur de thèse: Haralampos STRATIGOPOULOS (LIP6) Directeur de thèse: Hassan ABOUSHADY (LIP6) Domaine scientifique: Sciences et technologies de l'information et de la communication Thématique CNRS : Intelligence artificielle Resumé: Artificial Intelligence (AI) and Machine Learning (ML) algorithms have been a subject of interest for several decades now. Although AI and ML have gone through hype cycles of disappointment and enthusiasm, recent algorithmic advancements, in particular Deep Neural Networks (DNNs), as well as the availability of big data and the rapid growth of computing power, have renewed interest leading nowadays to applications in numerous distinct fields, i.e., robotics, medicine, autonomous vehicles, computer vision, speech recognition, natural language processing, gaming, etc. DNN models are computational intensive taking up a number of operations in the order of millions. From a hardware perspective, this poses severe challenges of data storage, data frequent movement, and processing speed on conventional Central Processing Units (CPUs) having a traditional Von Neumann computer architecture, commonly known as the “memory wall” problem. To this end, there is a pressing need for designing dedicated customized processors for AI, referred to as AI hardware accelerators, which belong to the larger family of domain-specific computing paradigms. Widely used AI hardware accelerators today are Graphics Processing Unit (GPUs) and Field-Programmable Gate Arrays (FPGAs), but orders of magnitude of energy-speed improvement can be achieved by Application-Specific Integrated Circuits (ASICs). Another high incentive for designing AI hardware accelerators is to push the execution of AI algorithms from the cloud closer to the sources of data onto edge devices. This is driven by energy, bandwidth, speed, availability, and privacy requirements. More specifically, edge computing reduces the data transfer requirement saving energy. Given the forecast of several tens of billions of edge devices in the near future being connected to the internet, edge computing would save bandwidth. Several applications, i.e., autonomous vehicles, require low-latency, real-time computation which is slowed down due to the communication with the cloud. Also, several applications require availability, thus they need to be independent of the internet. Finally, handling data locally offers privacy as opposed to transmitting sensitive data over the cloud. Edge AI is a challenging objective since edge devices have limited resources and are often battery-operated. Design efforts towards embedded or application-specific AI hardware accelerators are intense and on-going. There are several design flavors. Analog and mixed-signal implementations can offer orders of magnitude lower power consumption compared to their digital counterparts, thus they may be better-suited for edge computing being capable of acting directly on sensory data from world-machine interfaces. One way to reduce the energy consumption is approximate computing which refers to using approximate arithmetic units in the processing elements of the hardware neural network or performing network compression or quantization, which means reducing the precision of the weights and neuron activation values by transforming floating point numbers into narrow few-bit integers. Another design paradigm with tremendous is in-memory computing where the matrix-vector multiplications of the neural network are performed within the memory itself. In-memory computing has two main embodiments, namely performing arithmetic and logic operations within the on-chip SRAM or on memristive crossbar arrays. Finally, another trend is spiking neural networks which are the third generation of neural networks aiming at bridging the gap between biological neural networks and machine learning in terms of speed and energy consumption. The objective of this thesis will be the design of a lightweight, low-energy AI hardware accelerator to be embedded onto edge devices. The particular application we are interested in is spectrum sensing. The edge device will be connected to other devices as well as to the cloud. It is important to analyze in real-time the spectrum of the signals it is receiving for two principal reasons: (a) optimize the spectrum utilization, i.e., give priority to under-utilized frequency bands, so as not to congest the wireless network; and (b) detect incoming signals that present suspicious behavior for security purposes, i.e., jamming signals or signals of a side-channel attack attempting at stealing sensitive data out of the chip or bringing the chip into denial-of-service. First, a DNN model will be designed for incoming signal classification. Then, a dedicated AI hardware accelerator will be designed on which the DNN model will be mapped. The thesis will focus on the circuit-level implementation of the DNN accelerator and will reach up to chip fabrication in an advanced technology. The thesis will also study the security properties of the DNN accelerator, including resilience to adversarial attacks, backdoor attacks, DNN model theft, and fault injection attacks.