NA³Os
Neural Approximate Accelerator Architecture Optimization for DNN Inference on Lightweight FPGAs
Embedded Machine Learning (ML) constitutes an admittedly fast-growing
field that comprises ML algorithms, hardware, and software capable of
performing on-device sensor data analyses at extremely low power,
enabling thus several always-on and battery-powered applications and
services. Running ML-based applications on embedded edge devices
witnesses a phenomenal research and business interest for many reasons,
including accessibility, privacy, latency, cost, and security. Embedded
ML is primarily represented by artificial intelligence (AI) at the edge
(EdgeAI) and on tiny, ultra resource constrained devices, a.k.a. TinyML.
TinyML poses requirements for energy efficiency but also low latency as
well as to retain accuracy in acceptable levels mandating, thus,
optimization of the software and hardware stack.
GPUs form the
default platform for DNN training workloads, due to their high
parallelism computing originating by the massive number of processing
cores. Though, GPU is often not an optimal solution for DNN inference
acceleration due to the high energy-cost and the lack of
reconfigurability, especially for high sparsity models or customized
architectures. On the other hand, Field Programmable Gate Arrays (FPGAs)
have a unique privilege of potentially lower latency and higher
efficiency than GPUs while offering high customization and faster
time-to-market combined with potentially longer useful life than ASIC
solutions.
In the context of TinyML, NA³Os focuses on a neural
approximate accelerator-architecture co-search targeting specifically
lightweight FPGA devices. This project investigates design techniques to
optimally and automatically map DNNs to resource- constrained FPGAs
while exploiting principles of approximate computing. Our particular
topics of investigation include:
- Efficient mapping of DNN operations onto approximate hardware components (e.g., multipliers, adders, DSP Blocks, BRAMs).
- Techniques for fast and automated design space exploration of mappings of DNNs defined by a set of approximate operators and a set of FPGA platform constraints.
- Investigation of a hardware-aware neural architecture co-search methodology targeting FPGA-based DNN accelerators.
- Evaluation of robustness vs. energy efficiency tradeoffs.
- Finally, all developed methods shall be evaluated experimentally by providing a proper synthesis path and comparing the quality of generated solutions with state-of-the-art solutions.
Publications
- Sabih M., Karim A., Wittmann J., Hannig F., Teich J.:
Hardware/Software Co-Design of RISC-V Extensions for Accelerating Sparse DNNs on FPGAs
International Conference on Field Programmable Technology (FPT 2024) (Sydney, Australia, 10. December 2024 - 12. December 2024)
BibTeX: Download - Deutel M., Hannig F., Mutschler C., Teich J.:
On-Device Training of Fully Quantized Deep Neural Networks on Cortex-M Microcontrollers
In: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (2024)
ISSN: 0278-0070
DOI: 10.1109/TCAD.2024.3484354
URL: https://ieeexplore.ieee.org/document/10726519
BibTeX: Download - Sabih M., Sesli B., Hannig F., Teich J.:
Accelerating DNNs using Weight Clustering on RISC-V Custom Functional Units
Conference on Design, Automation and Test in Europe (DATE) (Valencia, 25. March 2024 - 27. March 2024)
In: Proceedings of the Conference on Design, Automation and Test in Europe (DATE) 2024
BibTeX: Download - Sabih M., Yayla M., Hannig F., Teich J., Chen JJ.:
Robust and Tiny Binary Neural Networks using Gradient-based Explainability Methods
EuroMLSys '23: Proceedings of the 3rd Workshop on Machine Learning and Systems (Rome, Italy, 8. May 2023 - 8. May 2023)
In: Eiko Yoneki, Luigi Nardi (ed.): EuroMLSys '23: Proceedings of the 3rd Workshop on Machine Learning and System, New York(NY) United States: 2023
DOI: 10.1145/3578356.3592595
URL: https://dl.acm.org/doi/10.1145/3578356.3592595
BibTeX: Download - Sabih M., Hannig F., Teich J.:
Fault-Tolerant Low-Precision DNNs using Explainable AI
Workshop on Dependable and Secure Machine Learning (DSML) (Virtual Workshop, 21. June 2021 - 24. June 2021)
In: 2021 51st Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN-W) 2021
DOI: 10.1109/DSN-W52860.2021.00036
URL: https://ieeexplore.ieee.org/document/9502445/
BibTeX: Download