Multi-Objective Hardware-Mapping Co-Optimisation for Multi-DNN Workloads on Chiplet-based Accelerators
By Abhijit Das (Universitat Politécnica de Catalunya, Spain), Enrico Russo (University of Catania, Italy) and Maurizio Palesi (University of Catania, Italy)
The need to efficiently execute different Deep Neural Networks (DNNs) on the same computing platform, coupled with the requirement for easy scalability, makes Multi-Chip Module (MCM)-based accelerators a preferred design choice. Such an accelerator brings together heterogeneous sub-accelerators in the form of chiplets, interconnected by a Network-on-Package (NoP). This paper addresses the challenge of selecting the most suitable sub-accelerators, configuring them, determining their optimal placement in the NoP, and mapping the layers of a predetermined set of DNNs spatially and temporally. The objective is to minimise execution time and energy consumption during parallel execution while also minimising the overall cost, specifically the silicon area, of the accelerator.
This paper presents MOHaM, a framework for multi-objective hardware-mapping co-optimisation for multi-DNN workloads on chiplet-based accelerators. MOHaM exploits a multi-objective evolutionary algorithm that has been specialised for the given problem by incorporating several customised genetic operators. MOHaM is evaluated against state-of-the-art Design Space Exploration (DSE) frameworks on different multi-DNN workload scenarios. The solutions discovered by MOHaM are Pareto optimal compared to those by the state-of-the-art. Specifically, MOHaM-generated accelerator designs can reduce latency by up to 96% and energy by up to 96.12%.
To read the full article, click here
Related Chiplet
- Interconnect Chiplet
- 12nm EURYTION RFK1 - UCIe SP based Ka-Ku Band Chiplet Transceiver
- Bridglets
- Automotive AI Accelerator
- Direct Chiplet Interface
Related Technical Papers
- Taming the Tail: NoI Topology Synthesis for Mixed DL Workloads on Chiplet-Based Accelerators
- On hardware security and trust for chiplet-based 2.5D and 3D ICs: Challenges and Innovations
- SCAR: Scheduling Multi-Model AI Workloads on Heterogeneous Multi-Chiplet Module Accelerators
- Communication Characterization of AI Workloads for Large-scale Multi-chiplet Accelerators
Latest Technical Papers
- PICNIC: Silicon Photonic Interconnected Chiplets with Computational Network and In-memory Computing for LLM Inference Acceleration
- Optimizing Attention on GPUs by Exploiting GPU Architectural NUMA Effects
- Inter-chip Clock Network Synthesis on Passive Interposer of 2.5D Chiplet Considering Transmission Line Effect
- Simulation-Driven Evaluation of Chiplet-Based Architectures Using VisualSim
- CHIPSIM: A Co-Simulation Framework for Deep Learning on Chiplet-Based Systems