Inter-Layer Scheduling Space Exploration for Multi-model Inference on Heterogeneous Chiplets

By Mohanad Odema, Hyoukjun Kwon, Mohammad Abdullah Al Faruque (University of California)

To address increasing compute demand from recent multi-model workloads with heavy models like large language models, we propose to deploy heterogeneous chiplet-based multi-chip module (MCM)-based accelerators. We develop an advanced scheduling framework for heterogeneous MCM accelerators that comprehensively consider complex heterogeneity and inter-chiplet pipelining. Our experiments using our framework on GPT-2 and ResNet-50 models on a 4-chiplet system have shown upto 2.2x and 1.9x increase in throughput and energy efficiency, compared to a monolithic accelerator with an optimized output-stationary dataflow.

To read the full article, click here

Related Chiplet

12nm EURYTION RFK1 - UCIe SP based Ka-Ku Band Chiplet Transceiver
Interconnect Chiplet
Bridglets
Automotive AI Accelerator
Direct Chiplet Interface

Related Technical Papers

SCAR: Scheduling Multi-Model AI Workloads on Heterogeneous Multi-Chiplet Module Accelerators
RapidChiplet: A Toolchain for Rapid Design Space Exploration of Chiplet Architectures
Workflows for tackling heterogeneous integration of chiplets for 2.5D/3D semiconductor packaging
Five Workflows for Tackling Heterogeneous Integration of Chiplets for 2.5D/3D

Latest Technical Papers

Enhancing Test Efficiency through Automated ATPG-Aware Lightweight Scan Instrumentation
Modeling Chiplet-to-Chiplet (C2C) Communication for Chiplet-based Co-Design
Die-Level Transformation of 2D Shuttle Chips into 3D-IC for Advanced Rapid Prototyping using Meta Bonding
STAMP-2.5D: Structural and Thermal Aware Methodology for Placement in 2.5D Integration
MCMComm: Hardware-Software Co-Optimization for End-to-End Communication in Multi-Chip-Modules

Inter-Layer Scheduling Space Exploration for Multi-model Inference on Heterogeneous Chiplets

Subscribe to the Chiplet Marketplace Newsletter

Related Chiplet

Related Technical Papers

Latest Technical Papers