Inter-Layer Scheduling Space Exploration for Multi-model Inference on Heterogeneous Chiplets

By Mohanad Odema, Hyoukjun Kwon, Mohammad Abdullah Al Faruque (University of California)

To address increasing compute demand from recent multi-model workloads with heavy models like large language models, we propose to deploy heterogeneous chiplet-based multi-chip module (MCM)-based accelerators. We develop an advanced scheduling framework for heterogeneous MCM accelerators that comprehensively consider complex heterogeneity and inter-chiplet pipelining. Our experiments using our framework on GPT-2 and ResNet-50 models on a 4-chiplet system have shown upto 2.2x and 1.9x increase in throughput and energy efficiency, compared to a monolithic accelerator with an optimized output-stationary dataflow.

To read the full article, click here

Related Chiplet

DPIQ Tx PICs
IMDD Tx PICs
Near-Packaged Optics (NPO) Chiplet Solution
High Performance Droplet
Interconnect Chiplet

Related Technical Papers

SCAR: Scheduling Multi-Model AI Workloads on Heterogeneous Multi-Chiplet Module Accelerators
Compass: Mapping Space Exploration for Multi-Chiplet Accelerators Targeting LLM Inference Serving Workloads
Five Workflows for Tackling Heterogeneous Integration of Chiplets for 2.5D/3D
Chiplets on Wheels : Review Paper on holistic chiplet solutions for autonomous vehicles

Latest Technical Papers

Spatiotemporal thermal characterization for 3D stacked chiplet systems based on transient thermal simulation
Interconnect-Aware Logic Resynthesis for Multi-Die FPGAs
Scope: A Scalable Merged Pipeline Framework for Multi-Chip-Module NN Accelerators
Scaling Routers with In-Package Optics and High-Bandwidth Memories
TDPNavigator-Placer: Thermal- and Wirelength-Aware Chiplet Placement in 2.5D Systems Through Multi-Agent Reinforcement Learning

Inter-Layer Scheduling Space Exploration for Multi-model Inference on Heterogeneous Chiplets

Subscribe to the Chiplet Marketplace Newsletter

Related Chiplet

Related Technical Papers

Latest Technical Papers