THERMOS: Thermally-Aware Multi-Objective Scheduling of AI Workloads on Heterogeneous Multi-Chiplet PIM Architectures
By Alish Kanani 1, Lukas Pfromm 1, Harsh Sharma 2, Janardhan Rao Doppa 2, Partha Pratim Pande 2, Umit Y. Ogras 3
1 University of Wisconsin–Madison, USA
2 Washington State University, USA
3 University of Wisconsin–Madison, USA
Abstract
Chiplet-based integration enables large-scale systems that combine diverse technologies, enabling higher yield, lower costs, and scalability, making them well-suited to AI workloads. Processing-in-Memory (PIM) has emerged as a promising solution for AI inference, leveraging technologies such as ReRAM, SRAM, and FeFET, each offering unique advantages and trade-offs. A heterogeneous chiplet-based PIM architecture can harness the complementary strengths of these technologies to enable higher performance and energy efficiency. However, scheduling AI workloads across such a heterogeneous system is challenging due to competing performance objectives, dynamic workload characteristics, and power and thermal constraints. To address this need, we propose THERMOS, a thermally-aware, multi-objective scheduling framework for AI workloads on heterogeneous multi-chiplet PIM architectures. THERMOS trains a single multi-objective reinforcement learning (MORL) policy that is capable of achieving Pareto-optimal execution time, energy, or a balanced objective at runtime, depending on the target preferences. Comprehensive evaluations show that THERMOS achieves up to 89% faster average execution time and 57% lower average energy consumption than baseline AI workload scheduling algorithms with only 0.14% runtime and 0.022% energy overhead.
To read the full article, click here
Related Chiplet
- Interconnect Chiplet
- 12nm EURYTION RFK1 - UCIe SP based Ka-Ku Band Chiplet Transceiver
- Bridglets
- Automotive AI Accelerator
- Direct Chiplet Interface
Related Technical Papers
- SCAR: Scheduling Multi-Model AI Workloads on Heterogeneous Multi-Chiplet Module Accelerators
- Communication Characterization of AI Workloads for Large-scale Multi-chiplet Accelerators
- Inter-Layer Scheduling Space Exploration for Multi-model Inference on Heterogeneous Chiplets
- Multi-Objective Hardware-Mapping Co-Optimisation for Multi-DNN Workloads on Chiplet-based Accelerators
Latest Technical Papers
- AuthenTree: A Scalable MPC-Based Distributed Trust Architecture for Chiplet-based Heterogeneous Systems
- THERMOS: Thermally-Aware Multi-Objective Scheduling of AI Workloads on Heterogeneous Multi-Chiplet PIM Architectures
- LaZagna: An Open-Source Framework for Flexible 3D FPGA Architectural Exploration
- Corsair: An In-memory Computing Chiplet Architecture for Inference-time Compute Acceleration
- Thermal Implications of Non-Uniform Power in BSPDN-Enabled 2.5D/3D Chiplet-based Systems-in-Package using Nanosheet Technology