MCMComm: Hardware-Software Co-Optimization for End-to-End Communication in Multi-Chip-Modules
By Ritik Raj, Shengjie Lin, William Won and Tushar Krishna
Georgia Institute of Technology, GA, USA
Increasing AI computing demands and slowing transistor scaling have led to the advent of Multi-Chip-Module (MCMs) based accelerators. MCMs enable cost-effective scalability, higher yield, and modular reuse by partitioning large chips into smaller chiplets. However, MCMs come at an increased communication cost, which requires critical analysis and optimization. This paper makes three main contributions: (i) an end-to-end, off-chip congestion-aware and packaging-adaptive analytical framework for detailed analysis, (ii) hardware software co-optimization incorporating diagonal links, on-chip redistribution, and non-uniform workload partitioning to optimize the framework, and (iii) using metaheuristics (genetic algorithms, GA) and mixed integer quadratic programming (MIQP) to solve the optimized framework. Experimental results demonstrate significant performance improvements for CNNs and Vision Transformers, showcasing up to 1.58x and 2.7x EdP (Energy delay Product) improvement using GA and MIQP, respectively.
To read the full article, click here
Related Chiplet
- DPIQ Tx PICs
- IMDD Tx PICs
- Near-Packaged Optics (NPO) Chiplet Solution
- High Performance Droplet
- Interconnect Chiplet
Related Technical Papers
- LEXI: Lossless Exponent Coding for Efficient Inter-Chiplet Communication in Hybrid LLMs
- Co-Optimization of Power Delivery Network Design for 3-D Heterogeneous Integration of RRAM-Based Compute In-Memory Accelerators
- Defect Analysis and Built-In-Self-Test for Chiplet Interconnects in Fan-out Wafer-Level Packaging
- CATCH: a Cost Analysis Tool for Co-optimization of chiplet-based Heterogeneous systems
Latest Technical Papers
- Plasma Etch Process Optimization for Photonic-Grade Diamond-on-Insulator Substrates and Thickness Evaluation using Colorimetry
- CUTh-Solver: GPU-Accelerated Sparse Matrix Solver for High-Resolution Thermal Simulation of 3D ICs
- Making Locality-aware GEMM Compatible with Page-Granularity Placement on Chiplet GPUs
- Advanced semiconductor packaging design via artificial intelligence and machine learning: A review
- DTCO of NOR-Type IGZO FeFETs for 3D Heterogeneous AI Memories: A Read-Centric Perspective