AIG-CIM: A Scalable Chiplet Module with Tri-Gear Heterogeneous Compute-in-Memory for Diffusion Acceleration
By Yiqi Jing1, Meng Wu1, Jiaqi Zhou1, Yiyang Sun1, Yufei Ma1,2, Ru Huang1 , Le Ye1,3 , Tianyu Jia1
1 School of Integrated Circuits, Peking University, Beijing, China, 2 Institute for Artificial Intelligence, Peking University, Beijing, China, 3 Advanced Institute of Information Technology of Peking University, Hangzhou, China
ABSTRACT
The emergence of Diffusion models has gained significant attention in the field of Artificial Intelligence Generated Content. While Diffusion demonstrates impressive image generation capability, it faces hardware deployment challenges due to its unique model architecture and computation requirement. In this paper, we present a hardware accelerator design, i.e. AIG-CIM, which incorporates tri-gear heterogeneous digital compute-in-memory to address the flexible data reuse demands in Diffusion models. Our framework offers a collaborative design methodology for large generative models from the computational circuit-level to the multi-chip-module system-level. We implemented and evaluated the AIG-CIM accelerator using TSMC 22nm technology. For several Diffusion inferences, scalable AIG-CIM chiplets achieve 21.3× latency reduction, up to 231.2× throughput improvement and three orders of magnitude energy efficiency improvement compared to RTX 3090 GPU.
To read the full article, click here
Related Chiplet
- Interconnect Chiplet
- 12nm EURYTION RFK1 - UCIe SP based Ka-Ku Band Chiplet Transceiver
- Bridglets
- Automotive AI Accelerator
- Direct Chiplet Interface
Related Technical Papers
- Hemlet: A Heterogeneous Compute-in-Memory Chiplet Architecture for Vision Transformers with Group-Level Parallelism
- Why package lithography matters in heterogeneous chiplet integration
- A Heterogeneous Chiplet Architecture for Accelerating End-to-End Transformer Models
- 3D-ICE 4.0: Accurate and efficient thermal modeling for 2.5D/3D heterogeneous chiplet systems
Latest Technical Papers
- High-Efficient and Fast-Response Thermal Management by Heterogeneous Integration of Diamond on Interposer-Based 2.5D Chiplets
- HexaMesh: Scaling to Hundreds of Chiplets with an Optimized Chiplet Arrangement
- A physics-constrained and data-driven approach for thermal field inversion in chiplet-based packaging
- Probing the Nanoscale Onset of Plasticity in Electroplated Copper for Hybrid Bonding Structures via Multimodal Atomic Force Microscopy
- Recent Progress in Structural Integrity Evaluation of Microelectronic Packaging Using Scanning Acoustic Microscopy (SAM): A Review