3D Stacked HBM and Compute Accelerators for LLM: Optimizing Thermal Management and Power Delivery Efficiency
By Janak Sharda 1; Madison Manley 1; Jungyoun Kwak 1; Chinsung Park 2; Muhannad Bakir 1; Shimeng Yu 1
1 Department of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA, 30308 USA
2 SK Hynix, South Korea *work done during Ph.D. at Georgia Institute of Technology
Abstract:
Advanced packaging is becoming essential for designing hardware accelerators for large language models (LLMs). Different architectures such as 2.5D integration of memory with logic have been proposed, however the bandwidth limits the throughput of the complete system. Recent works have proposed memory on logic systems, where high bandwidth memory (HBM) can be 3D stacked on top of logic to improve the throughput by 64× and energy efficiency by 3×. However, the high-power consumption of logic dies and the high thermal resistance of HBM can result in thermal and power delivery challenges in such heterogeneously integrated stacks. In this work, we explore various design configurations such as logic-on-memory, and memory-on-logic, and consider some hybrid configurations. Further, accurate modeling of DRAM dies is performed, and mitigation strategies are proposed to further improve the throughput by 16% for memory-on-logic, reduce the IR drop for logic-on-memory system by 640 mV, and get 4× higher throughput for a hybrid system compared to the 2.5D integrated system.
To read the full article, click here
Related Chiplet
- Interconnect Chiplet
- 12nm EURYTION RFK1 - UCIe SP based Ka-Ku Band Chiplet Transceiver
- Bridglets
- Automotive AI Accelerator
- Direct Chiplet Interface
Related Technical Papers
- Co-Optimization of Power Delivery Network Design for 3-D Heterogeneous Integration of RRAM-Based Compute In-Memory Accelerators
- Gemini: Mapping and Architecture Co-exploration for Large-scale DNN Chiplet Accelerators
- On hardware security and trust for chiplet-based 2.5D and 3D ICs: Challenges and Innovations
- MFIT : Multi-FIdelity Thermal Modeling for 2.5D and 3D Multi-Chiplet Architectures
Latest Technical Papers
- 3D Stacked HBM and Compute Accelerators for LLM: Optimizing Thermal Management and Power Delivery Efficiency
- Monolithically Integrated Optical Through-Silicon Waveguides for 3D Chip-to-Chip Photonic Interconnects
- Mozart: A Chiplet Ecosystem-Accelerator Codesign Framework for Composable Bespoke Application Specific Integrated Circuits
- On-Package Memory with Universal Chiplet Interconnect Express (UCIe): A Low Power, High Bandwidth, Low Latency and Low Cost Approach
- 3D Electronic-Photonic Heterogenous Interconnect Platforms Enabling Energy-Efficient Scalable Architectures For Future HPC Systems