LEGOSim: A Unified Parallel Simulation Framework for Multi-chiplet Heterogeneous Integration
By Tiantian Lin 1, Cheng Qiu 2, Xiaohang Wang 1, Ling Wang 3, Zhulin Zheng 1, Yingtao Jiang 4, Amit Kumar Singh 5, Jieming Yin 6, Sihai Qiu 7, Xiaodong Li 8, Xin Tang 8, Jie Song 8, Mingzhe Zhang 8, Kui Ren 1
1 The State Key Laboratory of Blockchain and Data Security, Zhejiang University and Hangzhou High-Tech Zone (Binjiang), Institute of Blockchain and Data Security Hangzhou, China
2 South China University of Technology Guangzhou, China
3 The University of Western Australia, Western Australia, Australia
4 University of Nevada, Las Vegas, Las Vegas, USA
5 University of Essex, Essex, United Kingdom
6 Nanjing University of Posts and Telecommunications, Nanjing, China
7 Beijing Smart-chip Microelectronics Technology Co., Ltd, Beijing, China
8 Ant Group Beijing, China
Abstract
The rise of multi-chiplet integration challenges existing simulators like gem5 and GPGPU-Sim for efficiently simulating heterogeneous multiple-chiplet systems due to incapability to modularly integrate heterogeneous chiplets and high synchronization overheads in parallel simulation. To address these limitations, this paper introduces LEGOSim, a unified parallel simulation framework capable of flexibly integrating various open-source and in-house designed chiplet simulators as processes in parallel simulation, referred to as "simlets" with minimal modifications needed. It introduces an on-demand synchronization protocol with adaptive time quanta and non-global fencing, ensuring synchronization only occurs when necessary, thus reducing overhead while maintaining correctness. The framework also integrates Network-on-Interposer (NoI) simulator for modeling inter-chiplet communication, enabling accurate assessment of various interconnection architectures’ performance. Evaluated with diverse benchmarks, LEGOSim shows high accuracy in simulating multi-chiplet architectures like SIMBA and a CiM-based accelerator, with average errors of 3.79% and 3.94%, respectively. It significantly reduces synchronization overhead by up to 99.9% compared to per-cycle synchronization and by 66.1% compared to time quantum synchronization, without synchronization errors. Five case studies show that LEGOSim also provides precise system performance metrics and stall cause reporting, simplifying tasks such as performance analysis and optimization, and can be used for design space exploration of various multi-chiplet systems.
Keywords: Architectural simulation, multi-chiplet system simulation.
To read the full article, click here
Related Chiplet
- Interconnect Chiplet
- 12nm EURYTION RFK1 - UCIe SP based Ka-Ku Band Chiplet Transceiver
- Bridglets
- Automotive AI Accelerator
- Direct Chiplet Interface
Related Technical Papers
- Workflows for tackling heterogeneous integration of chiplets for 2.5D/3D semiconductor packaging
- Advancing Trustworthiness in System-in-Package: A Novel Root-of-Trust Hardware Security Module for Heterogeneous Integration
- Five Workflows for Tackling Heterogeneous Integration of Chiplets for 2.5D/3D
- Muchisim: A Simulation Framework for Design Exploration of Multi-Chip Manycore Systems
Latest Technical Papers
- LEGOSim: A Unified Parallel Simulation Framework for Multi-chiplet Heterogeneous Integration
- 3D Stacked HBM and Compute Accelerators for LLM: Optimizing Thermal Management and Power Delivery Efficiency
- Monolithically Integrated Optical Through-Silicon Waveguides for 3D Chip-to-Chip Photonic Interconnects
- Mozart: A Chiplet Ecosystem-Accelerator Codesign Framework for Composable Bespoke Application Specific Integrated Circuits
- On-Package Memory with Universal Chiplet Interconnect Express (UCIe): A Low Power, High Bandwidth, Low Latency and Low Cost Approach