Chiplet-Based RISC-V SoC with Modular AI Acceleration
By P. Ramkumar (American University of Sharjah), S. S. Bharadwaj (Birla Institute of Technology and Science)
Abstract
Achieving high performance, energy efficiency, and cost-effectiveness while maintaining architectural flexibility is a critical challenge in the development and deployment of edge AI devices. Monolithic SoC designs struggle with this complex balance mainly due to low manufacturing yields (below 16%) at advanced 360 mm2 process nodes. This paper presents a novel chiplet-based RISC-V SoC architecture that addresses these limitations through modular AI acceleration and intelligent system level optimization. Our proposed design integrates 4 different key innovations in a 30mm x 30mm silicon interposer: adaptive cross-chiplet Dynamic Voltage and Frequency Scaling (DVFS); AI-aware Universal Chiplet Interconnect Express (UCIe) protocol extensions featuring streaming flow control units and compression-aware transfers; distributed cryptographic security across heterogeneous chiplets; and intelligent sensor-driven load migration. The proposed architecture integrates a 7nm RISC-V CPU chiplet with dual 5nm AI accelerators (15 TOPS INT8 each), 16GB HBM3 memory stacks, and dedicated power management controllers. Experimental results across industry standard benchmarks like MobileNetV2, ResNet-50 and real-time video processing demonstrate significant performance improvements. The AI-optimized configuration achieves ~14.7% latency reduction, 17.3% throughput improvement, and 16.2% power reduction compared to previous basic chiplet implementations. These improvements collectively translate to a 40.1% efficiency gain corresponding to ~3.5 mJ per MobileNetV2 inference (860 mW/244 images/s), while maintaining sub-5ms real-time capability across all experimented workloads. These performance upgrades demonstrate that modular chiplet designs can achieve near-monolithic computational density while enabling cost efficiency, scalability and upgradeability, crucial for next-generation edge AI device applications.
To read the full article, click here
Related Chiplet
- Interconnect Chiplet
- 12nm EURYTION RFK1 - UCIe SP based Ka-Ku Band Chiplet Transceiver
- Bridglets
- Automotive AI Accelerator
- Direct Chiplet Interface
Related Technical Papers
- Chiplet-Gym: Optimizing Chiplet-based AI Accelerator Design with Reinforcement Learning
- AIG-CIM: A Scalable Chiplet Module with Tri-Gear Heterogeneous Compute-in-Memory for Diffusion Acceleration
- Fulfilling 3D-IC Trade-Off Analyses (And Benefits) With An AI Assist
- A3D-MoE: Acceleration of Large Language Models with Mixture of Experts via 3D Heterogeneous Integration
Latest Technical Papers
- Chiplet-Based RISC-V SoC with Modular AI Acceleration
- Near-energy-free photonic Fourier transformation for convolution operation acceleration
- Optimizing Inter-chip Coupler Link Placement for Modular and Chiplet Quantum Systems
- Material Needs and Measurement Challenges for Advanced Semiconductor Packaging: Understanding the Soft Side of Science
- Leveraging 3D Technologies for Hardware Security: Opportunities and Challenges