Modeling and Controlling Many-Core HPC Processors: an Alternative to PID and Moving Average Algorithms
By Giovanni Bambini*, Alessandro Ottaviano†, Christian Conficoni*, Andrea Tilli*, Luca Benini*†, Andrea Bartolini*
* Department of Electrical, Electronic, and Information Engineering, University of Bologna, Italy
† Integrated Systems Laboratory, ETH Zurich, Switzerland
The race towards performance increase and computing power has led to chips with heterogeneous and complex designs, integrating an ever-growing number of cores on the same monolithic chip or chiplet silicon die. Higher integration density, compounded with the slowdown of technology-driven power reduction, implies that power and thermal management become increasingly relevant. Unfortunately, existing research lacks a detailed analysis and modeling of thermal, power, and electrical coupling effects and how they have to be jointly considered to perform dynamic control of complex and heterogeneous Multi-Processor System on Chips (MPSoCs). To close the gap, in this work, we first provide a detailed thermal and power model targeting a modern High Performance Computing (HPC) MPSoC. We consider real-world coupling effects such as actuators’ non-idealities and the exponential relation between the dissipated power, the temperature state, and the voltage level in a single processing element. We analyze how these factors affect the control algorithm behavior and the type of challenges that they pose. Based on the analysis, we propose a thermal capping strategy inspired by Fuzzy control theory to replace the state-of-the-art PID controller, as well as a root-finding iterative method to optimally choose the shared voltage value among cores grouped in the same voltage domain. We evaluate the proposed controller with model-in-the-loop and hardware-in-the-loop co-simulations. We show an improvement over state-of-the-art methods of up to 5× the maximum exceeded temperature while providing an average of 3.56% faster application execution runtime across all the evaluation scenarios.
To read the full article, click here
Related Chiplet
- DPIQ Tx PICs
- IMDD Tx PICs
- Near-Packaged Optics (NPO) Chiplet Solution
- High Performance Droplet
- Interconnect Chiplet
Related Technical Papers
- MFIT : Multi-FIdelity Thermal Modeling for 2.5D and 3D Multi-Chiplet Architectures
- Fault Modeling, Testing, and Repair for Chiplet Interconnects
- Fast and Accurate Jitter Modeling for Statistical BER Analysis for Chiplet Interconnect and Beyond
- Toward Open-Source Chiplets for HPC and AI: Occamy and Beyond
Latest Technical Papers
- Lifecycle Cost-Effectiveness Modeling for Redundancy-Enhanced Multi-Chiplet Architectures
- DISTIL: A Distributed Spiking Neural Network Accelerator on 2.5D Chiplet Systems
- Multi-Partner Project: COIN-3D -- Collaborative Innovation in 3D VLSI Reliability
- EOTPR Fine Pitch Probing for Die-to-Die Interconnect Failure Analysis
- InterPUF: Distributed Authentication via Physically Unclonnable Functions and Multi-party Computation for Reconfigurable Interposers