In-Field Monitoring of Chiplets via ML-Driven On-Die Telemetry

By Nir Sever, Sr Director Business Development, proteanTecs

The rapid adoption of chiplet-based architectures for AI acceleration is redefining system complexity. Heterogeneous dies, stacked memories and ultra-fast interconnects deliver exceptional performance-per-watt, yet they also amplify exposure to process variation, thermal stress, accelerated aging, and dynamic, workload-induced voltage droop. Traditional pre-silicon sign-off alone can no longer guarantee field reliability, power profiles, or efficient bring-up. This presentation will highlight a holistic design-through-in-field's methodology that embeds a sophisticated hardware IP monitoring system with dedicated agents, creating self-monitoring AI chiplet solutions fully aligned with the market needs. The lightweight agents are inserted during design, with negligible PPA impact while streaming high-resolution telemetry including path delay, droop, thermal, latent defects, aging, and workload signatures. Once deployed, embedded firmware and software applications running on the host SiP enable real-time insights and actions: 1. Power reduction by dynamically reclaiming guard-bands to reduce voltage, with a reliability safety net. 2. Failure and SDC prevention through early detection of marginal timing under dynamic workloads and environmental conditions. 3. Accelerated RMA root-cause analysis in the field and with correlation to vendor production data. By incorporating proteanTec's continuous in-field monitoring deep data into chiplet based designs, we deliver a scalable path to resilient, energy-efficient AI systems. The presentation will feature customer case studies and real silicon findings from 7nm and 5nm designs.

Introduction
The New Frontier: Chiplet Based AI Architectures
AI Systems Challenges
It’s All About Timing Visibility
Monitoring Margin to Timing Failure
Customer Use Cases: Select Examples
Dynamic Power Reduction in Mission Mode
Real-Time Health Monitoring for Failure Prevention
Local and Remote Diagnostics
Closing the Visibility Gap
Live Demo at Our Booth