FoldedHexaTorus: An Inter-Chiplet Interconnect Topology for Chiplet-based Systems using Organic and Glass Substrates

By Patrick Iff, Maciej Besta and Torsten Hoefler
ETH Zurich

Abstract

Chiplet-based systems are rapidly gaining traction in the market. Two packaging options for such systems are the established organic substrates and the emerging glass substrates. These substrates are used to implement the inter-chiplet interconnect (ICI), which is crucial for overall system performance. To guide the development of ICIs, we introduce three design principles for ICI network topologies on organic and glass substrates. Based on our design principles, we propose the novel FoldedHexaTorus network topology. Our evaluation shows that the FoldedHexaTorus achieves significantly higher throughput than state-of-the-art topologies while maintaining low latency.

Code: https://github.com/spcl/FoldedHexaTorus

I. INTRODUCTION

Technology scaling has fueled the ever-increasing performance per cost of processors and accelerators for a long time. However, since the 22 nm process, each transition to a scaled-down process has been accompanied by a surge in nonrecurring (NR) cost of over 50% [1]. As a result, designing chips in cutting-edge processes is only economically viable at high production volumes. Chiplets promise a solution to this problem, as a single chiplet can be reused for multiple products, while the NR cost due to design and validation is incurred only once. Additional advantages of chiplets include improved yield (and hence lower cost) due to their smaller size compared to monolithic chips, and the option to integrate heterogeneous chiplets (built with different processes) in a single package.

Splitting a monolithic chip into multiple chiplets creates the need for a high-throughput inter-chiplet interconnect (ICI), which is crucial for communication-intensive workloads such as machine learning training and inference or scientific simulations. The ICI is built using die-to-die (D2D) links [2], [3], [4], which are implemented on organic or glass substrates [5], [6], as well as silicon interposers [7] or bridges [8], [9]. While silicon interposers and bridges offer higher bandwidth, they come with higher production costs. Therefore, our work focuses on organic and glass substrates. Another advantage of these substrates over silicon interposers is that, since they use a different fabrication process, they are not bound by the reticle limit and thus allow the construction of massive systems [10].

A major determinant of ICI throughput is the topology of links between chiplets. For systems based on passive silicon interposers or silicon bridges, the ICI topology is restricted to connecting only adjacent chiplets, resulting in topologies such as Mesh and HexaMesh [11]. On active silicon interposers, the link length is unrestricted, and many topologies have been proposed [12], [7], [13], [14], [15]. For organic and glass substrates, the link length is less restricted than on passive interposers (due to superior loss characteristics), but more restricted than on active silicon interposers (due to the absence of repeaters), opening up a new and largely unexplored design space for ICI topologies.

In this work, we develop design principles for ICI topologies on organic and glass substrates (contribution 1). These design principles reveal that, to achieve high throughput, the ICI topology must have a low network radix, a low network diameter, and short links—three properties that are inherently in conflict with one another [16]. By searching for a sweet spot in this design space, we conceive the novel FoldedHexaTorus topology (contribution 2). The FoldedHexaTorus features a constant network radix of six, a constant link length only slightly longer than the chiplet side, and a network diameter of less than √ N, where N is the total number of chiplets. Our evaluation (contribution 3) shows that, for chiplet-based systems with organic and glass substrates, FoldedHexaTorus outperforms topologies for passive and active silicon interposers, as well as network-on-chip (NoC) topologies.

To read the full article, click here