Limits of hotspot detection and prediction in microprocesors [sic]

Placeholder Show Content

Abstract/Contents

Abstract
Microprocessor hotspots are a major reliability concern with heat fluxes as much as 20 times greater than those found elsewhere on the chip. Chip hotspots also augment thermo-mechanical stress at chip-package interfaces which can lead to failure during cycling. Because highly localized, transient chip cooling is both technically challenging and costly, chip manufacturers are using dynamic thermal management (DTM) techniques that reduce hotspots by throttling chip power. Uncertainty in heat flux profiles and chip thermal response leads to either excessively conservative DTM schemes and underutilized computational potential or device overheating and associated system failure risks. Improved techniques for quantifying uncertainty and accurately predicting transient thermal response are needed for maximizing reliable chip performance. A review is conducted of recent advancements in sensor design, laboratory thermometry, sensor allocation, and thermal signal processing for dynamic thermal management. Representative examples of DTM implementation are provided. Quantitative error estimates are compared for semiconductor thermal sensors and thermometry techniques, and improvements in thermal sensor placement and signal processing are presented. A simulation method is developed to determine the accuracy and resolution at which hotspot heat fluxes can be measured using distributed temperature sensors. The model is based on a novel, computationally-efficient, inverse heat transfer solution. The uncertainties in the hotspot location and intensity are computed for randomized chip heat flux profiles for varying sensor spacing, sensor vertical proximity, sensor error, and chip thermal properties. For certain cases the inverse solution method decreases mean absolute error in the heat flux profile by more than 30%. These results and simulation methods can be used to determine the optimal spacing of distributed temperature sensor arrays for hotspot management in chips. To enable on-chip modeling of transient temperature response in a semiconductor device subjected to arbitrarily varying power excitations, an original model compression technique is employed. A network identification deconvolution (NID) method is used to characterize device thermal response from either numerical or experimental results. To compute the transient response to an arbitrary power input, a highly-efficient technique based on digital signal processing is employed. An Infinite Impulse Response (IIR) filter dramatically reduces the required computations to achieve accurate response. The technique provides the best possible scaling of overall computation time and significantly reduces memory constraints. This improvement enables implementation of sophisticated runtime dynamic thermal management algorithms for high-power integrated circuit architectures. In sum, the present doctoral research offers a multi-faceted approach to managing measurement uncertainty in dynamic thermal management schemes and predicting hotspot response to facilitate optimal chip performance within reliable operating conditions.

Description

Alternative title Limits of hotspot detection and prediction in microprocessors
Type of resource text
Form electronic; electronic resource; remote
Extent 1 online resource.
Publication date 2012
Issuance monographic
Language English

Creators/Contributors

Associated with Miler, Josef
Associated with Stanford University, Department of Mechanical Engineering
Primary advisor Goodson, Kenneth E, 1967-
Thesis advisor Goodson, Kenneth E, 1967-
Thesis advisor Asheghi, Mehdi
Thesis advisor Kenny, Thomas William
Advisor Asheghi, Mehdi
Advisor Kenny, Thomas William

Subjects

Genre Theses

Bibliographic information

Statement of responsibility Josef L. Miler.
Note Submitted to the Department of Mechanical Engineering.
Thesis Thesis (Ph.D.)--Stanford University, 2012.
Location electronic resource

Access conditions

Copyright
© 2012 by Josef Miler

Also listed in

Loading usage metrics...