Run-time probabilistic detection of miscalibrated thermal sensors in many-core systems
Abstract
Many-core architectures use large numbers of small temperature sensors to detect thermal gradients and guide thermal management schemes. In this paper a technique to identify thermal sensors which are operating outside a required accuracy is described. Unlike previous on-chip temperature estimation approaches, our algorithms are optimized to run on-line while thermal management decisions are being made. The accuracy of a sensor is determined by comparing its readings to expected values from a probability distribution function determined from surrounding sensors. Experiments show that a sensor operating outside a desired accuracy can be identified with a detection rate of over 90% and an average false alarm rate of < 6%, with a confidence level of 90%. The run time of our method is shown to be around 3x lower than a recently-published temperature estimation method, enhancing its suitability for run-time implementation.
References
[1]
S. Sharifi and T. S. Rosing, "Accurate Direct and Indirect On-Chip Temperature Sensing for Efficient Dynamic Thermal Management," IEEE Trans. on CAD, vol. 29, no. 10, pp. 1586--1599, Oct. 2010.
[2]
W. Edwards, R. F. Miles, and D. von Winerfeldt, Eds., Advances in Decision Analysis: From Foundations to Application. Cambridge University Press, 2007.
[3]
F. Liu, "A General Framework for Spatial Correlation Modeling in VLSI Design," in Proc. Design Automation Conf., Jun. 2007, pp. 817--822.
[4]
R. Cochran and S. Reda, "Spectral Techniques for High-resolution Thermal Characterization with Limited Sensor Data," in Proc. Design Automation Conf., Jul. 2009, pp. 478--483.
[5]
Y. Zhang and A. Srivastava, "Accurate Temperature Estimation Using Noisy Thermal Sensors for Gaussian and Non-Gaussian Cases," IEEE Trans. on VLSI Systems, vol. 19, no. 9, pp. 1617--1626, Sep. 2011.
[6]
K.-J. Lee, K. Skadron, and W. Huang, "Analytical Model for Sensor Placement on Microprocessors," in Proc. IEEE Int'l Conf. on Computer Design, Oct. 2005, pp. 24--30.
[7]
A. Coskun, J. Ayala, D. Atienza, T. Rosing, and Y. Leblebici, "Dynamic thermal management in 3d multicore architectures," in Proc. IEEE/ACM Design, Automation and Test in Europe, Apr. 2009, pp. 1410--1415.
Index Terms
- Run-time probabilistic detection of miscalibrated thermal sensors in many-core systems
Recommendations
Design space exploration of thermal-aware many-core systems
Higher temperatures or uneven distribution of temperatures result in timing uncertainties which induces performance and reliability concerns for the system. Future 3D IC technology offers greater device integration, reduced signal delay and reduced ...
Comments
Information & Contributors
Information
Published In
Sponsors
- EDAA: European Design Automation Association
- ECSI
- EDAC: Electronic Design Automation Consortium
- SIGDA: ACM Special Interest Group on Design Automation
- IEEE CEDA
- The Russian Academy of Sciences: The Russian Academy of Sciences
Publisher
EDA Consortium
San Jose, CA, United States
Publication History
Published: 18 March 2013
Check for updates
Qualifiers
- Research-article
Conference
DATE 13
Sponsor:
- EDAA
- EDAC
- SIGDA
- The Russian Academy of Sciences
Acceptance Rates
Overall Acceptance Rate 518 of 1,794 submissions, 29%
Upcoming Conference
DATE '25
- Sponsor:
- sigda
Contributors
Other Metrics
Bibliometrics & Citations
Bibliometrics
Article Metrics
- 0Total Citations
- 39Total Downloads
- Downloads (Last 12 months)1
- Downloads (Last 6 weeks)0
Reflects downloads up to 23 Dec 2024
Other Metrics
Citations
View Options
Login options
Check if you have access through your login credentials or your institution to get full access on this article.
Sign in