Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Reliability-aware dynamic energy management in dependable embedded real-time systems

Published: 07 January 2011 Publication History

Abstract

Recent studies show that voltage scaling, which is an efficient energy management technique, has a direct and negative effect on system reliability because of the increased rate of transient faults (e.g., those induced by cosmic particles). In this article, we propose energy management schemes that explicitly take system reliability into consideration. The proposed reliability-aware energy management schemes dynamically schedule recoveries for tasks to be scaled down to recuperate the reliability loss due to energy management. Based on the amount of available slack, the application size, and the fault rate changes, we analyze when it is profitable to reclaim the slack for energy savings without sacrificing system reliability. Checkpoint technique is further explored to efficiently use the slack. Analytical and simulation results show that the proposed schemes can achieve comparable energy savings as ordinary energy management schemes (which are reliability-ignorant) while preserving system reliability. The ordinary energy management schemes that ignore the effects of voltage scaling on fault rate changes could lead to drastically decreased system reliability.

References

[1]
Aydin, H., Melhem, R., Mossé, D., and Mejia-Alvarez, P. 2001. Dynamic and aggressive scheduling techniques for power-aware real-time systems. In Proceedings of the 22th Real-Time Systems Symposium. IEEE, Los Alamitos, CA, 95--105.
[2]
Bohrer, P., Elnozahy, E. N., Keller, T., Kistler, M., Lefurgy, C., McDowell, C., and Rajamony, R. 2002. The case for power management in web servers. In Power Aware Computing, R. Graybill and R. Melhem, Eds. Kluwer Academic Publishers, Norwell, MA, 261--289.
[3]
Burd, T. D. and Brodersen, R. W. 1995. Energy efficient cmos microprocessor design. In Proceedings of the HICSS Conference. IEEE, Los Alamitos, CA, 288--297.
[4]
Castillo, X., McConnel, S., and Siewiorek, D. 1982. Derivation and caliberation of a transient error reliability model. IEEE Trans. Comput. 31, 7, 658--671.
[5]
Ejlali, A., Schmitz, M. T., Al-Hashimi, B. M., Miremadi, S. G., and Rosinger, P. 2005. Energy efficient seu-tolerance in dvs-enabled real-time systems through information redundancy. In Proceedings of the International Symposium on Low-Power and Electronics and Design. ACM, New York, 281--286.
[6]
Elnozahy, E. M., Melhem, R., and Mossé, D. 2002. Energy-efficient duplex and tmr real-time systems. In Proceedings of the 23rd Real-Time Systems Symposium. IEEE, Los Alamitos, CA, 256--266.
[7]
Ernst, D., Das, S., Lee, S., Blaauw, D., Austin, T., Mudge, T., Kim, N. S., and Flautner, K. 2004. Razor: Circuit-level correction of timing errors for low-power operation. IEEE Micro 24, 6, 10--20.
[8]
Ernst, R. and Ye, W. 1997. Embedded program timing analysis based on path clustering and architecture classification. In Proceedings of The International Conference on Computer-Aided Design. IEEE, Los Alamitos, CA, 598--604.
[9]
Fan, X., Ellis, C., and Lebeck, A. 2003. The synergy between power-aware memory systems and processor voltage. In Proceedings of the Workshop on Power-Aware Computing Systems. Springer, Berlin, 164--179.
[10]
Hazucha, P. and Svensson, C. 2000. Impact of cmos technology scaling on the atmospheric neutron soft error rate. IEEE Trans. Nuclear Sci. 47, 6, 2586--2594.
[11]
Intel-Corp. 2001. Mobile Pentium iii processor-m datasheet. Order Number: 298340-002.
[12]
Irani, S., Shukla, S., and Gupta, R. 2003. Algorithms for power savings. In Proceedings of the 14th Annual Symposium on Discrete Algorithms. ACM, New York, 37--46.
[13]
Ishihara, T. and Yauura, H. 1998. Voltage scheduling problem for dynamically variable voltage processors. In Proceedings of the International Symposium on Low-Power Electronics and Design. ACM, New York, 197--202.
[14]
Iyer, R. and Rossetti, D. J. 1984. A measurement-based model for workload dependence of cpu errors. IEEE Trans. Comput. 33, 518--528.
[15]
Iyer, R., Rossetti, D. J., and Hsueh, M. 1986. Measurement and modeling of computer reliability as affected by system activity. ACM Trans. Comput. Syst. 4, 3, 214--237.
[16]
Jejurikar, R., Pereira, C., and Gupta, R. 2004. Leakage aware dynamic voltage scaling for real-time embedded systems. In Proceedings of the 41st Annual Design Automation Conference. ACM, New York, 275--280.
[17]
Krishma, C. M. and Singh, A. D. 1993. Reliability of check-pointed real-time systems using time redundancy. IEEE Trans. Reliab. 42, 3, 427--435.
[18]
Lebeck, A. R., Fan, X., Zeng, H., and Ellis, C. S. 2000. Power aware page allocation. In Proceedings of the 9th International Conference on Architectural Support for Programming Languages and Operating Systems. ACM, New York, 105--116.
[19]
Lee, H., Shin, H., and Min, S. 1999. Worst-case timing requirement of real-time tasks with time redundancy. In Proceedings of Real-Time Computing Systems and Applications. IEEE, Los Alamitos, CA, 410--415.
[20]
Melhem, R., Mossé, D., and Elnozahy, E. M. 2004. The interplay of power management and fault recovery in real-time systems. IEEE Trans. Comput. 53, 2, 217--231.
[21]
Mossé, D., Aydin, H., Childers, B. R., and Melhem, R. 2000. Compiler-assisted dynamic power-aware scheduling for real-time applications. In Proceedings of the Workshop on Compiler and OS for Low-Power.
[22]
Pering, T., Burd, T., and Brodersen, R. 1998. The simulation and evaluation of dynamic voltage scaling algorithms. In Proceedings of the International Symposium on Low-Power Electronics and Design. ACM, New York, 76--81.
[23]
Pillai, P. and Shin, K. G. 2001. Real-time dynamic voltage scaling for low-power embedded operating systems. In Proceedings of the 18th Symposium on Operating Systems Principles. ACM, New York, 89--102.
[24]
Pradhan, D. K. 1986. Fault Tolerance Computing: Theory and Techniques. Prentice-Hall, Inc., Upper Saddle River, NJ.
[25]
Quaglia, F. and Santoro, A. 2003. Non-blocking check-pointing for optimistic parallel simulation: Description and an implementation. IEEE Trans. Parallel Distrib. Syst. 14, 6, 593--610.
[26]
Rashid, M. W., Tan, E. J., Huang, M. C., and Albonesi, D. H. 2005. Exploiting coarse-grain verification parallelism for power-efficient fault tolerance. In Proceedings of the 14th International Conference on Parallel Architecture and Compilation Techniques. IEEE, Los Alamitos, CA, 315--325.
[27]
Saewong, S. and Rajkumar, R. 2003. Practical voltage scaling for fixed-priority rt-systems. In Proceedings of the 9th Real-Time and Embedded Technology and Applications Symposium. IEEE, Los Alamitos, CA, 106--115.
[28]
Seifert, N., Moyer, D., Leland, N., and Hokinson, R. 2001. Historical trend in alpha-particle induced soft error rates of the alpha#8482; microprocessor. In Proceedings of the 39th Annual International Reliability Physics Symposium. IEEE, Los Alamitos, CA, 259--265.
[29]
Semiconductor, T. 2004. Soft errors in electronic memory: A white paper. http://www. tachyonsemi.com/about/papers/.
[30]
Shivakumar, P., Kistler, M., Keckler, S. W., Burger, D., and Alvisi, L. 2002. Modeling the effect of technology trends on the soft error rate of combinational logic. In Proceedings of the International Conference on Dependable Systems and Networks. IEEE, Los Alamitos, CA, 389--398.
[31]
Sinha, A. and Chandrakasan, A. P. 2001. Joule-track A web based tool for software energy profiling. In Proceedings of the Design Automation Conference. ACM, New York, 220--225.
[32]
Unsal, O. S., Koren, I., and Krishna, C. M. 2002. Towards energy-aware software-based fault tolerance in real-time systems. In Proceedings of The International Symposium on Low-Power Electronics Design. ACM, New York, 124--129.
[33]
Wang, N., Quek, J., Rafacz, T., and Patel, S. 2004. Characterizing the effects of transient faults on a high-performance processor pipeline. In Proceedings of the International Conference on Dependable Systems and Networks. IEEE, Los Alamitos, CA, 61--72.
[34]
Weiser, M., Welch, B., Demers, A., and Shenker, S. 1994. Scheduling for reduced cpu energy. In Proceedings of the 1st Symposium on Operating Systems Design and Implementation. ACM, New York, 13--23.
[35]
Xu, R., Zhu, D., Rusu, C., Melhem, R., and Mossé, D. 2005. Energy efficient policies for embedded clusters. In Proceedings of the Conference on Language, Compilers, and Tools for Embedded Systems. ACM, New York, 1--10.
[36]
Yao, F., Demers, A., and Shenker, S. 1995. A scheduling model for reduced cpu energy. In Proceedings of the 36th Annual Symposium on Foundations of Computer Science. IEEE, Los Alamitos, CA, 374--382.
[37]
Zhang, Y. and Chakrabarty, K. 2003. Energy-aware adaptive check-pointing in embedded real-time systems. In Proceedings of the Design, Automation and Test in Europe Conference. IEEE, Los Alamitos, CA, 918--923.
[38]
Zhang, Y. and Chakrabarty, K. 2004. Task feasibility analysis and dynamic voltage scaling in fault-tolerant real-time embedded systems. In Proceedings of Design, Automation and Test in Europe Conference. IEEE, Los Alamitos, CA, 1170--1175.
[39]
Zhang, Y., Chakrabarty, K., and Swaminathan, V. 2003. Energy-aware fault tolerance in fixed-priority real-time embedded systems. In Proceedings of International Conference on Computer-Aided Design. ACM, New York, 209--214.
[40]
Zhu, D., Melhem, R., and Mossé, D. 2004. The effects of energy management on reliability in real-time embedded systems. In Proceedings of the International Conference on Computer-Aidded Design. ACM, New York, 35--40.
[41]
Zhu, D., Melhem, R., and Mossé, D. 2005. Energy efficient configuration for qos in reliable parallel servers. In Proceedings of the 5th European Dependable Computing Conference. Springer, Berlin, 122--139.
[42]
Zhu, D., Melhem, R., Mossé, D., and Elnozahy, E. 2004. Analysis of an energy efficient optimistic tmr scheme. In Proceedings of the 10th International Conference on Parallel and Distributed Systems. IEEE, Los Alamitos, CA, 559--568.
[43]
Ziegler, J. F. 1998. Terrestrial cosmic ray intensities. IBM J. Res. Dev. 42, 1, 117--139.
[44]
Ziegler, J. F. 2004. Trends in electronic reliability: Effects of terrestrial cosmic rays. http://www.srim.org/SER/SERTrends.htm.

Cited By

View all
  • (2024)A Two-Phase Algorithm for Reliable and Energy-Efficient Heterogeneous Embedded SystemsIEICE Transactions on Information and Systems10.1587/transinf.2023EDP7262E107.D:10(1285-1296)Online publication date: 1-Oct-2024
  • (2024)Energy-efficient triple modular redundancy scheduling on heterogeneous multi-core real-time systemsJournal of Parallel and Distributed Computing10.1016/j.jpdc.2024.104915191(104915)Online publication date: Sep-2024
  • (2023)A Minimizing Energy Consumption Scheme for Real-Time Embedded System Based on Metaheuristic OptimizationIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2022.321569042:7(2276-2289)Online publication date: Jul-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Embedded Computing Systems
ACM Transactions on Embedded Computing Systems  Volume 10, Issue 2
December 2010
457 pages
ISSN:1539-9087
EISSN:1558-3465
DOI:10.1145/1880050
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Journal Family

Publication History

Published: 07 January 2011
Accepted: 01 May 2006
Received: 01 January 2006
Published in TECS Volume 10, Issue 2

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Power management
  2. dynamic voltage scaling

Qualifiers

  • Research-article
  • Research
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)19
  • Downloads (Last 6 weeks)3
Reflects downloads up to 15 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)A Two-Phase Algorithm for Reliable and Energy-Efficient Heterogeneous Embedded SystemsIEICE Transactions on Information and Systems10.1587/transinf.2023EDP7262E107.D:10(1285-1296)Online publication date: 1-Oct-2024
  • (2024)Energy-efficient triple modular redundancy scheduling on heterogeneous multi-core real-time systemsJournal of Parallel and Distributed Computing10.1016/j.jpdc.2024.104915191(104915)Online publication date: Sep-2024
  • (2023)A Minimizing Energy Consumption Scheme for Real-Time Embedded System Based on Metaheuristic OptimizationIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2022.321569042:7(2276-2289)Online publication date: Jul-2023
  • (2023)Energy management of fault-tolerant real-time embedded systems through switching-activity-based techniquesMicroprocessors and Microsystems10.1016/j.micpro.2023.104929102(104929)Online publication date: Oct-2023
  • (2022)Thermal Aware System-Wide Reliability Optimization for Automotive Distributed Computing ApplicationsIEEE Transactions on Vehicular Technology10.1109/TVT.2022.318597871:10(10442-10457)Online publication date: Oct-2022
  • (2022)Power-Aware Checkpointing for Multicore Embedded SystemsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2022.3188568(1-15)Online publication date: 2022
  • (2022) Fixed-Priority Scheduling for Reliable and Energy-Aware ( m , k )-Deadlines Enforcement With Standby-Sparing IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2021.306152241:3(502-515)Online publication date: Mar-2022
  • (2022)Work-in-Progress: Optimal Checkpointing Strategy for Real-time Systems with Both Logical and Timing Correctness2022 IEEE Real-Time Systems Symposium (RTSS)10.1109/RTSS55097.2022.00055(515-518)Online publication date: Dec-2022
  • (2022)A Survey of Fault-Tolerance Techniques for Embedded Systems From the Perspective of Power, Energy, and Thermal IssuesIEEE Access10.1109/ACCESS.2022.314421710(12229-12251)Online publication date: 2022
  • (2022)Mapping series-parallel streaming applications on hierarchical platforms with reliability and energy constraintsJournal of Parallel and Distributed Computing10.1016/j.jpdc.2022.01.016163(45-61)Online publication date: May-2022
  • Show More Cited By

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media