Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

Low-Power High-Efficiency Video Decoding using General-Purpose Processors

Published: 09 January 2015 Publication History

Abstract

In this article, we investigate how code optimization techniques and low-power states of general-purpose processors improve the power efficiency of HEVC decoding. The power and performance efficiency of the use of SIMD instructions, multicore architectures, and low-power active and idle states are analyzed in detail for offline video decoding. In addition, the power efficiency of techniques such as “race to idle” and “exploiting slack” with DVFS are evaluated for real-time video decoding. Results show that “exploiting slack” is more power efficient than “race to idle” for all evaluated platforms representing smartphone, tablet, laptop, and desktop computing systems.

References

[1]
A. Agarwal, C. H. Kim, S. Mukhopadhyay, and K. Roy. 2004. Leakage in nano-scale technologies: Mechanisms, impact and design considerations. In Proceedings of the 2004 41st Design Automation Conference. 6--11.
[2]
E. Akyol and M. van der Schaar. 2008. Compression-aware energy optimization for video decoding systems with passive power. IEEE Transactions on Circuits and Systems for Video Technology 18, 9 (Sept. 2008), 1300--1306.
[3]
F. Bossen. 2013. Common Test Conditions and Software Reference Configurations. Technical Report L1100. JCTVC.
[4]
F. Bossen, B. Bross, K. Suhring, and D. Flynn. 2012. HEVC complexity and implementation analysis. IEEE Transactions on Circuits and Systems for Video Technology 22, 12 (2012), 1685--1696.
[5]
Benjamin Bross, Valeri George, Mauricio Alvarez-Mesa, Tobias Mayer, Chi Ching Chi, Jens Brandenburg, Thomas Schierl, Detlev Marpe, and Ben Juurlink. 2013. HEVC performance and complexity for 4k video. In Proceedings of the IEEE International Conference on Consumer Electronics.
[6]
Len Brown. 2005. ACPI in Linux. In Proceedings of the Linux Symposium. 51.
[7]
T. D. Burd, T. A. Pering, A. J. Stratakos, and R. W. Brodersen. 2000. A dynamic voltage scaled microprocessor system. IEEE Journal of Solid-State Circuits 35, 11 (Nov. 2000), 1571--1580.
[8]
A. P. Chandrakasan, S. Sheng, and R. W. Brodersen. 1992. Low-power CMOS digital design. IEEE Journal of Solid-State Circuits 27, 4 (Apr. 1992), 473--484.
[9]
C. C. Chi, M. Alvarez-Mesa, B. Bross, B. Juurlink, and T. Schierl. 2014. SIMD acceleration for HEVC decoding. IEEE Transactions on Circuits and Systems for Video Technology PP, 99 (2014), 1--1.
[10]
Kihwan Choi, Karthik Dantu, Wei-Chung Cheng, and Massoud Pedram. 2002. Frame-based dynamic voltage and frequency scaling for a MPEG decoder. In Proceedings of the 2002 IEEE/ACM International Conference on Computer-aided Design (ICCAD ’02). ACM, New York, NY, 732--737.
[11]
Hadi Esmaeilzadeh, Emily Blem, Renee St. Amant, Karthikeyan Sankaralingam, and Doug Burger. 2012. Dark silicon and the end of multicore scaling. IEEE Micro 32, 3 (May 2012), 122--134.
[12]
Peter Greenhalgh. 2011. Big.LITTLE Processing with ARM Cortex-A15 & Cortex-A7. Retrieved from http://www.arm.com/files/downloads/big.LITTLE_Final.pdf.
[13]
Yan Gu, Samarjit Chakraborty, and Wei Tsang Ooi. 2006. Games are up for DVFS. In Proceedings of the 43rd Annual Design Automation Conference. 598--603.
[14]
P. Hammarlund, A. J. Martinez, A. A. Bajwa, D. L. Hill, E. Hallnor, Hong Jiang, M. Dixon, M. Derr, M. Hunsaker, R. Kumar, R. B. Osborne, R. Rajwar, R. Singhal, R. D’Sa, R. Chappell, S. Kaushik, S. Chennupaty, S. Jourdan, S. Gunther, T. Piazza, and T. Burton. 2014. Haswell: The fourth-generation intel core processor. Micro, IEEE 34, 2 (Mar. 2014), 6--20.
[15]
Nikos Hardavellas. 2012. The rise and fall of dark silicon. USENIX 37, 2 (April 2012).
[16]
Han Hoffman, Adi Kouadio, Yvonne Thomas, and Massimo Visca. 2012. The turin shoots. In EBU Tech-i. Number 13. European Broadcasting Union (EBU), 8--9. Retrieved from http://tech.ebu.ch/docs/tech-i/ebu_tech-i_013.pdf.
[17]
Intel. 2008. Intel Turbo Boost Technology in Intel Core Microarchitecture (Nehalem) Based Processors. Retrieved from http://files.shareholder.com/downloads/INTC/0x0x348508/C9259E98-BE06-42C8-A433- E28F64CB8EF2/TurboBoostWhitePaper.pdf.
[18]
S. Jain, S. Khare, S. Yada, V. Ambili, P. Salihundam, S. Ramani, S. Muthukumar, M. Srinivasan, A. Kumar, S. K. Gb, R. Ramanarayanan, V. Erraguntla, J. Howard, S. Vangal, S. Dighe, G. Ruhl, P. Aseron, H. Wilson, N. Borkar, V. De, and S. Borkar. 2012. A 280mV-to-1.2V wide-operating-range IA-32 processor in 32nm CMOS. In Proceedings of the 2012 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC). 66--68.
[19]
Jagrit Kathuria, M. Ayoubkhan, and Arti Noor. 2011. A review of clock gating techniques. MIT International Journal of Electronics and Communication Engineering 1, 2 (2011).
[20]
H. Kaul, M. Anders, S. Hsu, A. Agarwal, R. Krishnamurthy, and S. Borkar. 2012. Near-threshold voltage (NTV)--opportunities and challenges. In Proceedings of the 2012 49th ACM/EDAC/IEEE Design Automation Conference (DAC’12). 1149--1154.
[21]
Stefanos Kaxiras and Margaret Martonosi. 2008. Computer Architecture Techniques for Power-Efficiency (1st ed.). Morgan and Claypool.
[22]
N. S. Kim, T. Austin, D. Baauw, T. Mudge, K. Flautner, J. S. Hu, M. J. Irwin, M. Kandemir, and V. Narayanan. 2003. Leakage current: Moore’s law meets static power. Computer 36, 12 (Dec. 2003), 68--75.
[23]
Stephen Kosonocky. 2011. Practical Power Gating and Dynamic Voltage/Frequency Scaling. (August 2011). http://www.hotchips.org/wp-content/uploads/hc_archives/hc23/HC23.17.1-tutorial1/HC23.17.111.Practical_PGandDV-Kosonocky-AMD.pdf Hot Chips: A Symposium on High Performance Chips.
[24]
Rakesh Kumar, Dean M. Tullsen, Parthasarathy Ranganathan, Norman P. Jouppi, and Keith I. Farkas. 2004. Single-ISA heterogeneous multi-core architectures for multithreaded workload performance. SIGARCH Compututer Architecture News 32, 2 (March 2004), 64.
[25]
Etienne Le Sueur and Gernot Heiser. 2010. Dynamic voltage and frequency scaling: The laws of diminishing returns. In Proceedings of the 2010 International Conference on Power Aware Computing and Systems (HotPower’10). USENIX Association, Berkeley, CA, 1--8.
[26]
Etienne Le Sueur and Gernot Heiser. 2011. Slow down or sleep, that is the question. In Proceedings of the 2011 USENIX Conference on USENIX Annual Technical Conference (USENIXATC’11). USENIX Association, Berkeley, CA, 16.
[27]
Sheng Li, Jung Ho Ahn, Richard D. Strong, Jay B. Brockman, Dean M. Tullsen, and Norman P. Jouppi. 2013. The McPAT framework for multicore and manycore architectures: Simultaneously modeling power, area, and timing. ACM Transactions on Architecture and Code Optimization 10, 1 (April 2013), Article 5 (April 2013), 29 pages.
[28]
Wen-Yew Liang, Ming-Feng Chang, Yen-Lin Chen, and Chin-Feng Lai. 2013. Energy efficient video decoding for the android operating system. In Proceedings of the 2013 IEEE International Conference on Consumer Electronics (ICCE’12). 344--345.
[29]
Zhan Ma, Hao Hu, and Yao Wang. 2011. On complexity modeling of H.264/AVC video decoding and its application for energy efficient decoding. IEEE Transactions on Multimedia 13, 6 (Dec. 2011), 1240--1255.
[30]
Malena Mesarina and Yoshio Turner. 2003. Reduced energy decoding of MPEG streams. Multimedia Systems 9, 2 (2003), 202--213.
[31]
Venkatesh Pallipadi, Shaohua Li, and Adam Belay. 2007. cpuidle--do nothing, efficiently.... In Proceedings of the Linux Symposium.
[32]
Venkatesh Pallipadi and Alexey Starikovskiy. 2006. The ondemand governor. In Proceedings of the Linux Symposium, Vol. 2. sn, 215--230.
[33]
E. Rotem, A. Naveh, D. Rajwan, A. Ananthakrishnan, and E. Weissmann. 2012. Power-management architecture of the intel microarchitecture code-named sandy bridge. IEEE Micro 32, 2 (March 2012), 20--27.
[34]
T. Simunic, L. Benini, A. Acquaviva, P. Glynn, and G. De Micheli. 2001. Dynamic voltage scaling and power management for portable systems. In Proceedings of the Design Automation Conference. 524--529.
[35]
Bob Steigerwald. 2011. Energy Aware Computing. Powerful Approaches for Green System Design. Intel Press, Hillsboro, OR.
[36]
Gary J. Sullivan, Jens-Rainer Ohm, Woo-Jin Han, and Thomas Wiegand. 2012. Overview of the high efficiency video coding (HEVC) standard. IEEE Transactions on Circuits and Systems for Video Technology 22, 12 (Dec. 2012), 1649--1668.

Cited By

View all
  • (2022)AgileWatts: An Energy-Efficient CPU Core Idle-State Architecture for Latency-Sensitive Server Applications2022 55th IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO56248.2022.00063(835-850)Online publication date: Oct-2022
  • (2020)Performance-Based Pricing in Multi-Core Geo-Distributed Cloud ComputingIEEE Transactions on Cloud Computing10.1109/TCC.2016.26283688:4(1079-1092)Online publication date: 1-Oct-2020
  • (2018)Joint DVFS and Parallelism for Energy Efficient and Low Latency Software Video DecodingIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2017.277981229:4(858-872)Online publication date: 1-Apr-2018
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Architecture and Code Optimization
ACM Transactions on Architecture and Code Optimization  Volume 11, Issue 4
January 2015
797 pages
ISSN:1544-3566
EISSN:1544-3973
DOI:10.1145/2695583
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 January 2015
Accepted: 01 November 2014
Revised: 01 October 2014
Received: 01 March 2014
Published in TACO Volume 11, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Video decoding
  2. low-power computing
  3. parallel processing

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • European Community's Seventh Framework Programme [FP7/2007-2013] under the LPGPU Project (www.lpgpu.org)

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)66
  • Downloads (Last 6 weeks)16
Reflects downloads up to 13 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2022)AgileWatts: An Energy-Efficient CPU Core Idle-State Architecture for Latency-Sensitive Server Applications2022 55th IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO56248.2022.00063(835-850)Online publication date: Oct-2022
  • (2020)Performance-Based Pricing in Multi-Core Geo-Distributed Cloud ComputingIEEE Transactions on Cloud Computing10.1109/TCC.2016.26283688:4(1079-1092)Online publication date: 1-Oct-2020
  • (2018)Joint DVFS and Parallelism for Energy Efficient and Low Latency Software Video DecodingIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2017.277981229:4(858-872)Online publication date: 1-Apr-2018
  • (2018)Highly parallel HEVC decoding for heterogeneous systems with CPU and GPUSignal Processing: Image Communication10.1016/j.image.2017.12.00962(93-105)Online publication date: Mar-2018
  • (2017)Reducing computational complexity in HEVC decoder for mobile energy saving2017 25th European Signal Processing Conference (EUSIPCO)10.23919/EUSIPCO.2017.8081363(1026-1030)Online publication date: Aug-2017
  • (2017)Cooperative DVFS for energy-efficient HEVC decoding on embedded CPU-GPU architectureProceedings of the 54th Annual Design Automation Conference 201710.1145/3061639.3062216(1-6)Online publication date: 18-Jun-2017
  • (2017)Efficient DVFS for low power HEVC software decoderJournal of Real-Time Image Processing10.1007/s11554-016-0624-913:1(39-54)Online publication date: 1-Mar-2017
  • (2017)Architecture-aware optimization of an HEVC decoder on asymmetric multicore processorsJournal of Real-Time Image Processing10.1007/s11554-016-0606-y13:1(25-38)Online publication date: 1-Mar-2017
  • (2016)Scalable HEVC decoder for mobile devices: Trade-off between energy consumption and quality2016 Conference on Design and Architectures for Signal and Image Processing (DASIP)10.1109/DASIP.2016.7853791(18-25)Online publication date: Oct-2016

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media