Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Leveraging Automatic High-Level Synthesis Resource Sharing to Maximize Dynamical Voltage Overscaling with Error Control

Published: 02 November 2021 Publication History
  • Get Citation Alerts
  • Abstract

    Approximate Computing has emerged as an alternative way to further reduce the power consumption of integrated circuits (ICs) by trading off errors at the output with simpler, more efficient logic. So far the main approaches in approximate computing have been to simplify the hardware circuit by pruning the circuit until the maximum error threshold is met. One of the critical issues, though, is the training data used to prune the circuit. The output error can significantly exceed the maximum error if the final workload does not match the training data. Thus, most previous work typically assumes that training data matches with the workload data distribution. In this work, we present a method that dynamically overscales the supply voltage based on different workload distribution at runtime. This allows to adaptively select the supply voltage that leads to the largest power savings while ensuring that the error will never exceed the maximum error threshold. This approach also allows restoring of the original error-free circuit if no matching workload distribution is found. The proposed method also leverages the ability of High-Level Synthesis (HLS) to automatically generate circuits with different properties by setting different synthesis constraints to maximize the available timing slack and, hence, maximize the power savings. Experimental results show that our proposed method works very well, saving on average 47.08% of power as compared to the exact output circuit and 20.25% more than a traditional approximation method.

    References

    [1]
    A. Agrawal, J. Choi, K. Gopalakrishnan, S. Gupta, R. Nair, J. Oh, D. A. Prener, S. Shukla, V. Srinivasan, and Z. Sura. 2016. Approximate computing: Challenges and opportunities. In 2016 IEEE International Conference on Rebooting Computing (ICRC’16). 1–8.
    [2]
    O. Akbari, M. Kamal, A. Afzali-Kusha, M. Pedram, and M. Shafique. 2018. PX-CGRA: Polymorphic approximate coarse-grained reconfigurable architecture. In 2018 Design, Automation Test in Europe Conference Exhibition (DATE’18). 413–418.
    [3]
    Jushan Bai and Serena Ng. 2005. Tests for skewness, kurtosis, and normality for time series data. Journal of Business and Economic Statistics 23, 1 (2005), 49–60.
    [4]
    Filipe Betzel et al. 2018. Approximate communication: Techniques for reducing communication bottlenecks in large-scale parallel systems. ACM Computing Surveys 51, 1, Article 1 (Jan. 2018), 32 pages.
    [5]
    Marcelo Brandalero, Luigi Carro, Antonio Carlos S. Beck, and Muhammad Shafique. 2018. Approximate on-the-fly coarse-grained reconfigurable acceleration for general-purpose applications. In Design Automation Conference. Article 160, 6 pages.
    [6]
    Thomas D. Burd and Robert W. Brodersen. 2000. Design issues for dynamic voltage scaling. In Proceedings of the 2000 International Symposium on Low Power Electronics and Design. 9–14.
    [7]
    Thomas D. Burd, Trevor A. Pering, Anthony J. Stratakos, and Robert W. Brodersen. 2000. A dynamic voltage scaled microprocessor system. IEEE Journal of Solid-State Circuits 35, 11 (2000), 1571–1580.
    [8]
    Benjamin Carrion Schafer and Anushree Mahapatra. 2014. S2CBench:Synthesizable systemc benchmark suite. IEEE Embedded Systems Letters 6, 3 (2014), 53–56.
    [9]
    Jorge Castro-Godínez, Muhammad Shafique, and Jörg Henkel. 2019. ECAx: Balancing error correction costs in approximate accelerators. ACM Transactions on Embedded Computing Systems 18, 5s (Oct. 2019), Article 48,20 pages. DOI:
    [10]
    Jorge Castro-Godínez, Julián Mateus-Vargas, Muhammad Shafique, and Jörg Henkel. 2020. AxHLS: Design space exploration and high-level synthesis of approximate accelerators using approximate functional units and analytical models. In 2020 IEEE/ACM International Conference On Computer Aided Design (ICCAD’20). 1–9.
    [11]
    David P. Doane and Lori E. Seward. 2011. Measuring skewness: A forgotten statistic? Journal of Statistics Education: An International Journal on the Teaching and Learning of Statistics 19, 2 (2011), 1–18.
    [12]
    D. Garg and R. Sharma. 2014. Low power multiplier using dynamic voltage and frequency scaling (DVFS). In 2014 International Conference on Advances in Computing, Communications and Informatics (ICACCI’14). 560–564.
    [13]
    V. Gupta, D. Mohapatra, A. Raghunathan, and K. Roy. 2013. Low-power digital signal processing using approximate adders. IEEE TCAD 32, 1 (2013), 124–137.
    [14]
    D. A. F. Guzman, S. Sapienza, B. Sereni, and P. M. Ros. 2017. Very low power event-based surface EMG acquisition system with off-the-shelf components. In Biomedical Circuits and Systems Conference (BioCAS). 1–4.
    [15]
    G. Hackmann, W. Guo, G. Yan, Z. Sun, C. Lu, and S. Dyke. 2014. Cyber-physical codesign of distributed structural health monitoring with wireless sensor networks. IEEE Transactions on Parallel and Distributed Systems 25, 1 (Jan. 2014), 63–72.
    [16]
    Zhigang Hu, Alper Buyuktosunoglu, Viji Srinivasan, Victor Zyuban, Hans Jacobson, and Pradip Bose. 2004. Microarchitectural techniques for power gating of execution units. In Proceedings of the 2004 International Symposium on Low Power Electronics and Design (ISLPED’04). 32–37.
    [17]
    H. Jiang, C. Liu, N. Maheshwari, F. Lombardi, and J. Han. 2016. A comparative evaluation of approximate multipliers. In 2016 IEEE/ACM International Symposium on Nanoscale Architectures (NANOARCH’16). 191–196.
    [18]
    D. S. Khudia, B. Zamirai, M. Samadi, and S. Mahlke. 2016. Quality control for approximate accelerators by error prediction. IEEE Design Test 33, 1 (Feb. 2016), 43–50.
    [19]
    M. Kondo, H. Kobyashi, R. Sakamoto, M. Wada, J. Tsukamoto, M. Namiki, W. Wang, H. Amano, K. Matsunaga, M. Kudo, K. Usami, T. Komoda, and H. Nakamura. 2014. Design and evaluation of fine-grained power-gating for embedded microprocessors. In 2014 Design, Automation Test in Europe Conference Exhibition (DATE’14). 1–6. DOI:
    [20]
    R. Kumar and V. Kursun. 2006. Reversed temperature-dependent propagation delay characteristics in nanometer CMOS circuits. IEEE Transactions on Circuits and Systems II: Express Briefs 53, 10 (2006), 1078–1082.
    [21]
    Seogoo Lee, Lizy K. John, and Andreas Gerstlauer. 2017. High-level synthesis of approximate hardware under joint precision and voltage scaling. In Design, Automation Test in Europe Conference Exhibition, 2017 (DATE’17). 187–192. DOI:
    [22]
    Marcos T. Leipnitz and Gabriel L. Nazar. 2019. High-level synthesis of approximate designs under real-time constraints. ACM Transactions on Embedded Computing Systems 18, 5s, Article 59 (Oct. 2019), 21 pages. DOI:
    [23]
    Chaofan Li, Wei Luo, S. S. Sapatnekar, and Jiang Hu. 2015. Joint precision optimization and high level synthesis for approximate computing. In DAC. 1–6.
    [24]
    C. Liu, J. Han, and F. Lombardi. 2014. A low-power, high-performance approximate multiplier with configurable partial error recovery. In 2014 Design, Automation Test in Europe Conference Exhibition (DATE’14). 1–4.
    [25]
    Xinxin Mei, Qiang Wang, and Xiaowen Chu. 2016. A survey and measurement study of GPU DVFS on energy conservation. CoRR abs/1610.01784 (2016).
    [26]
    J. Miao, A. Gerstlauer, and M. Orshansky. 2013. Approximate logic synthesis under general error magnitude and frequency constraints. In ICCAD. 779–786.
    [27]
    Sparsh Mittal. 2016. A survey of techniques for approximate computing. ACM Computing Surveys 48, 4, Article 62 (March 2016), 33 pages.
    [28]
    Debabrata Mohapatra. 2011. Approximate Computing: Enabling Voltage Over-scaling in Multimedia Applications. Ph.D. Dissertation. Purdue University.
    [29]
    V. Mrazek, M. A. Hanif, Z. Vasicek, L. Sekanina, and M. Shafique. 2019. autoAx: An automatic design space exploration and circuit building methodology utilizing libraries of approximate components. In DAC. 1–6.
    [30]
    K. Nepal, Y. Li, R. I. Bahar, and S. Reda. 2014. ABACUS: A technique for automated behavioral synthesis of approximate computing circuits. In DATE. 1–6.
    [31]
    Jaehyun Park, Donghwa Shin, Naehyuck Chang, and Massoud Pedram. 2010. Accurate modeling and calculation of delay and energy overheads of dynamic voltage scaling in modern high-performance microprocessors. In 2010 ACM/IEEE International Symposium on Low-Power Electronics and Design (ISLPED’10). IEEE, 419–424.
    [32]
    M. Shafique, W. Ahmad, R. Hafiz, and J. Henkel. 2015. A low latency generic accuracy configurable adder. In DAC. 1–6.
    [33]
    M. Shafique, R. Hafiz, S. Rehman, W. El-Harouni, and J. Henkel. 2016. Invited: Cross-layer approximate computing: From logic to architectures. In DAC. 1–6.
    [34]
    S. Soleimani, A. Afzali-Kusha, and B. Forouzandeh. 2008. Temperature dependence of propagation delay characteristic in FinFET circuits. In 2008 International Conference on Microelectronics. 276–279.
    [35]
    M. S. Srivastava. 1984. A measure of skewness and kurtosis and a graphical method for assessing multivariate normality. Statistics and Probability Letters 2, 5 (1984), 263–267.
    [36]
    J. R. Stevens, A. Ranjan, and A. Raghunathan. 2018. AxBA: An approximate bus architecture framework. In ICCAD. 1–8.
    [37]
    A. Temko. 2017. Accurate heart rate monitoring during physical exercises using PPG. IEEE Transactions on Biomedical Engineering 64, 9 (Sep. 2017), 2016–2024.
    [38]
    P. van Stralen and A. Pimentel. 2010. Scenario-based design space exploration of MPSoCs. In 2010 IEEE International Conference on Computer Design. 305–312.
    [39]
    S. Venkataramani, A. Sabne, V. Kozhikkottu, K. Roy, and A. Raghunathan. 2012. SALSA: Systematic logic synthesis of approximate circuits. In DAC. 796–801.
    [40]
    S. Wu, S. Kang, C. Chakrabarti, and H. Lee. 2016. Low power baseband processor for IoT terminals with long range wireless communications. In GlobalSIP. 728–732.
    [41]
    Y. Wu and W. Qian. 2016. An efficient method for multi-level approximate logic synthesis under error rate constraint. In DAC. 1–6.
    [42]
    Y. Wu, C. Shen, Y. Jia, and W. Qian. 2017. Approximate logic synthesis for FPGA by wire removal and local function change. In ASP-DAC. 163–169.
    [43]
    Q. Xu, T. Mytkowicz, and N. S. Kim. 2016. Approximate computing: A survey. IEEE Design Test 33, 1 (Feb. 2016), 8–22.
    [44]
    S. Xu and B. Carrion Schafer. 2017. Approximate reconfigurable hardware accelerator: Adapting the micro-architecture to dynamic workloads. In 2017 IEEE International Conference on Computer Design (ICCD’17). 113–120.
    [45]
    S. Xu and B. Carrion Schafer. 2017. Exposing approximate computing optimizations at different levels: From behavioral to gate-level. IEEE TVLSI 25, 11 (2017), 3077–3088.
    [46]
    S. Xu and B. Carrion Schafer. 2019. Toward self-tunable approximate computing. IEEE Transactions on Very Large Scale Integration (VLSI) Systems 27, 4 (2019), 778–789.
    [47]
    S. Xu and B. Carrion Schafer. 2020. On the design of high performance HW accelerator through high-level synthesis scheduling approximations. In DATE. 1378–1383.
    [48]
    S. Sen, Y. Kim, S. Venkataramani, and A. Ragunathan. 2021. Value similarity extensions for approximate computing in general-purpose processors. In DATE. 1–6.
    [49]
    J. Zhang, K. Rangineni, Z. Ghodsi, and S. Garg. 2018. ThUnderVolt: Enabling aggressive voltage underscaling and timing error resilience for energy efficient deep learning accelerators. In 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC’18). 1–6.

    Cited By

    View all
    • (2023)Constraint-Aware Multi-Technique Approximate High-Level Synthesis for FPGAsACM Transactions on Reconfigurable Technology and Systems10.1145/362448116:4(1-28)Online publication date: 9-Oct-2023
    • (2023)An Energy-Efficient Generic Accuracy Configurable Multiplier Based on Block-Level Voltage OverscalingIEEE Transactions on Emerging Topics in Computing10.1109/TETC.2023.327941911:4(851-867)Online publication date: Oct-2023
    • (2022)A Multiplier-Less Level-3 Haar Wavelet Transform Approximation Requiring Five Additions Only2022 IEEE 15th Dallas Circuit And System Conference (DCAS)10.1109/DCAS53974.2022.9845632(1-6)Online publication date: 17-Jun-2022

    Index Terms

    1. Leveraging Automatic High-Level Synthesis Resource Sharing to Maximize Dynamical Voltage Overscaling with Error Control

          Recommendations

          Comments

          Information & Contributors

          Information

          Published In

          cover image ACM Transactions on Design Automation of Electronic Systems
          ACM Transactions on Design Automation of Electronic Systems  Volume 27, Issue 2
          March 2022
          217 pages
          ISSN:1084-4309
          EISSN:1557-7309
          DOI:10.1145/3494074
          Issue’s Table of Contents

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Journal Family

          Publication History

          Published: 02 November 2021
          Accepted: 01 July 2021
          Revised: 01 May 2021
          Received: 01 February 2021
          Published in TODAES Volume 27, Issue 2

          Permissions

          Request permissions for this article.

          Check for updates

          Author Tags

          1. Approximate computing
          2. dynamic error control
          3. voltage overscaling
          4. low-power
          5. high-level synthesis
          6. resource sharing

          Qualifiers

          • Research-article
          • Refereed

          Contributors

          Other Metrics

          Bibliometrics & Citations

          Bibliometrics

          Article Metrics

          • Downloads (Last 12 months)38
          • Downloads (Last 6 weeks)1
          Reflects downloads up to 27 Jul 2024

          Other Metrics

          Citations

          Cited By

          View all
          • (2023)Constraint-Aware Multi-Technique Approximate High-Level Synthesis for FPGAsACM Transactions on Reconfigurable Technology and Systems10.1145/362448116:4(1-28)Online publication date: 9-Oct-2023
          • (2023)An Energy-Efficient Generic Accuracy Configurable Multiplier Based on Block-Level Voltage OverscalingIEEE Transactions on Emerging Topics in Computing10.1109/TETC.2023.327941911:4(851-867)Online publication date: Oct-2023
          • (2022)A Multiplier-Less Level-3 Haar Wavelet Transform Approximation Requiring Five Additions Only2022 IEEE 15th Dallas Circuit And System Conference (DCAS)10.1109/DCAS53974.2022.9845632(1-6)Online publication date: 17-Jun-2022

          View Options

          Get Access

          Login options

          Full Access

          View options

          PDF

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          Full Text

          View this article in Full Text.

          Full Text

          HTML Format

          View this article in HTML Format.

          HTML Format

          Media

          Figures

          Other

          Tables

          Share

          Share

          Share this Publication link

          Share on social media