Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Enhancing Speculative Execution With Selective Approximate Computing

Published: 14 February 2019 Publication History
  • Get Citation Alerts
  • Abstract

    Speculative execution is an optimization technique used in modern processors by which predicted instructions are executed in advance with an objective of overlapping the latencies of slow operations. Branch prediction and load value speculation are examples of speculative execution used in modern pipelined processors to avoid execution stalls. However, speculative executions incur a performance penalty as an execution rollback when there is a misprediction. In this work, we propose to aid speculative execution with approximate computing by relaxing the execution rollback penalty associated with a misprediction. We propose a sensitivity analysis method for data and branches in a program to identify the data load and branch instructions that can be executed without any rollback in the pipeline and yet can ensure a certain user-specified quality of service of the application with a probabilistic reliability. Our analysis is based on statistical methods, particularly hypothesis testing and Bayesian analysis. We perform an architectural simulation of our proposed approximate execution and report the benefits in terms of CPU cycles and energy utilization on selected applications from the AxBench, ACCEPT, and Parsec 3.0 benchmarks suite.

    References

    [1]
    Richard C. Aster, Brian Borchers, and Clifford H. Thurber. 2011. Parameter Estimation and Inverse Problems. Vol. 90. Academic Press.
    [2]
    D. I. August, D. A. Connors, S. A. Mahlke, J. W. Sias, K. M. Crozier, B.-C. Cheng, et al. 1998. Integrated predicated and speculative execution in the IMPACT EPIC architecture. In Proceedings of the 25th Annual International Symposium on Computer Architecture. 227--237.
    [3]
    Christian Bienia, Sanjeev Kumar, Jaswinder Pal Singh, and Kai Li. 2008. The PARSEC benchmark suite: Characterization and architectural implications. In Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques (PACT’08). ACM, New York, NY, 72--81.
    [4]
    Trevor E. Carlson, Wim Heirman, Stijn Eyerman, Ibrahim Hur, and Lieven Eeckhout. 2014. An evaluation of high-level mechanistic core models. ACM Transactions on Architecture and Code Optimization 11, 3 (Oct. 2014), Article 5, 23 pages.
    [5]
    I.-Cheng K. Chen, C.-C. Lee, and T. N. Mudge. 1997. Instruction prefetching using branch prediction information. In Proceedings of the International Conference on Computer Design VLSI in Computers and Processors (ICCD’97). 593--601.
    [6]
    P. Düben, Parishkrati, S. Yenugula, J. Augustine, K. Palem, J. Schlachter, C. Enz, et al. 2015. Opportunities for energy efficient computing: A study of inexact general purpose processors for high-performance and big-data applications. In Proceedings of the 2015 Design, Automation, and Test in Europe Conference and Exhibition (DATE’15). 764--769.
    [7]
    A. N. Eden et al. 1998. The YAGS branch prediction scheme. In Proceedings of the 31st Annual ACM/IEEE International Symposium on Microarchitecture (MICRO’98). 69--77.
    [8]
    Hadi Esmaeilzadeh, Adrian Sampson, Luis Ceze, and Doug Burger. 2012. Neural acceleration for general-purpose approximate programs. In Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-45). IEEE, Washington, DC, 449--460.
    [9]
    Hadi Esmaeilzadeh, Adrian Sampson, Luis Ceze, and Doug Burger. 2012. Neural acceleration for general-purpose approximate programs. In Proceedings of the 45th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’12). IEEE, Los Alamitos, CA, 449--460.
    [10]
    William Feller. 1968. An Introduction to Probability Theory and Its Applications. Vol. 1. Wiley. http://www.amazon.ca/exec/obidos/redirect?tag=citeulike04-20&path===ASIN/0471257087.
    [11]
    José González and Antonio González. 1997. Speculative execution via address prediction and data prefetching. In Proceedings of the 11th International Conference on Supercomputing (ICS’97). 196--203.
    [12]
    J. Han and M. Orshansky. 2013. Approximate computing: An emerging paradigm for energy-efficient design. In Proceedings of the 2013 18th IEEE European Test Symposium (ETS’13). 1--6.
    [13]
    X. He, S. Jiang, W. Lu, G. Yan, Y. Han, and X. Li. 2017. Exploiting the potential of computation reuse through approximate computing. IEEE Transactions on Multi-Scale Computing Systems 3, 3 (July 2017), 152--165.
    [14]
    John L. Hennessy and David A. Patterson. 2011. Computer Architecture: A Quantitative Approach. Morgan Kaufmann.
    [15]
    N. Kanopoulos, N. Vasanthavada, and R. L. Baker. 1988. Design of an image edge detection filter using the Sobel operator. IEEE Journal of Solid-State Circuits 23, 2 (April 1988), 358--367.
    [16]
    Thomas Lengauer and Robert Endre Tarjan. 1979. A fast algorithm for finding dominators in a flowgraph. ACM Transactions on Programming Languages and Systems 1, 1 (Jan. 1979), 121--141.
    [17]
    Sheng Li, Jung Ho Ahn, Richard D. Strong, Jay B. Brockman, Dean M. Tullsen, and Norman P. Jouppi. 2009. McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures. In Proceedings of the 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-42). 469--480.
    [18]
    Mikko H. Lipasti, Christopher B. Wilkerson, and John Paul Shen. 1996. Value locality and load value prediction. ACM SIGPLAN Notices 31, 9 (1996), 138--147.
    [19]
    Chi-Keung Luk, Robert Cohn, Robert Muth, Harish Patil, Artur Klauser, Geoff Lowney, et al. 2005. Pin: Building customized program analysis tools with dynamic instrumentation. In Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’05). ACM, New York, NY, 190--200.
    [20]
    Avi Mendelson and Freddy Gabbay. 1996. Speculative Execution Based on Value Prediction. Technical Report. EE Department TR 1080, Technion—Israel Institute of Technology.
    [21]
    Joshua San Miguel, Mario Badr, and Natalie Enright Jerger. 2014. Load value approximation. In Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’14). IEEEE, Washington, DC, 127--139.
    [22]
    Sparsh Mittal. 2016. A survey of techniques for approximate computing. ACM Computing Surveys 48, 4, Article 62, 33 pages.
    [23]
    B. Nongpoh, R. Ray, S. Dutta, and A. Banerjee. 2017. AutoSense: A framework for automated sensitivity analysis of program data. IEEE Transactions of Software Engineering 43, 12 (2017), 1110--1124.
    [24]
    Gennady Pekhimenko, Danai Koutra, and Kun Qian. 2011. Approximate computing: Application analysis and hardware design. Retrieved from ww.cs.cmu.edu/∼gpekhime/Projects/15740/paper.pdf.
    [25]
    B. Ramakrishna Rau and J. A. Fisher. 1993. Instruction-level parallel processing: History, overview, and perspective. Journal of Supercomputing 7, 1–2 (1993), 9--50.
    [26]
    Martin Rinard. 2006. Probabilistic accuracy bounds for fault-tolerant computations that discard tasks. In Proceedings of the 20th Annual International Conference on Supercomputing (ICS’06). ACM, New York, NY, 324--334.
    [27]
    Vijay K. Rohatgi and A. K. Md. Ehsanes Saleh. 2015. An Introduction to Probability and Statistics. John Wiley 8 Sons.
    [28]
    Pooja Roy, Rajarshi Ray, Chundong Wang, and Weng Fai Wong. 2014. ASAC: Automatic sensitivity analysis for approximate computing. ACM SIGPLAN Notices 49, 5 (June 2014), 95--104.
    [29]
    Reuven Y. Rubinstein and Dirk P. Kroese. 2007. Simulation and the Monte Carlo Method (2nd ed.). Wiley Series in Probability and Statistics. Wiley.
    [30]
    Stuart J. Russell and Peter Norvig. 2003. Artificial Intelligence: A Modern Approach (2nd ed.). Pearson Education.
    [31]
    Adrian Sampson, Andre Baixo, Benjamin Ransford, Thierry Moreau, Joshua Yip, Luis Ceze, et al. 2015. ACCEPT: A Programmer-Guided Compiler Framework for Practical Approximate Computing. Technical Report UW-CSE-15-01-1. University of Washington.
    [32]
    Adrian Sampson, Werner Dietl, Emily Fortuna, Danushen Gnanapragasam, Luis Ceze, and Dan Grossman. 2011. EnerJ: Approximate data types for safe and general low-power computation. ACM SIGPLAN Notices 46, 6 (June 2011), 164--174.
    [33]
    N. Nethercote and J. Seward. 2007. Valgrind: A framework for heavyweight dynamic binary instrumentation. SIGPLAN Not. 42, 6 (2007), 89--100.
    [34]
    Stelios Sidiroglou-Douskos, Sasa Misailovic, Henry Hoffmann, and Martin Rinard. 2011. Managing performance vs. accuracy trade-offs with loop perforation. In Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering (ESEC/FSE’11). ACM, New York, NY, 124--134.
    [35]
    Bradley Thwaites, Gennady Pekhimenko, Hadi Esmaeilzadeh, Amir Yazdanbakhsh, Onur Mutlu, Jongse Park, et al. 2014. Rollback-free value prediction with approximate loads. In Proceedings of the 23rd International Conference on Parallel Architectures and Compilation (PACT’14). ACM, New York, NY, 493--494.
    [36]
    G. S. Tyson. 1994. The effects of predicated execution on branch prediction. In Proceedings of the 26th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’94). 196--206.
    [37]
    Vassilis Vassiliadis, Jan Riehme, Jens Deussen, Konstantinos Parasyris, Christos D. Antonopoulos, Nikolaos Bellas, et al. 2016. Towards automatic significance analysis for approximate computing. In Proceedings of the 2016 International Symposium on Code Generation and Optimization (CGO’16). 182--193.
    [38]
    Swagath Venkataramani, Srimat T. Chakradhar, Kaushik Roy, and Anand Raghunathan, 2015. Approximate computing and the quest for computing efficiency. In Proceedings of the 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC’15). Article 120, 6 pages.
    [39]
    A. Wald. 1945. Sequential tests of statistical hypotheses. Annals of Mathematical Statistics 16, 2 (June 1945), 117--186.
    [40]
    N. Wang, M. Fertig, and S. Patel. 2003. Y-branches: When you come to a fork in the road, take it. In Proceedings of the 2003 12th International Conference on Parallel Architectures and Compilation Techniques (PACT’03). 56--66.
    [41]
    Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli. 2004. Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Processing 13, 4 (April 2004), 600--612.
    [42]
    William A. Wulf and Sally A. McKee. 1995. Hitting the memory wall: Implications of the obvious. ACM SIGARCH Computer Architecture News 23, 1 (March 1995), 20--24.
    [43]
    H. Wunderlich, C. Braun, and A. Scholl. 2016. Pushing the limits: How fault tolerance extends the scope of approximate computing. In Proceedings of the 2016 IEEE 22nd International Symposium on On-Line Testing and Robust System Design (IOLTS’16). 133--136.
    [44]
    Amir Yazdanbakhsh, Divya Mahajan, Bradley Thwaites, Jongse Park, Anandhavel Nagendrakumar, Sindhuja Sethuraman, et al. 2015. Axilog: Language support for approximate hardware design. In Proceedings of the 2015 Design, Automation, and Test in Europe Conference and Exhibition (DATE’15). 812--817.
    [45]
    Amir Yazdanbakhsh, Gennady Pekhimenko, Bradley Thwaites, Hadi Esmaeilzadeh, Onur Mutlu, and Todd C. Mowry. 2016. RFVP: Rollback-free value prediction with safe-to-approximate loads. ACM Transactions on Architecture and Code Optimization 12, 4 (Jan. 2016), Article 62, 26 pages.
    [46]
    Hakan L. S. Younes and Reid G. Simmons. 2002. Probabilistic verification of discrete event systems using acceptance sampling. In Computer Aided Verification. Lecture Notes in Computer Science, Vol. 2404. Springer, 223--235.

    Cited By

    View all
    • (2024)Adaptive approximate computing in edge AI and IoT applications: A reviewJournal of Systems Architecture10.1016/j.sysarc.2024.103114150(103114)Online publication date: May-2024
    • (2022)Accelerating Decision Tree Ensemble with Guided Branch ApproximationProceedings of the 12th International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies10.1145/3535044.3535048(24-32)Online publication date: 9-Jun-2022
    • (2022)Usable Circuits with Imperfect Scan Logic2022 IEEE 31st Asian Test Symposium (ATS)10.1109/ATS56056.2022.00039(156-161)Online publication date: Nov-2022
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Design Automation of Electronic Systems
    ACM Transactions on Design Automation of Electronic Systems  Volume 24, Issue 2
    March 2019
    287 pages
    ISSN:1084-4309
    EISSN:1557-7309
    DOI:10.1145/3306156
    • Editor:
    • Naehyuck Chang
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Journal Family

    Publication History

    Published: 14 February 2019
    Accepted: 01 January 2019
    Revised: 01 November 2018
    Received: 01 April 2018
    Published in TODAES Volume 24, Issue 2

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Approximate computing
    2. Bayesian analysis
    3. hypothesis testing
    4. speculative execution

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    • National Institute of Technology Meghalaya and Visvesvaraya Ph.D. Scheme, Government of India

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)17
    • Downloads (Last 6 weeks)1

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Adaptive approximate computing in edge AI and IoT applications: A reviewJournal of Systems Architecture10.1016/j.sysarc.2024.103114150(103114)Online publication date: May-2024
    • (2022)Accelerating Decision Tree Ensemble with Guided Branch ApproximationProceedings of the 12th International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies10.1145/3535044.3535048(24-32)Online publication date: 9-Jun-2022
    • (2022)Usable Circuits with Imperfect Scan Logic2022 IEEE 31st Asian Test Symposium (ATS)10.1109/ATS56056.2022.00039(156-161)Online publication date: Nov-2022
    • (2019)Approximate computing for multithreaded programs in shared memory architecturesProceedings of the 17th ACM-IEEE International Conference on Formal Methods and Models for System Design10.1145/3359986.3361209(1-9)Online publication date: 9-Oct-2019
    • (undefined)Approximation Opportunities in Edge Computing Hardware: A Systematic Literature ReviewACM Computing Surveys10.1145/3572772

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media