research-article

Public Access

ApproxTuner: a compiler and runtime system for adaptive approximations

Authors:

Maria Kotsifakou,

Elizabeth Wang,

Vikram S. Adve,

Sasa Misailovic,

Sarita AdveAuthors Info & Claims

PPoPP '21: Proceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming

Pages 262 - 277

https://doi.org/10.1145/3437801.3446108

Published: 17 February 2021 Publication History

Abstract

Manually optimizing the tradeoffs between accuracy, performance and energy for resource-intensive applications with flexible accuracy or precision requirements is extremely difficult. We present ApproxTuner, an automatic framework for accuracy-aware optimization of tensor-based applications while requiring only high-level end-to-end quality specifications. ApproxTuner implements and manages approximations in algorithms, system software, and hardware.

The key contribution in ApproxTuner is a novel three-phase approach to approximation-tuning that consists of development-time, install-time, and run-time phases. Our approach decouples tuning of hardware-independent and hardware-specific approximations, thus providing retargetability across devices. To enable efficient autotuning of approximation choices, we present a novel accuracy-aware tuning technique called predictive approximation-tuning, which significantly speeds up autotuning by analytically predicting the accuracy impacts of approximations.

We evaluate ApproxTuner across 10 convolutional neural networks (CNNs) and a combined CNN and image processing benchmark. For the evaluated CNNs, using only hardware-independent approximation choices we achieve a mean speedup of 2.1x (max 2.7x) on a GPU, and 1.3x mean speedup (max 1.9x) on the CPU, while staying within 1 percentage point of inference accuracy loss. For two different accuracy-prediction models, ApproxTuner speeds up tuning by 12.8x and 20.4x compared to conventional empirical tuning while achieving comparable benefits.

References

[1]

2020. Coral. https://coral.ai/.

[2]

2020. Qualcomm Redefines Premium with the Flagship Snapdragon 888 5G Mobile Platform. https://www.qualcomm.com/news/releases/2020/12/02/qualcomm-redefines-premium-flagship-snapdragon-888-5g-mobile-platform.

[3]

Martin Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2016. TensorFlow: A system for large-scale machine learning. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16). 265--283. https://www.usenix.org/system/files/conference/osdi16/osdi16-abadi.pdf

Digital Library

[4]

Jason Ansel, Cy Chan, Yee Lok Wong, Marek Olszewski, Qin Zhao, Alan Edelman, and Saman Amarasinghe. 2009. PetaBricks: a language and compiler for algorithmic choice. Vol. 44. ACM.

Digital Library

[5]

Jason Ansel, Shoaib Kamil, Kalyan Veeramachaneni, Jonathan Ragan-Kelley, Jeffrey Bosboom, Una-May O'Reilly, and Saman Amarasinghe. 2014. OpenTuner: An Extensible Framework for Program Autotuning. In Proceedings of the 23rd international conference on Parallel architectures and compilation. ACM, 303--316.

Digital Library

[6]

Jason Ansel, Yee Lok Wong, Cy Chan, Marek Olszewski, Alan Edelman, and Saman Amarasinghe. 2011. Language and compiler support for auto-tuning variable-accuracy algorithms. In Code Generation and Optimization (CGO), 2011 9th Annual IEEE/ACM International Symposium on. IEEE, 85--96.

[7]

ARM. 2019. Half-precision floating-point number format. https://developer.arm.com/docs/dui0774/e/other-compiler-specific-features/half-precision-floating-point-number-format. Accessed: 2019-11-21.

[8]

D. Azariadi, V. Tsoutsouras, S. Xydis, and D. Soudris. 2016. ECG signal analysis and arrhythmia detection on IoT wearable medical devices. In 2016 5th International Conference on Modern Circuits and Systems Technologies (MOCAST). 1--4.

[9]

Woongki Baek and Trishul M. Chilimbi. 2010. Green: A Framework for Supporting Energy-conscious Programming Using Controlled Approximation. SIGPLAN Not. 45, 6 (June 2010), 198--209.

Digital Library

[10]

John Canny. 1986. A computational approach to edge detection. IEEE Transactions on pattern analysis and machine intelligence 6 (1986), 679--698.

Digital Library

[11]

Tianshi Chen, Zidong Du, Ninghui Sun, Jia Wang, Chengyong Wu, Yunji Chen, Olivier Temam, Tianshi Chen, Zidong Du, Ninghui Sun, Jia Wang, Chengyong Wu, Yunji Chen, Olivier Temam, Tianshi Chen, Zidong Du, Ninghui Sun, Jia Wang, Chengyong Wu, Yunji Chen, and Olivier Temam. 2014. DianNao: a small-footprint high-throughput accelerator for ubiquitous machine-learning. ACM SIGPLAN Notices 49, 4 (2014), 269--284.

Digital Library

[12]

Tianqi Chen, Thierry Moreau, Ziheng Jiang, Lianmin Zheng, Eddie Yan, Haichen Shen, Meghan Cowan, Leyuan Wang, Yuwei Hu, Luis Ceze, et al. 2018. TVM: An automated end-to-end optimizing compiler for deep learning. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18). 578--594. https://www.usenix.org/system/files/osdi18-chen.pdf

[13]

Tianqi Chen, Lianmin Zheng, Eddie Yan, Ziheng Jiang, Thierry Moreau, Luis Ceze, Carlos Guestrin, and Arvind Krishnamurthy. 2018. Learning to optimize tensor programs. In Advances in Neural Information Processing Systems. 3389--3400.

[14]

Sharan Chetlur, Cliff Woolley, Philippe Vandermersch, Jonathan Cohen, John Tran, Bryan Catanzaro, and Evan Shelhamer. 2014. cuDNN: Efficient Primitives for Deep Learning. CoRR abs/1410.0759 (2014). arXiv:1410.0759 http://arxiv.org/abs/1410.0759

[15]

Meghan Cowan, Thierry Moreau, Tianqi Chen, James Bornholt, and Luis Ceze. 2020. Automatic generation of high-performance quantized machine learning kernels. In Proceedings of the 18th ACM/IEEE International Symposium on Code Generation and Optimization. 305--316.

Digital Library

[16]

Yufei Ding, Jason Ansel, Kalyan Veeramachaneni, Xipeng Shen, Una-May O'Reilly, and Saman Amarasinghe. 2015. Autotuning Algorithmic Choice for Input Sensitivity. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI '15). ACM, New York, NY, USA, 379--390.

Digital Library

[17]

Mikhail Figurnov, Aizhan Ibraimova, Dmitry P Vetrov, and Pushmeet Kohli. 2016. PerforatedCNNs: Acceleration through Elimination of Redundant Convolutions. In Advances in Neural Information Processing Systems. 947--955.

[18]

Carlos M Fonseca, Joshua D Knowles, Lothar Thiele, and Eckart Zitzler. 2005. A tutorial on the performance assessment of stochastic multi-objective optimizers. In Third International Conference on Evolutionary Multi-Criterion Optimization (EMO 2005), Vol. 216. 240.

[19]

Josh Fromm, Meghan Cowan, Matthai Philipose, Luis Ceze, and Shwetak Patel. 2020. RIPTIDE: Fast End-to-end Binarized Neural Networks. Proceedings of Machine Learning and Systems (MLSys) (2020). http://ubicomplab.cs.washington.edu/pdfs/riptide.pdf

[20]

Trevor Gale, Matei Zaharia, Cliff Young, and Erich Elsen. 2020. Sparse GPU Kernels for Deep Learning. arXiv preprint arXiv:2006.10901 (2020).

[21]

Song Han, Huizi Mao, and William J. Dally. 2015. Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding. arXiv:cs.CV/1510.00149

[22]

Song Han, Jeff Pool, John Tran, and William Dally. 2015. Learning both weights and connections for efficient neural network. In Advances in neural information processing systems. 1135--1143.

[23]

Seungyeop Han, Haichen Shen, Matthai Philipose, Sharad Agarwal, Alec Wolman, and Arvind Krishnamurthy. 2016. MCDNN: An Approximation-based Execution Framework for Deep Stream Processing Under Resource Constraints. In Proceedings of the 14th Annual International Conference on Mobile Systems, Applications, and Services. ACM, 123--136.

Digital Library

[24]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.

[25]

Kartik Hegde, Hadi Asghari-Moghaddam, Michael Pellauer, Neal Crago, Aamer Jaleel, Edgar Solomonik, Joel Emer, and Christopher W Fletcher. 2019. ExTensor: An Accelerator for Sparse Tensor Algebra. In Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture. 319--333.

Digital Library

[26]

Henry Hoffmann, Sasa Misailovic, Stelios Sidiroglou, Anant Agarwal, and Martin Rinard. 2009. Using code perforation to improve performance, reduce energy consumption, and respond to failures. (2009).

[27]

H. Hoffmann, S. Sidiroglou, M. Carbin, S. Misailovic, A. Agarwal, and M. Rinard. 2011. Dynamic Knobs for Responsive Power-Aware Computing (ASPLOS).

Digital Library

[28]

Andrew G Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. 2017. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017).

[29]

Intel. 2018. Intel Movidius Vision Processing Units (VPUs). https://www.intel.com/content/dam/www/public/us/en/documents/product-briefs/myriad-x-product-brief.pdf.

[30]

Norman P. Jouppi, Cliff Young, Nishant Patil, David A. Patterson, Gaurav Agrawal, Raminder Bajwa, Sarah Bates, Suresh Bhatia, Nan Boden, Al Borchers, Rick Boyle, Pierre-luc Cantin, Clifford Chao, Chris Clark, Jeremy Coriell, Mike Daley, Matt Dau, Jeffrey Dean, Ben Gelb, Tara Vazir Ghaemmaghami, Rajendra Gottipati, William Gulland, Robert Hagmann, C. Richard Ho, Doug Hogberg, John Hu, Robert Hundt, Dan Hurt, Julian Ibarz, Aaron Jaffey, Alek Jaworski, Alexander Kaplan, Harshit Khaitan, Daniel Killebrew, Andy Koch, Naveen Kumar, Steve Lacy, James Laudon, James Law, Diemthu Le, Chris Leary, Zhuyuan Liu, Kyle Lucke, Alan Lundin, Gordon MacKean, Adriana Maggiore, Maire Mahony, Kieran Miller, Rahul Nagarajan, Ravi Narayanaswami, Ray Ni, Kathy Nix, Thomas Norrie, Mark Omernick, Narayana Penukonda, Andy Phelps, Jonathan Ross, Matt Ross, Amir Salek, Emad Samadiani, Chris Severn, Gregory Sizikov, Matthew Snelham, Jed Souter, Dan Steinberg, Andy Swing, Mercedes Tan, Gregory Thorson, Bo Tian, Horia Toma, Erick Tuttle, Vijay Vasudevan, Richard Walter, Walter Wang, Eric Wilcox, and Doe Hyun Yoon. 2017. In-Datacenter Performance Analysis of a Tensor Processing Unit. In Proceedings of the 44th Annual International Symposium on Computer Architecture, ISCA 2017, Toronto, ON, Canada, June 24-28, 2017. ACM, 1--12.

Digital Library

[31]

Boris N Khoromskij. 2012. Tensors-structured numerical methods in scientific computing: Survey on recent advances. Chemometrics and Intelligent Laboratory Systems 110, 1 (2012), 1--19.

[32]

Fredrik Kjolstad, Shoaib Kamil, Stephen Chou, David Lugato, and Saman Amarasinghe. 2017. The Tensor Algebra Compiler. Proceedings of the ACM on Programming Languages 1, OOPSLA (2017), 1--29.

Digital Library

[33]

Joseph C Kolecki. 2002. An introduction to tensors for students of physics and engineering. (2002).

[34]

Patrick Konsor. 2011. Performance Benefits of Half Precision Floats. https://software.intel.com/en-us/articles/performance-benefits-of-half-precision-floats. Accessed: 2019-11-21.

[35]

Maria Kotsifakou, Prakalp Srivastava, Matthew D. Sinclair, Rakesh Komuravelli, Vikram Adve, and Sarita Adve. 2018. HPVM: Heterogeneous Parallel Virtual Machine. In Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP '18). ACM, New York, NY, USA, 68--80.

Digital Library

[36]

Alex Krizhevsky. 2012. Learning Multiple Layers of Features from Tiny Images. University of Toronto (05 2012).

[37]

Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet Classification with Deep Convolutional Neural Networks. In Advances in neural information processing systems. 1097--1105.

Digital Library

[38]

Vadim Lebedev and Victor Lempitsky. 2016. Fast convnets using group-wise brain damage. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2554--2564.

[39]

Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient-based Learning Applied to Document Recognition. Proc. IEEE 86, 11 (1998), 2278--2324.

[40]

Yann LeCun, Corinna Cortes, and Christopher JC Burges. 1998. The MNIST database of handwritten digits. http://yann.lecun.com/exdb/mnist.

[41]

Yann LeCun, John S Denker, and Sara A Solla. 1990. Optimal brain damage. In Advances in neural information processing systems. 598--605.

[42]

Hao Li, Asim Kadav, Igor Durdanovic, Hanan Samet, and Hans Peter Graf. 2016. Pruning Filters for Efficient Convnets. arXiv preprint arXiv:1608.08710 (2016).

[43]

H. Li, K. Ota, and M. Dong. 2018. Learning IoT in Edge: Deep Learning for the Internet of Things with Edge Computing. IEEE Network 32, 1 (Jan 2018), 96--101.

[44]

Baoyuan Liu, Min Wang, Hassan Foroosh, Marshall Tappen, and Marianna Pensky. 2015. Sparse convolutional neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 806--814.

[45]

Mark Harris, NVIDIA. 2016. Mixed-Precision Programming with CUDA 8. https://devblogs.nvidia.com/mixed-precision-programming-cuda-8/.

[46]

M. Mehrabani, S. Bangalore, and B. Stern. 2015. Personalized speech recognition for Internet of Things. In 2015 IEEE 2nd World Forum on Internet of Things (WF-IoT). 369--374.

Digital Library

[47]

Andres Milioto, Philipp Lottes, and Cyrill Stachniss. 2018. Real-time semantic segmentation of crop and weed for precision agriculture robots leveraging background knowledge in CNNs. In 2018 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2229--2235.

[48]

NVIDIA. 2018. NVDLA. http://nvdla.org/.

[49]

NVIDIA. 2018. NVIDIA Jetson TX2 Developer Kit. https://www.nvidia.com/en-us/autonomous-machines/embedded-systems-dev-kits-modules.

[50]

NVIDIA Developer Forums. 2018. Power Monitoring on Jetson TX2. (2018)). https://forums.developer.nvidia.com/t/jetson-tx2-ina226-power-monitor-with-i2c-interface/48754.

[51]

Subhankar Pal, Jonathan Beaumont, Dong-Hyeon Park, Aporva Amarnath, Siying Feng, Chaitali Chakrabarti, Hun-Seok Kim, David Blaauw, Trevor Mudge, and Ronald Dreslinski. 2018. OuterSPACE: An Outer Product Based Sparse Matrix Multiplication Accelerator. In 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, 724--736.

[52]

Alex Renda, Jonathan Frankle, and Michael Carbin. 2019. Comparing Rewinding and Fine-tuning in Neural Network Pruning. In International Conference on Learning Representations.

[53]

Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, and Li Fei-Fei. 2015. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV) 115, 3 (2015), 211--252.

Digital Library

[54]

M. Samadi, D. Jamshidi, J. Lee, and S. Mahlke. 2014. Paraprox: Pattern-based approximation for data parallel applications (ASPLOS).

Digital Library

[55]

Mehrzad Samadi, Janghaeng Lee, D. Anoushe Jamshidi, Amir Hormati, and Scott Mahlke. 2013. SAGE: Self-tuning Approximation for Graphics Engines. In Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-46). 13--24.

Digital Library

[56]

Adrian Sampson, Andre Baixo, Benjamin Ransford, Thierry Moreau, Joshua Yip, Luis Ceze, and Mark Oskin. 2015. ACCEPT: A Programmer-Guided Compiler Framework for Practical Approximate Computing. In U. Washington, Tech. Rep. UW-CSE- 15-01-01.

[57]

Hashim Sharif, Prakalp Srivastava, Muhammad Huzaifa, Maria Kotsifakou, Keyur Joshi, Yasmin Sarita, Nathan Zhao, Vikram S. Adve, Sasa Misailovic, and Sarita V. Adve. 2019. ApproxHPVM: a portable compiler IR for accuracy-aware optimizations. PACMPL 3, OOPSLA (2019), 186:1--186:30.

Digital Library

[58]

Karen Simonyan and Andrew Zisserman. 2014. Very Deep Convolutional Networks for Large-scale Image Recognition. arXiv preprint arXiv:1409.1556 (2014).

[59]

Prakalp Srivastava, Mingu Kang, Sujan K Gonugondla, Sungmin Lim, Jungwook Choi, Vikram Adve, Nam Sung Kim, and Naresh Shanbhag. 2018. PROMISE: An End-to-End Design of a Programmable Mixed-Signal Accelerator for Machine-Learning Algorithms. In 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA). IEEE.

Digital Library

[60]

Phillip Stanley-Marbell, Armin Alaghi, Michael Carbin, Eva Darulova, Lara Dolecek, Andreas Gerstlauer, Ghayoor Gillani, Djordje Jevdjic, Thierry Moreau, Mattia Cacciotti, Alexandros Daglis, Natalie D. Enright Jerger, Babak Falsafi, Sasa Misailovic, Adrian Sampson, and Damien Zufferey. 2018. Exploiting Errors for Efficiency: A Survey from Circuits to Algorithms. CoRR abs/1809.05859 (2018). arXiv:1809.05859 http://arxiv.org/abs/1809.05859

[61]

Wei Wen, Chunpeng Wu, Yandan Wang, Yiran Chen, and Hai Li. 2016. Learning structured sparsity in deep neural networks. In Advances in neural information processing systems. 2074--2082.

[62]

Chris Wiltz. 2018. Magic Leap One Teardown: A Leap Forward for AR/VR? (2018). https://www.designnews.com/design-hardware-software/magic-leap-one-teardown-leap-forward-arvr/204060129459400

[63]

Carole-Jean Wu, David Brooks, Kevin Chen, Douglas Chen, Sy Choudhury, Marat Dukhan, Kim Hazelwood, Eldad Isaac, Yangqing Jia, Bill Jia, et al. 2019. Machine learning at Facebook: Understanding Inference at the Edge. In 2019 IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, 331--344.

[64]

Leonid Yavits, Amir Morad, and Ran Ginosar. 2014. Sparse matrix multiplication on an associative processor. IEEE Transactions on Parallel and Distributed Systems 26, 11 (2014), 3175--3183.

Digital Library

[65]

Jiecao Yu, Andrew Lukefahr, David Palframan, Ganesh Dasika, Reetuparna Das, and Scott Mahlke. 2017. Scalpel: Customizing DNN pruning to the Underlying Hardware Parallelism. ACM SIGARCH Computer Architecture News 45, 2 (2017), 548--560.

Digital Library

[66]

Wanghong Yuan, Klara Nahrstedt, Sarita Adve, Douglas L Jones, and Robin H Kravets. 2003. Design and evaluation of a cross-layer adaptation framework for mobile multimedia systems. In Multimedia Computing and Networking 2003, Vol. 5019. International Society for Optics and Photonics, 1--13.

[67]

Zeyuan Allen Zhu, Sasa Misailovic, Jonathan A Kelner, and Martin Rinard. 2012. Randomized accuracy-aware program transformations for efficient approximate computations. In ACM SIGPLAN Notices, Vol. 47. ACM, 441--454.

Digital Library

Cited By

Guan YYu CZhou YLeng JLi CGuo MTsafrir DMusuvathi MGupta RAbu-Ghazaleh N(2024)Fractal: Joint Multi-Level Sparse Pattern Tuning of Accuracy and Performance for DNN PruningProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 310.1145/3620666.3651351(416-430)Online publication date: 27-Apr-2024
https://dl.acm.org/doi/10.1145/3620666.3651351
Zhao YSharif HAdve VMisailovic STsafrir DMusuvathi MGupta RAbu-Ghazaleh N(2024)Felix: Optimizing Tensor Programs with Gradient DescentProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 310.1145/3620666.3651348(367-381)Online publication date: 27-Apr-2024
https://dl.acm.org/doi/10.1145/3620666.3651348
Fabjančič MMachidon OSharif HZhao YMisailović SPejović V(2024)Mobiprox: Supporting Dynamic Approximate Computing on MobilesIEEE Internet of Things Journal10.1109/JIOT.2024.336595711:9(16873-16886)Online publication date: 1-May-2024
https://doi.org/10.1109/JIOT.2024.3365957
Show More Cited By

Index Terms

ApproxTuner: a compiler and runtime system for adaptive approximations
1. Software and its engineering
  1. Software notations and tools
    1. Compilers

Recommendations

ApproxHPVM: a portable compiler IR for accuracy-aware optimizations

We propose ApproxHPVM, a compiler IR and system designed to enable accuracy-aware performance and energy tuning on heterogeneous systems with multiple compute units and approximation methods. ApproxHPVM automatically translates end-to-end application-...
Preliminary experiences with the uintah framework on Intel Xeon Phi and stampede
XSEDE '13: Proceedings of the Conference on Extreme Science and Engineering Discovery Environment: Gateway to Discovery

In this work, we describe our preliminary experiences on the Stampede system in the context of the Uintah Computational Framework. Uintah was developed to provide an environment for solving a broad class of fluid-structure interaction problems on ...
Efficient Accuracy Recovery in Approximate Neural Networks by Systematic Error Modelling
ASPDAC '21: Proceedings of the 26th Asia and South Pacific Design Automation Conference

Approximate Computing is a promising paradigm for mitigating the computational demands of Deep Neural Networks (DNNs), by leveraging DNN performance and area, throughput or power. The DNN accuracy, affected by such approximations, can be then ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

PPoPP '21: Proceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming

February 2021

507 pages

ISBN:9781450382946

DOI:10.1145/3437801

General Chair:
Jaejin Lee
Seoul National University, South Korea
,
Program Chair:
Erez Petrank
Technion, Israel

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 February 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Conference

PPoPP '21

Sponsor:

PPoPP '21: 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming

February 27, 2021

Virtual Event, Republic of Korea

Acceptance Rates

PPoPP '21 Paper Acceptance Rate 31 of 150 submissions, 21%;

Overall Acceptance Rate 230 of 1,014 submissions, 23%

Upcoming Conference

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

15
Total Citations
View Citations
1,194
Total Downloads

Downloads (Last 12 months)274
Downloads (Last 6 weeks)45

Reflects downloads up to 25 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Guan YYu CZhou YLeng JLi CGuo MTsafrir DMusuvathi MGupta RAbu-Ghazaleh N(2024)Fractal: Joint Multi-Level Sparse Pattern Tuning of Accuracy and Performance for DNN PruningProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 310.1145/3620666.3651351(416-430)Online publication date: 27-Apr-2024
https://dl.acm.org/doi/10.1145/3620666.3651351
Zhao YSharif HAdve VMisailovic STsafrir DMusuvathi MGupta RAbu-Ghazaleh N(2024)Felix: Optimizing Tensor Programs with Gradient DescentProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 310.1145/3620666.3651348(367-381)Online publication date: 27-Apr-2024
https://dl.acm.org/doi/10.1145/3620666.3651348
Fabjančič MMachidon OSharif HZhao YMisailović SPejović V(2024)Mobiprox: Supporting Dynamic Approximate Computing on MobilesIEEE Internet of Things Journal10.1109/JIOT.2024.336595711:9(16873-16886)Online publication date: 1-May-2024
https://doi.org/10.1109/JIOT.2024.3365957
Shakibhamedan SAminifar AVassallo LTaheriNejad N(2024)Harnessing Approximate Computing for Machine Learning2024 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)10.1109/ISVLSI61997.2024.00110(585-591)Online publication date: 1-Jul-2024
https://doi.org/10.1109/ISVLSI61997.2024.00110
Dalloo AJaleel Humaidi AAl Mhdawi AAl-Raweshidy H(2024)Approximate Computing: Concepts, Architectures, Challenges, Applications, and Future DirectionsIEEE Access10.1109/ACCESS.2024.346737512(146022-146088)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3467375
Yi PAchour S(2023)Hardware-Aware Static Optimization of Hyperdimensional ComputationsProceedings of the ACM on Programming Languages10.1145/36227977:OOPSLA2(1-30)Online publication date: 16-Oct-2023
https://dl.acm.org/doi/10.1145/3622797
Azhar MManivannan MStenström P(2023)Approx-RM: Reducing Energy on Heterogeneous Multicore Processors under Accuracy and Timing ConstraintsACM Transactions on Architecture and Code Optimization10.1145/360521420:3(1-25)Online publication date: 22-Jul-2023
https://dl.acm.org/doi/10.1145/3605214
Fink ZParasyris KGeorgakoudis GMenon HMohror KArnold DBadia R(2023)HPAC-Offload: Accelerating HPC Applications with Portable Approximate Computing on the GPUProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3581784.3607095(1-14)Online publication date: 12-Nov-2023
https://dl.acm.org/doi/10.1145/3581784.3607095
Raza MJaved SKazmi MAziz AUl Haque MQazi S(2023)Approximate Computing: Hardware and Software Techniques, Tools and Their ApplicationsJournal of Circuits, Systems and Computers10.1142/S021812662430001033:04Online publication date: 20-Sep-2023
https://doi.org/10.1142/S0218126624300010
Menon HDiffenderfer JGeorgakoudis GLaguna ILam MOsei-Kuffuor DParasyris KVanover J(2023)Approximate High-Performance Computing: A Fast and Energy-Efficient Computing Paradigm in the Post-Moore EraIT Professional10.1109/MITP.2023.325464225:2(7-15)Online publication date: 1-Mar-2023
https://dl.acm.org/doi/10.1109/MITP.2023.3254642
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten