Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1109/SC.2018.00051acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
research-article

ADAPT: algorithmic differentiation applied to floating-point precision tuning

Published: 26 July 2019 Publication History

Abstract

HPC applications use floating point arithmetic operations extensively to solve computational problems. Mixed-precision computing seeks to use the lowest precision data type that is sufficient to achieve a desired accuracy, improving performance and reducing power consumption. Manually optimizing a program to use mixed precision is challenging as it not only requires extensive knowledge about the numerical behavior of the algorithm but also estimates of the rounding errors. In this work, we present ADAPT, a scalable approach for mixed-precision analysis on HPC workloads using algorithmic differentiation to provide accurate estimates about the final output error. ADAPT provides a floating-point precision sensitivity profile while incurring an overhead of only a constant multiple of the original computation irrespective of the number of variables analyzed. The sensitivity profile can be used to make algorithmic choices and to develop mixed-precision configurations of a program. We evaluate ADAPT on six benchmarks and a proxy application (LULESH) and show that we are able to achieve a speedup of 1.2x on the proxy application.

References

[1]
J. Dongarra, P. Beckman, T. Moore, P. Aerts, G. Aloisio, J.-C. Andre, D. Barkai, J.-Y. Berthou, T. Boku, B. Braunschweig, F. Cappello, B. Chapman, X. Chi, A. Choudhary, S. Dosanjh, T. Dunning, S. Fiore, A. Geist, B. Gropp, R. Harrison, M. Hereld, M. Heroux, A. Hoisie, K. Hotta, Z. Jin, Y. Ishikawa, F. Johnson, S. Kale, R. Kenway, D. Keyes, B. Kramer, J. Labarta, A. Lichnewsky, T. Lippert, B. Lucas, B. Maccabe, S. Matsuoka, P. Messina, P. Michielse, B. Mohr, M. S. Mueller, W. E. Nagel, H. Nakashima, M. E. Papka, D. Reed, M. Sato, E. Seidel, J. Shalf, D. Skinner, M. Snir, T. Sterling, R. Stevens, F. Streitz, B. Sugar, S. Sumimoto, W. Tang, J. Taylor, R. Thakur, A. Trefethen, M. Valero, A. van der Steen, J. Vetter, P. Williams, R. Wisniewski, and K. Yelick, "The International Exascale Software Project Roadmap," International Journal of High Performance Computing Applications, vol. 25, no. 1, pp. 3--60, 2011. {Online}. Available:
[2]
G. Gopalakrishnan, P. D. Hovland, C. Iancu, S. Krishnamoorthy, I. Laguna, R. A. Lethin, K. Sen, S. F. Siegel, and A. Solar-Lezama, "Report of the HPC Correctness Summit, Jan 25--26, 2017," Washington, DC, Tech. Rep., 2017. {Online}. Available: http://arxiv.org/abs/1705.07478
[3]
M. Baboulin, A. Buttari, J. Dongarra, J. Kurzak, J. Langou, J. Langou, P. Luszczek, and S. Tomov, "Accelerating scientific computations with mixed precision algorithms," Computer Physics Communications, vol. 180, no. 12, pp. 2526--2533, 2009.
[4]
R. Medhat, M. O. Lam, B. L. Rountree, B. Bonakdarpour, and S. Fischmeister, "Managing the Performance/Error Tradeoff of Floating-point Intensive Applications," in Proceedings of the International Conference on Embedded Software (EMSOFT'17). ACM, 2017.
[5]
M. O. Lam, J. K. Hollingsworth, B. R. de Supinski, and M. P. LeGendre, "Automatically adapting programs for mixed-precision floating-point computation," in Proceedings of the 27th international ACM conference on International conference on supercomputing. ACM, 2013, pp. 369--378.
[6]
C. Rubio-González, C. Nguyen, H. D. Nguyen, J. Demmel, W. Kahan, K. Sen, D. H. Bailey, C. Iancu, and D. Hough, "Precimonious: Tuning Assistant for Floating-Point Precision," in Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis on (SC'13). New York, New York, USA: ACM Press, nov 2013, pp. 1--12.
[7]
A. Solovyev, C. Jacobsen, Z. Rakamarić, and G. Gopalakrishnan, "Rigorous estimation of floating-point round-off errors with symbolic taylor expansions," in International Symposium on Formal Methods. Springer, 2015, pp. 532--550.
[8]
N. Damouche, M. Martel, and A. Chapoutot, "Intra-procedural Optimization of the Numerical Accuracy of Programs," in Proceedings of the 20th International Workshop on Formal Methods for Industrial Critical Systems, FMICS 2015, vol. 9128, 2015, pp. 31--46. {Online}. Available: http://link.springer.com/10.1007/978-3-319-19458-5{_}3
[9]
P. Panchekha, A. Sanchez-Stern, J. R. Wilcox, and Z. Tatlock, "Automatically improving accuracy for floating point expressions," Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation - PLDI 2015, pp. 1--11, 2015. {Online}. Available: http://dl.acm.org/citation.cfm?doid=2737924.2737959
[10]
E. Darulova, E. Horn, and S. Sharma, "Sound mixed-precision optimization with rewriting," in Proceedings of the 9th ACM/IEEE International Conference on Cyber-Physical Systems, ser. ICCPS '18. Piscataway, NJ, USA: IEEE Press, 2018, pp. 208--219. {Online}. Available
[11]
T. Bao and X. Zhang, "On-the-fly detection of instability problems in floating-point program execution," in ACM SIGPLAN Notices, vol. 48, no. 10. ACM, 2013, pp. 817--832.
[12]
R. Nathan, H. Naeimi, D. J. Sorin, and X. Sun, "Profile-driven automated mixed precision," arXiv preprint arXiv:1606.00251, 2016.
[13]
F. Benz, A. Hildebrandt, and S. Hack, "A dynamic program analysis to find floating-point accuracy problems," ACM SIGPLAN Notices, vol. 47, no. 6, pp. 453--462, 2012.
[14]
E. Darulova and V. Kuncak, "Sound compilation of reals," in Acm Sigplan Notices, vol. 49, no. 1. ACM, 2014, pp. 235--248.
[15]
U. Naumann, The art of differentiating computer programs: an introduction to algorithmic differentiation. Siam, 2012, vol. 24.
[16]
D. H. Bailey, "Resolving numerical anomalies in scientific computation," Tech. Rep., 2012. {Online}. Available: http://www.davidhbailey.com/dhbpapers/numerical-bugs.pdf
[17]
IEEE, "IEEE Standard for Floating-Point Arithmetic (IEEE 754-2008)," IEEE, New York, Tech. Rep., aug 2008.
[18]
B. W. Char, G. J. Fee, K. O. Geddes, G. H. Gonnet, and M. B. Monagan, "A tutorial introduction to maple," Journal of Symbolic Computation, vol. 2, no. 2, pp. 179--200, 1986.
[19]
A. Griewank, "On Automatic Differentiation," in Mathematical Programming: Recent Developments and Applications. Kluwer Academic Publishers, 1989, vol. 6, pp. 83--108. {Online}. Available: http://www.researchgate.net/publication/2703247{_}On{_}Automatic{_}Differentiation/file/9c96052529013aed9e.pdf
[20]
L. Hascoët and V. Pascual, "The Tapenade Automatic Differentiation tool: Principles, Model, and Specification," ACM Transactions On Mathematical Software, vol. 39, no. 3, 2013. {Online}. Available
[21]
C. Bischof, L. Roh, and A. Mauer-Oats, "Adic: an extensible automatic differentiation tool for ansi-c," Urbana, vol. 51, p. 61802, 1997.
[22]
C. Bischof, A. Carle, G. Corliss, A. Griewank, and P. Hovland, "Adifor-generating derivative codes from fortran programs," Scientific Programming, vol. 1, no. 1, pp. 11--29, 1992.
[23]
J. Utke, U. Naumann, M. Fagan, N. Tallent, M. Strout, P. Heimbach, C. Hill, and C. Wunsch, "Openad/f: A modular open-source tool for automatic differentiation of fortran codes," ACM Transactions on Mathematical Software (TOMS), vol. 34, no. 4, p. 18, 2008.
[24]
M. Sagebaum, T. Albring, and N. R. Gauger, "High-performance derivative computations using codipack," arXiv preprint arXiv:1709.07229, 2017.
[25]
R. J. Hogan, "Fast reverse-mode automatic differentiation using expression templates in c++," ACM Trans. Math. Softw., vol. 40, no. 4, pp. 26:1--26:16, Jul. 2014. {Online}. Available
[26]
F. S. Acton, Real Computing Made Real: Preventing Errors in Scientific and Engineering Calculations. Princeton, NJ, USA: Princeton University Press, 1996.
[27]
T. A. Davis and Y. Hu, "The university of florida sparse matrix collection," ACM Trans. Math. Softw., vol. 38, no. 1, pp. 1:1--1:25, Dec. 2011. {Online}. Available
[28]
Y. Saad and M. H. Schultz, "Gmres: A generalized minimal residual algorithm for solving nonsymmetric linear systems," SIAM J. Sci. Stat. Comput., vol. 7, no. 3, pp. 856--869, Jul. 1986. {Online}. Available
[29]
Y. Saad, "Ilut: A dual threshold incomplete lu factorization," Numerical Linear Algebra with Applications, vol. 1, no. 4, pp. 387--402. {Online}. Available:
[30]
A. Haidar, P. Wu, S. Tomov, and J. Dongarra, "Investigating half precision arithmetic to accelerate dense linear system solvers," in Proceedings of the 8th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, ser. ScalA '17. New York, NY, USA: ACM, 2017, pp. 10:1--10:8. {Online}. Available
[31]
W.-F. Chiang, M. Baranowski, I. Briggs, A. Solovyev, G. Gopalakrishnan, and Z. Rakamari, "Rigorous Floating-Point Mixed-Precision Tuning," in Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages (POPL'17). New York, NY, USA: ACM, 2017, pp. 300--315.
[32]
I. Karlin, J. Keasler, and R. Neely, "Lulesh 2.0 updates and changes," Tech. Rep. LLNL-TR-641973, August 2013.
[33]
M. O. Lam and B. L. Rountree, "Floating-Point Shadow Value Analysis," in Proceedings of the 5th Workshop on Extreme-Scale Programming Tools, ser. ESPT '16. Piscataway, NJ, USA: IEEE Press, 2016, pp. 18--25. {Online}. Available
[34]
A. Buttari, J. Dongarra, J. Kurzak, P. Luszczek, and S. Tomov, "Using mixed precision for sparse matrix computations to enhance the performance while achieving 64-bit accuracy," ACM Transactions on Mathematical Software (TOMS), vol. 34, no. 4, p. 17, 2008.
[35]
X. S. Li, J. W. Demmel, D. H. Bailey, G. Henry, Y. Hida, J. Iskandar, W. Kahan, S. Y. Kang, A. Kapur, M. C. Martin et al., "Design, implementation and testing of extended and mixed precision blas," ACM Transactions on Mathematical Software (TOMS), vol. 28, no. 2, pp. 152--205, 2002.
[36]
H. Anzt, B. Rocker, and V. Heuveline, "Energy efficiency of mixed precision iterative refinement methods using hybrid hardware platforms," Computer Science-Research and Development, vol. 25, no. 3--4, pp. 141--148, 2010.
[37]
M. O. Lam, J. K. Hollingsworth, and G. Stewart, "Dynamic Floating-Point Cancellation Detection," Parallel Computing, vol. 39, no. 3, pp. 146--155, mar 2013.
[38]
E. Schkufza, R. Sharma, and A. Aiken, "Stochastic optimization of floating-point programs with tunable precision," ACM SIGPLAN Notices, vol. 49, no. 6, pp. 53--64, 2014.
[39]
F. De Dinechin, C. Q. Lauter, and G. Melquiond, "Assisted verification of elementary functions using gappa," in Proceedings of the 2006 ACM symposium on Applied computing. ACM, 2006, pp. 1318--1322.
[40]
V. Magron, C. Verimag, G. Constantinides, and A. Donaldson, "Certified Roundoff Error Bounds Using Semidefinite Programming," ACM Trans. Math. Softw. Article, vol. 43, no. 34, 2017. {Online}. Available
[41]
E. Goubault and S. Putot, "Static analysis of finite precision computations," in International Workshop on Verification, Model Checking, and Abstract Interpretation. Springer, 2011, pp. 232--247.
[42]
E. Darulova and V. Kuncak, "Towards a Compiler for Reals," ACM Transactions on Programming Languages and Systems (TOPLAS), vol. 39, no. 2, pp. 8:1--8:28, 2017.
[43]
M. Iri, "History of automatic differentiation and rounding error estimation," Andreas Griewank and George Corliss, editors, pp. 3--16, 1991.
[44]
T. Braconnier and P. Langlois, "From rounding error estimation to automatic correction with automatic differentiation," in Automatic differentiation of algorithms. Springer, 2002, pp. 351--357.
[45]
A. A. Gaffar, O. Mencer, W. Luk, P. Y. Cheung, and N. Shirazi, "Floating-point bitwidth analysis via automatic differentiation," in Field-Programmable Technology, 2002.(FPT). Proceedings. 2002 IEEE International Conference on. IEEE, 2002, pp. 158--165.
[46]
A. A. Gaffar, O. Mencer, and W. Luk, "Unifying bit-width optimisation for fixed-point and floating-point designs," in Field-Programmable Custom Computing Machines, 2004. FCCM 2004. 12th Annual IEEE Symposium on. IEEE, 2004, pp. 79--88.
[47]
V. Vassiliadis, J. Riehme, J. Deussen, K. Parasyris, C. D. Antonopoulos, N. Bellas, S. Lalis, and U. Naumann, "Towards automatic significance analysis for approximate computing," in Code Generation and Optimization (CGO), 2016 IEEE/ACM International Symposium on. IEEE, 2016, pp. 182--193.
[48]
M. O. Lam and J. K. Hollingsworth, "Fine-grained floating-point precision analysis," The International Journal of High Performance Computing Applications, p. 1094342016652462, 2016.
[49]
H. Guo and C. Rubio-González, "Exploiting community structure for floating-point precision tuning," in Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis. ACM, 2018, pp. 333--343.
[50]
S. Graillat, F. Jézéquel, R. Picot, F. Févotte, and B. Lathuiliere, "Promise: floating-point precision tuning with stochastic arithmetic," in Proceedings of the 17th International Symposium on Scientific Computing, Computer Arithmetics and Verified Numerics (SCAN), 2016, pp. 98--99.

Cited By

View all
  • (2024)FPCC: Detecting Floating-Point Errors via Chain ConditionsProceedings of the ACM on Programming Languages10.1145/36897648:OOPSLA2(1504-1531)Online publication date: 8-Oct-2024
  • (2024)MixPert: Optimizing Mixed-Precision Floating-Point Emulation on GPU Integer Tensor CoresProceedings of the 25th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded Systems10.1145/3652032.3657567(34-45)Online publication date: 20-Jun-2024
  • (2024)A Holistic Approach to Automatic Mixed-Precision Code Generation and Tuning for Affine ProgramsProceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming10.1145/3627535.3638484(55-67)Online publication date: 2-Mar-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SC '18: Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis
November 2018
932 pages

Sponsors

In-Cooperation

  • IEEE CS

Publisher

IEEE Press

Publication History

Published: 26 July 2019

Check for updates

Qualifiers

  • Research-article

Conference

SC18
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,516 of 6,373 submissions, 24%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)7
  • Downloads (Last 6 weeks)3
Reflects downloads up to 22 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)FPCC: Detecting Floating-Point Errors via Chain ConditionsProceedings of the ACM on Programming Languages10.1145/36897648:OOPSLA2(1504-1531)Online publication date: 8-Oct-2024
  • (2024)MixPert: Optimizing Mixed-Precision Floating-Point Emulation on GPU Integer Tensor CoresProceedings of the 25th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded Systems10.1145/3652032.3657567(34-45)Online publication date: 20-Jun-2024
  • (2024)A Holistic Approach to Automatic Mixed-Precision Code Generation and Tuning for Affine ProgramsProceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming10.1145/3627535.3638484(55-67)Online publication date: 2-Mar-2024
  • (2024)Predicting Performance and Accuracy of Mixed-Precision Programs for Precision TuningProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3623338(1-13)Online publication date: 20-May-2024
  • (2023)Myths and legends in high-performance computingInternational Journal of High Performance Computing Applications10.1177/1094342023116660837:3-4(245-259)Online publication date: 1-Jul-2023
  • (2023)Automatic Search Guided Code Optimization Framework for Mixed-Precision Scientific ApplicationsProceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis10.1145/3624062.3624108(399-403)Online publication date: 12-Nov-2023
  • (2023)HPAC-Offload: Accelerating HPC Applications with Portable Approximate Computing on the GPUProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3581784.3607095(1-14)Online publication date: 12-Nov-2023
  • (2023)Approximate High-Performance Computing: A Fast and Energy-Efficient Computing Paradigm in the Post-Moore EraIT Professional10.1109/MITP.2023.325464225:2(7-15)Online publication date: 1-Mar-2023
  • (2022)moTunerProceedings of the 19th ACM International Conference on Computing Frontiers10.1145/3528416.3530231(94-102)Online publication date: 17-May-2022
  • (2022)Towards an Approximation-Aware Computational Workflow Framework for Accelerating Large-Scale Discovery TasksProceedings of the 2022 Workshop on Advanced tools, programming languages, and PLatforms for Implementing and Evaluating algorithms for Distributed systems10.1145/3524053.3542746(7-14)Online publication date: 25-Jul-2022
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media