Efficiency of Reproducible Level 1 BLAS

Chohra, Chemseddine; Langlois, Philippe; Parello, David

doi:10.1007/978-3-319-31769-4_8

Chemseddine Chohra^16,17,
Philippe Langlois^16,17 &
David Parello^16,17

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9553))

Included in the following conference series:

International Symposium on Scientific Computing, Computer Arithmetic, and Validated Numerics

784 Accesses
5 Citations
7 Altmetric

Abstract

Numerical reproducibility failures appear in massively parallel floating-point computations. One way to guarantee this reproducibility is to extend the IEEE-754 correct rounding to larger computing sequences, e.g. to the BLAS. Is the extra cost for numerical reproducibility acceptable in practice? We present solutions and experiments for the level 1 BLAS and we conclude about their efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Reproducible, Accurately Rounded and Efficient BLAS

A Study on the Performance of Reproducible Computations

Estimation of Round-off Errors in OpenMP Codes

References

IEEE 754–2008, Standard for Floating-Point Arithmetic. Institute of Electrical and Electronics Engineers, New York (2008)
Google Scholar
Bohlender, G.: Floating-point computation of functions with maximum accuracy. IEEE Trans. Comput. C-26(7), 621–632 (1977)
Article MathSciNet Google Scholar
Chohra, C., Langlois, P., Parello, D.: Implementation and Efficiency of Reproducible Level 1 BLAS (2015). http://hal-lirmm.ccsd.cnrs.fr/lirmm-01179986
Collange, C., Defour, D., Graillat, S., Iakimchuk, R.: Reproducible and accurate matrix multiplication in ExBLAS for high-performance computing. In: SCAN 2014, Würzburg, Germany (2014)
Google Scholar
Dekker, T.J.: A floating-point technique for extending the available precision. Numer. Math. 18, 224–242 (1971)
Article MathSciNet Google Scholar
Demmel, J.W., Nguyen, H.D.: Fast reproducible floating-point summation. In: Proceedings of 21th IEEE Symposium on Computer Arithmetic. Austin, Texas, USA (2013)
Google Scholar
Intel Math Kernel Library. http://www.intel.com/software/products/mkl/
Jézéquel, F., Langlois, P., Revol, N.: First steps towards more numerical reproducibility. ESAIM: Proc. 45, 229–238 (2013)
Article MathSciNet Google Scholar
Muller, J.M., Brisebarre, N., de Dinechin, F., Jeannerod, C.P., Lefèvre, V., Melquiond, G., Revol, N., Stehlé, D., Torres, S.: Handbook of Floating-Point Arithmetic. Birkhäuser, Boston (2010)
Book Google Scholar
Ogita, T., Rump, S.M., Oishi, S.: Accurate sum and dot product. SIAM J. Sci. Comput. 26(6), 1955–1988 (2005)
Article MathSciNet Google Scholar
Reinders, J.: Intel Threading Building Blocks, 1st edn. O’Reilly & Associates Inc., Sebastopol (2007)
Google Scholar
http://webdali.univ-perp.fr/ReproducibleSoftware
Rump, S.M.: Ultimately fast accurate summation. SIAM J. Sci. Comput. 31(5), 3466–3502 (2009)
Article MathSciNet Google Scholar
Rump, S.M., Ogita, T., Oishi, S.: Accurate floating-point summation - part I: faithful rounding. SIAM J. Sci. Comput. 31(1), 189–224 (2008)
Article MathSciNet Google Scholar
Story, S.: Numerical reproducibility in the Intel Math Kernel Library. Salt Lake City, November 2012
Google Scholar
Van Zee, F.G., van de Geijn, R.A.: BLIS: a framework for rapidly instantiating BLAS functionality. ACM Trans. Math. Software 41(3), 14:1–14:33 (2015)
MathSciNet MATH Google Scholar
Yamanaka, N., Ogita, T., Rump, S., Oishi, S.: A parallel algorithm for accurate dot product. Parallel Comput. 34(68), 392–410 (2008)
Article MathSciNet Google Scholar
Zhu, Y.K., Hayes, W.B.: Correct rounding and hybrid approach to exact floating-point summation. SIAM J. Sci. Comput. 31(4), 2981–3001 (2009)
Article MathSciNet Google Scholar
Zhu, Y.K., Hayes, W.B.: Algorithm 908: online exact summation of floating-point streams. ACM Trans. Math. Softw. 37(3), 37:1–37:13 (2010)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Digits, Architectures et Logiciels Informatiques, Univ. Perpignan Via Domitia, 66860, Perpignan, France
Chemseddine Chohra, Philippe Langlois & David Parello
Laboratoire d’Informatique Robotique et de Microélectronique de Montpellier, Univ. Montpellier II, UMR 5506, CNRS, 34095, Montpellier, France
Chemseddine Chohra, Philippe Langlois & David Parello

Authors

Chemseddine Chohra
View author publications
You can also search for this author in PubMed Google Scholar
Philippe Langlois
View author publications
You can also search for this author in PubMed Google Scholar
David Parello
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chemseddine Chohra .

Editor information

Editors and Affiliations

Institute of Computer Science, University of Würzburg, Würzburg, Germany
Marco Nehmeier
Universität Würzburg, Würzburg, Germany
Jürgen Wolff von Gudenberg
Department of Mathematics, Uppsala University, Uppsala, Sweden
Warwick Tucker

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chohra, C., Langlois, P., Parello, D. (2016). Efficiency of Reproducible Level 1 BLAS. In: Nehmeier, M., Wolff von Gudenberg, J., Tucker, W. (eds) Scientific Computing, Computer Arithmetic, and Validated Numerics. SCAN 2015. Lecture Notes in Computer Science(), vol 9553. Springer, Cham. https://doi.org/10.1007/978-3-319-31769-4_8

Download citation

DOI: https://doi.org/10.1007/978-3-319-31769-4_8
Published: 09 April 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-31768-7
Online ISBN: 978-3-319-31769-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Efficiency of Reproducible Level 1 BLAS

Abstract

Access this chapter

Similar content being viewed by others

Reproducible, Accurately Rounded and Efficient BLAS

A Study on the Performance of Reproducible Computations

Estimation of Round-off Errors in OpenMP Codes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Efficiency of Reproducible Level 1 BLAS

Abstract

Access this chapter

Similar content being viewed by others

Reproducible, Accurately Rounded and Efficient BLAS

A Study on the Performance of Reproducible Computations

Estimation of Round-off Errors in OpenMP Codes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation