Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Efficiency of Reproducible Level 1 BLAS

  • Conference paper
  • First Online:
Scientific Computing, Computer Arithmetic, and Validated Numerics (SCAN 2015)

Abstract

Numerical reproducibility failures appear in massively parallel floating-point computations. One way to guarantee this reproducibility is to extend the IEEE-754 correct rounding to larger computing sequences, e.g. to the BLAS. Is the extra cost for numerical reproducibility acceptable in practice? We present solutions and experiments for the level 1 BLAS and we conclude about their efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Similar content being viewed by others

References

  1. IEEE 754–2008, Standard for Floating-Point Arithmetic. Institute of Electrical and Electronics Engineers, New York (2008)

    Google Scholar 

  2. Bohlender, G.: Floating-point computation of functions with maximum accuracy. IEEE Trans. Comput. C-26(7), 621–632 (1977)

    Article  MathSciNet  Google Scholar 

  3. Chohra, C., Langlois, P., Parello, D.: Implementation and Efficiency of Reproducible Level 1 BLAS (2015). http://hal-lirmm.ccsd.cnrs.fr/lirmm-01179986

  4. Collange, C., Defour, D., Graillat, S., Iakimchuk, R.: Reproducible and accurate matrix multiplication in ExBLAS for high-performance computing. In: SCAN 2014, Würzburg, Germany (2014)

    Google Scholar 

  5. Dekker, T.J.: A floating-point technique for extending the available precision. Numer. Math. 18, 224–242 (1971)

    Article  MathSciNet  Google Scholar 

  6. Demmel, J.W., Nguyen, H.D.: Fast reproducible floating-point summation. In: Proceedings of 21th IEEE Symposium on Computer Arithmetic. Austin, Texas, USA (2013)

    Google Scholar 

  7. Intel Math Kernel Library. http://www.intel.com/software/products/mkl/

  8. Jézéquel, F., Langlois, P., Revol, N.: First steps towards more numerical reproducibility. ESAIM: Proc. 45, 229–238 (2013)

    Article  MathSciNet  Google Scholar 

  9. Muller, J.M., Brisebarre, N., de Dinechin, F., Jeannerod, C.P., Lefèvre, V., Melquiond, G., Revol, N., Stehlé, D., Torres, S.: Handbook of Floating-Point Arithmetic. Birkhäuser, Boston (2010)

    Book  Google Scholar 

  10. Ogita, T., Rump, S.M., Oishi, S.: Accurate sum and dot product. SIAM J. Sci. Comput. 26(6), 1955–1988 (2005)

    Article  MathSciNet  Google Scholar 

  11. Reinders, J.: Intel Threading Building Blocks, 1st edn. O’Reilly & Associates Inc., Sebastopol (2007)

    Google Scholar 

  12. http://webdali.univ-perp.fr/ReproducibleSoftware

  13. Rump, S.M.: Ultimately fast accurate summation. SIAM J. Sci. Comput. 31(5), 3466–3502 (2009)

    Article  MathSciNet  Google Scholar 

  14. Rump, S.M., Ogita, T., Oishi, S.: Accurate floating-point summation - part I: faithful rounding. SIAM J. Sci. Comput. 31(1), 189–224 (2008)

    Article  MathSciNet  Google Scholar 

  15. Story, S.: Numerical reproducibility in the Intel Math Kernel Library. Salt Lake City, November 2012

    Google Scholar 

  16. Van Zee, F.G., van de Geijn, R.A.: BLIS: a framework for rapidly instantiating BLAS functionality. ACM Trans. Math. Software 41(3), 14:1–14:33 (2015)

    MathSciNet  MATH  Google Scholar 

  17. Yamanaka, N., Ogita, T., Rump, S., Oishi, S.: A parallel algorithm for accurate dot product. Parallel Comput. 34(68), 392–410 (2008)

    Article  MathSciNet  Google Scholar 

  18. Zhu, Y.K., Hayes, W.B.: Correct rounding and hybrid approach to exact floating-point summation. SIAM J. Sci. Comput. 31(4), 2981–3001 (2009)

    Article  MathSciNet  Google Scholar 

  19. Zhu, Y.K., Hayes, W.B.: Algorithm 908: online exact summation of floating-point streams. ACM Trans. Math. Softw. 37(3), 37:1–37:13 (2010)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chemseddine Chohra .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Chohra, C., Langlois, P., Parello, D. (2016). Efficiency of Reproducible Level 1 BLAS. In: Nehmeier, M., Wolff von Gudenberg, J., Tucker, W. (eds) Scientific Computing, Computer Arithmetic, and Validated Numerics. SCAN 2015. Lecture Notes in Computer Science(), vol 9553. Springer, Cham. https://doi.org/10.1007/978-3-319-31769-4_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-31769-4_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-31768-7

  • Online ISBN: 978-3-319-31769-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics