Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Jean-Michel Muller

rounded multiplication by arbitrary precision constants
To analyze a priori the accuracy of an algorithm in oating-point arithmetic, one usually derives a uniform error bound on the output, valid for most inputs and parametrized by the precision p. To show further that this bound is sharp, a... more
To analyze a priori the accuracy of an algorithm in oating-point arithmetic, one usually derives a uniform error bound on the output, valid for most inputs and parametrized by the precision p. To show further that this bound is sharp, a common way is to build an input example for which the error committed by the algorithm comes close to that bound, or even attains it. Such inputs may be given as oating-point numbers in one of the IEEE standard formats (say, for p = 53) or, more generally, as expressions parametrized by p, that can be viewed as symbolic oating-point numbers. With such inputs, a sharpness result can thus be established for virtually all reasonable formats instead of just one of them. This, however, requires the ability to run the algorithm on those inputs and, in particular, to compute the correctly-rounded sum, product, or ratio of two symbolic oating-point numbers. The goal of this paper is to show how these basic arithmetic operations can be performed automatically...
Floating-point numbers have an intuitive meaning when it comes to physics-based numerical computations, and they have thus become the most common way of approximating real numbers in computers. The IEEE-754 Standard has played a large... more
Floating-point numbers have an intuitive meaning when it comes to physics-based numerical computations, and they have thus become the most common way of approximating real numbers in computers. The IEEE-754 Standard has played a large part in making floating-point arithmetic ubiquitous today, by specifying its semantics in a strict yet useful way as early as 1985. In particular, floating-point operations should be performed as if their results were first computed with an infinite precision and then rounded to the target format. A consequence is that floating-point arithmetic satisfies the ‘standard model’ that is often used for analysing the accuracy of floating-point algorithms. But that is only scraping the surface, and floating-point arithmetic offers much more.In this survey we recall the history of floating-point arithmetic as well as its specification mandated by the IEEE-754 Standard. We also recall what properties it entails and what every programmer should know when designi...
The 2Sum and Fast2Sum algorithms are important building blocks in numerical computing. They are used (implicitely or explicitely) in many compensated algorithms (such as compensated summation or compensated polynomial evaluation). They... more
The 2Sum and Fast2Sum algorithms are important building blocks in numerical computing. They are used (implicitely or explicitely) in many compensated algorithms (such as compensated summation or compensated polynomial evaluation). They are also used for manipulating floating-point expansions . We show that these algorithms are much more robust than it is usually believed: The returned result makes sense even when the rounding function is not round-to-nearest, and they are almost immune to overflow.
Research Interests:
Let us denote by Q(N; ) the number of solutions of the diophantine equation A2 + B2 = C2 + C satisfying N A B C N 1 2. We prove that, for xed and N ! 1, there exists a constant ( ) such that Q(N; ) = ( )N + O N 7=8 logN . When = 2, Q(2n... more
Let us denote by Q(N; ) the number of solutions of the diophantine equation A2 + B2 = C2 + C satisfying N A B C N 1 2. We prove that, for xed and N ! 1, there exists a constant ( ) such that Q(N; ) = ( )N + O N 7=8 logN . When = 2, Q(2n 1; 2) counts the number of solutions of A2 + B2 = C2 + C with the same number, n, of binary digits; these solutions are interesting in the problem of computing the function (a;b) ! p a2 +b2 in radix-2 oating-p oint arithmetic. By elementary arguments, Q(N; ) can be expressed in terms of four sums of the type
Research Interests:

And 264 more