Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 02ddd49

Browse files
committed
Change floating-point output format for improved performance.
Previously, floating-point output was done by rounding to a specific decimal precision; by default, to 6 or 15 decimal digits (losing information) or as requested using extra_float_digits. Drivers that wanted exact float values, and applications like pg_dump that must preserve values exactly, set extra_float_digits=3 (or sometimes 2 for historical reasons, though this isn't enough for float4). Unfortunately, decimal rounded output is slow enough to become a noticable bottleneck when dealing with large result sets or COPY of large tables when many floating-point values are involved. Floating-point output can be done much faster when the output is not rounded to a specific decimal length, but rather is chosen as the shortest decimal representation that is closer to the original float value than to any other value representable in the same precision. The recently published Ryu algorithm by Ulf Adams is both relatively simple and remarkably fast. Accordingly, change float4out/float8out to output shortest decimal representations if extra_float_digits is greater than 0, and make that the new default. Applications that need rounded output can set extra_float_digits back to 0 or below, and take the resulting performance hit. We make one concession to portability for systems with buggy floating-point input: we do not output decimal values that fall exactly halfway between adjacent representable binary values (which would rely on the reader doing round-to-nearest-even correctly). This is known to be a problem at least for VS2013 on Windows. Our version of the Ryu code originates from https://github.com/ulfjack/ryu/ at commit c9c3fb1979, but with the following (significant) modifications: - Output format is changed to use fixed-point notation for small exponents, as printf would, and also to use lowercase 'e', a minimum of 2 exponent digits, and a mandatory sign on the exponent, to keep the formatting as close as possible to previous output. - The output of exact midpoint values is disabled as noted above. - The integer fast-path code is changed somewhat (since we have fixed-point output and the upstream did not). - Our project style has been largely applied to the code with the exception of C99 declaration-after-statement, which has been retained as an exception to our present policy. - Most of upstream's debugging and conditionals are removed, and we use our own configure tests to determine things like uint128 availability. Changing the float output format obviously affects a number of regression tests. This patch uses an explicit setting of extra_float_digits=0 for test output that is not expected to be exactly reproducible (e.g. due to numerical instability or differing algorithms for transcendental functions). Conversions from floats to numeric are unchanged by this patch. These may appear in index expressions and it is not yet clear whether any change should be made, so that can be left for another day. This patch assumes that the only supported floating point format is now IEEE format, and the documentation is updated to reflect that. Code by me, adapting the work of Ulf Adams and other contributors. References: https://dl.acm.org/citation.cfm?id=3192369 Reviewed-by: Tom Lane, Andres Freund, Donald Dong Discussion: https://postgr.es/m/87r2el1bx6.fsf@news-spur.riddles.org.uk
1 parent f397e08 commit 02ddd49

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

50 files changed

+5466
-368
lines changed

configure

+9-1
Original file line numberDiff line numberDiff line change
@@ -732,6 +732,7 @@ CPP
732732
BITCODE_CXXFLAGS
733733
BITCODE_CFLAGS
734734
CFLAGS_VECTOR
735+
PERMIT_DECLARATION_AFTER_STATEMENT
735736
LLVM_BINPATH
736737
LLVM_CXXFLAGS
737738
LLVM_CFLAGS
@@ -5261,6 +5262,7 @@ if test "$GCC" = yes -a "$ICC" = no; then
52615262
CFLAGS="-Wall -Wmissing-prototypes -Wpointer-arith"
52625263
CXXFLAGS="-Wall -Wpointer-arith"
52635264
# These work in some but not all gcc versions
5265+
save_CFLAGS=$CFLAGS
52645266

52655267
{ $as_echo "$as_me:${as_lineno-$LINENO}: checking whether ${CC} supports -Wdeclaration-after-statement, for CFLAGS" >&5
52665268
$as_echo_n "checking whether ${CC} supports -Wdeclaration-after-statement, for CFLAGS... " >&6; }
@@ -5301,7 +5303,13 @@ if test x"$pgac_cv_prog_CC_cflags__Wdeclaration_after_statement" = x"yes"; then
53015303
fi
53025304

53035305

5304-
# -Wdeclaration-after-statement isn't applicable for C++
5306+
# -Wdeclaration-after-statement isn't applicable for C++. Specific C files
5307+
# disable it, so AC_SUBST the negative form.
5308+
PERMIT_DECLARATION_AFTER_STATEMENT=
5309+
if test x"save_$CFLAGS" != x"$CFLAGS"; then
5310+
PERMIT_DECLARATION_AFTER_STATEMENT=-Wno-declaration-after-statement
5311+
fi
5312+
53055313
# Really don't want VLAs to be used in our dialect of C
53065314

53075315
{ $as_echo "$as_me:${as_lineno-$LINENO}: checking whether ${CC} supports -Werror=vla, for CFLAGS" >&5

configure.in

+8-1
Original file line numberDiff line numberDiff line change
@@ -476,8 +476,15 @@ if test "$GCC" = yes -a "$ICC" = no; then
476476
CFLAGS="-Wall -Wmissing-prototypes -Wpointer-arith"
477477
CXXFLAGS="-Wall -Wpointer-arith"
478478
# These work in some but not all gcc versions
479+
save_CFLAGS=$CFLAGS
479480
PGAC_PROG_CC_CFLAGS_OPT([-Wdeclaration-after-statement])
480-
# -Wdeclaration-after-statement isn't applicable for C++
481+
# -Wdeclaration-after-statement isn't applicable for C++. Specific C files
482+
# disable it, so AC_SUBST the negative form.
483+
PERMIT_DECLARATION_AFTER_STATEMENT=
484+
if test x"save_$CFLAGS" != x"$CFLAGS"; then
485+
PERMIT_DECLARATION_AFTER_STATEMENT=-Wno-declaration-after-statement
486+
fi
487+
AC_SUBST(PERMIT_DECLARATION_AFTER_STATEMENT)
481488
# Really don't want VLAs to be used in our dialect of C
482489
PGAC_PROG_CC_CFLAGS_OPT([-Werror=vla])
483490
# -Wvla is not applicable for C++

contrib/btree_gist/expected/float4.out

+10-10
Original file line numberDiff line numberDiff line change
@@ -33,11 +33,11 @@ SELECT count(*) FROM float4tmp WHERE a > -179.0;
3333
(1 row)
3434

3535
SELECT a, a <-> '-179.0' FROM float4tmp ORDER BY a <-> '-179.0' LIMIT 3;
36-
a | ?column?
37-
----------+----------
38-
-179 | 0
39-
-189.024 | 10.0239
40-
-158.177 | 20.8226
36+
a | ?column?
37+
------------+-----------
38+
-179 | 0
39+
-189.02386 | 10.023865
40+
-158.17741 | 20.822586
4141
(3 rows)
4242

4343
CREATE INDEX float4idx ON float4tmp USING gist ( a );
@@ -82,10 +82,10 @@ SELECT a, a <-> '-179.0' FROM float4tmp ORDER BY a <-> '-179.0' LIMIT 3;
8282
(3 rows)
8383

8484
SELECT a, a <-> '-179.0' FROM float4tmp ORDER BY a <-> '-179.0' LIMIT 3;
85-
a | ?column?
86-
----------+----------
87-
-179 | 0
88-
-189.024 | 10.0239
89-
-158.177 | 20.8226
85+
a | ?column?
86+
------------+-----------
87+
-179 | 0
88+
-189.02386 | 10.023865
89+
-158.17741 | 20.822586
9090
(3 rows)
9191

contrib/btree_gist/expected/float8.out

+10-10
Original file line numberDiff line numberDiff line change
@@ -33,11 +33,11 @@ SELECT count(*) FROM float8tmp WHERE a > -1890.0;
3333
(1 row)
3434

3535
SELECT a, a <-> '-1890.0' FROM float8tmp ORDER BY a <-> '-1890.0' LIMIT 3;
36-
a | ?column?
37-
--------------+------------
38-
-1890 | 0
39-
-2003.634512 | 113.634512
40-
-1769.73634 | 120.26366
36+
a | ?column?
37+
--------------+--------------------
38+
-1890 | 0
39+
-2003.634512 | 113.63451200000009
40+
-1769.73634 | 120.26366000000007
4141
(3 rows)
4242

4343
CREATE INDEX float8idx ON float8tmp USING gist ( a );
@@ -82,10 +82,10 @@ SELECT a, a <-> '-1890.0' FROM float8tmp ORDER BY a <-> '-1890.0' LIMIT 3;
8282
(3 rows)
8383

8484
SELECT a, a <-> '-1890.0' FROM float8tmp ORDER BY a <-> '-1890.0' LIMIT 3;
85-
a | ?column?
86-
--------------+------------
87-
-1890 | 0
88-
-2003.634512 | 113.634512
89-
-1769.73634 | 120.26366
85+
a | ?column?
86+
--------------+--------------------
87+
-1890 | 0
88+
-2003.634512 | 113.63451200000009
89+
-1769.73634 | 120.26366000000007
9090
(3 rows)
9191

contrib/cube/expected/cube.out

+19-13
Original file line numberDiff line numberDiff line change
@@ -81,21 +81,21 @@ SELECT 'NaN'::cube AS cube;
8181
(1 row)
8282

8383
SELECT '.1234567890123456'::cube AS cube;
84-
cube
85-
---------------------
86-
(0.123456789012346)
84+
cube
85+
----------------------
86+
(0.1234567890123456)
8787
(1 row)
8888

8989
SELECT '+.1234567890123456'::cube AS cube;
90-
cube
91-
---------------------
92-
(0.123456789012346)
90+
cube
91+
----------------------
92+
(0.1234567890123456)
9393
(1 row)
9494

9595
SELECT '-.1234567890123456'::cube AS cube;
96-
cube
97-
----------------------
98-
(-0.123456789012346)
96+
cube
97+
-----------------------
98+
(-0.1234567890123456)
9999
(1 row)
100100

101101
-- simple lists (points)
@@ -943,9 +943,9 @@ SELECT cube_distance('(42,42,42,42)'::cube,'(137,137,137,137)'::cube);
943943
(1 row)
944944

945945
SELECT cube_distance('(42,42,42)'::cube,'(137,137)'::cube);
946-
cube_distance
947-
------------------
948-
140.762210837994
946+
cube_distance
947+
--------------------
948+
140.76221083799445
949949
(1 row)
950950

951951
-- Test of cube function (text to cube)
@@ -1356,8 +1356,9 @@ SELECT cube_size('(42,137)'::cube);
13561356
0
13571357
(1 row)
13581358

1359-
-- Test of distances
1359+
-- Test of distances (euclidean distance may not be bit-exact)
13601360
--
1361+
SET extra_float_digits = 0;
13611362
SELECT cube_distance('(1,1)'::cube, '(4,5)'::cube);
13621363
cube_distance
13631364
---------------
@@ -1370,6 +1371,7 @@ SELECT '(1,1)'::cube <-> '(4,5)'::cube as d_e;
13701371
5
13711372
(1 row)
13721373

1374+
RESET extra_float_digits;
13731375
SELECT distance_chebyshev('(1,1)'::cube, '(4,5)'::cube);
13741376
distance_chebyshev
13751377
--------------------
@@ -1557,6 +1559,7 @@ RESET enable_bitmapscan;
15571559
INSERT INTO test_cube VALUES ('(1,1)'), ('(100000)'), ('(0, 100000)'); -- Some corner cases
15581560
SET enable_seqscan = false;
15591561
-- Test different metrics
1562+
SET extra_float_digits = 0;
15601563
SELECT *, c <-> '(100, 100),(500, 500)'::cube as dist FROM test_cube ORDER BY c <-> '(100, 100),(500, 500)'::cube LIMIT 5;
15611564
c | dist
15621565
-------------------------+------------------
@@ -1567,6 +1570,7 @@ SELECT *, c <-> '(100, 100),(500, 500)'::cube as dist FROM test_cube ORDER BY c
15671570
(1444, 403),(1346, 344) | 846
15681571
(5 rows)
15691572

1573+
RESET extra_float_digits;
15701574
SELECT *, c <=> '(100, 100),(500, 500)'::cube as dist FROM test_cube ORDER BY c <=> '(100, 100),(500, 500)'::cube LIMIT 5;
15711575
c | dist
15721576
-------------------------+------
@@ -1751,6 +1755,7 @@ SELECT c~>(-4), c FROM test_cube ORDER BY c~>(-4) LIMIT 15; -- descending by upp
17511755
-- Same queries with sequential scan (should give the same results as above)
17521756
RESET enable_seqscan;
17531757
SET enable_indexscan = OFF;
1758+
SET extra_float_digits = 0;
17541759
SELECT *, c <-> '(100, 100),(500, 500)'::cube as dist FROM test_cube ORDER BY c <-> '(100, 100),(500, 500)'::cube LIMIT 5;
17551760
c | dist
17561761
-------------------------+------------------
@@ -1761,6 +1766,7 @@ SELECT *, c <-> '(100, 100),(500, 500)'::cube as dist FROM test_cube ORDER BY c
17611766
(1444, 403),(1346, 344) | 846
17621767
(5 rows)
17631768

1769+
RESET extra_float_digits;
17641770
SELECT *, c <=> '(100, 100),(500, 500)'::cube as dist FROM test_cube ORDER BY c <=> '(100, 100),(500, 500)'::cube LIMIT 5;
17651771
c | dist
17661772
-------------------------+------

contrib/cube/expected/cube_sci.out

+9-9
Original file line numberDiff line numberDiff line change
@@ -87,20 +87,20 @@ SELECT '-1e-300'::cube AS cube;
8787
(1 row)
8888

8989
SELECT '1234567890123456'::cube AS cube;
90-
cube
91-
------------------------
92-
(1.23456789012346e+15)
90+
cube
91+
-------------------------
92+
(1.234567890123456e+15)
9393
(1 row)
9494

9595
SELECT '+1234567890123456'::cube AS cube;
96-
cube
97-
------------------------
98-
(1.23456789012346e+15)
96+
cube
97+
-------------------------
98+
(1.234567890123456e+15)
9999
(1 row)
100100

101101
SELECT '-1234567890123456'::cube AS cube;
102-
cube
103-
-------------------------
104-
(-1.23456789012346e+15)
102+
cube
103+
--------------------------
104+
(-1.234567890123456e+15)
105105
(1 row)
106106

contrib/cube/sql/cube.sql

+7-1
Original file line numberDiff line numberDiff line change
@@ -336,10 +336,12 @@ SELECT cube_inter('(1,2,3)'::cube, '(5,6,3)'::cube); -- point args
336336
SELECT cube_size('(4,8),(15,16)'::cube);
337337
SELECT cube_size('(42,137)'::cube);
338338

339-
-- Test of distances
339+
-- Test of distances (euclidean distance may not be bit-exact)
340340
--
341+
SET extra_float_digits = 0;
341342
SELECT cube_distance('(1,1)'::cube, '(4,5)'::cube);
342343
SELECT '(1,1)'::cube <-> '(4,5)'::cube as d_e;
344+
RESET extra_float_digits;
343345
SELECT distance_chebyshev('(1,1)'::cube, '(4,5)'::cube);
344346
SELECT '(1,1)'::cube <=> '(4,5)'::cube as d_c;
345347
SELECT distance_taxicab('(1,1)'::cube, '(4,5)'::cube);
@@ -395,7 +397,9 @@ INSERT INTO test_cube VALUES ('(1,1)'), ('(100000)'), ('(0, 100000)'); -- Some c
395397
SET enable_seqscan = false;
396398

397399
-- Test different metrics
400+
SET extra_float_digits = 0;
398401
SELECT *, c <-> '(100, 100),(500, 500)'::cube as dist FROM test_cube ORDER BY c <-> '(100, 100),(500, 500)'::cube LIMIT 5;
402+
RESET extra_float_digits;
399403
SELECT *, c <=> '(100, 100),(500, 500)'::cube as dist FROM test_cube ORDER BY c <=> '(100, 100),(500, 500)'::cube LIMIT 5;
400404
SELECT *, c <#> '(100, 100),(500, 500)'::cube as dist FROM test_cube ORDER BY c <#> '(100, 100),(500, 500)'::cube LIMIT 5;
401405

@@ -412,7 +416,9 @@ SELECT c~>(-4), c FROM test_cube ORDER BY c~>(-4) LIMIT 15; -- descending by upp
412416
-- Same queries with sequential scan (should give the same results as above)
413417
RESET enable_seqscan;
414418
SET enable_indexscan = OFF;
419+
SET extra_float_digits = 0;
415420
SELECT *, c <-> '(100, 100),(500, 500)'::cube as dist FROM test_cube ORDER BY c <-> '(100, 100),(500, 500)'::cube LIMIT 5;
421+
RESET extra_float_digits;
416422
SELECT *, c <=> '(100, 100),(500, 500)'::cube as dist FROM test_cube ORDER BY c <=> '(100, 100),(500, 500)'::cube LIMIT 5;
417423
SELECT *, c <#> '(100, 100),(500, 500)'::cube as dist FROM test_cube ORDER BY c <#> '(100, 100),(500, 500)'::cube LIMIT 5;
418424
SELECT c~>1, c FROM test_cube ORDER BY c~>1 LIMIT 15; -- ascending by left bound

contrib/pg_trgm/expected/pg_strict_word_trgm.out

+2
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,8 @@
11
DROP INDEX trgm_idx2;
22
\copy test_trgm3 from 'data/trgm2.data'
33
ERROR: relation "test_trgm3" does not exist
4+
-- reduce noise
5+
set extra_float_digits = 0;
46
select t,strict_word_similarity('Baykal',t) as sml from test_trgm2 where 'Baykal' <<% t order by sml desc, t;
57
t | sml
68
-------------------------------------+----------

contrib/pg_trgm/expected/pg_trgm.out

+2
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,8 @@ WHERE opc.oid >= 16384 AND NOT amvalidate(opc.oid);
1010
--backslash is used in tests below, installcheck will fail if
1111
--standard_conforming_string is off
1212
set standard_conforming_strings=on;
13+
-- reduce noise
14+
set extra_float_digits = 0;
1315
select show_trgm('');
1416
show_trgm
1517
-----------

contrib/pg_trgm/expected/pg_word_trgm.out

+2
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
11
CREATE TABLE test_trgm2(t text COLLATE "C");
22
\copy test_trgm2 from 'data/trgm2.data'
3+
-- reduce noise
4+
set extra_float_digits = 0;
35
select t,word_similarity('Baykal',t) as sml from test_trgm2 where 'Baykal' <% t order by sml desc, t;
46
t | sml
57
-------------------------------------+----------

contrib/pg_trgm/sql/pg_strict_word_trgm.sql

+3
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,9 @@ DROP INDEX trgm_idx2;
22

33
\copy test_trgm3 from 'data/trgm2.data'
44

5+
-- reduce noise
6+
set extra_float_digits = 0;
7+
58
select t,strict_word_similarity('Baykal',t) as sml from test_trgm2 where 'Baykal' <<% t order by sml desc, t;
69
select t,strict_word_similarity('Kabankala',t) as sml from test_trgm2 where 'Kabankala' <<% t order by sml desc, t;
710
select t,strict_word_similarity('Baykal',t) as sml from test_trgm2 where t %>> 'Baykal' order by sml desc, t;

contrib/pg_trgm/sql/pg_trgm.sql

+3
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,9 @@ WHERE opc.oid >= 16384 AND NOT amvalidate(opc.oid);
99
--standard_conforming_string is off
1010
set standard_conforming_strings=on;
1111

12+
-- reduce noise
13+
set extra_float_digits = 0;
14+
1215
select show_trgm('');
1316
select show_trgm('(*&^$@%@');
1417
select show_trgm('a b c');

contrib/pg_trgm/sql/pg_word_trgm.sql

+3
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,9 @@ CREATE TABLE test_trgm2(t text COLLATE "C");
22

33
\copy test_trgm2 from 'data/trgm2.data'
44

5+
-- reduce noise
6+
set extra_float_digits = 0;
7+
58
select t,word_similarity('Baykal',t) as sml from test_trgm2 where 'Baykal' <% t order by sml desc, t;
69
select t,word_similarity('Kabankala',t) as sml from test_trgm2 where 'Kabankala' <% t order by sml desc, t;
710
select t,word_similarity('Baykal',t) as sml from test_trgm2 where t %> 'Baykal' order by sml desc, t;

contrib/seg/expected/seg.out

+3-3
Original file line numberDiff line numberDiff line change
@@ -1127,7 +1127,7 @@ FROM test_seg WHERE s @> '11.2..11.3' OR s IS NULL ORDER BY s;
11271127
2.1 | 6.95 | 11.8
11281128
2.3 | Infinity | Infinity
11291129
2.3 | Infinity | Infinity
1130-
2.4 | 6.85 | 11.3
1130+
2.4 | 6.8500004 | 11.3
11311131
2.5 | 7 | 11.5
11321132
2.5 | 7.15 | 11.8
11331133
2.6 | Infinity | Infinity
@@ -1155,7 +1155,7 @@ FROM test_seg WHERE s @> '11.2..11.3' OR s IS NULL ORDER BY s;
11551155
4.5 | 59.75 | 115
11561156
4.7 | 8.25 | 11.8
11571157
4.8 | 8.15 | 11.5
1158-
4.8 | 8.2 | 11.6
1158+
4.8 | 8.200001 | 11.6
11591159
4.8 | 8.65 | 12.5
11601160
4.8 | Infinity | Infinity
11611161
4.9 | 8.45 | 12
@@ -1244,7 +1244,7 @@ FROM test_seg WHERE s @> '11.2..11.3' OR s IS NULL ORDER BY s;
12441244
9 | 10.5 | 12
12451245
9 | Infinity | Infinity
12461246
9.2 | 10.6 | 12
1247-
9.4 | 10.8 | 12.2
1247+
9.4 | 10.799999 | 12.2
12481248
9.5 | 10.75 | 12
12491249
9.5 | 10.85 | 12.2
12501250
9.5 | Infinity | Infinity

0 commit comments

Comments
 (0)