Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 620b49a

Browse files
committed
hash: Increase the number of possible overflow bitmaps by 8x.
Per a report from AP, it's not that hard to exhaust the supply of bitmap pages if you create a table with a hash index and then insert a few billion rows - and then you start getting errors when you try to insert additional rows. In the particular case reported by AP, there's another fix that we can make to improve recycling of overflow pages, which is another way to avoid the error, but there may be other cases where this problem happens and that fix won't help. So let's buy ourselves as much headroom as we can without rearchitecting anything. The comments claim that the old limit was 64GB, but it was really only 32GB, because we didn't use all the bits in the page for bitmap bits - only the largest power of 2 that could fit after deducting space for the page header and so forth. Thus, we have 4kB per page for bitmap bits, not 8kB. The new limit is thus actually 8 times the old *real* limit but only 4 times the old *purported* limit. Since this breaks on-disk compatibility, bump HASH_VERSION. We've already done this earlier in this release cycle, so this doesn't cause any incremental inconvenience for people using pg_upgrade from releases prior to v10. However, users who use pg_upgrade to reach 10beta3 or later from 10beta2 or earlier will need to REINDEX any hash indexes again. Amit Kapila and Robert Haas Discussion: http://postgr.es/m/20170704105728.mwb72jebfmok2nm2@zip.com.au
1 parent c30f177 commit 620b49a

File tree

5 files changed

+19
-15
lines changed

5 files changed

+19
-15
lines changed

contrib/pageinspect/expected/hash.out

+3-3
Original file line numberDiff line numberDiff line change
@@ -43,9 +43,9 @@ ERROR: invalid overflow block number 5
4343
SELECT magic, version, ntuples, bsize, bmsize, bmshift, maxbucket, highmask,
4444
lowmask, ovflpoint, firstfree, nmaps, procid, spares, mapp FROM
4545
hash_metapage_info(get_raw_page('test_hash_a_idx', 0));
46-
-[ RECORD 1 ]----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
46+
-[ RECORD 1 ]--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
4747
magic | 105121344
48-
version | 3
48+
version | 4
4949
ntuples | 1
5050
bsize | 8152
5151
bmsize | 4096
@@ -58,7 +58,7 @@ firstfree | 0
5858
nmaps | 1
5959
procid | 450
6060
spares | {0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0}
61-
mapp | {5,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0}
61+
mapp | {5,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0}
6262

6363
SELECT magic, version, ntuples, bsize, bmsize, bmshift, maxbucket, highmask,
6464
lowmask, ovflpoint, firstfree, nmaps, procid, spares, mapp FROM

contrib/pgstattuple/expected/pgstattuple.out

+2-2
Original file line numberDiff line numberDiff line change
@@ -134,7 +134,7 @@ create index test_hashidx on test using hash (b);
134134
select * from pgstathashindex('test_hashidx');
135135
version | bucket_pages | overflow_pages | bitmap_pages | unused_pages | live_items | dead_items | free_percent
136136
---------+--------------+----------------+--------------+--------------+------------+------------+--------------
137-
3 | 4 | 0 | 1 | 0 | 0 | 0 | 100
137+
4 | 4 | 0 | 1 | 0 | 0 | 0 | 100
138138
(1 row)
139139

140140
-- these should error with the wrong type
@@ -235,7 +235,7 @@ select pgstatindex('test_partition_idx');
235235
select pgstathashindex('test_partition_hash_idx');
236236
pgstathashindex
237237
---------------------
238-
(3,8,0,1,0,0,0,100)
238+
(4,8,0,1,0,0,0,100)
239239
(1 row)
240240

241241
drop table test_partitioned;

doc/src/sgml/pageinspect.sgml

+9-4
Original file line numberDiff line numberDiff line change
@@ -687,8 +687,13 @@ test=# SELECT * FROM hash_bitmap_info('con_hash_index', 2052);
687687
<function>hash_metapage_info</function> returns information stored
688688
in meta page of a <acronym>HASH</acronym> index. For example:
689689
<screen>
690-
test=# SELECT * FROM hash_metapage_info(get_raw_page('con_hash_index', 0));
691-
-[ RECORD 1 ]-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
690+
test=# SELECT magic, version, ntuples, ffactor, bsize, bmsize, bmshift,
691+
test-# maxbucket, highmask, lowmask, ovflpoint, firstfree, nmaps, procid,
692+
test-# regexp_replace(spares::text, '(,0)*}', '}') as spares,
693+
test-# regexp_replace(mapp::text, '(,0)*}', '}') as mapp
694+
test-# FROM hash_metapage_info(get_raw_page('con_hash_index', 0));
695+
-[ RECORD 1 ]-------------------------------------------------------------------------------
696+
spares | {0,0,0,0,0,0,1,1,1,1,1,1,1,1,3,4,4,4,45,55,58,59,508,567,628,704,1193,1202,1204}
692697
magic | 105121344
693698
version | 3
694699
ntuples | 500500
@@ -703,8 +708,8 @@ ovflpoint | 28
703708
firstfree | 1204
704709
nmaps | 1
705710
procid | 450
706-
spares | {0,0,0,0,0,0,1,1,1,1,1,1,1,1,3,4,4,4,45,55,58,59,508,567,628,704,1193,1202,1204,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0}
707-
mapp | {65,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0}
711+
spares | {0,0,0,0,0,0,1,1,1,1,1,1,1,1,3,4,4,4,45,55,58,59,508,567,628,704,1193,1202,1204}
712+
mapp | {65}
708713
</screen>
709714
</para>
710715
</listitem>

doc/src/sgml/pgstattuple.sgml

+1-1
Original file line numberDiff line numberDiff line change
@@ -368,7 +368,7 @@ pending_tuples | 0
368368
<programlisting>
369369
test=&gt; select * from pgstathashindex('con_hash_index');
370370
-[ RECORD 1 ]--+-----------------
371-
version | 2
371+
version | 4
372372
bucket_pages | 33081
373373
overflow_pages | 0
374374
bitmap_pages | 1

src/include/access/hash.h

+4-5
Original file line numberDiff line numberDiff line change
@@ -158,8 +158,7 @@ typedef HashScanOpaqueData *HashScanOpaque;
158158
#define HASH_METAPAGE 0 /* metapage is always block 0 */
159159

160160
#define HASH_MAGIC 0x6440640
161-
#define HASH_VERSION 3 /* 3 signifies multi-phased bucket allocation
162-
* to reduce doubling */
161+
#define HASH_VERSION 4
163162

164163
/*
165164
* spares[] holds the number of overflow pages currently allocated at or
@@ -182,10 +181,10 @@ typedef HashScanOpaqueData *HashScanOpaque;
182181
* after HASH_SPLITPOINT_GROUPS_WITH_ONE_PHASE).
183182
*
184183
* There is no particular upper limit on the size of mapp[], other than
185-
* needing to fit into the metapage. (With 8K block size, 128 bitmaps
186-
* limit us to 64 GB of overflow space...)
184+
* needing to fit into the metapage. (With 8K block size, 1024 bitmaps
185+
* limit us to 256 GB of overflow space...)
187186
*/
188-
#define HASH_MAX_BITMAPS 128
187+
#define HASH_MAX_BITMAPS 1024
189188

190189
#define HASH_SPLITPOINT_PHASE_BITS 2
191190
#define HASH_SPLITPOINT_PHASES_PER_GRP (1 << HASH_SPLITPOINT_PHASE_BITS)

0 commit comments

Comments
 (0)