Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 2e668c5

Browse files
committed
Avoid crash during EvalPlanQual recheck of an inner indexscan.
Commit 09529a7 changed nodeIndexscan.c and nodeIndexonlyscan.c to postpone initialization of the indexscan proper until the first tuple fetch. It overlooked the question of mark/restore behavior, which means that if some caller attempts to mark the scan before the first tuple fetch, you get a null pointer dereference. The only existing user of mark/restore is nodeMergejoin.c, which (somewhat accidentally) will never attempt to set a mark before the first inner tuple unless the inner child node is a Material node. Hence the case can't arise normally, so it seems sufficient to document the assumption at both ends. However, during an EvalPlanQual recheck, ExecScanFetch doesn't call IndexNext but just returns the jammed-in test tuple. Therefore, if we're doing a recheck in a plan tree with a mergejoin with inner indexscan, it's possible to reach ExecIndexMarkPos with iss_ScanDesc still null, as reported by Guo Xiang Tan in bug #15032. Really, when there's a test tuple supplied during an EPQ recheck, touching the index at all is the wrong thing: rather, the behavior of mark/restore ought to amount to saving and restoring the es_epqScanDone flag. We can avoid finding a place to actually save the flag, for the moment, because given the assumption that no caller will set a mark before fetching a tuple, es_epqScanDone must always be set by the time we try to mark. So the actual behavior change required is just to not reach the index access if a test tuple is supplied. The set of plan node types that need to consider this issue are those that support EPQ test tuples (i.e., call ExecScan()) and also support mark/restore; which is to say, IndexScan, IndexOnlyScan, and perhaps CustomScan. It's tempting to try to fix the problem in one place by teaching ExecMarkPos() itself about EPQ; but ExecMarkPos supports some plan types that aren't Scans, and also it seems risky to make assumptions about what a CustomScan wants to do here. Also, the most likely future change here is to decide that we do need to support marks placed before the first tuple, which would require additional work in IndexScan and IndexOnlyScan in any case. Hence, fix the EPQ issue in nodeIndexscan.c and nodeIndexonlyscan.c, accepting the small amount of code duplicated thereby, and leave it to CustomScan providers to fix this bug if they have it. Back-patch to v10 where commit 09529a7 came in. In earlier branches, the index_markpos() call is a waste of cycles when EPQ is active, but no more than that, so it doesn't seem appropriate to back-patch further. Discussion: https://postgr.es/m/20180126074932.3098.97815@wrigleys.postgresql.org
1 parent ba8c2df commit 2e668c5

File tree

5 files changed

+144
-1
lines changed

5 files changed

+144
-1
lines changed

src/backend/executor/nodeIndexonlyscan.c

+45
Original file line numberDiff line numberDiff line change
@@ -421,11 +421,39 @@ ExecEndIndexOnlyScan(IndexOnlyScanState *node)
421421

422422
/* ----------------------------------------------------------------
423423
* ExecIndexOnlyMarkPos
424+
*
425+
* Note: we assume that no caller attempts to set a mark before having read
426+
* at least one tuple. Otherwise, ioss_ScanDesc might still be NULL.
424427
* ----------------------------------------------------------------
425428
*/
426429
void
427430
ExecIndexOnlyMarkPos(IndexOnlyScanState *node)
428431
{
432+
EState *estate = node->ss.ps.state;
433+
434+
if (estate->es_epqTuple != NULL)
435+
{
436+
/*
437+
* We are inside an EvalPlanQual recheck. If a test tuple exists for
438+
* this relation, then we shouldn't access the index at all. We would
439+
* instead need to save, and later restore, the state of the
440+
* es_epqScanDone flag, so that re-fetching the test tuple is
441+
* possible. However, given the assumption that no caller sets a mark
442+
* at the start of the scan, we can only get here with es_epqScanDone
443+
* already set, and so no state need be saved.
444+
*/
445+
Index scanrelid = ((Scan *) node->ss.ps.plan)->scanrelid;
446+
447+
Assert(scanrelid > 0);
448+
if (estate->es_epqTupleSet[scanrelid - 1])
449+
{
450+
/* Verify the claim above */
451+
if (!estate->es_epqScanDone[scanrelid - 1])
452+
elog(ERROR, "unexpected ExecIndexOnlyMarkPos call in EPQ recheck");
453+
return;
454+
}
455+
}
456+
429457
index_markpos(node->ioss_ScanDesc);
430458
}
431459

@@ -436,6 +464,23 @@ ExecIndexOnlyMarkPos(IndexOnlyScanState *node)
436464
void
437465
ExecIndexOnlyRestrPos(IndexOnlyScanState *node)
438466
{
467+
EState *estate = node->ss.ps.state;
468+
469+
if (estate->es_epqTuple != NULL)
470+
{
471+
/* See comments in ExecIndexOnlyMarkPos */
472+
Index scanrelid = ((Scan *) node->ss.ps.plan)->scanrelid;
473+
474+
Assert(scanrelid > 0);
475+
if (estate->es_epqTupleSet[scanrelid - 1])
476+
{
477+
/* Verify the claim above */
478+
if (!estate->es_epqScanDone[scanrelid - 1])
479+
elog(ERROR, "unexpected ExecIndexOnlyRestrPos call in EPQ recheck");
480+
return;
481+
}
482+
}
483+
439484
index_restrpos(node->ioss_ScanDesc);
440485
}
441486

src/backend/executor/nodeIndexscan.c

+45
Original file line numberDiff line numberDiff line change
@@ -849,11 +849,39 @@ ExecEndIndexScan(IndexScanState *node)
849849

850850
/* ----------------------------------------------------------------
851851
* ExecIndexMarkPos
852+
*
853+
* Note: we assume that no caller attempts to set a mark before having read
854+
* at least one tuple. Otherwise, iss_ScanDesc might still be NULL.
852855
* ----------------------------------------------------------------
853856
*/
854857
void
855858
ExecIndexMarkPos(IndexScanState *node)
856859
{
860+
EState *estate = node->ss.ps.state;
861+
862+
if (estate->es_epqTuple != NULL)
863+
{
864+
/*
865+
* We are inside an EvalPlanQual recheck. If a test tuple exists for
866+
* this relation, then we shouldn't access the index at all. We would
867+
* instead need to save, and later restore, the state of the
868+
* es_epqScanDone flag, so that re-fetching the test tuple is
869+
* possible. However, given the assumption that no caller sets a mark
870+
* at the start of the scan, we can only get here with es_epqScanDone
871+
* already set, and so no state need be saved.
872+
*/
873+
Index scanrelid = ((Scan *) node->ss.ps.plan)->scanrelid;
874+
875+
Assert(scanrelid > 0);
876+
if (estate->es_epqTupleSet[scanrelid - 1])
877+
{
878+
/* Verify the claim above */
879+
if (!estate->es_epqScanDone[scanrelid - 1])
880+
elog(ERROR, "unexpected ExecIndexMarkPos call in EPQ recheck");
881+
return;
882+
}
883+
}
884+
857885
index_markpos(node->iss_ScanDesc);
858886
}
859887

@@ -864,6 +892,23 @@ ExecIndexMarkPos(IndexScanState *node)
864892
void
865893
ExecIndexRestrPos(IndexScanState *node)
866894
{
895+
EState *estate = node->ss.ps.state;
896+
897+
if (estate->es_epqTuple != NULL)
898+
{
899+
/* See comments in ExecIndexMarkPos */
900+
Index scanrelid = ((Scan *) node->ss.ps.plan)->scanrelid;
901+
902+
Assert(scanrelid > 0);
903+
if (estate->es_epqTupleSet[scanrelid - 1])
904+
{
905+
/* Verify the claim above */
906+
if (!estate->es_epqScanDone[scanrelid - 1])
907+
elog(ERROR, "unexpected ExecIndexRestrPos call in EPQ recheck");
908+
return;
909+
}
910+
}
911+
867912
index_restrpos(node->iss_ScanDesc);
868913
}
869914

src/backend/executor/nodeMergejoin.c

+4
Original file line numberDiff line numberDiff line change
@@ -1502,6 +1502,10 @@ ExecInitMergeJoin(MergeJoin *node, EState *estate, int eflags)
15021502
*
15031503
* Currently, only Material wants the extra MARKs, and it will be helpful
15041504
* only if eflags doesn't specify REWIND.
1505+
*
1506+
* Note that for IndexScan and IndexOnlyScan, it is *necessary* that we
1507+
* not set mj_ExtraMarks; otherwise we might attempt to set a mark before
1508+
* the first inner tuple, which they do not support.
15051509
*/
15061510
if (IsA(innerPlan(node), Material) &&
15071511
(eflags & EXEC_FLAG_REWIND) == 0 &&

src/test/isolation/expected/eval-plan-qual.out

+33
Original file line numberDiff line numberDiff line change
@@ -184,3 +184,36 @@ step readwcte: <... completed>
184184
id value
185185

186186
1 tableAValue2
187+
188+
starting permutation: wrjt selectjoinforupdate c2 c1
189+
step wrjt: UPDATE jointest SET data = 42 WHERE id = 7;
190+
step selectjoinforupdate:
191+
set enable_nestloop to 0;
192+
set enable_hashjoin to 0;
193+
set enable_seqscan to 0;
194+
explain (costs off)
195+
select * from jointest a join jointest b on a.id=b.id for update;
196+
select * from jointest a join jointest b on a.id=b.id for update;
197+
<waiting ...>
198+
step c2: COMMIT;
199+
step selectjoinforupdate: <... completed>
200+
QUERY PLAN
201+
202+
LockRows
203+
-> Merge Join
204+
Merge Cond: (a.id = b.id)
205+
-> Index Scan using jointest_id_idx on jointest a
206+
-> Index Scan using jointest_id_idx on jointest b
207+
id data id data
208+
209+
1 0 1 0
210+
2 0 2 0
211+
3 0 3 0
212+
4 0 4 0
213+
5 0 5 0
214+
6 0 6 0
215+
7 42 7 42
216+
8 0 8 0
217+
9 0 9 0
218+
10 0 10 0
219+
step c1: COMMIT;

src/test/isolation/specs/eval-plan-qual.spec

+17-1
Original file line numberDiff line numberDiff line change
@@ -21,13 +21,16 @@ setup
2121
CREATE TABLE table_b (id integer, value text);
2222
INSERT INTO table_a VALUES (1, 'tableAValue');
2323
INSERT INTO table_b VALUES (1, 'tableBValue');
24+
25+
CREATE TABLE jointest AS SELECT generate_series(1,10) AS id, 0 AS data;
26+
CREATE INDEX ON jointest(id);
2427
}
2528

2629
teardown
2730
{
2831
DROP TABLE accounts;
2932
DROP TABLE p CASCADE;
30-
DROP TABLE table_a, table_b;
33+
DROP TABLE table_a, table_b, jointest;
3134
}
3235

3336
session "s1"
@@ -78,6 +81,17 @@ step "updateforss" {
7881
UPDATE table_b SET value = 'newTableBValue' WHERE id = 1;
7982
}
8083

84+
# these tests exercise mark/restore during EPQ recheck, cf bug #15032
85+
86+
step "selectjoinforupdate" {
87+
set enable_nestloop to 0;
88+
set enable_hashjoin to 0;
89+
set enable_seqscan to 0;
90+
explain (costs off)
91+
select * from jointest a join jointest b on a.id=b.id for update;
92+
select * from jointest a join jointest b on a.id=b.id for update;
93+
}
94+
8195

8296
session "s2"
8397
setup { BEGIN ISOLATION LEVEL READ COMMITTED; }
@@ -104,6 +118,7 @@ step "readforss" {
104118
WHERE ta.id = 1 FOR UPDATE OF ta;
105119
}
106120
step "wrtwcte" { UPDATE table_a SET value = 'tableAValue2' WHERE id = 1; }
121+
step "wrjt" { UPDATE jointest SET data = 42 WHERE id = 7; }
107122
step "c2" { COMMIT; }
108123

109124
session "s3"
@@ -135,3 +150,4 @@ permutation "wx2" "partiallock" "c2" "c1" "read"
135150
permutation "wx2" "lockwithvalues" "c2" "c1" "read"
136151
permutation "updateforss" "readforss" "c1" "c2"
137152
permutation "wrtwcte" "readwcte" "c1" "c2"
153+
permutation "wrjt" "selectjoinforupdate" "c2" "c1"

0 commit comments

Comments
 (0)