Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 20b9fa3

Browse files
committed
Build inherited extended stats on partitioned tables
Commit 859b300 disabled building of extended stats for inheritance trees, to prevent updating the same catalog row twice. While that resolved the issue, it also means there are no extended stats for declaratively partitioned tables, because there are no data in the non-leaf relations. That also means declaratively partitioned tables were not affected by the issue 859b300 addressed, which means this is a regression affecting queries that calculate estimates for the whole inheritance tree as a whole (which includes e.g. GROUP BY queries). But because partitioned tables are empty, we can invert the condition and build statistics only for the case with inheritance, without losing anything. And we can consider them when calculating estimates. It may be necessary to run ANALYZE on partitioned tables, to collect proper statistics. For declarative partitioning there should no prior statistics, and it might take time before autoanalyze is triggered. For tables partitioned by inheritance the statistics may include data from child relations (if built 859b300), contradicting the current code. Report and patch by Justin Pryzby, minor fixes and cleanup by me. Backpatch all the way back to PostgreSQL 10, where extended statistics were introduced (same as 859b300). Author: Justin Pryzby Reported-by: Justin Pryzby Backpatch-through: 10 Discussion: https://postgr.es/m/20210923212624.GI831%40telsasoft.com
1 parent 74527c3 commit 20b9fa3

File tree

6 files changed

+77
-19
lines changed

6 files changed

+77
-19
lines changed

src/backend/commands/analyze.c

+17-1
Original file line numberDiff line numberDiff line change
@@ -549,6 +549,7 @@ do_analyze_rel(Relation onerel, VacuumParams *params,
549549
{
550550
MemoryContext col_context,
551551
old_context;
552+
bool build_ext_stats;
552553

553554
pgstat_progress_update_param(PROGRESS_ANALYZE_PHASE,
554555
PROGRESS_ANALYZE_PHASE_COMPUTE_STATS);
@@ -612,13 +613,28 @@ do_analyze_rel(Relation onerel, VacuumParams *params,
612613
thisdata->attr_cnt, thisdata->vacattrstats);
613614
}
614615

616+
/*
617+
* Should we build extended statistics for this relation?
618+
*
619+
* The extended statistics catalog does not include an inheritance
620+
* flag, so we can't store statistics built both with and without
621+
* data from child relations. We can store just one set of statistics
622+
* per relation. For plain relations that's fine, but for inheritance
623+
* trees we have to pick whether to store statistics for just the
624+
* one relation or the whole tree. For plain inheritance we store
625+
* the (!inh) version, mostly for backwards compatibility reasons.
626+
* For partitioned tables that's pointless (the non-leaf tables are
627+
* always empty), so we store stats representing the whole tree.
628+
*/
629+
build_ext_stats = (onerel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE) ? inh : (!inh);
630+
615631
/*
616632
* Build extended statistics (if there are any).
617633
*
618634
* For now we only build extended statistics on individual relations,
619635
* not for relations representing inheritance trees.
620636
*/
621-
if (!inh)
637+
if (build_ext_stats)
622638
BuildRelationExtStatistics(onerel, totalrows, numrows, rows,
623639
attr_cnt, vacattrstats);
624640
}

src/backend/statistics/dependencies.c

+6-3
Original file line numberDiff line numberDiff line change
@@ -1418,10 +1418,13 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
14181418
int unique_exprs_cnt;
14191419

14201420
/*
1421-
* When dealing with inheritance trees, ignore extended stats (which were
1422-
* built without data from child rels, and thus do not represent them).
1421+
* When dealing with regular inheritance trees, ignore extended stats
1422+
* (which were built without data from child rels, and thus do not
1423+
* represent them). For partitioned tables data there's no data in the
1424+
* non-leaf relations, so we build stats only for the inheritance tree.
1425+
* So for partitioned tables we do consider extended stats.
14231426
*/
1424-
if (rte->inh)
1427+
if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
14251428
return 1.0;
14261429

14271430
/* check if there's any stats that might be useful for us. */

src/backend/statistics/extended_stats.c

+6-3
Original file line numberDiff line numberDiff line change
@@ -1698,10 +1698,13 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli
16981698
RangeTblEntry *rte = planner_rt_fetch(rel->relid, root);
16991699

17001700
/*
1701-
* When dealing with inheritance trees, ignore extended stats (which were
1702-
* built without data from child rels, and thus do not represent them).
1701+
* When dealing with regular inheritance trees, ignore extended stats
1702+
* (which were built without data from child rels, and thus do not
1703+
* represent them). For partitioned tables data there's no data in the
1704+
* non-leaf relations, so we build stats only for the inheritance tree.
1705+
* So for partitioned tables we do consider extended stats.
17031706
*/
1704-
if (rte->inh)
1707+
if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
17051708
return sel;
17061709

17071710
/* check if there's any stats that might be useful for us. */

src/backend/utils/adt/selfuncs.c

+19-12
Original file line numberDiff line numberDiff line change
@@ -3913,19 +3913,23 @@ estimate_multivariate_ndistinct(PlannerInfo *root, RelOptInfo *rel,
39133913
Oid statOid = InvalidOid;
39143914
MVNDistinct *stats;
39153915
StatisticExtInfo *matched_info = NULL;
3916-
RangeTblEntry *rte = planner_rt_fetch(rel->relid, root);
3917-
3918-
/*
3919-
* When dealing with inheritance trees, ignore extended stats (which were
3920-
* built without data from child rels, and thus do not represent them).
3921-
*/
3922-
if (rte->inh)
3923-
return false;
3916+
RangeTblEntry *rte;
39243917

39253918
/* bail out immediately if the table has no extended statistics */
39263919
if (!rel->statlist)
39273920
return false;
39283921

3922+
/*
3923+
* When dealing with regular inheritance trees, ignore extended stats
3924+
* (which were built without data from child rels, and thus do not
3925+
* represent them). For partitioned tables data there's no data in the
3926+
* non-leaf relations, so we build stats only for the inheritance tree.
3927+
* So for partitioned tables we do consider extended stats.
3928+
*/
3929+
rte = planner_rt_fetch(rel->relid, root);
3930+
if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
3931+
return false;
3932+
39293933
/* look for the ndistinct statistics object matching the most vars */
39303934
nmatches_vars = 0; /* we require at least two matches */
39313935
nmatches_exprs = 0;
@@ -5242,11 +5246,14 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid,
52425246
break;
52435247

52445248
/*
5245-
* When dealing with inheritance trees, ignore extended stats (which
5246-
* were built without data from child rels, and so do not represent
5247-
* them).
5249+
* When dealing with regular inheritance trees, ignore extended
5250+
* stats (which were built without data from child rels, and thus
5251+
* do not represent them). For partitioned tables data there's no
5252+
* data in the non-leaf relations, so we build stats only for the
5253+
* inheritance tree. So for partitioned tables we do consider
5254+
* extended stats.
52485255
*/
5249-
if (rte->inh)
5256+
if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE)
52505257
break;
52515258

52525259
/* skip stats without per-expression stats */

src/test/regress/expected/stats_ext.out

+19
Original file line numberDiff line numberDiff line change
@@ -217,6 +217,25 @@ SELECT * FROM check_estimated_rows('SELECT a, b FROM stxdinh* WHERE a = 0 AND b
217217
(1 row)
218218

219219
DROP TABLE stxdinh, stxdinh1, stxdinh2;
220+
-- Ensure inherited stats ARE applied to inherited query in partitioned table
221+
CREATE TABLE stxdinp(i int, a int, b int) PARTITION BY RANGE (i);
222+
CREATE TABLE stxdinp1 PARTITION OF stxdinp FOR VALUES FROM (1) TO (100);
223+
INSERT INTO stxdinp SELECT 1, a/100, a/100 FROM generate_series(1, 999) a;
224+
CREATE STATISTICS stxdinp ON a, b FROM stxdinp;
225+
VACUUM ANALYZE stxdinp; -- partitions are processed recursively
226+
SELECT 1 FROM pg_statistic_ext WHERE stxrelid = 'stxdinp'::regclass;
227+
?column?
228+
----------
229+
1
230+
(1 row)
231+
232+
SELECT * FROM check_estimated_rows('SELECT a, b FROM stxdinp GROUP BY 1, 2');
233+
estimated | actual
234+
-----------+--------
235+
10 | 10
236+
(1 row)
237+
238+
DROP TABLE stxdinp;
220239
-- basic test for statistics on expressions
221240
CREATE TABLE ab1 (a INTEGER, b INTEGER, c TIMESTAMP, d TIMESTAMPTZ);
222241
-- expression stats may be built on a single expression column

src/test/regress/sql/stats_ext.sql

+10
Original file line numberDiff line numberDiff line change
@@ -134,6 +134,16 @@ SELECT * FROM check_estimated_rows('SELECT a, b FROM stxdinh* GROUP BY 1, 2');
134134
SELECT * FROM check_estimated_rows('SELECT a, b FROM stxdinh* WHERE a = 0 AND b = 0');
135135
DROP TABLE stxdinh, stxdinh1, stxdinh2;
136136

137+
-- Ensure inherited stats ARE applied to inherited query in partitioned table
138+
CREATE TABLE stxdinp(i int, a int, b int) PARTITION BY RANGE (i);
139+
CREATE TABLE stxdinp1 PARTITION OF stxdinp FOR VALUES FROM (1) TO (100);
140+
INSERT INTO stxdinp SELECT 1, a/100, a/100 FROM generate_series(1, 999) a;
141+
CREATE STATISTICS stxdinp ON a, b FROM stxdinp;
142+
VACUUM ANALYZE stxdinp; -- partitions are processed recursively
143+
SELECT 1 FROM pg_statistic_ext WHERE stxrelid = 'stxdinp'::regclass;
144+
SELECT * FROM check_estimated_rows('SELECT a, b FROM stxdinp GROUP BY 1, 2');
145+
DROP TABLE stxdinp;
146+
137147
-- basic test for statistics on expressions
138148
CREATE TABLE ab1 (a INTEGER, b INTEGER, c TIMESTAMP, d TIMESTAMPTZ);
139149

0 commit comments

Comments
 (0)