Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit aa087ec

Browse files
committed
Add pg_stats_ext view for extended statistics
Regular per-column statistics are stored in pg_statistics catalog, which is however rather difficult to read, so we also have pg_stats view with a human-reablable version of the data. For extended statistic the catalog was fairly easy to read, so we did not have such human-readable view so far. Commit 9b6babfa2d however did split the catalog into two, which makes querying harder. Furthermore, we want to show the multi-column MCV list in a way similar to per-column stats (and not as a bytea value). This commit introduces pg_stats_ext view, joining the two catalogs and massaging the data to produce human-readable output similar to pg_stats. It also considers RLS and access privileges - the data is shown only when the user has access to all columns the extended statistic is defined on. Bumped CATVERSION due to adding new system view. Author: Dean Rasheed, with improvements by me Reviewed-by: Dean Rasheed, John Naylor Discussion: https://postgr.es/m/CAEZATCUhT9rt7Ui%3DVdx4N%3D%3DVV5XOK5dsXfnGgVOz_JhAicB%3DZA%40mail.gmail.com
1 parent 6cbfb78 commit aa087ec

File tree

4 files changed

+263
-1
lines changed

4 files changed

+263
-1
lines changed

doc/src/sgml/catalogs.sgml

+192
Original file line numberDiff line numberDiff line change
@@ -6373,6 +6373,28 @@ SCRAM-SHA-256$<replaceable>&lt;iteration count&gt;</replaceable>:<replaceable>&l
63736373
about those tables that are readable by the current user.
63746374
</para>
63756375

6376+
<para>
6377+
<structname>pg_statistic</structname> should not be readable by the
6378+
public, since even statistical information about a table's contents
6379+
might be considered sensitive. (Example: minimum and maximum values
6380+
of a salary column might be quite interesting.)
6381+
<link linkend="view-pg-stats"><structname>pg_stats</structname></link>
6382+
is a publicly readable view on
6383+
<structname>pg_statistic</structname> that only exposes information
6384+
about those tables that are readable by the current user.
6385+
</para>
6386+
6387+
<para>
6388+
Similarly, <structname>pg_statistic_ext_data</structname> should not be
6389+
readable by the public, since the contents might be considered sensitive.
6390+
(Example: most common combination of values in columns might be quite
6391+
interesting.)
6392+
<link linkend="view-pg-stats-ext"><structname>pg_stats_ext</structname></link>
6393+
is a publicly readable view on <structname>pg_statistic_ext_data</structname>
6394+
(after joining with <structname>pg_statistic_ext</structname>) that only exposes
6395+
information about those tables and columns that are readable by the current user.
6396+
</para>
6397+
63766398
<table>
63776399
<title><structname>pg_statistic</structname> Columns</title>
63786400

@@ -8343,6 +8365,11 @@ SCRAM-SHA-256$<replaceable>&lt;iteration count&gt;</replaceable>:<replaceable>&l
83438365
<entry>planner statistics</entry>
83448366
</row>
83458367

8368+
<row>
8369+
<entry><link linkend="view-pg-stats-ext"><structname>pg_stats_ext</structname></link></entry>
8370+
<entry>extended planner statistics</entry>
8371+
</row>
8372+
83468373
<row>
83478374
<entry><link linkend="view-pg-tables"><structname>pg_tables</structname></link></entry>
83488375
<entry>tables</entry>
@@ -10922,6 +10949,171 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
1092210949

1092310950
</sect1>
1092410951

10952+
<sect1 id="view-pg-stats-ext">
10953+
<title><structname>pg_stats_ext</structname></title>
10954+
10955+
<indexterm zone="view-pg-stats-ext">
10956+
<primary>pg_stats_ext</primary>
10957+
</indexterm>
10958+
10959+
<para>
10960+
The view <structname>pg_stats_ext</structname> provides access to
10961+
the information stored in the <link
10962+
linkend="catalog-pg-statistic-ext"><structname>pg_statistic_ext</structname></link>
10963+
and <link linkend="catalog-pg-statistic-ext-data"><structname>pg_statistic_ext_data</structname></link>
10964+
catalogs. This view allows access only to rows of
10965+
<structname>pg_statistic_ext</structname> and <structname>pg_statistic_ext_data</structname>
10966+
that correspond to tables the user has permission to read, and therefore
10967+
it is safe to allow public read access to this view.
10968+
</para>
10969+
10970+
<para>
10971+
<structname>pg_stats_ext</structname> is also designed to present the
10972+
information in a more readable format than the underlying catalog
10973+
&mdash; at the cost that its schema must be extended whenever new types
10974+
of extended statistics are added to <structname>pg_statistic_ext</structname>.
10975+
</para>
10976+
10977+
<table>
10978+
<title><structname>pg_stats_ext</structname> Columns</title>
10979+
10980+
<tgroup cols="4">
10981+
<thead>
10982+
<row>
10983+
<entry>Name</entry>
10984+
<entry>Type</entry>
10985+
<entry>References</entry>
10986+
<entry>Description</entry>
10987+
</row>
10988+
</thead>
10989+
<tbody>
10990+
<row>
10991+
<entry><structfield>schemaname</structfield></entry>
10992+
<entry><type>name</type></entry>
10993+
<entry><literal><link linkend="catalog-pg-namespace"><structname>pg_namespace</structname></link>.nspname</literal></entry>
10994+
<entry>Name of schema containing table</entry>
10995+
</row>
10996+
10997+
<row>
10998+
<entry><structfield>tablename</structfield></entry>
10999+
<entry><type>name</type></entry>
11000+
<entry><literal><link linkend="catalog-pg-class"><structname>pg_class</structname></link>.relname</literal></entry>
11001+
<entry>Name of table</entry>
11002+
</row>
11003+
11004+
<row>
11005+
<entry><structfield>statistics_schemaname</structfield></entry>
11006+
<entry><type>name</type></entry>
11007+
<entry><literal><link linkend="catalog-pg-namespace"><structname>pg_namespace</structname></link>.nspname</literal></entry>
11008+
<entry>Name of schema containing extended statistic</entry>
11009+
</row>
11010+
11011+
<row>
11012+
<entry><structfield>statistics_name</structfield></entry>
11013+
<entry><type>name</type></entry>
11014+
<entry><literal><link linkend="catalog-pg-statistic-ext"><structname>pg_statistic_ext</structname></link>.stxname</literal></entry>
11015+
<entry>Name of extended statistics</entry>
11016+
</row>
11017+
11018+
<row>
11019+
<entry><structfield>statistics_owner</structfield></entry>
11020+
<entry><type>oid</type></entry>
11021+
<entry><literal><link linkend="catalog-pg-authid"><structname>pg_authid</structname></link>.oid</literal></entry>
11022+
<entry>Owner of the extended statistics</entry>
11023+
</row>
11024+
11025+
<row>
11026+
<entry><structfield>attnames</structfield></entry>
11027+
<entry><type>name[]</type></entry>
11028+
<entry><literal><link linkend="catalog-pg-attribute"><structname>pg_attribute</structname></link>.attname</literal></entry>
11029+
<entry>Names of the column the extended statistics is defined on</entry>
11030+
</row>
11031+
11032+
<row>
11033+
<entry><structfield>kinds</structfield></entry>
11034+
<entry><type>text[]</type></entry>
11035+
<entry></entry>
11036+
<entry>Types of exdended statistics enabled for this record</entry>
11037+
</row>
11038+
11039+
<row>
11040+
<entry><structfield>n_distinct</structfield></entry>
11041+
<entry><type>pg_ndistinct</type></entry>
11042+
<entry></entry>
11043+
<entry>N-distinct counts for combinations of columns. If greater than
11044+
zero, the estimated number of distinct values in the combination. If
11045+
less than zero, the negative of the number of distinct values divided
11046+
by the number of rows.
11047+
(The negated form is used when <command>ANALYZE</command> believes that
11048+
the number of distinct values is likely to increase as the table grows;
11049+
the positive form is used when the column seems to have a fixed number
11050+
of possible values.) For example, -1 indicates a unique combination of
11051+
columns in which the number of distinct combinations is the same as the
11052+
number of rows.
11053+
</entry>
11054+
</row>
11055+
11056+
<row>
11057+
<entry><structfield>dependencies</structfield></entry>
11058+
<entry><type>pg_dependencies</type></entry>
11059+
<entry></entry>
11060+
<entry>Functional dependency statistics</entry>
11061+
</row>
11062+
11063+
<row>
11064+
<entry><structfield>most_common_vals</structfield></entry>
11065+
<entry><type>anyarray</type></entry>
11066+
<entry></entry>
11067+
<entry>
11068+
A list of the most common combinations in the columns. (Null if
11069+
no values seem to be more common than any others.)
11070+
</entry>
11071+
</row>
11072+
11073+
<row>
11074+
<entry><structfield>most_common_val_nulls</structfield></entry>
11075+
<entry><type>anyarray</type></entry>
11076+
<entry></entry>
11077+
<entry>
11078+
A list of NULL flags for the most common combinations of values.
11079+
(Null when <structfield>most_common_vals</structfield> is.)
11080+
</entry>
11081+
</row>
11082+
11083+
<row>
11084+
<entry><structfield>most_common_freqs</structfield></entry>
11085+
<entry><type>real[]</type></entry>
11086+
<entry></entry>
11087+
<entry>
11088+
A list of the frequencies of the most common combinations,
11089+
i.e., number of occurrences of each divided by total number of rows.
11090+
(Null when <structfield>most_common_vals</structfield> is.)
11091+
</entry>
11092+
</row>
11093+
11094+
<row>
11095+
<entry><structfield>most_common_base_freqs</structfield></entry>
11096+
<entry><type>real[]</type></entry>
11097+
<entry></entry>
11098+
<entry>
11099+
A list of the base frequencies of the most common combinations,
11100+
i.e., product of per-value frequencies.
11101+
(Null when <structfield>most_common_vals</structfield> is.)
11102+
</entry>
11103+
</row>
11104+
</tbody>
11105+
</tgroup>
11106+
</table>
11107+
11108+
<para>
11109+
The maximum number of entries in the array fields can be controlled on a
11110+
column-by-column basis using the <command>ALTER TABLE SET STATISTICS</command>
11111+
command, or globally by setting the
11112+
<xref linkend="guc-default-statistics-target"/> run-time parameter.
11113+
</para>
11114+
11115+
</sect1>
11116+
1092511117
<sect1 id="view-pg-tables">
1092611118
<title><structname>pg_tables</structname></title>
1092711119

src/backend/catalog/system_views.sql

+41
Original file line numberDiff line numberDiff line change
@@ -253,6 +253,47 @@ CREATE VIEW pg_stats WITH (security_barrier) AS
253253

254254
REVOKE ALL on pg_statistic FROM public;
255255

256+
CREATE VIEW pg_stats_ext WITH (security_barrier) AS
257+
SELECT cn.nspname AS schemaname,
258+
c.relname AS tablename,
259+
sn.nspname AS statistics_schemaname,
260+
s.stxname AS statistics_name,
261+
pg_get_userbyid(s.stxowner) AS statistics_owner,
262+
( SELECT array_agg(a.attname ORDER BY a.attnum)
263+
FROM unnest(s.stxkeys) k
264+
JOIN pg_attribute a
265+
ON (a.attrelid = s.stxrelid AND a.attnum = k)
266+
) AS attnames,
267+
s.stxkind AS kinds,
268+
sd.stxdndistinct AS n_distinct,
269+
sd.stxddependencies AS dependencies,
270+
m.most_common_vals,
271+
m.most_common_val_nulls,
272+
m.most_common_freqs,
273+
m.most_common_base_freqs
274+
FROM pg_statistic_ext s JOIN pg_class c ON (c.oid = s.stxrelid)
275+
JOIN pg_statistic_ext_data sd ON (s.oid = sd.stxoid)
276+
LEFT JOIN pg_namespace cn ON (cn.oid = c.relnamespace)
277+
LEFT JOIN pg_namespace sn ON (sn.oid = s.stxnamespace)
278+
LEFT JOIN LATERAL
279+
( SELECT array_agg(values) AS most_common_vals,
280+
array_agg(nulls) AS most_common_val_nulls,
281+
array_agg(frequency) AS most_common_freqs,
282+
array_agg(base_frequency) AS most_common_base_freqs
283+
FROM pg_mcv_list_items(sd.stxdmcv)
284+
) m ON sd.stxdmcv IS NOT NULL
285+
WHERE NOT EXISTS
286+
( SELECT 1
287+
FROM unnest(stxkeys) k
288+
JOIN pg_attribute a
289+
ON (a.attrelid = s.stxrelid AND a.attnum = k)
290+
WHERE NOT has_column_privilege(c.oid, a.attnum, 'select') )
291+
AND (c.relrowsecurity = false OR NOT row_security_active(c.oid));
292+
293+
REVOKE ALL on pg_statistic_ext FROM public;
294+
GRANT SELECT (oid, stxrelid, stxname, stxnamespace, stxowner, stxkeys, stxkind)
295+
ON pg_statistic_ext TO public;
296+
256297
CREATE VIEW pg_publication_tables AS
257298
SELECT
258299
P.pubname AS pubname,

src/include/catalog/catversion.h

+1-1
Original file line numberDiff line numberDiff line change
@@ -53,6 +53,6 @@
5353
*/
5454

5555
/* yyyymmddN */
56-
#define CATALOG_VERSION_NO 201906151
56+
#define CATALOG_VERSION_NO 201906152
5757

5858
#endif

src/test/regress/expected/rules.out

+29
Original file line numberDiff line numberDiff line change
@@ -2284,6 +2284,35 @@ pg_stats| SELECT n.nspname AS schemaname,
22842284
JOIN pg_attribute a ON (((c.oid = a.attrelid) AND (a.attnum = s.staattnum))))
22852285
LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)))
22862286
WHERE ((NOT a.attisdropped) AND has_column_privilege(c.oid, a.attnum, 'select'::text) AND ((c.relrowsecurity = false) OR (NOT row_security_active(c.oid))));
2287+
pg_stats_ext| SELECT cn.nspname AS schemaname,
2288+
c.relname AS tablename,
2289+
sn.nspname AS statistics_schemaname,
2290+
s.stxname AS statistics_name,
2291+
pg_get_userbyid(s.stxowner) AS statistics_owner,
2292+
( SELECT array_agg(a.attname ORDER BY a.attnum) AS array_agg
2293+
FROM (unnest(s.stxkeys) k(k)
2294+
JOIN pg_attribute a ON (((a.attrelid = s.stxrelid) AND (a.attnum = k.k))))) AS attnames,
2295+
s.stxkind AS kinds,
2296+
sd.stxdndistinct AS n_distinct,
2297+
sd.stxddependencies AS dependencies,
2298+
m.most_common_vals,
2299+
m.most_common_val_nulls,
2300+
m.most_common_freqs,
2301+
m.most_common_base_freqs
2302+
FROM (((((pg_statistic_ext s
2303+
JOIN pg_class c ON ((c.oid = s.stxrelid)))
2304+
JOIN pg_statistic_ext_data sd ON ((s.oid = sd.stxoid)))
2305+
LEFT JOIN pg_namespace cn ON ((cn.oid = c.relnamespace)))
2306+
LEFT JOIN pg_namespace sn ON ((sn.oid = s.stxnamespace)))
2307+
LEFT JOIN LATERAL ( SELECT array_agg(pg_mcv_list_items."values") AS most_common_vals,
2308+
array_agg(pg_mcv_list_items.nulls) AS most_common_val_nulls,
2309+
array_agg(pg_mcv_list_items.frequency) AS most_common_freqs,
2310+
array_agg(pg_mcv_list_items.base_frequency) AS most_common_base_freqs
2311+
FROM pg_mcv_list_items(sd.stxdmcv) pg_mcv_list_items(index, "values", nulls, frequency, base_frequency)) m ON ((sd.stxdmcv IS NOT NULL)))
2312+
WHERE ((NOT (EXISTS ( SELECT 1
2313+
FROM (unnest(s.stxkeys) k(k)
2314+
JOIN pg_attribute a ON (((a.attrelid = s.stxrelid) AND (a.attnum = k.k))))
2315+
WHERE (NOT has_column_privilege(c.oid, a.attnum, 'select'::text))))) AND ((c.relrowsecurity = false) OR (NOT row_security_active(c.oid))));
22872316
pg_tables| SELECT n.nspname AS schemaname,
22882317
c.relname AS tablename,
22892318
pg_get_userbyid(c.relowner) AS tableowner,

0 commit comments

Comments
 (0)