Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit f2b8839

Browse files
committed
Add pg_relation_check_pages() to check on-disk pages of a relation
This makes use of CheckBuffer() introduced in c780a7a, adding a SQL wrapper able to do checks for all the pages of a relation. By default, all the fork types of a relation are checked, and it is possible to check only a given relation fork. Note that if the relation given in input has no physical storage or is temporary, then no errors are generated, allowing full-database checks when coupled with a simple scan of pg_class for example. This is not limited to clusters with data checksums enabled, as clusters without data checksums can still apply checks on pages using the page headers or for the case of a page full of zeros. This function returns a set of tuples consisting of: - The physical file where a broken page has been detected (without the segment number as that can be AM-dependent, which can be guessed from the block number for heap). A relative path from PGPATH is used. - The block number of the broken page. By default, only superusers have an access to this function but execution rights can be granted to other users. The feature introduced here is still minimal, and more improvements could be done, like: - Addition of a start and end block number to run checks on a range of blocks, which would apply only if one fork type is checked. - Addition of some progress reporting. - Throttling, with configuration parameters in function input or potentially some cost-based GUCs. Regression tests are added for positive cases in the main regression test suite, and TAP tests are added for cases involving the emulation of page corruptions. Bump catalog version. Author: Julien Rouhaud, Michael Paquier Reviewed-by: Masahiko Sawada, Justin Pryzby Discussion: https://postgr.es/m/CAOBaU_aVvMjQn=ge5qPiJOPMmOj5=ii3st5Q0Y+WuLML5sR17w@mail.gmail.com
1 parent c780a7a commit f2b8839

File tree

12 files changed

+644
-2
lines changed

12 files changed

+644
-2
lines changed

doc/src/sgml/func.sgml

+50
Original file line numberDiff line numberDiff line change
@@ -26182,6 +26182,56 @@ SELECT convert_from(pg_read_binary_file('file_in_utf8.txt'), 'UTF8');
2618226182

2618326183
</sect2>
2618426184

26185+
<sect2 id="functions-data-sanity">
26186+
<title>Data Sanity Functions</title>
26187+
26188+
<para>
26189+
The functions shown in <xref linkend="functions-data-sanity-table"/>
26190+
provide ways to check the sanity of data files in the cluster.
26191+
</para>
26192+
26193+
<table id="functions-data-sanity-table">
26194+
<title>Data Sanity Functions</title>
26195+
<tgroup cols="3">
26196+
<thead>
26197+
<row><entry>Name</entry> <entry>Return Type</entry> <entry>Description</entry>
26198+
</row>
26199+
</thead>
26200+
26201+
<tbody>
26202+
<row>
26203+
<entry>
26204+
<literal><function>pg_relation_check_pages(<parameter>relation</parameter> <type>regclass</type> [, <parameter>fork</parameter> <type>text</type> <literal>DEFAULT</literal> <literal>NULL</literal> ])</function></literal>
26205+
</entry>
26206+
<entry><type>setof record</type></entry>
26207+
<entry>Check the pages of a relation.
26208+
</entry>
26209+
</row>
26210+
</tbody>
26211+
</tgroup>
26212+
</table>
26213+
26214+
<indexterm>
26215+
<primary>pg_relation_check_pages</primary>
26216+
</indexterm>
26217+
<para id="functions-check-relation-note" xreflabel="pg_relation_check_pages">
26218+
<function>pg_relation_check_pages</function> iterates over all blocks of a
26219+
given relation and verifies if they are in a state where they can safely
26220+
be loaded into the shared buffers. If defined,
26221+
<replaceable>fork</replaceable> specifies that only the pages of the given
26222+
fork are to be verified. Fork can be <literal>'main'</literal> for the
26223+
main data fork, <literal>'fsm'</literal> for the free space map,
26224+
<literal>'vm'</literal> for the visibility map, or
26225+
<literal>'init'</literal> for the initialization fork. The default of
26226+
<literal>NULL</literal> means that all the forks of the relation are
26227+
checked. The function returns a list of blocks that are considered as
26228+
corrupted with the path of the related file. Use of this function is
26229+
restricted to superusers by default but access may be granted to others
26230+
using <command>GRANT</command>.
26231+
</para>
26232+
26233+
</sect2>
26234+
2618526235
</sect1>
2618626236

2618726237
<sect1 id="functions-trigger">

src/backend/catalog/system_views.sql

+9
Original file line numberDiff line numberDiff line change
@@ -1300,6 +1300,14 @@ LANGUAGE INTERNAL
13001300
STRICT VOLATILE
13011301
AS 'pg_create_logical_replication_slot';
13021302

1303+
CREATE OR REPLACE FUNCTION pg_relation_check_pages(
1304+
IN relation regclass, IN fork text DEFAULT NULL,
1305+
OUT path text, OUT failed_block_num bigint)
1306+
RETURNS SETOF record
1307+
LANGUAGE internal
1308+
VOLATILE PARALLEL RESTRICTED
1309+
AS 'pg_relation_check_pages';
1310+
13031311
CREATE OR REPLACE FUNCTION
13041312
make_interval(years int4 DEFAULT 0, months int4 DEFAULT 0, weeks int4 DEFAULT 0,
13051313
days int4 DEFAULT 0, hours int4 DEFAULT 0, mins int4 DEFAULT 0,
@@ -1444,6 +1452,7 @@ AS 'unicode_is_normalized';
14441452
-- can later change who can access these functions, or leave them as only
14451453
-- available to superuser / cluster owner, if they choose.
14461454
--
1455+
REVOKE EXECUTE ON FUNCTION pg_relation_check_pages(regclass, text) FROM public;
14471456
REVOKE EXECUTE ON FUNCTION pg_start_backup(text, boolean, boolean) FROM public;
14481457
REVOKE EXECUTE ON FUNCTION pg_stop_backup() FROM public;
14491458
REVOKE EXECUTE ON FUNCTION pg_stop_backup(boolean, boolean) FROM public;

src/backend/utils/adt/Makefile

+1
Original file line numberDiff line numberDiff line change
@@ -69,6 +69,7 @@ OBJS = \
6969
oid.o \
7070
oracle_compat.o \
7171
orderedsetaggs.o \
72+
pagefuncs.o \
7273
partitionfuncs.o \
7374
pg_locale.o \
7475
pg_lsn.o \

src/backend/utils/adt/pagefuncs.c

+229
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,229 @@
1+
/*-------------------------------------------------------------------------
2+
*
3+
* pagefuncs.c
4+
* Functions for features related to relation pages.
5+
*
6+
* Portions Copyright (c) 1996-2020, PostgreSQL Global Development Group
7+
* Portions Copyright (c) 1994, Regents of the University of California
8+
*
9+
*
10+
* IDENTIFICATION
11+
* src/backend/utils/adt/pagefuncs.c
12+
*
13+
*-------------------------------------------------------------------------
14+
*/
15+
16+
#include "postgres.h"
17+
18+
#include "access/relation.h"
19+
#include "funcapi.h"
20+
#include "miscadmin.h"
21+
#include "storage/bufmgr.h"
22+
#include "storage/lmgr.h"
23+
#include "storage/smgr.h"
24+
#include "utils/builtins.h"
25+
#include "utils/syscache.h"
26+
27+
static void check_one_relation(TupleDesc tupdesc, Tuplestorestate *tupstore,
28+
Oid relid, ForkNumber single_forknum);
29+
static void check_relation_fork(TupleDesc tupdesc, Tuplestorestate *tupstore,
30+
Relation relation, ForkNumber forknum);
31+
32+
/*
33+
* callback arguments for check_pages_error_callback()
34+
*/
35+
typedef struct CheckPagesErrorInfo
36+
{
37+
char *path;
38+
BlockNumber blkno;
39+
} CheckPagesErrorInfo;
40+
41+
/*
42+
* Error callback specific to check_relation_fork().
43+
*/
44+
static void
45+
check_pages_error_callback(void *arg)
46+
{
47+
CheckPagesErrorInfo *errinfo = (CheckPagesErrorInfo *) arg;
48+
49+
errcontext("while checking page %u of path %s",
50+
errinfo->blkno, errinfo->path);
51+
}
52+
53+
/*
54+
* pg_relation_check_pages
55+
*
56+
* Check the state of all the pages for one or more fork types in the given
57+
* relation.
58+
*/
59+
Datum
60+
pg_relation_check_pages(PG_FUNCTION_ARGS)
61+
{
62+
Oid relid;
63+
ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
64+
TupleDesc tupdesc;
65+
Tuplestorestate *tupstore;
66+
MemoryContext per_query_ctx;
67+
MemoryContext oldcontext;
68+
ForkNumber forknum;
69+
70+
/* Switch into long-lived context to construct returned data structures */
71+
per_query_ctx = rsinfo->econtext->ecxt_per_query_memory;
72+
oldcontext = MemoryContextSwitchTo(per_query_ctx);
73+
74+
/* Build a tuple descriptor for our result type */
75+
if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
76+
elog(ERROR, "return type must be a row type");
77+
78+
tupstore = tuplestore_begin_heap(true, false, work_mem);
79+
rsinfo->returnMode = SFRM_Materialize;
80+
rsinfo->setResult = tupstore;
81+
rsinfo->setDesc = tupdesc;
82+
83+
MemoryContextSwitchTo(oldcontext);
84+
85+
/* handle arguments */
86+
if (PG_ARGISNULL(0))
87+
{
88+
/* Just leave if nothing is defined */
89+
PG_RETURN_VOID();
90+
}
91+
92+
/* By default all the forks of a relation are checked */
93+
if (PG_ARGISNULL(1))
94+
forknum = InvalidForkNumber;
95+
else
96+
{
97+
const char *forkname = TextDatumGetCString(PG_GETARG_TEXT_PP(1));
98+
99+
forknum = forkname_to_number(forkname);
100+
}
101+
102+
relid = PG_GETARG_OID(0);
103+
104+
check_one_relation(tupdesc, tupstore, relid, forknum);
105+
tuplestore_donestoring(tupstore);
106+
107+
return (Datum) 0;
108+
}
109+
110+
/*
111+
* Perform the check on a single relation, possibly filtered with a single
112+
* fork. This function will check if the given relation exists or not, as
113+
* a relation could be dropped after checking for the list of relations and
114+
* before getting here, and we don't want to error out in this case.
115+
*/
116+
static void
117+
check_one_relation(TupleDesc tupdesc, Tuplestorestate *tupstore,
118+
Oid relid, ForkNumber single_forknum)
119+
{
120+
Relation relation;
121+
ForkNumber forknum;
122+
123+
/* Check if relation exists. leaving if there is no such relation */
124+
if (!SearchSysCacheExists1(RELOID, ObjectIdGetDatum(relid)))
125+
return;
126+
127+
relation = relation_open(relid, AccessShareLock);
128+
129+
/*
130+
* Sanity checks, returning no results if not supported. Temporary
131+
* relations and relations without storage are out of scope.
132+
*/
133+
if (!RELKIND_HAS_STORAGE(relation->rd_rel->relkind) ||
134+
relation->rd_rel->relpersistence == RELPERSISTENCE_TEMP)
135+
{
136+
relation_close(relation, AccessShareLock);
137+
return;
138+
}
139+
140+
RelationOpenSmgr(relation);
141+
142+
for (forknum = 0; forknum <= MAX_FORKNUM; forknum++)
143+
{
144+
if (single_forknum != InvalidForkNumber && single_forknum != forknum)
145+
continue;
146+
147+
if (smgrexists(relation->rd_smgr, forknum))
148+
check_relation_fork(tupdesc, tupstore, relation, forknum);
149+
}
150+
151+
relation_close(relation, AccessShareLock);
152+
}
153+
154+
/*
155+
* For a given relation and fork, do the real work of iterating over all pages
156+
* and doing the check. Caller must hold an AccessShareLock lock on the given
157+
* relation.
158+
*/
159+
static void
160+
check_relation_fork(TupleDesc tupdesc, Tuplestorestate *tupstore,
161+
Relation relation, ForkNumber forknum)
162+
{
163+
BlockNumber blkno,
164+
nblocks;
165+
SMgrRelation smgr = relation->rd_smgr;
166+
char *path;
167+
CheckPagesErrorInfo errinfo;
168+
ErrorContextCallback errcallback;
169+
170+
/* Number of output arguments in the SRF */
171+
#define PG_CHECK_RELATION_COLS 2
172+
173+
Assert(CheckRelationLockedByMe(relation, AccessShareLock, true));
174+
175+
/*
176+
* We remember the number of blocks here. Since caller must hold a lock
177+
* on the relation, we know that it won't be truncated while we are
178+
* iterating over the blocks. Any block added after this function started
179+
* will not be checked.
180+
*/
181+
nblocks = RelationGetNumberOfBlocksInFork(relation, forknum);
182+
183+
path = relpathbackend(smgr->smgr_rnode.node,
184+
smgr->smgr_rnode.backend,
185+
forknum);
186+
187+
/*
188+
* Error context to print some information about blocks and relations
189+
* impacted by corruptions.
190+
*/
191+
errinfo.path = pstrdup(path);
192+
errinfo.blkno = 0;
193+
errcallback.callback = check_pages_error_callback;
194+
errcallback.arg = (void *) &errinfo;
195+
errcallback.previous = error_context_stack;
196+
error_context_stack = &errcallback;
197+
198+
for (blkno = 0; blkno < nblocks; blkno++)
199+
{
200+
Datum values[PG_CHECK_RELATION_COLS];
201+
bool nulls[PG_CHECK_RELATION_COLS];
202+
int i = 0;
203+
204+
/* Update block number for the error context */
205+
errinfo.blkno = blkno;
206+
207+
CHECK_FOR_INTERRUPTS();
208+
209+
/* Check the given buffer */
210+
if (CheckBuffer(smgr, forknum, blkno))
211+
continue;
212+
213+
memset(values, 0, sizeof(values));
214+
memset(nulls, 0, sizeof(nulls));
215+
216+
values[i++] = CStringGetTextDatum(path);
217+
values[i++] = UInt32GetDatum(blkno);
218+
219+
Assert(i == PG_CHECK_RELATION_COLS);
220+
221+
/* Save the corrupted blocks in the tuplestore. */
222+
tuplestore_putvalues(tupstore, tupdesc, values, nulls);
223+
224+
pfree(path);
225+
}
226+
227+
/* Pop the error context stack */
228+
error_context_stack = errcallback.previous;
229+
}

src/include/catalog/catversion.h

+1-1
Original file line numberDiff line numberDiff line change
@@ -53,6 +53,6 @@
5353
*/
5454

5555
/* yyyymmddN */
56-
#define CATALOG_VERSION_NO 202010201
56+
#define CATALOG_VERSION_NO 202010281
5757

5858
#endif

src/include/catalog/pg_proc.dat

+7
Original file line numberDiff line numberDiff line change
@@ -10958,6 +10958,13 @@
1095810958
proallargtypes => '{oid,text,int8,timestamptz}', proargmodes => '{i,o,o,o}',
1095910959
proargnames => '{tablespace,name,size,modification}',
1096010960
prosrc => 'pg_ls_tmpdir_1arg' },
10961+
{ oid => '9147', descr => 'check pages of a relation',
10962+
proname => 'pg_relation_check_pages', procost => '10000', prorows => '20',
10963+
proisstrict => 'f', proretset => 't', provolatile => 'v', proparallel => 'r',
10964+
prorettype => 'record', proargtypes => 'regclass text',
10965+
proallargtypes => '{regclass,text,text,int8}', proargmodes => '{i,i,o,o}',
10966+
proargnames => '{relation,fork,path,failed_block_num}',
10967+
prosrc => 'pg_relation_check_pages' },
1096110968

1096210969
# hash partitioning constraint function
1096310970
{ oid => '5028', descr => 'hash partition CHECK constraint',

0 commit comments

Comments
 (0)