Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit a1ef01f

Browse files
committed
Improve pg_dump's dependency-sorting logic to enforce section dump order.
As of 9.2, with the --section option, it is very important that the concept of "pre data", "data", and "post data" sections of the output be honored strictly; else a dump divided into separate sectional files might be unrestorable. However, the dependency-sorting logic knew nothing of sections and would happily select output orderings that didn't fit that structure. Doing so was mostly harmless before 9.2, but now we need to be sure it doesn't do that. To fix, create dummy objects representing the section boundaries and add dependencies between them and all the normal objects. (This might sound expensive but it seems to only add a percent or two to pg_dump's runtime.) This also fixes a problem introduced in 9.1 by the feature that allows incomplete GROUP BY lists when a primary key is given in GROUP BY. That means that views can depend on primary key constraints. Previously, pg_dump would deal with that by simply emitting the primary key constraint before the view definition (and hence before the data section of the output). That's bad enough for simple serial restores, where creating an index before the data is loaded works, but is undesirable for speed reasons. But it could lead to outright failure of parallel restores, as seen in bug #6699 from Joe Van Dyk. That happened because pg_restore would switch into parallel mode as soon as it reached the constraint, and then very possibly would try to emit the view definition before the primary key was committed (as a consequence of another bug that causes the view not to be correctly marked as depending on the constraint). Adding the section boundary constraints forces the dependency-sorting code to break the view into separate table and rule declarations, allowing the rule, and hence the primary key constraint it depends on, to revert to their intended location in the post-data section. This also somewhat accidentally works around the bogus-dependency-marking problem, because the rule will be correctly shown as depending on the constraint, so parallel pg_restore will now do the right thing. (We will fix the bogus-dependency problem for real in a separate patch, but that patch is not easily back-portable to 9.1, so the fact that this patch is enough to dodge the only known symptom is fortunate.) Back-patch to 9.1, except for the hunk that adds verification that the finished archive TOC list is in correct section order; the place where it was convenient to add that doesn't exist in 9.1.
1 parent 77ed0c6 commit a1ef01f

File tree

4 files changed

+238
-30
lines changed

4 files changed

+238
-30
lines changed

src/bin/pg_dump/pg_backup_archiver.c

+34
Original file line numberDiff line numberDiff line change
@@ -247,8 +247,42 @@ SetArchiveRestoreOptions(Archive *AHX, RestoreOptions *ropt)
247247
curSection = SECTION_PRE_DATA;
248248
for (te = AH->toc->next; te != AH->toc; te = te->next)
249249
{
250+
/*
251+
* When writing an archive, we also take this opportunity to check
252+
* that we have generated the entries in a sane order that respects
253+
* the section divisions. When reading, don't complain, since buggy
254+
* old versions of pg_dump might generate out-of-order archives.
255+
*/
256+
if (AH->mode != archModeRead)
257+
{
258+
switch (te->section)
259+
{
260+
case SECTION_NONE:
261+
/* ok to be anywhere */
262+
break;
263+
case SECTION_PRE_DATA:
264+
if (curSection != SECTION_PRE_DATA)
265+
write_msg(modulename,
266+
"WARNING: archive items not in correct section order\n");
267+
break;
268+
case SECTION_DATA:
269+
if (curSection == SECTION_POST_DATA)
270+
write_msg(modulename,
271+
"WARNING: archive items not in correct section order\n");
272+
break;
273+
case SECTION_POST_DATA:
274+
/* ok no matter which section we were in */
275+
break;
276+
default:
277+
exit_horribly(modulename, "unexpected section code %d\n",
278+
(int) te->section);
279+
break;
280+
}
281+
}
282+
250283
if (te->section != SECTION_NONE)
251284
curSection = te->section;
285+
252286
te->reqs = _tocEntryRequired(te, curSection, ropt);
253287
}
254288
}

src/bin/pg_dump/pg_dump.c

+131-4
Original file line numberDiff line numberDiff line change
@@ -210,6 +210,11 @@ static void dumpACL(Archive *fout, CatalogId objCatId, DumpId objDumpId,
210210
const char *acls);
211211

212212
static void getDependencies(Archive *fout);
213+
214+
static DumpableObject *createBoundaryObjects(void);
215+
static void addBoundaryDependencies(DumpableObject **dobjs, int numObjs,
216+
DumpableObject *boundaryObjs);
217+
213218
static void getDomainConstraints(Archive *fout, TypeInfo *tyinfo);
214219
static void getTableData(TableInfo *tblinfo, int numTables, bool oids);
215220
static void makeTableDataInfo(TableInfo *tbinfo, bool oids);
@@ -270,6 +275,7 @@ main(int argc, char **argv)
270275
int numTables;
271276
DumpableObject **dobjs;
272277
int numObjs;
278+
DumpableObject *boundaryObjs;
273279
int i;
274280
enum trivalue prompt_password = TRI_DEFAULT;
275281
int compressLevel = -1;
@@ -691,6 +697,17 @@ main(int argc, char **argv)
691697
*/
692698
getDependencies(fout);
693699

700+
/* Lastly, create dummy objects to represent the section boundaries */
701+
boundaryObjs = createBoundaryObjects();
702+
703+
/* Get pointers to all the known DumpableObjects */
704+
getDumpableObjects(&dobjs, &numObjs);
705+
706+
/*
707+
* Add dummy dependencies to enforce the dump section ordering.
708+
*/
709+
addBoundaryDependencies(dobjs, numObjs, boundaryObjs);
710+
694711
/*
695712
* Sort the objects into a safe dump order (no forward references).
696713
*
@@ -700,14 +717,13 @@ main(int argc, char **argv)
700717
* will dump identically. Before 7.3 we don't have dependencies and we
701718
* use OID ordering as an (unreliable) guide to creation order.
702719
*/
703-
getDumpableObjects(&dobjs, &numObjs);
704-
705720
if (fout->remoteVersion >= 70300)
706721
sortDumpableObjectsByTypeName(dobjs, numObjs);
707722
else
708723
sortDumpableObjectsByTypeOid(dobjs, numObjs);
709724

710-
sortDumpableObjects(dobjs, numObjs);
725+
sortDumpableObjects(dobjs, numObjs,
726+
boundaryObjs[0].dumpId, boundaryObjs[1].dumpId);
711727

712728
/*
713729
* Create archive TOC entries for all the objects to be dumped, in a safe
@@ -7184,6 +7200,10 @@ dumpDumpableObject(Archive *fout, DumpableObject *dobj)
71847200
dobj->dependencies, dobj->nDeps,
71857201
dumpBlobs, NULL);
71867202
break;
7203+
case DO_PRE_DATA_BOUNDARY:
7204+
case DO_POST_DATA_BOUNDARY:
7205+
/* never dumped, nothing to do */
7206+
break;
71877207
}
71887208
}
71897209

@@ -11672,7 +11692,7 @@ dumpDefaultACL(Archive *fout, DefaultACLInfo *daclinfo)
1167211692
daclinfo->dobj.namespace ? daclinfo->dobj.namespace->dobj.name : NULL,
1167311693
NULL,
1167411694
daclinfo->defaclrole,
11675-
false, "DEFAULT ACL", SECTION_NONE,
11695+
false, "DEFAULT ACL", SECTION_POST_DATA,
1167611696
q->data, "", NULL,
1167711697
daclinfo->dobj.dependencies, daclinfo->dobj.nDeps,
1167811698
NULL, NULL);
@@ -14027,6 +14047,113 @@ getDependencies(Archive *fout)
1402714047
}
1402814048

1402914049

14050+
/*
14051+
* createBoundaryObjects - create dummy DumpableObjects to represent
14052+
* dump section boundaries.
14053+
*/
14054+
static DumpableObject *
14055+
createBoundaryObjects(void)
14056+
{
14057+
DumpableObject *dobjs;
14058+
14059+
dobjs = (DumpableObject *) pg_malloc(2 * sizeof(DumpableObject));
14060+
14061+
dobjs[0].objType = DO_PRE_DATA_BOUNDARY;
14062+
dobjs[0].catId = nilCatalogId;
14063+
AssignDumpId(dobjs + 0);
14064+
dobjs[0].name = pg_strdup("PRE-DATA BOUNDARY");
14065+
14066+
dobjs[1].objType = DO_POST_DATA_BOUNDARY;
14067+
dobjs[1].catId = nilCatalogId;
14068+
AssignDumpId(dobjs + 1);
14069+
dobjs[1].name = pg_strdup("POST-DATA BOUNDARY");
14070+
14071+
return dobjs;
14072+
}
14073+
14074+
/*
14075+
* addBoundaryDependencies - add dependencies as needed to enforce the dump
14076+
* section boundaries.
14077+
*/
14078+
static void
14079+
addBoundaryDependencies(DumpableObject **dobjs, int numObjs,
14080+
DumpableObject *boundaryObjs)
14081+
{
14082+
DumpableObject *preDataBound = boundaryObjs + 0;
14083+
DumpableObject *postDataBound = boundaryObjs + 1;
14084+
int i;
14085+
14086+
for (i = 0; i < numObjs; i++)
14087+
{
14088+
DumpableObject *dobj = dobjs[i];
14089+
14090+
/*
14091+
* The classification of object types here must match the SECTION_xxx
14092+
* values assigned during subsequent ArchiveEntry calls!
14093+
*/
14094+
switch (dobj->objType)
14095+
{
14096+
case DO_NAMESPACE:
14097+
case DO_EXTENSION:
14098+
case DO_TYPE:
14099+
case DO_SHELL_TYPE:
14100+
case DO_FUNC:
14101+
case DO_AGG:
14102+
case DO_OPERATOR:
14103+
case DO_OPCLASS:
14104+
case DO_OPFAMILY:
14105+
case DO_COLLATION:
14106+
case DO_CONVERSION:
14107+
case DO_TABLE:
14108+
case DO_ATTRDEF:
14109+
case DO_PROCLANG:
14110+
case DO_CAST:
14111+
case DO_DUMMY_TYPE:
14112+
case DO_TSPARSER:
14113+
case DO_TSDICT:
14114+
case DO_TSTEMPLATE:
14115+
case DO_TSCONFIG:
14116+
case DO_FDW:
14117+
case DO_FOREIGN_SERVER:
14118+
case DO_BLOB:
14119+
/* Pre-data objects: must come before the pre-data boundary */
14120+
addObjectDependency(preDataBound, dobj->dumpId);
14121+
break;
14122+
case DO_TABLE_DATA:
14123+
case DO_BLOB_DATA:
14124+
/* Data objects: must come between the boundaries */
14125+
addObjectDependency(dobj, preDataBound->dumpId);
14126+
addObjectDependency(postDataBound, dobj->dumpId);
14127+
break;
14128+
case DO_INDEX:
14129+
case DO_TRIGGER:
14130+
case DO_DEFAULT_ACL:
14131+
/* Post-data objects: must come after the post-data boundary */
14132+
addObjectDependency(dobj, postDataBound->dumpId);
14133+
break;
14134+
case DO_RULE:
14135+
/* Rules are post-data, but only if dumped separately */
14136+
if (((RuleInfo *) dobj)->separate)
14137+
addObjectDependency(dobj, postDataBound->dumpId);
14138+
break;
14139+
case DO_CONSTRAINT:
14140+
case DO_FK_CONSTRAINT:
14141+
/* Constraints are post-data, but only if dumped separately */
14142+
if (((ConstraintInfo *) dobj)->separate)
14143+
addObjectDependency(dobj, postDataBound->dumpId);
14144+
break;
14145+
case DO_PRE_DATA_BOUNDARY:
14146+
/* nothing to do */
14147+
break;
14148+
case DO_POST_DATA_BOUNDARY:
14149+
/* must come after the pre-data boundary */
14150+
addObjectDependency(dobj, preDataBound->dumpId);
14151+
break;
14152+
}
14153+
}
14154+
}
14155+
14156+
1403014157
/*
1403114158
* selectSourceSchema - make the specified schema the active search path
1403214159
* in the source database.

src/bin/pg_dump/pg_dump.h

+5-2
Original file line numberDiff line numberDiff line change
@@ -97,6 +97,7 @@ typedef enum
9797
DO_OPERATOR,
9898
DO_OPCLASS,
9999
DO_OPFAMILY,
100+
DO_COLLATION,
100101
DO_CONVERSION,
101102
DO_TABLE,
102103
DO_ATTRDEF,
@@ -118,7 +119,8 @@ typedef enum
118119
DO_DEFAULT_ACL,
119120
DO_BLOB,
120121
DO_BLOB_DATA,
121-
DO_COLLATION
122+
DO_PRE_DATA_BOUNDARY,
123+
DO_POST_DATA_BOUNDARY
122124
} DumpableObjectType;
123125

124126
typedef struct _dumpableObject
@@ -520,7 +522,8 @@ extern bool simple_string_list_member(SimpleStringList *list, const char *val);
520522

521523
extern void parseOidArray(const char *str, Oid *array, int arraysize);
522524

523-
extern void sortDumpableObjects(DumpableObject **objs, int numObjs);
525+
extern void sortDumpableObjects(DumpableObject **objs, int numObjs,
526+
DumpId preBoundaryId, DumpId postBoundaryId);
524527
extern void sortDumpableObjectsByTypeName(DumpableObject **objs, int numObjs);
525528
extern void sortDumpableObjectsByTypeOid(DumpableObject **objs, int numObjs);
526529

0 commit comments

Comments
 (0)