Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit ef65566

Browse files
committed
Further hacking on ruleutils' new column-alias-assignment code.
After further thought about implicit coercions appearing in a joinaliasvars list, I realized that they represent an additional reason why we might need to reference the join output column directly instead of referencing an underlying column. Consider SELECT x FROM t1 LEFT JOIN t2 USING (x) where t1.x is of type date while t2.x is of type timestamptz. The merged output variable is of type timestamptz, but it won't go to null when t2 does, therefore neither t1.x nor t2.x is a valid substitute reference. The code in get_variable() actually gets this case right, since it knows it shouldn't look through a coercion, but we failed to ensure that the unqualified output column name would be globally unique. To fix, modify the code that trawls for a dangerous situation so that it actually scans through an unnamed join's joinaliasvars list to see if there are any non-simple-Var entries.
1 parent bb686c9 commit ef65566

File tree

3 files changed

+94
-39
lines changed

3 files changed

+94
-39
lines changed

src/backend/utils/adt/ruleutils.c

Lines changed: 53 additions & 39 deletions
Original file line numberDiff line numberDiff line change
@@ -164,10 +164,10 @@ typedef struct
164164
* since they just inherit column names from their input RTEs, and we can't
165165
* rename the columns at the join level. Most of the time this isn't an issue
166166
* because we don't need to reference the join's output columns as such; we
167-
* can reference the input columns instead. That approach fails for merged
168-
* FULL JOIN USING columns, however, so when we have one of those in an
169-
* unnamed join, we have to make that column's alias globally unique across
170-
* the whole query to ensure it can be referenced unambiguously.
167+
* can reference the input columns instead. That approach can fail for merged
168+
* JOIN USING columns, however, so when we have one of those in an unnamed
169+
* join, we have to make that column's alias globally unique across the whole
170+
* query to ensure it can be referenced unambiguously.
171171
*
172172
* Another problem is that a JOIN USING clause requires the columns to be
173173
* merged to have the same aliases in both input RTEs. To handle that, we do
@@ -301,7 +301,7 @@ static bool refname_is_unique(char *refname, deparse_namespace *dpns,
301301
static void set_deparse_for_query(deparse_namespace *dpns, Query *query,
302302
List *parent_namespaces);
303303
static void set_simple_column_names(deparse_namespace *dpns);
304-
static bool has_unnamed_full_join_using(Node *jtnode);
304+
static bool has_dangerous_join_using(deparse_namespace *dpns, Node *jtnode);
305305
static void set_using_names(deparse_namespace *dpns, Node *jtnode);
306306
static void set_relation_column_names(deparse_namespace *dpns,
307307
RangeTblEntry *rte,
@@ -2570,7 +2570,7 @@ set_deparse_for_query(deparse_namespace *dpns, Query *query,
25702570
{
25712571
/* Detect whether global uniqueness of USING names is needed */
25722572
dpns->unique_using =
2573-
has_unnamed_full_join_using((Node *) query->jointree);
2573+
has_dangerous_join_using(dpns, (Node *) query->jointree);
25742574

25752575
/*
25762576
* Select names for columns merged by USING, via a recursive pass over
@@ -2630,25 +2630,26 @@ set_simple_column_names(deparse_namespace *dpns)
26302630
}
26312631

26322632
/*
2633-
* has_unnamed_full_join_using: search jointree for unnamed FULL JOIN USING
2633+
* has_dangerous_join_using: search jointree for unnamed JOIN USING
26342634
*
2635-
* Merged columns of a FULL JOIN USING act differently from either of the
2636-
* input columns, so they have to be referenced as columns of the JOIN not
2637-
* as columns of either input. And this is problematic if the join is
2638-
* unnamed (alias-less): we cannot qualify the column's name with an RTE
2639-
* name, since there is none. (Forcibly assigning an alias to the join is
2640-
* not a solution, since that will prevent legal references to tables below
2641-
* the join.) To ensure that every column in the query is unambiguously
2642-
* referenceable, we must assign such merged columns names that are globally
2643-
* unique across the whole query, aliasing other columns out of the way as
2644-
* necessary.
2635+
* Merged columns of a JOIN USING may act differently from either of the input
2636+
* columns, either because they are merged with COALESCE (in a FULL JOIN) or
2637+
* because an implicit coercion of the underlying input column is required.
2638+
* In such a case the column must be referenced as a column of the JOIN not as
2639+
* a column of either input. And this is problematic if the join is unnamed
2640+
* (alias-less): we cannot qualify the column's name with an RTE name, since
2641+
* there is none. (Forcibly assigning an alias to the join is not a solution,
2642+
* since that will prevent legal references to tables below the join.)
2643+
* To ensure that every column in the query is unambiguously referenceable,
2644+
* we must assign such merged columns names that are globally unique across
2645+
* the whole query, aliasing other columns out of the way as necessary.
26452646
*
26462647
* Because the ensuing re-aliasing is fairly damaging to the readability of
26472648
* the query, we don't do this unless we have to. So, we must pre-scan
26482649
* the join tree to see if we have to, before starting set_using_names().
26492650
*/
26502651
static bool
2651-
has_unnamed_full_join_using(Node *jtnode)
2652+
has_dangerous_join_using(deparse_namespace *dpns, Node *jtnode)
26522653
{
26532654
if (IsA(jtnode, RangeTblRef))
26542655
{
@@ -2661,24 +2662,38 @@ has_unnamed_full_join_using(Node *jtnode)
26612662

26622663
foreach(lc, f->fromlist)
26632664
{
2664-
if (has_unnamed_full_join_using((Node *) lfirst(lc)))
2665+
if (has_dangerous_join_using(dpns, (Node *) lfirst(lc)))
26652666
return true;
26662667
}
26672668
}
26682669
else if (IsA(jtnode, JoinExpr))
26692670
{
26702671
JoinExpr *j = (JoinExpr *) jtnode;
26712672

2672-
/* Is it an unnamed FULL JOIN with USING? */
2673-
if (j->alias == NULL &&
2674-
j->jointype == JOIN_FULL &&
2675-
j->usingClause)
2676-
return true;
2673+
/* Is it an unnamed JOIN with USING? */
2674+
if (j->alias == NULL && j->usingClause)
2675+
{
2676+
/*
2677+
* Yes, so check each join alias var to see if any of them are not
2678+
* simple references to underlying columns. If so, we have a
2679+
* dangerous situation and must pick unique aliases.
2680+
*/
2681+
RangeTblEntry *jrte = rt_fetch(j->rtindex, dpns->rtable);
2682+
ListCell *lc;
2683+
2684+
foreach(lc, jrte->joinaliasvars)
2685+
{
2686+
Var *aliasvar = (Var *) lfirst(lc);
2687+
2688+
if (aliasvar != NULL && !IsA(aliasvar, Var))
2689+
return true;
2690+
}
2691+
}
26772692

26782693
/* Nope, but inspect children */
2679-
if (has_unnamed_full_join_using(j->larg))
2694+
if (has_dangerous_join_using(dpns, j->larg))
26802695
return true;
2681-
if (has_unnamed_full_join_using(j->rarg))
2696+
if (has_dangerous_join_using(dpns, j->rarg))
26822697
return true;
26832698
}
26842699
else
@@ -2768,16 +2783,16 @@ set_using_names(deparse_namespace *dpns, Node *jtnode)
27682783
*
27692784
* If dpns->unique_using is TRUE, we force all USING names to be
27702785
* unique across the whole query level. In principle we'd only need
2771-
* the names of USING columns in unnamed full joins to be globally
2772-
* unique, but to safely assign all USING names in a single pass, we
2773-
* have to enforce the same uniqueness rule for all of them. However,
2774-
* if a USING column's name has been pushed down from the parent, we
2775-
* should use it as-is rather than making a uniqueness adjustment.
2776-
* This is necessary when we're at an unnamed join, and it creates no
2777-
* risk of ambiguity. Also, if there's a user-written output alias
2778-
* for a merged column, we prefer to use that rather than the input
2779-
* name; this simplifies the logic and seems likely to lead to less
2780-
* aliasing overall.
2786+
* the names of dangerous USING columns to be globally unique, but to
2787+
* safely assign all USING names in a single pass, we have to enforce
2788+
* the same uniqueness rule for all of them. However, if a USING
2789+
* column's name has been pushed down from the parent, we should use
2790+
* it as-is rather than making a uniqueness adjustment. This is
2791+
* necessary when we're at an unnamed join, and it creates no risk of
2792+
* ambiguity. Also, if there's a user-written output alias for a
2793+
* merged column, we prefer to use that rather than the input name;
2794+
* this simplifies the logic and seems likely to lead to less aliasing
2795+
* overall.
27812796
*
27822797
* If dpns->unique_using is FALSE, we only need USING names to be
27832798
* unique within their own join RTE. We still need to honor
@@ -5344,9 +5359,8 @@ get_variable(Var *var, int levelsup, bool istoplevel, deparse_context *context)
53445359
* If it's a simple reference to one of the input vars, then recursively
53455360
* print the name of that var instead. When it's not a simple reference,
53465361
* we have to just print the unqualified join column name. (This can only
5347-
* happen with columns that were merged by USING or NATURAL clauses in a
5348-
* FULL JOIN; we took pains previously to make the unqualified column name
5349-
* unique in such cases.)
5362+
* happen with "dangerous" merged columns in a JOIN USING; we took pains
5363+
* previously to make the unqualified column name unique in such cases.)
53505364
*
53515365
* This wouldn't work in decompiling plan trees, because we don't store
53525366
* joinaliasvars lists after planning; but a plan tree should never

src/test/regress/expected/create_view.out

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1243,6 +1243,34 @@ select pg_get_viewdef('vv4', true);
12431243
FULL JOIN tt8 tt8y(x_1, z, z2) USING (x_1);
12441244
(1 row)
12451245

1246+
-- Implicit coercions in a JOIN USING create issues similar to FULL JOIN
1247+
create table tt7a (x date, xx int, y int);
1248+
alter table tt7a drop column xx;
1249+
create table tt8a (x timestamptz, z int);
1250+
create view vv2a as
1251+
select * from (values(now(),2,3,now(),5)) v(a,b,c,d,e)
1252+
union all
1253+
select * from tt7a left join tt8a using (x), tt8a tt8ax;
1254+
select pg_get_viewdef('vv2a', true);
1255+
pg_get_viewdef
1256+
----------------------------------------------------------------
1257+
SELECT v.a, +
1258+
v.b, +
1259+
v.c, +
1260+
v.d, +
1261+
v.e +
1262+
FROM ( VALUES (now(),2,3,now(),5)) v(a, b, c, d, e)+
1263+
UNION ALL +
1264+
SELECT x AS a, +
1265+
tt7a.y AS b, +
1266+
tt8a.z AS c, +
1267+
tt8ax.x_1 AS d, +
1268+
tt8ax.z AS e +
1269+
FROM tt7a +
1270+
LEFT JOIN tt8a USING (x), +
1271+
tt8a tt8ax(x_1, z);
1272+
(1 row)
1273+
12461274
--
12471275
-- Also check dropping a column that existed when the view was made
12481276
--

src/test/regress/sql/create_view.sql

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -389,6 +389,19 @@ select pg_get_viewdef('vv2', true);
389389
select pg_get_viewdef('vv3', true);
390390
select pg_get_viewdef('vv4', true);
391391

392+
-- Implicit coercions in a JOIN USING create issues similar to FULL JOIN
393+
394+
create table tt7a (x date, xx int, y int);
395+
alter table tt7a drop column xx;
396+
create table tt8a (x timestamptz, z int);
397+
398+
create view vv2a as
399+
select * from (values(now(),2,3,now(),5)) v(a,b,c,d,e)
400+
union all
401+
select * from tt7a left join tt8a using (x), tt8a tt8ax;
402+
403+
select pg_get_viewdef('vv2a', true);
404+
392405
--
393406
-- Also check dropping a column that existed when the view was made
394407
--

0 commit comments

Comments
 (0)