Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 3db4056

Browse files
committed
Fix problems with parentheses around sub-SELECT --- for the last time,
I hope. I finally realized that we were going at it backwards: when there are excess parentheses, they need to be treated as part of the sub-SELECT, not as part of the surrounding expression. Although either choice yields an unambiguous grammar, only this way produces a grammar that is LALR(1). With the old approach we were guaranteed to fail on either 'SELECT (((SELECT 2)) + 3)' or 'SELECT (((SELECT 2)) UNION SELECT 2)' depending on which way we resolve the initial shift/reduce conflict. With the new way, the same reduction track can be followed in both cases until we have advanced far enough to know whether we are done with the sub-SELECT or not.
1 parent efd6cad commit 3db4056

File tree

1 file changed

+81
-77
lines changed

1 file changed

+81
-77
lines changed

src/backend/parser/gram.y

+81-77
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111
*
1212
*
1313
* IDENTIFICATION
14-
* $Header: /cvsroot/pgsql/src/backend/parser/gram.y,v 2.214 2001/01/06 10:50:02 petere Exp $
14+
* $Header: /cvsroot/pgsql/src/backend/parser/gram.y,v 2.215 2001/01/15 20:36:36 tgl Exp $
1515
*
1616
* HISTORY
1717
* AUTHOR DATE MAJOR EVENT
@@ -146,7 +146,8 @@ static void doNegateFloat(Value *v);
146146
UnlistenStmt, UpdateStmt, VacuumStmt, VariableResetStmt,
147147
VariableSetStmt, VariableShowStmt, ViewStmt, CheckPointStmt
148148

149-
%type <node> select_no_parens, select_clause, simple_select
149+
%type <node> select_no_parens, select_with_parens, select_clause,
150+
simple_select
150151

151152
%type <node> alter_column_action
152153
%type <ival> drop_behavior
@@ -2666,16 +2667,7 @@ RuleActionMulti: RuleActionMulti ';' RuleActionStmtOrEmpty
26662667
}
26672668
;
26682669

2669-
/*
2670-
* Allowing RuleActionStmt to be a SelectStmt creates an ambiguity:
2671-
* is the RuleActionList "((SELECT foo))" a standalone RuleActionStmt,
2672-
* or a one-entry RuleActionMulti list? We don't really care, but yacc
2673-
* wants to know. We use operator precedence to resolve the ambiguity:
2674-
* giving this rule a higher precedence than ')' will force a reduce
2675-
* rather than shift decision, causing the one-entry-list interpretation
2676-
* to be chosen.
2677-
*/
2678-
RuleActionStmt: SelectStmt %prec TYPECAST
2670+
RuleActionStmt: SelectStmt
26792671
| InsertStmt
26802672
| UpdateStmt
26812673
| DeleteStmt
@@ -3262,32 +3254,48 @@ opt_cursor: BINARY { $$ = TRUE; }
32623254
* The rule returns either a single SelectStmt node or a tree of them,
32633255
* representing a set-operation tree.
32643256
*
3265-
* To avoid ambiguity problems with nested parentheses, we have to define
3266-
* a "select_no_parens" nonterminal in which there are no parentheses
3267-
* at the outermost level. This is used in the production
3268-
* c_expr: '(' select_no_parens ')'
3269-
* This gives a unique parsing of constructs where a subselect is nested
3270-
* in an expression with extra parentheses: the parentheses are not part
3271-
* of the subselect but of the outer expression. yacc is not quite bright
3272-
* enough to handle the situation completely, however. To prevent a shift/
3273-
* reduce conflict, we also have to attach a precedence to the
3274-
* SelectStmt: select_no_parens
3275-
* rule that is higher than the precedence of ')'. This means that when
3276-
* "((SELECT foo" has been parsed in an expression context, and the
3277-
* next token is ')', the parser will follow the '(' SelectStmt ')' reduction
3278-
* path rather than '(' select_no_parens ')'. The upshot is that excess
3279-
* parens don't work in this context: SELECT ((SELECT foo)) will give a
3280-
* parse error, whereas SELECT ((SELECT foo) UNION (SELECT bar)) is OK.
3281-
* This is ugly, but it beats not allowing excess parens anywhere...
3282-
*
3283-
* In all other contexts, we can use SelectStmt which allows outer parens.
3257+
* There is an ambiguity when a sub-SELECT is within an a_expr and there
3258+
* are excess parentheses: do the parentheses belong to the sub-SELECT or
3259+
* to the surrounding a_expr? We don't really care, but yacc wants to know.
3260+
* To resolve the ambiguity, we are careful to define the grammar so that
3261+
* the decision is staved off as long as possible: as long as we can keep
3262+
* absorbing parentheses into the sub-SELECT, we will do so, and only when
3263+
* it's no longer possible to do that will we decide that parens belong to
3264+
* the expression. For example, in "SELECT (((SELECT 2)) + 3)" the extra
3265+
* parentheses are treated as part of the sub-select. The necessity of doing
3266+
* it that way is shown by "SELECT (((SELECT 2)) UNION SELECT 2)". Had we
3267+
* parsed "((SELECT 2))" as an a_expr, it'd be too late to go back to the
3268+
* SELECT viewpoint when we see the UNION.
3269+
*
3270+
* This approach is implemented by defining a nonterminal select_with_parens,
3271+
* which represents a SELECT with at least one outer layer of parentheses,
3272+
* and being careful to use select_with_parens, never '(' SelectStmt ')',
3273+
* in the expression grammar. We will then have shift-reduce conflicts
3274+
* which we can resolve in favor of always treating '(' <select> ')' as
3275+
* a select_with_parens. To resolve the conflicts, the productions that
3276+
* conflict with the select_with_parens productions are manually given
3277+
* precedences lower than the precedence of ')', thereby ensuring that we
3278+
* shift ')' (and then reduce to select_with_parens) rather than trying to
3279+
* reduce the inner <select> nonterminal to something else. We use UMINUS
3280+
* precedence for this, which is a fairly arbitrary choice.
3281+
*
3282+
* To be able to define select_with_parens itself without ambiguity, we need
3283+
* a nonterminal select_no_parens that represents a SELECT structure with no
3284+
* outermost parentheses. This is a little bit tedious, but it works.
3285+
*
3286+
* In non-expression contexts, we use SelectStmt which can represent a SELECT
3287+
* with or without outer parentheses.
32843288
*/
32853289

3286-
SelectStmt: select_no_parens %prec TYPECAST
3290+
SelectStmt: select_no_parens %prec UMINUS
3291+
| select_with_parens %prec UMINUS
3292+
;
3293+
3294+
select_with_parens: '(' select_no_parens ')'
32873295
{
3288-
$$ = $1;
3296+
$$ = $2;
32893297
}
3290-
| '(' SelectStmt ')'
3298+
| '(' select_with_parens ')'
32913299
{
32923300
$$ = $2;
32933301
}
@@ -3318,13 +3326,7 @@ select_no_parens: simple_select
33183326
;
33193327

33203328
select_clause: simple_select
3321-
{
3322-
$$ = $1;
3323-
}
3324-
| '(' SelectStmt ')'
3325-
{
3326-
$$ = $2;
3327-
}
3329+
| select_with_parens
33283330
;
33293331

33303332
/*
@@ -3342,8 +3344,10 @@ select_clause: simple_select
33423344
* (SELECT foo UNION SELECT bar) ORDER BY baz
33433345
* not
33443346
* SELECT foo UNION (SELECT bar ORDER BY baz)
3345-
* Likewise FOR UPDATE and LIMIT. This does not limit functionality,
3346-
* because you can reintroduce sort and limit clauses inside parentheses.
3347+
* Likewise FOR UPDATE and LIMIT. Therefore, those clauses are described
3348+
* as part of the select_no_parens production, not simple_select.
3349+
* This does not limit functionality, because you can reintroduce sort and
3350+
* limit clauses inside parentheses.
33473351
*
33483352
* NOTE: only the leftmost component SelectStmt should have INTO.
33493353
* However, this is not checked by the grammar; parse analysis must check it.
@@ -3614,11 +3618,11 @@ table_ref: relation_expr
36143618
$1->name = $2;
36153619
$$ = (Node *) $1;
36163620
}
3617-
| '(' SelectStmt ')' alias_clause
3621+
| select_with_parens alias_clause
36183622
{
36193623
RangeSubselect *n = makeNode(RangeSubselect);
3620-
n->subquery = $2;
3621-
n->name = $4;
3624+
n->subquery = $1;
3625+
n->name = $2;
36223626
$$ = (Node *) n;
36233627
}
36243628
| joined_table
@@ -3788,15 +3792,15 @@ relation_expr: relation_name
37883792
$$->inhOpt = INH_DEFAULT;
37893793
$$->name = NULL;
37903794
}
3791-
| relation_name '*' %prec '='
3795+
| relation_name '*'
37923796
{
37933797
/* inheritance query */
37943798
$$ = makeNode(RangeVar);
37953799
$$->relname = $1;
37963800
$$->inhOpt = INH_YES;
37973801
$$->name = NULL;
37983802
}
3799-
| ONLY relation_name %prec '='
3803+
| ONLY relation_name
38003804
{
38013805
/* no inheritance */
38023806
$$ = makeNode(RangeVar);
@@ -4146,27 +4150,27 @@ opt_interval: datetime { $$ = makeList1($1); }
41464150
* Define row_descriptor to allow yacc to break the reduce/reduce conflict
41474151
* with singleton expressions.
41484152
*/
4149-
row_expr: '(' row_descriptor ')' IN '(' SelectStmt ')'
4153+
row_expr: '(' row_descriptor ')' IN select_with_parens
41504154
{
41514155
SubLink *n = makeNode(SubLink);
41524156
n->lefthand = $2;
41534157
n->oper = (List *) makeA_Expr(OP, "=", NULL, NULL);
41544158
n->useor = FALSE;
41554159
n->subLinkType = ANY_SUBLINK;
4156-
n->subselect = $6;
4160+
n->subselect = $5;
41574161
$$ = (Node *)n;
41584162
}
4159-
| '(' row_descriptor ')' NOT IN '(' SelectStmt ')'
4163+
| '(' row_descriptor ')' NOT IN select_with_parens
41604164
{
41614165
SubLink *n = makeNode(SubLink);
41624166
n->lefthand = $2;
41634167
n->oper = (List *) makeA_Expr(OP, "<>", NULL, NULL);
41644168
n->useor = TRUE;
41654169
n->subLinkType = ALL_SUBLINK;
4166-
n->subselect = $7;
4170+
n->subselect = $6;
41674171
$$ = (Node *)n;
41684172
}
4169-
| '(' row_descriptor ')' all_Op sub_type '(' SelectStmt ')'
4173+
| '(' row_descriptor ')' all_Op sub_type select_with_parens
41704174
{
41714175
SubLink *n = makeNode(SubLink);
41724176
n->lefthand = $2;
@@ -4176,10 +4180,10 @@ row_expr: '(' row_descriptor ')' IN '(' SelectStmt ')'
41764180
else
41774181
n->useor = FALSE;
41784182
n->subLinkType = $5;
4179-
n->subselect = $7;
4183+
n->subselect = $6;
41804184
$$ = (Node *)n;
41814185
}
4182-
| '(' row_descriptor ')' all_Op '(' SelectStmt ')'
4186+
| '(' row_descriptor ')' all_Op select_with_parens
41834187
{
41844188
SubLink *n = makeNode(SubLink);
41854189
n->lefthand = $2;
@@ -4189,7 +4193,7 @@ row_expr: '(' row_descriptor ')' IN '(' SelectStmt ')'
41894193
else
41904194
n->useor = FALSE;
41914195
n->subLinkType = MULTIEXPR_SUBLINK;
4192-
n->subselect = $6;
4196+
n->subselect = $5;
41934197
$$ = (Node *)n;
41944198
}
41954199
| '(' row_descriptor ')' all_Op '(' row_descriptor ')'
@@ -4291,9 +4295,9 @@ a_expr: c_expr
42914295
* If you add more explicitly-known operators, be sure to add them
42924296
* also to b_expr and to the MathOp list above.
42934297
*/
4294-
| '+' a_expr %prec UMINUS
4298+
| '+' a_expr %prec UMINUS
42954299
{ $$ = makeA_Expr(OP, "+", NULL, $2); }
4296-
| '-' a_expr %prec UMINUS
4300+
| '-' a_expr %prec UMINUS
42974301
{ $$ = doNegate($2); }
42984302
| '%' a_expr
42994303
{ $$ = makeA_Expr(OP, "%", NULL, $2); }
@@ -4458,12 +4462,12 @@ a_expr: c_expr
44584462
makeA_Expr(OP, "<", $1, $4),
44594463
makeA_Expr(OP, ">", $1, $6));
44604464
}
4461-
| a_expr IN '(' in_expr ')'
4465+
| a_expr IN in_expr
44624466
{
44634467
/* in_expr returns a SubLink or a list of a_exprs */
4464-
if (IsA($4, SubLink))
4468+
if (IsA($3, SubLink))
44654469
{
4466-
SubLink *n = (SubLink *)$4;
4470+
SubLink *n = (SubLink *)$3;
44674471
n->lefthand = makeList1($1);
44684472
n->oper = (List *) makeA_Expr(OP, "=", NULL, NULL);
44694473
n->useor = FALSE;
@@ -4474,7 +4478,7 @@ a_expr: c_expr
44744478
{
44754479
Node *n = NULL;
44764480
List *l;
4477-
foreach(l, (List *) $4)
4481+
foreach(l, (List *) $3)
44784482
{
44794483
Node *cmp = makeA_Expr(OP, "=", $1, lfirst(l));
44804484
if (n == NULL)
@@ -4485,12 +4489,12 @@ a_expr: c_expr
44854489
$$ = n;
44864490
}
44874491
}
4488-
| a_expr NOT IN '(' in_expr ')'
4492+
| a_expr NOT IN in_expr
44894493
{
44904494
/* in_expr returns a SubLink or a list of a_exprs */
4491-
if (IsA($5, SubLink))
4495+
if (IsA($4, SubLink))
44924496
{
4493-
SubLink *n = (SubLink *)$5;
4497+
SubLink *n = (SubLink *)$4;
44944498
n->lefthand = makeList1($1);
44954499
n->oper = (List *) makeA_Expr(OP, "<>", NULL, NULL);
44964500
n->useor = FALSE;
@@ -4501,7 +4505,7 @@ a_expr: c_expr
45014505
{
45024506
Node *n = NULL;
45034507
List *l;
4504-
foreach(l, (List *) $5)
4508+
foreach(l, (List *) $4)
45054509
{
45064510
Node *cmp = makeA_Expr(OP, "<>", $1, lfirst(l));
45074511
if (n == NULL)
@@ -4512,14 +4516,14 @@ a_expr: c_expr
45124516
$$ = n;
45134517
}
45144518
}
4515-
| a_expr all_Op sub_type '(' SelectStmt ')'
4519+
| a_expr all_Op sub_type select_with_parens
45164520
{
45174521
SubLink *n = makeNode(SubLink);
45184522
n->lefthand = makeList1($1);
45194523
n->oper = (List *) makeA_Expr(OP, $2, NULL, NULL);
45204524
n->useor = FALSE; /* doesn't matter since only one col */
45214525
n->subLinkType = $3;
4522-
n->subselect = $5;
4526+
n->subselect = $4;
45234527
$$ = (Node *)n;
45244528
}
45254529
| row_expr
@@ -4539,9 +4543,9 @@ b_expr: c_expr
45394543
{ $$ = $1; }
45404544
| b_expr TYPECAST Typename
45414545
{ $$ = makeTypeCast($1, $3); }
4542-
| '+' b_expr %prec UMINUS
4546+
| '+' b_expr %prec UMINUS
45434547
{ $$ = makeA_Expr(OP, "+", NULL, $2); }
4544-
| '-' b_expr %prec UMINUS
4548+
| '-' b_expr %prec UMINUS
45454549
{ $$ = doNegate($2); }
45464550
| '%' b_expr
45474551
{ $$ = makeA_Expr(OP, "%", NULL, $2); }
@@ -4908,24 +4912,24 @@ c_expr: attr
49084912
n->agg_distinct = FALSE;
49094913
$$ = (Node *)n;
49104914
}
4911-
| '(' select_no_parens ')'
4915+
| select_with_parens %prec UMINUS
49124916
{
49134917
SubLink *n = makeNode(SubLink);
49144918
n->lefthand = NIL;
49154919
n->oper = NIL;
49164920
n->useor = FALSE;
49174921
n->subLinkType = EXPR_SUBLINK;
4918-
n->subselect = $2;
4922+
n->subselect = $1;
49194923
$$ = (Node *)n;
49204924
}
4921-
| EXISTS '(' SelectStmt ')'
4925+
| EXISTS select_with_parens
49224926
{
49234927
SubLink *n = makeNode(SubLink);
49244928
n->lefthand = NIL;
49254929
n->oper = NIL;
49264930
n->useor = FALSE;
49274931
n->subLinkType = EXISTS_SUBLINK;
4928-
n->subselect = $3;
4932+
n->subselect = $2;
49294933
$$ = (Node *)n;
49304934
}
49314935
;
@@ -5037,14 +5041,14 @@ trim_list: a_expr FROM expr_list
50375041
{ $$ = $1; }
50385042
;
50395043

5040-
in_expr: SelectStmt
5044+
in_expr: select_with_parens
50415045
{
50425046
SubLink *n = makeNode(SubLink);
50435047
n->subselect = $1;
50445048
$$ = (Node *)n;
50455049
}
5046-
| in_expr_nodes
5047-
{ $$ = (Node *)$1; }
5050+
| '(' in_expr_nodes ')'
5051+
{ $$ = (Node *)$2; }
50485052
;
50495053

50505054
in_expr_nodes: a_expr

0 commit comments

Comments
 (0)