Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 83f1c7b

Browse files
committed
Fix ECPG's handling of type names that match SQL keywords.
Previously, ECPG could only cope with variable declarations whose type names either weren't any SQL keyword, or were at least partially reserved. If you tried to use something in the unreserved_keyword category, you got a syntax error. This is pretty awful, not only because it says right on the tin that those words are not reserved, but because the set of such keywords tends to grow over time. Thus, an ECPG program that was just fine last year could fail when recompiled with a newer SQL grammar. We had to work around this recently when STRING became a keyword, but it's time for an actual fix instead of a band-aid. To fix, borrow a trick from C parsers and make the lexer's behavior change when it sees a word that is known as a typedef. This is not free of downsides: if you try to use such a name as a SQL keyword in EXEC SQL later in the program, it won't be recognized as a SQL keyword, leading to a syntax error there instead. So in a real sense this is just trading one hazard for another. But there is an important difference: with this, whether your ECPG program works depends only on what typedef names and SQL commands are used in the program text. If it compiles today it'll still compile next year, even if more words have become SQL keywords. Discussion: https://postgr.es/m/3661437.1653855582@sss.pgh.pa.us
1 parent e64cdab commit 83f1c7b

File tree

9 files changed

+236
-106
lines changed

9 files changed

+236
-106
lines changed

doc/src/sgml/ecpg.sgml

+38-1
Original file line numberDiff line numberDiff line change
@@ -1483,6 +1483,10 @@ EXEC SQL END DECLARE SECTION;
14831483

14841484
<sect4>
14851485
<title>Typedefs</title>
1486+
<indexterm>
1487+
<primary>typedef</primary>
1488+
<secondary>in ECPG</secondary>
1489+
</indexterm>
14861490

14871491
<para>
14881492
Use the <literal>typedef</literal> keyword to map new types to already
@@ -1497,8 +1501,41 @@ EXEC SQL END DECLARE SECTION;
14971501
<programlisting>
14981502
EXEC SQL TYPE serial_t IS long;
14991503
</programlisting>
1500-
This declaration does not need to be part of a declare section.
1504+
This declaration does not need to be part of a declare section;
1505+
that is, you can also write typedefs as normal C statements.
15011506
</para>
1507+
1508+
<para>
1509+
Any word you declare as a typedef cannot be used as a SQL keyword
1510+
in <literal>EXEC SQL</literal> commands later in the same program.
1511+
For example, this won't work:
1512+
<programlisting>
1513+
EXEC SQL BEGIN DECLARE SECTION;
1514+
typedef int start;
1515+
EXEC SQL END DECLARE SECTION;
1516+
...
1517+
EXEC SQL START TRANSACTION;
1518+
</programlisting>
1519+
ECPG will report a syntax error for <literal>START
1520+
TRANSACTION</literal>, because it no longer
1521+
recognizes <literal>START</literal> as a SQL keyword,
1522+
only as a typedef.
1523+
(If you have such a conflict, and renaming the typedef
1524+
seems impractical, you could write the SQL command
1525+
using <link linkend="ecpg-dynamic">dynamic SQL</link>.)
1526+
</para>
1527+
1528+
<note>
1529+
<para>
1530+
In <productname>PostgreSQL</productname> releases before v16, use
1531+
of SQL keywords as typedef names was likely to result in syntax
1532+
errors associated with use of the typedef itself, rather than use
1533+
of the name as a SQL keyword. The new behavior is less likely to
1534+
cause problems when an existing ECPG application is recompiled in
1535+
a new <productname>PostgreSQL</productname> release with new
1536+
keywords.
1537+
</para>
1538+
</note>
15021539
</sect4>
15031540

15041541
<sect4>

src/interfaces/ecpg/preproc/ecpg.trailer

+117-53
Original file line numberDiff line numberDiff line change
@@ -564,8 +564,29 @@ var_type: simple_type
564564
$$.type_index = mm_strdup("-1");
565565
$$.type_sizeof = NULL;
566566
}
567-
| ECPGColLabelCommon '(' precision opt_scale ')'
567+
| NUMERIC '(' precision opt_scale ')'
568568
{
569+
$$.type_enum = ECPGt_numeric;
570+
$$.type_str = mm_strdup("numeric");
571+
$$.type_dimension = mm_strdup("-1");
572+
$$.type_index = mm_strdup("-1");
573+
$$.type_sizeof = NULL;
574+
}
575+
| DECIMAL_P '(' precision opt_scale ')'
576+
{
577+
$$.type_enum = ECPGt_decimal;
578+
$$.type_str = mm_strdup("decimal");
579+
$$.type_dimension = mm_strdup("-1");
580+
$$.type_index = mm_strdup("-1");
581+
$$.type_sizeof = NULL;
582+
}
583+
| IDENT '(' precision opt_scale ')'
584+
{
585+
/*
586+
* In C parsing mode, NUMERIC and DECIMAL are not keywords, so
587+
* they will show up here as a plain identifier, and we need
588+
* this duplicate code to recognize them.
589+
*/
569590
if (strcmp($1, "numeric") == 0)
570591
{
571592
$$.type_enum = ECPGt_numeric;
@@ -587,15 +608,98 @@ var_type: simple_type
587608
$$.type_index = mm_strdup("-1");
588609
$$.type_sizeof = NULL;
589610
}
590-
| ECPGColLabelCommon ecpg_interval
611+
| VARCHAR
591612
{
592-
if (strlen($2) != 0 && strcmp ($1, "datetime") != 0 && strcmp ($1, "interval") != 0)
593-
mmerror (PARSE_ERROR, ET_ERROR, "interval specification not allowed here");
613+
$$.type_enum = ECPGt_varchar;
614+
$$.type_str = EMPTY; /*mm_strdup("varchar");*/
615+
$$.type_dimension = mm_strdup("-1");
616+
$$.type_index = mm_strdup("-1");
617+
$$.type_sizeof = NULL;
618+
}
619+
| FLOAT_P
620+
{
621+
/* Note: DOUBLE is handled in simple_type */
622+
$$.type_enum = ECPGt_float;
623+
$$.type_str = mm_strdup("float");
624+
$$.type_dimension = mm_strdup("-1");
625+
$$.type_index = mm_strdup("-1");
626+
$$.type_sizeof = NULL;
627+
}
628+
| NUMERIC
629+
{
630+
$$.type_enum = ECPGt_numeric;
631+
$$.type_str = mm_strdup("numeric");
632+
$$.type_dimension = mm_strdup("-1");
633+
$$.type_index = mm_strdup("-1");
634+
$$.type_sizeof = NULL;
635+
}
636+
| DECIMAL_P
637+
{
638+
$$.type_enum = ECPGt_decimal;
639+
$$.type_str = mm_strdup("decimal");
640+
$$.type_dimension = mm_strdup("-1");
641+
$$.type_index = mm_strdup("-1");
642+
$$.type_sizeof = NULL;
643+
}
644+
| TIMESTAMP
645+
{
646+
$$.type_enum = ECPGt_timestamp;
647+
$$.type_str = mm_strdup("timestamp");
648+
$$.type_dimension = mm_strdup("-1");
649+
$$.type_index = mm_strdup("-1");
650+
$$.type_sizeof = NULL;
651+
}
652+
| INTERVAL ecpg_interval
653+
{
654+
$$.type_enum = ECPGt_interval;
655+
$$.type_str = mm_strdup("interval");
656+
$$.type_dimension = mm_strdup("-1");
657+
$$.type_index = mm_strdup("-1");
658+
$$.type_sizeof = NULL;
659+
}
660+
| STRING
661+
{
662+
if (INFORMIX_MODE)
663+
{
664+
/* In Informix mode, "string" is automatically a typedef */
665+
$$.type_enum = ECPGt_string;
666+
$$.type_str = mm_strdup("char");
667+
$$.type_dimension = mm_strdup("-1");
668+
$$.type_index = mm_strdup("-1");
669+
$$.type_sizeof = NULL;
670+
}
671+
else
672+
{
673+
/* Otherwise, legal only if user typedef'ed it */
674+
struct typedefs *this = get_typedef("string", false);
675+
676+
$$.type_str = (this->type->type_enum == ECPGt_varchar || this->type->type_enum == ECPGt_bytea) ? EMPTY : mm_strdup(this->name);
677+
$$.type_enum = this->type->type_enum;
678+
$$.type_dimension = this->type->type_dimension;
679+
$$.type_index = this->type->type_index;
680+
if (this->type->type_sizeof && strlen(this->type->type_sizeof) != 0)
681+
$$.type_sizeof = this->type->type_sizeof;
682+
else
683+
$$.type_sizeof = cat_str(3, mm_strdup("sizeof("), mm_strdup(this->name), mm_strdup(")"));
594684

685+
struct_member_list[struct_level] = ECPGstruct_member_dup(this->struct_member_list);
686+
}
687+
}
688+
| IDENT ecpg_interval
689+
{
595690
/*
596-
* Check for type names that the SQL grammar treats as
597-
* unreserved keywords
691+
* In C parsing mode, the above SQL type names are not keywords,
692+
* so they will show up here as a plain identifier, and we need
693+
* this duplicate code to recognize them.
694+
*
695+
* Note that we also handle the type names bytea, date, and
696+
* datetime here, but not above because those are not currently
697+
* SQL keywords. If they ever become so, they must gain duplicate
698+
* productions above.
598699
*/
700+
if (strlen($2) != 0 && strcmp ($1, "datetime") != 0 && strcmp ($1, "interval") != 0)
701+
mmerror (PARSE_ERROR, ET_ERROR, "interval specification not allowed here");
702+
599703
if (strcmp($1, "varchar") == 0)
600704
{
601705
$$.type_enum = ECPGt_varchar;
@@ -686,45 +790,8 @@ var_type: simple_type
686790
}
687791
else
688792
{
689-
/* this is for typedef'ed types */
690-
struct typedefs *this = get_typedef($1);
691-
692-
$$.type_str = (this->type->type_enum == ECPGt_varchar || this->type->type_enum == ECPGt_bytea) ? EMPTY : mm_strdup(this->name);
693-
$$.type_enum = this->type->type_enum;
694-
$$.type_dimension = this->type->type_dimension;
695-
$$.type_index = this->type->type_index;
696-
if (this->type->type_sizeof && strlen(this->type->type_sizeof) != 0)
697-
$$.type_sizeof = this->type->type_sizeof;
698-
else
699-
$$.type_sizeof = cat_str(3, mm_strdup("sizeof("), mm_strdup(this->name), mm_strdup(")"));
700-
701-
struct_member_list[struct_level] = ECPGstruct_member_dup(this->struct_member_list);
702-
}
703-
}
704-
| STRING
705-
{
706-
/*
707-
* It's quite horrid that ECPGColLabelCommon excludes
708-
* unreserved_keyword, meaning that unreserved keywords can't be
709-
* used as type names in var_type. However, this is hard to avoid
710-
* since what follows ecpgstart can be either a random SQL
711-
* statement or an ECPGVarDeclaration (beginning with var_type).
712-
* Pending a bright idea about how to fix that, we must
713-
* special-case STRING (and any other unreserved keywords that are
714-
* likely to be needed here).
715-
*/
716-
if (INFORMIX_MODE)
717-
{
718-
$$.type_enum = ECPGt_string;
719-
$$.type_str = mm_strdup("char");
720-
$$.type_dimension = mm_strdup("-1");
721-
$$.type_index = mm_strdup("-1");
722-
$$.type_sizeof = NULL;
723-
}
724-
else
725-
{
726-
/* this is for typedef'ed types */
727-
struct typedefs *this = get_typedef("string");
793+
/* Otherwise, it must be a user-defined typedef name */
794+
struct typedefs *this = get_typedef($1, false);
728795

729796
$$.type_str = (this->type->type_enum == ECPGt_varchar || this->type->type_enum == ECPGt_bytea) ? EMPTY : mm_strdup(this->name);
730797
$$.type_enum = this->type->type_enum;
@@ -751,7 +818,7 @@ var_type: simple_type
751818
{
752819
/* No */
753820

754-
this = get_typedef(name);
821+
this = get_typedef(name, false);
755822
$$.type_str = mm_strdup(this->name);
756823
$$.type_enum = this->type->type_enum;
757824
$$.type_dimension = this->type->type_dimension;
@@ -1657,17 +1724,14 @@ ColLabel: ECPGColLabel { $$ = $1; }
16571724
| ECPGunreserved_interval { $$ = $1; }
16581725
;
16591726

1660-
ECPGColLabel: ECPGColLabelCommon { $$ = $1; }
1727+
ECPGColLabel: ecpg_ident { $$ = $1; }
16611728
| unreserved_keyword { $$ = $1; }
1662-
| reserved_keyword { $$ = $1; }
1663-
| ECPGKeywords_rest { $$ = $1; }
1664-
| CONNECTION { $$ = mm_strdup("connection"); }
1665-
;
1666-
1667-
ECPGColLabelCommon: ecpg_ident { $$ = $1; }
16681729
| col_name_keyword { $$ = $1; }
16691730
| type_func_name_keyword { $$ = $1; }
1731+
| reserved_keyword { $$ = $1; }
16701732
| ECPGKeywords_vanames { $$ = $1; }
1733+
| ECPGKeywords_rest { $$ = $1; }
1734+
| CONNECTION { $$ = mm_strdup("connection"); }
16711735
;
16721736

16731737
ECPGCKeywords: S_AUTO { $$ = mm_strdup("auto"); }

src/interfaces/ecpg/preproc/ecpg.type

-1
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,6 @@
33
%type <str> ECPGCKeywords
44
%type <str> ECPGColId
55
%type <str> ECPGColLabel
6-
%type <str> ECPGColLabelCommon
76
%type <str> ECPGConnect
87
%type <str> ECPGCursorStmt
98
%type <str> ECPGDeallocateDescr

src/interfaces/ecpg/preproc/pgc.l

+13-4
Original file line numberDiff line numberDiff line change
@@ -983,10 +983,19 @@ cppline {space}*#([^i][A-Za-z]*|{if}|{ifdef}|{ifndef}|{import})((\/\*[^*/]*\*+
983983
{
984984
int kwvalue;
985985

986-
/* Is it an SQL/ECPG keyword? */
987-
kwvalue = ScanECPGKeywordLookup(yytext);
988-
if (kwvalue >= 0)
989-
return kwvalue;
986+
/*
987+
* User-defined typedefs override SQL keywords, but
988+
* not C keywords. Currently, a typedef name is just
989+
* reported as IDENT, but someday we might need to
990+
* return a distinct token type.
991+
*/
992+
if (get_typedef(yytext, true) == NULL)
993+
{
994+
/* Is it an SQL/ECPG keyword? */
995+
kwvalue = ScanECPGKeywordLookup(yytext);
996+
if (kwvalue >= 0)
997+
return kwvalue;
998+
}
990999

9911000
/* Is it a C keyword? */
9921001
kwvalue = ScanCKeywordLookup(yytext);

src/interfaces/ecpg/preproc/preproc_extern.h

+1-1
Original file line numberDiff line numberDiff line change
@@ -93,7 +93,7 @@ extern void add_variable_to_head(struct arguments **, struct variable *, struct
9393
extern void add_variable_to_tail(struct arguments **, struct variable *, struct variable *);
9494
extern void remove_variable_from_list(struct arguments **list, struct variable *var);
9595
extern void dump_variables(struct arguments *, int);
96-
extern struct typedefs *get_typedef(char *);
96+
extern struct typedefs *get_typedef(const char *name, bool noerror);
9797
extern void adjust_array(enum ECPGttype, char **, char **, char *, char *, int, bool);
9898
extern void reset_variables(void);
9999
extern void check_indicator(struct ECPGtype *);

src/interfaces/ecpg/preproc/variable.c

+9-4
Original file line numberDiff line numberDiff line change
@@ -497,15 +497,20 @@ check_indicator(struct ECPGtype *var)
497497
}
498498

499499
struct typedefs *
500-
get_typedef(char *name)
500+
get_typedef(const char *name, bool noerror)
501501
{
502502
struct typedefs *this;
503503

504-
for (this = types; this && strcmp(this->name, name) != 0; this = this->next);
505-
if (!this)
504+
for (this = types; this != NULL; this = this->next)
505+
{
506+
if (strcmp(this->name, name) == 0)
507+
return this;
508+
}
509+
510+
if (!noerror)
506511
mmfatal(PARSE_ERROR, "unrecognized data type name \"%s\"", name);
507512

508-
return this;
513+
return NULL;
509514
}
510515

511516
void

0 commit comments

Comments
 (0)