Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 9f0a017

Browse files
committed
Fix COPY FROM for null marker strings that correspond to invalid encoding.
The COPY documentation says "COPY FROM matches the input against the null string before removing backslashes". It is therefore reasonable to presume that null markers like E'\\0' will work ... and they did, until someone put the tests in the wrong order during microoptimization-driven rewrites. Since then, we've been failing if the null marker is something that would de-escape to an invalidly-encoded string. Since null markers generally need to be something that can't appear in the data, this represents a nontrivial loss of functionality; surprising nobody noticed it earlier. Per report from Jeff Davis. Backpatch to 8.4 where this got broken.
1 parent 811a2cb commit 9f0a017

File tree

3 files changed

+60
-16
lines changed

3 files changed

+60
-16
lines changed

src/backend/commands/copy.c

Lines changed: 29 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -3098,7 +3098,17 @@ CopyReadAttributesText(CopyState cstate)
30983098
start_ptr = cur_ptr;
30993099
cstate->raw_fields[fieldno] = output_ptr;
31003100

3101-
/* Scan data for field */
3101+
/*
3102+
* Scan data for field.
3103+
*
3104+
* Note that in this loop, we are scanning to locate the end of field
3105+
* and also speculatively performing de-escaping. Once we find the
3106+
* end-of-field, we can match the raw field contents against the null
3107+
* marker string. Only after that comparison fails do we know that
3108+
* de-escaping is actually the right thing to do; therefore we *must
3109+
* not* throw any syntax errors before we've done the null-marker
3110+
* check.
3111+
*/
31023112
for (;;)
31033113
{
31043114
char c;
@@ -3211,26 +3221,29 @@ CopyReadAttributesText(CopyState cstate)
32113221
*output_ptr++ = c;
32123222
}
32133223

3214-
/* Terminate attribute value in output area */
3215-
*output_ptr++ = '\0';
3216-
3217-
/*
3218-
* If we de-escaped a non-7-bit-ASCII char, make sure we still have
3219-
* valid data for the db encoding. Avoid calling strlen here for the
3220-
* sake of efficiency.
3221-
*/
3222-
if (saw_non_ascii)
3223-
{
3224-
char *fld = cstate->raw_fields[fieldno];
3225-
3226-
pg_verifymbstr(fld, output_ptr - (fld + 1), false);
3227-
}
3228-
32293224
/* Check whether raw input matched null marker */
32303225
input_len = end_ptr - start_ptr;
32313226
if (input_len == cstate->null_print_len &&
32323227
strncmp(start_ptr, cstate->null_print, input_len) == 0)
32333228
cstate->raw_fields[fieldno] = NULL;
3229+
else
3230+
{
3231+
/*
3232+
* At this point we know the field is supposed to contain data.
3233+
*
3234+
* If we de-escaped any non-7-bit-ASCII chars, make sure the
3235+
* resulting string is valid data for the db encoding.
3236+
*/
3237+
if (saw_non_ascii)
3238+
{
3239+
char *fld = cstate->raw_fields[fieldno];
3240+
3241+
pg_verifymbstr(fld, output_ptr - fld, false);
3242+
}
3243+
}
3244+
3245+
/* Terminate attribute value in output area */
3246+
*output_ptr++ = '\0';
32343247

32353248
fieldno++;
32363249
/* Done if we hit EOL instead of a delim */

src/test/regress/expected/copy2.out

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -239,6 +239,22 @@ a\.
239239
\.b
240240
c\.d
241241
"\."
242+
-- test handling of nonstandard null marker that violates escaping rules
243+
CREATE TEMP TABLE testnull(a int, b text);
244+
INSERT INTO testnull VALUES (1, E'\\0'), (NULL, NULL);
245+
COPY testnull TO stdout WITH NULL AS E'\\0';
246+
1 \\0
247+
\0 \0
248+
COPY testnull FROM stdin WITH NULL AS E'\\0';
249+
SELECT * FROM testnull;
250+
a | b
251+
----+----
252+
1 | \0
253+
|
254+
42 | \0
255+
|
256+
(4 rows)
257+
242258
DROP TABLE x, y;
243259
DROP FUNCTION fn_x_before();
244260
DROP FUNCTION fn_x_after();

src/test/regress/sql/copy2.sql

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -164,6 +164,21 @@ c\.d
164164

165165
COPY testeoc TO stdout CSV;
166166

167+
-- test handling of nonstandard null marker that violates escaping rules
168+
169+
CREATE TEMP TABLE testnull(a int, b text);
170+
INSERT INTO testnull VALUES (1, E'\\0'), (NULL, NULL);
171+
172+
COPY testnull TO stdout WITH NULL AS E'\\0';
173+
174+
COPY testnull FROM stdin WITH NULL AS E'\\0';
175+
42 \\0
176+
\0 \0
177+
\.
178+
179+
SELECT * FROM testnull;
180+
181+
167182
DROP TABLE x, y;
168183
DROP FUNCTION fn_x_before();
169184
DROP FUNCTION fn_x_after();

0 commit comments

Comments
 (0)