Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit f10a20e

Browse files
committed
Doc: fix thinko in description of how to escape a backslash in bytea.
Also clean up some discussion that had been left in a very confused state thanks to half-hearted adjustments for the change to standard_conforming_strings being the default. Discussion: https://postgr.es/m/154954987367.1297.4358910045409218@wrigleys.postgresql.org
1 parent 3d462f0 commit f10a20e

File tree

1 file changed

+26
-32
lines changed

1 file changed

+26
-32
lines changed

doc/src/sgml/datatype.sgml

+26-32
Original file line numberDiff line numberDiff line change
@@ -1335,9 +1335,9 @@ SELECT b, char_length(b) FROM test2;
13351335
per byte, most significant nibble first. The entire string is
13361336
preceded by the sequence <literal>\x</literal> (to distinguish it
13371337
from the escape format). In some contexts, the initial backslash may
1338-
need to be escaped by doubling it, in the same cases in which backslashes
1339-
have to be doubled in escape format; details appear below.
1340-
The hexadecimal digits can
1338+
need to be escaped by doubling it
1339+
(see <xref linkend="sql-syntax-strings"/>).
1340+
For input, the hexadecimal digits can
13411341
be either upper or lower case, and whitespace is permitted between
13421342
digit pairs (but not within a digit pair nor in the starting
13431343
<literal>\x</literal> sequence).
@@ -1379,9 +1379,7 @@ SELECT '\xDEADBEEF';
13791379
values <emphasis>must</emphasis> be escaped, while all octet
13801380
values <emphasis>can</emphasis> be escaped. In
13811381
general, to escape an octet, convert it into its three-digit
1382-
octal value and precede it
1383-
by a backslash (or two backslashes, if writing the value as a
1384-
literal using escape string syntax).
1382+
octal value and precede it by a backslash.
13851383
Backslash itself (octet decimal value 92) can alternatively be represented by
13861384
double backslashes.
13871385
<xref linkend="datatype-binary-sqlesc"/>
@@ -1398,7 +1396,7 @@ SELECT '\xDEADBEEF';
13981396
<entry>Description</entry>
13991397
<entry>Escaped Input Representation</entry>
14001398
<entry>Example</entry>
1401-
<entry>Output Representation</entry>
1399+
<entry>Hex Representation</entry>
14021400
</row>
14031401
</thead>
14041402

@@ -1422,7 +1420,7 @@ SELECT '\xDEADBEEF';
14221420
<row>
14231421
<entry>92</entry>
14241422
<entry>backslash</entry>
1425-
<entry><literal>'\'</literal> or <literal>'\\134'</literal></entry>
1423+
<entry><literal>'\\'</literal> or <literal>'\134'</literal></entry>
14261424
<entry><literal>SELECT '\\'::bytea;</literal></entry>
14271425
<entry><literal>\x5c</literal></entry>
14281426
</row>
@@ -1442,39 +1440,35 @@ SELECT '\xDEADBEEF';
14421440
<para>
14431441
The requirement to escape <emphasis>non-printable</emphasis> octets
14441442
varies depending on locale settings. In some instances you can get away
1445-
with leaving them unescaped. Note that the result in each of the examples
1446-
in <xref linkend="datatype-binary-sqlesc"/> was exactly one octet in
1447-
length, even though the output representation is sometimes
1448-
more than one character.
1443+
with leaving them unescaped.
14491444
</para>
14501445

14511446
<para>
1452-
The reason multiple backslashes are required, as shown
1453-
in <xref linkend="datatype-binary-sqlesc"/>, is that an input
1454-
string written as a string literal must pass through two parse
1455-
phases in the <productname>PostgreSQL</productname> server.
1456-
The first backslash of each pair is interpreted as an escape
1457-
character by the string-literal parser (assuming escape string
1458-
syntax is used) and is therefore consumed, leaving the second backslash of the
1459-
pair. (Dollar-quoted strings can be used to avoid this level
1460-
of escaping.) The remaining backslash is then recognized by the
1461-
<type>bytea</type> input function as starting either a three
1462-
digit octal value or escaping another backslash. For example,
1463-
a string literal passed to the server as <literal>'\001'</literal>
1464-
becomes <literal>\001</literal> after passing through the
1465-
escape string parser. The <literal>\001</literal> is then sent
1466-
to the <type>bytea</type> input function, where it is converted
1467-
to a single octet with a decimal value of 1. Note that the
1468-
single-quote character is not treated specially by <type>bytea</type>,
1469-
so it follows the normal rules for string literals. (See also
1470-
<xref linkend="sql-syntax-strings"/>.)
1447+
The reason that single quotes must be doubled, as shown
1448+
in <xref linkend="datatype-binary-sqlesc"/>, is that this
1449+
is true for any string literal in a SQL command. The generic
1450+
string-literal parser consumes the outermost single quotes
1451+
and reduces any pair of single quotes to one data character.
1452+
What the <type>bytea</type> input function sees is just one
1453+
single quote, which it treats as a plain data character.
1454+
However, the <type>bytea</type> input function treats
1455+
backslashes as special, and the other behaviors shown in
1456+
<xref linkend="datatype-binary-sqlesc"/> are implemented by
1457+
that function.
1458+
</para>
1459+
1460+
<para>
1461+
In some contexts, backslashes must be doubled compared to what is
1462+
shown above, because the generic string-literal parser will also
1463+
reduce pairs of backslashes to one data character;
1464+
see <xref linkend="sql-syntax-strings"/>.
14711465
</para>
14721466

14731467
<para>
14741468
<type>Bytea</type> octets are output in <literal>hex</literal>
14751469
format by default. If you change <xref linkend="guc-bytea-output"/>
14761470
to <literal>escape</literal>,
1477-
<quote>non-printable</quote> octet are converted to
1471+
<quote>non-printable</quote> octets are converted to their
14781472
equivalent three-digit octal value and preceded by one backslash.
14791473
Most <quote>printable</quote> octets are output by their standard
14801474
representation in the client character set, e.g.:

0 commit comments

Comments
 (0)