Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 49bca9e

Browse files
committed
Fix inadequately-sized output buffer in contrib/unaccent.
The output buffer size in unaccent_lexize() was calculated as input string length times pg_database_encoding_max_length(), which effectively assumes that replacement strings aren't more than one character. While that was all that we previously documented it to support, the code actually has always allowed replacement strings of arbitrary length; so if you tried to make use of longer strings, you were at risk of buffer overrun. To fix, use an expansible StringInfo buffer instead of trying to determine the maximum space needed a-priori. This would be a security issue if unaccent rules files could be installed by unprivileged users; but fortunately they can't, so in the back branches the problem can be labeled as improper configuration by a superuser. Nonetheless, a memory stomp isn't a nice way of reacting to improper configuration, so let's back-patch the fix.
1 parent 4dc3df9 commit 49bca9e

File tree

1 file changed

+27
-24
lines changed

1 file changed

+27
-24
lines changed

contrib/unaccent/unaccent.c

+27-24
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,7 @@
1515

1616
#include "catalog/namespace.h"
1717
#include "commands/defrem.h"
18+
#include "lib/stringinfo.h"
1819
#include "tsearch/ts_cache.h"
1920
#include "tsearch/ts_locale.h"
2021
#include "tsearch/ts_public.h"
@@ -263,46 +264,48 @@ unaccent_lexize(PG_FUNCTION_ARGS)
263264
TrieChar *rootTrie = (TrieChar *) PG_GETARG_POINTER(0);
264265
char *srcchar = (char *) PG_GETARG_POINTER(1);
265266
int32 len = PG_GETARG_INT32(2);
266-
char *srcstart,
267-
*trgchar = NULL;
268-
int charlen;
269-
TSLexeme *res = NULL;
270-
TrieChar *node;
267+
char *srcstart = srcchar;
268+
TSLexeme *res;
269+
StringInfoData buf;
270+
271+
/* we allocate storage for the buffer only if needed */
272+
buf.data = NULL;
271273

272-
srcstart = srcchar;
273274
while (srcchar - srcstart < len)
274275
{
276+
TrieChar *node;
277+
int charlen;
278+
275279
charlen = pg_mblen(srcchar);
276280

277281
node = findReplaceTo(rootTrie, (unsigned char *) srcchar, charlen);
278282
if (node && node->replaceTo)
279283
{
280-
if (!res)
284+
if (buf.data == NULL)
281285
{
282-
/* allocate res only if it's needed */
283-
res = palloc0(sizeof(TSLexeme) * 2);
284-
res->lexeme = trgchar = palloc(len * pg_database_encoding_max_length() + 1 /* \0 */ );
285-
res->flags = TSL_FILTER;
286+
/* initialize buffer */
287+
initStringInfo(&buf);
288+
/* insert any data we already skipped over */
286289
if (srcchar != srcstart)
287-
{
288-
memcpy(trgchar, srcstart, srcchar - srcstart);
289-
trgchar += (srcchar - srcstart);
290-
}
290+
appendBinaryStringInfo(&buf, srcstart, srcchar - srcstart);
291291
}
292-
memcpy(trgchar, node->replaceTo, node->replacelen);
293-
trgchar += node->replacelen;
294-
}
295-
else if (res)
296-
{
297-
memcpy(trgchar, srcchar, charlen);
298-
trgchar += charlen;
292+
appendBinaryStringInfo(&buf, node->replaceTo, node->replacelen);
299293
}
294+
else if (buf.data != NULL)
295+
appendBinaryStringInfo(&buf, srcchar, charlen);
300296

301297
srcchar += charlen;
302298
}
303299

304-
if (res)
305-
*trgchar = '\0';
300+
/* return a result only if we made at least one substitution */
301+
if (buf.data != NULL)
302+
{
303+
res = (TSLexeme *) palloc0(sizeof(TSLexeme) * 2);
304+
res->lexeme = buf.data;
305+
res->flags = TSL_FILTER;
306+
}
307+
else
308+
res = NULL;
306309

307310
PG_RETURN_POINTER(res);
308311
}

0 commit comments

Comments
 (0)