Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit a93b3b9

Browse files
committed
Fix bug in the tsvector stats collection function, which caused a crash if
the sample contains just a one tsvector, containing only one lexeme.
1 parent fb645f6 commit a93b3b9

File tree

1 file changed

+22
-21
lines changed

1 file changed

+22
-21
lines changed

src/backend/tsearch/ts_typanalyze.c

+22-21
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
*
88
*
99
* IDENTIFICATION
10-
* $PostgreSQL: pgsql/src/backend/tsearch/ts_typanalyze.c,v 1.2 2008/09/19 19:03:40 tgl Exp $
10+
* $PostgreSQL: pgsql/src/backend/tsearch/ts_typanalyze.c,v 1.3 2008/11/27 21:17:39 heikki Exp $
1111
*
1212
*-------------------------------------------------------------------------
1313
*/
@@ -290,33 +290,34 @@ compute_tsvector_stats(VacAttrStats *stats,
290290
if (num_mcelem > track_len)
291291
num_mcelem = track_len;
292292

293-
/* Grab the minimal and maximal frequencies that will get stored */
294-
minfreq = sort_table[num_mcelem - 1]->frequency;
295-
maxfreq = sort_table[0]->frequency;
296-
297-
/*
298-
* We want to store statistics sorted on the lexeme value using first
299-
* length, then byte-for-byte comparison. The reason for doing length
300-
* comparison first is that we don't care about the ordering so long
301-
* as it's consistent, and comparing lengths first gives us a chance
302-
* to avoid a strncmp() call.
303-
*
304-
* This is different from what we do with scalar statistics -- they get
305-
* sorted on frequencies. The rationale is that we usually search
306-
* through most common elements looking for a specific value, so we can
307-
* grab its frequency. When values are presorted we can employ binary
308-
* search for that. See ts_selfuncs.c for a real usage scenario.
309-
*/
310-
qsort(sort_table, num_mcelem, sizeof(TrackItem *),
311-
trackitem_compare_lexemes);
312-
313293
/* Generate MCELEM slot entry */
314294
if (num_mcelem > 0)
315295
{
316296
MemoryContext old_context;
317297
Datum *mcelem_values;
318298
float4 *mcelem_freqs;
319299

300+
/* Grab the minimal and maximal frequencies that will get stored */
301+
minfreq = sort_table[num_mcelem - 1]->frequency;
302+
maxfreq = sort_table[0]->frequency;
303+
304+
/*
305+
* We want to store statistics sorted on the lexeme value using
306+
* first length, then byte-for-byte comparison. The reason for
307+
* doing length comparison first is that we don't care about the
308+
* ordering so long as it's consistent, and comparing lengths first
309+
* gives us a chance to avoid a strncmp() call.
310+
*
311+
* This is different from what we do with scalar statistics -- they
312+
* get sorted on frequencies. The rationale is that we usually
313+
* search through most common elements looking for a specific
314+
* value, so we can grab its frequency. When values are presorted
315+
* we can employ binary search for that. See ts_selfuncs.c for a
316+
* real usage scenario.
317+
*/
318+
qsort(sort_table, num_mcelem, sizeof(TrackItem *),
319+
trackitem_compare_lexemes);
320+
320321
/* Must copy the target values into anl_context */
321322
old_context = MemoryContextSwitchTo(stats->anl_context);
322323

0 commit comments

Comments
 (0)