Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 821b821

Browse files
committed
Still more fixes for lossy-GiST-distance-functions patch.
Fix confusion in documentation, substantial memory leakage if float8 or float4 are pass-by-reference, and assorted comments that were obsoleted by commit 98edd61.
1 parent 284bef2 commit 821b821

File tree

5 files changed

+55
-41
lines changed

5 files changed

+55
-41
lines changed

doc/src/sgml/gist.sgml

Lines changed: 22 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -208,12 +208,6 @@
208208
</tgroup>
209209
</table>
210210

211-
<para>
212-
Currently, ordering by the distance operator <literal>&lt;-&gt;</>
213-
is supported only with <literal>point</> by the operator classes
214-
of the geometric types.
215-
</para>
216-
217211
<para>
218212
For historical reasons, the <literal>inet_ops</> operator class is
219213
not the default class for types <type>inet</> and <type>cidr</>.
@@ -805,28 +799,30 @@ my_distance(PG_FUNCTION_ARGS)
805799
</para>
806800

807801
<para>
808-
Some approximation is allowed when determining the distance, as long as
809-
the result is never greater than the entry's actual distance. Thus, for
810-
example, distance to a bounding box is usually sufficient in geometric
811-
applications. For an internal tree node, the distance returned must not
812-
be greater than the distance to any of the child nodes. If the returned
813-
distance is not accurate, the function must set *recheck to false. (This
814-
is not necessary for internal tree nodes; for them, the calculation is
815-
always assumed to be inaccurate). The executor will calculate the
816-
accurate distance after fetching the tuple from the heap, and reorder
817-
the tuples if necessary.
802+
Some approximation is allowed when determining the distance, so long
803+
as the result is never greater than the entry's actual distance. Thus,
804+
for example, distance to a bounding box is usually sufficient in
805+
geometric applications. For an internal tree node, the distance
806+
returned must not be greater than the distance to any of the child
807+
nodes. If the returned distance is not exact, the function must set
808+
<literal>*recheck</> to true. (This is not necessary for internal tree
809+
nodes; for them, the calculation is always assumed to be inexact.) In
810+
this case the executor will calculate the accurate distance after
811+
fetching the tuple from the heap, and reorder the tuples if necessary.
818812
</para>
819813

820814
<para>
821-
If the distance function returns *recheck=true for a leaf node, the
822-
original ordering operator's return type must be float8 or float4, and
823-
the distance function's return value must be comparable with the actual
824-
distance operator. Otherwise, the distance function's return type can
825-
be any finit <type>float8</> value, as long as the relative order of
826-
the returned values matches the order returned by the ordering operator.
827-
(Infinity and minus infinity are used internally to handle cases such as
828-
nulls, so it is not recommended that <function>distance</> functions
829-
return these values.)
815+
If the distance function returns <literal>*recheck = true</> for any
816+
leaf node, the original ordering operator's return type must
817+
be <type>float8</> or <type>float4</>, and the distance function's
818+
result values must be comparable to those of the original ordering
819+
operator, since the executor will sort using both distance function
820+
results and recalculated ordering-operator results. Otherwise, the
821+
distance function's result values can be any finite <type>float8</>
822+
values, so long as the relative order of the result values matches the
823+
order returned by the ordering operator. (Infinity and minus infinity
824+
are used internally to handle cases such as nulls, so it is not
825+
recommended that <function>distance</> functions return these values.)
830826
</para>
831827

832828
</listitem>
@@ -857,7 +853,7 @@ LANGUAGE C STRICT;
857853
struct, whose 'key' field contains the same datum in the original,
858854
uncompressed form. If the opclass' compress function does nothing for
859855
leaf entries, the fetch method can return the argument as is.
860-
</para>
856+
</para>
861857

862858
<para>
863859
The matching code in the C module could then follow this skeleton:

src/backend/access/gist/gistget.c

Lines changed: 19 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -191,20 +191,18 @@ gistindex_keytest(IndexScanDesc scan,
191191
/*
192192
* Call the Distance function to evaluate the distance. The
193193
* arguments are the index datum (as a GISTENTRY*), the comparison
194-
* datum, and the ordering operator's strategy number and subtype
195-
* from pg_amop.
194+
* datum, the ordering operator's strategy number and subtype from
195+
* pg_amop, and the recheck flag.
196196
*
197197
* (Presently there's no need to pass the subtype since it'll
198198
* always be zero, but might as well pass it for possible future
199199
* use.)
200200
*
201-
* Distance functions get a recheck argument as well. In this
202-
* case the returned distance is the lower bound of distance and
203-
* needs to be rechecked. We return single recheck flag which
204-
* means that both quals and distances are to be rechecked. We
205-
* initialize the flag to 'false'. The flag was added in version
206-
* 9.5 and the distance operators written before that won't know
207-
* about the flag, and are never lossy.
201+
* If the function sets the recheck flag, the returned distance is
202+
* a lower bound on the true distance and needs to be rechecked.
203+
* We initialize the flag to 'false'. This flag was added in
204+
* version 9.5; distance functions written before that won't know
205+
* about the flag, but are expected to never be lossy.
208206
*/
209207
recheck = false;
210208
dist = FunctionCall5Coll(&key->sk_func,
@@ -475,11 +473,22 @@ getNextNearest(IndexScanDesc scan)
475473
{
476474
if (so->orderByTypes[i] == FLOAT8OID)
477475
{
476+
#ifndef USE_FLOAT8_BYVAL
477+
/* must free any old value to avoid memory leakage */
478+
if (!scan->xs_orderbynulls[i])
479+
pfree(DatumGetPointer(scan->xs_orderbyvals[i]));
480+
#endif
478481
scan->xs_orderbyvals[i] = Float8GetDatum(item->distances[i]);
479482
scan->xs_orderbynulls[i] = false;
480483
}
481484
else if (so->orderByTypes[i] == FLOAT4OID)
482485
{
486+
/* convert distance function's result to ORDER BY type */
487+
#ifndef USE_FLOAT4_BYVAL
488+
/* must free any old value to avoid memory leakage */
489+
if (!scan->xs_orderbynulls[i])
490+
pfree(DatumGetPointer(scan->xs_orderbyvals[i]));
491+
#endif
483492
scan->xs_orderbyvals[i] = Float4GetDatum((float4) item->distances[i]);
484493
scan->xs_orderbynulls[i] = false;
485494
}
@@ -491,7 +500,7 @@ getNextNearest(IndexScanDesc scan)
491500
* calculated by the distance function to that. The
492501
* executor won't actually need the order by values we
493502
* return here, if there are no lossy results, so only
494-
* insist on the datatype if the *recheck is set.
503+
* insist on converting if the *recheck flag is set.
495504
*/
496505
if (scan->xs_recheckorderby)
497506
elog(ERROR, "GiST operator family's FOR ORDER BY operator must return float8 or float4 if the distance function is lossy");

src/backend/access/gist/gistscan.c

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -88,8 +88,9 @@ gistbeginscan(PG_FUNCTION_ARGS)
8888
so->qual_ok = true; /* in case there are zero keys */
8989
if (scan->numberOfOrderBys > 0)
9090
{
91-
scan->xs_orderbyvals = palloc(sizeof(Datum) * scan->numberOfOrderBys);
91+
scan->xs_orderbyvals = palloc0(sizeof(Datum) * scan->numberOfOrderBys);
9292
scan->xs_orderbynulls = palloc(sizeof(bool) * scan->numberOfOrderBys);
93+
memset(scan->xs_orderbynulls, true, sizeof(bool) * scan->numberOfOrderBys);
9394
}
9495

9596
scan->opaque = so;
@@ -284,6 +285,8 @@ gistrescan(PG_FUNCTION_ARGS)
284285
GIST_DISTANCE_PROC, skey->sk_attno,
285286
RelationGetRelationName(scan->indexRelation));
286287

288+
fmgr_info_copy(&(skey->sk_func), finfo, so->giststate->scanCxt);
289+
287290
/*
288291
* Look up the datatype returned by the original ordering operator.
289292
* GiST always uses a float8 for the distance function, but the
@@ -297,7 +300,6 @@ gistrescan(PG_FUNCTION_ARGS)
297300
* first time.
298301
*/
299302
so->orderByTypes[i] = get_func_rettype(skey->sk_func.fn_oid);
300-
fmgr_info_copy(&(skey->sk_func), finfo, so->giststate->scanCxt);
301303

302304
/* Restore prior fn_extra pointers, if not first time */
303305
if (!first_time)

src/backend/executor/nodeIndexscan.c

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -459,10 +459,16 @@ reorderqueue_pop(IndexScanState *node)
459459
{
460460
HeapTuple result;
461461
ReorderTuple *topmost;
462+
int i;
462463

463464
topmost = (ReorderTuple *) pairingheap_remove_first(node->iss_ReorderQueue);
464465

465466
result = topmost->htup;
467+
for (i = 0; i < node->iss_NumOrderByKeys; i++)
468+
{
469+
if (!node->iss_OrderByTypByVals[i] && !topmost->orderbynulls[i])
470+
pfree(DatumGetPointer(topmost->orderbyvals[i]));
471+
}
466472
pfree(topmost->orderbyvals);
467473
pfree(topmost->orderbynulls);
468474
pfree(topmost);

src/include/access/relscan.h

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -95,12 +95,13 @@ typedef struct IndexScanDescData
9595
/*
9696
* When fetching with an ordering operator, the values of the ORDER BY
9797
* expressions of the last returned tuple, according to the index. If
98-
* xs_recheck is true, these need to be rechecked just like the scan keys,
99-
* and the values returned here are a lower-bound on the actual values.
98+
* xs_recheckorderby is true, these need to be rechecked just like the
99+
* scan keys, and the values returned here are a lower-bound on the actual
100+
* values.
100101
*/
101102
Datum *xs_orderbyvals;
102103
bool *xs_orderbynulls;
103-
bool xs_recheckorderby; /* T means ORDER BY exprs must be rechecked */
104+
bool xs_recheckorderby;
104105

105106
/* state data for traversing HOT chains in index_getnext */
106107
bool xs_continue_hot; /* T if must keep walking HOT chain */

0 commit comments

Comments
 (0)