Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit b8279a7

Browse files
committed
Fix behavior of ~> (cube, int) operator
~> (cube, int) operator was especially designed for knn-gist search. However, it appears that knn-gist search can't work correctly with current behavior of this operator when dataset contains cubes of variable dimensionality. In this case, the same value of second operator argument can point to different dimension depending on dimensionality of particular cube. Such behavior is incompatible with gist indexing of cubes, and knn-gist doesn't work correctly for it. This patch changes behavior of ~> (cube, int) operator by introducing dimension numbering where value of second argument unambiguously identifies number of dimension. With new behavior, this operator can be correctly supported by knn-gist. Relevant changes to cube operator class are also included. Backpatch to v9.6 where operator was introduced. Since behavior of ~> (cube, int) operator is changed, depending entities must be refreshed after upgrade. Such as, expression indexes using this operator must be reindexed, materialized views must be rebuilt, stored procedures and client code must be revised to correctly use new behavior. That should be mentioned in release notes. Noticed by: Tomas Vondra Author: Alexander Korotkov Reviewed by: Tomas Vondra, Andrey Borodin Discussion: https://www.postgresql.org/message-id/flat/a9657f6a-b497-36ff-e56-482a2c7e3292@2ndquadrant.com
1 parent 08adf68 commit b8279a7

File tree

5 files changed

+512
-286
lines changed

5 files changed

+512
-286
lines changed

contrib/cube/cube.c

Lines changed: 95 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -1336,15 +1336,55 @@ g_cube_distance(PG_FUNCTION_ARGS)
13361336

13371337
if (strategy == CubeKNNDistanceCoord)
13381338
{
1339+
/*
1340+
* Handle ordering by ~> operator. See comments of cube_coord_llur()
1341+
* for details
1342+
*/
13391343
int coord = PG_GETARG_INT32(1);
1344+
bool isLeaf = GistPageIsLeaf(entry->page);
13401345

1341-
if (DIM(cube) == 0)
1342-
retval = 0.0;
1343-
else if (IS_POINT(cube))
1344-
retval = cube->x[(coord - 1) % DIM(cube)];
1346+
/* 0 is the only unsupported coordinate value */
1347+
if (coord <= 0)
1348+
ereport(ERROR,
1349+
(errcode(ERRCODE_ARRAY_ELEMENT_ERROR),
1350+
errmsg("cube index %d is out of bounds", coord)));
1351+
1352+
if (coord <= 2 * DIM(cube))
1353+
{
1354+
/* dimension index */
1355+
int index = (coord - 1) / 2;
1356+
/* whether this is upper bound (lower bound otherwise) */
1357+
bool upper = ((coord - 1) % 2 == 1);
1358+
1359+
if (IS_POINT(cube))
1360+
{
1361+
retval = cube->x[index];
1362+
}
1363+
else
1364+
{
1365+
if (isLeaf)
1366+
{
1367+
/* For leaf just return required upper/lower bound */
1368+
if (upper)
1369+
retval = Max(cube->x[index], cube->x[index + DIM(cube)]);
1370+
else
1371+
retval = Min(cube->x[index], cube->x[index + DIM(cube)]);
1372+
}
1373+
else
1374+
{
1375+
/*
1376+
* For non-leaf we should always return lower bound,
1377+
* because even upper bound of a child in the subtree can
1378+
* be as small as our lower bound.
1379+
*/
1380+
retval = Min(cube->x[index], cube->x[index + DIM(cube)]);
1381+
}
1382+
}
1383+
}
13451384
else
1346-
retval = Min(cube->x[(coord - 1) % DIM(cube)],
1347-
cube->x[(coord - 1) % DIM(cube) + DIM(cube)]);
1385+
{
1386+
retval = 0.0;
1387+
}
13481388
}
13491389
else
13501390
{
@@ -1491,43 +1531,73 @@ cube_coord(PG_FUNCTION_ARGS)
14911531
}
14921532

14931533

1494-
/*
1495-
* This function works like cube_coord(),
1496-
* but rearranges coordinates of corners to get cube representation
1497-
* in the form of (lower left, upper right).
1498-
* For historical reasons that extension allows us to create cubes in form
1499-
* ((2,1),(1,2)) and instead of normalizing such cube to ((1,1),(2,2)) it
1500-
* stores cube in original way. But to get cubes ordered by one of dimensions
1501-
* directly from the index without extra sort step we need some
1502-
* representation-independent coordinate getter. This function implements it.
1534+
/*----
1535+
* This function works like cube_coord(), but rearranges coordinates in the
1536+
* way suitable to support coordinate ordering using KNN-GiST. For historical
1537+
* reasons this extension allows us to create cubes in form ((2,1),(1,2)) and
1538+
* instead of normalizing such cube to ((1,1),(2,2)) it stores cube in original
1539+
* way. But in order to get cubes ordered by one of dimensions from the index
1540+
* without explicit sort step we need this representation-independent coordinate
1541+
* getter. Moreover, indexed dataset may contain cubes of different dimensions
1542+
* number. Accordingly, this coordinate getter should be able to return
1543+
* lower/upper bound for particular dimension independently on number of cube
1544+
* dimensions.
1545+
*
1546+
* Long story short, this function uses following meaning of coordinates:
1547+
* # (2 * N - 1) -- lower bound of Nth dimension,
1548+
* # (2 * N) -- upper bound of Nth dimension.
1549+
*
1550+
* When given coordinate exceeds number of cube dimensions, then 0 returned
1551+
* (reproducing logic of GiST indexing of variable-length cubes).
15031552
*/
15041553
Datum
15051554
cube_coord_llur(PG_FUNCTION_ARGS)
15061555
{
15071556
NDBOX *cube = PG_GETARG_NDBOX(0);
15081557
int coord = PG_GETARG_INT32(1);
1558+
bool inverse = false;
1559+
float8 result;
15091560

1510-
if (coord <= 0 || coord > 2 * DIM(cube))
1561+
/* 0 is the only unsupported coordinate value */
1562+
if (coord <= 0)
15111563
ereport(ERROR,
15121564
(errcode(ERRCODE_ARRAY_ELEMENT_ERROR),
15131565
errmsg("cube index %d is out of bounds", coord)));
15141566

1515-
if (coord <= DIM(cube))
1567+
if (coord <= 2 * DIM(cube))
15161568
{
1569+
/* dimension index */
1570+
int index = (coord - 1) / 2;
1571+
/* whether this is upper bound (lower bound otherwise) */
1572+
bool upper = ((coord - 1) % 2 == 1);
1573+
15171574
if (IS_POINT(cube))
1518-
PG_RETURN_FLOAT8(cube->x[coord - 1]);
1575+
{
1576+
result = cube->x[index];
1577+
}
15191578
else
1520-
PG_RETURN_FLOAT8(Min(cube->x[coord - 1],
1521-
cube->x[coord - 1 + DIM(cube)]));
1579+
{
1580+
if (upper)
1581+
result = Max(cube->x[index], cube->x[index + DIM(cube)]);
1582+
else
1583+
result = Min(cube->x[index], cube->x[index + DIM(cube)]);
1584+
}
15221585
}
15231586
else
15241587
{
1525-
if (IS_POINT(cube))
1526-
PG_RETURN_FLOAT8(cube->x[(coord - 1) % DIM(cube)]);
1527-
else
1528-
PG_RETURN_FLOAT8(Max(cube->x[coord - 1],
1529-
cube->x[coord - 1 - DIM(cube)]));
1588+
/*
1589+
* Return zero if coordinate is out of bound. That reproduces logic of
1590+
* how cubes with low dimension number are expanded during GiST
1591+
* indexing.
1592+
*/
1593+
result = 0.0;
15301594
}
1595+
1596+
/* Inverse value if needed */
1597+
if (inverse)
1598+
result = -result;
1599+
1600+
PG_RETURN_FLOAT8(result);
15311601
}
15321602

15331603
/* Increase or decrease box size by a radius in at least n dimensions. */

0 commit comments

Comments
 (0)