Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 73e3566

Browse files
committed
Improve comments about btree's use of ScanKey data structures: there
are two basically different kinds of scankeys, and we ought to try harder to indicate which is used in each place in the code. I've chosen the names "search scankey" and "insertion scankey", though you could make about as good an argument for "operator scankey" and "comparison function scankey".
1 parent e38217d commit 73e3566

File tree

5 files changed

+78
-47
lines changed

5 files changed

+78
-47
lines changed

src/backend/access/nbtree/README

+21-10
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
$PostgreSQL: pgsql/src/backend/access/nbtree/README,v 1.8 2003/11/29 19:51:40 pgsql Exp $
1+
$PostgreSQL: pgsql/src/backend/access/nbtree/README,v 1.9 2006/01/17 00:09:00 tgl Exp $
22

33
This directory contains a correct implementation of Lehman and Yao's
44
high-concurrency B-tree management algorithm (P. Lehman and S. Yao,
@@ -325,15 +325,26 @@ work sometimes, but could cause failures later on depending on
325325
what else gets put on their page.
326326

327327
"ScanKey" data structures are used in two fundamentally different ways
328-
in this code. Searches for the initial position for a scan, as well as
329-
insertions, use scankeys in which the comparison function is a 3-way
330-
comparator (<0, =0, >0 result). These scankeys are built within the
331-
btree code (eg, by _bt_mkscankey()) and used by _bt_compare(). Once we
332-
are positioned, sequential examination of tuples in a scan is done by
333-
_bt_checkkeys() using scankeys in which the comparison functions return
334-
booleans --- for example, int4lt might be used. These scankeys are the
335-
ones originally passed in from outside the btree code. Same
336-
representation, but different comparison functions!
328+
in this code, which we describe as "search" scankeys and "insertion"
329+
scankeys. A search scankey is the kind passed to btbeginscan() or
330+
btrescan() from outside the btree code. The sk_func pointers in a search
331+
scankey point to comparison functions that return boolean, such as int4lt.
332+
There might be more than one scankey entry for a given index column, or
333+
none at all. (We require the keys to appear in index column order, but
334+
the order of multiple keys for a given column is unspecified.) An
335+
insertion scankey uses the same array-of-ScanKey data structure, but the
336+
sk_func pointers point to btree comparison support functions (ie, 3-way
337+
comparators that return int4 values interpreted as <0, =0, >0). In an
338+
insertion scankey there is exactly one entry per index column. Insertion
339+
scankeys are built within the btree code (eg, by _bt_mkscankey()) and are
340+
used to locate the starting point of a scan, as well as for locating the
341+
place to insert a new index tuple. (Note: in the case of an insertion
342+
scankey built from a search scankey, there might be fewer keys than
343+
index columns, indicating that we have no constraints for the remaining
344+
index columns.) After we have located the starting point of a scan, the
345+
original search scankey is consulted as each index entry is sequentially
346+
scanned to decide whether to return the entry and whether the scan can
347+
stop (see _bt_checkkeys()).
337348

338349
Notes about data representation
339350
-------------------------------

src/backend/access/nbtree/nbtinsert.c

+4-3
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
*
99
*
1010
* IDENTIFICATION
11-
* $PostgreSQL: pgsql/src/backend/access/nbtree/nbtinsert.c,v 1.130 2006/01/11 08:43:11 neilc Exp $
11+
* $PostgreSQL: pgsql/src/backend/access/nbtree/nbtinsert.c,v 1.131 2006/01/17 00:09:00 tgl Exp $
1212
*
1313
*-------------------------------------------------------------------------
1414
*/
@@ -80,7 +80,7 @@ _bt_doinsert(Relation rel, BTItem btitem,
8080
BTStack stack;
8181
Buffer buf;
8282

83-
/* we need a scan key to do our search, so build one */
83+
/* we need an insertion scan key to do our search, so build one */
8484
itup_scankey = _bt_mkscankey(rel, itup);
8585

8686
top:
@@ -331,7 +331,8 @@ _bt_check_unique(Relation rel, BTItem btitem, Relation heapRel,
331331
* If 'afteritem' is >0 then the new tuple must be inserted after the
332332
* existing item of that number, noplace else. If 'afteritem' is 0
333333
* then the procedure finds the exact spot to insert it by searching.
334-
* (keysz and scankey parameters are used ONLY if afteritem == 0.)
334+
* (keysz and scankey parameters are used ONLY if afteritem == 0.
335+
* The scankey must be an insertion-type scankey.)
335336
*
336337
* NOTE: if the new key is equal to one or more existing keys, we can
337338
* legitimately place it anywhere in the series of equal keys --- in fact,

src/backend/access/nbtree/nbtpage.c

+2-2
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@
99
*
1010
*
1111
* IDENTIFICATION
12-
* $PostgreSQL: pgsql/src/backend/access/nbtree/nbtpage.c,v 1.90 2005/11/22 18:17:06 momjian Exp $
12+
* $PostgreSQL: pgsql/src/backend/access/nbtree/nbtpage.c,v 1.91 2006/01/17 00:09:01 tgl Exp $
1313
*
1414
* NOTES
1515
* Postgres btree pages look like ordinary relation pages. The opaque
@@ -813,7 +813,7 @@ _bt_pagedel(Relation rel, Buffer buf, bool vacuum_full)
813813
* better drop the target page lock first.
814814
*/
815815
_bt_relbuf(rel, buf);
816-
/* we need a scan key to do our search, so build one */
816+
/* we need an insertion scan key to do our search, so build one */
817817
itup_scankey = _bt_mkscankey(rel, &(targetkey->bti_itup));
818818
/* find the leftmost leaf page containing this key */
819819
stack = _bt_search(rel, rel->rd_rel->relnatts, itup_scankey, false,

src/backend/access/nbtree/nbtsearch.c

+42-24
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
* Portions Copyright (c) 1994, Regents of the University of California
99
*
1010
* IDENTIFICATION
11-
* $PostgreSQL: pgsql/src/backend/access/nbtree/nbtsearch.c,v 1.99 2005/12/07 19:37:53 tgl Exp $
11+
* $PostgreSQL: pgsql/src/backend/access/nbtree/nbtsearch.c,v 1.100 2006/01/17 00:09:01 tgl Exp $
1212
*
1313
*-------------------------------------------------------------------------
1414
*/
@@ -29,6 +29,9 @@ static bool _bt_endpoint(IndexScanDesc scan, ScanDirection dir);
2929
* _bt_search() -- Search the tree for a particular scankey,
3030
* or more precisely for the first leaf page it could be on.
3131
*
32+
* The passed scankey must be an insertion-type scankey (see nbtree/README),
33+
* but it can omit the rightmost column(s) of the index.
34+
*
3235
* When nextkey is false (the usual case), we are looking for the first
3336
* item >= scankey. When nextkey is true, we are looking for the first
3437
* item strictly greater than scankey.
@@ -127,15 +130,18 @@ _bt_search(Relation rel, int keysz, ScanKey scankey, bool nextkey,
127130
* data that appeared on the page originally is either on the page
128131
* or strictly to the right of it.
129132
*
130-
* When nextkey is false (the usual case), we are looking for the first
131-
* item >= scankey. When nextkey is true, we are looking for the first
132-
* item strictly greater than scankey.
133-
*
134133
* This routine decides whether or not we need to move right in the
135134
* tree by examining the high key entry on the page. If that entry
136135
* is strictly less than the scankey, or <= the scankey in the nextkey=true
137136
* case, then we followed the wrong link and we need to move right.
138137
*
138+
* The passed scankey must be an insertion-type scankey (see nbtree/README),
139+
* but it can omit the rightmost column(s) of the index.
140+
*
141+
* When nextkey is false (the usual case), we are looking for the first
142+
* item >= scankey. When nextkey is true, we are looking for the first
143+
* item strictly greater than scankey.
144+
*
139145
* On entry, we have the buffer pinned and a lock of the type specified by
140146
* 'access'. If we move right, we release the buffer and lock and acquire
141147
* the same on the right sibling. Return value is the buffer we stop at.
@@ -194,14 +200,13 @@ _bt_moveright(Relation rel,
194200
/*
195201
* _bt_binsrch() -- Do a binary search for a key on a particular page.
196202
*
203+
* The passed scankey must be an insertion-type scankey (see nbtree/README),
204+
* but it can omit the rightmost column(s) of the index.
205+
*
197206
* When nextkey is false (the usual case), we are looking for the first
198207
* item >= scankey. When nextkey is true, we are looking for the first
199208
* item strictly greater than scankey.
200209
*
201-
* The scankey we get has the compare function stored in the procedure
202-
* entry of each data struct. We invoke this regproc to do the
203-
* comparison for every key in the scankey.
204-
*
205210
* On a leaf page, _bt_binsrch() returns the OffsetNumber of the first
206211
* key >= given scankey, or > scankey if nextkey is true. (NOTE: in
207212
* particular, this means it is possible to return a value 1 greater than the
@@ -301,8 +306,11 @@ _bt_binsrch(Relation rel,
301306
/*----------
302307
* _bt_compare() -- Compare scankey to a particular tuple on the page.
303308
*
309+
* The passed scankey must be an insertion-type scankey (see nbtree/README),
310+
* but it can omit the rightmost column(s) of the index.
311+
*
304312
* keysz: number of key conditions to be checked (might be less than the
305-
* total length of the scan key!)
313+
* number of index columns!)
306314
* page/offnum: location of btree item to be compared to.
307315
*
308316
* This routine returns:
@@ -464,12 +472,17 @@ _bt_next(IndexScanDesc scan, ScanDirection dir)
464472
/*
465473
* _bt_first() -- Find the first item in a scan.
466474
*
467-
* We need to be clever about the type of scan, the operation it's
468-
* performing, and the tree ordering. We find the
469-
* first item in the tree that satisfies the qualification
470-
* associated with the scan descriptor. On exit, the page containing
475+
* We need to be clever about the direction of scan, the search
476+
* conditions, and the tree ordering. We find the first item (or,
477+
* if backwards scan, the last item) in the tree that satisfies the
478+
* qualifications in the scan key. On exit, the page containing
471479
* the current index tuple is read locked and pinned, and the scan's
472480
* opaque data entry is updated to include the buffer.
481+
*
482+
* Note that scan->keyData[], and the so->keyData[] scankey built from it,
483+
* are both search-type scankeys (see nbtree/README for more about this).
484+
* Within this routine, we build a temporary insertion-type scankey to use
485+
* in locating the scan start position.
473486
*/
474487
bool
475488
_bt_first(IndexScanDesc scan, ScanDirection dir)
@@ -537,6 +550,9 @@ _bt_first(IndexScanDesc scan, ScanDirection dir)
537550
* equality quals survive preprocessing, however, it doesn't matter which
538551
* one we use --- by definition, they are either redundant or
539552
* contradictory.
553+
*
554+
* The selected scan keys (at most one per index column) are remembered by
555+
* storing their addresses into the local startKeys[] array.
540556
*----------
541557
*/
542558
strat_total = BTEqualStrategyNumber;
@@ -631,9 +647,10 @@ _bt_first(IndexScanDesc scan, ScanDirection dir)
631647
return _bt_endpoint(scan, dir);
632648

633649
/*
634-
* We want to start the scan somewhere within the index. Set up a
635-
* 3-way-comparison scankey we can use to search for the boundary point we
636-
* identified above.
650+
* We want to start the scan somewhere within the index. Set up an
651+
* insertion scankey we can use to search for the boundary point we
652+
* identified above. The insertion scankey is built in the local
653+
* scankeys[] array, using the keys identified by startKeys[].
637654
*/
638655
Assert(keysCount <= INDEX_MAX_KEYS);
639656
for (i = 0; i < keysCount; i++)
@@ -681,19 +698,20 @@ _bt_first(IndexScanDesc scan, ScanDirection dir)
681698
}
682699
}
683700

684-
/*
701+
/*----------
685702
* Examine the selected initial-positioning strategy to determine exactly
686703
* where we need to start the scan, and set flag variables to control the
687704
* code below.
688705
*
689706
* If nextkey = false, _bt_search and _bt_binsrch will locate the first
690-
* item >= scan key. If nextkey = true, they will locate the first item >
691-
* scan key.
707+
* item >= scan key. If nextkey = true, they will locate the first
708+
* item > scan key.
692709
*
693-
* If goback = true, we will then step back one item, while if goback =
694-
* false, we will start the scan on the located item.
710+
* If goback = true, we will then step back one item, while if
711+
* goback = false, we will start the scan on the located item.
695712
*
696713
* it's yet other place to add some code later for is(not)null ...
714+
*----------
697715
*/
698716
switch (strat_total)
699717
{
@@ -774,8 +792,8 @@ _bt_first(IndexScanDesc scan, ScanDirection dir)
774792
}
775793

776794
/*
777-
* Use the manufactured scan key to descend the tree and position
778-
* ourselves on the target leaf page.
795+
* Use the manufactured insertion scan key to descend the tree and
796+
* position ourselves on the target leaf page.
779797
*/
780798
stack = _bt_search(rel, keysCount, scankeys, nextkey, &buf, BT_READ);
781799

src/backend/access/nbtree/nbtutils.c

+9-8
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
*
99
*
1010
* IDENTIFICATION
11-
* $PostgreSQL: pgsql/src/backend/access/nbtree/nbtutils.c,v 1.67 2005/12/07 19:37:53 tgl Exp $
11+
* $PostgreSQL: pgsql/src/backend/access/nbtree/nbtutils.c,v 1.68 2006/01/17 00:09:01 tgl Exp $
1212
*
1313
*-------------------------------------------------------------------------
1414
*/
@@ -23,7 +23,7 @@
2323

2424
/*
2525
* _bt_mkscankey
26-
* Build a scan key that contains comparison data from itup
26+
* Build an insertion scan key that contains comparison data from itup
2727
* as well as comparator routines appropriate to the key datatypes.
2828
*
2929
* The result is intended for use with _bt_compare().
@@ -67,11 +67,12 @@ _bt_mkscankey(Relation rel, IndexTuple itup)
6767

6868
/*
6969
* _bt_mkscankey_nodata
70-
* Build a scan key that contains comparator routines appropriate to
71-
* the key datatypes, but no comparison data. The comparison data
72-
* ultimately used must match the key datatypes.
70+
* Build an insertion scan key that contains 3-way comparator routines
71+
* appropriate to the key datatypes, but no comparison data. The
72+
* comparison data ultimately used must match the key datatypes.
7373
*
74-
* The result cannot be used with _bt_compare(). Currently this
74+
* The result cannot be used with _bt_compare(), unless comparison
75+
* data is first stored into the key entries. Currently this
7576
* routine is only called by nbtsort.c and tuplesort.c, which have
7677
* their own comparison routines.
7778
*/
@@ -160,7 +161,7 @@ _bt_formitem(IndexTuple itup)
160161
/*----------
161162
* _bt_preprocess_keys() -- Preprocess scan keys
162163
*
163-
* The caller-supplied keys (in scan->keyData[]) are copied to
164+
* The caller-supplied search-type keys (in scan->keyData[]) are copied to
164165
* so->keyData[] with possible transformation. scan->numberOfKeys is
165166
* the number of input keys, so->numberOfKeys gets the number of output
166167
* keys (possibly less, never greater).
@@ -485,7 +486,7 @@ _bt_preprocess_keys(IndexScanDesc scan)
485486
* accordingly. See comments for _bt_preprocess_keys(), above, about how
486487
* this is done.
487488
*
488-
* scan: index scan descriptor
489+
* scan: index scan descriptor (containing a search-type scankey)
489490
* page: buffer page containing index tuple
490491
* offnum: offset number of index tuple (must be a valid item!)
491492
* dir: direction we are scanning in

0 commit comments

Comments
 (0)