Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 07f0f6a

Browse files
committed
Speed up tail processing when hashing aligned C strings
After encountering the NUL terminator, the word-at-a-time loop exits and we must hash the remaining bytes. Previously we calculated the terminator's position and re-loaded the remaining bytes from the input string. We already have all the data we need in a register, so let's just mask off the bytes we need and hash them immediately. The mask can be cheaply computed without knowing the terminator's position. We still need that position for the length calculation, but the CPU can now do that in parallel with other work, shortening the dependency chain. Ants Aasma and John Naylor Discussion: https://postgr.es/m/CANwKhkP7pCiW_5fAswLhs71-JKGEz1c1%2BPC0a_w1fwY4iGMqUA%40mail.gmail.com
1 parent b1484a3 commit 07f0f6a

File tree

1 file changed

+34
-10
lines changed

1 file changed

+34
-10
lines changed

src/include/common/hashfn_unstable.h

+34-10
Original file line numberDiff line numberDiff line change
@@ -219,8 +219,9 @@ static inline size_t
219219
fasthash_accum_cstring_aligned(fasthash_state *hs, const char *str)
220220
{
221221
const char *const start = str;
222-
size_t remainder;
222+
uint64 chunk;
223223
uint64 zero_byte_low;
224+
uint64 mask;
224225

225226
Assert(PointerIsAligned(start, uint64));
226227

@@ -239,7 +240,7 @@ fasthash_accum_cstring_aligned(fasthash_state *hs, const char *str)
239240
*/
240241
for (;;)
241242
{
242-
uint64 chunk = *(uint64 *) str;
243+
chunk = *(uint64 *) str;
243244

244245
#ifdef WORDS_BIGENDIAN
245246
zero_byte_low = haszero64(pg_bswap64(chunk));
@@ -254,14 +255,37 @@ fasthash_accum_cstring_aligned(fasthash_state *hs, const char *str)
254255
str += FH_SIZEOF_ACCUM;
255256
}
256257

257-
/*
258-
* The byte corresponding to the NUL will be 0x80, so the rightmost bit
259-
* position will be in the range 7, 15, ..., 63. Turn this into byte
260-
* position by dividing by 8.
261-
*/
262-
remainder = pg_rightmost_one_pos64(zero_byte_low) / BITS_PER_BYTE;
263-
fasthash_accum(hs, str, remainder);
264-
str += remainder;
258+
if (zero_byte_low & 0xFF)
259+
{
260+
/*
261+
* The next byte in the input is the NUL terminator, so we have
262+
* nothing to do.
263+
*/
264+
}
265+
else
266+
{
267+
/*
268+
* Create a mask for the remaining bytes so we can combine them into
269+
* the hash. The mask also covers the NUL terminator, but that's
270+
* harmless. The mask could contain 0x80 in bytes corresponding to the
271+
* input past the terminator, but only where the input byte is zero or
272+
* one, so also harmless.
273+
*/
274+
mask = zero_byte_low | (zero_byte_low - 1);
275+
#ifdef WORDS_BIGENDIAN
276+
/* need to mask the upper bytes */
277+
mask = pg_bswap64(mask);
278+
#endif
279+
hs->accum = chunk & mask;
280+
fasthash_combine(hs);
281+
282+
/*
283+
* The byte corresponding to the NUL will be 0x80, so the rightmost
284+
* bit position will be in the range 15, 23, ..., 63. Turn this into
285+
* byte position by dividing by 8.
286+
*/
287+
str += pg_rightmost_one_pos64(zero_byte_low) / BITS_PER_BYTE;
288+
}
265289

266290
return str - start;
267291
}

0 commit comments

Comments
 (0)