Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit c60e520

Browse files
committed
Use memcpy instead of a byte loop in pglz_decompress
The byte loop used in pglz_decompress() because of possible overlap may be quite inefficient, so this commit replaces it with memcpy. The gains do depend on the data (compressibility) and hardware, but seem to be quite significant. Author: Andrey Borodin Reviewed-by: Michael Paquier, Konstantin Knizhnik, Tels Discussion: https://postgr.es/m/469C9ED9-348C-4FE7-A7A7-B0FA671BEE4C@yandex-team.ru
1 parent 6d61c3f commit c60e520

File tree

1 file changed

+41
-11
lines changed

1 file changed

+41
-11
lines changed

src/common/pg_lzcompress.c

+41-11
Original file line numberDiff line numberDiff line change
@@ -714,11 +714,13 @@ pglz_decompress(const char *source, int32 slen, char *dest,
714714
if (ctrl & 1)
715715
{
716716
/*
717-
* Otherwise it contains the match length minus 3 and the
718-
* upper 4 bits of the offset. The next following byte
719-
* contains the lower 8 bits of the offset. If the length is
720-
* coded as 18, another extension tag byte tells how much
721-
* longer the match really was (0-255).
717+
* Set control bit means we must read a match tag. The match
718+
* is coded with two bytes. First byte uses lower nibble to
719+
* code length - 3. Higher nibble contains upper 4 bits of the
720+
* offset. The next following byte contains the lower 8 bits
721+
* of the offset. If the length is coded as 18, another
722+
* extension tag byte tells how much longer the match really
723+
* was (0-255).
722724
*/
723725
int32 len;
724726
int32 off;
@@ -731,16 +733,44 @@ pglz_decompress(const char *source, int32 slen, char *dest,
731733

732734
/*
733735
* Now we copy the bytes specified by the tag from OUTPUT to
734-
* OUTPUT. It is dangerous and platform dependent to use
735-
* memcpy() here, because the copied areas could overlap
736-
* extremely!
736+
* OUTPUT (copy len bytes from dp - off to dp). The copied
737+
* areas could overlap, to preven possible uncertainty, we
738+
* copy only non-overlapping regions.
737739
*/
738740
len = Min(len, destend - dp);
739-
while (len--)
741+
while (off < len)
740742
{
741-
*dp = dp[-off];
742-
dp++;
743+
/*---------
744+
* When offset is smaller than length - source and
745+
* destination regions overlap. memmove() is resolving
746+
* this overlap in an incompatible way with pglz. Thus we
747+
* resort to memcpy()-ing non-overlapping regions.
748+
*
749+
* Consider input: 112341234123412341234
750+
* At byte 5 here ^ we have match with length 16 and
751+
* offset 4. 11234M(len=16, off=4)
752+
* We are decoding first period of match and rewrite match
753+
* 112341234M(len=12, off=8)
754+
*
755+
* The same match is now at position 9, it points to the
756+
* same start byte of output, but from another position:
757+
* the offset is doubled.
758+
*
759+
* We iterate through this offset growth until we can
760+
* proceed to usual memcpy(). If we would try to decode
761+
* the match at byte 5 (len=16, off=4) by memmove() we
762+
* would issue memmove(5, 1, 16) which would produce
763+
* 112341234XXXXXXXXXXXX, where series of X is 12
764+
* undefined bytes, that were at bytes [5:17].
765+
*---------
766+
*/
767+
memcpy(dp, dp - off, off);
768+
len -= off;
769+
dp += off;
770+
off += off;
743771
}
772+
memcpy(dp, dp - off, len);
773+
dp += len;
744774
}
745775
else
746776
{

0 commit comments

Comments
 (0)