[HACKERS] Lazy hash table for XidInMVCCSnapshot (helps Zipfian a bit)
От | Sokolov Yura |
---|---|
Тема | [HACKERS] Lazy hash table for XidInMVCCSnapshot (helps Zipfian a bit) |
Дата | |
Msg-id | 35960b8af917e9268881cd8df3f88320@postgrespro.ru обсуждение исходный текст |
Ответы |
Re: [HACKERS] Lazy hash table for XidInMVCCSnapshot (helps Zipfian abit)
|
Список | pgsql-hackers |
Good day, every one. In attempt to improve performance of YCSB on zipfian distribution, it were found that significant time is spent in XidInMVCCSnapshot in scanning snapshot->xip array. While overall CPU time is not too noticable, it has measurable impact on scaleability. First I tried to sort snapshot->xip in GetSnapshotData, and search in a sorted array. But since snapshot->xip is not touched if no transaction contention occurs, sorting xip always is not best option. Then I sorted xip array on demand in XidInMVCCSnapshot only if search in snapshot->xip occurs (ie lazy sorting). It performs much better, but since it is O(NlogN), sort's execution time is noticable for large number of clients. Third approach (present in attached patch) is making hash table lazily on first search in xip array. Note: hash table is not built if number of "in-progress" xids is less than 60. Tests shows, there is no big benefit from doing so (at least on Intel Xeon). For this letter I've tested with pgbench and random_exponential updating rows from pgbench_tellers (scale=300, so 3000 rows in a table). Scripts are in attached archive. With this test configuration, numbers are quite close to numbers from YCSB benchmark workloada. Test machine is 4xXeon CPU E7-8890 - 72cores (144HT), fsync=on, synchronous_commit=off. Results: clients | master | hashsnap --------+----------+---------- 25 | 67652 | 70017 50 | 102781 | 102074 75 | 81716 | 79440 110 | 68286 | 69223 150 | 56168 | 60713 200 | 45073 | 48880 250 | 36526 | 40893 325 | 28363 | 32497 400 | 22532 | 26639 500 | 17423 | 21496 650 | 12767 | 16461 800 | 9599 | 13483 (Note: if pgbench_accounts is updated (30000000 rows), then exponential distribution behaves differently from zipfian with used parameter.) Remarks: - it could be combined with "Cache data in GetSnapshotData" https://commitfest.postgresql.org/14/553/ - if CSN ever landed, then there will be no need in this optimization at all. PS. Excuse me for following little promotion of lwlock patch https://commitfest.postgresql.org/14/1166/ clients | master | hashsnap | hashsnap+lwlock --------+----------+----------+-------------- 25 | 67652 | 70017 | 127601 50 | 102781 | 102074 | 134545 75 | 81716 | 79440 | 128655 110 | 68286 | 69223 | 110420 150 | 56168 | 60713 | 86715 200 | 45073 | 48880 | 68266 250 | 36526 | 40893 | 56638 325 | 28363 | 32497 | 45704 400 | 22532 | 26639 | 38247 500 | 17423 | 21496 | 32668 650 | 12767 | 16461 | 25488 800 | 9599 | 13483 | 21326 With regards, -- Sokolov Yura aka funny_falcon Postgres Professional: https://postgrespro.ru The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Вложения
В списке pgsql-hackers по дате отправления: