Comparing changes

In a plain join, we can just summarily discard an input tuple with null join key(s), since it cannot match anything from the other side of the join (assuming a strict join operator). However, if the tuple comes from the outer side of an outer join then we have to emit it with null-extension of the other side. Up to now, hash joins did that by inserting the tuple into the hash table as though it were a normal tuple. This is unnecessarily inefficient though, since the required processing is far simpler than for a potentially-matchable tuple. Worse, if there are a lot of such tuples they will bloat the hash bucket they go into, possibly causing useless repeated attempts to split that bucket or increase the number of batches. We have a report of a large join vainly creating many thousands of batches when faced with such input. This patch improves the situation by keeping such tuples out of the hash table altogether, instead pushing them into a separate tuplestore from which we return them later. (One might consider trying to return them immediately; but that would require substantial refactoring, and it doesn't work anyway for the case where we rescan an unmodified hash table.) This works even in parallel hash joins, because whichever worker reads a null-keyed tuple can just return it; there's no need for consultation with other workers. Thus the tuplestores are local storage even in a parallel join.

…keys This branch was automatically generated by a robot using patches from an email thread registered at: https://commitfest.postgresql.org/patch/5738 The branch will be overwritten each time a new patch version is posted to the thread, and also periodically to check for bitrot caused by changes on the master branch. Patch(es): https://www.postgresql.org/message-id/3061845.1746486714@sss.pgh.pa.us Author(s): Tom Lane

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Comparing changes

Open a pull request

Commits on May 22, 2025

This comparison is taking too long to generate.

Uh oh!