Removing unneeded self joins
От | Alexander Kuzmenkov |
---|---|
Тема | Removing unneeded self joins |
Дата | |
Msg-id | 64486b0b-0404-e39e-322d-0801154901f3@postgrespro.ru обсуждение исходный текст |
Ответы |
Re: Removing unneeded self joins
Re: Removing unneeded self joins Re: Removing unneeded self joins |
Список | pgsql-hackers |
Hi hackers, There is a join optimization we don't do -- removing inner join of a table with itself on a unique column. Such joins are generated by various ORMs, so from time to time our customers ask us to look into this. Most recently, it was discussed on the list in relation to an article comparing the optimizations that some DBMS make [1]. I started to explore what can be done about this. Attached is a proof of concept patch. It works for some simple cases: create table tt(a int primary key, b text); explain select p.* from tt p join (select * from tt where b ~~ 'a%') q on p.a = q.a; QUERY PLAN ────────────────────────────────────────────────────── Seq Scan on tt p (cost=0.00..25.88 rows=6 width=36) Filter: (b ~~ 'a%'::text) It also works for semi-joins like `explain select p.* from tt p where exists (select * from tt where b ~~ 'a%' and a = p.a);`. This requires a preparatory step of reducing unique semi joins to inner joins, and we already do this (reduce_unique_semijoin). What this patch tries to do is to remove these inner joins when a single join is being planned (populate_joinrel_with_paths). The main entry point is reduce_self_unique_join. First, it proves that both input relations are uniquely constrained by the same index given the particular join clauses. We already have a way to find such indexes (relation_has_unique_index_for), so I was able to reuse this. What I'm not sure about is how to properly remove the join after that. For now, I just pretend that the join relation being built is the outer baserel, add to it the restrictions from the inner relation, and then plan it as usual. Maybe there is a less hacky way to do it? I've seen elsewhere a suggestion to use an AppendPath for a similar purpose, but here we can't just use the outer relation we've already planned because the restriction list is different. I'd be glad to hear your thoughts on this. [1] https://www.postgresql.org/message-id/flat/CAMjNa7cC4X9YR-vAJS-jSYCajhRDvJQnN7m2sLH1wLh-_Z2bsw%40mail.gmail.com#CAMjNa7cC4X9YR-vAJS-jSYCajhRDvJQnN7m2sLH1wLh-_Z2bsw@mail.gmail.com -- Alexander Kuzmenkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Вложения
В списке pgsql-hackers по дате отправления: