Re: POC, WIP: OR-clause support for indexes
От | Andrei Lepikhov |
---|---|
Тема | Re: POC, WIP: OR-clause support for indexes |
Дата | |
Msg-id | e3338e82-a28d-4631-9eec-b9c0984b32d5@postgrespro.ru обсуждение исходный текст |
Ответ на | Re: POC, WIP: OR-clause support for indexes (jian he <jian.universality@gmail.com>) |
Ответы |
Re: POC, WIP: OR-clause support for indexes
|
Список | pgsql-hackers |
On 13/2/2024 07:00, jian he wrote: > + newa = makeNode(ArrayExpr); > + /* array_collid will be set by parse_collate.c */ > + newa->element_typeid = scalar_type; > + newa->array_typeid = array_type; > + newa->multidims = false; > + newa->elements = aexprs; > + newa->location = -1; > > I am confused by the comments `array_collid will be set by > parse_collate.c`, can you further explain it? I wonder if the second paragraph of comments on commit b310b6e will be enough to dive into details. > if OR expression right arm is not plain Const, but with collation > specification, eg. > `where a = 'a' collate "C" or a = 'b' collate "C";` > > then the rightop is not Const, it will be CollateExpr, it will not be > used in transformation. Yes, it is done for simplicity right now. I'm not sure about corner cases of merging such expressions. > > set enable_or_transformation to on; > explain(timing off, analyze, costs off) > select count(*) from test where (x = 1 or x = 2 or x = 3 or x = 4 or x > = 5 or x = 6 or x = 7 or x = 8 or x = 9 ) \watch i=0.1 c=10 > 35.376 ms > > The time is the last result of the 10 iterations. The reason here - parallel workers. If you see into the plan you will find parallel workers without optimization and absence of them in the case of optimization: Gather (cost=1000.00..28685.37 rows=87037 width=12) (actual rows=90363 loops=1) Workers Planned: 2 Workers Launched: 2 -> Parallel Seq Scan on test Filter: ((x = 1) OR (x = 2) OR (x = 3) OR (x = 4) OR (x = 5) OR (x = 6) OR (x = 7) OR (x = 8) OR (x = 9)) Seq Scan on test (cost=0.02..20440.02 rows=90600 width=12) (actual rows=90363 loops=1) Filter: (x = ANY ('{1,2,3,4,5,6,7,8,9}'::integer[])) Having 90600 tuples returned we estimate it into 87000 (less precisely) without transformation and 90363 (more precisely) with the transformation. But if you play with parallel_tuple_cost and parallel_setup_cost, you will end up having these parallel workers: Gather (cost=0.12..11691.03 rows=90600 width=12) (actual rows=90363 loops=1) Workers Planned: 2 Workers Launched: 2 -> Parallel Seq Scan on test Filter: (x = ANY ('{1,2,3,4,5,6,7,8,9}'::integer[])) Rows Removed by Filter: 303212 And some profit about 25%, on my laptop. I'm not sure about the origins of such behavior, but it seems to be an issue of parallel workers, not this specific optimization. -- regards, Andrei Lepikhov Postgres Professional
В списке pgsql-hackers по дате отправления: