Re: [HACKERS] WIP: Aggregation push-down
От | Antonin Houska |
---|---|
Тема | Re: [HACKERS] WIP: Aggregation push-down |
Дата | |
Msg-id | 23468.1551369774@localhost обсуждение исходный текст |
Ответ на | Re: [HACKERS] WIP: Aggregation push-down (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: [HACKERS] WIP: Aggregation push-down
|
Список | pgsql-hackers |
Tom Lane <tgl@sss.pgh.pa.us> wrote: > Antonin Houska <ah@cybertec.at> writes: > > Michael Paquier <michael@paquier.xyz> wrote: > >> Latest patch set fails to apply, so moved to next CF, waiting on > >> author. > > > Rebased. > > This is in need of rebasing again :-(. I went ahead and pushed the 001 > part, since that seemed fairly uncontroversial. ok, thanks. > I did not spend a whole lot of time looking at the patch today, but > I'm still pretty distressed at the data structures you've chosen. > I remain of the opinion that a grouped relation and a base relation > are, er, unrelated, even if they happen to share the same relid set. > So I do not see the value of the RelOptInfoSet struct you propose here, ok. As you suggested upthread, I try now to reuse the join_rel_list / join_rel_hash structures, see v11-001-Introduce_RelInfoList.patch. > and I definitely don't think there's any value in having, eg, > create_seqscan_path or create_index_path dealing with this stuff. Originally I tried to aggregate any path that ever gets passed to agg_path(), but that's probably not worth the code complexity. Now the partial aggregation is only applied to paths that survived agg_path() on the plain relation. > I also don't like changing create_nestloop_path et al to take a PathTarget > rather than using the RelOptInfo's pathtarget; IMO, it's flat out wrong > for a path to generate a tlist different from what its parent RelOptInfo > says that the relation produces. Likewise, the current patch version is less invasive, so create_nestloop_path et al are not touched at all. > I think basically the way this ought to work is that we generate baserel > paths pretty much the same as today, and then we run through the baserels > and see which ones are potentially worth doing partial aggregation on, > and for each one that is, we create a separate "upper relation" RelOptInfo > describing that. The paths for this RelOptInfo would be > partial-aggregation paths using paths from the corresponding baserel as > input. Then we'd run a join search that only considers joining grouped > rels with plain rels (I concur that joining two grouped rels is not worth > coping with, at least for now). make_join_rel() is the core of my implementation: besides joining two plain relations it tries to join plain relation to grouped one, and also to aggregate the join of the two plain relations. I consider this approach less invasive and more efficient than running the whole standard_join_search again for the grouped rels. The problem of numeric-like data types (i.e. types for wich equality of two values of the grouping key does not justify putting them into the same group because information like scale would be discarded this way) remains open. My last idea was to add a boolean flag to operator class which tells that equality implies "bitwise equality", and to disallow aggregate push-down if SortGroupClause.eqop is in an opclass which has this field FALSE. I'd like to hear your opinion before I do any coding. -- Antonin Houska https://www.cybertec-postgresql.com
Вложения
В списке pgsql-hackers по дате отправления: