Re: Merging statistics from children instead of re-sampling everything
От | Andrey Lepikhov |
---|---|
Тема | Re: Merging statistics from children instead of re-sampling everything |
Дата | |
Msg-id | bdb0bea2-a0da-1f1d-5c92-96ff90c198eb@postgrespro.ru обсуждение исходный текст |
Ответ на | Re: Merging statistics from children instead of re-sampling everything (Tomas Vondra <tomas.vondra@enterprisedb.com>) |
Ответы |
Re: Merging statistics from children instead of re-sampling everything
|
Список | pgsql-hackers |
On 21/1/2022 01:25, Tomas Vondra wrote: > But I don't have a very good idea what to do about statistics that we > can't really merge. For some types of statistics it's rather tricky to > reasonably merge the results - ndistinct is a simple example, although > we could work around that by building and merging hyperloglog counters. I think, as a first step on this way we can reduce a number of pulled tuples. We don't really needed to pull all tuples from a remote server. To construct a reservoir, we can pull only a tuple sample. Reservoir method needs only a few arguments to return a sample like you read tuples locally. Also, to get such parts of samples asynchronously, we can get size of each partition on a preliminary step of analysis. In my opinion, even this solution can reduce heaviness of a problem drastically. -- regards, Andrey Lepikhov Postgres Professional
В списке pgsql-hackers по дате отправления: