Re: Add parallel columns for seq scan and index scan on pg_stat_all_tables and _indexes

Поиск

Список

Период

Сортировка

От	Alena Rybakina
Тема	Re: Add parallel columns for seq scan and index scan on pg_stat_all_tables and _indexes
Дата	8 октября 2024 г. 18:20:17
Msg-id	27273ef8-e211-4ad9-ba55-d3947bd179b8@postgrespro.ru обсуждение исходный текст
Ответ на	Re: Add parallel columns for seq scan and index scan on pg_stat_all_tables and _indexes (Bertrand Drouvot <bertranddrouvot.pg@gmail.com>)
Список	pgsql-hackers

Дерево обсуждения

On 07.10.2024 03:41, Michael Paquier wrote:
> On Mon, Oct 07, 2024 at 12:43:18AM +0300, Alena Rybakina wrote:
>> Maybe I'm not aware of the whole context of the thread and maybe my
>> questions will seem a bit stupid, but honestly
>> it's not entirely clear to me how this statistics will help to adjust the
>> number of parallel workers.
>> We may have situations when during overestimation of the cardinality during
>> query optimization a several number of parallel workers were unjustifiably
>> generated and vice versa -
>> due to a high workload only a few number of workers were generated.
>> How do we identify such cases so as not to increase or decrease the number
>> of parallel workers when it is not necessary?
> Well.  For spiky workloads, only these numbers are not going to help.
> If you can map them with the number of times a query related to these
> tables has been called, something that pg_stat_statements would be
> able to show more information about.
>
> FWIW, I have doubts that these numbers attached to this portion of the
> system are always useful.  For OLTP workloads, parallel workers would
> unlikely be spawned because even with JOINs we won't work with a high
> number of tuples that require them.  This could be interested with
> analytics, however complex query sequences mean that we'd still need
> to look at all the plans involving the relations where there is an
> unbalance of planned/spawned workers, because these can usually
> involve quite a few gather nodes.  At the end of the day, it seems to
> me that we would still need data that involves statements to track
> down specific plans that are starving.  If your application does not
> have that many statements, looking at individial plans is OK, but if
> you have hundreds of them to dig into, this is time-consuming and
> stats at table/index level don't offer data in terms of stuff that
> stands out and needs adjustments.
>
> And this is without the argument of bloating more the stats entries
> for each table, even if it matters less now that these stats are in
> shmem lately.

To be honest, it’s not entirely clear to me how these statistics will 
help in setting up parallel workers.

As I understand, we need additional tools for analytics, which are 
available in pg_stat_statements, but how then does it work? maybe you 
have the opportunity to demonstrate this?

-- 
Regards,
Alena Rybakina
Postgres Professional

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Add parallel columns for seq scan and index scan on pg_stat_all_tables and _indexes