Re: Add parallel columns for seq scan and index scan on pg_stat_all_tables and _indexes
От | Alena Rybakina |
---|---|
Тема | Re: Add parallel columns for seq scan and index scan on pg_stat_all_tables and _indexes |
Дата | |
Msg-id | 27273ef8-e211-4ad9-ba55-d3947bd179b8@postgrespro.ru обсуждение исходный текст |
Ответ на | Re: Add parallel columns for seq scan and index scan on pg_stat_all_tables and _indexes (Bertrand Drouvot <bertranddrouvot.pg@gmail.com>) |
Список | pgsql-hackers |
On 07.10.2024 03:41, Michael Paquier wrote: > On Mon, Oct 07, 2024 at 12:43:18AM +0300, Alena Rybakina wrote: >> Maybe I'm not aware of the whole context of the thread and maybe my >> questions will seem a bit stupid, but honestly >> it's not entirely clear to me how this statistics will help to adjust the >> number of parallel workers. >> We may have situations when during overestimation of the cardinality during >> query optimization a several number of parallel workers were unjustifiably >> generated and vice versa - >> due to a high workload only a few number of workers were generated. >> How do we identify such cases so as not to increase or decrease the number >> of parallel workers when it is not necessary? > Well. For spiky workloads, only these numbers are not going to help. > If you can map them with the number of times a query related to these > tables has been called, something that pg_stat_statements would be > able to show more information about. > > FWIW, I have doubts that these numbers attached to this portion of the > system are always useful. For OLTP workloads, parallel workers would > unlikely be spawned because even with JOINs we won't work with a high > number of tuples that require them. This could be interested with > analytics, however complex query sequences mean that we'd still need > to look at all the plans involving the relations where there is an > unbalance of planned/spawned workers, because these can usually > involve quite a few gather nodes. At the end of the day, it seems to > me that we would still need data that involves statements to track > down specific plans that are starving. If your application does not > have that many statements, looking at individial plans is OK, but if > you have hundreds of them to dig into, this is time-consuming and > stats at table/index level don't offer data in terms of stuff that > stands out and needs adjustments. > > And this is without the argument of bloating more the stats entries > for each table, even if it matters less now that these stats are in > shmem lately. To be honest, it’s not entirely clear to me how these statistics will help in setting up parallel workers. As I understand, we need additional tools for analytics, which are available in pg_stat_statements, but how then does it work? maybe you have the opportunity to demonstrate this? -- Regards, Alena Rybakina Postgres Professional
В списке pgsql-hackers по дате отправления: