Asynchronous MergeAppend
От | Alexander Pyhalov |
---|---|
Тема | Asynchronous MergeAppend |
Дата | |
Msg-id | 59be194c5a409fb9fc9f2031581b8a44@postgrespro.ru обсуждение исходный текст |
Ответы |
Re: Asynchronous MergeAppend
|
Список | pgsql-hackers |
Hello. I'd like to make MergeAppend node Async-capable like Append node. Nowadays when planner chooses MergeAppend plan, asynchronous execution is not possible. With attached patches you can see plans like EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM async_pt WHERE b % 100 = 0 ORDER BY b, a; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------ Merge Append Sort Key: async_pt.b, async_pt.a -> Async Foreign Scan on public.async_p1 async_pt_1 Output: async_pt_1.a, async_pt_1.b, async_pt_1.c Remote SQL: SELECT a, b, c FROM public.base_tbl1 WHERE (((b % 100) = 0)) ORDER BY b ASC NULLS LAST, a ASC NULLS LAST -> Async Foreign Scan on public.async_p2 async_pt_2 Output: async_pt_2.a, async_pt_2.b, async_pt_2.c Remote SQL: SELECT a, b, c FROM public.base_tbl2 WHERE (((b % 100) = 0)) ORDER BY b ASC NULLS LAST, a ASC NULLS LAST This can be quite profitable (in our test cases you can gain up to two times better speed with MergeAppend async execution on remote servers). Code for asynchronous execution in Merge Append was mostly borrowed from Append node. What significantly differs - in ExecMergeAppendAsyncGetNext() you must return tuple from the specified slot. Subplan number determines tuple slot where data should be retrieved to. When subplan is ready to provide some data, it's cached in ms_asyncresults. When we get tuple for subplan, specified in ExecMergeAppendAsyncGetNext(), ExecMergeAppendAsyncRequest() returns true and loop in ExecMergeAppendAsyncGetNext() ends. We can fetch data for subplans which either don't have cached result ready or have already returned them to the upper node. This flag is stored in ms_has_asyncresults. As we can get data for some subplan either earlier or after loop in ExecMergeAppendAsyncRequest(), we check this flag twice in this function. Unlike ExecAppendAsyncEventWait(), it seems ExecMergeAppendAsyncEventWait() doesn't need a timeout - as there's no need to get result from synchronous subplan if a tuple form async one was explicitly requested. Also we had to fix postgres_fdw to avoid directly looking at Append fields. Perhaps, accesors to Append fields look strange, but allows to avoid some code duplication. I suppose, duplication could be even less if we reworked async Append implementation, but so far I haven't tried to do this to avoid big diff from master. Also mark_async_capable() believes that path corresponds to plan. This can be not true when create_[merge_]append_plan() inserts sort node. In this case mark_async_capable() can treat Sort plan node as some other and crash, so there's a small fix for this. -- Best regards, Alexander Pyhalov, Postgres Professional
Вложения
В списке pgsql-hackers по дате отправления: