|
| 1 | +From pgsql-sql-owner+M5999=candle.pha.pa.us=pgman@postgresql.org Mon Dec 17 01:39:56 2001 |
| 2 | +Return-path: <pgsql-sql-owner+M5999=candle.pha.pa.us=pgman@postgresql.org> |
| 3 | +Received: from rs.postgresql.org (server1.pgsql.org [64.39.15.238] (may be forged)) |
| 4 | + by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id fBH6du410376 |
| 5 | + for <pgman@candle.pha.pa.us>; Mon, 17 Dec 2001 01:39:56 -0500 (EST) |
| 6 | +Received: from postgresql.org (postgresql.org [64.49.215.8]) |
| 7 | + by rs.postgresql.org (8.11.6/8.11.6) with ESMTP id fBH6VoR80062 |
| 8 | + for <pgman@candle.pha.pa.us>; Mon, 17 Dec 2001 00:36:11 -0600 (CST) |
| 9 | + (envelope-from pgsql-sql-owner+M5999=candle.pha.pa.us=pgman@postgresql.org) |
| 10 | +Received: from sss.pgh.pa.us ([192.204.191.242]) |
| 11 | + by postgresql.org (8.11.3/8.11.4) with ESMTP id fBH6Lgm62418 |
| 12 | + for <pgsql-sql@postgresql.org>; Mon, 17 Dec 2001 01:21:42 -0500 (EST) |
| 13 | + (envelope-from tgl@sss.pgh.pa.us) |
| 14 | +Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) |
| 15 | + by sss.pgh.pa.us (8.11.4/8.11.4) with ESMTP id fBH6LHi29550; |
| 16 | + Mon, 17 Dec 2001 01:21:17 -0500 (EST) |
| 17 | +To: "Christopher Kings-Lynne" <chriskl@familyhealth.com.au> |
| 18 | +cc: "Stephan Szabo" <sszabo@megazone23.bigpanda.com>, |
| 19 | + "MindTerm" <mindterm@yahoo.com>, pgsql-sql@postgresql.org |
| 20 | +Subject: Re: [SQL] performance tuning in large function / transaction |
| 21 | +In-Reply-To: <GNELIHDDFBOCMGBFGEFOIENDCAAA.chriskl@familyhealth.com.au> |
| 22 | +References: <GNELIHDDFBOCMGBFGEFOIENDCAAA.chriskl@familyhealth.com.au> |
| 23 | +Comments: In-reply-to "Christopher Kings-Lynne" <chriskl@familyhealth.com.au> |
| 24 | + message dated "Mon, 17 Dec 2001 12:06:14 +0800" |
| 25 | +Date: Mon, 17 Dec 2001 01:21:16 -0500 |
| 26 | +Message-ID: <29547.1008570076@sss.pgh.pa.us> |
| 27 | +From: Tom Lane <tgl@sss.pgh.pa.us> |
| 28 | +Precedence: bulk |
| 29 | +Sender: pgsql-sql-owner@postgresql.org |
| 30 | +Status: OR |
| 31 | + |
| 32 | +"Christopher Kings-Lynne" <chriskl@familyhealth.com.au> writes: |
| 33 | +> Is it true that the IN command is implemented sort of as a linked list |
| 34 | +> linear time search? Is there any plan for a super-fast implementation of |
| 35 | +> 'IN'? |
| 36 | + |
| 37 | +This deserves a somewhat long-winded answer. |
| 38 | + |
| 39 | +Postgres presently supports two kinds of IN (I'm not sure whether SQL92 |
| 40 | +allows any additional kinds): |
| 41 | + |
| 42 | +1. Scalar-list IN: foo IN ('bar', 'baz', 'quux', ...) |
| 43 | + |
| 44 | +2. Sub-select IN: foo IN (SELECT bar FROM ...) |
| 45 | + |
| 46 | +In the scalar-list form, a variable is compared to an explicit list of |
| 47 | +constants or expressions. This form is exactly equivalent to |
| 48 | + foo = 'bar' OR foo = 'baz' OR foo = 'quux' OR ... |
| 49 | +and is converted into that form by the parser. The planner is capable |
| 50 | +of converting a WHERE clause of this kind into multiple passes of |
| 51 | +indexscan, when foo is an indexed column and all the IN-list elements |
| 52 | +are constants. Whether it actually will make that conversion depends |
| 53 | +on the usual vagaries of pg_statistic entries, etc. But if it's a |
| 54 | +unique or fairly-selective index, and there aren't a huge number of |
| 55 | +entries in the IN list, a multiple indexscan should be a good plan. |
| 56 | + |
| 57 | +In the sub-select form, we pretty much suck: for each tuple in the outer |
| 58 | +query, we run the inner query until we find a matching value or the |
| 59 | +inner query ends. This is basically a nested-loop scenario, with the |
| 60 | +only (minimally) redeeming social value being that the planner realizes |
| 61 | +it should pick a fast-start plan for the inner query. I think it should |
| 62 | +be possible to convert this form into a modified kind of join (sort of |
| 63 | +the reverse of an outer join: rather than at least one result per |
| 64 | +lefthand row, at most one result per lefthand row), and then we could |
| 65 | +use join methods that are more efficient than nested-loop. But no one's |
| 66 | +tried to make that happen yet. |
| 67 | + |
| 68 | + regards, tom lane |
| 69 | + |
| 70 | +---------------------------(end of broadcast)--------------------------- |
| 71 | +TIP 2: you can get off all lists at once with the unregister command |
| 72 | + (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) |
| 73 | + |
| 74 | +From pgsql-sql-owner+M6000=candle.pha.pa.us=pgman@postgresql.org Mon Dec 17 01:49:56 2001 |
| 75 | +Return-path: <pgsql-sql-owner+M6000=candle.pha.pa.us=pgman@postgresql.org> |
| 76 | +Received: from rs.postgresql.org (server1.pgsql.org [64.39.15.238] (may be forged)) |
| 77 | + by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id fBH6nu410869 |
| 78 | + for <pgman@candle.pha.pa.us>; Mon, 17 Dec 2001 01:49:56 -0500 (EST) |
| 79 | +Received: from postgresql.org (postgresql.org [64.49.215.8]) |
| 80 | + by rs.postgresql.org (8.11.6/8.11.6) with ESMTP id fBH6fLR80303 |
| 81 | + for <pgman@candle.pha.pa.us>; Mon, 17 Dec 2001 00:45:51 -0600 (CST) |
| 82 | + (envelope-from pgsql-sql-owner+M6000=candle.pha.pa.us=pgman@postgresql.org) |
| 83 | +Received: from mail.iinet.net.au (symphony-05.iinet.net.au [203.59.3.37]) |
| 84 | + by postgresql.org (8.11.3/8.11.4) with SMTP id fBH6XFm62784 |
| 85 | + for <pgsql-sql@postgresql.org>; Mon, 17 Dec 2001 01:33:15 -0500 (EST) |
| 86 | + (envelope-from chriskl@familyhealth.com.au) |
| 87 | +Received: (qmail 30765 invoked by uid 666); 17 Dec 2001 06:33:10 -0000 |
| 88 | +Received: from unknown (HELO houston.familyhealth.com.au) (203.59.231.6) |
| 89 | + by mail.iinet.net.au with SMTP; 17 Dec 2001 06:33:10 -0000 |
| 90 | +Received: (from root@localhost) |
| 91 | + by houston.familyhealth.com.au (8.11.6/8.11.6) id fBH6XBH96532 |
| 92 | + for pgsql-sql@postgresql.org; Mon, 17 Dec 2001 14:33:11 +0800 (WST) |
| 93 | + (envelope-from chriskl@familyhealth.com.au) |
| 94 | +Received: from mariner (mariner.internal [192.168.0.101]) |
| 95 | + by houston.familyhealth.com.au (8.11.6/8.9.3) with SMTP id fBH6X7p96337; |
| 96 | + Mon, 17 Dec 2001 14:33:07 +0800 (WST) |
| 97 | +From: "Christopher Kings-Lynne" <chriskl@familyhealth.com.au> |
| 98 | +To: "Tom Lane" <tgl@sss.pgh.pa.us> |
| 99 | +cc: "Stephan Szabo" <sszabo@megazone23.bigpanda.com>, |
| 100 | + "MindTerm" <mindterm@yahoo.com>, <pgsql-sql@postgresql.org> |
| 101 | +Subject: [SQL] 'IN' performance |
| 102 | +Date: Mon, 17 Dec 2001 14:33:40 +0800 |
| 103 | +Message-ID: <GNELIHDDFBOCMGBFGEFOEENFCAAA.chriskl@familyhealth.com.au> |
| 104 | +MIME-Version: 1.0 |
| 105 | +Content-Type: text/plain; |
| 106 | + charset="iso-8859-1" |
| 107 | +Content-Transfer-Encoding: 7bit |
| 108 | +X-Priority: 3 (Normal) |
| 109 | +X-MSMail-Priority: Normal |
| 110 | +X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2910.0) |
| 111 | +X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 |
| 112 | +Importance: Normal |
| 113 | +In-Reply-To: <29547.1008570076@sss.pgh.pa.us> |
| 114 | +X-scanner: scanned by Inflex 0.1.5c - (http://www.inflex.co.za/) |
| 115 | +Precedence: bulk |
| 116 | +Sender: pgsql-sql-owner@postgresql.org |
| 117 | +Status: OR |
| 118 | + |
| 119 | +> In the sub-select form, we pretty much suck: for each tuple in the outer |
| 120 | +> query, we run the inner query until we find a matching value or the |
| 121 | +> inner query ends. This is basically a nested-loop scenario, with the |
| 122 | +> only (minimally) redeeming social value being that the planner realizes |
| 123 | +> it should pick a fast-start plan for the inner query. I think it should |
| 124 | +> be possible to convert this form into a modified kind of join (sort of |
| 125 | +> the reverse of an outer join: rather than at least one result per |
| 126 | +> lefthand row, at most one result per lefthand row), and then we could |
| 127 | +> use join methods that are more efficient than nested-loop. But no one's |
| 128 | +> tried to make that happen yet. |
| 129 | + |
| 130 | +That's what I was thinking...where abouts does all that activity happen? |
| 131 | + |
| 132 | +I assume the planner knows that it doesn't have to reevaluate the subquery |
| 133 | +if it's not correlated? |
| 134 | + |
| 135 | +Chris |
| 136 | + |
| 137 | + |
| 138 | +---------------------------(end of broadcast)--------------------------- |
| 139 | +TIP 2: you can get off all lists at once with the unregister command |
| 140 | + (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) |
| 141 | + |
| 142 | +From pgsql-sql-owner+M6001=candle.pha.pa.us=pgman@postgresql.org Mon Dec 17 02:00:10 2001 |
| 143 | +Return-path: <pgsql-sql-owner+M6001=candle.pha.pa.us=pgman@postgresql.org> |
| 144 | +Received: from rs.postgresql.org (server1.pgsql.org [64.39.15.238] (may be forged)) |
| 145 | + by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id fBH709411405 |
| 146 | + for <pgman@candle.pha.pa.us>; Mon, 17 Dec 2001 02:00:09 -0500 (EST) |
| 147 | +Received: from postgresql.org (postgresql.org [64.49.215.8]) |
| 148 | + by rs.postgresql.org (8.11.6/8.11.6) with ESMTP id fBH6psR80624 |
| 149 | + for <pgman@candle.pha.pa.us>; Mon, 17 Dec 2001 00:56:15 -0600 (CST) |
| 150 | + (envelope-from pgsql-sql-owner+M6001=candle.pha.pa.us=pgman@postgresql.org) |
| 151 | +Received: from sss.pgh.pa.us ([192.204.191.242]) |
| 152 | + by postgresql.org (8.11.3/8.11.4) with ESMTP id fBH6iCm63171 |
| 153 | + for <pgsql-sql@postgresql.org>; Mon, 17 Dec 2001 01:44:12 -0500 (EST) |
| 154 | + (envelope-from tgl@sss.pgh.pa.us) |
| 155 | +Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) |
| 156 | + by sss.pgh.pa.us (8.11.4/8.11.4) with ESMTP id fBH6i3i29733; |
| 157 | + Mon, 17 Dec 2001 01:44:03 -0500 (EST) |
| 158 | +To: "Christopher Kings-Lynne" <chriskl@familyhealth.com.au> |
| 159 | +cc: "Stephan Szabo" <sszabo@megazone23.bigpanda.com>, |
| 160 | + "MindTerm" <mindterm@yahoo.com>, pgsql-sql@postgresql.org |
| 161 | +Subject: Re: [SQL] 'IN' performance |
| 162 | +In-Reply-To: <GNELIHDDFBOCMGBFGEFOEENFCAAA.chriskl@familyhealth.com.au> |
| 163 | +References: <GNELIHDDFBOCMGBFGEFOEENFCAAA.chriskl@familyhealth.com.au> |
| 164 | +Comments: In-reply-to "Christopher Kings-Lynne" <chriskl@familyhealth.com.au> |
| 165 | + message dated "Mon, 17 Dec 2001 14:33:40 +0800" |
| 166 | +Date: Mon, 17 Dec 2001 01:44:03 -0500 |
| 167 | +Message-ID: <29730.1008571443@sss.pgh.pa.us> |
| 168 | +From: Tom Lane <tgl@sss.pgh.pa.us> |
| 169 | +Precedence: bulk |
| 170 | +Sender: pgsql-sql-owner@postgresql.org |
| 171 | +Status: OR |
| 172 | + |
| 173 | +"Christopher Kings-Lynne" <chriskl@familyhealth.com.au> writes: |
| 174 | +> That's what I was thinking...where abouts does all that activity happen? |
| 175 | + |
| 176 | +The infrastructure for different join rules already exists. There'd |
| 177 | +need to be a new JOIN_xxx type added to the various join nodes in the |
| 178 | +executor, but AFAICS that's just a minor extension. The part that is |
| 179 | +perhaps not trivial is in the planner. All the existing inner and outer |
| 180 | +join types start out expressed as joins in the original query. To make |
| 181 | +IN into a join, the planner would have to hoist up a clause from WHERE |
| 182 | +into the join-tree structure. I think it can be done, but I have not |
| 183 | +thought hard about where and how, nor about what semantic restrictions |
| 184 | +might need to be checked. |
| 185 | + |
| 186 | + regards, tom lane |
| 187 | + |
| 188 | +---------------------------(end of broadcast)--------------------------- |
| 189 | +TIP 2: you can get off all lists at once with the unregister command |
| 190 | + (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) |
| 191 | + |
0 commit comments