Re: LIKE search and performance
От | Alexander Staubo |
---|---|
Тема | Re: LIKE search and performance |
Дата | |
Msg-id | 88daf38c0705230905u1a981102k691b47bd36011a80@mail.gmail.com обсуждение исходный текст |
Ответ на | LIKE search and performance ("Andy" <frum@ar-sd.net>) |
Ответы |
Re: LIKE search and performance
|
Список | pgsql-performance |
On 5/23/07, Andy <frum@ar-sd.net> wrote: > An example would be: > SELECT * FROM table > WHERE name like '%john%' or street like '%srt%' > > Anyway, the query planner always does seq scan on the whole table and that > takes some time. How can this be optimized or made in another way to be > faster? There's no algorithm in existence that can "index" arbitrary substrings the way you think. The only rational way to accomplish this is to first break the text into substrings using some algorithm (eg., words delimited by whitespace and punctuation), and index the substrings individually. You can do this using vanilla PostgreSQL, and you can use Tsearch2 and/or its GIN indexes to help speed up the searches. The simplest solution would be to put all the substrings in a table, one row per substring, along with an attribute referencing the source table/row. At this point you have effectively reduced your search space: you can use a query to isolate the words in your "dictionary" that contain the substrings. So a query might be: select ... from persons where id in ( select person_id from person_words where word like '%john%'; ) The "like" search, even though it will use a sequential scan, is bound to be much faster on these small words than searching for substrings through large gobs of text in the persons table. Note that PostgreSQL *can* exploit the index for *prefix* matching, if you tell it to use the right operator class: create index persons_name_index on persons (name text_pattern_ops); or, if you're using varchars (though there is rarely any reason to): create index persons_name_index on persons (name varchar_pattern_ops); (These two may be identical internally. Check the docs.) Now you can do: select ... from persons where name like 'john%'; which will yield a query plan such as this: Index Scan using persons_name_index on persons (cost=0.00..8.27 rows=1 width=29) (actual time=0.184..0.373 rows=51 loops=1) Index Cond: ((name ~>=~ 'john'::text) AND (name ~<~ 'joho'::text)) Filter: (title ~~ 'john%'::text) Alexander.
В списке pgsql-performance по дате отправления: