在 f_unaccent() 上使用 GIN 索引的查詢似乎很慢？

April 10, 2018

我在一個包含產品的表中有近 20M 行，包括他們的名字。

我希望通過名稱快速搜尋全文，所以我創建了這個索引：

CREATE INDEX uprice_item_occurrence_unaccent_name_trgm_idx ON price_item_occurrence USING gin (f_unaccent(name) gin_trgm_ops);;

我正在跳躍以下查詢將花費更少的時間（比如說）500ms

select * from price_item_occurrence as oo
where f_unaccent(oo.name) % f_unaccent('iphone');

但這需要將近2s：

postgres=# explain analyze select * from price_item_occurrence as oo where f_unaccent(oo.name) % f_unaccent('iphone');
                                                                            QUERY PLAN                                                                              
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on price_item_occurrence oo  (cost=1956.42..63674.14 rows=16570 width=287) (actual time=247.918..1880.759 rows=94 loops=1)
  Recheck Cond: (f_unaccent((name)::text) % 'iphone'::text)
  Rows Removed by Index Recheck: 87838
  Heap Blocks: exact=76663
  -&gt;  Bitmap Index Scan on uprice_item_occurrence_unaccent_name_trgm_idx  (cost=0.00..1952.28 rows=16570 width=0) (actual time=195.418..195.418 rows=88962 loops=1)
        Index Cond: (f_unaccent((name)::text) % 'iphone'::text)
Planning time: 0.444 ms
Execution time: 1880.833 ms

數據庫可能很忙，但我不確定。

我試著玩select set_limit(0.9);（增加），它有一點幫助，但沒有多大幫助。

我正在使用 Postgres 10，可以更改 Postgres 配置，我願意接受建議。

我嘗試了它ilike並改進了一些：

postgres=# explain analyze select * from price_item_occurrence as oo where f_unaccent(oo.name) ilike ('%' || f_unaccent('iphone') || '%');
                                                                            QUERY PLAN                                                                             
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on price_item_occurrence oo  (cost=3135.08..416823.45 rows=166075 width=286) (actual time=50.258..670.085 rows=65917 loops=1)
  Recheck Cond: (f_unaccent((name)::text) ~~* '%iphone%'::text)
  Rows Removed by Index Recheck: 10
  Heap Blocks: exact=59750
  -&gt;  Bitmap Index Scan on uprice_item_occurrence_unaccent_name_trgm_idx  (cost=0.00..3093.56 rows=166075 width=0) (actual time=37.385..37.385 rows=67700 loops=1)
        Index Cond: (f_unaccent((name)::text) ~~* '%iphone%'::text)
Planning time: 0.545 ms
Execution time: 675.776 ms
(8 rows)

大約快 2 倍。

我試過limit 10：

postgres=# explain analyze select * from price_item_occurrence as oo where f_unaccent(oo.name) % f_unaccent('iphone') limit 10;
                                                                               QUERY PLAN                                                                                
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit  (cost=373.27..410.51 rows=10 width=287) (actual time=268.718..589.131 rows=10 loops=1)
  -&gt;  Bitmap Heap Scan on price_item_occurrence oo  (cost=373.27..62493.45 rows=16680 width=287) (actual time=268.715..589.123 rows=10 loops=1)
        Recheck Cond: (f_unaccent((name)::text) % 'iphone'::text)
        Rows Removed by Index Recheck: 18917
        Heap Blocks: exact=17100
        -&gt;  Bitmap Index Scan on uprice_item_occurrence_unaccent_name_trgm_idx  (cost=0.00..369.10 rows=16680 width=0) (actual time=165.958..165.958 rows=69268 loops=1)
              Index Cond: (f_unaccent((name)::text) % 'iphone'::text)
Planning time: 0.397 ms
Execution time: 589.187 ms
(9 rows)

這也更快，也許幾乎足夠好

我已經看到GIN索引對於這些查詢的執行速度通常比 GiST 快得多。試試這個索引：
CREATE INDEX price_item_occurrence_name_trgm_gin idx ON price_item_occurrence
USING GIN (f_unaccent(name) gin_trgm_ops);
有關的：
PostgreSQL 是否支持“不區分重音”排序規則？
使用帶有非重音且僅帶有右端萬用字元的 ILIKE
LIKE 是如何實現的？
在 PostgreSQL 中使用 LIKE、SIMILAR TO 或正則表達式進行模式匹配
ILIKE 模式的三元組索引未按預期工作
所有關於性能優化的基本建議都適用。對於初學者來說，您的表需要足夠快的VACUUMed 和d。ANALYZE
為讀取性能配置 PostgreSQL

引用自：https://dba.stackexchange.com/questions/203429

在 f_unaccent() 上使用 GIN 索引的查詢似乎很慢？

相關問答

如何加快查詢時間序列中的最後一個值？

從版本 10 升級到 11 後 Postgres 非常慢

將 GIN PostgreSQL 索引用於整數外鍵有什麼缺點嗎？

子查詢中的慢左連接橫向

IS NULL 上的 Postgres 部分索引不起作用

向查詢添加 LEFT JOIN 已將執行時間增加了 10 倍