Postgres 查詢不使用索引，但使用索引更快

September 18, 2020

抱歉，如果這似乎是一個重複的問題。我在 AWS RDS 上使用 Postgres 11.6。我有 2 張桌子：

CREATE TABLE public.e
(
   id character varying(32) COLLATE pg_catalog."default" NOT NULL,
   p_id character varying(32) COLLATE pg_catalog."default" NOT NULL,
   CONSTRAINT e_pkey PRIMARY KEY (id)
)
WITH (
   OIDS = FALSE
)
TABLESPACE pg_default;

CREATE TABLE public.ed
(
   e_id character varying(32) COLLATE pg_catalog."default" NOT NULL,
   &lt;other columns + primary key&gt;
)
WITH (
   OIDS = FALSE
)
TABLESPACE pg_default;

我有一個索引ed.e_id：

CREATE INDEX ix_ed_e_id
   ON public.ed USING btree
   (e_id COLLATE pg_catalog."default" ASC NULLS LAST)
   TABLESPACE pg_default;

當我執行此查詢時：

select *
from ed, e
where e.id = ed.e_id
and e.p_id = '5c7cae8df6d10f1064b2eaf5';

（使用時問題依然存在from ed inner join e on e.id = ed.e_id）

explain analyze計劃是：

Gather  (cost=1136.68..141235.01 rows=28320 width=311) (actual time=0.456..871.155 rows=102709 loops=1)
 Workers Planned: 2
 Workers Launched: 2
 -&gt;  Hash Join  (cost=136.68..137403.01 rows=11800 width=311) (actual time=0.241..688.095 rows=34236 loops=3)
       Hash Cond: (ed.e_id = e.id)
       -&gt;  Parallel Seq Scan on ed ed  (cost=0.00..133210.10 rows=1544610 width=218) (actual time=0.005..314.524 rows=1235269 loops=3)
       -&gt;  Hash  (cost=135.67..135.67 rows=81 width=93) (actual time=0.125..0.126 rows=81 loops=3)
             Buckets: 1024  Batches: 1  Memory Usage: 19kB
             -&gt;  Bitmap Heap Scan on e e  (cost=4.91..135.67 rows=81 width=93) (actual time=0.045..0.097 rows=81 loops=3)
                   Recheck Cond: ((p_id)::text = '5c7cae8df6d10f1064b2eaf5'::text)
                   Heap Blocks: exact=31
                   -&gt;  Bitmap Index Scan on ix_e_p_id  (cost=0.00..4.89 rows=81 width=0) (actual time=0.035..0.035 rows=81 loops=3)
                         Index Cond: ((p_id)::text = '5c7cae8df6d10f1064b2eaf5'::text)
Planning Time: 0.329 ms
Execution Time: 877.804 ms

用一個Parallel Seq Scan on ed為ed.e_id匹配。

當 ISET SESSION enable_seqscan = OFF時，解釋計劃是：

Nested Loop  (cost=0.72..395895.14 rows=28320 width=311) (actual time=0.037..60.068 rows=102709 loops=1)
 -&gt;  Index Scan using e_pkey on e e  (cost=0.29..917.61 rows=81 width=93) (actual time=0.019..4.995 rows=81 loops=1)
       Filter: ((p_id)::text = '5c7cae8df6d10f1064b2eaf5'::text)
       Rows Removed by Filter: 10522
 -&gt;  Index Scan using ix_ed_e_id on ed ed  (cost=0.43..4757.83 rows=11844 width=218) (actual time=0.013..0.334 rows=1268 loops=81)
       Index Cond: (e_id = e.id)
Planning Time: 0.273 ms
Execution Time: 64.675 ms

快了一個數量級（877ms vs 64ms）！我試過VACUUM ANALYZE ed了，但這沒有幫助。我什至嘗試將e.id&更改ed.e_id為UUID類型，但這也無濟於事。

如何說服 Postgres 使用ix_ed_e_id索引而不設置enable_seqscan為關閉？

似乎 PostgreSQL 高估了索引掃描的成本，這導致它更喜歡雜湊連接而不是嵌套循環連接。
有兩個參數告訴 PostgreSQL 硬體並影響它對索引掃描成本的估計：
random_page_cost: 與相比越大，seq_page_costPostgreSQL 估計索引掃描的隨機 I/O 與順序 I/O 相比就越昂貴。因此，您可以降低該參數以鼓勵索引掃描。
effective_cache_size：這告訴優化器有多少記憶體可用於記憶體數據。如果該值很高，它將假定索引被記憶體並且價格索引掃描較低。
也許調整這些參數會改變 PostgreSQL 的想法，儘管成本估計相差甚遠。

引用自：https://dba.stackexchange.com/questions/275638

Postgres 查詢不使用索引，但使用索引更快

相關問答

PostgreSQL（大數據）：按標籤搜尋並按時間戳排序

使用帶有 pg_restore 的並行恢復沒有加速

Postgres 優化 IN 查詢

為什麼我在 Postgres 13 中的空間查詢比在 Postgres 11 中慢？

Postgres 11：升級後查詢計劃使用 seq 掃描

為什麼我的查詢處於空閒狀態？