Postgresql
Postgres 查詢不使用索引,但使用索引更快
抱歉,如果這似乎是一個重複的問題。我在 AWS RDS 上使用 Postgres 11.6。我有 2 張桌子:
CREATE TABLE public.e ( id character varying(32) COLLATE pg_catalog."default" NOT NULL, p_id character varying(32) COLLATE pg_catalog."default" NOT NULL, CONSTRAINT e_pkey PRIMARY KEY (id) ) WITH ( OIDS = FALSE ) TABLESPACE pg_default; CREATE TABLE public.ed ( e_id character varying(32) COLLATE pg_catalog."default" NOT NULL, <other columns + primary key> ) WITH ( OIDS = FALSE ) TABLESPACE pg_default;
我有一個索引
ed.e_id
:CREATE INDEX ix_ed_e_id ON public.ed USING btree (e_id COLLATE pg_catalog."default" ASC NULLS LAST) TABLESPACE pg_default;
當我執行此查詢時:
select * from ed, e where e.id = ed.e_id and e.p_id = '5c7cae8df6d10f1064b2eaf5';
(使用時問題依然存在
from ed inner join e on e.id = ed.e_id
)
explain analyze
計劃是:Gather (cost=1136.68..141235.01 rows=28320 width=311) (actual time=0.456..871.155 rows=102709 loops=1) Workers Planned: 2 Workers Launched: 2 -> Hash Join (cost=136.68..137403.01 rows=11800 width=311) (actual time=0.241..688.095 rows=34236 loops=3) Hash Cond: (ed.e_id = e.id) -> Parallel Seq Scan on ed ed (cost=0.00..133210.10 rows=1544610 width=218) (actual time=0.005..314.524 rows=1235269 loops=3) -> Hash (cost=135.67..135.67 rows=81 width=93) (actual time=0.125..0.126 rows=81 loops=3) Buckets: 1024 Batches: 1 Memory Usage: 19kB -> Bitmap Heap Scan on e e (cost=4.91..135.67 rows=81 width=93) (actual time=0.045..0.097 rows=81 loops=3) Recheck Cond: ((p_id)::text = '5c7cae8df6d10f1064b2eaf5'::text) Heap Blocks: exact=31 -> Bitmap Index Scan on ix_e_p_id (cost=0.00..4.89 rows=81 width=0) (actual time=0.035..0.035 rows=81 loops=3) Index Cond: ((p_id)::text = '5c7cae8df6d10f1064b2eaf5'::text) Planning Time: 0.329 ms Execution Time: 877.804 ms
用一個
Parallel Seq Scan on ed
為ed.e_id
匹配。當 I
SET SESSION enable_seqscan = OFF
時,解釋計劃是:Nested Loop (cost=0.72..395895.14 rows=28320 width=311) (actual time=0.037..60.068 rows=102709 loops=1) -> Index Scan using e_pkey on e e (cost=0.29..917.61 rows=81 width=93) (actual time=0.019..4.995 rows=81 loops=1) Filter: ((p_id)::text = '5c7cae8df6d10f1064b2eaf5'::text) Rows Removed by Filter: 10522 -> Index Scan using ix_ed_e_id on ed ed (cost=0.43..4757.83 rows=11844 width=218) (actual time=0.013..0.334 rows=1268 loops=81) Index Cond: (e_id = e.id) Planning Time: 0.273 ms Execution Time: 64.675 ms
快了一個數量級(877ms vs 64ms)!我試過
VACUUM ANALYZE ed
了,但這沒有幫助。我什至嘗試將e.id
&更改ed.e_id
為UUID
類型,但這也無濟於事。如何說服 Postgres 使用
ix_ed_e_id
索引而不設置enable_seqscan
為關閉?
似乎 PostgreSQL 高估了索引掃描的成本,這導致它更喜歡雜湊連接而不是嵌套循環連接。
有兩個參數告訴 PostgreSQL 硬體並影響它對索引掃描成本的估計:
random_page_cost
: 與 相比越大,seq_page_cost
PostgreSQL 估計索引掃描的隨機 I/O 與順序 I/O 相比就越昂貴。因此,您可以降低該參數以鼓勵索引掃描。effective_cache_size
:這告訴優化器有多少記憶體可用於記憶體數據。如果該值很高,它將假定索引被記憶體並且價格索引掃描較低。也許調整這些參數會改變 PostgreSQL 的想法,儘管成本估計相差甚遠。