Postgresql
Postgres 帶有 order by、index 和 limit 的慢查詢
我正在嘗試提高 postgres(9.6) 查詢的性能。這是我的架構,表包含大約 6000 萬行。
Column | Type | Modifiers --------------------------+-----------------------------+----------- transaction_id | text | not null network_merchant_name | text | network_merchant_id | text | network_merchant_mcc | integer | network_merchant_country | text | issuer_country | text | merchant_id | text | remapped_merchant_id | text | created_at | timestamp without time zone | updated_at | timestamp without time zone | remapped_at | timestamp without time zone | Indexes: "mapped_transactions_pkey" PRIMARY KEY, btree (transaction_id) "ix_mapped_transactions_remapped_at" btree (remapped_at NULLS FIRST)
這是我要執行的查詢。
SELECT * FROM mapped_transactions ORDER BY remapped_at ASC NULLS FIRST LIMIT 10000;
這是查詢計劃:
QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=0.57..1511.67 rows=10000 width=146) (actual time=327049.374..327345.341 rows=10000 loops=1) Buffers: shared hit=574937 read=210425 dirtied=356 written=4457 I/O Timings: read=146625.381 write=59.637 -> Index Scan using ix_mapped_transactions_remapped_at on mapped_transactions (cost=0.57..16190862.91 rows=107145960 width=146) (actual time=327049.364..327339.402 rows=10000 loops=1) Buffers: shared hit=574937 read=210425 dirtied=356 written=4457 I/O Timings: read=146625.381 write=59.637 Planning time: 0.125 ms Execution time: 327348.322 ms (8 rows)
我不明白為什麼
remapped_at
列上有索引需要這麼多時間。另一方面,如果我反向訂購,它會很快。
SELECT * FROM mapped_transactions ORDER BY remapped_at DESC NULLS LAST LIMIT 10000;
計劃是:
QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=0.57..1511.67 rows=10000 width=146) (actual time=0.020..9.268 rows=10000 loops=1) Buffers: shared hit=1802 -> Index Scan Backward using ix_mapped_transactions_remapped_at on mapped_transactions (cost=0.57..16190848.04 rows=107145866 width=146) (actual time=0.018..4.759 rows=10000 loops=1) Buffers: shared hit=1802 Planning time: 0.080 ms Execution time: 11.561 ms (6 rows)
有人可以幫助我如何提高第一個查詢的性能嗎?
更新
- 我通過重建表和重新索引數據解決了這個問題。
VACUUM FULL ANALYZE
不是一個選項,因為該表正在被使用,我不想鎖定它。- 該指數的表現正在迅速惡化。我在 7 小時前重建了索引,性能還不錯。現在查詢在大約 10 秒內得到答复。請注意,此表寫得很重。如何使索引快速?我是否必須經常重新索引表?此表中沒有刪除,但有很多更新。
我又進行了一些測試,似乎這個解決方案對我不起作用。隨著時間的推移,索引會不斷降級,如果我不重新索引(或刪除並再次創建),我想要執行的查詢將變得越來越慢。有一些方法可以在不阻塞其他讀取/寫入的情況下創建索引,但目前我不會使用它們,因為它們不容易擴展。我無法用我所擁有的知識解決這個問題,所以我將採用另一種方法。這種方法使用具有恆定列且永遠不會更新的索引。感謝大家幫助我。
我無法確認 PostgreSQL 9.5。所以這可能是一種回歸。樣本數據,
CREATE TABLE mapped_transactions(remapped_at) AS SELECT CASE WHEN x::int % 42 = 0 THEN null ELSE x::int END FROM generate_series( 1, 40e6 ) AS gs(x); CREATE INDEX ON mapped_transactions (remapped_at NULLS FIRST); VACUUM ANALYZE mapped_transactions;
我們在這裡做
NULLS FIRST
,它有效,EXPLAIN ANALYZE SELECT * FROM mapped_transactions ORDER BY remapped_at ASC NULLS FIRST LIMIT 10000; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=0.56..260.24 rows=10000 width=4) (actual time=0.069..4.308 rows=10000 loops=1) -> Index Only Scan using mapped_transactions_remapped_at_idx on mapped_transactions (cost=0.56..1038716.81 rows=40000016 width=4) (actual time=0.067..2.740 rows=10000 loops=1) Heap Fetches: 0 Planning time: 0.232 ms Execution time: 5.017 ms (5 rows)
試
DESC NULLS LAST
EXPLAIN ANALYZE SELECT * FROM mapped_transactions ORDER BY remapped_at DESC NULLS LAST LIMIT 10000; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=0.56..260.24 rows=10000 width=4) (actual time=0.073..4.429 rows=10000 loops=1) -> Index Only Scan Backward using mapped_transactions_remapped_at_idx on mapped_transactions (cost=0.56..1038716.81 rows=40000016 width=4) (actual time=0.071..2.865 rows=10000 loops=1) Heap Fetches: 0 Planning time: 0.114 ms Execution time: 5.137 ms (5 rows)
您可能需要編寫一個測試案例。不清楚你正在經歷什麼。