Postgresql
為什麼這個帶有 UNION 的 SQL 查詢明顯快於沒有 UNION 的相同查詢?
我正在嘗試優化一個查詢,該查詢在 Postgres 12.7 上永遠不會完成。它需要幾個小時,甚至幾天,才能使 CPU 達到 100%,並且永遠不會返回:
SELECT "id", "counter", "item_id", "item_name", "type", "updated_time" FROM "changes" WHERE (type = 1 OR type = 3) AND user_id = 'kJ6GYJNPM4wdDY5dUV1b8PqDRJj6RRgW' OR type = 2 AND item_id IN (SELECT item_id FROM user_items WHERE user_id = 'kJ6GYJNPM4wdDY5dUV1b8PqDRJj6RRgW') ORDER BY "counter" ASC LIMIT 100;
我隨機嘗試使用 UNION 重寫它,我相信它是等價的。基本上,查詢中有兩個部分,一個用於 type = 1 或 3,另一個用於 type = 2。
( SELECT "id", "counter", "item_id", "item_name", "type", "updated_time" FROM "changes" WHERE (type = 1 OR type = 3) AND user_id = 'kJ6GYJNPM4wdDY5dUV1b8PqDRJj6RRgW' ) UNION ( SELECT "id", "counter", "item_id", "item_name", "type", "updated_time" FROM "changes" WHERE type = 2 AND item_id IN (SELECT item_id FROM user_items WHERE user_id = 'kJ6GYJNPM4wdDY5dUV1b8PqDRJj6RRgW') ) ORDER BY "counter" ASC LIMIT 100;
此查詢在 10 秒內返回,而另一個查詢則在幾天后不再返回。知道是什麼導致了這種巨大的差異嗎?
查詢計劃
對於原始查詢:
QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=1001.01..1697110.80 rows=100 width=119) -> Gather Merge (cost=1001.01..8625312957.40 rows=508535 width=119) Workers Planned: 2 -> Parallel Index Scan using changes_pkey on changes (cost=0.98..8625253259.82 rows=211890 width=119) Filter: ((((type = 1) OR (type = 3)) AND ((user_id)::text = 'kJ6GYJNPM4wdDY5dUV1b8PqDRJj6RRgW'::text)) OR ((type = 2) AND (SubPlan 1))) SubPlan 1 -> Materialize (cost=0.55..18641.22 rows=143863 width=33) -> Index Only Scan using user_items_user_id_item_id_unique on user_items (cost=0.55..16797.90 rows=143863 width=33) Index Cond: (user_id = 'kJ6GYJNPM4wdDY5dUV1b8PqDRJj6RRgW'::text)
對於 UNION 查詢:
Limit (cost=272866.63..272866.88 rows=100 width=212) (actual time=10564.742..10566.964 rows=100 loops=1) -> Sort (cost=272866.63..273371.95 rows=202128 width=212) (actual time=10564.739..10566.950 rows=100 loops=1) Sort Key: changes.counter Sort Method: top-N heapsort Memory: 69kB -> Unique (cost=261604.20..265141.44 rows=202128 width=212) (actual time=9530.376..10493.030 rows=147261 loops=1) -> Sort (cost=261604.20..262109.52 rows=202128 width=212) (actual time=9530.374..10375.845 rows=147261 loops=1) Sort Key: changes.id, changes.counter, changes.item_id, changes.item_name, changes.type, changes.updated_time Sort Method: external merge Disk: 19960kB -> Gather (cost=1000.00..223064.76 rows=202128 width=212) (actual time=2439.116..7356.233 rows=147261 loops=1) Workers Planned: 2 Workers Launched: 2 -> Parallel Append (cost=0.00..201851.96 rows=202128 width=212) (actual time=2421.400..7815.315 rows=49087 loops=3) -> Parallel Hash Join (cost=12010.60..103627.94 rows=47904 width=119) (actual time=907.286..3118.898 rows=24 loops=3) Hash Cond: ((changes.item_id)::text = (user_items.item_id)::text) -> Parallel Seq Scan on changes (cost=0.00..90658.65 rows=365215 width=119) (actual time=1.466..2919.855 rows=295810 loops=3) Filter: (type = 2) Rows Removed by Filter: 428042 -> Parallel Hash (cost=11290.21..11290.21 rows=57631 width=33) (actual time=78.190..78.191 rows=48997 loops=3) Buckets: 262144 Batches: 1 Memory Usage: 12416kB -> Parallel Index Only Scan using user_items_user_id_item_id_unique on user_items (cost=0.55..11290.21 rows=57631 width=33) (actual time=0.056..107.247 rows=146991 loops=1) Index Cond: (user_id = 'kJ6GYJNPM4wdDY5dUV1b8PqDRJj6RRgW'::text) Heap Fetches: 11817 -> Parallel Seq Scan on changes changes_1 (cost=0.00..95192.10 rows=36316 width=119) (actual time=2410.556..7026.664 rows=73595 loops=2) Filter: (((user_id)::text = 'kJ6GYJNPM4wdDY5dUV1b8PqDRJj6RRgW'::text) AND ((type = 1) OR (type = 3))) Rows Removed by Filter: 1012184 Planning Time: 65.846 ms Execution Time: 10575.679 ms (27 rows)
定義
Table "public.changes" Column | Type | Collation | Nullable | Default ---------------+-----------------------+-----------+----------+------------------------------------------ counter | integer | | not null | nextval('changes_counter_seq'::regclass) id | character varying(32) | | not null | item_type | integer | | not null | item_id | character varying(32) | | not null | item_name | text | | not null | ''::text type | integer | | not null | updated_time | bigint | | not null | created_time | bigint | | not null | previous_item | text | | not null | ''::text user_id | character varying(32) | | not null | ''::character varying Indexes: "changes_pkey" PRIMARY KEY, btree (counter) "changes_id_unique" UNIQUE CONSTRAINT, btree (id) "changes_id_index" btree (id) "changes_item_id_index" btree (item_id) "changes_user_id_index" btree (user_id)
Table "public.user_items" Column | Type | Collation | Nullable | Default --------------+-----------------------+-----------+----------+---------------------------------------- id | integer | | not null | nextval('user_items_id_seq'::regclass) user_id | character varying(32) | | not null | item_id | character varying(32) | | not null | updated_time | bigint | | not null | created_time | bigint | | not null | Indexes: "user_items_pkey" PRIMARY KEY, btree (id) "user_items_user_id_item_id_unique" UNIQUE CONSTRAINT, btree (user_id, item_id) "user_items_item_id_index" btree (item_id) "user_items_user_id_index" btree (user_id)
類型計數
postgres=> select count(*) from changes where type = 1; count --------- 1201839 (1 row) postgres=> select count(*) from changes where type = 2; count -------- 888269 (1 row) postgres=> select count(*) from changes where type = 3; count ------- 83849 (1 row)
每個 user_id 有多少 item_id
postgres=> SELECT min(ct), max(ct), avg(ct), sum(ct) FROM (SELECT count(*) AS ct FROM user_items GROUP BY user_id) x; min | max | avg | sum -----+--------+-----------------------+--------- 6 | 146991 | 2253.0381526104417671 | 1122013 (1 row)
可以容忍的 OR 很幸運,因為它找到了 100 個類型為 1 或 3 的匹配行,然後才找到任何類型 2 必須對照另一個表進行檢查。無法容忍的顯然必須對另一張表進行檢查,並且它以非常緩慢的方式進行檢查,方法是遍歷其中的所有行。現在它應該使用散列子計劃,而不是正常子計劃。它不會使用我能想到的散列子計劃的唯一原因是你的 work_mem 設置非常低,所以它認為它不能將散列表放入記憶體中,所以它回退到一個完全可怕的方法。
“散列子計劃”沒有辦法溢出到磁碟,所以如果計劃者認為它會使用太多記憶體,它就不會安排一個。在聯合方面,雜湊連接可能會溢出到磁碟,因此它更願意使用它。
如果你提高你的 work_mem,OR 計劃應該會變得更快。不需要太多,在我手中 10MB 就足夠了(但這對於現代伺服器來說仍然很小,除非你有充分的理由不這樣做,否則我可能會將其設置為至少 100MB)