Postgresql
SORT BY LIMIT LEFT JOIN 查詢慢/查詢計劃錯誤
我有以下 3 個表 a、b、c 都具有復合唯一鍵(application_id、tid),但是所有這些表都與外鍵無關(這是我們解決方案的限制)。所有表都有 (application_id, tid) 和時間戳列的索引。
我目前的問題是奇怪的查詢時間,列表按時間戳排序。當我要求表a並左加入b我的查詢時間很差(請參閱下面的查詢和查詢計劃)但是當左加入c查詢時如預期的那樣快(請參閱下一個)
加入b:
SELECT a.application_id, a.timestamp, b.dua, b.dpr FROM applications AS a LEFT JOIN features AS b ON b.application_id = a.application_id AND b.tid = a.tid WHERE a.tid = '00010-1' ORDER BY a.timestamp DESC, a.application_id DESC +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ |QUERY PLAN | +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ |Limit (cost=187051.77..187051.82 rows=21 width=127) (actual time=4579.007..4579.017 rows=21 loops=1) | | Output: i.application_id, i.timestamp, b.dua, b.dpr | | -> Sort (cost=187051.77..187542.06 rows=196115 width=127) (actual time=4579.006..4579.012 rows=21 loops=1) | | Output: i.application_id, i.timestamp, b.dua, b.dpr | | Sort Key: i.t DESC, i.application_id DESC | | Sort Method: top-N heapsort Memory: 28kB | | -> Hash Right Join (cost=45786.40..181764.20 rows=196115 width=127) (actual time=313.709..4503.010 rows=195204 loops=1) | | Output: i.application_id, i.timestamp, b.dua, b.dpr | | Hash Cond: ((b.tid = i.tid ) AND (b.application_id = i.application_id)) | | -> Bitmap Heap Scan on public.b (cost=14548.92..139604.43 rows=180366 width=122) (actual time=95.169..1133.541 rows=181194 loops=1) | | Output: b.dua, b.dpr, b.application_id, b.tid | | Recheck Cond: (b.tid = '00010-1'::text) | | Rows Removed by Index Recheck: 58154 | | Heap Blocks: exact=33962 lossy=53026 | | -> Bitmap Index Scan on b_pkey (cost=0.00..14503.83 rows=180366 width=0) (actual time=85.135..85.136 rows=181207 loops=1) | | Index Cond: (b.tid = '00010-1'::text) | | -> Hash (cost=26379.75..26379.75 rows=196115 width=51) (actual time=218.276..218.276 rows=195204 loops=1) | | Output: i.application_id, i.timestamp, i.tid | | Buckets: 65536 Batches: 8 Memory Usage: 2692kB | | -> Bitmap Heap Scan on public.a i (cost=14184.31..26379.75 rows=196115 width=51) (actual time=46.996..137.642 rows=195204 loops=1) | | Output: i.application_id, i.timestamp, i.tid| | Recheck Cond: (i.tid = '00010-1'::text) | | Heap Blocks: exact=9724 | | -> Bitmap Index Scan on "a_idx" (cost=0.00..14135.28 rows=196115 width=0) (actual time=45.295..45.295 rows=222169 loops=1)| | Index Cond: (i.tid = '00010-1'::text) | |Planning time: 0.845 ms | |Execution time: 4579.093 ms | +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
加入c:
SELECT a.application_id, a.timestamp , c.fs , c.pds FROM a LEFT JOIN c ON a.application_id = c.application_id AND a.tid = c.tid WHERE a.tid = '00010-1' ORDER BY a.timestamp DESC, a.application_id DESC LIMIT 21; +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ |QUERY PLAN | +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ |Limit (cost=89302.01..89302.07 rows=21 width=53) (actual time=714.657..714.669 rows=21 loops=1) | | Output: i.application_id, i.t, c.fs, c.pds | | -> Sort (cost=89302.01..89792.30 rows=196115 width=53) (actual time=714.656..714.660 rows=21 loops=1) | | Output: i.application_id, i.timestamp, c.fs, c.pds | | Sort Key: i.t DESC, i.application_id DESC | | Sort Method: top-N heapsort Memory: 27kB | | -> Hash Right Join (cost=62593.93..84014.44 rows=196115 width=53) (actual time=344.028..674.053 rows=195208 loops=1) | | Output: i.application_id, i.t, c.fs, c.pds | | Hash Cond: ((c.tid = i.tid) AND (c.application_id = i.application_id)) | | -> Bitmap Heap Scan on public.c c (cost=31356.46..44628.14 rows=181655 width=57) (actual time=99.070..192.647 rows=181196 loops=1) | | Output: c.fs, c.pds, c.application_id, c.tid | | Recheck Cond: (c.tid = '00010-1'::text) | | Heap Blocks: exact=5373 | | -> Bitmap Index Scan on c_pkey (cost=0.00..31311.04 rows=181655 width=0) (actual time=97.912..97.912 rows=181196 loops=1) | | Index Cond: (c.tid = '00010-1'::text) | | -> Hash (cost=26379.75..26379.75 rows=196115 width=51) (actual time=244.532..244.532 rows=195208 loops=1) | | Output: i.application_id, i.timestamp, i.tid | Buckets: 65536 Batches: 8 Memory Usage: 2693kB | | -> Bitmap Heap Scan on public.a i (cost=14184.31..26379.75 rows=196115 width=51) (actual time=52.118..155.189 rows=195208 loops=1) | | Output: i.application_id, i.timestamp, i.tid | | Recheck Cond: (i.tid = '00010-1'::text) | | Heap Blocks: exact=9724 | | -> Bitmap Index Scan on "a_idx" (cost=0.00..14135.28 rows=196115 width=0) (actual time=50.129..50.129 rows=222173 loops=1)| | Index Cond: (i.tid = '00010-1'::text) | |Planning time: 0.346 ms | |Execution time: 714.749 ms | +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
表B和C的行數相對相同(300k),但它們的性能卻大不相同。我從查詢計劃中找出的原因是B表的順序掃描,但我不明白為什麼查詢計劃不首先解析A(將以 21 行結束)然後嘗試左連接其餘的(行來自C或B )。
您能否建議我可以更改任何調整參數以改進第一次查詢?或者提示我如何重寫查詢以從 SORT 和 LIMIT 中受益?到目前為止,我發現如果我為表a提供任何其他條件。
提前感謝您的幫助
更新:由於我公司的政策,查詢和查詢計劃已被混淆,並且表a是別名,因此在查詢計劃中它也被引用為i
簡
如果您有一個可以在應用限制後按順序獲取行的索引,那麼它應該自動按照您想要的方式進行查詢。所以
create index on applications (tid, timestamp, application_id)
如果您不想建構專用索引,您仍然可以編寫查詢以使其首先使用 CTE 處理“a”:
WITH t as (select * from applications where tid='00010-1' ORDER BY a.timestamp DESC, a.application_id DESC LIMIT 21) SELECT a.application_id, a.timestamp, b.dua, b.dpr FROM t AS a LEFT JOIN features AS b ON b.application_id = a.application_id AND b.tid = a.tid ORDER BY a.timestamp DESC, a.application_id DESC
查看您已有的計劃,點陣圖堆掃描的有損塊和散列連接的 8 個批次都表明您的 work_mem 對於您正在執行的查詢可能太低了。此外,表 B 的數據比表 C 的數據更分散(87,000 塊對 5400 個塊,行數大致相同),儘管這不太可能完全解釋時間上的差異。
最後,9.6 已經很老了(儘管仍然受支持)。從那時起,性能有了許多改進。如果您花費大量時間擔心性能,您應該計劃儘早升級。