Postgresql

Postgres 為特定使用者執行 SEQ 掃描

  • April 26, 2021

我試圖了解為什麼以下查詢對特定使用者執行這麼長時間(大約 5 分鐘),但對於涉及相同行數的其他使用者,它需要 2 / 3 秒

有一個主表“長度”與 FK 到“會話”表,長度表在 session_id 和 stroke_type 列上有一個 lengths_idx 索引,通常在這裡使用

我可以看到它沒有為該特定使用者使用“長度”表上的 lengths_idx 索引,這可能是該表的索引/統計資訊有問題嗎?

在 Amazon RDS 上執行的 Postgresql 9.6

SELECT x.sess_id, round(count(x.*) * x.size,2)
FROM ( 
   SELECT l.session_id as sess_id, 
           l.stroke_type, 
           s.pool_size as size,row_number() OVER (ORDER BY l.session_id, l.stroke_type, l.length_number) - l.length_number AS grp 
   FROM lengths as l, user_sessions as us, sessions as s 
   WHERE l.session_id = us.session_id 
   AND us.user_id = 1234 
   AND l.session_id = s.session_id     
   AND CAST(l.stroke_type as text) = CAST('free' as text) 
) x 
GROUP BY x.sess_id, x.stroke_type, grp, x.size 
ORDER BY count(x.*) DESC LIMIT 1

快速執行的查詢解釋(索引掃描):

Limit  (cost=5341076.22..5341076.23 rows=1 width=61) (actual time=2817.451..2817.452 rows=1 loops=1)
 ->  Sort  (cost=5341076.22..5341077.08 rows=342 width=61) (actual time=2817.449..2817.449 rows=1 loops=1)
       Sort Key: (count(x.*)) DESC
       Sort Method: top-N heapsort  Memory: 25kB
       ->  GroupAggregate  (cost=5341062.54..5341074.51 rows=342 width=61) (actual time=2776.621..2817.029 rows=1136 loops=1)
             Group Key: x.sess_id, x.stroke_type, x.grp, x.size
             ->  Sort  (cost=5341062.54..5341063.40 rows=342 width=66) (actual time=2776.573..2793.144 rows=57154 loops=1)
                   Sort Key: x.sess_id, x.stroke_type, x.grp, x.size
                   Sort Method: external sort  Disk: 4248kB
                   ->  Subquery Scan on x  (cost=5341036.18..5341048.15 rows=342 width=66) (actual time=2627.123..2710.884 rows=57154 loops=1)
                         ->  WindowAgg  (cost=5341036.18..5341044.73 rows=342 width=25) (actual time=2627.108..2679.127 rows=57154 loops=1)
                               ->  Sort  (cost=5341036.18..5341037.03 rows=342 width=17) (actual time=2627.091..2644.249 rows=57154 loops=1)
                                     Sort Key: l.session_id, l.stroke_type, l.length_number
                                     Sort Method: external sort  Disk: 1512kB
                                     ->  Nested Loop  (cost=1.43..5341021.79 rows=342 width=17) (actual time=3.403..2571.724 rows=57154 loops=1)
                                           Join Filter: (us.session_id = l.session_id)
                                           ->  Nested Loop  (cost=0.86..13212.89 rows=1367 width=13) (actual time=1.854..494.049 rows=1079 loops=1)
                                                 ->  Index Only Scan using user_sessions_pkey on user_sessions us  (cost=0.43..1911.49 rows=1367 width=4) (actual time=0.947..86.638 rows=1079 loops=1)
                                                       Index Cond: (user_id = 2055)
                                                       Heap Fetches: 290
                                                 ->  Index Scan using sessions_pkey on sessions s  (cost=0.43..8.26 rows=1 width=9) (actual time=0.374..0.376 rows=1 loops=1079)
                                                       Index Cond: (session_id = us.session_id)
                                           ->  Index Scan using lengths_idx on lengths l  (cost=0.57..3897.30 rows=12 width=12) (actual time=1.208..1.899 rows=53 loops=1079)
                                                 Index Cond: (session_id = s.session_id)
                                                 Filter: ((stroke_type)::text = 'free'::text)
                                                 Rows Removed by Filter: 0
Planning time: 3.090 ms
Execution time: 2819.171 ms

慢速執行的查詢解釋(SEQ SCAN):

Limit  (cost=5759727.25..5759727.25 rows=1 width=61) (actual time=315354.229..315354.231 rows=1 loops=1)
 ->  Sort  (cost=5759727.25..5759728.36 rows=444 width=61) (actual time=315354.228..315354.228 rows=1 loops=1)
       Sort Key: (count(x.*)) DESC
       Sort Method: top-N heapsort  Memory: 25kB
       ->  GroupAggregate  (cost=5759709.49..5759725.03 rows=444 width=61) (actual time=315304.470..315353.719 rows=1094 loops=1)
             Group Key: x.sess_id, x.stroke_type, x.grp, x.size
             ->  Sort  (cost=5759709.49..5759710.60 rows=444 width=66) (actual time=315304.374..315324.696 rows=54090 loops=1)
                   Sort Key: x.sess_id, x.stroke_type, x.grp, x.size
                   Sort Method: external sort  Disk: 4016kB
                   ->  Subquery Scan on x  (cost=5759674.42..5759689.96 rows=444 width=66) (actual time=315123.778..315226.785 rows=54090 loops=1)
                         ->  WindowAgg  (cost=5759674.42..5759685.52 rows=444 width=25) (actual time=315123.762..315189.061 rows=54090 loops=1)
                               ->  Sort  (cost=5759674.42..5759675.53 rows=444 width=17) (actual time=315123.744..315147.132 rows=54090 loops=1)
                                     Sort Key: l.session_id, l.stroke_type, l.length_number
                                     Sort Method: external sort  Disk: 1432kB
                                     ->  Nested Loop  (cost=2484.15..5759654.90 rows=444 width=17) (actual time=251.322..315047.897 rows=54090 loops=1)
                                           ->  Hash Join  (cost=2483.72..5756479.95 rows=444 width=16) (actual time=250.846..312922.269 rows=54090 loops=1)
                                                 Hash Cond: (l.session_id = us.session_id)
                                                 ->  Seq Scan on lengths l  (cost=0.00..5750146.58 rows=1025390 width=12) (actual time=0.458..282569.670 rows=101216939 loops=1)
                                                       Filter: ((stroke_type)::text = 'free'::text)
                                                       Rows Removed by Filter: 103123974
                                                 ->  Hash  (cost=2461.52..2461.52 rows=1776 width=4) (actual time=118.626..118.626 rows=1059 loops=1)
                                                       Buckets: 2048  Batches: 1  Memory Usage: 54kB
                                                       ->  Index Only Scan using user_sessions_pkey on user_sessions us  (cost=0.43..2461.52 rows=1776 width=4) (actual time=0.963..118.151 rows=1059 loops=1)
                                                             Index Cond: (user_id = 43266)
                                                             Heap Fetches: 446
                                           ->  Index Scan using sessions_pkey on sessions s  (cost=0.43..7.14 rows=1 width=9) (actual time=0.037..0.038 rows=1 loops=54090)
                                                 Index Cond: (session_id = l.session_id)
Planning time: 0.572 ms
Execution time: 315356.321 ms

估計lengths很差。

嘗試使用ANALYZE. 如果僅此還不夠,請增加細節:

ALTER TABLE lenghts ALTER stroke_type SET STATISTICS 1000;
ANALYZE lengths;

引用自:https://dba.stackexchange.com/questions/290489