點陣圖堆掃描很慢

March 6, 2020

我有 2 個表：mailings和form_responses。form_responses表與mailings表有一個外鍵，因此mailing具有一對多的 form_responses。我想要做的是計算郵件統計：每個郵件獲取總form_responses以及收到最後一個form_response的時間。我想退回最後收到form_response的前 20 封**郵件。但是當它必須查詢 50k+ 行時，查詢變得非常慢。

郵件（~100k 行）

   Column    |            Type             |                     Modifiers                     
--------------+-----------------------------+---------------------------------------------------
id           | integer                     | pkey
project_id   | integer                     | not null
deleted      | boolean                     |
active       | boolean                     |
created_at   | timestamp with time zone    | 

Indexes:
"mailings_pkey" PRIMARY KEY, btree (id)
"mailing_on_project_id_where_not_deleted_and_active" btree (project_id) (WHERE deleted IS NOT true AND active IS TRUE)
+ individual indexes on almost all columns

form_responses（~6m 行）

   Column    |            Type             |                     Modifiers                     
--------------+-----------------------------+---------------------------------------------------
id           | integer                     | pkey
project_id   | integer                     | not null
mailing_id   | integer                     | 
deleted      | boolean                     |
is_unanswered| boolean                     |
created_at   | timestamp with time zone    | 

Indexes:
"form_responses_pkey" PRIMARY KEY, btree (id)
"form_responses_on_pid_cr_time_where_not_del_not_is_unans" btree (project_id, created_time) (WHERE deleted IS NOT true AND is_uanswered IS NOT TRUE)
+ individual indexes on almost all columns

我想在一個日期範圍內獲取項目（id=1）的郵件統計資訊。我只想要未刪除且處於活動狀態的郵件的統計資訊，並且只計算未刪除的回复。

所以這給了我以下查詢

EXPLAIN (ANALYZE, BUFFERS)
SELECT
 count(DISTINCT form_responses.id) AS total,
 max(form_responses.created_time) AS form_response_last_received,
 (SELECT mailings.name FROM mailings WHERE id = form_responses.mailing_id) AS name,
 form_responses.mailing_id AS mailing_id FROM form_responses
JOIN
 mailings ON mailings.id = form_responses.mailing_id
WHERE
 form_responses.deleted IS NOT true AND
 form_responses.is_unanswered IS NOT true AND
 form_responses.created_time &gt;= '2017-08-02 00:00:00' AND
 form_responses.created_time &lt;= '2018-01-29 00:00:00' AND
 form_responses.project_id = 1 AND
 mailings.deleted IS NOT true AND
 mailings.project_id = 1 AND
 mailings.active IS true
GROUP BY form_responses.mailing_id
ORDER BY form_response_last_received DESC
LIMIT 20 OFFSET 0

這給了我以下輸出：

Limit  (cost=308997.69..308997.74 rows=20 width=16) (actual time=7762.554..7762.565 rows=20 loops=1)
 Buffers: shared hit=74535 read=13513 dirtied=1
 -&gt;  Sort  (cost=308997.69..308998.19 rows=197 width=16) (actual time=7762.552..7762.557 rows=20 loops=1)
       Sort Key: (max(form_responses.created_time))
       Sort Method: top-N heapsort  Memory: 26kB
       Buffers: shared hit=74535 read=13513 dirtied=1
       -&gt;  GroupAggregate  (cost=307039.18..308992.45 rows=197 width=16) (actual time=7673.523..7762.278 rows=686 loops=1)
             Group Key: form_responses.mailing_id
             Buffers: shared hit=74535 read=13513 dirtied=1
             -&gt;  Sort  (cost=307039.18..307111.58 rows=28961 width=16) (actual time=7673.277..7702.783 rows=137542 loops=1)
                   Sort Key: form_responses.mailing_id
                   Sort Method: quicksort  Memory: 12592kB
                   Buffers: shared hit=71790 read=13512 dirtied=1
                   -&gt;  Hash Join  (cost=7376.42..304892.90 rows=28961 width=16) (actual time=44.576..7609.818 rows=137542 loops=1)
                         Hash Cond: (form_responses.mailing_id = mailings.id)
                         Buffers: shared hit=71790 read=13512 dirtied=1
                         -&gt;  Bitmap Heap Scan on form_responses  (cost=3815.27..299862.55 rows=134811 width=16) (actual time=41.439..7499.305 rows=137543 loops=1)
                               Recheck Cond: ((project_id = 1) AND (created_time &gt;= '2017-08-02 00:00:00+00'::timestamp with time zone) AND (created_time &lt;= '2018-01-29 00:00:00+00'::timestamp with time zone) AND (deleted IS NOT TRUE) AND (is_unanswered IS NOT TRUE))
                               Heap Blocks: exact=84014
                               Buffers: shared hit=71032 read=13512
                               -&gt;  Bitmap Index Scan on form_responses_on_pid_cr_time_where_not_del_not_is_unans  (cost=0.00..3781.57 rows=134811 width=0) (actual time=23.892..23.892 rows=137543 loops=1)
                                     Index Cond: ((project_id = 1) AND (created_time &gt;= '2017-08-02 00:00:00+00'::timestamp with time zone) AND (created_time &lt;= '2018-01-29 00:00:00+00'::timestamp with time zone))
                                     Buffers: shared hit=530
                         -&gt;  Hash  (cost=3286.82..3286.82 rows=21946 width=4) (actual time=3.093..3.093 rows=2866 loops=1)
                               Buckets: 4096  Batches: 1  Memory Usage: 101kB
                               Buffers: shared hit=758 dirtied=1
                               -&gt;  Bitmap Heap Scan on mailings  (cost=874.50..3286.82 rows=21946 width=4) (actual time=0.403..2.145 rows=2866 loops=1)
                                     Recheck Cond: ((project_id = 1) AND (deleted IS NOT TRUE) AND (active IS TRUE))
                                     Heap Blocks: exact=732
                                     Buffers: shared hit=758 dirtied=1
                                     -&gt;  Bitmap Index Scan on mailing_on_project_id_where_not_deleted_and_active  (cost=0.00..869.01 rows=21946 width=0) (actual time=0.315..0.315 rows=2866 loops=1)
                                           Index Cond: (project_id = 1)
                                           Buffers: shared hit=26
             SubPlan 1
               -&gt;  Index Scan using mailings_pkey on mailings mailings_1  (cost=0.42..8.44 rows=1 width=26) (actual time=0.003..0.003 rows=1 loops=686)
                     Index Cond: (id = form_responses.mailing_id)
                     Buffers: shared hit=2745 read=1
                     Planning time: 0.865 ms
Execution time: 7764.951 ms

在此範例中，它查詢約 135k 行，最多可達到約 600k。如您所見，99% 的時間都花在點陣圖堆掃描上。我db.t2.large在 AWS 上執行一個具有 100gb (SSD) 儲存的 RDS 數據庫，它執行 postgres 9.4.7

有沒有辦法避免進行點陣圖堆掃描？
或者有什麼方法可以提高這個速度？
我可以創建一個覆蓋索引，因此它不必讀取表單響應表本身嗎？
我應該將form_responses表聚集在project_id和/或mailing_id上嗎？
我應該升級我的硬體嗎？
我對我的數據庫要求太多了嗎？

可以避免點陣圖掃描set enable_bitmapscan to off。然後你可以看到它選擇了什麼，以及該選擇是如何執行的。這通常是一種探索性工具，在生產系統中以這種方式全域設置通常不是一個好主意。但是您可以只在一個會話中設置，因此在生產系統上進行測試沒有問題。
如果您使用相同的參數第二次執行相同的查詢，它會更快嗎？如果是這樣，您可以通過將表預熱form_responses到記憶體中來提高性能（例如，使用pg_prewarm擴展），假設您的記憶體足夠大以容納所有內容。你也可以嘗試增加effective_io_concurrency，如果 RDS 允許的話。
如果您創建一個索引，mailing_on_project_id_where_not_deleted_and_active但也包括列列表mailing_id的id末尾，那麼它可以像一個覆蓋索引一樣，您可以讓它使用僅索引掃描。你必須保持你的桌子充分吸塵，才能最大限度地發揮作用。預設的清理級別可能不夠激進，具體取決於數據更新的模式。
PostgreSQL 基於索引而不是列對錶進行集群。分群mailing_on_project_id_where_not_deleted_and_active可能會幫助這個特定的查詢。
根據上述答案和結果，您可能需要升級硬體。如果form_responses不適合 RAM，則更多 RAM 會有所幫助。如果這不可行並且需要從磁碟讀取，那麼從 HDD 升級到 SSD 會有所幫助（除非 RDS 已經在 SSD 上，在這種情況下，除非它受到限制，否則您首先不應該遇到問題） .
此查詢可能會在 PostgreSQL 9.6 中引入並在 10 中改進的一些並行查詢實現中受益，但我不知道 RDS 是否支持這些。

引用自：https://dba.stackexchange.com/questions/196721

點陣圖堆掃描很慢

相關問答

對多個 OUTER JOIN 進行計數和分組

聯接表並根據條件匯總行

Postgres Count 在同一個查詢中具有不同的條件

從 3 個表中按月獲取合併的行數

加入未嵌套的整數數組列併計算數組中的出現次數

使用 where 和排序子句連接 3 個表