Postgresql 10：具有精確堆塊的點陣圖堆掃描

January 23, 2022

我有以下查詢：

select ro.*
from courier c1
   join courier c2 on c2.real_physical_courier_1c_id = c1.real_physical_courier_1c_id
   join restaurant_order ro on ro.courier_id = c2.id
   left join jsonb_array_elements(items) jae on true
   left join jsonb_array_elements(jae-&gt;'options') ji on true
   inner join catalogue c on c.id in ((jae-&gt;'id')::int, (ji-&gt;'id')::int)
   join restaurant r on r.id = ro.restaurant_id
where c1.id = '7b35cdab-b423-472a-bde1-d6699f6cefd3' and ro.status in (70, 73)
group by ro.order_id, r.id ;

這是查詢計劃的一部分，它需要大約 95% 的時間：

-&gt;  Parallel Bitmap Heap Scan on restaurant_order ro  (cost=23.87..2357.58 rows=1244 width=1257) (actual time=11.931..38.163 rows=98 loops=2)"
     Recheck Cond: (status = ANY ('{70,73}'::integer[]))"
     Heap Blocks: exact=28755"
     -&gt;  Bitmap Index Scan on ro__status  (cost=0.00..23.34 rows=2115 width=0) (actual time=9.168..9.168 rows=51540 loops=1)"
           Index Cond: (status = ANY ('{70,73}'::integer[]))"

我有一些問題。

首先是點陣圖索引掃描部分。Postgres 遍歷 51540 條 ro__status 索引記錄，Index Cond: (status = ANY ('{70,73}'::integer[]))"並創建一個包含 28755 個元素的點陣圖。它的鍵是對應表行的物理位置（exact在Heap Blocks節中表示）。它是否正確？
其次，這張圖被傳遞到 Bitmap Heap Scan 階段。Recheck Cond實際上並沒有執行，因為堆塊不是有損樣式。點陣圖堆掃描按元組的物理位置對點陣圖進行排序，以啟用順序訪問。然後它分兩次依次讀取表數據 ( loops=2) 並獲得不超過 196 個表行。那是對的嗎？
線中反映的點陣圖大小Heap Blocks: exact=28755隨時間變化很大。差異是兩個數量級。比如昨天是500左右，為什麼會這樣？
現在，為什麼在點陣圖索引掃描階段創建的點陣圖有這麼多鍵？有 ro__status 索引可以表明只有大約 200 條狀態為 70 和 73 的記錄。我想不出任何原因阻止 postgres 只保留那些實際滿足index cond. 成本似乎很大：而不是約 200 個鍵，而是 28755 個！
為什麼點陣圖堆掃描需要這麼長時間？據我所知，有兩次順序讀取（loops=2），它應該花費更少的時間，不是嗎？或者，按元組的物理位置排序的點陣圖是罪魁禍首嗎？
我應該擔心估計不佳嗎？如果是這樣，增加 default_statistics_target 應該會有所幫助，對吧？現在預設為 100。

以防萬一，這是一個完整的計劃：

"Group  (cost=51297.15..52767.65 rows=19998 width=1261) (actual time=42.555..42.555 rows=0 loops=1)"
"  Group Key: ro.order_id, r.id"
"  -&gt;  Gather Merge  (cost=51297.15..52708.83 rows=11764 width=1261) (actual time=42.554..45.459 rows=0 loops=1)"
"        Workers Planned: 1"
"        Workers Launched: 1"
"        -&gt;  Group  (cost=50297.14..50385.37 rows=11764 width=1261) (actual time=38.850..38.850 rows=0 loops=2)"
"              Group Key: ro.order_id, r.id"
"              -&gt;  Sort  (cost=50297.14..50326.55 rows=11764 width=1261) (actual time=38.850..38.850 rows=0 loops=2)"
"                    Sort Key: ro.order_id, r.id"
"                    Sort Method: quicksort  Memory: 25kB"
"                    Worker 0:  Sort Method: quicksort  Memory: 25kB"
"                    -&gt;  Nested Loop  (cost=31.84..45709.27 rows=11764 width=1261) (actual time=38.819..38.819 rows=0 loops=2)"
"                          -&gt;  Nested Loop Left Join  (cost=27.21..5194.50 rows=5882 width=1325) (actual time=38.819..38.819 rows=0 loops=2)"
"                                -&gt;  Nested Loop Left Join  (cost=27.20..5076.49 rows=59 width=1293) (actual time=38.818..38.818 rows=0 loops=2)"
"                                      -&gt;  Nested Loop  (cost=27.20..5074.49 rows=1 width=1261) (actual time=38.818..38.818 rows=0 loops=2)"
"                                            -&gt;  Hash Join  (cost=26.93..5073.59 rows=1 width=1257) (actual time=38.817..38.818 rows=0 loops=2)"
"                                                  Hash Cond: (c2.real_physical_courier_1c_id = c1.real_physical_courier_1c_id)"
"                                                  -&gt;  Nested Loop  (cost=24.28..5068.22 rows=1038 width=1267) (actual time=11.960..38.732 rows=98 loops=2)"
"                                                        -&gt;  Parallel Bitmap Heap Scan on restaurant_order ro  (cost=23.87..2357.58 rows=1244 width=1257) (actual time=11.931..38.163 rows=98 loops=2)"
"                                                              Recheck Cond: (status = ANY ('{70,73}'::integer[]))"
"                                                              Heap Blocks: exact=28755"
"                                                              -&gt;  Bitmap Index Scan on ro__status  (cost=0.00..23.34 rows=2115 width=0) (actual time=9.168..9.168 rows=51540 loops=1)"
"                                                                    Index Cond: (status = ANY ('{70,73}'::integer[]))"
"                                                        -&gt;  Index Scan using courier_pkey on courier c2  (cost=0.41..2.18 rows=1 width=26) (actual time=0.005..0.005 rows=1 loops=195)"
"                                                              Index Cond: (id = ro.courier_id)"
"                                                  -&gt;  Hash  (cost=2.63..2.63 rows=1 width=10) (actual time=0.039..0.039 rows=1 loops=2)"
"                                                        Buckets: 1024  Batches: 1  Memory Usage: 9kB"
"                                                        -&gt;  Index Scan using courier_pkey on courier c1  (cost=0.41..2.63 rows=1 width=10) (actual time=0.034..0.034 rows=1 loops=2)"
"                                                              Index Cond: (id = '7b35cdab-b423-472a-bde1-d6699f6cefd3'::uuid)"
"                                            -&gt;  Index Only Scan using restaurant_pkey on restaurant r  (cost=0.27..0.89 rows=1 width=4) (never executed)"
"                                                  Index Cond: (id = ro.restaurant_id)"
"                                                  Heap Fetches: 0"
"                                      -&gt;  Function Scan on jsonb_array_elements jae  (cost=0.00..1.00 rows=100 width=32) (never executed)"
"                                -&gt;  Function Scan on jsonb_array_elements ji  (cost=0.01..1.00 rows=100 width=32) (never executed)"
"                          -&gt;  Bitmap Heap Scan on catalogue c  (cost=4.63..6.87 rows=2 width=4) (never executed)"
"                                Recheck Cond: ((id = ((jae.value -&gt; 'id'::text))::integer) OR (id = ((ji.value -&gt; 'id'::text))::integer))"
"                                -&gt;  BitmapOr  (cost=4.63..4.63 rows=2 width=0) (never executed)"
"                                      -&gt;  Bitmap Index Scan on catalogue_pkey  (cost=0.00..0.97 rows=1 width=0) (never executed)"
"                                            Index Cond: (id = ((jae.value -&gt; 'id'::text))::integer)"
"                                      -&gt;  Bitmap Index Scan on catalogue_pkey  (cost=0.00..0.97 rows=1 width=0) (never executed)"
"                                            Index Cond: (id = ((ji.value -&gt; 'id'::text))::integer)"
"Planning Time: 1.113 ms"
"Execution Time: 45.588 ms"

它建構了一個包含 51,540 個項目的點陣圖。然後將其（大致）分成兩半，一個用於兩個並行過程中的每一個。的報告exact=28755顯然僅針對其中一個過程。（如果您通過禁用並行查詢set max_parallel_workers_per_gather TO 0，則生成的計劃將更容易理解。這通常是我在研究查詢性能時要做的第一件事，除非我正在嘗試研究並行化。無論我做出什麼改進都會然後通常轉換回並行執行。）
點陣圖本質上是物理順序的。對其進行排序與創建它不是一個單獨的步驟。PostgreSQL 以該順序一個接一個地讀取塊。如果它決定將這些單獨的讀取合併為順序讀取，則取決於作業系統/文件系統。以我的經驗，你必須閱讀幾乎每一個區塊才能產生良好的效果。如果您只讀取每五個（隨機）塊，那麼您還不如進行隨機讀取。我無法從您的數據中看出 28755 個塊代表表的哪一部分。
現在，為什麼在點陣圖索引掃描階段創建的點陣圖有這麼多鍵？有 ro__status 索引可以表明只有大約 200 條狀態為 70 和 73 的記錄
PostgreSQL 中的索引本身並不包含任何可見性資訊。“ro__status”無法知道其中哪些 ctid 仍然可見，因此必須將它們全部塞入點陣圖中。然後它們中的大多數在堆掃描階段被拒絕為不可見。（這沒有明確報告，重新檢查和過濾拒絕的方式。你必須通過點陣圖大小和最終行數之間的差異來推斷它。在 BitmapAnd 和 BitmapOr 的情況下，你甚至不能輕易做到這一點，因為點陣圖大小不准確）。
所以這就是問題的癥結所在，您訪問超過 50,000 個元組只是為了找到 195 個活的元組。從索引中清除那些死元組是清理的主要工作之一。所以很可能你的桌子沒有得到足夠的吸塵。您可以非常簡單地對此進行測試，清理桌子並查看是否可以解決問題。如果沒有，那麼您可能有長期持有的快照，這會阻止真空有效，所以去尋找那些。
Btree 索引確實具有“微真空”功能，其中正常索引掃描會殺死一個索引元組，它發現該索引元組指向死對所有堆元組。但是點陣圖索引掃描並沒有實現這一點，因為它們在初始協商後不會重新訪問索引，因此沒有很好的機會殺死索引元組。點陣圖掃描將從這個微真空中受益，但不會自己執行微真空。這可能會導致一種反常的情況，即點陣圖掃描比正常索引掃描更受歡迎，索引的相關部分變得越臃腫，直到點陣圖掃描開始執行更差。增加吸塵可以解決這個問題，但如果您不想進一步增加它，那麼您通常可以不鼓勵點陣圖。

引用自：https://dba.stackexchange.com/questions/306458

Postgresql 10：具有精確堆塊的點陣圖堆掃描

相關問答

具有大 IN 的 Postgres 查詢，並且在臨時表上加入似乎不起作用

postgresql 計劃器/優化器的特定索引問題

不同 Postgres 數據庫上相同查詢的不同執行時間

PostgreSQL IO 成本邏輯

Postgres Planner 偶爾不使用 GIN 索引

為什麼 COALESCE 阻止在 varchar 上使用索引，而不是在文本列上？