postgresql 計劃器/優化器的特定索引問題

November 18, 2021

我有一個問題，我不知道了，似乎找不到已經討論過的類似問題，所以…
我/必須在 pgsql 10.15 / RHEL x86_64 上。
rel A 有大約 8.13 億條記錄。
格式就像 id,….,snapzeit。
col snapzeit 不可為空，非唯一，時間戳。
date_trunc(‘minute’, snapzeit) 上有一個索引。
不涉及 refin 約束。
最近分析
rel B 只是時間戳的一個名為 t 的列，並且有 13 行；獨特的時間戳。rel B 之所以存在，是因為我來自使用 generate_series，這比我目前的問題還要慢。
我想從 A 中獲取所有 snapzeit = Bt，這將是大約 255k 行。
如果我通過給出一個文字列表（與 B 中的值相同）進行過濾，它的速度很快：
select * from A
where date_trunc('minute',snapzeit) in ('2021-11-01 04:30',...);
查詢計劃：
Bitmap Heap Scan on A  (cost=1294171.21..24450480.97 rows=69102630 width=193) (actual time=16.184..278.121 rows=255589 loops=1)
 Recheck Cond: (date_trunc('minute'::text, snapzeit) = ANY ('{"2021-11-17 04:00:00",$CUT_OUT$'::timestamp without time zone[]))
 Heap Blocks: exact=5879
 -&gt;  Bitmap Index Scan on IDX_A  (cost=0.00..1276895.55 rows=69102630 width=0) (actual time=15.367..15.368 rows=255589 loops=1)
       Index Cond: (date_trunc('minute'::text, snapzeit) = ANY ('{"2021-11-17 04:00:00",$CUT_OUT$'::timestamp without time zone[]))
Planning time: 0.157 ms
Execution time: 456.024 ms
如果我使用 rel B 則需要很長時間。我添加了第二個謂詞，所以它完全會在 2021 年回歸：
select * from A
where date_trunc('minute',snapzeit) in (
   select date_trunc('minute',t) from B
)
and snapzeit &gt;= '2021-11-10';
查詢計劃：
Nested Loop  (cost=622164.37..18870692.83 rows=12866340 width=193) (actual time=3051.373..42762.069 rows=255589 loops=1)
 -&gt;  HashAggregate  (cost=1.21..1.38 rows=17 width=8) (actual time=0.057..0.133 rows=17 loops=1)
       Group Key: date_trunc('minute'::text, B.t)
       -&gt;  Seq Scan on test  (cost=0.00..1.17 rows=17 width=8) (actual time=0.024..0.039 rows=17 loops=1)
 -&gt;  Bitmap Heap Scan on A  (cost=622163.16..1108754.04 rows=128663 width=193) (actual time=2478.495..2492.003 rows=15035 loops=17)
       Recheck Cond: ((date_trunc('minute'::text, snapzeit) = date_trunc('minute'::text, B.t)) AND (snapzeit &gt;= '2021-11-10 00:00:00'::timestamp without time zone))
       Heap Blocks: exact=5879
       -&gt;  BitmapAnd  (cost=622163.16..622163.16 rows=128663 width=0) (actual time=2478.424..2478.424 rows=0 loops=17)
             -&gt;  Bitmap Index Scan on IDX_A  (cost=0.00..75117.91 rows=4065181 width=0) (actual time=0.870..0.870 rows=15035 loops=17)
                   Index Cond: (date_trunc('minute'::text, snapzeit) = date_trunc('minute'::text, B.t))
             -&gt;  Bitmap Index Scan on IDX2_A (cost=0.00..540579.67 rows=25732679 width=0) (actual time=2464.373..2464.373 rows=36137255 loops=17)
                   Index Cond: (snapzeit &gt;= '2021-11-10 00:00:00'::timestamp without time zone)
Planning time: 0.313 ms
Execution time: 42936.373 ms
rel A 在 snapzeit 上也有一個 IDX，沒有那個 date_trunc。
嘗試了 ANY() 的變體，在 A.snapzeit=Bt 上內部連接 A 和 B，子選擇中沒有 date_trunc；它保持緩慢。
我的期望將是與“字面”情況類似的表現；那麼，在第二種情況下，所有動作從何而來，而它可以將這些少數值帶到索引中，僅此而已？
另一方面，我覺得我忽略了一些簡單/明顯的事情。有什麼提示嗎？
乾杯和thanx很多，JJ
編輯：
輸入 array() 一切都很快：
select * from A
   where date_trunc('minute',snapzeit) = ANY(ARRAY (
       select date_trunc('minute',t) from B
   ));
由@LaurenzAlbe 解決，謝謝。
雖然我仍然會說規劃器/優化器應該自己做。
乾杯，JJ

實驗表明，以下足以讓 PostgreSQL 使用更快的索引掃描：
WHERE date_trunc('minute',snapzeit) IN (SELECT ...)
到
WHERE date_trunc('minute',snapzeit) = ANY (array(SELECT ...))

引用自：https://dba.stackexchange.com/questions/302823

postgresql 計劃器/優化器的特定索引問題

相關問答

PostgreSQL 執行計劃中帶有數字和下劃線的奇怪表

Postgres 選擇 seq 掃描而不是索引掃描

臨時表上的索引使用情況

PostgreSQL 錯誤地使用主鍵索引進行最小/最大查詢

如何使用索引進行簡單的時間範圍連接？

指定 JOIN 和 LIMIT 時 SELECT 非常慢，但找不到/很少找到結果