Postgresql

Postgresql 無法使用我的覆蓋索引並退回到更慢的點陣圖掃描

  • November 5, 2018

我試圖弄清楚為什麼我的表在索引掃描快得多時使用點陣圖堆掃描。

桌子:

webarchive=# \d web_pages
                                              Table "public.web_pages"
     Column       |            Type             |                              Modifiers
-------------------+-----------------------------+---------------------------------------------------------------------
id                | bigint                      | not null default nextval('web_pages_id_seq'::regclass)
state             | dlstate_enum                | not null
errno             | integer                     |
url               | text                        | not null
starturl          | text                        | not null
netloc            | text                        | not null
file              | bigint                      |
priority          | integer                     | not null
distance          | integer                     | not null
is_text           | boolean                     |
limit_netloc      | boolean                     |
title             | citext                      |
mimetype          | text                        |
type              | itemtype_enum               |
content           | text                        |
fetchtime         | timestamp without time zone |
addtime           | timestamp without time zone |
normal_fetch_mode | boolean                     | default true
ignoreuntiltime   | timestamp without time zone | not null default '1970-01-01 00:00:00'::timestamp without time zone
Indexes:
   "web_pages_pkey" PRIMARY KEY, btree (id)
   "ix_web_pages_url" UNIQUE, btree (url)
   "ix_web_pages_distance" btree (distance)
   "ix_web_pages_fetchtime" btree (fetchtime)
   "ix_web_pages_id" btree (id)
   "ix_web_pages_id_state" btree (id, state)
   "ix_web_pages_netloc" btree (netloc)
   "ix_web_pages_priority" btree (priority)
   "ix_web_pages_state" btree (state)
   "web_pages_netloc_fetchtime_idx" btree (netloc, fetchtime)
   "web_pages_netloc_id_idx" btree (netloc, id)
Foreign-key constraints:
   "web_pages_file_fkey" FOREIGN KEY (file) REFERENCES web_files(id)
Tablespace: "main_webarchive_tablespace"

詢問:

EXPLAIN ANALYZE UPDATE
   web_pages
SET
   state = 'new'
WHERE
   (state = 'fetching' OR state = 'processing')
AND
   id <= 150000000;

在這種情況下,由於我有一個覆蓋索引 ( ix_web_pages_id_state),我希望查詢規劃器只進行索引掃描。但是,它會生成一個點陣圖堆掃描,這要慢得多:

                                                                         QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------
Update on web_pages  (cost=524.06..532.09 rows=2 width=671) (actual time=2356.900..2356.900 rows=0 loops=1)
  ->  Bitmap Heap Scan on web_pages  (cost=524.06..532.09 rows=2 width=671) (actual time=2356.896..2356.896 rows=0 loops=1)
        Recheck Cond: (((state = 'fetching'::dlstate_enum) OR (state = 'processing'::dlstate_enum)) AND (id <= 150000000))
        Heap Blocks: exact=6
        ->  BitmapAnd  (cost=524.06..524.06 rows=2 width=0) (actual time=2353.388..2353.388 rows=0 loops=1)
              ->  BitmapOr  (cost=151.98..151.98 rows=6779 width=0) (actual time=2021.635..2021.636 rows=0 loops=1)
                    ->  Bitmap Index Scan on ix_web_pages_state  (cost=0.00..147.41 rows=6779 width=0) (actual time=2021.616..2021.617 rows=11668131 loops=1)
                          Index Cond: (state = 'fetching'::dlstate_enum)
                    ->  Bitmap Index Scan on ix_web_pages_state  (cost=0.00..4.57 rows=1 width=0) (actual time=0.015..0.016 rows=0 loops=1)
                          Index Cond: (state = 'processing'::dlstate_enum)
              ->  Bitmap Index Scan on web_pages_pkey  (cost=0.00..371.83 rows=16435 width=0) (actual time=0.046..0.047 rows=205 loops=1)
                    Index Cond: (id <= 150000000)
Planning time: 0.232 ms
Execution time: 2406.234 ms
(14 rows)

如果我強制它不進行點陣圖堆掃描(按set enable_bitmapscan to off;),它會生成一個更快的計劃:

                                                             QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------
Update on web_pages  (cost=0.56..38591.75 rows=2 width=671) (actual time=0.284..0.285 rows=0 loops=1)
  ->  Index Scan using web_pages_pkey on web_pages  (cost=0.56..38591.75 rows=2 width=671) (actual time=0.281..0.281 rows=0 loops=1)
        Index Cond: (id <= 150000000)
        Filter: ((state = 'fetching'::dlstate_enum) OR (state = 'processing'::dlstate_enum))
        Rows Removed by Filter: 181
Planning time: 0.190 ms
Execution time: 0.334 ms
(7 rows)

我重新執行了一次真空分析,以查看表統計資訊是否可能已過時,但這似乎沒有任何好處。此外,以上是在多次重新執行相同的查詢之後,所以我認為記憶體也不應該是相關的。

我怎樣才能誘導計劃者在這裡生成一個更高效的計劃?


編輯:正如評論中所建議的,我添加了一個 index "ix_web_pages_state_id" btree (state, id)。不幸的是,它沒有幫助。

我還嘗試過降低random_page_cost(低至 0.5)以及增加統計目標,但兩者都沒有任何效果。


進一步編輯 - 刪除 OR 條件:

EXPLAIN ANALYZE UPDATE
   web_pages
SET
   state = 'new'
WHERE
   state = 'fetching'
AND
   id <= 150000000;

產量:

                                                                      QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------
Update on web_pages  (cost=311.83..315.84 rows=1 width=589) (actual time=2574.654..2574.655 rows=0 loops=1)
  ->  Bitmap Heap Scan on web_pages  (cost=311.83..315.84 rows=1 width=589) (actual time=2574.650..2574.651 rows=0 loops=1)
        Recheck Cond: ((id <= 150000000) AND (state = 'fetching'::dlstate_enum))
        Heap Blocks: exact=6
        ->  BitmapAnd  (cost=311.83..311.83 rows=1 width=0) (actual time=2574.556..2574.556 rows=0 loops=1)
              ->  Bitmap Index Scan on web_pages_pkey  (cost=0.00..49.60 rows=1205 width=0) (actual time=0.679..0.680 rows=726 loops=1)
                    Index Cond: (id <= 150000000)
              ->  Bitmap Index Scan on ix_web_pages_state  (cost=0.00..261.98 rows=7122 width=0) (actual time=2519.950..2519.951 rows=11873888 loops=1)
                    Index Cond: (state = 'fetching'::dlstate_enum)

進一步編輯 - MOAR WEIRDNESS:

我重寫了查詢以使用子查詢:

EXPLAIN ANALYZE UPDATE
   web_pages
SET
   state = 'new'
WHERE
   (state = 'fetching' OR state = 'processing')
AND
   id IN (
       SELECT 
           id 
       FROM 
           web_pages 
       WHERE 
           id <= 150000000
   );

並且產生的執行計劃優於迄今為止的所有其他執行計劃。有時

                                                                       QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------
Update on web_pages  (cost=1.12..13878.31 rows=1 width=595) (actual time=2.773..2.774 rows=0 loops=1)
  ->  Nested Loop  (cost=1.12..13878.31 rows=1 width=595) (actual time=2.772..2.773 rows=0 loops=1)
        ->  Index Scan using web_pages_pkey on web_pages web_pages_1  (cost=0.56..3533.34 rows=1205 width=14) (actual time=0.000..0.602 rows=181 loops=1)
              Index Cond: (id <= 150000000)
        ->  Index Scan using web_pages_pkey on web_pages  (cost=0.56..8.58 rows=1 width=585) (actual time=0.010..0.010 rows=0 loops=181)
              Index Cond: (id = web_pages_1.id)
              Filter: ((state = 'fetching'::dlstate_enum) OR (state = 'processing'::dlstate_enum))
              Rows Removed by Filter: 1
Planning time: 0.891 ms
Execution time: 2.894 ms
(10 rows)

Update on web_pages  (cost=21193.19..48917.78 rows=2 width=595)
 ->  Hash Semi Join  (cost=21193.19..48917.78 rows=2 width=595)
       Hash Cond: (web_pages.id = web_pages_1.id)
       ->  Bitmap Heap Scan on web_pages  (cost=270.14..27976.00 rows=7126 width=585)
             Recheck Cond: ((state = 'fetching'::dlstate_enum) OR (state = 'processing'::dlstate_enum))
             ->  BitmapOr  (cost=270.14..270.14 rows=7126 width=0)
                   ->  Bitmap Index Scan on ix_web_pages_state  (cost=0.00..262.01 rows=7126 width=0)
                         Index Cond: (state = 'fetching'::dlstate_enum)
                   ->  Bitmap Index Scan on ix_web_pages_state  (cost=0.00..4.57 rows=1 width=0)
                         Index Cond: (state = 'processing'::dlstate_enum)
       ->  Hash  (cost=20834.15..20834.15 rows=7112 width=14)
             ->  Index Scan using web_pages_pkey on web_pages web_pages_1  (cost=0.56..20834.15 rows=7112 width=14)
                   Index Cond: ((id > 1883250000) AND (id <= 1883300000))

在這一點上,我不知道發生了什麼。我所知道的是每個案例都是由set enable_bitmapscan to off;.


好的,我昨晚執行的非常長時間執行的事務完成了,我設法VACUUM VERBOSE ANALYZE在桌子上執行了一個:

webarchive=# VACUUM ANALYZE VERBOSE web_pages;
INFO:  vacuuming "public.web_pages"
INFO:  scanned index "ix_web_pages_distance" to remove 33328301 row versions
DETAIL:  CPU 6.85s/21.21u sec elapsed 171.28 sec
INFO:  scanned index "ix_web_pages_fetchtime" to remove 33328301 row versions
DETAIL:  CPU 6.20s/25.28u sec elapsed 186.53 sec
INFO:  scanned index "ix_web_pages_id" to remove 33328301 row versions
DETAIL:  CPU 7.37s/29.56u sec elapsed 226.49 sec
INFO:  scanned index "ix_web_pages_netloc" to remove 33328301 row versions
DETAIL:  CPU 8.47s/41.44u sec elapsed 260.50 sec
INFO:  scanned index "ix_web_pages_priority" to remove 33328301 row versions
DETAIL:  CPU 5.65s/16.35u sec elapsed 180.78 sec
INFO:  scanned index "ix_web_pages_state" to remove 33328301 row versions
DETAIL:  CPU 4.51s/21.14u sec elapsed 189.60 sec
INFO:  scanned index "ix_web_pages_url" to remove 33328301 row versions
DETAIL:  CPU 26.59s/78.52u sec elapsed 969.99 sec
INFO:  scanned index "web_pages_netloc_fetchtime_idx" to remove 33328301 row versions
DETAIL:  CPU 8.23s/48.39u sec elapsed 301.37 sec
INFO:  scanned index "web_pages_netloc_id_idx" to remove 33328301 row versions
DETAIL:  CPU 15.52s/43.25u sec elapsed 423.68 sec
INFO:  scanned index "web_pages_pkey" to remove 33328301 row versions
DETAIL:  CPU 8.12s/33.43u sec elapsed 215.93 sec
INFO:  scanned index "ix_web_pages_id_state" to remove 33328301 row versions
DETAIL:  CPU 8.22s/33.26u sec elapsed 214.43 sec
INFO:  scanned index "ix_web_pages_state_id" to remove 33328301 row versions
DETAIL:  CPU 6.01s/28.04u sec elapsed 174.19 sec
INFO:  "web_pages": removed 33328301 row versions in 3408348 pages
DETAIL:  CPU 89.90s/50.24u sec elapsed 1928.70 sec
INFO:  index "ix_web_pages_distance" now contains 29463963 row versions in 215671 pages
DETAIL:  33328301 index row versions were removed.
32914 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "ix_web_pages_fetchtime" now contains 29463982 row versions in 253375 pages
DETAIL:  33328301 index row versions were removed.
40460 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "ix_web_pages_id" now contains 29464000 row versions in 238212 pages
DETAIL:  33328301 index row versions were removed.
21081 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "ix_web_pages_netloc" now contains 29464025 row versions in 358150 pages
DETAIL:  33328301 index row versions were removed.
99235 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "ix_web_pages_priority" now contains 29464032 row versions in 214923 pages
DETAIL:  33328301 index row versions were removed.
21451 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "ix_web_pages_state" now contains 29466359 row versions in 215150 pages
DETAIL:  33328301 index row versions were removed.
81340 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "ix_web_pages_url" now contains 29466350 row versions in 1137027 pages
DETAIL:  33197635 index row versions were removed.
236405 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "web_pages_netloc_fetchtime_idx" now contains 29466381 row versions in 539255 pages
DETAIL:  33328301 index row versions were removed.
220594 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "web_pages_netloc_id_idx" now contains 29466392 row versions in 501276 pages
DETAIL:  33328301 index row versions were removed.
144217 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "web_pages_pkey" now contains 29466394 row versions in 236560 pages
DETAIL:  33173411 index row versions were removed.
20559 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "ix_web_pages_id_state" now contains 29466415 row versions in 256699 pages
DETAIL:  33328301 index row versions were removed.
27194 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "ix_web_pages_state_id" now contains 29466435 row versions in 244076 pages
DETAIL:  33328301 index row versions were removed.
91918 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  "web_pages": found 33339704 removable, 29367176 nonremovable row versions in 4224021 out of 4231795 pages
DETAIL:  2541 dead row versions cannot be removed yet.
There were 2079389 unused item pointers.
Skipped 0 pages due to buffer pins.
0 pages are entirely empty.
CPU 330.54s/537.34u sec elapsed 7707.90 sec.
INFO:  vacuuming "pg_toast.pg_toast_705758310"
INFO:  scanned index "pg_toast_705758310_index" to remove 7184381 row versions
DETAIL:  CPU 7.32s/13.70u sec elapsed 240.71 sec
INFO:  "pg_toast_705758310": removed 7184381 row versions in 2271192 pages
DETAIL:  CPU 62.81s/46.41u sec elapsed 1416.12 sec
INFO:  index "pg_toast_705758310_index" now contains 114558558 row versions in 338256 pages
DETAIL:  7184381 index row versions were removed.
2033 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  "pg_toast_705758310": found 7184381 removable, 40907769 nonremovable row versions in 11388831 out of 29033065 pages
DETAIL:  5 dead row versions cannot be removed yet.
There were 74209 unused item pointers.
Skipped 0 pages due to buffer pins.
0 pages are entirely empty.
CPU 433.26s/247.73u sec elapsed 8444.85 sec.
INFO:  analyzing "public.web_pages"
INFO:  "web_pages": scanned 600000 of 4232727 pages, containing 4191579 live rows and 4552 dead rows; 600000 rows in sample, 29569683 estimated total rows
VACUUM

它仍在生成非僅索引查詢,儘管執行時間與僅索引查詢更一致。我不明白為什麼行為發生瞭如此大的變化。執行時間很長的查詢會導致這麼多成本嗎?

webarchive=# EXPLAIN ANALYZE UPDATE
       web_pages
   SET
       state = 'new'
   WHERE
       (state = 'fetching' OR state = 'processing')
   AND
       id IN (
           SELECT
               id
           FROM
               web_pages
           WHERE
               id > 1883250000
           AND
               id <= 1883300000
       );
                                                                        QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------
Update on web_pages  (cost=36.00..9936.00 rows=1 width=594) (actual time=37.856..37.857 rows=0 loops=1)
  ->  Nested Loop Semi Join  (cost=36.00..9936.00 rows=1 width=594) (actual time=37.852..37.853 rows=0 loops=1)
        ->  Bitmap Heap Scan on web_pages  (cost=35.44..3167.00 rows=788 width=584) (actual time=23.984..31.489 rows=2321 loops=1)
              Recheck Cond: ((state = 'fetching'::dlstate_enum) OR (state = 'processing'::dlstate_enum))
              Heap Blocks: exact=2009
              ->  BitmapOr  (cost=35.44..35.44 rows=788 width=0) (actual time=22.347..22.348 rows=0 loops=1)
                    ->  Bitmap Index Scan on ix_web_pages_state  (cost=0.00..30.47 rows=788 width=0) (actual time=22.326..22.327 rows=9202 loops=1)
                          Index Cond: (state = 'fetching'::dlstate_enum)
                    ->  Bitmap Index Scan on ix_web_pages_state_id  (cost=0.00..4.57 rows=1 width=0) (actual time=0.017..0.017 rows=0 loops=1)
                          Index Cond: (state = 'processing'::dlstate_enum)
        ->  Index Scan using ix_web_pages_id_state on web_pages web_pages_1  (cost=0.56..8.58 rows=1 width=14) (actual time=0.001..0.001 rows=0 loops=2321)
              Index Cond: ((id = web_pages.id) AND (id > 1883250000) AND (id <= 1883300000))
Planning time: 2.677 ms
Execution time: 37.945 ms
(14 rows)

有趣的是,ID 偏移量的似乎會影響規劃:

webarchive=# EXPLAIN ANALYZE UPDATE
       web_pages
   SET
       state = 'new'
   WHERE
       (state = 'fetching' OR state = 'processing')
   AND
       id IN (
           SELECT
               id
           FROM
               web_pages
           WHERE
               id >  149950000
           AND
               id <= 150000000
       );
                                                                       QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------
Update on web_pages  (cost=1.12..17.18 rows=1 width=594) (actual time=0.030..0.031 rows=0 loops=1)
  ->  Nested Loop  (cost=1.12..17.18 rows=1 width=594) (actual time=0.026..0.028 rows=0 loops=1)
        ->  Index Scan using ix_web_pages_id_state on web_pages web_pages_1  (cost=0.56..8.58 rows=1 width=14) (actual time=0.022..0.024 rows=0 loops=1)
              Index Cond: ((id > 149950000) AND (id <= 150000000))
        ->  Index Scan using ix_web_pages_id_state on web_pages  (cost=0.56..8.59 rows=1 width=584) (never executed)
              Index Cond: (id = web_pages_1.id)
              Filter: ((state = 'fetching'::dlstate_enum) OR (state = 'processing'::dlstate_enum))
Planning time: 1.531 ms
Execution time: 0.155 ms
(9 rows)

查詢計劃器是否在其計劃中考慮查詢參數的值?我原以為規劃與查詢參數無關,但現在考慮一下,使用參數來改進規劃是有意義的,所以我可以看到它以這種方式工作。

有趣的是,點陣圖掃描現在的性能似乎要高得多。

webarchive=# set enable_bitmapscan to off;
SET
webarchive=#     EXPLAIN ANALYZE UPDATE
       web_pages
   SET
       state = 'new'
   WHERE
       (state = 'fetching' OR state = 'processing')
   AND
       id IN (
           SELECT
               id
           FROM
               web_pages
           WHERE
               id > 1883250000
           AND
               id <= 1883300000
       );
                                                                         QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------
Update on web_pages  (cost=1.12..82226.59 rows=1 width=594) (actual time=66.993..66.994 rows=0 loops=1)
  ->  Nested Loop  (cost=1.12..82226.59 rows=1 width=594) (actual time=66.992..66.993 rows=0 loops=1)
        ->  Index Scan using web_pages_pkey on web_pages web_pages_1  (cost=0.56..21082.82 rows=7166 width=14) (actual time=0.055..20.206 rows=8567 loops=1)
              Index Cond: ((id > 1883250000) AND (id <= 1883300000))
        ->  Index Scan using web_pages_pkey on web_pages  (cost=0.56..8.52 rows=1 width=584) (actual time=0.004..0.004 rows=0 loops=8567)
              Index Cond: (id = web_pages_1.id)
              Filter: ((state = 'fetching'::dlstate_enum) OR (state = 'processing'::dlstate_enum))
              Rows Removed by Filter: 1
Planning time: 1.963 ms
Execution time: 67.112 ms
(10 rows)

webarchive=# set enable_bitmapscan to on;
SET
webarchive=#     EXPLAIN ANALYZE UPDATE
       web_pages
   SET
       state = 'new'
   WHERE
       (state = 'fetching' OR state = 'processing')
   AND
       id IN (
           SELECT
               id
           FROM
               web_pages
           WHERE
               id > 1883250000
           AND
               id <= 1883300000
       );
                                                                        QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------
Update on web_pages  (cost=36.00..9936.00 rows=1 width=594) (actual time=23.331..23.331 rows=0 loops=1)
  ->  Nested Loop Semi Join  (cost=36.00..9936.00 rows=1 width=594) (actual time=23.327..23.328 rows=0 loops=1)
        ->  Bitmap Heap Scan on web_pages  (cost=35.44..3167.00 rows=788 width=584) (actual time=6.727..17.027 rows=1966 loops=1)
              Recheck Cond: ((state = 'fetching'::dlstate_enum) OR (state = 'processing'::dlstate_enum))
              Heap Blocks: exact=3825
              ->  BitmapOr  (cost=35.44..35.44 rows=788 width=0) (actual time=3.499..3.499 rows=0 loops=1)
                    ->  Bitmap Index Scan on ix_web_pages_state  (cost=0.00..30.47 rows=788 width=0) (actual time=3.471..3.472 rows=21996 loops=1)
                          Index Cond: (state = 'fetching'::dlstate_enum)
                    ->  Bitmap Index Scan on ix_web_pages_state_id  (cost=0.00..4.57 rows=1 width=0) (actual time=0.022..0.023 rows=0 loops=1)
                          Index Cond: (state = 'processing'::dlstate_enum)
        ->  Index Scan using ix_web_pages_id_state on web_pages web_pages_1  (cost=0.56..8.58 rows=1 width=14) (actual time=0.001..0.001 rows=0 loops=1966)
              Index Cond: ((id = web_pages.id) AND (id > 1883250000) AND (id <= 1883300000))
Planning time: 0.774 ms
Execution time: 23.425 ms
(14 rows)

所以我認為問題只是索引有很多不再有效的行,過濾這些行的過程是主要的時間成本。這裡的根本問題是,(我認為)MVCC 系統與 VACUUM 系統在極長時間執行的事務的上下文中互動的方式。

這是有道理的(回想起來),在每個可以使用該索引的事務都完成之前,不能從索引中刪除條目。從文件中:

但是 PostgreSQL 中的任何表掃描還有一個額外的要求:它必須驗證每個檢索到的行對查詢的 MVCC 快照是“可見的”,如第 13 章所述。可見性資訊不儲存在索引條目中,僅儲存在堆條目中;所以乍一看似乎每行檢索都需要一個堆訪問。如果最近修改了表行,情況確實如此。但是,對於很少更改的數據,有一種方法可以解決這個問題。

在這種情況下,我啟動了一個數據庫轉儲,然後繼續進行了一系列清理工作(這涉及到大量的行流失)。這將導致對每個索引查詢進行大量堆查找,因為索引包含大量現在已刪除的行。

不過,這主要是假設性的,因為我沒有資源來嘗試重現這種情況。

無論如何,@jjanes 關於長時間執行查詢的提示是我在這裡找到兔子洞的關鍵。

這部分顯然是混亂的:

->  Bitmap Index Scan on ix_web_pages_state  
    (cost=0.00..147.41 rows=6779 width=0) 
    (actual time=2021.616..2021.617 rows=11668131 loops=1)
Index Cond: (state = 'fetching'::dlstate_enum)

它發現的行數比它想像的多一千倍。真空分析應該已經解決了這個問題。也許你的真空沒有做太多,因為一些長期開放的交易正在阻止它刪除死元組?

進行 VACUUM VERBOSE ANALYZE 可以對此有所了解。

您還可以使用測試查詢進一步探索:

EXPLAIN (ANALYZE,BUFFERS) SELECT COUNT(*) FROM web_pages WHERE state = 'fetching'

嘗試通過擺弄 enable_* 參數使其同時作為點陣圖掃描和正常索引掃描執行,以便我們可以比較兩個報告。特別是,保持打開的事務將導致點陣圖索引掃描節點的實際計數更高,因為它必須在測試可見性之前對行進行計數,而普通索引掃描會在增加計數之前丟棄不可見的行。

引用自:https://dba.stackexchange.com/questions/221728