為什麼有些計數查詢這麼慢？

April 17, 2019

我有一個users包含 260 萬行的表，SELECT COUNT(*) FROM users需要 2 秒才能成功，但請求stories包含 520 萬行的表需要一個多小時。

解釋非常相似：

explain select count(*) from users;

Finalize Aggregate  (cost=89417.87..89417.88 rows=1 width=8)
  -&gt;  Gather  (cost=89417.65..89417.86 rows=2 width=8)
        Workers Planned: 2
        -&gt;  Partial Aggregate  (cost=88417.65..88417.66 rows=1 width=8)
              -&gt;  Parallel Seq Scan on users  (cost=0.00..85702.72 rows=1085972 width=0)

explain select count(*) from stories;

Finalize Aggregate  (cost=428235.66..428235.67 rows=1 width=8)
  -&gt;  Gather  (cost=428235.45..428235.66 rows=2 width=8)
        Workers Planned: 2
        -&gt;  Partial Aggregate  (cost=427235.45..427235.46 rows=1 width=8)
              -&gt;  Parallel Index Only Scan using stories__is_permanently_deleted__idx on stories  (cost=0.43..421752.81 rows=2193057 width=0)

Postgres 版本：

                                                           version                                                            
PostgreSQL 10.6 (Ubuntu 10.6-0ubuntu0.18.04.1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 7.3.0-16ubuntu3) 7.3.0, 64-bit

的表定義stories：

        Column          |  Type   | Collation | Nullable |                      Default                      | Storage  | Stats target | Description 
-------------------------+---------+-----------+----------+---------------------------------------------------+----------+--------------+-------------
id                      | bigint  |           | not null | nextval('stories_id_seq'::regclass) | plain    |              | 
rating                  | integer |           | not null |                                                   | plain    |              | 
number_of_pluses        | integer |           | not null |                                                   | plain    |              | 
number_of_minuses       | integer |           | not null |                                                   | plain    |              | 
title                   | text    |           | not null |                                                   | extended |              | 
content_blocks          | jsonb   |           | not null |                                                   | extended |              | 
created_at_timestamp    | bigint  |           | not null |                                                   | plain    |              | 
story_url               | text    |           | not null |                                                   | extended |              | 
tags                    | jsonb   |           | not null |                                                   | extended |              | 
number_of_comments      | integer |           | not null |                                                   | plain    |              | 
is_deleted              | boolean |           | not null |                                                   | plain    |              | 
is_rating_hidden        | boolean |           | not null |                                                   | plain    |              | 
has_mine_tag            | boolean |           | not null |                                                   | plain    |              | 
has_adult_tag           | boolean |           | not null |                                                   | plain    |              | 
is_longpost             | boolean |           | not null |                                                   | plain    |              | 
author_id               | bigint  |           | not null |                                                   | plain    |              | 
author_username         | text    |           | not null |                                                   | extended |              | 
author_profile_url      | text    |           | not null |                                                   | extended |              | 
author_avatar_url       | text    |           | not null |                                                   | extended |              | 
community_link          | text    |           | not null |                                                   | extended |              | 
community_name          | text    |           | not null |                                                   | extended |              | 
comments_are_hot        | boolean |           | not null |                                                   | plain    |              | 
added_timestamp         | bigint  |           | not null |                                                   | plain    |              | 
last_update_timestamp   | bigint  |           | not null |                                                   | plain    |              | 
next_update_timestamp   | bigint  |           | not null |                                                   | plain    |              | 
task_taken_at_timestamp | bigint  |           | not null |                                                   | plain    |              | 
is_permanently_deleted  | boolean |           | not null | false                                             | plain    |              | 
Indexes:
   "stories_pkey" PRIMARY KEY, btree (id)
   "stories__added_timestamp__idx" btree (added_timestamp)
   "stories__is_permanently_deleted__idx" btree (is_permanently_deleted)
   "stories__last_update_timestamp__idx" btree (last_update_timestamp)
   "stories__next_update_timestamp__idx" btree (next_update_timestamp)
   "stories__task_taken_at_timestamp__idx" btree (task_taken_at_timestamp)

索引定義：

    Index "public.stories__is_permanently_deleted__idx"
        Column         |  Type   |       Definition       | Storage 
------------------------+---------+------------------------+---------
is_permanently_deleted | boolean | is_permanently_deleted | plain
btree, for table "public.stories"

重新索引後（按照建議）

EXPLAIN (ANALYZE, BUFFERS) select count(*) from stories：：

Finalize Aggregate  (cost=356218.22..356218.23 rows=1 width=8) (actual time=273577.971..273577.971 rows=1 loops=1)
  Buffers: shared hit=186467 read=65977 dirtied=24 written=1166
  -&gt;  Gather  (cost=356218.01..356218.22 rows=2 width=8) (actual time=272647.858..273602.243 rows=3 loops=1)
        Workers Planned: 2
        Workers Launched: 2
        Buffers: shared hit=186467 read=65977 dirtied=24 written=1166
        -&gt;  Partial Aggregate  (cost=355218.01..355218.02 rows=1 width=8) (actual time=272938.947..272938.948 rows=1 loops=3)
              Buffers: shared hit=186467 read=65977 dirtied=24 written=1166
              -&gt;  Parallel Index Only Scan using stories__is_permanently_deleted__idx on stories  (cost=0.43..349741.33 rows=2190671 width=0) (actual time=0.386..271148.590 rows=1752497 loops=3)
                    Heap Fetches: 654818
                    Buffers: shared hit=186467 read=65977 dirtied=24 written=1166
Planning time: 0.726 ms
Execution time: 273602.447 ms

當你有一個Parallel Seq Scanforusers時，你會得到一個Parallel Index Only Scanfor stories- 這通常比對錶的順序掃描要快。
如果它那麼慢，那麼明顯的原因就是索引膨脹（或更糟糕的是，索引損壞）。
重新創建索引並再次測試以查看是否如此。如果是肯定的，請調查是什麼使您的索引膨脹（或損壞）。腐敗應該是一個極其罕見的例外 - 除非您使用有故障的 RAM / 儲存進行操作。
REINDEX INDEX stories__is_permanently_deleted__idx;
如果您不需要精確計數，那麼還有更快的替代方案：
在 PostgreSQL 中發現表的行數的快速方法
**另外：**像這樣對列重新排序stories以每行節省約 20 個字節：
專欄 | 類型 
-------------------------+---------
編號 | 大整數 
created_at_timestamp | 大整數 
添加時間戳 | 大整數 
last_update_timestamp | 大整數 
next_update_timestamp | 大整數 
task_taken_at_timestamp | 大整數 
作者_id | 大整數
評級 | 整數
number_of_pluses | 整數
number_of_minuses | 整數
number_of_comments | 整數 
is_deleted | 布爾值
is_rating_hidden | 布爾值
has_mine_tag | 布爾值
has_adult_tag | 布爾值
is_longpost | 布爾值
comments_are_hot | 布爾值
is_permanently_deleted | 布爾值
作者_使用者名 | 文本 
author_profile_url | 文本 
author_avatar_url | 文本 
社區連結 | 文本 
社區名稱 | 文本 
標題 | 文本 
故事網址 | 文本 
內容塊 | jsonb 
標籤 | jsonb
看：
為讀取性能配置 PostgreSQL

引用自：https://dba.stackexchange.com/questions/235094

為什麼有些計數查詢這麼慢？

相關問答

FROM 子句中的相關函式是否針對每一行執行？

執行 SQL 查詢時如何獲取更多物理細節？

兩台伺服器上的Postgresql查詢計劃不同

為什麼儘管對列進行了索引排序，但查詢計劃仍然對錶進行排序？

具有大 IN 的 Postgres 查詢，並且在臨時表上加入似乎不起作用

為什麼優化器不在我的表上使用聚群索引？