不同/分組的表演

November 7, 2017

我正在嘗試使用此（簡化）表選擇房間中最新的消息作者：

            Table "public.message"
Column  |  Type   | Nullable |               Default               
---------+---------+----------+-------------------------------------
id      | bigint  | not null | nextval('message_id_seq'::regclass)
room    | integer | not null | 
author  | integer | not null | 
created | integer | not null | 
Indexes:
   "message_pkey" PRIMARY KEY, btree (id)
   "message_author_created_room" btree (author, created, room)
   "message_room_author_created" btree (room, author, created)
   "message_room_created" btree (room, created)
   "message_room_id" btree (room, id)

問題是這樣的查詢很慢：

select message.author as id, max(message.created) as mc from message
where room=12 group by message.author order by mc desc limit 50;

這是explain(analyze, verbose, buffers)：

miaou=&gt; explain (analyze, verbose, buffers) select message.author as id, max(message.created) as mc from message
where room=12 group by message.author order by mc desc limit 50;
                                                                                            QUERY PLAN                                                                                             
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit  (cost=10627.14..10627.26 rows=50 width=8) (actual time=54.887..54.901 rows=50 loops=1)
  Output: author, (max(created))
  Buffers: shared hit=490
  -&gt;  Sort  (cost=10627.14..10629.19 rows=820 width=8) (actual time=54.885..54.891 rows=50 loops=1)
        Output: author, (max(created))
        Sort Key: (max(message.created)) DESC
        Sort Method: top-N heapsort  Memory: 29kB
        Buffers: shared hit=490
        -&gt;  Finalize GroupAggregate  (cost=1000.46..10599.90 rows=820 width=8) (actual time=14.019..54.788 rows=160 loops=1)
              Output: author, max(created)
              Group Key: message.author
              Buffers: shared hit=490
              -&gt;  Gather Merge  (cost=1000.46..10583.50 rows=1640 width=8) (actual time=14.007..54.636 rows=248 loops=1)
                    Output: author, (PARTIAL max(created))
                    Workers Planned: 2
                    Workers Launched: 2
                    Buffers: shared hit=490
                    -&gt;  Partial GroupAggregate  (cost=0.43..9394.18 rows=820 width=8) (actual time=3.439..34.733 rows=83 loops=3)
                          Output: author, PARTIAL max(created)
                          Group Key: message.author
                          Buffers: shared hit=2989
                          Worker 0: actual time=0.297..49.593 rows=116 loops=1
                            Buffers: shared hit=1550
                          Worker 1: actual time=6.624..40.612 rows=60 loops=1
                            Buffers: shared hit=949
                          -&gt;  Parallel Index Only Scan using message_room_author_created on public.message  (cost=0.43..8904.09 rows=96377 width=8) (actual time=0.030..20.067 rows=73907 loops=3)
                                Output: author, created
                                Index Cond: (message.room = 12)
                                Heap Fetches: 139
                                Buffers: shared hit=2989
                                Worker 0: actual time=0.035..28.355 rows=109834 loops=1
                                  Buffers: shared hit=1550
                                Worker 1: actual time=0.030..23.723 rows=79112 loops=1
                                  Buffers: shared hit=949
Planning time: 0.211 ms
Execution time: 57.071 ms

我想知道如何讓它更快，真正重要的目標是獲得 N 個最新作者。有沒有更快的方法來查詢這些資訊？

select count(author) from message where room=12;
因為 author 不能為 null 就可以了SELECT count(*)。這樣做可能會導致在message_room_created或上進行僅索引掃描message_room_id。如果您需要獨特的作者，而您沒有，則必須使用count( DISTINCT author ).
這也被計劃為並行查詢。這是相對較新的功能。您可能想嘗試SET max_parallel_workers_per_gather = 0禁用並行查詢功能並重新發布結果。
無論哪種方式
這個很慢（相同的持續時間，大約45 毫秒）
如果您有數百萬行的索引掃描和堆提取，那麼將 45 毫秒稱為“慢”是不公平的。

引用自：https://dba.stackexchange.com/questions/190296

不同/分組的表演

相關問答

有沒有辦法加快 DISTINCT 查詢？

高效部分 DISTINCT ON

PostgreSQL 10 優化慢查詢性能

大表的高效分頁

從 am:n 表中有效地返回兩個聚合數組

GROUP BY，但每個使用者只使用一行