Postgresql

從具有數百萬行的 PostgresQL 數據庫中檢索數據需要很長時間

  • April 11, 2020

我一直在開發一個使用者可以註冊為使用者、創建讀書俱樂部並邀請其他人(成員)加入的系統。使用者和會員都可以向俱樂部添加圖書,也可以為其他會員添加的圖書投票。我最近嘗試添加很多數據來檢查數據庫是否執行良好,之後我發現實際獲取我喜歡的數據需要花費很多時間。我想獲取俱樂部中的所有書籍,包括他們的選票和投票支持該俱樂部的成員的姓名。

我的數據庫圖(通過 dbdiagram.io 創建,查看

數據庫圖 為了方便地查詢數據庫,我決定使用Hasura,這是一個開源服務,只需查看資料結構就可以創建 GraphQL 後端(我使用的是 PostgresQL)。我使用以下查詢來獲取我想要的數據:

query GetBooksOfClubIncludingVotesAndMemberName {
 books(
   where: {
     club_id: {_eq: "3"}, 
     state:{_eq: 0 }
   }, 
   order_by: [
     { fallback : asc },
     { id: asc }
   ]
 ) {
   id
   isbn
   state
   votes {
     member {
       id
       name
     }
   }
 }    
}

該查詢當然會轉換為 SQL 語句

SELECT
 coalesce(
   json_agg(
     "root"
     ORDER BY
       "root.pg.fallback" ASC NULLS LAST,
       "root.pg.id" ASC NULLS LAST
   ),
   '[]'
 ) AS "root"
FROM
 (
   SELECT
     row_to_json(
       (
         SELECT
           "_8_e"
         FROM
           (
             SELECT
               "_0_root.base"."id" AS "id",
               "_0_root.base"."isbn" AS "isbn",
               "_7_root.ar.root.votes"."votes" AS "votes"
           ) AS "_8_e"
       )
     ) AS "root",
     "_0_root.base"."id" AS "root.pg.id",
     "_0_root.base"."fallback" AS "root.pg.fallback"
   FROM
     (
       SELECT
         *
       FROM
         "public"."books"
       WHERE
         (
           (("public"."books"."club_id") = (('3') :: bigint))
           AND (("public"."books"."state") = (('0') :: smallint))
         )
     ) AS "_0_root.base"
     LEFT OUTER JOIN LATERAL (
       SELECT
         coalesce(json_agg("votes"), '[]') AS "votes"
       FROM
         (
           SELECT
             row_to_json(
               (
                 SELECT
                   "_5_e"
                 FROM
                   (
                     SELECT
                       "_4_root.ar.root.votes.or.member"."member" AS "member"
                   ) AS "_5_e"
               )
             ) AS "votes"
           FROM
             (
               SELECT
                 *
               FROM
                 "public"."votes"
               WHERE
                 (("_0_root.base"."id") = ("book_id"))
             ) AS "_1_root.ar.root.votes.base"
             LEFT OUTER JOIN LATERAL (
               SELECT
                 row_to_json(
                   (
                     SELECT
                       "_3_e"
                     FROM
                       (
                         SELECT
                           "_2_root.ar.root.votes.or.member.base"."id" AS "id",
                           "_2_root.ar.root.votes.or.member.base"."name" AS "name"
                       ) AS "_3_e"
                   )
                 ) AS "member"
               FROM
                 (
                   SELECT
                     *
                   FROM
                     "public"."members"
                   WHERE
                     (
                       ("_1_root.ar.root.votes.base"."member_id") = ("id")
                     )
                 ) AS "_2_root.ar.root.votes.or.member.base"
             ) AS "_4_root.ar.root.votes.or.member" ON ('true')
         ) AS "_6_root.ar.root.votes"
     ) AS "_7_root.ar.root.votes" ON ('true')
   ORDER BY
     "root.pg.fallback" ASC NULLS LAST,
     "root.pg.id" ASC NULLS LAST
 ) AS "_9_root";

使用EXPLAIN ANALYZE前面的語句執行此語句時,它告訴我大約需要 9217 毫秒才能完成,請檢查下面的分析響應

                                                                        QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate  (cost=12057321.11..12057321.15 rows=1 width=32) (actual time=9151.967..9151.967 rows=1 loops=1)
  ->  Sort  (cost=12057312.92..12057313.38 rows=182 width=37) (actual time=9151.856..9151.865 rows=180 loops=1)
        Sort Key: books.fallback, books.id
        Sort Method: quicksort  Memory: 72kB
        ->  Nested Loop Left Join  (cost=66041.02..12057306.09 rows=182 width=37) (actual time=301.721..9151.490 rows=180 loops=1)
              ->  Index Scan using book_club on books  (cost=0.43..37888.11 rows=182 width=42) (actual time=249.506..304.469 rows=180 loops=1)
                    Index Cond: (club_id = '3'::bigint)
                    Filter: (state = '0'::smallint)
              ->  Aggregate  (cost=66040.60..66040.64 rows=1 width=32) (actual time=49.134..49.134 rows=1 loops=180)
                    ->  Nested Loop Left Join  (cost=0.72..66040.46 rows=3 width=32) (actual time=0.037..49.124 rows=3 loops=180)
                          ->  Index Only Scan using member_book on votes  (cost=0.43..66021.32 rows=3 width=8) (actual time=0.024..49.104 rows=3 loops=180)
                                Index Cond: (book_id = books.id)
                                Heap Fetches: 540
                          ->  Index Scan using members_pkey on members  (cost=0.29..6.38 rows=1 width=36) (actual time=0.005..0.005 rows=1 loops=540)
                                Index Cond: (id = votes.member_id)
                                SubPlan 2
                                  ->  Result  (cost=0.00..0.04 rows=1 width=32) (actual time=0.000..0.000 rows=1 loops=540)
                    SubPlan 3
                      ->  Result  (cost=0.00..0.04 rows=1 width=32) (actual time=0.000..0.000 rows=1 loops=540)
              SubPlan 1
                ->  Result  (cost=0.00..0.04 rows=1 width=32) (actual time=0.001..0.002 rows=1 loops=180)
Planning Time: 0.788 ms
JIT:
  Functions: 32
  Options: Inlining true, Optimization true, Expressions true, Deforming true
  Timing: Generation 4.614 ms, Inlining 52.818 ms, Optimization 113.442 ms, Emission 81.939 ms, Total 252.813 ms
Execution Time: 9217.899 ms
(27 rows)

表大小為:

  relname    | rowcount
--------------+----------
books        |  1153800
members      |    19230
votes        |  3461400
clubs        |     6410
users        |        3

這需要的時間太長了。在我之前的設計中,我沒有任何索引,這導致它更慢。我已經添加了索引,但我仍然對我必須等待那麼久的事實不太滿意。關於資料結構或任何東西,我有什麼可以改進的嗎?

編輯 相同的選擇語句,但現在EXPLAIN (ANALYZE, BUFFERS)按照建議使用:

                                                                        QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate  (cost=12057321.11..12057321.15 rows=1 width=32) (actual time=8896.202..8896.202 rows=1 loops=1)
  Buffers: shared hit=2392279 read=9470
  ->  Sort  (cost=12057312.92..12057313.38 rows=182 width=37) (actual time=8896.097..8896.106 rows=180 loops=1)
        Sort Key: books.fallback, books.id
        Sort Method: quicksort  Memory: 72kB
        Buffers: shared hit=2392279 read=9470
        ->  Nested Loop Left Join  (cost=66041.02..12057306.09 rows=182 width=37) (actual time=222.978..8895.801 rows=180 loops=1)
              Buffers: shared hit=2392279 read=9470
              ->  Index Scan using book_club on books  (cost=0.43..37888.11 rows=182 width=42) (actual time=174.471..214.000 rows=180 loops=1)
                    Index Cond: (club_id = '3'::bigint)
                    Filter: (state = '0'::smallint)
                    Buffers: shared hit=113 read=9470
              ->  Aggregate  (cost=66040.60..66040.64 rows=1 width=32) (actual time=48.211..48.211 rows=1 loops=180)
                    Buffers: shared hit=2392166
                    ->  Nested Loop Left Join  (cost=0.72..66040.46 rows=3 width=32) (actual time=0.028..48.202 rows=3 loops=180)
                          Buffers: shared hit=2392166
                          ->  Index Only Scan using member_book on votes  (cost=0.43..66021.32 rows=3 width=8) (actual time=0.018..48.187 rows=3 loops=180)
                                Index Cond: (book_id = books.id)
                                Heap Fetches: 540
                                Buffers: shared hit=2390546
                          ->  Index Scan using members_pkey on members  (cost=0.29..6.38 rows=1 width=36) (actual time=0.004..0.004 rows=1 loops=540)
                                Index Cond: (id = votes.member_id)
                                Buffers: shared hit=1620
                                SubPlan 2
                                  ->  Result  (cost=0.00..0.04 rows=1 width=32) (actual time=0.000..0.000 rows=1 loops=540)
                    SubPlan 3
                      ->  Result  (cost=0.00..0.04 rows=1 width=32) (actual time=0.000..0.000 rows=1 loops=540)
              SubPlan 1
                ->  Result  (cost=0.00..0.04 rows=1 width=32) (actual time=0.008..0.008 rows=1 loops=180)
Planning Time: 0.400 ms
JIT:
  Functions: 32
  Options: Inlining true, Optimization true, Expressions true, Deforming true
  Timing: Generation 2.060 ms, Inlining 9.923 ms, Optimization 94.927 ms, Emission 68.793 ms, Total 175.702 ms
Execution Time: 8898.360 ms
(35 rows)

**編輯 2:**按照答案中的建議 使用select * from pg_prepared_xacts;和。select * from pg_stat_activity;第一個語句沒有顯示任何行,第二個語句我沒有註意到任何舊的 xact_start 時間,這是在之前(昨天)執行之後完成的VACUUM FULL votes。不幸的是,跑步VACUUM FULL votes並不能解決問題。

語句的輸出:

booky=# select * from pg_prepared_xacts;
transaction | gid | prepared | owner | database
-------------+-----+----------+-------+----------
(0 rows)

booky=# select * from pg_stat_activity;
datid  | datname | pid | usesysid | usename  | application_name | client_addr | client_hostname | client_port |         backend_start         |          xact_start           |          query_start          |         state_change          | wait_event_type |     wait_event      | state  | backend_xid | backend_xmin |              query              |         backend_type
--------+---------+-----+----------+----------+------------------+-------------+-----------------+-------------+-------------------------------+-------------------------------+-------------------------------+-------------------------------+-----------------+---------------------+--------+-------------+--------------+---------------------------------+------------------------------
       |         |  31 |          |          |                  |             |                 |             | 2020-04-05 08:41:47.959657+00 |                               |                               |                               | Activity        | AutoVacuumMain      |        |             |              |                                 | autovacuum launcher
       |         |  33 |       10 | postgres |                  |             |                 |             | 2020-04-05 08:41:47.959964+00 |                               |                               |                               | Activity        | LogicalLauncherMain |        |             |              |                                 | logical replication launcher
169575 | booky   |  48 |       10 | postgres | psql             |             |                 |          -1 | 2020-04-05 10:05:20.847798+00 | 2020-04-05 10:07:47.534381+00 | 2020-04-05 10:07:47.534381+00 | 2020-04-05 10:07:47.534382+00 |                 |                     | active |             |     15265333 | select * from pg_stat_activity; | client backend
       |         |  29 |          |          |                  |             |                 |             | 2020-04-05 08:41:47.959346+00 |                               |                               |                               | Activity        | BgWriterHibernate   |        |             |              |                                 | background writer
       |         |  28 |          |          |                  |             |                 |             | 2020-04-05 08:41:47.959688+00 |                               |                               |                               | Activity        | CheckpointerMain    |        |             |              |                                 | checkpointer
       |         |  30 |          |          |                  |             |                 |             | 2020-04-05 08:41:47.959501+00 |                               |                               |                               | Activity        | WalWriterMain       |        |             |              |                                 | walwriter
(6 rows)

感謝@Lennart,我添加了一個似乎已經解決了這個問題的索引。它從大約 8900 毫秒一直到 35 毫秒,太棒了

要創建的索引:

CREATE INDEX IX1_VOTES ON VOTES (book_id, member_id)
 ->  Index Only Scan using member_book on votes  (cost=0.43..66021.32 rows=3 width=8) (actual time=0.024..49.104 rows=3 loops=180)
        Index Cond: (book_id = books.id)
        Heap Fetches: 540

49.104 * 180 = 8839,這基本上是你所有的時間。很可能大部分時間都是去 IO 從表中讀取隨機頁面(如果你打開track_io_timings然後EXPLAIN (ANALYZE, BUFFERS)我們會有一個明確的答案)。

如果你真空“投票”,從而擺脫堆提取,它幾乎肯定會解決問題。

  ->  Index Only Scan using member_book on votes  (cost=0.43..66021.32 rows=3 width=8) (actual time=0.018..48.187 rows=3 loops=180)
          Index Cond: (book_id = books.id)
          Heap Fetches: 540
          Buffers: shared hit=2390546

如果這是在 VACUUM 完成後完成的,那麼您可能有某種長期執行的事務處於打開狀態,這會阻止 VACUUM 有效地完成其工作。此外,達到 2,390,546 個緩衝區來獲得 540 行似乎非常奇怪。同樣,這可能是由於一些長時間打開的事務導致索引和/或表中的大量膨脹。

是否select * from pg_prepared_xacts;顯示任何行?是否select * from pg_stat_activity顯示任何舊時代xact_start?如果這兩個都不是,那麼你能做一個VACUUM FULL votes看看是否能解決問題嗎?

引用自:https://dba.stackexchange.com/questions/264321