從具有數百萬行的 PostgresQL 數據庫中檢索數據需要很長時間
我一直在開發一個使用者可以註冊為使用者、創建讀書俱樂部並邀請其他人(成員)加入的系統。使用者和會員都可以向俱樂部添加圖書,也可以為其他會員添加的圖書投票。我最近嘗試添加很多數據來檢查數據庫是否執行良好,之後我發現實際獲取我喜歡的數據需要花費很多時間。我想獲取俱樂部中的所有書籍,包括他們的選票和投票支持該俱樂部的成員的姓名。
我的數據庫圖(通過 dbdiagram.io 創建,查看)
為了方便地查詢數據庫,我決定使用Hasura,這是一個開源服務,只需查看資料結構就可以創建 GraphQL 後端(我使用的是 PostgresQL)。我使用以下查詢來獲取我想要的數據:
query GetBooksOfClubIncludingVotesAndMemberName { books( where: { club_id: {_eq: "3"}, state:{_eq: 0 } }, order_by: [ { fallback : asc }, { id: asc } ] ) { id isbn state votes { member { id name } } } }
該查詢當然會轉換為 SQL 語句
SELECT coalesce( json_agg( "root" ORDER BY "root.pg.fallback" ASC NULLS LAST, "root.pg.id" ASC NULLS LAST ), '[]' ) AS "root" FROM ( SELECT row_to_json( ( SELECT "_8_e" FROM ( SELECT "_0_root.base"."id" AS "id", "_0_root.base"."isbn" AS "isbn", "_7_root.ar.root.votes"."votes" AS "votes" ) AS "_8_e" ) ) AS "root", "_0_root.base"."id" AS "root.pg.id", "_0_root.base"."fallback" AS "root.pg.fallback" FROM ( SELECT * FROM "public"."books" WHERE ( (("public"."books"."club_id") = (('3') :: bigint)) AND (("public"."books"."state") = (('0') :: smallint)) ) ) AS "_0_root.base" LEFT OUTER JOIN LATERAL ( SELECT coalesce(json_agg("votes"), '[]') AS "votes" FROM ( SELECT row_to_json( ( SELECT "_5_e" FROM ( SELECT "_4_root.ar.root.votes.or.member"."member" AS "member" ) AS "_5_e" ) ) AS "votes" FROM ( SELECT * FROM "public"."votes" WHERE (("_0_root.base"."id") = ("book_id")) ) AS "_1_root.ar.root.votes.base" LEFT OUTER JOIN LATERAL ( SELECT row_to_json( ( SELECT "_3_e" FROM ( SELECT "_2_root.ar.root.votes.or.member.base"."id" AS "id", "_2_root.ar.root.votes.or.member.base"."name" AS "name" ) AS "_3_e" ) ) AS "member" FROM ( SELECT * FROM "public"."members" WHERE ( ("_1_root.ar.root.votes.base"."member_id") = ("id") ) ) AS "_2_root.ar.root.votes.or.member.base" ) AS "_4_root.ar.root.votes.or.member" ON ('true') ) AS "_6_root.ar.root.votes" ) AS "_7_root.ar.root.votes" ON ('true') ORDER BY "root.pg.fallback" ASC NULLS LAST, "root.pg.id" ASC NULLS LAST ) AS "_9_root";
使用
EXPLAIN ANALYZE
前面的語句執行此語句時,它告訴我大約需要 9217 毫秒才能完成,請檢查下面的分析響應QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=12057321.11..12057321.15 rows=1 width=32) (actual time=9151.967..9151.967 rows=1 loops=1) -> Sort (cost=12057312.92..12057313.38 rows=182 width=37) (actual time=9151.856..9151.865 rows=180 loops=1) Sort Key: books.fallback, books.id Sort Method: quicksort Memory: 72kB -> Nested Loop Left Join (cost=66041.02..12057306.09 rows=182 width=37) (actual time=301.721..9151.490 rows=180 loops=1) -> Index Scan using book_club on books (cost=0.43..37888.11 rows=182 width=42) (actual time=249.506..304.469 rows=180 loops=1) Index Cond: (club_id = '3'::bigint) Filter: (state = '0'::smallint) -> Aggregate (cost=66040.60..66040.64 rows=1 width=32) (actual time=49.134..49.134 rows=1 loops=180) -> Nested Loop Left Join (cost=0.72..66040.46 rows=3 width=32) (actual time=0.037..49.124 rows=3 loops=180) -> Index Only Scan using member_book on votes (cost=0.43..66021.32 rows=3 width=8) (actual time=0.024..49.104 rows=3 loops=180) Index Cond: (book_id = books.id) Heap Fetches: 540 -> Index Scan using members_pkey on members (cost=0.29..6.38 rows=1 width=36) (actual time=0.005..0.005 rows=1 loops=540) Index Cond: (id = votes.member_id) SubPlan 2 -> Result (cost=0.00..0.04 rows=1 width=32) (actual time=0.000..0.000 rows=1 loops=540) SubPlan 3 -> Result (cost=0.00..0.04 rows=1 width=32) (actual time=0.000..0.000 rows=1 loops=540) SubPlan 1 -> Result (cost=0.00..0.04 rows=1 width=32) (actual time=0.001..0.002 rows=1 loops=180) Planning Time: 0.788 ms JIT: Functions: 32 Options: Inlining true, Optimization true, Expressions true, Deforming true Timing: Generation 4.614 ms, Inlining 52.818 ms, Optimization 113.442 ms, Emission 81.939 ms, Total 252.813 ms Execution Time: 9217.899 ms (27 rows)
表大小為:
relname | rowcount --------------+---------- books | 1153800 members | 19230 votes | 3461400 clubs | 6410 users | 3
這需要的時間太長了。在我之前的設計中,我沒有任何索引,這導致它更慢。我已經添加了索引,但我仍然對我必須等待那麼久的事實不太滿意。關於資料結構或任何東西,我有什麼可以改進的嗎?
編輯 相同的選擇語句,但現在
EXPLAIN (ANALYZE, BUFFERS)
按照建議使用:QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=12057321.11..12057321.15 rows=1 width=32) (actual time=8896.202..8896.202 rows=1 loops=1) Buffers: shared hit=2392279 read=9470 -> Sort (cost=12057312.92..12057313.38 rows=182 width=37) (actual time=8896.097..8896.106 rows=180 loops=1) Sort Key: books.fallback, books.id Sort Method: quicksort Memory: 72kB Buffers: shared hit=2392279 read=9470 -> Nested Loop Left Join (cost=66041.02..12057306.09 rows=182 width=37) (actual time=222.978..8895.801 rows=180 loops=1) Buffers: shared hit=2392279 read=9470 -> Index Scan using book_club on books (cost=0.43..37888.11 rows=182 width=42) (actual time=174.471..214.000 rows=180 loops=1) Index Cond: (club_id = '3'::bigint) Filter: (state = '0'::smallint) Buffers: shared hit=113 read=9470 -> Aggregate (cost=66040.60..66040.64 rows=1 width=32) (actual time=48.211..48.211 rows=1 loops=180) Buffers: shared hit=2392166 -> Nested Loop Left Join (cost=0.72..66040.46 rows=3 width=32) (actual time=0.028..48.202 rows=3 loops=180) Buffers: shared hit=2392166 -> Index Only Scan using member_book on votes (cost=0.43..66021.32 rows=3 width=8) (actual time=0.018..48.187 rows=3 loops=180) Index Cond: (book_id = books.id) Heap Fetches: 540 Buffers: shared hit=2390546 -> Index Scan using members_pkey on members (cost=0.29..6.38 rows=1 width=36) (actual time=0.004..0.004 rows=1 loops=540) Index Cond: (id = votes.member_id) Buffers: shared hit=1620 SubPlan 2 -> Result (cost=0.00..0.04 rows=1 width=32) (actual time=0.000..0.000 rows=1 loops=540) SubPlan 3 -> Result (cost=0.00..0.04 rows=1 width=32) (actual time=0.000..0.000 rows=1 loops=540) SubPlan 1 -> Result (cost=0.00..0.04 rows=1 width=32) (actual time=0.008..0.008 rows=1 loops=180) Planning Time: 0.400 ms JIT: Functions: 32 Options: Inlining true, Optimization true, Expressions true, Deforming true Timing: Generation 2.060 ms, Inlining 9.923 ms, Optimization 94.927 ms, Emission 68.793 ms, Total 175.702 ms Execution Time: 8898.360 ms (35 rows)
**編輯 2:**按照答案中的建議 使用
select * from pg_prepared_xacts;
和。select * from pg_stat_activity;
第一個語句沒有顯示任何行,第二個語句我沒有註意到任何舊的 xact_start 時間,這是在之前(昨天)執行之後完成的VACUUM FULL votes
。不幸的是,跑步VACUUM FULL votes
並不能解決問題。語句的輸出:
booky=# select * from pg_prepared_xacts; transaction | gid | prepared | owner | database -------------+-----+----------+-------+---------- (0 rows) booky=# select * from pg_stat_activity; datid | datname | pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | xact_start | query_start | state_change | wait_event_type | wait_event | state | backend_xid | backend_xmin | query | backend_type --------+---------+-----+----------+----------+------------------+-------------+-----------------+-------------+-------------------------------+-------------------------------+-------------------------------+-------------------------------+-----------------+---------------------+--------+-------------+--------------+---------------------------------+------------------------------ | | 31 | | | | | | | 2020-04-05 08:41:47.959657+00 | | | | Activity | AutoVacuumMain | | | | | autovacuum launcher | | 33 | 10 | postgres | | | | | 2020-04-05 08:41:47.959964+00 | | | | Activity | LogicalLauncherMain | | | | | logical replication launcher 169575 | booky | 48 | 10 | postgres | psql | | | -1 | 2020-04-05 10:05:20.847798+00 | 2020-04-05 10:07:47.534381+00 | 2020-04-05 10:07:47.534381+00 | 2020-04-05 10:07:47.534382+00 | | | active | | 15265333 | select * from pg_stat_activity; | client backend | | 29 | | | | | | | 2020-04-05 08:41:47.959346+00 | | | | Activity | BgWriterHibernate | | | | | background writer | | 28 | | | | | | | 2020-04-05 08:41:47.959688+00 | | | | Activity | CheckpointerMain | | | | | checkpointer | | 30 | | | | | | | 2020-04-05 08:41:47.959501+00 | | | | Activity | WalWriterMain | | | | | walwriter (6 rows)
感謝@Lennart,我添加了一個似乎已經解決了這個問題的索引。它從大約 8900 毫秒一直到 35 毫秒,太棒了
要創建的索引:
CREATE INDEX IX1_VOTES ON VOTES (book_id, member_id)
-> Index Only Scan using member_book on votes (cost=0.43..66021.32 rows=3 width=8) (actual time=0.024..49.104 rows=3 loops=180) Index Cond: (book_id = books.id) Heap Fetches: 540
49.104 * 180 = 8839,這基本上是你所有的時間。很可能大部分時間都是去 IO 從表中讀取隨機頁面(如果你打開
track_io_timings
然後EXPLAIN (ANALYZE, BUFFERS)
我們會有一個明確的答案)。如果你真空“投票”,從而擺脫堆提取,它幾乎肯定會解決問題。
-> Index Only Scan using member_book on votes (cost=0.43..66021.32 rows=3 width=8) (actual time=0.018..48.187 rows=3 loops=180) Index Cond: (book_id = books.id) Heap Fetches: 540 Buffers: shared hit=2390546
如果這是在 VACUUM 完成後完成的,那麼您可能有某種長期執行的事務處於打開狀態,這會阻止 VACUUM 有效地完成其工作。此外,達到 2,390,546 個緩衝區來獲得 540 行似乎非常奇怪。同樣,這可能是由於一些長時間打開的事務導致索引和/或表中的大量膨脹。
是否
select * from pg_prepared_xacts;
顯示任何行?是否select * from pg_stat_activity
顯示任何舊時代xact_start
?如果這兩個都不是,那麼你能做一個VACUUM FULL votes
看看是否能解決問題嗎?