使用計數/排序的 postgres 查詢性能緩慢
我從另一個開發人員那裡繼承了一個 Django/postgres 應用程序,我發現許多查詢都很慢。我花了很多時間正確組織/規範化以前沒有做過的數據,這有很大幫助,但有些查詢仍然需要幾分鐘。
特別是,我們通過根據特定標準顯示圖像計數來進行一些查詢來總結數據,下面顯示了一個範例 - 按主題標籤排列的熱門圖像。
該查詢返回圖像數量最多的主題標籤。image-hashtag 關係是跨連接表的多對多關係。還有各種其他 WHERE 條件可以應用於此查詢,但這是最簡單的範例 - 我們至少按創建的時間戳和 query_id 欄位進行過濾。
pg版本:
PostgreSQL 9.6.11 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.3 20140911 (Red Hat 4.8.3-9), 64-bit
硬體:AWS RDS Postgres db.t2.large 最近從 db.t2.medium 升級,這似乎消除了許多查詢中的重新檢查條件步驟
查詢:
SELECT base_hashtag.*, COUNT(DISTINCT base_image.id) FROM base_hashtag JOIN base_imagehashtag ON base_hashtag.id = base_imagehashtag.hashtag_id JOIN base_image ON base_imagehashtag.image_id = base_image.id WHERE base_image.query_id = '566591d4-33a3-4a96-a2d9-99e7bd625c18' AND base_image.created > 1548979200 AND base_image.created < 1551398400 AND base_hashtag.id <> 1 GROUP BY base_hashtag.id ORDER BY base_hashtag.count DESC LIMIT 40 ;
表定義:以下是範例查詢的表欄位的簡化子集。有許多潛在的冗餘索引,但我認為它們不會影響選擇性能,我打算及時清除它們
- Table: public.base_image CREATE TABLE public.base_image ( id integer NOT NULL DEFAULT nextval('base_image_id_seq'::regclass), image_url character varying(100000) COLLATE pg_catalog."default", username character varying(100000) COLLATE pg_catalog."default", created integer, location_id character varying(10000000) COLLATE pg_catalog."default", query_id uuid, CONSTRAINT base_image_pkey PRIMARY KEY (id), CONSTRAINT base_image_query_id_b2da2903_fk_base_query_pk_id FOREIGN KEY (query_id) REFERENCES public.base_query (pk_id) MATCH SIMPLE ON UPDATE NO ACTION ON DELETE NO ACTION DEFERRABLE INITIALLY DEFERRED ) WITH ( OIDS = FALSE ) TABLESPACE pg_default; CREATE INDEX base_image_0bbeda9c ON public.base_image USING btree (query_id) TABLESPACE pg_default; CREATE INDEX base_image_created_idx ON public.base_image USING btree (created) TABLESPACE pg_default; CREATE INDEX base_image_created_imageid_queryid ON public.base_image USING btree (id, created, query_id) TABLESPACE pg_default; CREATE INDEX base_image_id_idx ON public.base_image USING btree (id) TABLESPACE pg_default; CREATE INDEX base_image_id_query_id_idx ON public.base_image USING btree (id, query_id) TABLESPACE pg_default; CREATE UNIQUE INDEX base_image_query_id_id_idx ON public.base_image USING btree (query_id, id) TABLESPACE pg_default; CREATE INDEX base_image_query_id_idx ON public.base_image USING btree (query_id) TABLESPACE pg_default; -- Table: public.base_imagehashtag CREATE TABLE public.base_imagehashtag ( id integer NOT NULL DEFAULT nextval('base_imagehashtag_id_seq'::regclass), hashtag_id integer NOT NULL, image_id integer NOT NULL, CONSTRAINT base_imagehashtag_pkey PRIMARY KEY (id), CONSTRAINT base_imagehashtag_image_id_96133dcc_uniq UNIQUE (image_id, hashtag_id), CONSTRAINT base_imagehashtag_hashtag_id_0d819bb9_fk_base_hashtag_id FOREIGN KEY (hashtag_id) REFERENCES public.base_hashtag (id) MATCH SIMPLE ON UPDATE NO ACTION ON DELETE NO ACTION DEFERRABLE INITIALLY DEFERRED, CONSTRAINT base_imagehashtag_image_id_79e99aa4_fk_base_image_id FOREIGN KEY (image_id) REFERENCES public.base_image (id) MATCH SIMPLE ON UPDATE NO ACTION ON DELETE NO ACTION DEFERRABLE INITIALLY DEFERRED ) WITH ( OIDS = FALSE ) TABLESPACE pg_default; CREATE INDEX base_imagehashtag_e4858d5c ON public.base_imagehashtag USING btree (hashtag_id) TABLESPACE pg_default; CREATE INDEX base_imagehashtag_f33175e6 ON public.base_imagehashtag USING btree (image_id) TABLESPACE pg_default; CREATE INDEX base_imagehashtag_imageid_hashtag_id ON public.base_imagehashtag USING btree (hashtag_id, image_id) TABLESPACE pg_default; -- Table: public.base_hashtag CREATE TABLE public.base_hashtag ( id integer NOT NULL DEFAULT nextval('base_hashtag_id_seq'::regclass), name character varying(255) COLLATE pg_catalog."default" NOT NULL, CONSTRAINT base_hashtag_pkey PRIMARY KEY (id), CONSTRAINT base_hashtag_name_key UNIQUE (name) ) WITH ( OIDS = FALSE ) TABLESPACE pg_default; CREATE INDEX base_hashtag_name_a8f89285_like ON public.base_hashtag USING btree (name COLLATE pg_catalog."default" varchar_pattern_ops) TABLESPACE pg_default;
基數:大約
- 基本圖像:23M
- base_hashtag:5M
- base_image標籤:211M
查詢計劃:
Limit (cost=7895851.20..7895851.30 rows=40 width=36) (actual time=188165.607..188165.641 rows=40 loops=1) Buffers: shared hit=137658 read=1129357, temp read=652059 written=652045 -> Sort (cost=7895851.20..7904963.31 rows=3644846 width=36) (actual time=188165.605..188165.618 rows=40 loops=1) Sort Key: (count(base_hashtag.*)) DESC Sort Method: top-N heapsort Memory: 28kB Buffers: shared hit=137658 read=1129357, temp read=652059 written=652045 -> GroupAggregate (cost=7434552.18..7780638.93 rows=3644846 width=36) (actual time=178023.985..188051.536 rows=290908 loops=1) Group Key: base_hashtag.id Buffers: shared hit=137658 read=1129357, temp read=652059 written=652045 -> Merge Join (cost=7434552.18..7716854.12 rows=3644846 width=68) (actual time=178023.656..186172.736 rows=3812014 loops=1) Merge Cond: (base_hashtag.id = base_imagehashtag.hashtag_id) Buffers: shared hit=137658 read=1129357, temp read=652059 written=652045 -> Index Scan using base_hashtag_pkey on base_hashtag (cost=0.43..205295.87 rows=5894881 width=64) (actual time=0.014..1961.031 rows=4341714 loops=1) Filter: (id <> 1) Rows Removed by Filter: 1 Buffers: shared hit=39911 -> Materialize (cost=7433940.57..7452164.81 rows=3644847 width=8) (actual time=177776.569..181299.695 rows=4055074 loops=1) Buffers: shared hit=97747 read=1129357, temp read=652059 written=652045 -> Sort (cost=7433940.57..7443052.69 rows=3644847 width=8) (actual time=177776.566..179584.317 rows=4055074 loops=1) Sort Key: base_imagehashtag.hashtag_id Sort Method: external merge Disk: 71336kB Buffers: shared hit=97747 read=1129357, temp read=652059 written=652045 -> Hash Join (cost=1201261.72..6937033.15 rows=3644847 width=8) (actual time=46237.509..174816.562 rows=4055074 loops=1) Hash Cond: (base_imagehashtag.image_id = base_image.id) Buffers: shared hit=97747 read=1129357, temp read=631968 written=631954 -> Seq Scan on base_imagehashtag (cost=0.00..3256836.88 rows=211107488 width=8) (actual time=0.103..72452.659 rows=211056752 loops=1) Buffers: shared hit=16405 read=1129357 -> Hash (cost=1194732.58..1194732.58 rows=397931 width=4) (actual time=1284.293..1284.294 rows=261208 loops=1) Buckets: 131072 Batches: 8 Memory Usage: 2174kB Buffers: shared hit=81342, temp written=667 -> Bitmap Heap Scan on base_image (cost=180976.73..1194732.58 rows=397931 width=4) (actual time=950.553..1191.553 rows=261208 loops=1) Recheck Cond: ((created > 1548979200) AND (created < 1551398400) AND (query_id = '566591d4-33a3-4a96-a2d9-99e7bd625c18'::uuid)) Rows Removed by Index Recheck: 179762 Heap Blocks: exact=8033 lossy=43452 Buffers: shared hit=81342 -> BitmapAnd (cost=180976.73..180976.73 rows=397931 width=0) (actual time=948.947..948.947 rows=0 loops=1) Buffers: shared hit=29857 -> Bitmap Index Scan on base_image_created_idx (cost=0.00..32677.36 rows=1488892 width=0) (actual time=171.198..171.198 rows=1484511 loops=1) Index Cond: ((created > 1548979200) AND (created < 1551398400)) Buffers: shared hit=5154 -> Bitmap Index Scan on base_image_query_id_idx (cost=0.00..148100.16 rows=6159946 width=0) (actual time=760.218..760.219 rows=6274189 loops=1) Index Cond: (query_id = '566591d4-33a3-4a96-a2d9-99e7bd625c18'::uuid) Buffers: shared hit=24703 Planning time: 0.689 ms Execution time: 188176.901 ms
如您所見,查詢可能需要幾分鐘才能完成。我們在應用程序級別使用記憶體,但使用者經常送出獨特的查詢,所以這只會有一點幫助。有許多此類查詢同時發生以填充儀表板,這給數據庫帶來了巨大的負載。我不期待即時響應,但希望可以減少到幾秒鐘而不是幾分鐘。
我很欣賞此查詢中涉及大量數據,因為我認為它已經執行了所有主題標籤計數,然後才能對結果進行排序以獲得最高值。
我知道有很多關於如何調整 postgres 的資訊,但我希望有人能夠看到這個設置中明顯效率低下的東西,並指出我正確的方向,這樣我就可以開始試驗了。
編輯
與 set enable_seqscan=off 相同的查詢
Limit (cost=8386776.05..8386776.15 rows=40 width=36) (actual time=75981.675..75981.712 rows=40 loops=1) Buffers: shared hit=43238 read=414995, temp read=20088 written=20088 -> Sort (cost=8386776.05..8395626.09 rows=3540015 width=36) (actual time=75981.674..75981.687 rows=40 loops=1) Sort Key: (count(base_hashtag.*)) DESC Sort Method: top-N heapsort Memory: 28kB Buffers: shared hit=43238 read=414995, temp read=20088 written=20088 -> GroupAggregate (cost=7941630.37..8274877.45 rows=3540015 width=36) (actual time=64678.453..75862.227 rows=290908 loops=1) Group Key: base_hashtag.id Buffers: shared hit=43238 read=414995, temp read=20088 written=20088 -> Merge Join (cost=7941630.37..8212927.18 rows=3540015 width=68) (actual time=64678.087..73943.416 rows=3812014 loops=1) Merge Cond: (base_imagehashtag.hashtag_id = base_hashtag.id) Buffers: shared hit=43238 read=414995, temp read=20088 written=20088 -> Sort (cost=7939229.85..7948079.89 rows=3540015 width=8) (actual time=64541.756..66215.773 rows=4055074 loops=1) Sort Key: base_imagehashtag.hashtag_id Sort Method: external merge Disk: 71344kB Buffers: shared hit=35902 read=375789, temp read=20088 written=20088 -> Merge Join (cost=34646.33..7457355.98 rows=3540015 width=8) (actual time=58023.450..61805.075 rows=4055074 loops=1) Merge Cond: (base_image.id = base_imagehashtag.image_id) Buffers: shared hit=35902 read=375789 -> Index Only Scan using base_image_country_created_imageid_query_id on base_image (cost=0.56..268172.32 rows=384264 width=4) (actual time=217.707..832.100 rows=261208 loops=1) Index Cond: ((query_id = '566591d4-33a3-4a96-a2d9-99e7bd625c18'::uuid) AND (created > 1548979200) AND (created < 1551398400)) Heap Fetches: 0 Buffers: shared hit=35724 read=9321 -> Index Only Scan using base_imagehashtag_image_id_96133dcc_uniq on base_imagehashtag (cost=0.57..6621360.73 rows=212655744 width=8) (actual time=3.366..34853.999 rows=113968400 loops=1) Heap Fetches: 0 Buffers: shared hit=178 read=366468 -> Index Scan using base_hashtag_pkey on base_hashtag (cost=0.43..205820.87 rows=5910767 width=64) (actual time=0.041..3822.781 rows=7862820 loops=1) Filter: (id <> 1) Rows Removed by Filter: 1 Buffers: shared hit=7336 read=39206 Planning time: 0.609 ms Execution time: 75991.874 ms
編輯 2
enable_mergejoin=off 和 enable_hashjoin=off 的相同查詢
Limit (cost=44144721.02..44144721.12 rows=40 width=36) (actual time=25487.908..25487.942 rows=40 loops=1) Buffers: shared hit=17297009 read=83, temp read=88580 written=88580 -> Sort (cost=44144721.02..44154290.01 rows=3827596 width=36) (actual time=25487.906..25487.915 rows=40 loops=1) Sort Key: (count(base_hashtag.*)) DESC Sort Method: top-N heapsort Memory: 28kB Buffers: shared hit=17297009 read=83, temp read=88580 written=88580 -> GroupAggregate (cost=43947180.16..44023732.08 rows=3827596 width=36) (actual time=21791.076..25377.264 rows=290908 loops=1) Group Key: base_hashtag.id Buffers: shared hit=17297009 read=83, temp read=88580 written=88580 -> Sort (cost=43947180.16..43956749.15 rows=3827596 width=68) (actual time=21790.924..23315.027 rows=3812014 loops=1) Sort Key: base_hashtag.id Sort Method: external merge Disk: 259024kB Buffers: shared hit=17297009 read=83, temp read=88580 written=88580 -> Nested Loop (cost=1.56..43214685.68 rows=3827596 width=68) (actual time=2.965..17525.982 rows=3812014 loops=1) Buffers: shared hit=17297009 read=83 -> Nested Loop (cost=1.13..15358809.52 rows=3827597 width=8) (actual time=2.508..3727.578 rows=4055074 loops=1) Buffers: shared hit=1060857 read=42 -> Index Only Scan using base_image_queryid_created_id on base_image (cost=0.56..17668.67 rows=417605 width=4) (actual time=0.018..85.571 rows=261208 loops=1) Index Cond: ((query_id = '566591d4-33a3-4a96-a2d9-99e7bd625c18'::uuid) AND (created > 1548979200) AND (created < 1551398400)) Heap Fetches: 0 Buffers: shared hit=4205 -> Index Only Scan using base_imagehashtag_image_id_96133dcc_uniq on base_imagehashtag (cost=0.57..28.74 rows=800 width=8) (actual time=0.002..0.006 rows=16 loops=261208) Index Cond: (image_id = base_image.id) Heap Fetches: 0 Buffers: shared hit=1056652 read=42 -> Index Scan using base_hashtag_pkey on base_hashtag (cost=0.43..7.27 rows=1 width=64) (actual time=0.002..0.002 rows=1 loops=4055074) Index Cond: (id = base_imagehashtag.hashtag_id) Filter: (id <> 1) Rows Removed by Filter: 0 Buffers: shared hit=16236152 read=41 Planning time: 0.507 ms Execution time: 25527.201 ms
由於您願意使用記憶體的查詢結果,因此一種可能性是創建一個物化視圖,該視圖將一次性預先計算所有 base_image.query_id 的值,而不是根據需要計算每個值並儲存它。看起來一次計算所有這些值的時間不應該比只計算一個值更耗時。但這只有在與 base_image.created 進行比較的值是穩定的或至少遵循可預測的模式時才有效。所以像:
create materialized view foobar as SELECT base_image.query_id, base_hashtag.*, COUNT(DISTINCT base_image.id) as dist_count FROM base_hashtag JOIN base_imagehashtag ON base_hashtag.id = base_imagehashtag.hashtag_id JOIN base_image ON base_imagehashtag.image_id = base_image.id WHERE base_image.created > 1548979200 AND base_image.created < 1551398400 AND base_hashtag.id <> 1 GROUP BY base_image.query_id, base_hashtag.id
要堅持查詢本身,看起來您的 work_mem 對於這種類型的查詢來說太低了。這表明
儲存桶:131072 批次:8 記憶體使用量:2174kB
和
堆塊:精確=8033 有損=43452
如果你能負擔得起增加 work_mem 的能力,以至於你可以在一個批次中進行散列,因此點陣圖掃描中沒有有損塊,那將是一個幫助。(雖然點陣圖掃描不是一開始的瓶頸的一部分,所以它不能真正成為一個巨大的幫助)。
索引
on base_image (query_id, created, id)
可以通過避免 BitmapAnd 和進行僅索引掃描來提供一些幫助,但它也只能提供這麼多幫助,因為從“base_image”獲取數據並不是最慢的步驟。如果不擺脫大量的“base_imagehashtag 上的 Seq Scan”,我認為您不會獲得重大改進。我想在你之後查看查詢的 EXPLAIN (ANALYZE,BUFFERS),
set enable_seqscan=off
然後看看它選擇了什麼。即使最終速度變慢,看看它的作用可能會提供一些線索。另一件事是,從 9.6 升級到 v11 將開啟並行查詢執行。如果您沒有同時進行多個查詢,那不會提高效率,但會改善延遲。儘管使用 db.t2.large,但這可能沒有多大幫助。根據我的經驗,超執行緒在這種情況下是沒有用的,所以 2 個 vCPU 可能只是意味著 1 個 CPU,所以沒有什麼可以並行化的。
查看使用 enable_seqscan=off 執行時的執行情況,它從掃描整個表“base_imagehashtag”切換到掃描其索引之一。這是一個改進,但現在掃描整個索引反而是瓶頸,所以這不是一個巨大的改進。
我希望看到該索引上的嵌套循環,而不是它的合併連接。那麼你能
explain (analyze, buffers)
在之後展示另一個set enable_mergejoin=off
嗎?這將比 enable_seqscan 更好地針對我想要看到的內容。如果使用冷記憶體執行,嵌套循環可能會出現糟糕的性能,但在熱記憶體上可能會有更好的性能,這是我們想要找出的。由於掃描完整索引仍然很慢並且是瓶頸,我認為嵌套循環是您剩下的唯一選擇,然後再求助於硬體升級或重構您的數據表示。如果您只是想讓它在生產中使用 75 秒計劃而不是 180 秒計劃,則有多種途徑可以追求。如果您的應用程序/框架/抽象層提供這種靈活性,則只需在執行查詢之前設置 enable_seqscan=off 並在之後重置它。解決根本問題很好,但有時你必須選擇你的戰鬥並採取更簡單的方法。
一條可疑的線是這樣的:
-> Index Only Scan using base_imagehashtag_image_id_96133dcc_uniq on base_imagehashtag (cost=0.57..6621360.73 rows=212655744 width=8) (實際時間=3.366..34853.999 rows=113968400 loops=1)
為什麼它找到的行數是預期的一半?如果這個估計是正確的,那麼計劃者會更快地找到這個計劃,所以應該使用這個計劃而不是求助於技巧。您的統計數據是否是最新的(例如,真空分析足夠頻繁)?我在這裡唯一能想到的另一件事是,對給定的“query_id”和“created”值的 base_image_country_created_imageid_query_id 掃描僅返回 base_image.id 值全範圍的下半部分 base_image.id 值,因此合併連接的另一半必須提前退出。但如果是這種情況,我不知道如何利用這一事實讓計劃者做得更好。另外,實際性能將取決於查詢中使用的確切參數值。
我認為鼓勵對 seq 掃描進行全索引掃描的另一種方法是相對於“cpu_tuple_cost”降低“cpu_index_tuple_cost”的值。我最近一直在想,“cpu_index_tuple_cost”的預設設置可能還是太高了。如果可以的話,您必須降低多少“cpu_index_tuple_cost”才能讓它切換計劃?