Postgresql

使用計數/排序的 postgres 查詢性能緩慢

  • May 28, 2019

我從另一個開發人員那裡繼承了一個 Django/postgres 應用程序,我發現許多查詢都很慢。我花了很多時間正確組織/規範化以前沒有做過的數據,這有很大幫助,但有些查詢仍然需要幾分鐘。

特別是,我們通過根據特定標準顯示圖像計數來進行一些查詢來總結數據,下面顯示了一個範例 - 按主題標籤排列的熱門圖像。

該查詢返回圖像數量最多的主題標籤。image-hashtag 關係是跨連接表的多對多關係。還有各種其他 WHERE 條件可以應用於此查詢,但這是最簡單的範例 - 我們至少按創建的時間戳和 query_id 欄位進行過濾。

pg版本PostgreSQL 9.6.11 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.3 20140911 (Red Hat 4.8.3-9), 64-bit

硬體:AWS RDS Postgres db.t2.large 最近從 db.t2.medium 升級,這似乎消除了許多查詢中的重新檢查條件步驟

查詢

SELECT
   base_hashtag.*,
   COUNT(DISTINCT base_image.id)
FROM
   base_hashtag
   JOIN base_imagehashtag ON base_hashtag.id = base_imagehashtag.hashtag_id
   JOIN base_image ON base_imagehashtag.image_id = base_image.id
WHERE
   base_image.query_id = '566591d4-33a3-4a96-a2d9-99e7bd625c18'
   AND base_image.created > 1548979200
   AND base_image.created < 1551398400
   AND base_hashtag.id <> 1
GROUP BY
   base_hashtag.id
ORDER BY
   base_hashtag.count DESC
LIMIT
   40
;

表定義:以下是範例查詢的表欄位的簡化子集。有許多潛在的冗餘索引,但我認為它們不會影響選擇性能,我打算及時清除它們

- Table: public.base_image

CREATE TABLE public.base_image
(
   id integer NOT NULL DEFAULT nextval('base_image_id_seq'::regclass),
   image_url character varying(100000) COLLATE pg_catalog."default",
   username character varying(100000) COLLATE pg_catalog."default",
   created integer,
   location_id character varying(10000000) COLLATE pg_catalog."default",
   query_id uuid,
   CONSTRAINT base_image_pkey PRIMARY KEY (id),
   CONSTRAINT base_image_query_id_b2da2903_fk_base_query_pk_id FOREIGN KEY (query_id)
       REFERENCES public.base_query (pk_id) MATCH SIMPLE
       ON UPDATE NO ACTION
       ON DELETE NO ACTION
       DEFERRABLE INITIALLY DEFERRED
)
WITH (
   OIDS = FALSE
)
TABLESPACE pg_default;

CREATE INDEX base_image_0bbeda9c
   ON public.base_image USING btree
   (query_id)
   TABLESPACE pg_default;

CREATE INDEX base_image_created_idx
   ON public.base_image USING btree
   (created)
   TABLESPACE pg_default;

CREATE INDEX base_image_created_imageid_queryid
   ON public.base_image USING btree
   (id, created, query_id)
   TABLESPACE pg_default;

CREATE INDEX base_image_id_idx
   ON public.base_image USING btree
   (id)
   TABLESPACE pg_default;

CREATE INDEX base_image_id_query_id_idx
   ON public.base_image USING btree
   (id, query_id)
   TABLESPACE pg_default;

CREATE UNIQUE INDEX base_image_query_id_id_idx
   ON public.base_image USING btree
   (query_id, id)
   TABLESPACE pg_default;

CREATE INDEX base_image_query_id_idx
   ON public.base_image USING btree
   (query_id)
   TABLESPACE pg_default;

-- Table: public.base_imagehashtag

CREATE TABLE public.base_imagehashtag
(
   id integer NOT NULL DEFAULT nextval('base_imagehashtag_id_seq'::regclass),
   hashtag_id integer NOT NULL,
   image_id integer NOT NULL,
   CONSTRAINT base_imagehashtag_pkey PRIMARY KEY (id),
   CONSTRAINT base_imagehashtag_image_id_96133dcc_uniq UNIQUE (image_id, hashtag_id),
   CONSTRAINT base_imagehashtag_hashtag_id_0d819bb9_fk_base_hashtag_id FOREIGN KEY (hashtag_id)
       REFERENCES public.base_hashtag (id) MATCH SIMPLE
       ON UPDATE NO ACTION
       ON DELETE NO ACTION
       DEFERRABLE INITIALLY DEFERRED,
   CONSTRAINT base_imagehashtag_image_id_79e99aa4_fk_base_image_id FOREIGN KEY (image_id)
       REFERENCES public.base_image (id) MATCH SIMPLE
       ON UPDATE NO ACTION
       ON DELETE NO ACTION
       DEFERRABLE INITIALLY DEFERRED
)
WITH (
   OIDS = FALSE
)
TABLESPACE pg_default;

CREATE INDEX base_imagehashtag_e4858d5c
   ON public.base_imagehashtag USING btree
   (hashtag_id)
   TABLESPACE pg_default;

CREATE INDEX base_imagehashtag_f33175e6
   ON public.base_imagehashtag USING btree
   (image_id)
   TABLESPACE pg_default;

CREATE INDEX base_imagehashtag_imageid_hashtag_id
   ON public.base_imagehashtag USING btree
   (hashtag_id, image_id)
   TABLESPACE pg_default;

-- Table: public.base_hashtag

CREATE TABLE public.base_hashtag
(
   id integer NOT NULL DEFAULT nextval('base_hashtag_id_seq'::regclass),
   name character varying(255) COLLATE pg_catalog."default" NOT NULL,
   CONSTRAINT base_hashtag_pkey PRIMARY KEY (id),
   CONSTRAINT base_hashtag_name_key UNIQUE (name)
)
WITH (
   OIDS = FALSE
)
TABLESPACE pg_default;

CREATE INDEX base_hashtag_name_a8f89285_like
   ON public.base_hashtag USING btree
   (name COLLATE pg_catalog."default" varchar_pattern_ops)
   TABLESPACE pg_default;

基數:大約

  • 基本圖像:23M
  • base_hashtag:5M
  • base_image標籤:211M

查詢計劃

Limit  (cost=7895851.20..7895851.30 rows=40 width=36) (actual time=188165.607..188165.641 rows=40 loops=1)
  Buffers: shared hit=137658 read=1129357, temp read=652059 written=652045
  ->  Sort  (cost=7895851.20..7904963.31 rows=3644846 width=36) (actual time=188165.605..188165.618 rows=40 loops=1)
        Sort Key: (count(base_hashtag.*)) DESC
        Sort Method: top-N heapsort  Memory: 28kB
        Buffers: shared hit=137658 read=1129357, temp read=652059 written=652045
        ->  GroupAggregate  (cost=7434552.18..7780638.93 rows=3644846 width=36) (actual time=178023.985..188051.536 rows=290908 loops=1)
              Group Key: base_hashtag.id
              Buffers: shared hit=137658 read=1129357, temp read=652059 written=652045
              ->  Merge Join  (cost=7434552.18..7716854.12 rows=3644846 width=68) (actual time=178023.656..186172.736 rows=3812014 loops=1)
                    Merge Cond: (base_hashtag.id = base_imagehashtag.hashtag_id)
                    Buffers: shared hit=137658 read=1129357, temp read=652059 written=652045
                    ->  Index Scan using base_hashtag_pkey on base_hashtag  (cost=0.43..205295.87 rows=5894881 width=64) (actual time=0.014..1961.031 rows=4341714 loops=1)
                          Filter: (id <> 1)
                          Rows Removed by Filter: 1
                          Buffers: shared hit=39911
                    ->  Materialize  (cost=7433940.57..7452164.81 rows=3644847 width=8) (actual time=177776.569..181299.695 rows=4055074 loops=1)
                          Buffers: shared hit=97747 read=1129357, temp read=652059 written=652045
                          ->  Sort  (cost=7433940.57..7443052.69 rows=3644847 width=8) (actual time=177776.566..179584.317 rows=4055074 loops=1)
                                Sort Key: base_imagehashtag.hashtag_id
                                Sort Method: external merge  Disk: 71336kB
                                Buffers: shared hit=97747 read=1129357, temp read=652059 written=652045
                                ->  Hash Join  (cost=1201261.72..6937033.15 rows=3644847 width=8) (actual time=46237.509..174816.562 rows=4055074 loops=1)
                                      Hash Cond: (base_imagehashtag.image_id = base_image.id)
                                      Buffers: shared hit=97747 read=1129357, temp read=631968 written=631954
                                      ->  Seq Scan on base_imagehashtag  (cost=0.00..3256836.88 rows=211107488 width=8) (actual time=0.103..72452.659 rows=211056752 loops=1)
                                            Buffers: shared hit=16405 read=1129357
                                      ->  Hash  (cost=1194732.58..1194732.58 rows=397931 width=4) (actual time=1284.293..1284.294 rows=261208 loops=1)
                                            Buckets: 131072  Batches: 8  Memory Usage: 2174kB
                                            Buffers: shared hit=81342, temp written=667
                                            ->  Bitmap Heap Scan on base_image  (cost=180976.73..1194732.58 rows=397931 width=4) (actual time=950.553..1191.553 rows=261208 loops=1)
                                                  Recheck Cond: ((created > 1548979200) AND (created < 1551398400) AND (query_id = '566591d4-33a3-4a96-a2d9-99e7bd625c18'::uuid))
                                                  Rows Removed by Index Recheck: 179762
                                                  Heap Blocks: exact=8033 lossy=43452
                                                  Buffers: shared hit=81342
                                                  ->  BitmapAnd  (cost=180976.73..180976.73 rows=397931 width=0) (actual time=948.947..948.947 rows=0 loops=1)
                                                        Buffers: shared hit=29857
                                                        ->  Bitmap Index Scan on base_image_created_idx  (cost=0.00..32677.36 rows=1488892 width=0) (actual time=171.198..171.198 rows=1484511 loops=1)
                                                              Index Cond: ((created > 1548979200) AND (created < 1551398400))
                                                              Buffers: shared hit=5154
                                                        ->  Bitmap Index Scan on base_image_query_id_idx  (cost=0.00..148100.16 rows=6159946 width=0) (actual time=760.218..760.219 rows=6274189 loops=1)
                                                              Index Cond: (query_id = '566591d4-33a3-4a96-a2d9-99e7bd625c18'::uuid)
                                                              Buffers: shared hit=24703
Planning time: 0.689 ms
Execution time: 188176.901 ms

如您所見,查詢可能需要幾分鐘才能完成。我們在應用程序級別使用記憶體,但使用者經常送出獨特的查詢,所以這只會有一點幫助。有許多此類查詢同時發生以填充儀表板,這給數據庫帶來了巨大的負載。我不期待即時響應,但希望可以減少到幾秒鐘而不是幾分鐘。

我很欣賞此查詢中涉及大量數據,因為我認為它已經執行了所有主題標籤計數,然後才能對結果進行排序以獲得最高值。

我知道有很多關於如何調整 postgres 的資訊,但我希望有人能夠看到這個設置中明顯效率低下的東西,並指出我正確的方向,這樣我就可以開始試驗了。

編輯

與 set enable_seqscan=off 相同的查詢

   Limit  (cost=8386776.05..8386776.15 rows=40 width=36) (actual time=75981.675..75981.712 rows=40 loops=1)
  Buffers: shared hit=43238 read=414995, temp read=20088 written=20088
  ->  Sort  (cost=8386776.05..8395626.09 rows=3540015 width=36) (actual time=75981.674..75981.687 rows=40 loops=1)
        Sort Key: (count(base_hashtag.*)) DESC
        Sort Method: top-N heapsort  Memory: 28kB
        Buffers: shared hit=43238 read=414995, temp read=20088 written=20088
        ->  GroupAggregate  (cost=7941630.37..8274877.45 rows=3540015 width=36) (actual time=64678.453..75862.227 rows=290908 loops=1)
              Group Key: base_hashtag.id
              Buffers: shared hit=43238 read=414995, temp read=20088 written=20088
              ->  Merge Join  (cost=7941630.37..8212927.18 rows=3540015 width=68) (actual time=64678.087..73943.416 rows=3812014 loops=1)
                    Merge Cond: (base_imagehashtag.hashtag_id = base_hashtag.id)
                    Buffers: shared hit=43238 read=414995, temp read=20088 written=20088
                    ->  Sort  (cost=7939229.85..7948079.89 rows=3540015 width=8) (actual time=64541.756..66215.773 rows=4055074 loops=1)
                          Sort Key: base_imagehashtag.hashtag_id
                          Sort Method: external merge  Disk: 71344kB
                          Buffers: shared hit=35902 read=375789, temp read=20088 written=20088
                          ->  Merge Join  (cost=34646.33..7457355.98 rows=3540015 width=8) (actual time=58023.450..61805.075 rows=4055074 loops=1)
                                Merge Cond: (base_image.id = base_imagehashtag.image_id)
                                Buffers: shared hit=35902 read=375789
                                ->  Index Only Scan using base_image_country_created_imageid_query_id on base_image  (cost=0.56..268172.32 rows=384264 width=4) (actual time=217.707..832.100 rows=261208 loops=1)
                                      Index Cond: ((query_id = '566591d4-33a3-4a96-a2d9-99e7bd625c18'::uuid) AND (created > 1548979200) AND (created < 1551398400))
                                      Heap Fetches: 0
                                      Buffers: shared hit=35724 read=9321
                                ->  Index Only Scan using base_imagehashtag_image_id_96133dcc_uniq on base_imagehashtag  (cost=0.57..6621360.73 rows=212655744 width=8) (actual time=3.366..34853.999 rows=113968400 loops=1)
                                      Heap Fetches: 0
                                      Buffers: shared hit=178 read=366468
                    ->  Index Scan using base_hashtag_pkey on base_hashtag  (cost=0.43..205820.87 rows=5910767 width=64) (actual time=0.041..3822.781 rows=7862820 loops=1)
                          Filter: (id <> 1)
                          Rows Removed by Filter: 1
                          Buffers: shared hit=7336 read=39206
Planning time: 0.609 ms
Execution time: 75991.874 ms

編輯 2

enable_mergejoin=off 和 enable_hashjoin=off 的相同查詢

Limit  (cost=44144721.02..44144721.12 rows=40 width=36) (actual time=25487.908..25487.942 rows=40 loops=1)
  Buffers: shared hit=17297009 read=83, temp read=88580 written=88580
  ->  Sort  (cost=44144721.02..44154290.01 rows=3827596 width=36) (actual time=25487.906..25487.915 rows=40 loops=1)
        Sort Key: (count(base_hashtag.*)) DESC
        Sort Method: top-N heapsort  Memory: 28kB
        Buffers: shared hit=17297009 read=83, temp read=88580 written=88580
        ->  GroupAggregate  (cost=43947180.16..44023732.08 rows=3827596 width=36) (actual time=21791.076..25377.264 rows=290908 loops=1)
              Group Key: base_hashtag.id
              Buffers: shared hit=17297009 read=83, temp read=88580 written=88580
              ->  Sort  (cost=43947180.16..43956749.15 rows=3827596 width=68) (actual time=21790.924..23315.027 rows=3812014 loops=1)
                    Sort Key: base_hashtag.id
                    Sort Method: external merge  Disk: 259024kB
                    Buffers: shared hit=17297009 read=83, temp read=88580 written=88580
                    ->  Nested Loop  (cost=1.56..43214685.68 rows=3827596 width=68) (actual time=2.965..17525.982 rows=3812014 loops=1)
                          Buffers: shared hit=17297009 read=83
                          ->  Nested Loop  (cost=1.13..15358809.52 rows=3827597 width=8) (actual time=2.508..3727.578 rows=4055074 loops=1)
                                Buffers: shared hit=1060857 read=42
                                ->  Index Only Scan using base_image_queryid_created_id on base_image  (cost=0.56..17668.67 rows=417605 width=4) (actual time=0.018..85.571 rows=261208 loops=1)
                                      Index Cond: ((query_id = '566591d4-33a3-4a96-a2d9-99e7bd625c18'::uuid) AND (created > 1548979200) AND (created < 1551398400))
                                      Heap Fetches: 0
                                      Buffers: shared hit=4205
                                ->  Index Only Scan using base_imagehashtag_image_id_96133dcc_uniq on base_imagehashtag  (cost=0.57..28.74 rows=800 width=8) (actual time=0.002..0.006 rows=16 loops=261208)
                                      Index Cond: (image_id = base_image.id)
                                      Heap Fetches: 0
                                      Buffers: shared hit=1056652 read=42
                          ->  Index Scan using base_hashtag_pkey on base_hashtag  (cost=0.43..7.27 rows=1 width=64) (actual time=0.002..0.002 rows=1 loops=4055074)
                                Index Cond: (id = base_imagehashtag.hashtag_id)
                                Filter: (id <> 1)
                                Rows Removed by Filter: 0
                                Buffers: shared hit=16236152 read=41
Planning time: 0.507 ms
Execution time: 25527.201 ms

由於您願意使用記憶體的查詢結果,因此一種可能性是創建一個物化視圖,該視圖將一次性預先計算所有 base_image.query_id 的值,而不是根據需要計算每個值並儲存它。看起來一次計算所有這些值的時間不應該比只計算一個值更耗時。但這只有在與 base_image.created 進行比較的值是穩定的或至少遵循可預測的模式時才有效。所以像:

create materialized view foobar as SELECT
   base_image.query_id,
   base_hashtag.*,
   COUNT(DISTINCT base_image.id) as dist_count
FROM
   base_hashtag
   JOIN base_imagehashtag ON base_hashtag.id = base_imagehashtag.hashtag_id
   JOIN base_image ON base_imagehashtag.image_id = base_image.id
WHERE
   base_image.created > 1548979200
   AND base_image.created < 1551398400
   AND base_hashtag.id <> 1
GROUP BY
   base_image.query_id,
   base_hashtag.id

要堅持查詢本身,看起來您的 work_mem 對於這種類型的查詢來說太低了。這表明

儲存桶:131072 批次:8 記憶體使用量:2174kB

堆塊:精確=8033 有損=43452

如果你能負擔得起增加 work_mem 的能力,以至於你可以在一個批次中進行散列,因此點陣圖掃描中沒有有損塊,那將是一個幫助。(雖然點陣圖掃描不是一開始的瓶頸的一部分,所以它不能真正成為一個巨大的幫助)。

索引on base_image (query_id, created, id)可以通過避免 BitmapAnd 和進行僅索引掃描來提供一些幫助,但它也只能提供這麼多幫助,因為從“base_image”獲取數據並不是最慢的步驟。

如果不擺脫大量的“base_imagehashtag 上的 Seq Scan”,我認為您不會獲得重大改進。我想在你之後查看查詢的 EXPLAIN (ANALYZE,BUFFERS),set enable_seqscan=off然後看看它選擇了什麼。即使最終速度變慢,看看它的作用可能會提供一些線索。

另一件事是,從 9.6 升級到 v11 將開啟並行查詢執行。如果您沒有同時進行多個查詢,那不會提高效率,但會改善延遲。儘管使用 db.t2.large,但這可能沒有多大幫助。根據我的經驗,超執行緒在這種情況下是沒有用的,所以 2 個 vCPU 可能只是意味著 1 個 CPU,所以沒有什麼可以並行化的。


查看使用 enable_seqscan=off 執行時的執行情況,它從掃描整個表“base_imagehashtag”切換到掃描其索引之一。這是一個改進,但現在掃描整個索引反而是瓶頸,所以這不是一個巨大的改進。

我希望看到該索引上的嵌套循環,而不是它的合併連接。那麼你能explain (analyze, buffers)在之後展示另一個set enable_mergejoin=off嗎?這將比 enable_seqscan 更好地針對我想要看到的內容。如果使用冷記憶體執行,嵌套循環可能會出現糟糕的性能,但在熱記憶體上可能會有更好的性能,這是我們想要找出的。由於掃描完整索引仍然很慢並且是瓶頸,我認為嵌套循環是您剩下的唯一選擇,然後再求助於硬體升級或重構您的數據表示。

如果您只是想讓它在生產中使用 75 秒計劃而不是 180 秒計劃,則有多種途徑可以追求。如果您的應用程序/框架/抽象層提供這種靈活性,則只需在執行查詢之前設置 enable_seqscan=off 並在之後重置它。解決根本問題很好,但有時你必須選擇你的戰鬥並採取更簡單的方法。

一條可疑的線是這樣的:

-> Index Only Scan using base_imagehashtag_image_id_96133dcc_uniq on base_imagehashtag (cost=0.57..6621360.73 rows=212655744 width=8) (實際時間=3.366..34853.999 rows=113968400 loops=1)

為什麼它找到的行數是預期的一半?如果這個估計是正確的,那麼計劃者會更快地找到這個計劃,所以應該使用這個計劃而不是求助於技巧。您的統計數據是否是最新的(例如,真空分析足夠頻繁)?我在這裡唯一能想到的另一件事是,對給定的“query_id”和“created”值的 base_image_country_created_imageid_query_id 掃描僅返回 base_image.id 值全範圍的下半部分 base_image.id 值,因此合併連接的另一半必須提前退出。但如果是這種情況,我不知道如何利用這一事實讓計劃者做得更好。另外,實際性能將取決於查詢中使用的確切參數值。

我認為鼓勵對 seq 掃描進行全索引掃描的另一種方法是相對於“cpu_tuple_cost”降低“cpu_index_tuple_cost”的值。我最近一直在想,“cpu_index_tuple_cost”的預設設置可能還是太高了。如果可以的話,您必須降低多少“cpu_index_tuple_cost”才能讓它切換計劃?

引用自:https://dba.stackexchange.com/questions/237650