Postgresql

優化單個“下一個”行的 UPDATE 查詢

  • August 13, 2021

我們從 new relic 獲得的資訊表明,我們的查詢性能很差(執行大約需要 16 秒)。我們在 Debian Linux 上使用 PostgreSQL 11.8。

桌子:

postgres=# SELECT table_name, column_name, data_type, character_maximum_length, character_octet_length, numeric_precision, numeric_precision_radix, numeric_scale, is_nullable, is_updatable FROM information_schema.columns WHERE table_name = 'store_codeinventory';
    table_name      |    column_name    |        data_type         | character_maximum_length | character_octet_length | numeric_precision | numeric_precision_radix | numeric_scale | is_nullable | is_updatable 
---------------------+-------------------+--------------------------+--------------------------+------------------------+-------------------+-------------------------+---------------+-------------+--------------
store_codeinventory | id                | integer                  |                          |                        |                32 |                       2 |             0 | NO          | YES
store_codeinventory | code              | character varying        |                      255 |                   1020 |                   |                         |               | NO          | YES
store_codeinventory | limit_days        | integer                  |                          |                        |                32 |                       2 |             0 | NO          | YES
store_codeinventory | deactivation_date | timestamp with time zone |                          |                        |                   |                         |               | YES         | YES
store_codeinventory | cost              | numeric                  |                          |                        |                 8 |                      10 |             2 | NO          | YES
store_codeinventory | price             | numeric                  |                          |                        |                 8 |                      10 |             2 | NO          | YES
store_codeinventory | created_date      | timestamp with time zone |                          |                        |                   |                         |               | YES         | YES
store_codeinventory | claimed_date      | timestamp with time zone |                          |                        |                   |                         |               | YES         | YES
store_codeinventory | is_active         | boolean                  |                          |                        |                   |                         |               | NO          | YES
store_codeinventory | book_id           | integer                  |                          |                        |                32 |                       2 |             0 | YES         | YES
store_codeinventory | code_import_id    | integer                  |                          |                        |                32 |                       2 |             0 | NO          | YES
store_codeinventory | creator_id        | integer                  |                          |                        |                32 |                       2 |             0 | NO          | YES
store_codeinventory | inv_id            | integer                  |                          |                        |                32 |                       2 |             0 | YES         | YES
store_codeinventory | label_id          | integer                  |                          |                        |                32 |                       2 |             0 | YES         | YES
store_codeinventory | recipient_id      | integer                  |                          |                        |                32 |                       2 |             0 | YES         | YES
store_codeinventory | purchase_id       | integer                  |                          |                        |                32 |                       2 |             0 | YES         | YES
(16 rows)

我無權訪問生產,但我嘗試填寫本地數據庫並了解問題。詢問:

UPDATE store_codeinventory set recipient_id = 1168, claimed_date = NOW()
   where id = ANY((select array(select id from store_codeinventory
       where recipient_id is NULL and inv_id = 72 and is_active=true
       ORDER BY ID ASC LIMIT 1 FOR UPDATE)) ::integer[]) and recipient_id is NULL;

查詢計劃:

                                                                                             QUERY PLAN                                                                                               
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Update on store_codeinventory  (cost=0.64..47.43 rows=10 width=92) (actual time=0.291..0.291 rows=0 loops=1)
  InitPlan 2 (returns $3)
    ->  Result  (cost=0.34..0.35 rows=1 width=32) (actual time=0.060..0.060 rows=1 loops=1)
          InitPlan 1 (returns $2)
            ->  Limit  (cost=0.29..0.34 rows=1 width=10) (actual time=0.058..0.058 rows=1 loops=1)
                  ->  LockRows  (cost=0.29..1031.77 rows=19963 width=10) (actual time=0.057..0.057 rows=1 loops=1)
                        ->  Index Scan using store_codeinventory_pkey on store_codeinventory store_codeinventory_1  (cost=0.29..832.14 rows=19963 width=10) (actual time=0.053..0.053 rows=1 loops=1)
                              Filter: ((recipient_id IS NULL) AND is_active AND (inv_id = 72))
                              Rows Removed by Filter: 94
  ->  Index Scan using store_codeinventory_pkey on store_codeinventory  (cost=0.29..47.08 rows=10 width=92) (actual time=0.065..0.066 rows=1 loops=1)
        Index Cond: (id = ANY ($3))
        Filter: (recipient_id IS NULL)
Planning Time: 0.365 ms
Execution Time: 0.327 ms
(14 rows)

我不太擅長寫sql。有人可以給我建議瓶頸在哪裡嗎?據我了解,可能有兩個地方:LockRows 和 Index Scan,對嗎?我認為第一個是引起FOR UPDATE的,第二個是 - select where ORDER BY ID ASC LIMIT?有沒有辦法優化它們?

UPD。有關索引的資訊(我刪除了未在查詢中使用的列):

postgres=# SELECT indexname, indexdef FROM pg_indexes WHERE schemaname = 'public' and tablename = 'store_codeinventory' AND (indexdef LIKE '%inv_id%' OR indexdef LIKE '%pkey%' OR indexdef LIKE '%recipient_id%');
                          indexname                            |                                                                        indexdef                                                                         
----------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------
store_codeinventory_inv_id_recipient_id_is_active_822b92e5_idx | CREATE INDEX store_codeinventory_inv_id_recipient_id_is_active_822b92e5_idx ON public.store_codeinventory USING btree (inv_id, recipient_id, is_active)
store_codeinventory_recipient_id_e32fbb18                      | CREATE INDEX store_codeinventory_recipient_id_e32fbb18 ON public.store_codeinventory USING btree (recipient_id)
store_codeinventory_inv_id_21e20eb7                            | CREATE INDEX store_codeinventory_inv_id_21e20eb7 ON public.store_codeinventory USING btree (inv_id)
store_codeinventory_pkey                                       | CREATE UNIQUE INDEX store_codeinventory_pkey ON public.store_codeinventory USING btree (id)
(4 rows)

UPD2我不太擅長 sql,但我決定分析該查詢並得出結論,我們可以簡化它。您對此變體的看法:

UPDATE store_codeinventory set recipient_id = 1168, claimed_date = NOW()
   where id = (select id from store_codeinventory
       where recipient_id is NULL and inv_id = 72 and is_active=true 
       ORDER BY ID ASC LIMIT 1 FOR UPDATE) and recipient_id is NULL;

我認為我們不需要數組,因為在子選擇中我們只會得到一個或 null。然後我認為LIMIT 1獲得ordered by ID ASC最小的id值是一樣的。如果我們寫:

UPDATE store_codeinventory set recipient_id = 1168, claimed_date = NOW()
   where id = (select MIN(id) from store_codeinventory
       where recipient_id is NULL and inv_id = 72 and is_active=true);

是否相同?

您的第一個查詢是昂貴的廢話 - 就像您已經發現自己一樣。該陣列沒有任何作用。

第二個查詢更有意義。

第三個查詢背後的想法是基於一個誤解。ORDER BY id LIMIT 1至少與 一樣有效min(id)。兩者都可以使用索引。第一個是更通用的。除其他外,您可以添加一個鎖定子句,例如FOR UPDATE- 如果可以同時進行寫訪問,您需要確保它是安全的。

所以:

UPDATE store_codeinventory
SET    recipient_id = 1168
    , claimed_date = now()
WHERE  id = (
  SELECT id
  FROM   store_codeinventory
  WHERE  recipient_id IS NULL
  AND    inv_id = 72
  AND    is_active
  ORDER  BY id
  LIMIT  1
  FOR    UPDATE SKIP LOCKED  -- !?
  );

有了索引支持,這應該很快。您需要正確的索引。任何額外的索引都會使寫入表的成本更高。因此,添加您實際需要的索引並避免使用其他索引。

最佳索引取決於相關列的各自特徵和數據分佈:

recipient_id
inv_id
is_active
id

範例:如果recipient_idNULL 值很少,則部分索引會捕捉:

CREATE INDEX ON store_codeinventory (id)
WHERE recipient_id IS NULL;

或者,如果謂詞始終相同,則此索引將是完美的:

CREATE INDEX ON store_codeinventory (id)
WHERE  recipient_id IS NULL
AND    inv_id = 72
AND    is_active;

SKIP LOCKED僅為競爭下一個“免費”的多個事務添加鎖定子句id- 這看起來是我們的案例。

見(密切相關!):

引用自:https://dba.stackexchange.com/questions/297951