優化單個“下一個”行的 UPDATE 查詢

August 13, 2021

我們從 new relic 獲得的資訊表明，我們的查詢性能很差（執行大約需要 16 秒）。我們在 Debian Linux 上使用 PostgreSQL 11.8。

桌子：

postgres=# SELECT table_name, column_name, data_type, character_maximum_length, character_octet_length, numeric_precision, numeric_precision_radix, numeric_scale, is_nullable, is_updatable FROM information_schema.columns WHERE table_name = 'store_codeinventory';
    table_name      |    column_name    |        data_type         | character_maximum_length | character_octet_length | numeric_precision | numeric_precision_radix | numeric_scale | is_nullable | is_updatable 
---------------------+-------------------+--------------------------+--------------------------+------------------------+-------------------+-------------------------+---------------+-------------+--------------
store_codeinventory | id                | integer                  |                          |                        |                32 |                       2 |             0 | NO          | YES
store_codeinventory | code              | character varying        |                      255 |                   1020 |                   |                         |               | NO          | YES
store_codeinventory | limit_days        | integer                  |                          |                        |                32 |                       2 |             0 | NO          | YES
store_codeinventory | deactivation_date | timestamp with time zone |                          |                        |                   |                         |               | YES         | YES
store_codeinventory | cost              | numeric                  |                          |                        |                 8 |                      10 |             2 | NO          | YES
store_codeinventory | price             | numeric                  |                          |                        |                 8 |                      10 |             2 | NO          | YES
store_codeinventory | created_date      | timestamp with time zone |                          |                        |                   |                         |               | YES         | YES
store_codeinventory | claimed_date      | timestamp with time zone |                          |                        |                   |                         |               | YES         | YES
store_codeinventory | is_active         | boolean                  |                          |                        |                   |                         |               | NO          | YES
store_codeinventory | book_id           | integer                  |                          |                        |                32 |                       2 |             0 | YES         | YES
store_codeinventory | code_import_id    | integer                  |                          |                        |                32 |                       2 |             0 | NO          | YES
store_codeinventory | creator_id        | integer                  |                          |                        |                32 |                       2 |             0 | NO          | YES
store_codeinventory | inv_id            | integer                  |                          |                        |                32 |                       2 |             0 | YES         | YES
store_codeinventory | label_id          | integer                  |                          |                        |                32 |                       2 |             0 | YES         | YES
store_codeinventory | recipient_id      | integer                  |                          |                        |                32 |                       2 |             0 | YES         | YES
store_codeinventory | purchase_id       | integer                  |                          |                        |                32 |                       2 |             0 | YES         | YES
(16 rows)

我無權訪問生產，但我嘗試填寫本地數據庫並了解問題。詢問：

UPDATE store_codeinventory set recipient_id = 1168, claimed_date = NOW()
   where id = ANY((select array(select id from store_codeinventory
       where recipient_id is NULL and inv_id = 72 and is_active=true
       ORDER BY ID ASC LIMIT 1 FOR UPDATE)) ::integer[]) and recipient_id is NULL;

查詢計劃：

                                                                                             QUERY PLAN                                                                                               
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Update on store_codeinventory  (cost=0.64..47.43 rows=10 width=92) (actual time=0.291..0.291 rows=0 loops=1)
  InitPlan 2 (returns $3)
    -&gt;  Result  (cost=0.34..0.35 rows=1 width=32) (actual time=0.060..0.060 rows=1 loops=1)
          InitPlan 1 (returns $2)
            -&gt;  Limit  (cost=0.29..0.34 rows=1 width=10) (actual time=0.058..0.058 rows=1 loops=1)
                  -&gt;  LockRows  (cost=0.29..1031.77 rows=19963 width=10) (actual time=0.057..0.057 rows=1 loops=1)
                        -&gt;  Index Scan using store_codeinventory_pkey on store_codeinventory store_codeinventory_1  (cost=0.29..832.14 rows=19963 width=10) (actual time=0.053..0.053 rows=1 loops=1)
                              Filter: ((recipient_id IS NULL) AND is_active AND (inv_id = 72))
                              Rows Removed by Filter: 94
  -&gt;  Index Scan using store_codeinventory_pkey on store_codeinventory  (cost=0.29..47.08 rows=10 width=92) (actual time=0.065..0.066 rows=1 loops=1)
        Index Cond: (id = ANY ($3))
        Filter: (recipient_id IS NULL)
Planning Time: 0.365 ms
Execution Time: 0.327 ms
(14 rows)

我不太擅長寫sql。有人可以給我建議瓶頸在哪裡嗎？據我了解，可能有兩個地方：LockRows 和 Index Scan，對嗎？我認為第一個是引起FOR UPDATE的，第二個是 - select where ORDER BY ID ASC LIMIT？有沒有辦法優化它們？

UPD。有關索引的資訊（我刪除了未在查詢中使用的列）：

postgres=# SELECT indexname, indexdef FROM pg_indexes WHERE schemaname = 'public' and tablename = 'store_codeinventory' AND (indexdef LIKE '%inv_id%' OR indexdef LIKE '%pkey%' OR indexdef LIKE '%recipient_id%');
                          indexname                            |                                                                        indexdef                                                                         
----------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------
store_codeinventory_inv_id_recipient_id_is_active_822b92e5_idx | CREATE INDEX store_codeinventory_inv_id_recipient_id_is_active_822b92e5_idx ON public.store_codeinventory USING btree (inv_id, recipient_id, is_active)
store_codeinventory_recipient_id_e32fbb18                      | CREATE INDEX store_codeinventory_recipient_id_e32fbb18 ON public.store_codeinventory USING btree (recipient_id)
store_codeinventory_inv_id_21e20eb7                            | CREATE INDEX store_codeinventory_inv_id_21e20eb7 ON public.store_codeinventory USING btree (inv_id)
store_codeinventory_pkey                                       | CREATE UNIQUE INDEX store_codeinventory_pkey ON public.store_codeinventory USING btree (id)
(4 rows)

UPD2我不太擅長 sql，但我決定分析該查詢並得出結論，我們可以簡化它。您對此變體的看法：

UPDATE store_codeinventory set recipient_id = 1168, claimed_date = NOW()
   where id = (select id from store_codeinventory
       where recipient_id is NULL and inv_id = 72 and is_active=true 
       ORDER BY ID ASC LIMIT 1 FOR UPDATE) and recipient_id is NULL;

我認為我們不需要數組，因為在子選擇中我們只會得到一個或 null。然後我認為LIMIT 1獲得ordered by ID ASC最小的id值是一樣的。如果我們寫：

UPDATE store_codeinventory set recipient_id = 1168, claimed_date = NOW()
   where id = (select MIN(id) from store_codeinventory
       where recipient_id is NULL and inv_id = 72 and is_active=true);

是否相同？

您的第一個查詢是昂貴的廢話 - 就像您已經發現自己一樣。該陣列沒有任何作用。
第二個查詢更有意義。
第三個查詢背後的想法是基於一個誤解。ORDER BY id LIMIT 1至少與一樣有效min(id)。兩者都可以使用索引。第一個是更通用的。除其他外，您可以添加一個鎖定子句，例如FOR UPDATE- 如果可以同時進行寫訪問，您需要確保它是安全的。
所以：
UPDATE store_codeinventory
SET    recipient_id = 1168
    , claimed_date = now()
WHERE  id = (
  SELECT id
  FROM   store_codeinventory
  WHERE  recipient_id IS NULL
  AND    inv_id = 72
  AND    is_active
  ORDER  BY id
  LIMIT  1
  FOR    UPDATE SKIP LOCKED  -- !?
  );
有了索引支持，這應該很快。您需要正確的索引。任何額外的索引都會使寫入表的成本更高。因此，添加您實際需要的索引並避免使用其他索引。
最佳索引取決於相關列的各自特徵和數據分佈：
recipient_id
inv_id
is_active
id
範例：如果recipient_idNULL 值很少，則部分索引會捕捉：
CREATE INDEX ON store_codeinventory (id)
WHERE recipient_id IS NULL;
或者，如果謂詞始終相同，則此索引將是完美的：
CREATE INDEX ON store_codeinventory (id)
WHERE  recipient_id IS NULL
AND    inv_id = 72
AND    is_active;
SKIP LOCKED僅為競爭下一個“免費”的多個事務添加鎖定子句id- 這看起來是我們的案例。
見（密切相關！）：
Postgres 更新…限制 1

引用自：https://dba.stackexchange.com/questions/297951

優化單個“下一個”行的 UPDATE 查詢

相關問答

優化查詢以在分區表中跨多天獲取數據

FROM 子句中的相關函式是否針對每一行執行？

具有大 IN 的 Postgres 查詢，並且在臨時表上加入似乎不起作用

列的數據類型會影響查詢性能嗎？

為什麼優化器不在我的表上使用聚群索引？

列的數據類型會影響查詢的性能嗎？