Postgresql

儘管 ON CONFLICT DO NOTHING 導致多行 INSERT 死鎖

  • August 7, 2019

設置

我有一個set_interactions(arg_rows text)看起來像這樣的批量插入函式:

with inserts as (
   insert into interaction (
       thing_id,
       associate_id, created_time)
   select t->>'thing_id', t->>'associate_id', now() from
   json_array_elements(arg_rows::json)  t
   ON CONFLICT (thing_id, associate_id) DO NOTHING
   RETURNING thing_id, associate_id
) select into insert_count count(*) from inserts;

-- Followed by an insert in an unrelated table that has two triggers, neither of which touch any of the tables here (also not by any of their triggers, etc.)

(我以這種方式包裝它,因為我需要計算實際插入的數量,而不需要“假行更新”技巧。)

該表interaction有:

  1. 只有一個約束:多列主鍵 (thing_id, associate_id)
  2. 沒有索引
  3. 只有一個觸發器:插入後,對於每一行。

觸發器這樣做:

DECLARE associateId text;

BEGIN

-- Go out and get the associate_id for this thing_id
BEGIN
   SELECT thing.associate_id INTO STRICT associateId FROM thing WHERE thing.id = NEW.thing_id;
   EXCEPTION
   WHEN NO_DATA_FOUND THEN
       RAISE EXCEPTION 'Could not map the thing to an associate!';
   WHEN TOO_MANY_ROWS THEN
       RAISE EXCEPTION 'Could not map the thing to a SINGLE associate!'; -- thing PK should prevent this
END;

-- We don't want to add an association between an associate interacting with their own things
IF associateId != NEW.associate_id THEN

   -- Insert the new association, if it doesn't yet exist
   INSERT INTO associations ("thing_owner", "associate")
   VALUES (associateId, NEW.associate_id)
   ON CONFLICT DO NOTHING;

END IF;

RETURN NULL;

END;

兩者interactionsassociations沒有比您在上述語句中看到的更多列。

問題

有時,deadlock detected當應用程序呼叫set_interactions(). 它可能會呼叫 1-100 行的未排序數據;“衝突”批次可能有也可能沒有相同的輸入(在整個批次級別或每個衝突行)。

錯誤詳情:

deadlock detected
while inserting index tuple (37605,46) in relation "associations"
SQL statement INSERT INTO associations ("thing_owner", "associate")
    VALUES (associateId, NEW.associate_id)
    ON CONFLICT DO NOTHING;
PL/pgSQL function aud.addfriendship() line 19 at SQL statement
SQL statement "with inserts as (
        insert into interaction (
            thing_id,
            associate_id, created_time)
        select t->>'thing_id', t->>'associate_id', now() from
        json_array_elements(arg_rows::json)  t
        ON CONFLICT (thing_id, associate_id) DO NOTHING
        RETURNING thing_id, associate_id
    ) select                  count(*) from inserts"
PL/pgSQL function setinteractions(text) line 7 at SQL statement
Process 31370 waits for ShareLock on transaction 111519214; blocked by process 31418.
Process 31418 waits for ShareLock on transaction 111519211; blocked by process 31370.
error: deadlock detected

我試過的

我認為有時可能在一次呼叫中使用重複數據呼叫該函式。不是這樣:這反而會導致有保證的錯誤,ON CONFLICT DO UPDATE command cannot affect row a second time.

我無法重現死鎖,即使set_interactions()使用相同的參數一次嘗試 1,000 次呼叫,甚至使用相同的行對(在對中不同)thing_idassociate_id值但也有其他值,所以它們沒有得到優化在到達 PostgreSQL 之前不知何故(它們也不應該被數據庫優化掉,因為該函式被標記為volatile。)這是來自單執行緒後端;但同時,應用程序本身在生產中只執行一個這樣的後端,那裡發生了死鎖。我什至嘗試針對生產數據庫的完整副本執行這 1,000 個呼叫,甚至在來自第二個後端的負載下執行,另外還通過從interactions. 他們毫無怨言地成功了。

https://rcoh.svbtle.com/postgres-unique-constraints-can-cause-deadlock提到在插入重複項時試圖避免依賴唯一索引(據我所知,這是 PK 的含義)。但是,那是之前ON CONFLICT DO UPDATE的,我認為這可以解決這個問題。

此查詢如何“隨機”死鎖,我該如何解決?(另外,為什麼我不能用上面的方法重現它?)

ON CONFLICT子句可以防止重複鍵錯誤。嘗試輸入相同鍵或更新相同行的並發事務仍然可能存在摩擦。所以它不是防止死鎖的保險。

最重要的是,為輸入行添加一致的順序ORDER BY。為了確保訂單得到執行,我使用了 CTE,它實現了結果。(我認為它也應該與子查詢一起使用;只是為了確定。)否則,試圖在唯一索引中輸入相同索引元組的相互糾纏的插入可能會導致您觀察到的死鎖。手冊:

防止死鎖的最佳方法通常是通過確保所有使用數據庫的應用程序以一致的順序獲取多個對像上的鎖來避免死鎖。

此外,由於set_interactions()是 PL/pgSQL 函式,因此更簡單、更便宜:

WITH data AS (
  SELECT t->>'thing_id' AS t_id, t->>'associate_id' AS a_id
  -- Or, if not type text, cast right away:
  -- SELECT (t->>'thing_id')::int AS t_id, (t->>'associate_id')::int AS a_id
  FROM   json_array_elements(arg_rows::json) t
  ORDER  BY 1, 2  -- deterministic, stable order (!!)
  )
INSERT INTO interaction (thing_id, associate_id, created_time)
SELECT t_id, a_id, now()
FROM   data
ON     CONFLICT (thing_id, associate_id) DO NOTHING;

GET DIAGNOSTICS insert_count = ROW_COUNT;

不需要另一個 CTERETURNING和另一個count(*). 更多的:

觸發器功能看起來也很臃腫。不需要嵌套塊,因為您沒有擷取錯誤,只會引發異常,以任何一種方式回滾整個事務。例外也是毫無意義的。

– PK 應該防止這種情況

觸發功能歸結為:

BEGIN
  -- Insert the new association, if it doesn't yet exist
  INSERT INTO associations (thing_owner, associate)
  SELECT t.associate_id, NEW.associate_id
  FROM   thing t
  WHERE  t.id = NEW.thing_id          --     -- PK guarantees 0 or 1 result
  AND    t.associate_id <> NEW.associate_id  -- exclude association to self
  ON     CONFLICT DO NOTHING;

  RETURN NULL;
END

您可以set_interactions()完全刪除觸發器和函式,然後執行這個查詢,做我在問題中可以看到的所有有用的事情:

WITH data AS (
  SELECT (t->>'thing_id')::int AS t_id, (t->>'associate_id')::int AS a_id  -- asuming int
  FROM   json_array_elements(arg_rows::json) t
  ORDER  BY 1, 2  -- (!!)
  )
, ins_inter AS (
  INSERT INTO interaction (thing_id, associate_id, created_time)
  SELECT t_id, a_id, now()
  FROM   data
  ON     CONFLICT (thing_id, associate_id) DO NOTHING
  RETURNING thing_id, associate_id
  )
, ins_ass AS (
  INSERT INTO associations (thing_owner, associate)
  SELECT t.associate_id, i.associate_id
  FROM   ins_inter i
  JOIN   thing     t ON t.id = i.thing_id
                    AND t.associate_id <> i.associate_id  -- exclude association to self
  ON     CONFLICT DO NOTHING
  )
SELECT count(*) FROM ins_inter;

現在,我看不到任何出現死鎖的機會了。當然,所有其他可能同時寫入同一個表的事務必須堅持相同的行順序。

如果這是不可能的並且您仍在考慮SKIP LOCKED,請參閱:

引用自:https://dba.stackexchange.com/questions/194756