兩次更新導致死鎖
我有一個帶有以下sheme的mysql隊列:
CREATE TABLE IF NOT EXISTS `media` ( `mid` int(11) NOT NULL AUTO_INCREMENT, `order_type` int(11) NOT NULL, `media_id` varchar(50) COLLATE utf8_unicode_ci NOT NULL, `package` int(11) NOT NULL, `sent` int(11) NOT NULL, `timestamp` datetime NOT NULL, `t_check` int(11) NOT NULL, `performed_by` varchar(20) COLLATE utf8_unicode_ci DEFAULT NULL, `last_run` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, `flag` int(11) NOT NULL, `package_sent_diff` int(11) AS (package-sent) PERSISTENT, PRIMARY KEY (`mid`), KEY `last_run` (`last_run`), KEY `performed_by` (`performed_by`), KEY `sent` (`sent`), KEY `package` (`package`), KEY `sent_package` (`sent`,`package`), KEY `package_sent_diff` (`package_sent_diff`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci AUTO_INCREMENT=238343961;
package 是工作量, sent 是執行任務的頻率。如果
sent < package
有工作可供工人接受。package_sent_diff
用於索引查詢性能必須完成的數量。performed_by
用於將工作分配給工人。MariaDB 版本 10.0.20,該表有超過 200 萬行。
自動送出已開啟
幾個工作人員基於此隊列執行。
簡單地概述他們的查詢工作:
一名工人執行更新以聲明 10 個未使用的工作
UPDATE media SET performed_by = '71602155f18ac6001eb', last_run = NOW() WHERE flag = 0 AND t_check < 75 AND sent < 20 AND last_run < NOW() AND performed_by IS NULL AND sent < 4377 AND package_sent_diff > 0 LIMIT 10;
然後工人從表中選擇所有資訊
SELECT * FROM media WHERE performed_by = '71602155f18ac6001eb'
然後它執行作業並將作業的發送值加一併釋放該行以供以後執行
UPDATE media SET sent = sent + 1, performed_by = NULL WHERE mid = 238323961
此時我經常看到
package_sent_diff
key發生死鎖簡化程式碼:
$unique = uniqid(); $mysqli->query("UPDATE media SET performed_by = '".$unique."', last_run = NOW() WHERE flag = 0 AND t_check < 75 AND sent < 20 AND last_run < NOW() AND performed_by IS NULL AND sent < 4377 AND package_sent_diff > 0 LIMIT 10"); $query = $mysqli->query("SELECT * FROM media WHERE performed_by = '".$unique."'"); while($job = $query->fetc_assoc()) { if(doJob($job)) { $mysqli->query("UPDATE media SET sent = sent + 1, performed_by = NULL WHERE mid = ".$job['mid']); } }
這是結果
SHOW ENGINE INNODB STATUS
------------------------ LATEST DETECTED DEADLOCK ------------------------ 2015-09-10 14:09:22 7f7f27fb6700 *** (1) TRANSACTION: TRANSACTION 2544765863, ACTIVE 0 sec updating or deleting mysql tables in use 1, locked 1 LOCK WAIT 4 lock struct(s), heap size 1184, 3 row lock(s), undo log entries 1 MySQL thread id 725683, OS thread handle 0x7f7f50fb6700, query id 989929718 localhost media_queue updating UPDATE media SET sent = sent + 1, performed_by = NULL WHERE mid = 238323961 *** (1) WAITING FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 2565 page no 59448 n bits 768 index `package_sent_diff` of table `media_queue`.`media` trx table locks 1 total table locks 13 trx id 2544765863 lock_mode X locks gap before rec insert intention waiting lock hold time 0 wait time before grant 0 *** (2) TRANSACTION: TRANSACTION 2544765389, ACTIVE 1 sec fetching rows, thread declared inside InnoDB 2350 mysql tables in use 1, locked 1 3029 lock struct(s), heap size 357928, 25327 row lock(s) MySQL thread id 725748, OS thread handle 0x7f7f27fb6700, query id 989929439 localhost media_queue updating UPDATE media SET performed_by = '71602155f18ac6001eb', last_run = NOW() WHERE flag = 0 AND t_check < 75 AND sent < 20 AND last_run < NOW() AND performed_by IS NULL AND sent < 4377 AND package_sent_diff > 0 LIMIT 10 *** (2) HOLDS THE LOCK(S): RECORD LOCKS space id 2565 page no 59448 n bits 768 index `package_sent_diff` of table `media_queue`.`media` trx table locks 1 total table locks 13 trx id 2544765389 lock_mode X lock hold time 0 wait time before grant 0 *** (2) WAITING FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 2565 page no 55655 n bits 1192 index `package_sent_diff` of table `media_queue`.`media` trx table locks 1 total table locks 13 trx id 2544765389 lock_mode X waiting lock hold time 0 wait time before grant 0 *** WE ROLL BACK TRANSACTION (1)
要求作業的查詢和更新列的查詢似乎
sent
相互衝突,從而導致死鎖。執行的工人越多,我看到死鎖的頻率就越高
我經常讀到一個應用程序必須預料到死鎖,並且應該簡單地重新發出導致死鎖的查詢,但這並沒有給我帶來任何結果。
無論我多久重複一次
UPDATE media SET sent = sent + 1, performed_by = NULL WHERE mid = 238323961
查詢,3 次或 20 次,如果該查詢已經遇到死鎖,它總是會導致死鎖。我怎樣才能改變事情以擺脫僵局?保持快速查詢很重要,這樣我就可以在該表上執行 100 多個工作人員,而不會使其速度過慢。
//編輯:
添加複合索引後
ALTER TABLE `media_queue`.`media` ADD INDEX `performed_by_package_sent_diff` ( `performed_by` , `package_sent_diff` ) COMMENT '';
並將第二個查詢修改為
UPDATE media SET sent = sent + 3, performed_by = NULL WHERE mid = 243674295 AND performed_by = '92817855f2e21978b76'
我看到的死鎖減少了 10 倍,但死鎖仍然會發生。
這是
SHOW ENGINE INNODB STATUS
輸出:------------------------ LATEST DETECTED DEADLOCK ------------------------ 2015-09-11 15:48:29 7f7f487ff700 *** (1) TRANSACTION: TRANSACTION 2585082648, ACTIVE 0 sec updating or deleting mysql tables in use 1, locked 1 LOCK WAIT 4 lock struct(s), heap size 1184, 3 row lock(s), undo log entries 1 MySQL thread id 748336, OS thread handle 0x7f7f255b7700, query id 1018467234 localhost media_queue updating UPDATE media SET sent = sent + 1, performed_by = NULL WHERE mid = 244045072 AND performed_by = '11514555f2f24e094e7' *** (1) WAITING FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 2587 page no 61522 n bits 784 index `performed_by_package_sent_diff` of table `media_queue`.`media` trx table locks 1 total table locks 41 trx id 2585082648 lock_mode X locks gap before rec insert intention waiting lock hold time 0 wait time before grant 0 *** (2) TRANSACTION: TRANSACTION 2585081924, ACTIVE 2 sec fetching rows, thread declared inside InnoDB 1852 mysql tables in use 1, locked 1 4559 lock struct(s), heap size 521768, 66353 row lock(s) MySQL thread id 748490, OS thread handle 0x7f7f487ff700, query id 1018466800 localhost media_queue Searching rows for update UPDATE media SET performed_by = '35359855f2f4a5a7eaf', last_run = NOW() WHERE flag = 0 AND t_check < 75 AND sent < 20 AND last_run < NOW() AND performed_by IS NULL AND sent < 4370 AND package_sent_diff > 0 LIMIT 10 *** (2) HOLDS THE LOCK(S): RECORD LOCKS space id 2587 page no 61522 n bits 784 index `performed_by_package_sent_diff` of table `media_queue`.`media` trx table locks 1 total table locks 41 trx id 2585081924 lock_mode X lock hold time 1 wait time before grant 0 *** (2) WAITING FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 2587 page no 73537 n bits 520 index `performed_by_package_sent_diff` of table `media_queue`.`media` trx table locks 1 total table locks 41 trx id 2585081924 lock_mode X waiting lock hold time 0 wait time before grant 0 *** WE ROLL BACK TRANSACTION (1)
同樣在 90% 的情況下,如果在死鎖發生後重複查詢,我可以強制查詢通過。
這是盡可能優化還是可以完全擺脫死鎖?
添加
ORDER BY mid
可能會防止死鎖。但我真正的答案是“忍受它”。也就是說,認識到您遇到了死鎖並簡單地重新執行
UPDATE
。你被放慢了一點,但除此之外沒有造成任何傷害。無關:
KEY `sent` (`sent`), KEY `sent_package` (`sent`,`package`),
其中第一個是多餘的,可以刪除。
答案應該是修改您選擇的更新,讀取與更新 where 子句相關的主鍵,然後使用其主鍵單獨更新每一行。
我在大約 100 個 mariadb 數據庫的環境中工作,每個數據庫大約 32-40 個 cpu,384gbram,每個節點都有 6tb 到 33tb 的 mariadb 數據庫。從字面上看,每秒執行 300,000 到 500,000 筆交易,具體取決於交易量。
許多服務可以在 SUPER HIGH VOLUME 環境中執行的唯一方法是 100% 所有更新和刪除都必須按主鍵,這意味著您只需要單獨的行級鎖。這是當今 mysql 數據庫上可用的最精細的鎖定集。