Mysql
通過 percona 回放回放日誌時出現奇怪的鎖定問題
重播日誌時出現問題
我正在對一個新的數據庫節點進行基準測試(最後的規格)並且遇到了一些奇怪的行為:
如此處所述,我:
- 創建了轉儲 (innobackupex ftw)
- 我記錄了一個小時的所有查詢
- 設置我的新數據庫(與實時數據庫相同的 my.cnf 僅具有更高的innodb_buffer_pool_size)
- 開始重播我的慢查詢日誌
根據文件:
percona-playback --mysql-host=127.0.0.1\ --mysql-user=root --mysql-schema=my_db\ --query-log-file=slow.log
這可以正常工作大約 15 分鐘,然後我開始遇到奇怪的鎖定問題:
Error during query: Lock wait timeout exceeded; try restarting transaction, number of tries 0
我開始調試數據庫上的目前負載,發現只有一個查詢正在執行:
(取自 innodb 狀態)
---TRANSACTION 1C5264768, ACTIVE 44 sec inserting mysql tables in use 1, locked 1 LOCK WAIT 2 lock struct(s), heap size 376, 1 row lock(s) MySQL thread id 4289, OS thread handle 0x7f7fb0779700, query id 77515 localhost 127.0.0.1 root update insert into sessions (a, b, c, d, e, e, f, g, h, i, j, k, l, m, n, o, p, q) values (0, 682, ------- TRX HAS BEEN WAITING 44 SEC FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 4549 page no 7875876 n bits 104 index `PRIMARY` of table `production`.`sessions` trx id 1C5264768 lock_mode X insert intention waiting ------------------ TABLE LOCK table `production`.`sessions` trx id 1C5264768 lock mode IX RECORD LOCKS space id 4549 page no 7875876 n bits 104 index `PRIMARY` of table `production`.`sessions` trx id 1C5264768 lock_mode X insert intention waiting ---TRANSACTION 1C526475D, ACTIVE (PREPARED) 452 sec 2 lock struct(s), heap size 376, 1 row lock(s) MySQL thread id 1722, OS thread handle 0x7f7fb083d700, query id 77311 localhost 127.0.0.1 root Trx read view will not see trx with id >= 1C526475E, sees < 1C525BA04 TABLE LOCK table `production`.`sessions` trx id 1C526475D lock mode IX RECORD LOCKS space id 4549 page no 7875876 n bits 104 index `PRIMARY` of table `production`.`sessions` trx id 1C526475D lock_mode X ---------------------------- END OF INNODB MONITOR OUTPUT
只有一張桌子打開:
mysql> SHOW OPEN TABLES from production where In_use != 0; +----------------------+--------------+--------+-------------+ | Database | Table | In_use | Name_locked | +----------------------+--------------+--------+-------------+ | production | sessions | 1 | 0 | +----------------------+--------------+--------+-------------+ 1 row in set (0.00 sec)
這種情況會保持這種狀態大約 3-4 分鐘,然後突然播放繼續。
這些問題不會在實時數據庫上發生:我們在鎖定方面存在一些問題,但我們從未超過innodb_lock_wait_timeout值。
我很可能遺漏了一些明顯的東西,但是對於我的生活,我無法弄清楚,但是為什麼重播會像那樣掛起,或者更好的是,為什麼 mysql 會保持這種鎖定狀態?
慢日誌中的相關條目來自我們的 jee 伺服器:
XA START 0xbe681101606ce8d1676630322c7365727665722c5035313337,0x676630322c7365727665722c50353133372c00,0x4a5453; insert into sessions (a, b, c, d, e, e, f, g, h, i, j, k, l, m, n, o, p, q) values (0, 682, ...); XA END 0xbe681101606ce8d1676630322c7365727665722c5035313337,0x676630322c7365727665722c50353133372c00,0x4a5453;
hibernate的事務處理與鎖的生成方式和不關閉方式有什麼關係嗎?
伺服器規格
- Ubuntu 12.04.2 LTS
- percona-server-server-5.5 版本 5.5.32-rel31.0-549.precise
相關配置:
max_connections = 1500 sort_buffer_size = 1M thread_cache_size = 1000 max_heap_table_size = 512M tmp_table_size = 512M join_buffer_size = 67108864 expand_fast_index_creation = ON open_files_limit = 65535 table_definition_cache = 4096 table_open_cache = 262144 max_allowed_packet = 16M thread_stack = 192K query_cache_limit = 1M query_cache_size = 512M thread_concurrency = 8 query_cache_type = 1 long_query_time = 2 log_slave_updates = 1 expire_logs_days = 10 max_binlog_size = 100M
Innodb 配置:
default_storage_engine = InnoDB innodb_file_per_table = 1 innodb_old_blocks_time = 1000 innodb_buffer_pool_size = 163456M innodb_log_file_size = 256M innodb_flush_method = O_DIRECT innodb_read_io_threads = 4 innodb_write_io_threads = 4 innodb_doublewrite = FALSE innodb_flush_log_at_trx_commit = 2
感謝您在這方面的任何幫助或經驗!
編輯
我一直在玩一些 innodb 變數,並且在innodb_show_verbose_locks的幫助下能夠確定更多。在這個例子中:
---TRANSACTION 1C52D8AB4, ACTIVE 49 sec inserting mysql tables in use 1, locked 1 LOCK WAIT 2 lock struct(s), heap size 376, 1 row lock(s) MySQL thread id 18602, OS thread handle 0x7f007a4a0700, query id 624263 localhost 127.0.0.1 root update INSERT INTO `images` (A,B,C...) VALUES (....) ------- TRX HAS BEEN WAITING 49 SEC FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 51 page no 16791 n bits 152 index `PRIMARY` of table `production`.`images` trx id 1C52D8AB4 lock_mode X insert intention waiting Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0 0: len 8; hex 73757072656d756d; asc supremum;; ------------------ TABLE LOCK table `production`.`images` trx id 1C52D8AB4 lock mode IX RECORD LOCKS space id 51 page no 16791 n bits 152 index `PRIMARY` of table `production`.`images` trx id 1C52D8AB4 lock_mode X insert intention waiting Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0 0: len 8; hex 73757072656d756d; asc supremum;; ---TRANSACTION 1C52D8AA9, ACTIVE 151 sec 2 lock struct(s), heap size 376, 1 row lock(s) MySQL thread id 18460, OS thread handle 0x7f007454e700, query id 624243 localhost 127.0.0.1 root TABLE LOCK table `production`.`images` trx id 1C52D8AA9 lock mode IX RECORD LOCKS space id 51 page no 16791 n bits 152 index `PRIMARY` of table `production`.`images` trx id 1C52D8AA9 lock_mode X Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0 0: len 8; hex 73757072656d756d; asc supremum;;
事務1C52D8AA9和1C52D8AB4在地址**73757072656d756d上都有一個 IX 鎖定,這很好,因為我從這篇文章中收集到,因為 innodb 使用 MGL 鎖定。然而,後續X 鎖定(參見此處:“id 1C52D8AB4 lock_mode X insert intent waiting”)缺失。
我似乎找到了答案……至少它有效。
經過幾個小時的反複試驗,我發現問題似乎是我的慢日誌中的事務。我找不到任何文件來支持這一點,所以我仍然不確定我是否只是使用了錯誤的工具。
我評論了所有:
- XA 開始
- XA結束
- XA 送出
- XA 準備
- 開始;
- 犯罪;
我的日誌中的行,它沒有一個鎖就可以工作。