損壞的 InnoDB 表、mysqlcheck 和 mysqldump 崩潰伺服器

May 27, 2020

我管理一個 Percona XtraDB 集群，該集群使用帶有擺動連接的網路儲存。我們會周期性地遇到高 iowait 崩潰並以只讀方式重新掛載 fs。不幸的是，更換儲存目前不是一種選擇。

最近我注意到當執行 mysqldump 或 mysqlcheck 時，它們會使節點上的 MySQL 伺服器崩潰，並出現錯誤mysqlcheck: Got error: 2013: Lost connection to MySQL server during query when executing 'CHECK TABLE ... '

以下是mysqld.log崩潰期間的內容：

InnoDB: Error in pages 9479 and 9480 of index "PRIMARY" of table "foobar"."quux"
InnoDB: broken FIL_PAGE_NEXT or FIL_PAGE_PREV links
2015-09-28 14:39:45 7f015813b700 InnoDB: Page dump in ascii and hex (16384 bytes):

(...)

InnoDB: End of page dump
2015-09-28 14:39:45 7f015813b700 InnoDB: uncompressed page, stored checksum in field1 4038097986, calculated checksums for field1: crc32 2787032309, innodb 4038097986, none 3735928559, stored checksum in field2 1190336748, calculated checksums for field2: crc32 2787032309, innodb 1190336748, none 3735928559, page LSN 4 3652646491, low 4 bytes of LSN at page end 3652646491, page number (if stored to page already) 9479, space id (if created with &gt;= MySQL-4.1.1 and stored already) 18
InnoDB: Page may be an index page where index id is 67
InnoDB: (index "PRIMARY" of table "foobar"."quux")
2015-09-28 14:39:45 7f015813b700 InnoDB: Page dump in ascii and hex (16384 bytes):

(...) 

InnoDB: End of page dump
2015-09-28 14:39:46 7f015813b700 InnoDB: uncompressed page, stored checksum in field1 554678569, calculated checksums for field1: crc32 2178598661, innodb 554678569, none 3735928559, stored checksum in field2 1065260512, calculated checksums for field2: crc32 2178598661, innodb 1065260512, none 3735928559, page LSN 10 202985777, low 4 bytes of LSN at page end 202985777, page number (if stored to page already) 6792, space id (if created with &gt;= MySQL-4.1.1 and stored already) 18
InnoDB: Page may be an index page where index id is 67
InnoDB: (index "PRIMARY" of table "foobar"."quux")
InnoDB: Corruption of an index tree: table "foobar"."quux", index "PRIMARY",
InnoDB: father ptr page no 55234, child page no 9479
PHYSICAL RECORD: n_fields 14; compact format; info bits 0
0: len 30; hex 34616434393538322d353232372d653863302d326466662d353461663639; asc 4ad49582-5227-e8c0-2dff-54af69; (total 36 bytes);
1: len 6; hex 000000000dd7; asc       ;;
2: len 7; hex c8000001741ea1; asc     t  ;;
3: len 5; hex 99951259cb; asc    Y ;;
4: len 5; hex 99951259cb; asc    Y ;;
5: len 30; hex 62633965323864352d383865382d343466322d393337322d353339303931; asc bc9e28d5-88e8-44f2-9372-539091; (total 36 bytes);
6: len 30; hex 62633965323864352d383865382d343466322d393337322d353339303931; asc bc9e28d5-88e8-44f2-9372-539091; (total 36 bytes);
7: len 1; hex 80; asc  ;;
8: len 30; hex 64356664666538352d656431652d346465362d383363612d616439663164; asc d5fdfe85-ed1e-4de6-83ca-ad9f1d; (total 36 bytes);
9: len 8; hex 436f6e7461637473; asc Contacts;;
10: len 4; hex 6c696e6b; asc link;;
11: len 30; hex 7b226f626a656374223a7b226e616d65223a224d72204a6f7264616e204e; asc {"object":{"name":"Mr John Hacker; (total 343 bytes);
12: len 4; hex 80000000; asc     ;;
13: len 30; hex 7b226e616d65223a22222c22646f635f6f776e6572223a22222c22757365; asc {"name":"","doc_owner":"","use; (total 72 bytes);
n_owned: 0; heap_no: 2; next rec: 751
PHYSICAL RECORD: n_fields 2; compact format; info bits 0
0: len 30; hex 34616435616262302d303535662d333939612d613038652d353439396461; asc 4ad5abb0-055f-399a-a08e-5499da; (total 36 bytes);
1: len 4; hex 0000d7c2; asc     ;;
n_owned: 0; heap_no: 277; next rec: 4688
InnoDB: You should dump + drop + reimport the table to fix the
InnoDB: corruption. If the crash happens at the database startup, see
InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html about
InnoDB: forcing recovery. Then dump + drop + reimport.
2015-09-28 14:39:46 7f015813b700  InnoDB: Assertion failure in thread 139643749381888 in file btr0btr.cc line 1492
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
12:39:46 UTC - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed, 
something is definitely wrong and this may fail.
Please help us make Percona XtraDB Cluster better by reporting any
bugs at https://bugs.launchpad.net/percona-xtradb-cluster

key_buffer_size=25165824
read_buffer_size=131072
max_used_connections=7
max_threads=202
thread_count=10
connection_count=5
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 105204 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0xf4de780
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7f015813ad38 thread_stack 0x40000
/usr/sbin/mysqld(my_print_stacktrace+0x35)[0x8fa965]
/usr/sbin/mysqld(handle_fatal_signal+0x4b4)[0x665644]
/lib64/libpthread.so.0(+0xf710)[0x7f0185a25710]
/lib64/libc.so.6(gsignal+0x35)[0x7f0183e6b625]
/lib64/libc.so.6(abort+0x175)[0x7f0183e6ce05]
/usr/sbin/mysqld[0xa10d84]
/usr/sbin/mysqld[0xa16cc8]
/usr/sbin/mysqld[0x917920]
/usr/sbin/mysqld(_ZN7handler8ha_checkEP3THDP15st_ha_check_opt+0x6a)[0x5a422a]
/usr/sbin/mysqld[0x835fc3]
/usr/sbin/mysqld(_ZN19Sql_cmd_check_table7executeEP3THD+0xc2)[0x836cd2]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x33d5)[0x6ed235]
/usr/sbin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x658)[0x6f0958]
/usr/sbin/mysqld[0x6f0acd]
/usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x19d5)[0x6f2de5]
/usr/sbin/mysqld(_Z10do_commandP3THD+0x22b)[0x6f42cb]
/usr/sbin/mysqld(_Z24do_handle_one_connectionP3THD+0x17f)[0x6bc52f]
/usr/sbin/mysqld(handle_one_connection+0x47)[0x6bc717]
/usr/sbin/mysqld(pfs_spawn_thread+0x12a)[0xaf611a]
/lib64/libpthread.so.0(+0x79d1)[0x7f0185a1d9d1]
/lib64/libc.so.6(clone+0x6d)[0x7f0183f218fd]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (7efebc091d30): is an invalid pointer
Connection ID (thread ID): 2336
Status: NOT_KILLED

很明顯，桌子foobar.quux嚴重損壞。使用數據庫的應用程序仍然有效（儘管性能有所降低），SELECT語句也是如此。

mysqlcheck 工具不能用來修復它，所以我知道的解決方案是做 a SELECT * FROM quux INTO OUTFILE，刪除表，然後LOAD DATA INFILE在下一個維護視窗期間做 a 。這種處理方式是否存在缺點，是否有其他選擇來修復表格？

**編輯：**我重新啟動了 MySQL 伺服器，其值innodb_force_recovery從 1 增加到 4，結果始終相同：

mysqldump 失敗並出現錯誤mysqldump: Error 2013: Lost connection to MySQL server during query when dumping table quux at row: 156915
MySQL 命令SELECT * FROM quux INTO OUTFILE '/root/quux.sql';在出現錯誤後不久失敗ERROR 2013 (HY000): Lost connection to MySQL server during query

我要試試innodb_force_recovery=5嗎innodb_force_recovery=6？可能有什麼缺點？

mysqlcheck不會修復 InnoDB 表中的損壞。您需要從表中轉儲數據並重新創建它。
innodb_force_recovery使用選項啟動 MySQL 。嘗試從 1 到 6 的值，直到 MySQL 啟動。
用轉儲表mysqldump。
放下桌子。
重啟 MySQL w/o innodb_force_recovery。
重新載入轉儲。

對於它的價值，我發現它mysqlcheck -o可以有效地修復損壞的 InnoDB 表，因為它實際上做了“重新創建 + 分析”。我有一個損壞的表，它阻止了 mysql 啟動。首先我試過：
mysqlcheck -r db_name table_name
返回：
db_name.table_name
note     : The storage engine for the table doesn't support repair
然後我嘗試了：
mysqlcheck -o db_name table_name
在這裡…
db_name.table_name
note     : Table does not support optimize, doing recreate + analyze instead
status   : OK
不確定這是否會修復各種損壞，但它對我有用。

引用自：https://dba.stackexchange.com/questions/116730

損壞的 InnoDB 表、mysqlcheck 和 mysqldump 崩潰伺服器

相關問答

mysql 5.5 損壞 - 缺少臨時表

MySQL 5.6 中的頻繁損壞索引。這是一個 MySQL 錯誤嗎？

防止 MySQL 索引損壞

在並發截斷命令期間伺服器崩潰後 MySQL INNODB 損壞

無法載入 mysql 轉儲文件

即使在完全轉儲和重新導入之後，MySQL 也會損壞