Amazon-Rds

Aurora MySQL 5.7 隨機失敗

  • October 28, 2021

這是第 5 次。它每週發生一次(週二或週三 03:00-07:00 UTC+0)。在控制台上,它顯示可用但無法訪問。我們嘗試等待實例是否會自行恢復,大約 30 分鐘後沒有任何反應。所以我手動重新啟動它,然後在重新啟動後(約 5 分鐘)再次上線。

了解實際出了什麼問題會很有幫助。這只是一個使用者和記錄很少的開發伺服器。

Engine: Aurora MySQL 5.7.12
DB instance class: db.t2.small
Backup time: 16:00-16:30 UTC+0
Maintenance time: sun:17:00-sun:17:30 UTC+0

以下是重啟實例後可用日誌的唯一列表。

error/mysql-error-running.log.2018-07-24.03 Tue Jul 24 11:14:06 GMT+800 2018    11.8 kB
error/mysql-error-running.log.2018-07-24.04 Tue Jul 24 11:30:00 GMT+800 2018    285.5 kB
error/mysql-error-running.log.2018-07-24.05 Tue Jul 24 12:30:00 GMT+800 2018    31.1 kB
error/mysql-error-running.log.2018-07-24.06 Tue Jul 24 13:30:00 GMT+800 2018    31.8 kB
error/mysql-error-running.log.2018-07-24.07 Tue Jul 24 14:30:00 GMT+800 2018    32.9 kB
error/mysql-error-running.log.2018-07-24.08 Tue Jul 24 15:30:00 GMT+800 2018    29 kB
error/mysql-error-running.log.2018-07-24.09 Tue Jul 24 16:30:00 GMT+800 2018    32.1 kB
error/mysql-error-running.log.2018-07-24.10 Tue Jul 24 17:30:00 GMT+800 2018    27.5 kB
error/mysql-error-running.log.2018-07-24.11 Tue Jul 24 18:30:00 GMT+800 2018    31.7 kB
error/mysql-error-running.log.2018-07-24.12 Tue Jul 24 19:30:00 GMT+800 2018    27.1 kB
error/mysql-error-running.log.2018-07-24.13 Tue Jul 24 20:30:00 GMT+800 2018    22.4 kB
error/mysql-error-running.log.2018-07-24.14 Tue Jul 24 21:30:00 GMT+800 2018    22.8 kB
error/mysql-error-running.log.2018-07-24.15 Tue Jul 24 22:30:00 GMT+800 2018    24.7 kB
error/mysql-error-running.log.2018-07-24.16 Tue Jul 24 23:30:00 GMT+800 2018    24.7 kB
error/mysql-error.log   Wed Jul 25 00:34:45 GMT+800 2018    2.6 kB
external/mysql-external.log Wed Jul 25 00:30:00 GMT+800 2018    7.6 kB

外部/mysql-external.log

/rdsdbbin/oscar/bin/mysqld, Version: 5.7.12 (MySQL Community Server (GPL)). started with:
Tcp port: 3306 Unix socket: /tmp/mysql.sock
Time,ServerHost,User,UserHost,Command,Payload
/rdsdbbin/oscar/bin/mysqld, Version: 5.7.12 (MySQL Community Server (GPL)). started with:
Tcp port: 3306 Unix socket: /tmp/mysql.sock
Time,ServerHost,User,UserHost,Command,Payload
/rdsdbbin/oscar/bin/mysqld, Version: 5.7.12 (MySQL Community Server (GPL)). started with:
Tcp port: 3306 Unix socket: /tmp/mysql.sock
Time,ServerHost,User,UserHost,Command,Payload
----------------------- END OF LOG ----------------------

錯誤/mysql-error-running.log.2018-07-24.03顯示:https ://pastebin.com/ywmXLR5g 。

錯誤/mysql-error-running.log.2018-07-24.04顯示:https ://pastebin.com/g1dkR6rj 。

錯誤/mysql-error-running.log.2018-07-24.18顯示:https ://pastebin.com/g0aAXfaT 。

所有其他日誌均未顯示任何內容(見圖)。

在此處輸入圖像描述

事件日誌

July 24, 2018 at 11:14:14 AM UTC+8  DB instance restarted
July 24, 2018 at 11:13:31 AM UTC+8  Error restarting mysql: Engine bootstrap failed with no mysqld process running...
July 24, 2018 at 11:12:01 AM UTC+8  Recovery of the DB instance is complete.
July 24, 2018 at 11:04:26 AM UTC+8  Recovery of the DB instance has started. Recovery time will vary with the amount of data to be recovered.

CPU 使用率 (07-24-2018) 在此處輸入圖像描述

CPU 使用率(2018 年 7 月 11 日至 2018 年 7 月 24 日) 在此處輸入圖像描述

特別感謝@WilsonHauck。經過 4 週的監控,手動將 Aurora 升級到最新版本即可解決問題。

在 2.01.1 上已經有幾個解決意外重啟的錯誤修正。https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/AuroraMySQL.Updates.20Updates.html

要手動升級您的 Aurora:

  1. 轉到 RDS - AWS 控制台
  2. 導航到集群
  3. 選擇您的集群
  4. 點擊操作>>立即升級

引用自:https://dba.stackexchange.com/questions/213030