為什麼 MySQL 5.7 複製在我的新伺服器上非常慢
我對 MySQL 複製有疑問,我想找到有關它的建議或資訊。
讓我向您展示基礎設施。
師父:這是我們主要的 MySQL 伺服器。我想用Slave B替換它。
Slave A:這是我們連接到Master的複制伺服器。這台伺服器沒有問題。
Slave B:這是我們的 MySQL 伺服器,將取代Master。目前,此伺服器被配置為複制以準備傳輸。
目標: 我有兩個執行良好的 MySQL 伺服器(主/從)。我想用一個新的更高效的伺服器(從 B )替換我們的****主伺服器。為此,我們設置了第二個從 B複製以準備傳輸,但我遇到了問題。我試圖調查這個問題,但我不是 MySQL 專家,我想尋求幫助以找到問題。
PS:我在****Slave B上安裝了Percona Monitoring Management和sysbench工具。因此,如果需要,我可以輕鬆地從這些工具中提供資訊來完成我的文章。
機器硬體細節:
掌握:
奴隸A
從機 B
https://versus.com/en/intel-xeon-e5-2650-vs-intel-xeon-silver-4214r
觀察複製延遲:
掌握:
掌握文件mysql-bin.024080的工作
show master status\G *************************** 1. row *************************** File: mysql-bin.024080 Position: 62186738
奴隸 B:
從站 B擁有所有文件(mysql-bin.024080),它適用於文件mysql-bin.024075。這些文件之間的差距隨著時間的推移而增加。目前 SQL_THREAD 落後於 5 個二進制日誌。
show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 10.55.55.21 Master_User: slave_replicator Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-bin.024080 Read_Master_Log_Pos: 60111446 Relay_Log_File: mysql-relay-bin.002952 Relay_Log_Pos: 154669142 Relay_Master_Log_File: mysql-bin.024075 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 154668929 Relay_Log_Space: 2631436265 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 68158 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 195390 Master_UUID: 361633ab-6bb7-11e7-829c-1a76636a64a1 Master_Info_File: /var/lib/mysql/master.info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Reading event from the relay log Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: Executed_Gtid_Set: Auto_Position: 0 Replicate_Rewrite_DB: Channel_Name: Master_TLS_Version:
奴隸A:
從站 A是最新的。
show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 10.55.55.21 Master_User: slave_replicator Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-bin.024080 Read_Master_Log_Pos: 88327716 Relay_Log_File: mysql-relay-bin.417046 Relay_Log_Pos: 88327835 Relay_Master_Log_File: mysql-bin.024080 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 88327716 Relay_Log_Space: 88328255 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 195390 Master_UUID: 361633ab-6bb7-11e7-829c-1a76636a64a1 Master_Info_File: /var/lib/mysql/master.info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: Executed_Gtid_Set: Auto_Position: 0 Replicate_Rewrite_DB: Channel_Name: Master_TLS_Version: 1 row in set (0.00 sec)
我確定了3種方法
1:MySQL配置問題 MySQL 的很多參數都會影響伺服器性能,因此優化MySQL非常重要。當我改變sync_binlog = 0和innodb_flush_log_at_trx_commit = 2參數時,伺服器性能大大提高並允許從 B的複制趕上。但是,我想離開sync_binlog = 1和innodb_flush_log_at_trx_commit = 1。您知道任何其他可能影響複製速度的參數嗎?
2:單執行緒 CPU 問題 MySQL 在單個程序上執行複制,因此只佔用一個 CPU。由於Slave B (2.40GHz) 的 CPU 比Slave A (2.60GHz) 慢。從站 A上的 MySQL 複製過程對工作負載沒有任何問題,而從站 B無法支持。我怎樣才能驗證這一點?
3:硬體 問題 可以考慮硬體問題。我怎樣才能找到瓶頸?
這一定是 I/O 問題,因為磁碟在Slave B上很忙。但是讀寫性能看起來很糟糕?
ATOP - SLAVE B 2021/01/28 19:07:53 -------------- 10s elapsed PRC | sys 0.41s | user 0.73s | #proc 576 | #trun 2 | #tslpi 668 | #tslpu 2 | #zombie 0 | clones 0 | #exit 0 | CPU | sys 3% | user 7% | irq 0% | idle 4722% | wait 69% | steal 0% | guest 0% | curf 1.97GHz | curscal 56% | CPL | avg1 1.86 | avg5 1.67 | avg15 1.79 | | csw 67549 | | intr 28125 | | numcpu 48 | MEM | tot 187.6G | free 161.3G | cache 5.7G | buff 148.7M | slab 716.4M | shmem 0.5M | vmbal 0.0M | hptot 0.0M | hpuse 0.0M | SWP | tot 511.0M | free 429.1M | | | | | | vmcom 168.4G | vmlim 94.3G | DSK | sda | busy 99% | read 325 | write 4319 | KiB/r 16 | KiB/w 10 | MBr/s 0.5 | MBw/s 4.2 | avio 2.12 ms | NET | transport | tcpi 359 | tcpo 417 | udpi 0 | udpo 0 | tcpao 1 | tcppo 1 | tcprs 0 | udpie 0 | NET | network | ipi 364 | ipo 345 | ipfrw 0 | deliv 362 | | | icmpi 3 | icmpo 0 | NET | enp61s0 0% | pcki 132 | pcko 81 | sp 10 Gbps | si 10 Kbps | so 77 Kbps | erri 0 | erro 0 | drpo 0 | NET | enp61s0 0% | pcki 131 | pcko 138 | sp 10 Gbps | si 72 Kbps | so 55 Kbps | erri 0 | erro 0 | drpo 0 | PID TID RDDSK WRDSK WCANCL DSK CMD 1/2 12588 - 5520K 22508K 0K 99% mysqld
這是我的SLAVE B(Raid 10) 上的 SSD https://www.actualis.com/fr/sata/7075-intel-s4510-384tb-sata-6gb-s-3d-tlc-25-1dwpd-ssdsc2kb038t8。 html
我在Slave B上執行了 IO 測試,以檢查我的問題是否來自 I/O。我們可以看到,這個結果比之前只有 MySQL Replication 的結果要好得多。
為什麼我無法使用 MySQL 獲得相同的 I/O 性能?
執行 fio 的 ATOP
$> fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test --filename=test --bs=4k --iodepth=64 --size=2G --readwrite=randrw --rwmixread=75 --numjobs=1``` ```bash ATOP - SLAVE B 2021/02/01 09:07:29 -------------- 10s elapsed PRC | sys 1.14s | user 1.04s | #proc 587 | #trun 1 | #tslpi 680 | #tslpu 0 | #zombie 0 | clones 0 | | #exit 2 | CPU | sys 10% | user 6% | irq 5% | idle 4780% | wait 0% | | steal 0% | guest 0% | curf 1.58GHz | curscal 45% | CPL | avg1 2.23 | avg5 5.91 | avg15 5.57 | | | csw 437791 | intr 416652 | | | numcpu 48 | MEM | tot 187.6G | free 3.2G | cache 56.0G | buff 162.9M | slab 2.2G | shmem 0.7M | shrss 0.0M | vmbal 0.0M | hptot 0.0M | hpuse 0.0M | SWP | tot 511.0M | free 429.3M | | | | | | | vmcom 168.4G | vmlim 94.3G | DSK | sda | busy 68% | read 209257 | write 70808 | KiB/r 4 | KiB/w 4 | MBr/s 82.5 | MBw/s 27.7 | avq 63.06 | avio 0.02 ms | NET | transport | tcpi 389 | tcpo 555 | udpi 0 | udpo 0 | tcpao 1 | tcppo 1 | tcprs 0 | tcpie 0 | udpie 0 | NET | network | ipi 392 | ipo 385 | ipfrw 0 | deliv 391 | | | | icmpi 2 | icmpo 0 | NET | enp61s0 0% | pcki 188 | pcko 234 | sp 10 Gbps | si 13 Kbps | so 243 Kbps | erri 0 | erro 0 | drpi 0 | drpo 0 | NET | enp61s0 0% | pcki 20 | pcko 47 | sp 10 Gbps | si 1 Kbps | so 50 Kbps | erri 0 | erro 0 | drpi 0 | drpo 0
從 B 上執行 MySQL 的 ATOP
ATOP - Slave B 2021/02/03 08:16:50 -------------- 10s elapsed PRC | sys 0.55s | user 0.71s | #proc 573 | #tslpi 785 | #tslpu 2 | #zombie 0 | #exit 1 | CPU | sys 5% | user 6% | irq 0% | idle 4738% | wait 52% | curf 1.90GHz | curscal 54% | CPL | avg1 2.94 | avg5 2.26 | avg15 1.82 | csw 85384 | intr 34355 | | numcpu 48 | MEM | tot 187.6G | free 92.5G | cache 72.4G | buff 178.5M | slab 2.7G | vmbal 0.0M | hptot 0.0M | SWP | tot 511.0M | free 429.5M | | | | vmcom 169.3G | vmlim 94.3G | DSK | sda | busy 99% | read 727 | write 4837 | MBr/s 1.1 | MBw/s 5.9 | avio 1.78 ms | NET | transport | tcpi 1232 | tcpo 1284 | udpi 0 | udpo 0 | tcpao 2 | tcppo 2 | NET | network | ipi 1236 | ipo 1208 | ipfrw 0 | deliv 1234 | icmpi 2 | icmpo 0 | NET | enp61s0 0% | pcki 1019 | pcko 1002 | sp 10 Gbps | si 508 Kbps | so 105 Kbps | erro 0 | NET | enp61s0 0% | pcki 121 | pcko 78 | sp 10 Gbps | si 9 Kbps | so 77 Kbps | erro 0 |
執行 MySQL 的 Master 上的 ATOP
ATOP - Master 2021/02/03 08:13:07 ------ 10s elapsed PRC | sys 2.75s | user 9.46s | #proc 313 | #tslpi 468 | #tslpu 1 | #zombie 0 | #exit 0 | CPU | sys 23% | user 89% | irq 1% | idle 2820% | wait 68% | curf 2.10GHz | curscal ?% | CPL | avg1 2.65 | avg5 2.51 | avg15 1.74 | csw 114254 | intr 111525 | | numcpu 30 | MEM | tot 117.6G | free 540.8M | cache 83.6M | dirty 0.0M | buff 125.8M | slab 316.8M | | SWP | tot 120.0G | free 100.6G | | | | vmcom 121.2G | vmlim 178.8G | PAG | scan 62526 | stall 0 | | | swin 13616 | | swout 13017 | LVM | master--vg-root | busy 48% | read 17477 | write 1580 | MBr/s 27.30 | MBw/s 0.65 | avio 0.25 ms | LVM | a--vg-swap_1 | busy 31% | read 13617 | write 13017 | MBr/s 5.32 | MBw/s 5.08 | avio 0.12 ms | DSK | xvda | busy 61% | read 30832 | write 2685 | MBr/s 32.62 | MBw/s 5.74 | avio 0.18 ms | NET | transport | tcpi 7823 | tcpo 7678 | udpi 2 | udpo 2 | tcpao 1 | tcppo 385 | NET | network | ipi 7825 | ipo 6857 | ipfrw 0 | deliv 7825 | icmpi 0 | icmpo 0 | NET | eth0 ---- | pcki 7829 | pcko 6858 | si 984 Kbps | so 2202 Kbps | erri 0 | erro 0 |
每秒速率 = RPS
為您的 Slave B my.cnf 考慮的建議
$$ mysqld $$部分
innodb_lru_scan_depth=100 # from 1024 to conserve 90% of CPU cycles used for function every second innodb_log_buffer_size=1G # from 16M for ~ 30 minutes in RAM before write to media innodb_io_capacity=1500 # from 200 to enable higher IOPS to your SSD storage read_rnd_buffer_size=128K # from 256K to reduce handler_read_rnd_next RPS of 6,753
有關其他建議,請查看配置文件、聯繫資訊的網路配置文件和可免費下載的實用程序腳本,以幫助進行性能調整。
使用 Slave B 有更多機會提高處理速度。