Galera
MariaDB 10.1 Galera 集群錯誤
我正在嘗試使用 2 個節點安裝 MariaDB Galera Cluster:
節點 1/ 172.23.0.2 :
wsrep_on=ON wsrep_provider=/usr/lib64/galera/libgalera_smm.so binlog_format=ROW wsrep_cluster_address='gcomm://' wsrep_sst_receive_address = '172.23.0.2:4444' wsrep_cluster_name='cluster' wsrep_node_name='n_01' wsrep_sst_method=rsync wsrep_sst_auth=cluster_user:cluster_pass
節點 2/ 172.23.0.3 :
wsrep_on=ON wsrep_provider=/usr/lib64/galera/libgalera_smm.so binlog_format=ROW wsrep_cluster_address='gcomm://172.23.0.2' wsrep_sst_receive_address = '172.23.0.3:4444' wsrep_cluster_name='cluster' wsrep_node_name='n_02' wsrep_sst_method=rsync wsrep_sst_auth=cluster_user:cluster_pass
第一個節點啟動時沒有錯誤:
Variable_name Value -------------------- --------- wsrep_cluster_size 1 wsrep_cluster_status Primary wsrep_connected ON wsrep_ready ON
但是當我啟動 2n 節點時,我得到了這個:
mariadb.service - MariaDB database server Loaded: loaded (/usr/lib/systemd/system/mariadb.service; enabled; vendor preset: disabled) Drop-In: /etc/systemd/system/mariadb.service.d └─migrated-from-my.cnf-settings.conf Active: failed (Result: exit-code) since jeu. 2017-08-24 19:11:32 CEST; 14s ago Process: 14656 ExecStartPost=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS) Process: 20222 ExecStart=/usr/sbin/mysqld $MYSQLD_OPTS $_WSREP_NEW_CLUSTER $_WSREP_START_POSITION (code=exited, status=1/FAILURE) Process: 18861 ExecStartPre=/bin/sh -c [ ! -e /usr/bin/galera_recovery ] && VAR= || VAR=`/usr/bin/galera_recovery`; [ $? -eq 0 ] && systemctl set-environment _WSREP_START_POSITION=$VAR || exit 1 (code=exited, status=0/SUCCESS) Process: 18858 ExecStartPre=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS) Main PID: 20222 (code=exited, status=1/FAILURE) Status: "MariaDB server is down" CGroup: /system.slice/mariadb.service ├─20357 /bin/bash -ue /usr//bin/wsrep_sst_rsync --role joiner --address 172.23.0.3 --datadir /home/mysql/ --parent 20309 --binlog /var/log/mariadb/binlog/mysql_binlog ├─20391 rsync --daemon --no-detach --port 4444 --config /home/mysql//rsync_sst.conf ├─22006 /bin/bash -ue /usr//bin/wsrep_sst_rsync --role joiner --address 172.23.0.3 --datadir /home/mysql/ --parent 21997 --binlog /var/log/mariadb/binlog/mysql_binlog ├─22638 sleep 0.2 └─22648 sleep 0.2 août 24 19:11:23 ovh38 mysqld[20222]: 2017-08-24 19:11:23 127079663351552 [ERROR] WSREP: Failed to read 'ready <addr>' from: wsrep_sst_rsync --role 'joiner' --address '172.23.0.3:4444' --datadir '/home/mysql/' --pa...log/mysql_binlog' août 24 19:11:23 ovh38 mysqld[20222]: Read: '(null)' août 24 19:11:23 ovh38 mysqld[20222]: 2017-08-24 19:11:23 127079663351552 [ERROR] WSREP: Process completed with error: wsrep_sst_rsync --role 'joiner' --address '172.23.0.3:4444' --datadir '/home/mysql/' --parent '...eady in progress) août 24 19:11:23 ovh38 mysqld[20222]: 2017-08-24 19:11:23 127080155712256 [ERROR] WSREP: Failed to prepare for 'rsync' SST. Unrecoverable. août 24 19:11:23 ovh38 mysqld[20222]: 2017-08-24 19:11:23 127080155712256 [ERROR] Aborting août 24 19:11:32 ovh38 mysqld[20222]: Error in my_thread_global_end(): 1 threads didn't exit août 24 19:11:32 ovh38 systemd[1]: mariadb.service: main process exited, code=exited, status=1/FAILURE août 24 19:11:32 ovh38 systemd[1]: Failed to start MariaDB database server. août 24 19:11:32 ovh38 systemd[1]: Unit mariadb.service entered failed state. août 24 19:11:32 ovh38 systemd[1]: mariadb.service failed. Hint: Some lines were ellipsized, use -l to show in full.
更新 :
此錯誤的來源是由於 rsync 程序已在使用中,因此解決方案是終止它:
Proto Recv-Q Send-Q Adresse locale Adresse distante Etat PID/Program name tcp 0 0 0.0.0.0:21 0.0.0.0:* LISTEN 1087/proftpd: (acce tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 708/sshd tcp 0 0 0.0.0.0:4444 0.0.0.0:* LISTEN 15510/rsync tcp6 0 0 :::80 :::* LISTEN 19059/httpd tcp6 0 0 :::22 :::* LISTEN 708/sshd tcp6 0 0 :::443 :::* LISTEN 19059/httpd tcp6 0 0 :::4444 :::* LISTEN 15510/rsync tcp6 0 0 :::545 :::* LISTEN 19059/httpd #kill -9 15510
我嘗試重新啟動第二個節點:
systemctl start mariadb
在第一個節點中:SHOW STATUS LIKE 'wsrep_cluster%'
Variable_name Value ------------------------ -------------------------------------- wsrep_cluster_conf_id 2 wsrep_cluster_size 2 wsrep_cluster_state_uuid 00edfa0e-88d5-11e7-8f43-5ea901e83b3a wsrep_cluster_status Primary
然而,又出現了一個錯誤:
# systemctl status mariadb.service ● mariadb.service - MariaDB database server Loaded: loaded (/usr/lib/systemd/system/mariadb.service; enabled; vendor preset: disabled) Drop-In: /etc/systemd/system/mariadb.service.d └─migrated-from-my.cnf-settings.conf Active: failed (Result: timeout) since ven. 2017-08-25 10:42:09 CEST; 11min ago Process: 14656 ExecStartPost=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS) Process: 12697 ExecStartPre=/bin/sh -c [ ! -e /usr/bin/galera_recovery ] && VAR= || VAR=`/usr/bin/galera_recovery`; [ $? -eq 0 ] && systemctl set-environment _WSREP_START_POSITION=$VAR || exit 1 (code=exited, status=0/SUCCESS) Process: 12685 ExecStartPre=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS) Main PID: 15310 CGroup: /system.slice/mariadb.service ├─15310 /usr/sbin/mysqld --wsrep_start_position=00000000-0000-0000-0000-000000000000:-1 ├─15468 /bin/bash -ue /usr//bin/wsrep_sst_rsync --role joiner --address 172.23.0.3 --datadir /home/mysql/ --parent 15310 --binlog /var/log/mariadb/binlog/mysql_binlog ├─15510 rsync --daemon --no-detach --port 4444 --config /home/mysql//rsync_sst.conf ├─15980 /bin/bash -ue /usr//bin/wsrep_sst_rsync --role joiner --address 172.23.0.3 --datadir /home/mysql/ --parent 15901 --binlog /var/log/mariadb/binlog/mysql_binlog ├─18646 sleep 0.2 ├─18670 sleep 0.2 ├─18675 sleep 0.2 ├─18676 sleep 0.2 ├─18686 sleep 0.2 ├─20357 /bin/bash -ue /usr//bin/wsrep_sst_rsync --role joiner --address 172.23.0.3 --datadir /home/mysql/ --parent 20309 --binlog /var/log/mariadb/binlog/mysql_binlog ├─22006 /bin/bash -ue /usr//bin/wsrep_sst_rsync --role joiner --address 172.23.0.3 --datadir /home/mysql/ --parent 21997 --binlog /var/log/mariadb/binlog/mysql_binlog └─23982 /bin/bash -ue /usr//bin/wsrep_sst_rsync --role joiner --address 172.23.0.3 --datadir /home/mysql/ --parent 23794 --binlog /var/log/mariadb/binlog/mysql_binlog août 25 10:39:10 ovh38 mysqld[15310]: 2017-08-25 10:39:10 115191544425216 [Note] WSREP: New cluster view: global state: 00edfa0e-88d5-11e7-8f43-5ea901e83b3a:0, view# 2: Primary, number of nodes: 2, my index: 0, protocol version 3 août 25 10:39:10 ovh38 mysqld[15310]: 2017-08-25 10:39:10 115191544425216 [Warning] WSREP: Gap in state sequence. Need state transfer. août 25 10:39:10 ovh38 mysqld[15310]: 2017-08-25 10:39:10 115190975993600 [Note] WSREP: Running: 'wsrep_sst_rsync --role 'joiner' --address '172.23.0.3' --datadir '/home/mysql/' --parent '15310' --binlog '/var/log/...g/mysql_binlog' ' août 25 10:39:10 ovh38 rsyncd[15510]: rsyncd version 3.0.9 starting, listening on port 4444 août 25 10:39:13 ovh38 mysqld[15310]: 2017-08-25 10:39:13 115191014389504 [Note] WSREP: (42fd49d1, 'tcp://0.0.0.0:4567') turning message relay requesting off août 25 10:40:39 ovh38 systemd[1]: mariadb.service start operation timed out. Terminating. août 25 10:42:09 ovh38 systemd[1]: mariadb.service stop-final-sigterm timed out. Skipping SIGKILL. Entering failed mode. août 25 10:42:09 ovh38 systemd[1]: Failed to start MariaDB database server. août 25 10:42:09 ovh38 systemd[1]: Unit mariadb.service entered failed state. août 25 10:42:09 ovh38 systemd[1]: mariadb.service failed. Hint: Some lines were ellipsized, use -l to show in full.
任何想法來解決這個問題?
您必須在每個節點的配置文件中指定所有節點的 IP 地址:
wsrep_cluster_address="gcomm://IP.node1,IP.node2,IP.node3"
請參考 Galera 文件
此外,為了避免腦裂,您應該添加第三個節點或仲裁者。