Linux

MariaDB Galera 集群複製錯誤,沒有收到狀態?

  • February 16, 2017

我有一個非常簡單的、開箱即用的 MariaDB 設置,帶有 Galera 集群。我的集群主節點能夠執行,並且報告說只有一個節點連接到集群,即主節點本身。當我嘗試將第二個節點附加到集群時,我收到狀態接收錯誤並且程序出錯,並且失敗。master上的配置是這樣的:

[mariadb]
wsrep_cluster_address=gcomm://
wsrep_provider=/usr/lib64/galera/libgalera_smm.so
binlog_format=ROW
default_storage_engine=InnoDB
innodb_autoinc_lock_mode=2
innodb_locks_unsafe_for_binlog=1

這是在 /etc/my.cnf.d/zabbix_cluster.cnf 中。從節點看起來很相似,只是其中包含主節點的名稱。當我service mysql restart在從節點上執行時,輸出顯示 MySQL 已成功啟動,但當我這樣做時,pgrep mysql它什麼也沒返回。在檢查/var/log/mysql/error.log我初始化時,它說有一個狀態接收錯誤,並且永遠不會接收狀態。輸出如下所示:

130806 10:10:15 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
130806 10:10:15 mysqld_safe WSREP: Running position recovery with --log_error=/tmp/tmp.uZwsHWfH6y
130806 10:10:17 mysqld_safe WSREP: Recovered position 00000000-0000-0000-0000-000000000000:-1
130806 10:10:17 [Warning] option 'general_log': boolean value '/var/log/mysql/mysqld.log' wasn't recognized. Set to OFF.
130806 10:10:17 [Warning] option 'slow_query_log': boolean value '/var/log/mysql-slow-queries.log' wasn't recognized. Set to OFF.
130806 10:10:17 [Note] WSREP: wsrep_start_position var submitted: '00000000-0000-0000-0000-000000000000:-1'
130806 10:10:17 InnoDB: The InnoDB memory heap is disabled
130806 10:10:17 InnoDB: Mutexes and rw_locks use GCC atomic builtins
130806 10:10:17 InnoDB: Compressed tables use zlib 1.2.3
130806 10:10:17 InnoDB: Using Linux native AIO
130806 10:10:17 InnoDB: Initializing buffer pool, size = 128.0M
130806 10:10:17 InnoDB: Completed initialization of buffer pool
130806 10:10:17 InnoDB: highest supported file format is Barracuda.
130806 10:10:17  InnoDB: Waiting for the background threads to start
130806 10:10:18 Percona XtraDB (http://www.percona.com) 1.1.8-29.3 started; log sequence number 1598129
130806 10:10:18 [Note] Plugin 'FEEDBACK' is disabled.
130806 10:10:18 [Note] Event Scheduler: Loaded 0 events
130806 10:10:18 [Note] WSREP: Read nil XID from storage engines, skipping position init
130806 10:10:18 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib64/galera/libgalera_smm.so'
130806 10:10:18 [Note] WSREP: wsrep_load(): Galera 23.2.4(r147) by Codership Oy <info@codership.com> loaded succesfully.
130806 10:10:18 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1
130806 10:10:18 [Note] WSREP: Reusing existing '/var/lib/mysql//galera.cache'.
130806 10:10:18 [Note] WSREP: Passing config to GCS: base_host = 10.162.111.109; base_port = 4567; cert.log_conflicts = no; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.size = 128M; gcs.fc_debug = 0; gcs.fc_factor = 1; gcs.fc_limit = 16; gcs.fc_master_slave = NO; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = NO; replicator.causal_read_timeout = PT30S; replicator.commit_order = 3
130806 10:10:18 [Note] WSREP: Assign initial position for certification: -1, protocol version: -1
130806 10:10:18 [Note] WSREP: Start replication
130806 10:10:18 [Note] WSREP: Setting initial position to 00000000-0000-0000-0000-000000000000:-1
130806 10:10:18 [Note] WSREP: protonet asio version 0
130806 10:10:18 [Note] WSREP: backend: asio
130806 10:10:18 [Note] WSREP: GMCast version 0
130806 10:10:18 [Note] WSREP: (4e646cee-feaa-11e2-0800-10aa5e70a57b, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
130806 10:10:18 [Note] WSREP: (4e646cee-feaa-11e2-0800-10aa5e70a57b, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
130806 10:10:18 [Note] WSREP: EVS version 0
130806 10:10:18 [Note] WSREP: PC version 0
130806 10:10:18 [Note] WSREP: gcomm: connecting to group 'my_wsrep_cluster', peer 'zabbixcrt02:'
130806 10:10:19 [Note] WSREP: declaring 21415a01-fea8-11e2-0800-7061deb24ae4 stable
130806 10:10:19 [Note] WSREP: Node 21415a01-fea8-11e2-0800-7061deb24ae4 state prim
130806 10:10:19 [Note] WSREP: view(view_id(PRIM,21415a01-fea8-11e2-0800-7061deb24ae4,8) memb {
       21415a01-fea8-11e2-0800-7061deb24ae4,
       4e646cee-feaa-11e2-0800-10aa5e70a57b,
} joined {
} left {
} partitioned {
})
130806 10:10:19 [Note] WSREP: gcomm: connected
130806 10:10:19 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636
130806 10:10:19 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)
130806 10:10:19 [Note] WSREP: Opened channel 'my_wsrep_cluster'
130806 10:10:19 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 1, memb_num = 2
130806 10:10:19 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.
130806 10:10:19 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.5.29-MariaDB'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MariaDB Server, wsrep_23.7.3.rXXXX
130806 10:10:19 [Note] WSREP: STATE EXCHANGE: sent state msg: 4eb2dd53-feaa-11e2-0800-5d7a774f5dbf
130806 10:10:19 [Note] WSREP: STATE EXCHANGE: got state msg: 4eb2dd53-feaa-11e2-0800-5d7a774f5dbf from 0 (ceszabbixcrt02)
130806 10:10:19 [Note] WSREP: STATE EXCHANGE: got state msg: 4eb2dd53-feaa-11e2-0800-5d7a774f5dbf from 1 (ceszabbixcrt03)
130806 10:10:19 [Note] WSREP: Quorum results:
       version    = 2,
       component  = PRIMARY,
       conf_id    = 7,
       members    = 1/2 (joined/total),
       act_id     = 0,
       last_appl. = -1,
       protocols  = 0/4/2 (gcs/repl/appl),
       group UUID = bcb32946-fea7-11e2-0800-32db11e867f1
130806 10:10:19 [Note] WSREP: Flow-control interval: [23, 23]
130806 10:10:19 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 0)
130806 10:10:19 [Note] WSREP: State transfer required:
       Group state: bcb32946-fea7-11e2-0800-32db11e867f1:0
       Local state: 00000000-0000-0000-0000-000000000000:-1
130806 10:10:19 [Note] WSREP: New cluster view: global state: bcb32946-fea7-11e2-0800-32db11e867f1:0, view# 8: Primary, number of nodes: 2, my index: 1, protocol version 2
130806 10:10:19 [Warning] WSREP: Gap in state sequence. Need state transfer.
130806 10:10:21 [Note] WSREP: Prepared SST request: mysqldump|10.162.111.109:3306
130806 10:10:21 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
130806 10:10:21 [Note] WSREP: Assign initial position for certification: 0, protocol version: 2
130806 10:10:21 [Warning] WSREP: Failed to prepare for incremental state transfer: Local state UUID (00000000-0000-0000-0000-000000000000) does not match group state UUID (bcb32946-fea7-11e2-0800-32db11e867f1): 1 (Operation not permitted) at galera/src/replicator_str.cpp:prepare_for_IST():442. IST will be unavailable.
130806 10:10:21 [Note] WSREP: Node 1 (zabbixcrt03) requested state transfer from '*any*'. Selected 0 (zabbixcrt02)(SYNCED) as donor.
130806 10:10:21 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 0)
130806 10:10:21 [Note] WSREP: Requesting state transfer: success, donor: 0
130806 10:10:24 [Warning] WSREP: 0 (zabbixcrt02): State transfer to 1 (zabbixcrt03) failed: -2 (No such file or directory)
130806 10:10:24 [ERROR] WSREP: gcs/src/gcs_group.c:gcs_group_handle_join_msg():719: Will never receive state. Need to abort.
130806 10:10:24 [Note] WSREP: gcomm: terminating thread
130806 10:10:24 [Note] WSREP: gcomm: joining thread
130806 10:10:24 [Note] WSREP: gcomm: closing backend
130806 10:10:25 [Note] WSREP: view(view_id(NON_PRIM,21415a01-fea8-11e2-0800-7061deb24ae4,8) memb {
       4e646cee-feaa-11e2-0800-10aa5e70a57b,
} joined {
} left {
} partitioned {
       21415a01-fea8-11e2-0800-7061deb24ae4,
})
130806 10:10:25 [Note] WSREP: view((empty))
130806 10:10:25 [Note] WSREP: gcomm: closed
130806 10:10:25 [Note] WSREP: /usr/sbin/mysqld: Terminated.
130806 10:10:25 mysqld_safe Number of processes running now: 0
130806 10:10:25 mysqld_safe WSREP: not restarting wsrep node automatically
130806 10:10:25 mysqld_safe mysqld from pid file /var/lib/mysql/zabbixcrt03.pid ended

我不確定為什麼會發生這種情況,或者這意味著什麼。我看到它可以連接,但是有些東西沒有從主站傳輸到從站。我應該尋找/關注或做什麼?

我也確保它/var/lib/mysql/是由mysql:mysql它所有的。該目錄的權限是:755

  1. 用最小的配置文件實現集群需要做的是,取出所有wsrep語句,直到我準備好。
  2. 然後啟動 MySQL,並在數據庫中為兩台伺服器上的集群/複製創建一個使用者。
  3. 然後wsrep_sst_auth=<dbuser>:<passwd>在兩台伺服器上添加該行。
  4. 現在先重啟 MariaDB 的 Master 實例,等到它啟動後,再重啟 MariaDB 伺服器的 Slave 實例。ñ

現在,當您SHOW STATUS LIKE 'wsrep_%'在 mariadb 內部發出時,它應該顯示集群中的 2 個節點。

引用自:https://dba.stackexchange.com/questions/47636