Postgresql
使用流複製通過 Postgresql 9.4 將熱備用提升為 master
我目前有 2 台執行 Centos 6 和 Postgresql 9.4 的數據庫伺服器(同一網路上的相同複製)。它們被設置為主伺服器和使用
pg_basebackup
從主伺服器創建的備份,主伺服器將 WAL 複製到輔助伺服器的文件系統,備份集從 WAL 更新並使用流複製連接到主伺服器。這一切都很好,直到我殺死主人並嘗試將備份提升為新主人。我正在執行
kill -9
以模擬主伺服器的災難性故障,然後不久之後觸摸備份上的觸發器文件。問題是,只要我觸摸備份伺服器上的觸發器文件,它就會檢測到它,然後無法作為主伺服器啟動。
以下是我從主數據庫斷開備份時的日誌:
< 2015-10-05 14:20:52.888 BST >LOG: trigger file found: /tmp/psql.trigger < 2015-10-05 14:20:52.888 BST >LOG: redo is not required < 2015-10-05 14:20:52.904 BST >LOG: restored log file "000000010000000000000004" from archive < 2015-10-05 14:20:52.906 BST >PANIC: record with zero length at 0/40000C8 < 2015-10-05 14:20:52.906 BST >LOG: startup process (PID 3101) was terminated by signal 6: Aborted < 2015-10-05 14:20:52.906 BST >LOG: terminating any other active server processes < 2015-10-05 14:21:08.580 BST >LOG: database system was interrupted while in recovery at log time 2015-10-05 13:50:08 BST < 2015-10-05 14:21:08.580 BST >HINT: If this has occurred more than once some data might be corrupted and you might need to choose an earlier recovery target. < 2015-10-05 14:21:08.935 BST >LOG: entering standby mode < 2015-10-05 14:21:08.947 BST >LOG: restored log file "000000010000000000000004" from archive < 2015-10-05 14:21:08.949 BST >LOG: record with zero length at 0/40000C8 < 2015-10-05 14:21:08.951 BST >FATAL: could not connect to the primary server: could not connect to server: Connection refused Is the server running on host "192.168.56.101" and accepting TCP/IP connections on port 5432?
13:50 的日誌只是伺服器以恢復模式啟動。
主節點上的配置
postgresql.conf
是:listen_addresses = '*' wal_level = archive max_wal_senders = 2 archive_mode = on archive_command = 'rsync -a %p postgres@192.168.56.102:/apps/pgsql/9.4/write_ahead_logs/%f' archive_timeout = 600
recovery.conf
備份中的配置是:standby_mode = 'on' primary_conninfo = 'host=192.168.56.101 port=5432 user=backups password=abc' restore_command = 'cp /apps/pgsql/9.4/write_ahead_logs/%f %p' archive_cleanup_command = 'pg_archivecleanup /apps/pgsql/9.4/write_ahead_logs %r' trigger_file = '/tmp/psql.trigger'
我確定我遺漏了一些重要的東西,但是在搜尋了網際網路和文件之後,我無法找到它。任何幫助將非常感激。
解決了。問題是我讓Monit在後台執行,它在我編寫recovery.conf之前獲取備份數據庫(伺服器由廚師提供)。
解決方案是將 -R 標誌傳遞
pg_basebackup
給以確保數據庫永遠不會在恢復模式之外啟動,這會導致它與 WAL 和來自主伺服器的流式複制不兼容。