主節點的時間線 2 與恢復目標時間線 1 不匹配

August 21, 2018

我想用 postgresql 和 pgpool 創建一個配置來獲得高度可用性（HA）。我想做這個：

 .-----.                  .--------.
 |     |           W      |   DB   |
 | APP |-----+-----------&gt;| MASTER |----.
 |     |     |            |        |    |
 `-----`     |            `--------`    | STREAMING REPLICATION
             |            .--------.    |
             |     R      |   DB   |    |
             +-----------&gt;| SLAVE1 |&lt;---+      
             |            |        |    |
             |            `--------`    |
             |            .--------.    |
             |     R      |   DB   |    |
             `-----------&gt;| SLAVE2 |&lt;---`
                          |        |
                          `--------`
after failover:

 .-----.                  .--------.
 |     |                  |   DB   |
 | APP |-----+----        | MASTER |
 |     |     |            |  FAIL  |
 `-----`     |            `--------`
             |            .--------.
             |    W/R     |   DB   |
             +-----------&gt;| SLAVE1 |---.          
             |            |        |   |
             |            `--------`   |
             |            .--------.   | STREAMING REPLICATION
             |     R      |   DB   |   |
             `-----------&gt;| SLAVE2 |&lt;--`
                          |        |
                          `--------`

通過 Streaming Replication + pgpool-II關注 HA PostgreSQL 集群的文章我用 vagrant 創建了 4 個虛擬機 (vm)。一個 Master（M），兩個 Slave（S1 和 S2），另一個用於 pgpool（APP）。當我在 APP 中進行一些查詢時，一切正常。當我停止主人時，S1 被提升為 M，而 S2 現在有了一個新主人。但這給我帶來了問題。我不知道這個錯誤是什麼意思，我不知道如何解決它：

2014-08-18 17:04:56 UTC FATAL:  timeline 2 of the primary does not match recovery target timeline 1

我在用：

postgres 9.1
pgpool 3.1.1-1

更新

當 S1 提升為 Master 時，S2 會發出此錯誤。

更新 2

節點 IP 地址：

M  - 192.168.1.10
S1 - 192.168.1.11
S2 - 192.168.1.12
APP- 192.168.1.13

這是我的 S1 和 S2 的 recovery.conf 文件配置。

standby_mode = 'on'
primary_conninfo = 'host=192.168.1.10 port=5432 user=replicator password=replicator'
trigger_file = '/tmp/postgresql.trigger.5432'

更新 3

這是我在 /etc/pgpool2/pgpool.conf 文件中的後端：

...
backend_hostname0 = '192.168.1.10'
backend_port0 = 5432
backend_weight0 = 0
backend_data_directory0 = '/var/lib/postgresql/9.1/main'
backend_flag0 = 'ALLOW_TO_FAILOVER'

backend_hostname1 = '192.168.1.11'
backend_port1 = 5432
backend_weight1 = 0
backend_data_directory1 = '/var/lib/postgresql/9.1/main'
backend_flag1 = 'ALLOW_TO_FAILOVER'

backend_hostname2 = '192.168.1.12'
backend_port2 = 5432
backend_weight2 = 0
backend_data_directory2 = '/var/lib/postgresql/9.1/main'
backend_flag2 = 'ALLOW_TO_FAILOVER'


#------------------------------------------------------------------------------
# FAILOVER AND FAILBACK
#------------------------------------------------------------------------------

failover_command = '/var/lib/postgresql/bin/failover.sh %d %M %m'

它是 failover.sh 文件：

#!/bin/sh
FALLING_NODE=$1
OLD_MASTER=$2
NEW_MASTER=$3
SLAVE1='slave1'
SLAVE2='slave2'
if test $FALLING_NODE -eq 0
then
ssh -T $SLAVE1 touch /tmp/postgresql.trigger.5432
ssh -T $SLAVE1 "while test ! -f /var/lib/postgresql/9.1/main/recovery.done; $
ssh -T $SLAVE2 "sed -i 's/192.168.1.10/192.168.1.11/' /var/lib/postgresql/9.$
ssh -T $SLAVE2 /etc/init.d/postgresql restart
/usr/sbin/pcp_attach_node 10 localhost 9898 pgpool pgpool 2
fi

這個詳細的部落格涵蓋了這個問題。
簡而言之，主伺服器和備用伺服器都必須具有archive_mode = on和archive_command，直到 9.3 刪除了這個要求。

引用自：https://dba.stackexchange.com/questions/74326

主節點的時間線 2 與恢復目標時間線 1 不匹配

相關問答

帶有 HA 代理的 PgPool 2

當查詢開始在備用伺服器上執行時，重播延遲開始增加

如何在 PostgreSQL 中切換主伺服器和熱備份（帶流複製）？

Patroni：如何處理與主伺服器長時間斷開連接的副本？

PostgreSQL同步複製超時？

Windows 故障轉移上的 Postgresql 流複製