事務送出前恢復停止 - Postgres 11 PITR

August 17, 2021

我正在嘗試在重複節點上進行 PITR。這是我正在遵循的步驟。

在源節點上開始備份。

select pg_start_backup('backup')

在目標節點上使用這些 rsync 命令複製目標節點上的所有文件：

rsync  -av sourcedb:/TPINFO01/wal_archive/ /TPINFO01/wal_archive
rsync --delete -av sourcedb:/TPINFO01/datadg/tablespace/ /TPINFO01/datadg/tablespace
rsync --delete -av sourcedb:/TPINFO01/datadg/data/ /TPINFO01/datadg/data --exclude 'pg_log' --exclude 'pg_replslot' --exclude 'postgresql.conf' --exclude 'recovery.conf'

停止源節點上的備份

select pg_stop_backup();

通過 crontab，每隔一分鐘從源數據庫中複製 WAL 文件。在源數據庫上啟用了連續的 WAL 歸檔。

* * * 08 * /home/postgres/rsync_wal.sh &gt; /dev/null 2&gt;&1

rsync_wal.sh

#!/bin/sh

rsync -av sourcedb:/TPINFO01/wal_archive/ /TPINFO01/wal_archive

第二天，我recovery.conf在 targetdb 上創建如下：

recovery_target_time = '2021-08-16 02:00:00 JST'
recovery_target_action = 'pause'
recovery_target_inclusive = 'false'
restore_command = 'rsync -a /TPINFO01/wal_archive/%f %p'

它恢復文件但因以下錯誤而停止：

2021-08-16 16:33:52.853 JST,,,3608,,611a14e0.e18,1,,2021-08-16 16:33:52 JST,,0,LOG,00000,"database system was interrupted while in recovery at log time 2021-08-16 02:27:59 JST",,"If this has occurred more than once some data might be corrupted and you might need to choose an earlier recovery target.",,,,,,,""
2021-08-16 16:33:53.023 JST,,,3608,,611a14e0.e18,2,,2021-08-16 16:33:52 JST,,0,LOG,00000,"starting point-in-time recovery to 2021-08-16 02:00:00+09",,,,,,,,,""
2021-08-16 16:33:53.120 JST,,,3608,,611a14e0.e18,3,,2021-08-16 16:33:52 JST,,0,LOG,00000,"restored log file ""0000000100009928000000A7"" from archive",,,,,,,,,""
2021-08-16 16:33:53.144 JST,,,3608,,611a14e0.e18,4,,2021-08-16 16:33:52 JST,1/0,0,LOG,00000,"redo starts at 9928/A73B0DA8",,,,,,,,,""
2021-08-16 16:33:53.145 JST,,,3608,,611a14e0.e18,5,,2021-08-16 16:33:52 JST,1/0,0,LOG,00000,"recovery stopping before commit of transaction 74385349, time 2021-08-16 02:30:01.469553+09",,,,,,,,,""
2021-08-16 16:33:53.145 JST,,,3608,,611a14e0.e18,6,,2021-08-16 16:33:52 JST,1/0,0,FATAL,XX000,"requested recovery stop point is before consistent recovery point",,,,,,,,,""

這裡有什麼問題？目標節點上的數據庫甚至沒有啟動時如何損壞？我錯過了一些步驟嗎？

這是備份文件的樣子

START WAL LOCATION: 9928/9D000028 (file 00000001000099280000009D)
STOP WAL LOCATION: 9928/9D000168 (file 00000001000099280000009D)
CHECKPOINT LOCATION: 9928/9D000098
BACKUP METHOD: pg_start_backup
BACKUP FROM: master
START TIME: 2021-08-15 21:58:00 JST
LABEL: backup
START TIMELINE: 1
STOP TIME: 2021-08-15 22:03:44 JST
STOP TIMELINE: 1

正如 Laurenz Albe 在評論中提到的，我在pg_stop_backup().
我重複了上述步驟，唯一的區別是，將 WAL 段從源數據庫複製到目標數據庫的 cronjob 在備份期間連續執行，以確保不會失去任何 WAL 段。
它以這種方式工作，我可以在我想要的時間做 PITR。

您的 WAL 歸檔似乎無法可靠執行。
rsync與其使用from cron 作業來複製 WAL 段，這似乎偶爾會失去段，不如使用 PostgreSQLarchive_command來歸檔 WAL 段。然後 PostgreSQL 將確保所有段都被歸檔。

引用自：https://dba.stackexchange.com/questions/298087

事務送出前恢復停止 - Postgres 11 PITR

相關問答

執行 barman recover 命令後 Postgres 伺服器保持只讀模式

PostgreSQL 中差異備份的 PgBackRest 策略

已定義表空間的 pg_basebackup

如何正確執行恢復？

可以將臨時表空間放在易失性儲存上還是從備份中省略它？（PostgreSQL）

postgres：在 PITR 期間未刪除 WAL 文件