Postgresql
事務送出前恢復停止 - Postgres 11 PITR
我正在嘗試在重複節點上進行 PITR。這是我正在遵循的步驟。
- 在源節點上開始備份。
select pg_start_backup('backup')
- 在目標節點上使用這些 rsync 命令複製目標節點上的所有文件:
rsync -av sourcedb:/TPINFO01/wal_archive/ /TPINFO01/wal_archive rsync --delete -av sourcedb:/TPINFO01/datadg/tablespace/ /TPINFO01/datadg/tablespace rsync --delete -av sourcedb:/TPINFO01/datadg/data/ /TPINFO01/datadg/data --exclude 'pg_log' --exclude 'pg_replslot' --exclude 'postgresql.conf' --exclude 'recovery.conf'
- 停止源節點上的備份
select pg_stop_backup();
- 通過 crontab,每隔一分鐘從源數據庫中複製 WAL 文件。在源數據庫上啟用了連續的 WAL 歸檔。
* * * 08 * /home/postgres/rsync_wal.sh > /dev/null 2>&1
rsync_wal.sh
#!/bin/sh rsync -av sourcedb:/TPINFO01/wal_archive/ /TPINFO01/wal_archive
- 第二天,我
recovery.conf
在 targetdb 上創建如下:recovery_target_time = '2021-08-16 02:00:00 JST' recovery_target_action = 'pause' recovery_target_inclusive = 'false' restore_command = 'rsync -a /TPINFO01/wal_archive/%f %p'
- 它恢復文件但因以下錯誤而停止:
2021-08-16 16:33:52.853 JST,,,3608,,611a14e0.e18,1,,2021-08-16 16:33:52 JST,,0,LOG,00000,"database system was interrupted while in recovery at log time 2021-08-16 02:27:59 JST",,"If this has occurred more than once some data might be corrupted and you might need to choose an earlier recovery target.",,,,,,,"" 2021-08-16 16:33:53.023 JST,,,3608,,611a14e0.e18,2,,2021-08-16 16:33:52 JST,,0,LOG,00000,"starting point-in-time recovery to 2021-08-16 02:00:00+09",,,,,,,,,"" 2021-08-16 16:33:53.120 JST,,,3608,,611a14e0.e18,3,,2021-08-16 16:33:52 JST,,0,LOG,00000,"restored log file ""0000000100009928000000A7"" from archive",,,,,,,,,"" 2021-08-16 16:33:53.144 JST,,,3608,,611a14e0.e18,4,,2021-08-16 16:33:52 JST,1/0,0,LOG,00000,"redo starts at 9928/A73B0DA8",,,,,,,,,"" 2021-08-16 16:33:53.145 JST,,,3608,,611a14e0.e18,5,,2021-08-16 16:33:52 JST,1/0,0,LOG,00000,"recovery stopping before commit of transaction 74385349, time 2021-08-16 02:30:01.469553+09",,,,,,,,,"" 2021-08-16 16:33:53.145 JST,,,3608,,611a14e0.e18,6,,2021-08-16 16:33:52 JST,1/0,0,FATAL,XX000,"requested recovery stop point is before consistent recovery point",,,,,,,,,""
這裡有什麼問題?目標節點上的數據庫甚至沒有啟動時如何損壞?我錯過了一些步驟嗎?
這是備份文件的樣子
START WAL LOCATION: 9928/9D000028 (file 00000001000099280000009D) STOP WAL LOCATION: 9928/9D000168 (file 00000001000099280000009D) CHECKPOINT LOCATION: 9928/9D000098 BACKUP METHOD: pg_start_backup BACKUP FROM: master START TIME: 2021-08-15 21:58:00 JST LABEL: backup START TIMELINE: 1 STOP TIME: 2021-08-15 22:03:44 JST STOP TIMELINE: 1
正如 Laurenz Albe 在評論中提到的,我在
pg_stop_backup()
.我重複了上述步驟,唯一的區別是,將 WAL 段從源數據庫複製到目標數據庫的 cronjob 在備份期間連續執行,以確保不會失去任何 WAL 段。
它以這種方式工作,我可以在我想要的時間做 PITR。
您的 WAL 歸檔似乎無法可靠執行。
rsync
與其使用from cron 作業來複製 WAL 段,這似乎偶爾會失去段,不如使用 PostgreSQLarchive_command
來歸檔 WAL 段。然後 PostgreSQL 將確保所有段都被歸檔。