Postgresql
postgresql 複製:WAL 在段可以歸檔之前旋轉?
使用 PostgreSQL 10.7
今天早上檢查我的日誌,我經常看到這個錯誤:
archive command failed with exit code 1 The failed archive command was: test ! -f /opt/bitnami/postgresql/archive/00000001000005D30000008B && cp pg_wal/00000001000005D30000008B /opt/bitnami/postgresql/archive/00000001000005D30000008B
我剛剛看到不推薦使用文件中的命令,我想知道是否有人可以指出更好的歸檔命令的方向。
同時,此段不在
pg_wal
,archive_status
或archive
目錄中。我注意到日誌中有一些檢查點警告。我們以前在複製伺服器上遇到過類似的問題,我認為打開存檔會有所幫助,因為由於某種原因它沒有啟用(現在是這樣),但現在看來問題出在其他地方。
數據庫確實每天都承受著沉重的負載,cronjob 將它們恢復為匹配生產數據庫,這似乎是發生檢查點錯誤的時候。
現在我唯一的選擇是創建一個新的基本備份,因為我至少缺少一個 WAL 段嗎?
這是相關部分
postgresql.conf
#------------------------------------------------------------------------------ # WRITE AHEAD LOG #------------------------------------------------------------------------------ # - Settings - wal_level = 'hot_standby' # (change requires restart) #fsync = on # flush data to disk for crash safety # (turning this off can cause # unrecoverable data corruption) #synchronous_commit = on # synchronization level; # off, local, remote_write, remote_apply, or on #wal_sync_method = fsync # the default is the first option # supported by the operating system: # open_datasync # fdatasync (default on Linux) # fsync # fsync_writethrough # open_sync #full_page_writes = on # recover from partial page writes #wal_compression = off # enable compression of full-page writes #wal_log_hints = off # also do full page writes of non-critical updates # (change requires restart) #wal_buffers = -1 # min 32kB, -1 sets based on shared_buffers # (change requires restart) #wal_writer_delay = 200ms # 1-10000 milliseconds #wal_writer_flush_after = 1MB # measured in pages, 0 disables #commit_delay = 0 # range 0-100000, in microseconds #commit_siblings = 5 # range 1-1000 # - Checkpoints - #checkpoint_timeout = 5min # range 30s-1d max_wal_size = '400MB' #min_wal_size = 80MB #checkpoint_completion_target = 0.5 # checkpoint target duration, 0.0 - 1.0 #checkpoint_flush_after = 0 # measured in pages, 0 disables #checkpoint_warning = 30s # 0 disables # - Archiving - archive_mode = on # enables archiving; off, on, or always # (change requires restart) archive_command = 'test ! -f /opt/bitnami/postgresql/archive/%f && cp %p /opt/bitnami/postgresql/archive/%f' # command to use to archive a logfile segment # placeholders: %p = path of file to archive # %f = file name only # e.g. 'test ! -f /mnt/server/archivedir/%f && cp %p /mnt/server/archivedir/%f' #archive_timeout = 0 # force a logfile segment switch after this # number of seconds; 0 disables
對於我能做些什麼來防止這種情況再次發生,我將不勝感激。
該文件應保留在 pg_wal 中,直到歸檔命令成功。如果存檔從未成功,則該文件應該仍然存在。也許您的文件系統已損壞並失去文件,或者您的員工正在搞亂他們不應該做的事情。
您使用的命令可能發生的情況是,如果它報告成功然後系統崩潰,並且當它恢復時存檔文件不存在或沒有正確的數據,因為它從未同步到磁碟。但這種情況不會在您的日誌中留下歸檔錯誤消息。