如何在命令失敗後修復“無法寫入文件 pg_wal/xlogtemp”？

March 17, 2022

我跑了vacuum full analyze，但是時間太長了，而且它使用了 100% 的 RAM，所以我殺了它。現在，Postgres 不會啟動。我有：

* Starting PostgreSQL 14 database server                                                                                            
* Error: /usr/lib/postgresql/14/bin/pg_ctl /usr/lib/postgresql/14/bin/pg_ctl start -D /var/lib/postgresql/14/main -l /var/log/postgresql/postgresql-14-main.log -s -o  -c config_file="/etc/postgresql/14/main/postgresql.conf"  exited with status 1:
2022-03-15 22:51:01.259 PDT [915] LOG:  starting PostgreSQL 14.2 (Ubuntu 14.2-1.pgdg20.04+1) on x86_64-pc-linux-gnu, compiled by gcc 
(Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0, 64-bit
2022-03-15 22:51:01.259 PDT [915] LOG:  listening on IPv4 address "0.0.0.0", port 5432
2022-03-15 22:51:01.259 PDT [915] LOG:  listening on IPv6 address "::", port 5432
2022-03-15 22:51:01.263 PDT [915] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2022-03-15 22:51:01.271 PDT [916] LOG:  database system was interrupted while in recovery at 2022-03-15 21:52:50 PDT
2022-03-15 22:51:01.271 PDT [916] HINT:  This probably means that some data is corrupted and you will have to use the last backup for recovery.
2022-03-15 22:51:02.578 PDT [917] postgres@foo FATAL:  the database system is starting up
2022-03-15 22:51:02.595 PDT [918] postgres@foo FATAL:  the database system is starting up
2022-03-15 22:51:02.612 PDT [919] postgres@foo FATAL:  the database system is starting up
2022-03-15 22:51:02.641 PDT [920] postgres@foo FATAL:  the database system is starting up
2022-03-15 22:51:02.656 PDT [921] postgres@foo FATAL:  the database system is starting up
2022-03-15 22:51:02.683 PDT [922] postgres@foo FATAL:  the database system is starting up
2022-03-15 22:51:03.570 PDT [916] LOG:  database system was not properly shut down; automatic recovery in progress
2022-03-15 22:51:03.573 PDT [916] LOG:  redo starts at 36/21030540
2022-03-15 22:51:04.368 PDT [916] LOG:  redo done at 36/2EFFF1D8 system usage: CPU: user: 0.40 s, system: 0.38 s, elapsed: 0.79 s    
2022-03-15 22:51:04.376 PDT [916] PANIC:  could not write to file "pg_wal/xlogtemp.916": No space left on device
2022-03-15 22:51:04.378 PDT [915] LOG:  startup process (PID 916) was terminated by signal 6: Aborted
2022-03-15 22:51:04.378 PDT [915] LOG:  aborting startup due to startup process failure
2022-03-15 22:51:04.386 PDT [915] LOG:  database system is shut down
pg_ctl: could not start server
Examine the log output.

數據庫很小，應該不會用完磁碟空間，所以vacuum full analyze肯定創建了很多垃圾文件。解決此問題的正確方法是什麼？不知道怎麼用pg_archivecleanup，找不到存檔目錄，/var/lib/postgresql/14/main/pg_xlog不存在。這是文件夾中的所有內容：

sudo ls /var/lib/postgresql/14/main/
PG_VERSION  pg_commit_ts  pg_multixact  pg_serial     pg_stat_tmp  pg_twophase postgresql.auto.conf
base        pg_dynshmem   pg_notify     pg_snapshots  pg_subtrans  pg_wal      postmaster.opts
global      pg_logical    pg_replslot   pg_stat       pg_tblspc    pg_xact

這是一個開發數據庫，所以在最壞的情況下我可以刪除所有內容。但是，如果在生產中發生這種情況，我想知道我還能如何解決它。

作業系統：Ubuntu 20 是 WSL

顯然您的磁碟空間不足pg_wal。
恢復方式：
增加磁碟大小（不要刪除文件）
啟動 PostgreSQL
pg_dumpall集群
停止 PostgreSQL 伺服器並刪除數據目錄（保存配置文件）
創建一個新集群initdb，添加配置並啟動它
恢復步驟 3 中的轉儲
這將擺脫崩潰留下的文件。
下次，VACUUM (FULL)除非您有明確的指示，否則不要執行，特別是不要在數據庫中的所有表上執行它，而只在那些真正需要它的表上執行。
注意：要收集統計資訊，請執行ANALYZE. 而且VACUUM (FULL)不是更好的版本，而是VACUUM完全不同的東西。

尋找足夠的文件以刪除或移動到其他地方，以便系統有足夠的空間啟動。一旦它存活足夠長的時間來啟動和檢查點，它可能會自行清理更多空間。在您顯示的清單中，我沒有看到人類可讀的日誌文件目錄（在最新版本中預設命名為“log”，在舊版本中預設命名為“pg_log”）。這通常是搬遷的最佳人選。那麼接下來，在同一個分區的其他地方，但在 PostgreSQL 數據目錄之外的其他文件呢？

引用自：https://dba.stackexchange.com/questions/308782

如何在命令失敗後修復“無法寫入文件 pg_wal/xlogtemp”？

相關問答

PostgreSQL 9.5 數據庫在日誌中顯示損壞跡象，但客戶端正常工作

如何驗證 postgres 9.3 db 沒有損壞？

從損壞的驅動器載入 PostgreSQL 數據庫

PostgreSQL 8.1 損壞問題

如果程序因某種原因終止，重新啟動 PostgreSQL 是否安全？

磁碟空間不足後無法重新啟動 Postgres