Postgresql
與 Patroni 一起使用時,PostgreSQL 程序陷入“分叉的新後端”循環
我有一個 3 節點的 Patroni2.1.1 + PGSQL13.4 集群,其中一個節點在完成基本備份引導後表現得很奇怪。當通過相同的過程時,我有另一個相同的節點工作正常。
基本上,一旦 Patroni 嘗試連接到 localhost:5432,伺服器似乎就當機並重新分叉(據我所知)。您將在“循環”下方看到從
connection received: host=::1 port=XXXXX
.2021-11-01 12:01:43.395 EDT [242925] LOG: starting PostgreSQL 13.4 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.4.1 20200928 (Red Hat 8.4.1-1), 64-bit 2021-11-01 12:01:43.396 EDT [242925] LOG: listening on IPv6 address "::1", port 5432 2021-11-01 12:01:43.396 EDT [242925] LOG: listening on IPv4 address "127.0.0.1", port 5432 2021-11-01 12:01:43.397 EDT [242925] LOG: listening on IPv4 address "192.168.23.13", port 5432 2021-11-01 12:01:43.398 EDT [242925] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432" 2021-11-01 12:01:43.401 EDT [242925] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432" 2021-11-01 12:01:43.405 EDT [242928] LOG: database system was interrupted; last known up at 2021-11-01 12:01:25 EDT 2021-11-01 12:01:43.405 EDT [242928] DEBUG: removing all temporary WAL segments 2021-11-01 12:01:43.436 EDT [242928] DEBUG: executing restore command "cp /var/lib/pgsql/13/archive/00000077.history pg_wal/RECOVERYHISTORY" cp: cannot stat '/var/lib/pgsql/13/archive/00000077.history': No such file or directory 2021-11-01 12:01:43.440 EDT [242928] DEBUG: could not restore file "00000077.history" from archive: child process exited with exit code 1 2021-11-01 12:01:43.441 EDT [242928] LOG: entering standby mode 2021-11-01 12:01:43.441 EDT [242928] DEBUG: backup time 2021-11-01 12:01:25 EDT in file "backup_label" 2021-11-01 12:01:43.441 EDT [242928] DEBUG: backup label pg_basebackup base backup in file "backup_label" 2021-11-01 12:01:43.441 EDT [242928] DEBUG: backup timeline 118 in file "backup_label" 2021-11-01 12:01:43.441 EDT [242928] DEBUG: executing restore command "cp /var/lib/pgsql/13/archive/00000076.history pg_wal/RECOVERYHISTORY" 2021-11-01 12:01:43.538 EDT [242928] LOG: restored log file "00000076.history" from archive 2021-11-01 12:01:43.540 EDT [242928] DEBUG: executing restore command "cp /var/lib/pgsql/13/archive/000000760000000100000000 pg_wal/RECOVERYXLOG" 2021-11-01 12:01:43.689 EDT [242928] LOG: restored log file "000000760000000100000000" from archive 2021-11-01 12:01:43.692 EDT [242928] DEBUG: got WAL segment from archive 2021-11-01 12:01:43.693 EDT [242928] DEBUG: checkpoint record is at 1/60 2021-11-01 12:01:43.693 EDT [242928] DEBUG: redo record is at 1/28; shutdown false 2021-11-01 12:01:43.693 EDT [242928] DEBUG: next transaction ID: 531; next OID: 24606 2021-11-01 12:01:43.693 EDT [242928] DEBUG: next MultiXactId: 1; next MultiXactOffset: 0 2021-11-01 12:01:43.693 EDT [242928] DEBUG: oldest unfrozen transaction ID: 478, in database 1 2021-11-01 12:01:43.693 EDT [242928] DEBUG: oldest MultiXactId: 1, in database 1 2021-11-01 12:01:43.693 EDT [242928] DEBUG: commit timestamp Xid oldest/newest: 0/0 2021-11-01 12:01:43.693 EDT [242928] DEBUG: transaction ID wrap limit is 2147484125, limited by database with OID 1 2021-11-01 12:01:43.693 EDT [242928] DEBUG: MultiXactId wrap limit is 2147483648, limited by database with OID 1 2021-11-01 12:01:43.693 EDT [242928] DEBUG: starting up replication slots 2021-11-01 12:01:43.693 EDT [242928] DEBUG: starting up replication origin progress state 2021-11-01 12:01:43.697 EDT [242928] DEBUG: resetting unlogged relations: cleanup 1 init 0 2021-11-01 12:01:43.698 EDT [242928] DEBUG: initializing for hot standby 2021-11-01 12:01:43.698 EDT [242928] DEBUG: my backend ID is 1 2021-11-01 12:01:43.698 EDT [242928] LOG: redo starts at 1/28 2021-11-01 12:01:43.698 EDT [242928] DEBUG: prune KnownAssignedXids to 531 2021-11-01 12:01:43.698 EDT [242928] CONTEXT: WAL redo at 1/28 for Standby/RUNNING_XACTS: nextXid 531 latestCompletedXid 530 oldestRunningXid 531 2021-11-01 12:01:43.698 EDT [242928] DEBUG: 0 KnownAssignedXids (num=0 tail=0 head=0) 2021-11-01 12:01:43.698 EDT [242928] CONTEXT: WAL redo at 1/28 for Standby/RUNNING_XACTS: nextXid 531 latestCompletedXid 530 oldestRunningXid 531 2021-11-01 12:01:43.698 EDT [242928] DEBUG: recovery snapshots are now enabled 2021-11-01 12:01:43.698 EDT [242928] CONTEXT: WAL redo at 1/28 for Standby/RUNNING_XACTS: nextXid 531 latestCompletedXid 530 oldestRunningXid 531 2021-11-01 12:01:43.698 EDT [242928] DEBUG: prune KnownAssignedXids to 531 2021-11-01 12:01:43.698 EDT [242928] CONTEXT: WAL redo at 1/D8 for Standby/RUNNING_XACTS: nextXid 531 latestCompletedXid 530 oldestRunningXid 531 2021-11-01 12:01:43.698 EDT [242928] DEBUG: end of backup reached 2021-11-01 12:01:43.698 EDT [242928] CONTEXT: WAL redo at 1/110 for XLOG/BACKUP_END: 1/28 2021-11-01 12:01:43.699 EDT [242932] DEBUG: checkpointer updated shared memory configuration values 2021-11-01 12:01:43.700 EDT [242928] LOG: consistent recovery state reached at 1/138 2021-11-01 12:01:43.700 EDT [242928] DEBUG: executing restore command "cp /var/lib/pgsql/13/archive/000000760000000100000001 pg_wal/RECOVERYXLOG" 2021-11-01 12:01:43.700 EDT [242925] LOG: database system is ready to accept read only connections cp: cannot stat '/var/lib/pgsql/13/archive/000000760000000100000001': No such file or directory 2021-11-01 12:01:43.704 EDT [242928] DEBUG: could not restore file "000000760000000100000001" from archive: child process exited with exit code 1 2021-11-01 12:01:43.704 EDT [242928] DEBUG: could not open file "pg_wal/000000760000000100000001": No such file or directory 2021-11-01 12:01:43.704 EDT [242928] DEBUG: switched WAL source from archive to stream after failure 2021-11-01 12:01:43.705 EDT [242936] DEBUG: find_in_dynamic_libpath: trying "/usr/pgsql-13/lib/libpqwalreceiver" 2021-11-01 12:01:43.706 EDT [242936] DEBUG: find_in_dynamic_libpath: trying "/usr/pgsql-13/lib/libpqwalreceiver.so" 2021-11-01 12:01:44.370 EDT [242925] DEBUG: forked new backend, pid=242938 socket=12 2021-11-01 12:01:44.371 EDT [242938] LOG: connection received: host=::1 port=36714 2021-11-01 12:01:51.235 EDT [242938] DEBUG: shmem_exit(0): 0 before_shmem_exit callbacks to make 2021-11-01 12:01:51.235 EDT [242938] DEBUG: shmem_exit(0): 0 on_shmem_exit callbacks to make 2021-11-01 12:01:51.235 EDT [242938] DEBUG: proc_exit(0): 1 callbacks to make 2021-11-01 12:01:51.235 EDT [242938] DEBUG: exit(0) 2021-11-01 12:01:51.235 EDT [242938] DEBUG: shmem_exit(-1): 0 before_shmem_exit callbacks to make 2021-11-01 12:01:51.235 EDT [242938] DEBUG: shmem_exit(-1): 0 on_shmem_exit callbacks to make 2021-11-01 12:01:51.235 EDT [242938] DEBUG: proc_exit(-1): 0 callbacks to make 2021-11-01 12:01:51.236 EDT [242925] DEBUG: forked new backend, pid=242941 socket=12 2021-11-01 12:01:51.237 EDT [242941] LOG: connection received: host=127.0.0.1 port=43428 2021-11-01 12:01:51.237 EDT [242925] DEBUG: reaping dead processes 2021-11-01 12:01:51.237 EDT [242925] DEBUG: server process (PID 242938) exited with exit code 0 2021-11-01 12:01:58.284 EDT [242941] DEBUG: shmem_exit(0): 0 before_shmem_exit callbacks to make 2021-11-01 12:01:58.284 EDT [242941] DEBUG: shmem_exit(0): 0 on_shmem_exit callbacks to make 2021-11-01 12:01:58.284 EDT [242941] DEBUG: proc_exit(0): 1 callbacks to make 2021-11-01 12:01:58.284 EDT [242941] DEBUG: exit(0) 2021-11-01 12:01:58.284 EDT [242941] DEBUG: shmem_exit(-1): 0 before_shmem_exit callbacks to make 2021-11-01 12:01:58.284 EDT [242941] DEBUG: shmem_exit(-1): 0 on_shmem_exit callbacks to make 2021-11-01 12:01:58.284 EDT [242941] DEBUG: proc_exit(-1): 0 callbacks to make 2021-11-01 12:01:58.285 EDT [242925] DEBUG: reaping dead processes 2021-11-01 12:01:58.285 EDT [242925] DEBUG: server process (PID 242941) exited with exit code 0 2021-11-01 12:01:59.295 EDT [242925] DEBUG: forked new backend, pid=242944 socket=12 2021-11-01 12:01:59.295 EDT [242944] LOG: connection received: host=::1 port=36758 2021-11-01 12:02:06.301 EDT [242944] DEBUG: shmem_exit(0): 0 before_shmem_exit callbacks to make 2021-11-01 12:02:06.301 EDT [242944] DEBUG: shmem_exit(0): 0 on_shmem_exit callbacks to make 2021-11-01 12:02:06.301 EDT [242944] DEBUG: proc_exit(0): 1 callbacks to make 2021-11-01 12:02:06.301 EDT [242944] DEBUG: exit(0) 2021-11-01 12:02:06.301 EDT [242944] DEBUG: shmem_exit(-1): 0 before_shmem_exit callbacks to make 2021-11-01 12:02:06.301 EDT [242944] DEBUG: shmem_exit(-1): 0 on_shmem_exit callbacks to make 2021-11-01 12:02:06.301 EDT [242944] DEBUG: proc_exit(-1): 0 callbacks to make 2021-11-01 12:02:06.301 EDT [242925] DEBUG: forked new backend, pid=242948 socket=12 2021-11-01 12:02:06.302 EDT [242925] DEBUG: reaping dead processes 2021-11-01 12:02:06.302 EDT [242925] DEBUG: server process (PID 242944) exited with exit code 0 2021-11-01 12:02:06.302 EDT [242948] LOG: connection received: host=127.0.0.1 port=43470 2021-11-01 12:02:13.440 EDT [242948] DEBUG: shmem_exit(0): 0 before_shmem_exit callbacks to make 2021-11-01 12:02:13.440 EDT [242948] DEBUG: shmem_exit(0): 0 on_shmem_exit callbacks to make 2021-11-01 12:02:13.440 EDT [242948] DEBUG: proc_exit(0): 1 callbacks to make 2021-11-01 12:02:13.440 EDT [242948] DEBUG: exit(0) 2021-11-01 12:02:13.440 EDT [242948] DEBUG: shmem_exit(-1): 0 before_shmem_exit callbacks to make 2021-11-01 12:02:13.440 EDT [242948] DEBUG: shmem_exit(-1): 0 on_shmem_exit callbacks to make 2021-11-01 12:02:13.440 EDT [242948] DEBUG: proc_exit(-1): 0 callbacks to make 2021-11-01 12:02:13.441 EDT [242925] DEBUG: reaping dead processes 2021-11-01 12:02:13.441 EDT [242925] DEBUG: server process (PID 242948) exited with exit code 0 2021-11-01 12:02:14.148 EDT [242925] DEBUG: postmaster received signal 2 2021-11-01 12:02:14.148 EDT [242925] LOG: received fast shutdown request 2021-11-01 12:02:14.150 EDT [242925] LOG: aborting any active transactions 2021-11-01 12:02:14.150 EDT [242933] DEBUG: shmem_exit(0): 1 before_shmem_exit callbacks to make 2021-11-01 12:02:14.150 EDT [242933] DEBUG: shmem_exit(0): 5 on_shmem_exit callbacks to make 2021-11-01 12:02:14.150 EDT [242933] DEBUG: proc_exit(0): 2 callbacks to make 2021-11-01 12:02:14.150 EDT [242933] DEBUG: exit(0) 2021-11-01 12:02:14.150 EDT [242933] DEBUG: shmem_exit(-1): 0 before_shmem_exit callbacks to make 2021-11-01 12:02:14.150 EDT [242933] DEBUG: shmem_exit(-1): 0 on_shmem_exit callbacks to make 2021-11-01 12:02:14.150 EDT [242933] DEBUG: proc_exit(-1): 0 callbacks to make 2021-11-01 12:02:14.150 EDT [242928] DEBUG: shmem_exit(1): 1 before_shmem_exit callbacks to make 2021-11-01 12:02:14.150 EDT [242928] DEBUG: shmem_exit(1): 7 on_shmem_exit callbacks to make 2021-11-01 12:02:14.150 EDT [242928] DEBUG: removing all KnownAssignedXids 2021-11-01 12:02:14.150 EDT [242928] DEBUG: release all standby locks 2021-11-01 12:02:14.150 EDT [242928] DEBUG: proc_exit(1): 2 callbacks to make 2021-11-01 12:02:14.150 EDT [242928] DEBUG: exit(1) 2021-11-01 12:02:14.150 EDT [242928] DEBUG: shmem_exit(-1): 0 before_shmem_exit callbacks to make 2021-11-01 12:02:14.150 EDT [242928] DEBUG: shmem_exit(-1): 0 on_shmem_exit callbacks to make 2021-11-01 12:02:14.150 EDT [242928] DEBUG: proc_exit(-1): 0 callbacks to make 2021-11-01 12:02:14.151 EDT [242925] DEBUG: reaping dead processes 2021-11-01 12:02:14.151 EDT [242925] DEBUG: reaping dead processes 2021-11-01 12:02:14.449 EDT [242925] DEBUG: forked new backend, pid=242954 socket=12 2021-11-01 12:02:14.450 EDT [242954] LOG: connection received: host=::1 port=36798 2021-11-01 12:02:15.915 EDT [242936] FATAL: terminating walreceiver process due to administrator command 2021-11-01 12:02:15.915 EDT [242936] DEBUG: shmem_exit(1): 1 before_shmem_exit callbacks to make 2021-11-01 12:02:15.915 EDT [242936] DEBUG: shmem_exit(1): 6 on_shmem_exit callbacks to make 2021-11-01 12:02:15.915 EDT [242936] DEBUG: proc_exit(1): 2 callbacks to make 2021-11-01 12:02:15.915 EDT [242936] DEBUG: exit(1) 2021-11-01 12:02:15.916 EDT [242936] DEBUG: shmem_exit(-1): 0 before_shmem_exit callbacks to make 2021-11-01 12:02:15.916 EDT [242936] DEBUG: shmem_exit(-1): 0 on_shmem_exit callbacks to make 2021-11-01 12:02:15.916 EDT [242936] DEBUG: proc_exit(-1): 0 callbacks to make 2021-11-01 12:02:15.917 EDT [242925] DEBUG: reaping dead processes 2021-11-01 12:02:15.917 EDT [242932] LOG: shutting down 2021-11-01 12:02:15.917 EDT [242932] DEBUG: skipping restartpoint, already performed at 1/28 2021-11-01 12:02:15.919 EDT [242932] DEBUG: updated min recovery point to 1/1000000 on timeline 118 2021-11-01 12:02:15.920 EDT [242932] DEBUG: shmem_exit(0): 1 before_shmem_exit callbacks to make 2021-11-01 12:02:15.920 EDT [242932] DEBUG: shmem_exit(0): 5 on_shmem_exit callbacks to make 2021-11-01 12:02:15.920 EDT [242932] DEBUG: proc_exit(0): 2 callbacks to make 2021-11-01 12:02:15.920 EDT [242932] DEBUG: exit(0) 2021-11-01 12:02:15.920 EDT [242932] DEBUG: shmem_exit(-1): 0 before_shmem_exit callbacks to make 2021-11-01 12:02:15.920 EDT [242932] DEBUG: shmem_exit(-1): 0 on_shmem_exit callbacks to make 2021-11-01 12:02:15.920 EDT [242932] DEBUG: proc_exit(-1): 0 callbacks to make 2021-11-01 12:02:15.921 EDT [242925] DEBUG: reaping dead processes 2021-11-01 12:02:15.921 EDT [242934] DEBUG: writing stats file "pg_stat/global.stat" 2021-11-01 12:02:15.921 EDT [242934] DEBUG: shmem_exit(-1): 0 before_shmem_exit callbacks to make 2021-11-01 12:02:15.921 EDT [242934] DEBUG: shmem_exit(-1): 0 on_shmem_exit callbacks to make 2021-11-01 12:02:15.921 EDT [242934] DEBUG: proc_exit(-1): 0 callbacks to make 2021-11-01 12:02:15.922 EDT [242925] DEBUG: reaping dead processes 2021-11-01 12:02:17.890 EDT [242925] DEBUG: postmaster received signal 2 2021-11-01 12:02:18.617 EDT [242954] DEBUG: shmem_exit(0): 0 before_shmem_exit callbacks to make 2021-11-01 12:02:18.617 EDT [242954] DEBUG: shmem_exit(0): 0 on_shmem_exit callbacks to make 2021-11-01 12:02:18.617 EDT [242954] DEBUG: proc_exit(0): 1 callbacks to make 2021-11-01 12:02:18.617 EDT [242954] DEBUG: exit(0) 2021-11-01 12:02:18.617 EDT [242954] DEBUG: shmem_exit(-1): 0 before_shmem_exit callbacks to make 2021-11-01 12:02:18.617 EDT [242954] DEBUG: shmem_exit(-1): 0 on_shmem_exit callbacks to make 2021-11-01 12:02:18.617 EDT [242954] DEBUG: proc_exit(-1): 0 callbacks to make 2021-11-01 12:02:18.618 EDT [242925] DEBUG: reaping dead processes 2021-11-01 12:02:18.618 EDT [242925] DEBUG: server process (PID 242954) exited with exit code 0 2021-11-01 12:02:18.618 EDT [242925] DEBUG: shmem_exit(0): 0 before_shmem_exit callbacks to make 2021-11-01 12:02:18.618 EDT [242925] DEBUG: shmem_exit(0): 5 on_shmem_exit callbacks to make 2021-11-01 12:02:18.618 EDT [242925] DEBUG: cleaning up dynamic shared memory control segment with ID 2104592565 2021-11-01 12:02:18.631 EDT [242925] DEBUG: proc_exit(0): 2 callbacks to make 2021-11-01 12:02:18.631 EDT [242925] LOG: database system is shut down 2021-11-01 12:02:18.631 EDT [242925] DEBUG: exit(0) 2021-11-01 12:02:18.631 EDT [242925] DEBUG: shmem_exit(-1): 0 before_shmem_exit callbacks to make 2021-11-01 12:02:18.631 EDT [242925] DEBUG: shmem_exit(-1): 0 on_shmem_exit callbacks to make 2021-11-01 12:02:18.631 EDT [242925] DEBUG: proc_exit(-1): 0 callbacks to make 2021-11-01 12:02:18.633 EDT [242927] DEBUG: logger shutting down 2021-11-01 12:02:18.633 EDT [242927] DEBUG: shmem_exit(0): 0 before_shmem_exit callbacks to make 2021-11-01 12:02:18.633 EDT [242927] DEBUG: shmem_exit(0): 0 on_shmem_exit callbacks to make 2021-11-01 12:02:18.633 EDT [242927] DEBUG: proc_exit(0): 0 callbacks to make 2021-11-01 12:02:18.633 EDT [242927] DEBUG: exit(0) 2021-11-01 12:02:18.633 EDT [242927] DEBUG: shmem_exit(-1): 0 before_shmem_exit callbacks to make 2021-11-01 12:02:18.633 EDT [242927] DEBUG: shmem_exit(-1): 0 on_shmem_exit callbacks to make 2021-11-01 12:02:18.633 EDT [242927] DEBUG: proc_exit(-1): 0 callbacks to make
如果我停止 Patroni,伺服器將開始正常執行並按預期加入複製。使用者 postgres 可以使用 psql 實用程序甚至使用 pgadmin 進行本地或遠端連接。
有人可以解釋這裡發生了什麼嗎?我仍在學習 PGSQL,但我對這些日誌還不太滿意……謝謝!
有關更多上下文,您還可以查看我發布的 Patroni 問題#2104 。
OK 終於找到原因了!在 strace 上使用 strace 我發現了以下循環:
388084 sendmsg(5, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\0\0\0\35", iov_len=4}, {iov_base="\2\0\0\n26:94667\0\226\321\200\2732\253J\266\263\301\2\243\364\360\261\375", iov_len=29}], msg_iovlen=2, msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 33 388084 read(5, "\0\0\0\267", 4) = 4 388084 read(5, "\0\0\0\0", 4) = 4 388084 read(5, "\0\0\0\0\0\0\0\1\0\0\0\2\0\0\0\vDOMAIN.COM\0\0\0\10P"..., 183) = 183
原來,postgres 使用者的錯誤主體仍然被記憶體,並且在 kdestroy -A 重新開始工作之後。