Ubuntu

Postgresql 10 和 Ubuntu - 無法啟動 postgresql 伺服器

  • August 26, 2020

有一天,我的 Postgresql 伺服器停止工作。檢查日誌。它以某種方式關閉。

root@ip_address:/# tail /var/log/postgresql/postgresql-10-main.log
2020-02-19 06:47:49.215 CET [23497] LOG:  received smart shutdown request
2020-02-19 06:47:49.477 CET [23497] LOG:  worker process: logical replication launcher (PID 23512) exited with exit code 1
2020-02-19 06:47:49.482 CET [23507] LOG:  shutting down
2020-02-19 06:47:49.546 CET [23497] LOG:  database system is shut down

當我跑步時,

root@ip_address:/# psql
psql: could not connect to server: No such file or directory
   Is the server running locally and accepting
   connections on Unix domain socket "/var/run/postgresql/.s.PGSQL.5432"?

它抱怨沒有文件和目錄。所以我檢查了我是否在postgresql跑步。

root@ip_address:/# systemctl status postgresql
● postgresql.service - PostgreSQL RDBMS
  Loaded: loaded (/lib/systemd/system/postgresql.service; enabled; vendor preset: enabled)
  Active: active (exited) since Sun 2020-03-08 16:19:24 CET; 26min ago
 Process: 30136 ExecStart=/bin/true (code=exited, status=0/SUCCESS)
Main PID: 30136 (code=exited, status=0/SUCCESS)

Mar 08 16:19:24 vps584959 systemd[1]: Starting PostgreSQL RDBMS...
Mar 08 16:19:24 vps584959 systemd[1]: Started PostgreSQL RDBMS.

它正在執行。但是,如果我檢查 postgresql 集群。

root@ip_address:/# pg_lsclusters
Ver Cluster Port Status Owner    Data directory              Log file
10  main    5432 down   postgres /var/lib/postgresql/10/main /var/log/postgresql/postgresql-10-main.log

DOWN

所以我嘗試了

root@ip_address:/# pg_ctlcluster 10 main start
Error: Config owner (deploy:1003) and data owner (postgres:114) do not match, and config owner is not root

我無法讓它工作。然後我嘗試了。

sudo chown -R deploy:postgres /var/lib/postgresql/10/ && sudo chmod -R u=rwX,go= /var/lib/postgresql/10/

再試一次。

root@ip_address:/# pg_ctlcluster 10 main start
Job for postgresql@10-main.service failed because the service did not take the steps required by its unit configuration.
See "systemctl status postgresql@10-main.service" and "journalctl -xe" for details.
root@ip_address:/# systemctl status postgresql@10-main.service
● postgresql@10-main.service - PostgreSQL Cluster 10-main
  Loaded: loaded (/lib/systemd/system/postgresql@.service; indirect; vendor preset: enabled)
  Active: failed (Result: protocol) since Sun 2020-03-08 16:59:53 CET; 2min 52s ago
 Process: 31635 ExecStart=/usr/bin/pg_ctlcluster --skip-systemctl-redirect 10-main start (code=exited, status=1/FAILURE)
Main PID: 23497 (code=exited, status=0/SUCCESS)

Mar 08 16:59:53 vps584959 systemd[1]: Starting PostgreSQL Cluster 10-main...
Mar 08 16:59:53 vps584959 postgresql@10-main[31635]: Error: /usr/lib/postgresql/10/bin/pg_ctl /usr/lib/postgresql/10/bin/pg_ctl start -D /var/lib/postgresql/10/main -l /var/log/postgre
Mar 08 16:59:53 vps584959 systemd[1]: postgresql@10-main.service: Can't open PID file /var/run/postgresql/10-main.pid (yet?) after start: No such file or directory
Mar 08 16:59:53 vps584959 systemd[1]: postgresql@10-main.service: Failed with result 'protocol'.
Mar 08 16:59:53 vps584959 systemd[1]: Failed to start PostgreSQL Cluster 10-main.

不知道該怎麼做更多。有人有同樣的問題嗎?

更多資訊。

root@ip_address:/var/run/postgresql# ls
total 0
drwxrwsr-x  3 postgres postgres   60 Feb 19 06:47 .
drwxr-xr-x 28 root     root     1060 Mar  8 13:58 ..
drwxr-s---  2 postgres postgres   40 Feb 19 06:47 10-main.pg_stat_tmp
root@vps584959:~# journalctl -xe
Mar 08 17:46:07 vps584959 sudo[2154]: root : TTY=pts/0 ; PWD=/root ; USER=root ; COMMAND=/bin/systemctl start postgresql@10-main
Mar 08 17:46:07 vps584959 sudo[2154]: pam_unix(sudo:session): session opened for user root by root(uid=0)
Mar 08 17:46:07 vps584959 systemd[1]: Starting PostgreSQL Cluster 10-main...
-- Subject: Unit postgresql@10-main.service has begun start-up
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
--
-- Unit postgresql@10-main.service has begun starting up.
Mar 08 17:46:07 vps584959 postgresql@10-main[2157]: Error: Config owner (deploy:1003) and data owner (root:0) do not match, and config owner is not root
Mar 08 17:46:07 vps584959 systemd[1]: postgresql@10-main.service: Can't open PID file /var/run/postgresql/10-main.pid (yet?) after start: No such file or directory
Mar 08 17:46:07 vps584959 systemd[1]: postgresql@10-main.service: Failed with result 'protocol'.
Mar 08 17:46:07 vps584959 systemd[1]: Failed to start PostgreSQL Cluster 10-main.
-- Subject: Unit postgresql@10-main.service has failed
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
--
-- Unit postgresql@10-main.service has failed.
--
-- The result is RESULT.
Mar 08 17:46:07 vps584959 sudo[2154]: pam_unix(sudo:session): session closed for user root
Mar 08 17:46:08 vps584959 sshd[2152]: Invalid user ftp1 from x.x.x.x port 57060
Mar 08 17:46:08 vps584959 sshd[2152]: pam_unix(sshd:auth): check pass; user unknown
Mar 08 17:46:08 vps584959 sshd[2152]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=x.x.x.x
Mar 08 17:46:09 vps584959 sshd[2152]: Failed password for invalid user ftp1 from 159.89.196.75 port 57060 ssh2
Mar 08 17:46:10 vps584959 sshd[2152]: Received disconnect from x.x.x.x port 57060:11: Bye Bye [preauth]
Mar 08 17:46:10 vps584959 sshd[2152]: Disconnected from invalid user ftp1 159.89.196.75 port 57060 [preauth]
Mar 08 17:46:11 vps584959 sshd[2150]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=x.x.x.x  user=root
Mar 08 17:46:12 vps584959 sshd[2159]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=x.x.x.x  user=root
Mar 08 17:46:13 vps584959 sshd[2150]: Failed password for root from xx.xx.xx.xx port 20408 ssh2

更新

  • 仍然無法正常工作。
root@myserver:~# chown -R postgres:postgres /etc/postgresql/10/main/
root@myserver:~#
root@myserver:~#
root@myserver:~# pg_ctlcluster 10 main start
Job for postgresql@10-main.service failed because the service did not take the steps required by its unit configuration.
See "systemctl status postgresql@10-main.service" and "journalctl -xe" for details.
root@myserver:~# systemctl status postgresql@10-main.service
● postgresql@10-main.service - PostgreSQL Cluster 10-main
  Loaded: loaded (/lib/systemd/system/postgresql@.service; indirect; vendor preset: enabled)
  Active: failed (Result: protocol) since Thu 2020-03-12 00:09:43 CET; 7s ago
 Process: 23767 ExecStart=/usr/bin/pg_ctlcluster --skip-systemctl-redirect 10-main start (code=exited, status=1/FAILURE)

root@vps584959:~# systemctl status postgresql@10-main.service
● postgresql@10-main.service - PostgreSQL Cluster 10-main
  Loaded: loaded (/lib/systemd/system/postgresql@.service; indirect; vendor preset: enabled)
  Active: failed (Result: protocol) since Thu 2020-03-12 00:09:43 CET; 11min ago
 Process: 23767 ExecStart=/usr/bin/pg_ctlcluster --skip-systemctl-redirect 10-main start (code=exited, status=1/FAILURE)

更新 2

root@vps584959:~# journalctl -xe
--
-- Unit UNIT has finished starting up.
--
-- The start-up result is RESULT.
Mar 14 13:55:16 vps584959 systemd[31170]: Startup finished in 171ms.
-- Subject: User manager start-up is now complete
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
--
-- The user manager instance for user 0 has been started. All services queued
-- for starting have been started. Note that other services might still be starting
-- up or be started at any later time.
--
-- Startup of the manager took 171927 microseconds.
Mar 14 13:55:17 vps584959 sshd[31156]: Failed password for root from 49.88.112.111 port 29693 ssh2
Mar 14 13:55:18 vps584959 sshd[31156]: Received disconnect from 49.88.112.111 port 29693:11:  [preauth]
Mar 14 13:55:18 vps584959 sshd[31156]: Disconnected from authenticating user root 49.88.112.111 port 29693 [preauth]
Mar 14 13:55:18 vps584959 sshd[31156]: PAM 2 more authentication failures; logname= uid=0 euid=0 tty=ssh ruser= rhost=49.88.112.111  user=root
Mar 14 13:55:33 vps584959 sshd[31363]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=134.17.94.237  user=root
Mar 14 13:55:35 vps584959 sshd[31363]: Failed password for root from 134.17.94.237 port 3684 ssh2
Mar 14 13:55:35 vps584959 sshd[31363]: Received disconnect from 134.17.94.237 port 3684:11: Bye Bye [preauth]
Mar 14 13:55:35 vps584959 sshd[31363]: Disconnected from authenticating user root 134.17.94.237 port 3684 [preauth]
Mar 14 13:55:43 vps584959 systemd[1]: Starting PostgreSQL Cluster 10-main...
-- Subject: Unit postgresql@10-main.service has begun start-up
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
--
-- Unit postgresql@10-main.service has begun starting up.
Mar 14 13:55:43 vps584959 postgresql@10-main[31373]: Error: /usr/lib/postgresql/10/bin/pg_ctl /usr/lib/postgresql/10/bin/pg_ctl start -D /var/lib/postgresql/10/main -l /var/log/p
Mar 14 13:55:43 vps584959 postgresql@10-main[31373]: 2020-03-14 13:55:43.696 CET [31378] FATAL:  private key file "/etc/ssl/private/ssl-cert-snakeoil.key" must be owned by the da
Mar 14 13:55:43 vps584959 postgresql@10-main[31373]: 2020-03-14 13:55:43.698 CET [31378] LOG:  database system is shut down
Mar 14 13:55:43 vps584959 postgresql@10-main[31373]: pg_ctl: could not start server
Mar 14 13:55:43 vps584959 postgresql@10-main[31373]: Examine the log output.
Mar 14 13:55:43 vps584959 systemd[1]: postgresql@10-main.service: Can't open PID file /var/run/postgresql/10-main.pid (yet?) after start: No such file or directory
Mar 14 13:55:43 vps584959 systemd[1]: postgresql@10-main.service: Failed with result 'protocol'.
Mar 14 13:55:43 vps584959 systemd[1]: Failed to start PostgreSQL Cluster 10-main.
-- Subject: Unit postgresql@10-main.service has failed
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
--
-- Unit postgresql@10-main.service has failed.
--
-- The result is RESULT.

和,

root@vps584959:~# cat /var/log/postgresql/postgresql-10-main.log
.
.
2020-03-12 00:09:43.609 CET [23773] FATAL:  private key file "/etc/ssl/private/ssl-cert-snakeoil.key" must be owned by the database user or root
2020-03-12 00:09:43.611 CET [23773] LOG:  database system is shut down
pg_ctl: could not start server
Examine the log output.
2020-03-14 13:55:43.696 CET [31378] FATAL:  private key file "/etc/ssl/private/ssl-cert-snakeoil.key" must be owned by the database user or root
2020-03-14 13:55:43.698 CET [31378] LOG:  database system is shut down
pg_ctl: could not start server
Examine the log output.
root@vps584959:~#

systemctl status postgresql檢查您的 postgresql 實例沒有用,因為它只是一個傘式服務。你想要systemctl status postgresql@10-main.service在你的情況下。當實例關閉時,它會正確地將狀態顯示為非活動狀態。

如開頭所述/lib/systemd/system/postgresql@.service

$ head /lib/systemd/system/postgresql@.service
# systemd service template for PostgreSQL clusters. The actual instances will
# be called "postgresql@version-cluster", e.g. "postgresql@9.3-main". The
# variable %i expands to "version-cluster", %I expands to "version/cluster".
# (%I breaks for cluster names containing dashes.)

[Unit]
Description=PostgreSQL Cluster %i
AssertPathExists=/etc/postgresql/%I/postgresql.conf
RequiresMountsFor=/etc/postgresql/%I /var/lib/postgresql/%I
PartOf=postgresql.service

至於伺服器無法啟動的原始問題,是因為postgres應該擁有配置文件和數據目錄以及其中的所有內容。出於某種原因,您/某人/某物將這些文件重新分配給了deploy使用者,而這不能與在 Ubuntu/Debian 中設置 PosgreSQL 的方式一起使用。只需保留最初設置的權限和所有者即可。


關於 SSL 密鑰文件的 Pper 評論:以下錯誤表明必須在 SSL 私鑰文件上恢復權限和所有權:

PostgreSQL 錯誤:

FATAL:  private key file "/etc/ssl/private/ssl-cert-snakeoil.key" must be owned by the database user or root

恢復所有權和權限的命令:

$ sudo chown root:ssl-cert /etc/ssl/private/ssl-cert-snakeoil.key
$ sudo chmod 640 /etc/ssl/private/ssl-cert-snakeoil.key

引用自:https://serverfault.com/questions/1006099