Mysql

基於 IPSec 的 MariaDB Galera 集群

  • September 2, 2019

我正在擴展我的伺服器基礎設施,因此我在第二個數據中心訂購了伺服器,我想將其用作故障轉移位置。

為了更好地辨識,我將使用“ffm1”作為主數據中心,使用“ffm2”作為故障轉移位置。

在這兩個位置上,我都有一個複制的 pfSense 防火牆,它通過 IPSec 隧道連接。

10.0.0.0/16 - ffm1 本地子網 10.3.0.0/16 - ffm2 本地子網

10.1.0.0/16 和 10.2.0.0/16 保留給其他具有互連的數據中心。

目前大多數東西都可以使用 IPSec 隧道正常工作:我可以使用我的內部儲存庫、dns 伺服器、ldap 等。

在 ffm1 上,我有一個 HA Gelera 集群,其中包含 4 個 MariaDB 節點和兩個 MaxScale 實例,它們由我的 pfSense 負載平衡器進行故障轉移平衡。

現在我想在 ffm2 上擴展集群。兩個數據中心都用 2 x 100GBit 連接,我在兩邊都有 2 x 10Gbit 上行鏈路,所以應該沒問題。

所以我在 ffm2 上安裝了一個 Node 並設置了 Gelra 配置:

#
# * Galera-related settings
#
[galera]
# Mandatory settings
wsrep_on=ON
wsrep_provider=/usr/lib64/galera/libgalera_smm.so
wsrep_cluster_address=gcomm://10.0.5.11,10.0.5.12,10.0.5.13,10.0.5.14,10.3.0.26
binlog_format=row
default_storage_engine=InnoDB
innodb_autoinc_lock_mode=2

但是當我嘗試啟動 mariadb 時,我得到了一個錯誤。我可以找到以下日誌條目:

Aug 24 09:37:44 ffcdb1 mysqld: 2019-08-24  9:37:44 0 [Note] WSREP: Read nil XID from storage engines, skipping position init
Aug 24 09:37:44 ffcdb1 mysqld: 2019-08-24  9:37:44 0 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib64/galera/libgalera_smm.so'
Aug 24 09:37:44 ffcdb1 mysqld: 2019-08-24  9:37:44 0 [Note] WSREP: wsrep_load(): Galera 25.3.26(r3857) by Codership Oy <info@codership.com> loaded successfully.
Aug 24 09:37:44 ffcdb1 mysqld: 2019-08-24  9:37:44 0 [Note] WSREP: CRC-32C: using "slicing-by-8" algorithm.
Aug 24 09:37:44 ffcdb1 mysqld: 2019-08-24  9:37:44 0 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1, safe_to_bootstrap: 1
Aug 24 09:37:44 ffcdb1 mysqld: 2019-08-24  9:37:44 0 [Note] WSREP: Passing config to GCS: base_dir = /var/lib/mysql/; base_host = 10.3.0.26; base_port = 4567; cert.log_conflicts = no; cert.optimistic_pa = yes; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.recover = no; gcache.size = 128M; gcomm.thread_prio = ; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version = 0; pc.announce_timeout = PT3S; pc
Aug 24 09:37:44 ffcdb1 mysqld: 2019-08-24  9:37:44 0 [Note] WSREP: GCache history reset: 886e1511-9278-11e9-b808-5bf36b96cc3c:0 -> 00000000-0000-0000-0000-000000000000:-1
Aug 24 09:37:44 ffcdb1 mysqld: 2019-08-24  9:37:44 0 [Note] WSREP: Assign initial position for certification: -1, protocol version: -1
Aug 24 09:37:44 ffcdb1 mysqld: 2019-08-24  9:37:44 0 [Note] WSREP: wsrep_sst_grab()
Aug 24 09:37:44 ffcdb1 mysqld: 2019-08-24  9:37:44 0 [Note] WSREP: Start replication
Aug 24 09:37:44 ffcdb1 mysqld: 2019-08-24  9:37:44 0 [Note] WSREP: Setting initial position to 00000000-0000-0000-0000-000000000000:-1
Aug 24 09:37:44 ffcdb1 mysqld: 2019-08-24  9:37:44 0 [Note] WSREP: protonet asio version 0
Aug 24 09:37:44 ffcdb1 mysqld: 2019-08-24  9:37:44 0 [Note] WSREP: Using CRC-32C for message checksums.
Aug 24 09:37:44 ffcdb1 mysqld: 2019-08-24  9:37:44 0 [Note] WSREP: backend: asio
Aug 24 09:37:44 ffcdb1 mysqld: 2019-08-24  9:37:44 0 [Note] WSREP: gcomm thread scheduling priority set to other:0
Aug 24 09:37:44 ffcdb1 mysqld: 2019-08-24  9:37:44 0 [Warning] WSREP: access file(/var/lib/mysql//gvwstate.dat) failed(No such file or directory)
Aug 24 09:37:44 ffcdb1 mysqld: 2019-08-24  9:37:44 0 [Note] WSREP: restore pc from disk failed
Aug 24 09:37:44 ffcdb1 mysqld: 2019-08-24  9:37:44 0 [Note] WSREP: GMCast version 0
Aug 24 09:37:44 ffcdb1 mysqld: 2019-08-24  9:37:44 0 [Note] WSREP: (0f8b0c37, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
Aug 24 09:37:44 ffcdb1 mysqld: 2019-08-24  9:37:44 0 [Note] WSREP: (0f8b0c37, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
Aug 24 09:37:44 ffcdb1 mysqld: 2019-08-24  9:37:44 0 [Note] WSREP: EVS version 0
Aug 24 09:37:44 ffcdb1 mysqld: 2019-08-24  9:37:44 0 [Note] WSREP: gcomm: connecting to group 'my_wsrep_cluster', peer '10.0.5.11:,10.0.5.12:,10.0.5.13:,10.0.5.14:,10.3.0.26:'
Aug 24 09:37:44 ffcdb1 mysqld: 2019-08-24  9:37:44 0 [Note] WSREP: (0f8b0c37, 'tcp://0.0.0.0:4567') connection established to 0f8b0c37 tcp://10.3.0.26:4567
Aug 24 09:37:44 ffcdb1 mysqld: 2019-08-24  9:37:44 0 [Warning] WSREP: (0f8b0c37, 'tcp://0.0.0.0:4567') address 'tcp://10.3.0.26:4567' points to own listening address, blacklisting
Aug 24 09:37:44 ffcdb1 mysqld: 2019-08-24  9:37:44 0 [Note] WSREP: (0f8b0c37, 'tcp://0.0.0.0:4567') connection established to 2c2ef983 tcp://10.0.5.13:4567
Aug 24 09:37:44 ffcdb1 mysqld: 2019-08-24  9:37:44 0 [Note] WSREP: (0f8b0c37, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers:
Aug 24 09:37:44 ffcdb1 mysqld: 2019-08-24  9:37:44 0 [Note] WSREP: (0f8b0c37, 'tcp://0.0.0.0:4567') connection established to d4b50fbb tcp://10.0.5.11:4567
Aug 24 09:37:44 ffcdb1 mysqld: 2019-08-24  9:37:44 0 [Note] WSREP: (0f8b0c37, 'tcp://0.0.0.0:4567') connection established to c241e84f tcp://10.0.5.12:4567
Aug 24 09:37:44 ffcdb1 mysqld: 2019-08-24  9:37:44 0 [Note] WSREP: (0f8b0c37, 'tcp://0.0.0.0:4567') connection established to 3f000499 tcp://10.0.5.14:4567
Aug 24 09:37:45 ffcdb1 mysqld: 2019-08-24  9:37:45 0 [Warning] WSREP: last inactive check more than PT1.5S ago (PT1.50881S), skipping check
Aug 24 09:37:46 ffcdb1 mysqld: 2019-08-24  9:37:46 0 [Note] WSREP: declaring 2c2ef983 at tcp://10.0.5.13:4567 stable
Aug 24 09:37:46 ffcdb1 mysqld: 2019-08-24  9:37:46 0 [Note] WSREP: declaring 3f000499 at tcp://10.0.5.14:4567 stable
Aug 24 09:37:46 ffcdb1 mysqld: 2019-08-24  9:37:46 0 [Note] WSREP: declaring c241e84f at tcp://10.0.5.12:4567 stable
Aug 24 09:37:46 ffcdb1 mysqld: 2019-08-24  9:37:46 0 [Note] WSREP: declaring d4b50fbb at tcp://10.0.5.11:4567 stable
Aug 24 09:37:47 ffcdb1 mysqld: 2019-08-24  9:37:47 0 [Note] WSREP: (0f8b0c37, 'tcp://0.0.0.0:4567') connection to peer 0f8b0c37 with addr tcp://10.3.0.26:4567 timed out, no messages seen in PT3S
Aug 24 09:37:47 ffcdb1 mysqld: 2019-08-24  9:37:47 0 [Note] WSREP: Node 2c2ef983 state prim
Aug 24 09:37:47 ffcdb1 mysqld: 2019-08-24  9:37:47 0 [Note] WSREP: view(view_id(PRIM,0f8b0c37,83) memb {
Aug 24 09:37:47 ffcdb1 mysqld: 0f8b0c37,0
Aug 24 09:37:47 ffcdb1 mysqld: 2c2ef983,0
Aug 24 09:37:47 ffcdb1 mysqld: 3f000499,0
Aug 24 09:37:47 ffcdb1 mysqld: c241e84f,0
Aug 24 09:37:47 ffcdb1 mysqld: d4b50fbb,0
Aug 24 09:37:47 ffcdb1 mysqld: } joined {
Aug 24 09:37:47 ffcdb1 mysqld: } left {
Aug 24 09:37:47 ffcdb1 mysqld: } partitioned {
Aug 24 09:37:47 ffcdb1 mysqld: })
Aug 24 09:37:47 ffcdb1 mysqld: 2019-08-24  9:37:47 0 [Note] WSREP: save pc into disk
Aug 24 09:37:47 ffcdb1 mysqld: 2019-08-24  9:37:47 0 [Note] WSREP: (0f8b0c37, 'tcp://0.0.0.0:4567') turning message relay requesting off
Aug 24 09:37:47 ffcdb1 mysqld: 2019-08-24  9:37:47 0 [Note] WSREP: gcomm: connected
Aug 24 09:37:47 ffcdb1 mysqld: 2019-08-24  9:37:47 0 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636
Aug 24 09:37:47 ffcdb1 mysqld: 2019-08-24  9:37:47 0 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)
Aug 24 09:37:47 ffcdb1 mysqld: 2019-08-24  9:37:47 0 [Note] WSREP: Opened channel 'my_wsrep_cluster'
Aug 24 09:37:47 ffcdb1 mysqld: 2019-08-24  9:37:47 0 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 5
Aug 24 09:37:47 ffcdb1 mysqld: 2019-08-24  9:37:47 0 [Note] WSREP: Waiting for SST to complete.
Aug 24 09:37:47 ffcdb1 mysqld: 2019-08-24  9:37:47 0 [Note] WSREP: STATE_EXCHANGE: sent state UUID: 11a1a631-c642-11e9-9bf8-2a2f60544da4
Aug 24 09:37:47 ffcdb1 mysqld: 2019-08-24  9:37:47 0 [Note] WSREP: STATE EXCHANGE: sent state msg: 11a1a631-c642-11e9-9bf8-2a2f60544da4
Aug 24 09:37:47 ffcdb1 mysqld: 2019-08-24  9:37:47 0 [Note] WSREP: STATE EXCHANGE: got state msg: 11a1a631-c642-11e9-9bf8-2a2f60544da4 from 0 (ffcdb1.HOSTNAME.de)
Aug 24 09:37:47 ffcdb1 mysqld: 2019-08-24  9:37:47 0 [Note] WSREP: STATE EXCHANGE: got state msg: 11a1a631-c642-11e9-9bf8-2a2f60544da4 from 1 (db3.HOSTNAME.intern)
Aug 24 09:37:47 ffcdb1 mysqld: 2019-08-24  9:37:47 0 [Note] WSREP: STATE EXCHANGE: got state msg: 11a1a631-c642-11e9-9bf8-2a2f60544da4 from 2 (db4.HOSTNAME.intern)
Aug 24 09:37:47 ffcdb1 mysqld: 2019-08-24  9:37:47 0 [Note] WSREP: STATE EXCHANGE: got state msg: 11a1a631-c642-11e9-9bf8-2a2f60544da4 from 3 (db2.HOSTNAME.intern)
Aug 24 09:37:47 ffcdb1 mysqld: 2019-08-24  9:37:47 0 [Note] WSREP: STATE EXCHANGE: got state msg: 11a1a631-c642-11e9-9bf8-2a2f60544da4 from 4 (db1.HOSTNAME.intern)
Aug 24 09:37:47 ffcdb1 mysqld: 2019-08-24  9:37:47 0 [Note] WSREP: Quorum results:
Aug 24 09:37:47 ffcdb1 mysqld: version    = 4,
Aug 24 09:37:47 ffcdb1 mysqld: component  = PRIMARY,
Aug 24 09:37:47 ffcdb1 mysqld: conf_id    = 67,
Aug 24 09:37:47 ffcdb1 mysqld: members    = 4/5 (joined/total),
Aug 24 09:37:47 ffcdb1 mysqld: act_id     = 26678596,
Aug 24 09:37:47 ffcdb1 mysqld: last_appl. = -1,
Aug 24 09:37:47 ffcdb1 mysqld: protocols  = 0/9/3 (gcs/repl/appl),
Aug 24 09:37:47 ffcdb1 mysqld: group UUID = 886e1511-9278-11e9-b808-5bf36b96cc3c
Aug 24 09:37:47 ffcdb1 mysqld: 2019-08-24  9:37:47 0 [Note] WSREP: Flow-control interval: [36, 36]
Aug 24 09:37:47 ffcdb1 mysqld: 2019-08-24  9:37:47 0 [Note] WSREP: Trying to continue unpaused monitor
Aug 24 09:37:47 ffcdb1 mysqld: 2019-08-24  9:37:47 0 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 26678596)
Aug 24 09:37:47 ffcdb1 mysqld: 2019-08-24  9:37:47 2 [Note] WSREP: State transfer required:
Aug 24 09:37:47 ffcdb1 mysqld: Group state: 886e1511-9278-11e9-b808-5bf36b96cc3c:26678596
Aug 24 09:37:47 ffcdb1 mysqld: Local state: 00000000-0000-0000-0000-000000000000:-1
Aug 24 09:37:47 ffcdb1 mysqld: 2019-08-24  9:37:47 2 [Note] WSREP: New cluster view: global state: 886e1511-9278-11e9-b808-5bf36b96cc3c:26678596, view# 68: Primary, number of nodes: 5, my index: 0, protocol version 3
Aug 24 09:37:47 ffcdb1 mysqld: 2019-08-24  9:37:47 2 [Warning] WSREP: Gap in state sequence. Need state transfer.
Aug 24 09:37:47 ffcdb1 mysqld: 2019-08-24  9:37:47 0 [Note] WSREP: Running: 'wsrep_sst_rsync --role 'joiner' --address '10.3.0.26' --datadir '/var/lib/mysql/'   --parent '809'  ''  '''
Aug 24 09:37:47 ffcdb1 rsyncd[859]: rsyncd version 3.1.2 starting, listening on port 4444
Aug 24 09:37:47 ffcdb1 mysqld: 2019-08-24  9:37:47 2 [Note] WSREP: Prepared SST request: rsync|10.3.0.26:4444/rsync_sst
Aug 24 09:37:47 ffcdb1 mysqld: 2019-08-24  9:37:47 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
Aug 24 09:37:47 ffcdb1 mysqld: 2019-08-24  9:37:47 2 [Note] WSREP: REPL Protocols: 9 (4, 2)
Aug 24 09:37:47 ffcdb1 mysqld: 2019-08-24  9:37:47 2 [Note] WSREP: Assign initial position for certification: 26678596, protocol version: 4
Aug 24 09:37:47 ffcdb1 mysqld: 2019-08-24  9:37:47 0 [Note] WSREP: Service thread queue flushed.
Aug 24 09:37:47 ffcdb1 mysqld: 2019-08-24  9:37:47 2 [Warning] WSREP: Failed to prepare for incremental state transfer: Local state UUID (00000000-0000-0000-0000-000000000000) does not match group state UUID (886e1511-9278-11e9-b808-5bf36b96cc3c): 1 (Operation not permitted)
Aug 24 09:37:47 ffcdb1 mysqld: at galera/src/replicator_str.cpp:prepare_for_IST():482. IST will be unavailable.
Aug 24 09:37:47 ffcdb1 mysqld: 2019-08-24  9:37:47 0 [Note] WSREP: Member 0.0 (ffcdb1.HOSTNAME.de) requested state transfer from '*any*'. Selected 1.0 (db3.HOSTNAME.intern)(SYNCED) as donor.
Aug 24 09:37:47 ffcdb1 mysqld: 2019-08-24  9:37:47 0 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 26678596)
Aug 24 09:37:47 ffcdb1 mysqld: 2019-08-24  9:37:47 2 [Note] WSREP: Requesting state transfer: success, donor: 1
Aug 24 09:37:47 ffcdb1 mysqld: 2019-08-24  9:37:47 2 [Note] WSREP: GCache history reset: 00000000-0000-0000-0000-000000000000:0 -> 886e1511-9278-11e9-b808-5bf36b96cc3c:26678596
Aug 24 09:37:51 ffcdb1 rsyncd[891]: name lookup failed for 10.0.5.13: Name or service not known
Aug 24 09:37:51 ffcdb1 rsyncd[891]: connect from UNKNOWN (10.0.5.13)
Aug 24 09:37:51 ffcdb1 rsyncd[891]: rsync to rsync_sst/ from UNKNOWN (10.0.5.13)
Aug 24 09:37:51 ffcdb1 rsyncd[891]: receiving file list
Aug 24 09:37:51 ffcdb1 rsyncd[891]: sent 25 bytes  received 336 bytes  total size 0
Aug 24 09:37:51 ffcdb1 rsyncd[893]: name lookup failed for 10.0.5.13: Name or service not known
Aug 24 09:37:51 ffcdb1 rsyncd[893]: connect from UNKNOWN (10.0.5.13)
Aug 24 09:37:51 ffcdb1 rsyncd[893]: rsync to rsync_sst-data_dir/ from UNKNOWN (10.0.5.13)
Aug 24 09:37:51 ffcdb1 rsyncd[893]: receiving file list
Aug 24 09:37:52 ffcdb1 kernel: random: crng init done
Aug 24 09:38:09 ffcdb1 rsyncd[893]: sent 44 bytes  received 281087150 bytes  total size 281018368
Aug 24 09:38:09 ffcdb1 rsyncd[964]: name lookup failed for 10.0.5.13: Name or service not known
Aug 24 09:38:09 ffcdb1 rsyncd[964]: connect from UNKNOWN (10.0.5.13)
Aug 24 09:38:09 ffcdb1 rsyncd[964]: rsync to rsync_sst-log_dir/ from UNKNOWN (10.0.5.13)
Aug 24 09:38:09 ffcdb1 rsyncd[964]: receiving file list
Aug 24 09:38:16 ffcdb1 rsyncd[964]: sent 63 bytes  received 100688097 bytes  total size 100663296
Aug 24 09:38:16 ffcdb1 rsyncd[997]: name lookup failed for 10.0.5.13: Name or service not known
Aug 24 09:38:16 ffcdb1 rsyncd[998]: name lookup failed for 10.0.5.13: Name or service not known
Aug 24 09:38:16 ffcdb1 rsyncd[997]: connect from UNKNOWN (10.0.5.13)
Aug 24 09:38:16 ffcdb1 rsyncd[998]: connect from UNKNOWN (10.0.5.13)
Aug 24 09:38:16 ffcdb1 rsyncd[998]: rsync to rsync_sst/./test from UNKNOWN (10.0.5.13)
Aug 24 09:38:16 ffcdb1 rsyncd[997]: rsync to rsync_sst/./designeroutlet_live from UNKNOWN (10.0.5.13)
Aug 24 09:38:16 ffcdb1 rsyncd[998]: receiving file list
Aug 24 09:38:16 ffcdb1 rsyncd[997]: receiving file list
Aug 24 09:38:16 ffcdb1 rsyncd[998]: sent 48 bytes  received 225 bytes  total size 65
Aug 24 09:38:16 ffcdb1 rsyncd[1001]: name lookup failed for 10.0.5.13: Name or service not known
Aug 24 09:38:16 ffcdb1 rsyncd[1001]: connect from UNKNOWN (10.0.5.13)
Aug 24 09:38:16 ffcdb1 rsyncd[1001]: rsync to rsync_sst/./mysql from UNKNOWN (10.0.5.13)
Aug 24 09:38:16 ffcdb1 rsyncd[1001]: receiving file list
Aug 24 09:38:16 ffcdb1 rsyncd[1001]: sent 1739 bytes  received 1228736 bytes  total size 1222390
Aug 24 09:38:16 ffcdb1 rsyncd[1007]: name lookup failed for 10.0.5.13: Name or service not known
Aug 24 09:38:16 ffcdb1 rsyncd[1007]: connect from UNKNOWN (10.0.5.13)
Aug 24 09:38:16 ffcdb1 rsyncd[1007]: rsync to rsync_sst/./performance_schema from UNKNOWN (10.0.5.13)
Aug 24 09:38:17 ffcdb1 rsyncd[1007]: receiving file list
Aug 24 09:38:17 ffcdb1 rsyncd[1007]: sent 48 bytes  received 221 bytes  total size 61
Aug 24 09:38:45 ffcdb1 systemd: Created slice User Slice of root.
Aug 24 09:38:45 ffcdb1 systemd-logind: New session 1 of user root.
Aug 24 09:38:45 ffcdb1 systemd: Started Session 1 of user root.
Aug 24 09:39:14 ffcdb1 systemd: mariadb.service start operation timed out. Terminating.
Aug 24 09:39:14 ffcdb1 mysqld: Terminated
Aug 24 09:39:14 ffcdb1 mysqld: WSREP_SST: [INFO] Joiner cleanup. rsync PID: 859 (20190824 09:39:14.291)
Aug 24 09:39:14 ffcdb1 rsyncd[859]: sent 0 bytes  received 0 bytes  total size 0
Aug 24 09:39:14 ffcdb1 rsyncd[997]: rsync error: received SIGINT, SIGTERM, or SIGHUP (code 20) at io.c(504) [generator=3.1.2]
Aug 24 09:39:14 ffcdb1 rsyncd[997]: rsync error: received SIGINT, SIGTERM, or SIGHUP (code 20) at io.c(504) [receiver=3.1.2]
Aug 24 09:39:14 ffcdb1 mysqld: WSREP_SST: [INFO] Joiner cleanup done. (20190824 09:39:14.824)
Aug 24 09:40:44 ffcdb1 systemd: mariadb.service stop-final-sigterm timed out. Skipping SIGKILL. Entering failed mode.
Aug 24 09:40:44 ffcdb1 systemd: Failed to start MariaDB 10.3.16 database server.
Aug 24 09:40:44 ffcdb1 systemd: Unit mariadb.service entered failed state.
Aug 24 09:40:44 ffcdb1 systemd: mariadb.service failed.
Aug 24 09:40:44 ffcdb1 systemd: Reached target Multi-User System.
Aug 24 09:40:44 ffcdb1 systemd: Starting Update UTMP about System Runlevel Changes...
Aug 24 09:40:44 ffcdb1 systemd: Started Update UTMP about System Runlevel Changes.
Aug 24 09:40:44 ffcdb1 systemd: Startup finished in 579ms (kernel) + 916ms (initrd) + 3min 4.123s (userspace) = 3min 5.618s.
Aug 24 09:41:44 ffcdb1 rsyncd[1000]: rsync: [receiver] write error: Broken pipe (32)
Aug 24 09:41:44 ffcdb1 mysqld: 2019-08-24  9:41:44 0 [ERROR] WSREP: Process completed with error: wsrep_sst_rsync --role 'joiner' --address '10.3.0.26' --datadir '/var/lib/mysql/'   --parent '809'  ''  '': 3 (No such process)
Aug 24 09:41:44 ffcdb1 mysqld: 2019-08-24  9:41:44 0 [ERROR] WSREP: Failed to read uuid:seqno and wsrep_gtid_domain_id from joiner script.
Aug 24 09:41:44 ffcdb1 mysqld: 2019-08-24  9:41:44 0 [ERROR] WSREP: SST failed: 3 (No such process)
Aug 24 09:41:44 ffcdb1 mysqld: 2019-08-24  9:41:44 0 [ERROR] Aborting
Aug 24 09:41:44 ffcdb1 mysqld: 2019-08-24  9:41:44 0 [Warning] WSREP: 1.0 (db3.HOSTNAME.intern): State transfer to 0.0 (ffcdb1.HOSTNAME.de) failed: -255 (Unknown error 255)
Aug 24 09:41:44 ffcdb1 mysqld: 2019-08-24  9:41:44 0 [ERROR] WSREP: gcs/src/gcs_group.cpp:gcs_group_handle_join_msg():737: Will never receive state. Need to abort.

知道為什麼嗎?SELinux 被禁用並且所需的埠是打開的(tcp 和 udp,已經過測試!)。

我無法解釋,這是為什麼呢?

我知道,我還是要做DNS解析!但這不是錯誤,不是嗎?

編輯:也許作業系統版本也很重要:我使用的是 Centos7。

當我在 ffm2 中啟動 Mariadb 服務並檢查集群大小時,我得到了 5 的大小(這是正確的),在啟動命令崩潰後,它又回到了 4。

所以我認為溝通基本上是成功的,不是嗎?

更新 2:當我刪除 galera Stuff 並啟動 Mariadb 服務時,它正在工作,並且我還擁有集群中的所有可用數據。很奇怪…

沒有線索?

通過為 systemd 服務設置更高的超時限制來修復。第一次啟動過程需要太多時間來進行第一次同步。

引用自:https://serverfault.com/questions/980546