Vagrant 下的 MariaDB Galera 複製
我正在使用多機 VagrantFile 在 Vagrant 下配置 Galera MySQL 集群。
我認為問題不在於 Vagrant
流浪版
流浪文件
Vagrant.configure(2) do |config| config.vm.box = "ubuntu/trusty64" config.vm.provider "virtualbox" do |vb| vb.memory = "2048" end config.ssh.forward_agent = true if Vagrant.has_plugin?("vagrant-cachier") config.cache.scope = :box config.cache.enable :apt end config.vm.define "core0" do |core0| core0.vm.network "private_network", ip: "192.168.50.3" core0.vm.hostname = "core0" core0.vm.provision :hosts, :sync_hosts => true core0.vm.provision "shell", inline: <<-SHELL sudo python /vagrant/bootstrap.pex --core-nodes core0 core1 core2 --node-zero SHELL end config.vm.define "core1" do |core1| core1.vm.network "private_network", ip: "192.168.50.4" core1.vm.hostname = "core1" core1.vm.provision :hosts, :sync_hosts => true core1.vm.provision "shell", inline: <<-SHELL sudo python /vagrant/bootstrap.pex --master core0 --core SHELL end config.vm.define "core2" do |core2| core2.vm.network "private_network", ip: "192.168.50.5" core2.vm.hostname = "core2" core2.vm.provision :hosts, :sync_hosts => true core2.vm.provision "shell", inline: <<-SHELL sudo python /vagrant/bootstrap.pex --master core0 --core SHELL end end
流浪外掛
我在流浪者中使用
vagrant-cachier
and 。vagrant-hosts
Vagrant 依次執行並創建每個 VM,然後我進行 2 階段配置,以確保在嘗試集群之前建立盒子之間的網路。vagrant up --provision-with hosts && vagrant provision --provision-with shell
shell 配置器使用 salt 來創建和安裝 mariadb 和 gluster
Mariadb 版本
vagrant@core0:~$ sudo apt-cache policy mariadb-server-core-10.1 mariadb-server-core-10.1: Installed: 10.1.18+maria-1~trusty Candidate: 10.1.18+maria-1~trusty vagrant@core0:~$ sudo apt-cache policy galera-3 galera-3: Installed: 25.3.18-trusty Candidate: 25.3.18-trusty
我將集群地址配置為 galera.cnf
wsrep_cluster_address = gcomm://core2,core0,core1
當主機
core1
並core2
嘗試加入core0
他們無法加入時。core1 加入集群
core1 能夠找到 core0 並檢索目前集群狀態。
Oct 12 15:15:02 core1 mysqld: 2016-10-12 15:15:02 140403237877696 [Note] WSREP: gcomm: connecting to group 'TestSystem', peer 'core2:,core0:,core1:' Oct 12 15:15:02 core1 mysqld: 2016-10-12 15:15:02 140403237877696 [Note] WSREP: (a61950db, 'tcp://0.0.0.0:4567') connection established to a61950db tcp://127.0.0.1:4567 Oct 12 15:15:02 core1 mysqld: 2016-10-12 15:15:02 140403237877696 [Note] WSREP: (a61950db, 'tcp://0.0.0.0:4567') connection established to a61950db tcp://127.0.1.1:4567 Oct 12 15:15:02 core1 mysqld: 2016-10-12 15:15:02 140403237877696 [Warning] WSREP: (a61950db, 'tcp://0.0.0.0:4567') address 'tcp://127.0.1.1:4567' points to own listening address, blacklisting Oct 12 15:15:02 core1 mysqld: 2016-10-12 15:15:02 140403237877696 [Note] WSREP: (a61950db, 'tcp://0.0.0.0:4567') connection established to a5301480 tcp://192.168.50.3:4567 Oct 12 15:15:02 core1 mysqld: 2016-10-12 15:15:02 140403237877696 [Note] WSREP: (a61950db, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: Oct 12 15:15:03 core1 mysqld: 2016-10-12 15:15:03 140403237877696 [Note] WSREP: declaring a5301480 at tcp://192.168.50.3:4567 stable Oct 12 15:15:03 core1 mysqld: 2016-10-12 15:15:03 140403237877696 [Note] WSREP: Node a5301480 state prim
核心2不可用
正如預期的那樣,此時 core2 不可用
Oct 12 15:15:03 core1 mysqld: 2016-10-12 15:15:03 140403237877696 [Note] WSREP: discarding pending addr without UUID: tcp://192.168.50.5:4567 Oct 12 15:15:03 core1 mysqld: 2016-10-12 15:15:03 140403237877696 [Note] WSREP: gcomm: connected
SST 失敗
core1 嘗試使用
10.0.2.15
Vagrant NAT 地址連接到 core0Oct 12 15:15:03 core1 mysqld: 2016-10-12 15:15:03 140403237563136 [Note] WSREP: State transfer required: Oct 12 15:15:03 core1 mysqld: #011Group state: a530f9fd-908d-11e6-a72a-b2e3a6b91029:1113 Oct 12 15:15:03 core1 mysqld: #011Local state: 00000000-0000-0000-0000-000000000000:-1 Oct 12 15:15:03 core1 mysqld: 2016-10-12 15:15:03 140403237563136 [Note] WSREP: New cluster view: global state: a530f9fd-908d-11e6-a72a-b2e3a6b91029:1113, view# 2: Primary, number of nodes: 2, my index: 1, protocol version 3 Oct 12 15:15:03 core1 mysqld: 2016-10-12 15:15:03 140403237563136 [Warning] WSREP: Gap in state sequence. Need state transfer. Oct 12 15:15:03 core1 mysqld: 2016-10-12 15:15:03 140402002753280 [Note] WSREP: Running: 'wsrep_sst_xtrabackup-v2 --role 'joiner' --address '10.0.2.15' --datadir '/var/lib/mysql/' --parent '9043' --binlog '/var/log/mariadb_bin/mariadb-bin' ' Oct 12 15:15:03 core1 mysqld: WSREP_SST: [INFO] Logging all stderr of SST/Innobackupex to syslog (20161012 15:15:03.985) Oct 12 15:15:03 core1 -wsrep-sst-joiner: Streaming with xbstream Oct 12 15:15:03 core1 -wsrep-sst-joiner: Using socat as streamer Oct 12 15:15:04 core1 -wsrep-sst-joiner: Evaluating timeout -k 110 100 socat -u TCP-LISTEN:4444,reuseaddr stdio | xbstream -x; RC=( ${PIPESTATUS[@]} ) Oct 12 15:15:04 core1 mysqld: 2016-10-12 15:15:04 140403237563136 [Note] WSREP: Prepared SST request: xtrabackup-v2|10.0.2.15:4444/xtrabackup_sst//1 Oct 12 15:15:04 core1 mysqld: 2016-10-12 15:15:04 140403237563136 [Note] WSREP: REPL Protocols: 7 (3, 2) Oct 12 15:15:04 core1 mysqld: 2016-10-12 15:15:04 140402075592448 [Note] WSREP: Service thread queue flushed. Oct 12 15:15:04 core1 mysqld: 2016-10-12 15:15:04 140403237563136 [Note] WSREP: Assign initial position for certification: 1113, protocol version: 3 Oct 12 15:15:04 core1 mysqld: 2016-10-12 15:15:04 140402075592448 [Note] WSREP: Service thread queue flushed. Oct 12 15:15:04 core1 mysqld: 2016-10-12 15:15:04 140403237563136 [Warning] WSREP: Failed to prepare for incremental state transfer: Local state UUID (00000000-0000-0000-0000-000000000000) does not match group state UUID (a530f9fd-908d-11e6-a72a-b2e3a6b91029): 1 (Operation not permitted) Oct 12 15:15:04 core1 mysqld: #011 at galera/src/replicator_str.cpp:prepare_for_IST():482. IST will be unavailable. Oct 12 15:15:04 core1 mysqld: 2016-10-12 15:15:04 140402019526400 [Note] WSREP: Member 1.0 (core1) requested state transfer from '*any*'. Selected 0.0 (core0)(SYNCED) as donor. Oct 12 15:15:04 core1 mysqld: 2016-10-12 15:15:04 140402019526400 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 1113) Oct 12 15:15:04 core1 mysqld: 2016-10-12 15:15:04 140403237563136 [Note] WSREP: Requesting state transfer: success, donor: 0 Oct 12 15:15:04 core1 mysqld: 2016-10-12 15:15:04 140402019526400 [Warning] WSREP: 0.0 (core0): State transfer to 1.0 (core1) failed: -32 (Broken pipe) Oct 12 15:15:04 core1 mysqld: 2016-10-12 15:15:04 140402019526400 [ERROR] WSREP: gcs/src/gcs_group.cpp:gcs_group_handle_join_msg():736: Will never receive state. Need to abort.
core0 上的 wsrep 狀態
在 core0 上登錄 mysql 並執行
SHOW GLOBAL STATUS LIKE 'wsrep_%' +------------------------------+--------------------------------------+ | Variable_name | Value | +------------------------------+--------------------------------------+ ... | wsrep_cluster_state_uuid | a530f9fd-908d-11e6-a72a-b2e3a6b91029 | | wsrep_cluster_status | Primary | | wsrep_gcomm_uuid | a5301480-908d-11e6-a84e-0b2444c3985f | | wsrep_incoming_addresses | 10.0.2.15:3306 | | wsrep_local_state | 4 | | wsrep_local_state_comment | Synced | | wsrep_local_state_uuid | a530f9fd-908d-11e6-a72a-b2e3a6b91029 | ... +------------------------------+--------------------------------------+
因此,core0 似乎將其 wsrep 傳入地址宣傳為
10.0.2.15:3306
,這不是我期望的地址 -192.168.0.3:3306
。在 core0 上的 ifconfig
這顯示了 eth0 上的 NAT
vagrant@core0:~$ ifconfig eth0 Link encap:Ethernet HWaddr 08:00:27:de:04:89 inet addr:10.0.2.15 Bcast:10.0.2.255 Mask:255.255.255.0 inet6 addr: fe80::a00:27ff:fede:489/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:218886 errors:0 dropped:0 overruns:0 frame:0 TX packets:81596 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:205966097 (205.9 MB) TX bytes:6015101 (6.0 MB) eth1 Link encap:Ethernet HWaddr 08:00:27:bc:f7:ee inet addr:192.168.50.3 Bcast:192.168.50.255 Mask:255.255.255.0 inet6 addr: fe80::a00:27ff:febc:f7ee/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:261637 errors:0 dropped:0 overruns:0 frame:0 TX packets:244284 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:59467905 (59.4 MB) TX bytes:114065906 (114.0 MB) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:246320 errors:0 dropped:0 overruns:0 frame:0 TX packets:246320 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:64552545 (64.5 MB) TX bytes:64552545 (64.5 MB)
如何/為什麼將地址設置為此?有沒有辦法讓地址成為正確的地址?
更新
如何設置 wsrep_incoming_addresses
雖然 wsrep_cluster_address 必須在節點開始時指定,但 wsrep_incoming_addresses 是在初始化期間內部確定的。在 linux 作業系統上,用於確定 IP 地址的命令從介面列表中選擇第一個可用的全域 IP 地址。
ip addr show | grep '^\s*inet' | grep -m1 global | awk ' {print $2 } ' | sed 's/\/.*//'
https://mariadb.atlassian.net/browse/MDEV-5487
我的輸出
vagrant@core0:~$ ip addr show | grep '^\s*inet' | grep -m1 global | awk ' > {print $2 } > ' | sed 's/\/.*//' 10.0.2.15
這裡的解決方案是設置
wsrep_sst_receive_address
設置https://mariadb.com/kb/en/mariadb/galera-cluster-system-variables/#wsrep_sst_receive_address
將此設置為我分配給 Vagrant 框的主機名為我解決了這個問題