Ceph
為什麼使用ceph磁碟的ovirt VM停留在“等待啟動”狀態
我的設置包括 ceph 模擬(centos 7,使用 ceph-ansible 設置)、pike 發佈時的 cinder/keystone 組合和 ovirt 4.2.5.1。
外部 cinder 提供程序已設置,我可以創建磁碟。
創建 vm 並啟動它時,該 VM 在 ovirt 儀表板中顯示為“等待啟動”
在 ovirt 節點上,它應該執行我檢查過 libvirt 的 VM:
# virsh --readonly list Id Name State ---------------------------------------------------- 14 testceph paused
檢查域配置似乎也沒問題…最重要的是磁碟配置中給出了 ceph mons。
# virsh --readonly dumpxml <domain type='kvm' id='15'> <name>testceph</name> <uuid>036a2385-2b4f-48f9-bcf9-8f2882ecde36</uuid> <metadata xmlns:ns0="http://ovirt.org/vm/tune/1.0" xmlns:ovirt-vm="http://ovirt.org/vm/1.0"> <ns0:qos/> <ovirt-vm:vm xmlns:ovirt-vm="http://ovirt.org/vm/1.0"> <ovirt-vm:clusterVersion>4.2</ovirt-vm:clusterVersion> <ovirt-vm:destroy_on_reboot type="bool">False</ovirt-vm:destroy_on_reboot> <ovirt-vm:launchPaused>false</ovirt-vm:launchPaused> <ovirt-vm:memGuaranteedSize type="int">2730</ovirt-vm:memGuaranteedSize> <ovirt-vm:minGuaranteedMemoryMb type="int">2730</ovirt-vm:minGuaranteedMemoryMb> <ovirt-vm:resumeBehavior>auto_resume</ovirt-vm:resumeBehavior> <ovirt-vm:startTime type="float">1535016868.02</ovirt-vm:startTime> <ovirt-vm:device mac_address="00:1a:4a:16:01:78"> <ovirt-vm:specParams/> <ovirt-vm:vm_custom/> </ovirt-vm:device> </ovirt-vm:vm> </metadata> <maxMemory slots='16' unit='KiB'>16777216</maxMemory> <memory unit='KiB'>4194304</memory> <currentMemory unit='KiB'>4194304</currentMemory> <vcpu placement='static' current='1'>16</vcpu> <iothreads>1</iothreads> <resource> <partition>/machine</partition> </resource> <sysinfo type='smbios'> <system> <entry name='manufacturer'>oVirt</entry> <entry name='product'>oVirt Node</entry> <entry name='version'>7-5.1804.el7.centos.2</entry> <entry name='serial'>49434D53-0200-9031-2500-31902500FB7F</entry> <entry name='uuid'>036a2385-2b4f-48f9-bcf9-8f2882ecde36</entry> </system> </sysinfo> <os> <type arch='x86_64' machine='pc-i440fx-rhel7.3.0'>hvm</type> <smbios mode='sysinfo'/> </os> <features> <acpi/> </features> <cpu mode='custom' match='exact' check='partial'> <model fallback='forbid'>Nehalem</model> <topology sockets='16' cores='1' threads='1'/> <numa> <cell id='0' cpus='0' memory='4194304' unit='KiB'/> </numa> </cpu> <clock offset='variable' adjustment='0' basis='utc'> <timer name='rtc' tickpolicy='catchup'/> <timer name='pit' tickpolicy='delay'/> <timer name='hpet' present='no'/> </clock> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>destroy</on_crash> <pm> <suspend-to-mem enabled='no'/> <suspend-to-disk enabled='no'/> </pm> <devices> <emulator>/usr/libexec/qemu-kvm</emulator> <disk type='file' device='cdrom'> <driver name='qemu' type='raw' error_policy='report'/> <source file='/rhev/data-center/mnt/192.168.10.6:_media_ovirt-cd-images/c394242c-81ae-4d6a-a193-65157cc84702/images/11111111-1111-1111-1111-111111111111/ubuntu-server-18.04.iso' startupPolicy='optional'/> <backingStore/> <target dev='hdc' bus='ide'/> <readonly/> <boot order='2'/> <alias name='ua-fcca0dff-d833-4f28-b782-78ce0b016afe'/> <address type='drive' controller='0' bus='1' target='0' unit='0'/> </disk> <disk type='network' device='disk' snapshot='no'> <driver name='qemu' type='raw' cache='none' error_policy='stop' io='threads'/> <auth username='cinder'> <secret type='ceph' uuid='c6020051-6cd3-4ddf-982e-3d94c080de9c'/> </auth> <source protocol='rbd' name='volumes/volume-9b95b28c-9eec-4110-9973-88c161d3503f'> <host name='192.168.20.21' port='6789'/> <host name='192.168.20.22' port='6789'/> <host name='192.168.20.23' port='6789'/> </source> <target dev='sda' bus='scsi'/> <serial>9b95b28c-9eec-4110-9973-88c161d3503f</serial> <boot order='1'/> <alias name='ua-9b95b28c-9eec-4110-9973-88c161d3503f'/> <address type='drive' controller='0' bus='0' target='0' unit='0'/> </disk> <controller type='usb' index='0' model='piix3-uhci'> <alias name='usb'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/> </controller> <controller type='pci' index='0' model='pci-root'> <alias name='pci.0'/> </controller> <controller type='scsi' index='0'> <alias name='scsi0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> </controller> <controller type='ide' index='0'> <alias name='ide'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/> </controller> <controller type='virtio-serial' index='0'> <alias name='virtio-serial0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/> </controller> <interface type='bridge'> <mac address='00:1a:4a:16:01:78'/> <source bridge='ovirtmgmt'/> <target dev='vnet0'/> <model type='virtio'/> <filterref filter='vdsm-no-mac-spoofing'/> <link state='up'/> <mtu size='1500'/> <alias name='ua-446090cf-6758-4b3d-bd87-eb8b61442a46'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> </interface> <channel type='unix'> <source mode='bind' path='/var/lib/libvirt/qemu/channels/036a2385-2b4f-48f9-bcf9-8f2882ecde36.ovirt-guest-agent.0'/> <target type='virtio' name='ovirt-guest-agent.0'/> <alias name='channel0'/> <address type='virtio-serial' controller='0' bus='0' port='1'/> </channel> <channel type='unix'> <source mode='bind' path='/var/lib/libvirt/qemu/channels/036a2385-2b4f-48f9-bcf9-8f2882ecde36.org.qemu.guest_agent.0'/> <target type='virtio' name='org.qemu.guest_agent.0'/> <alias name='channel1'/> <address type='virtio-serial' controller='0' bus='0' port='2'/> </channel> <channel type='spicevmc'> <target type='virtio' name='com.redhat.spice.0'/> <alias name='channel2'/> <address type='virtio-serial' controller='0' bus='0' port='3'/> </channel> <input type='mouse' bus='ps2'> <alias name='input0'/> </input> <input type='keyboard' bus='ps2'> <alias name='input1'/> </input> <graphics type='spice' port='5900' tlsPort='5901' autoport='yes' listen='192.168.10.11' passwdValidTo='1970-01-01T00:00:01'> <listen type='network' address='192.168.10.11' network='vdsm-ovirtmgmt'/> <channel name='main' mode='secure'/> <channel name='display' mode='secure'/> <channel name='inputs' mode='secure'/> <channel name='cursor' mode='secure'/> <channel name='playback' mode='secure'/> <channel name='record' mode='secure'/> <channel name='smartcard' mode='secure'/> <channel name='usbredir' mode='secure'/> </graphics> <graphics type='vnc' port='5902' autoport='yes' listen='192.168.10.11' keymap='en-us' passwdValidTo='1970-01-01T00:00:01'> <listen type='network' address='192.168.10.11' network='vdsm-ovirtmgmt'/> </graphics> <video> <model type='qxl' ram='65536' vram='32768' vgamem='16384' heads='1' primary='yes'/> <alias name='ua-31537f3a-f1cf-4269-839c-bc82721ff7f3'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/> </video> <memballoon model='virtio'> <stats period='5'/> <alias name='ua-beeefd9c-c2bd-4836-83f0-a28657219b3e'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/> </memballoon> <rng model='virtio'> <backend model='random'>/dev/urandom</backend> <alias name='ua-7d25e1b5-4d03-4bc3-80f6-83a80d69b391'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/> </rng> </devices> <seclabel type='dynamic' model='selinux' relabel='yes'> <label>system_u:system_r:svirt_t:s0:c565,c625</label> <imagelabel>system_u:object_r:svirt_image_t:s0:c565,c625</imagelabel> </seclabel> <seclabel type='dynamic' model='dac' relabel='yes'> <label>+107:+107</label> <imagelabel>+107:+107</imagelabel> </seclabel> </domain>
是的。我可以看到 ovirt 節點和 ceph 節點相互交談。但是 ovirt 節點只與 ceph 監視器對話。不涉及 OSD。兩個節點都可以相互通信,允許使用巨型幀並且可以正常工作。在這個輸出中,ceph 監視器是 192.168.20.21,ovirt 節點是 192.168.20.11
[root@ceph1 ~]# tcpflow -c -i enp3s0 src or dst host 192.168.20.11 tcpflow: listening on enp3s0 192.168.020.021.06789-192.168.020.011.44734: ceph v027 192.168.020.011.44734-192.168.020.021.06789: ceph v027 192.168.020.011.44734-192.168.020.021.06789: *D 192.168.020.011.44734-192.168.020.021.06789: 192.168.020.021.06789-192.168.020.011.44734: 192.168.020.021.06789-192.168.020.011.44736: ceph v027 192.168.020.011.44736-192.168.020.021.06789: ceph v027 192.168.020.011.44736-192.168.020.021.06789: *D 192.168.020.021.06789-192.168.020.011.44736: 192.168.020.021.06789-192.168.020.011.44738: ceph v027 192.168.020.011.44738-192.168.020.021.06789: ceph v027 192.168.020.011.44738-192.168.020.021.06789: *D! 192.168.020.021.06789-192.168.020.011.44738: 192.168.020.021.06789-192.168.020.011.44740: ceph v027 192.168.020.011.44740-192.168.020.021.06789: ceph v027 192.168.020.011.44740-192.168.020.021.06789: *D" 192.168.020.021.06789-192.168.020.011.44740: 192.168.020.021.06789-192.168.020.011.44742: ceph v027 192.168.020.011.44742-192.168.020.021.06789: ceph v027 192.168.020.011.44742-192.168.020.021.06789: *D# 192.168.020.021.06789-192.168.020.011.44742: 192.168.020.021.06789-192.168.020.011.44754: ceph v027 192.168.020.011.44754-192.168.020.021.06789: ceph v027 192.168.020.011.44754-192.168.020.021.06789: *D) 192.168.020.021.06789-192.168.020.011.44754:
這種情況一直持續到 ovirt 節點上的 qemu 日誌顯示:
2018-08-23T09:34:31.233493Z qemu-kvm: -drive file=rbd:volumes/volume-9b95b28c-9eec-4110-9973-88c161d3503f:id=cinder:auth_supported=cephx\;none:mon_host=192.168.20.21\:6789\;192.168.20.22\:6789\;192.168.20.23\:6789,file.password-secret=ua-9b95b28c-9eec-4110-9973-88c161d3503f-secret0,format=raw,if=none,id=drive-ua-9b95b28c-9eec-4110-9973-88c161d3503f,serial=9b95b28c-9eec-4110-9973-88c161d3503f,cache=none,werror=stop,rerror=stop,aio=threads: 'serial' is deprecated, please use the corresponding option of '-device' instead 2018-08-23T09:39:31.281126Z qemu-kvm: -drive file=rbd:volumes/volume-9b95b28c-9eec-4110-9973-88c161d3503f:id=cinder:auth_supported=cephx\;none:mon_host=192.168.20.21\:6789\;192.168.20.22\:6789\;192.168.20.23\:6789,file.password-secret=ua-9b95b28c-9eec-4110-9973-88c161d3503f-secret0,format=raw,if=none,id=drive-ua-9b95b28c-9eec-4110-9973-88c161d3503f,serial=9b95b28c-9eec-4110-9973-88c161d3503f,cache=none,werror=stop,rerror=stop,aio=threads: error connecting: Connection timed out 2018-08-23 09:39:31.291+0000: shutting down, reason=failed
那麼是什麼讓 ovirt 和 ceph 不說話呢?通過 cinder 的鏡像介紹完成了,但是不知何故 ovirt 並沒有聯繫到 ceph osds…
與作為 oVirt 4.2.x 基礎的 centos 7 一起出現的 librbd1 庫似乎存在問題。它太舊了,不能和 ceph V13.x (又名模仿)一起玩,可能還有 V12 發光。
有關此問題的討論,請參閱oVirt 表單中的此文章。
有一種方法可以升級庫。