Lvm

LVM:“找不到具有 uuid 的設備”但 blkid 找到了 UUID

  • October 2, 2018

我有一個 SLES 11.2 PPC (3.0.58-0.6.6-ppc64) 系統,它失去了它的捲組(包含 LV 的數據並不重要,但可以很好地返回)。磁碟通過來自 SAN 的兩條光纖路徑連接。

當我在上週五計劃停電之前重新啟動它時,問題就開始了。在再次將其關閉之前,我沒有時間進行故障排除。該卷組以前已成功使用了大約兩年。

vgscan 和 pvscan 什麼都不返回:

# pvscan -vP
Partial mode. Incomplete logical volumes will be processed.
Wiping cache of LVM-capable devices
Wiping internal VG cache
Walking through all physical volumes
No matching physical volumes found
# vgscan -vP
Partial mode. Incomplete logical volumes will be processed.
 Wiping cache of LVM-capable devices
 Wiping internal VG cache
Reading all physical volumes.  This may take a while...
 Finding all volume groups
No volume groups found

vgcfgrestore 報告它找不到 PV:

# vgcfgrestore vgclients
Couldn't find device with uuid PyKfIa-cCs9-gBoh-Qb50-yOw4-dHQw-N1YELU.
Couldn't find device with uuid FXfSAO-P9hO-Dgtl-0Ihf-x2jX-TnHU-kSqUA2.
Cannot restore Volume Group vgclients with 2 PVs marked as missing.
Restore failed.

但是 blkid 可以找到這些 UUID:

# blkid -t UUID=PyKfIa-cCs9-gBoh-Qb50-yOw4-dHQw-N1YELU
/dev/mapper/3600a0b800029df24000011084db97741: UUID="PyKfIa-cCs9-gBoh-Qb50-yOw4-dHQw-N1YELU" TYPE="LVM2_member" 
/dev/sdl: UUID="PyKfIa-cCs9-gBoh-Qb50-yOw4-dHQw-N1YELU" TYPE="LVM2_member" 
/dev/sdw: UUID="PyKfIa-cCs9-gBoh-Qb50-yOw4-dHQw-N1YELU" TYPE="LVM2_member" 
# blkid -t UUID=FXfSAO-P9hO-Dgtl-0Ihf-x2jX-TnHU-kSqUA2
/dev/mapper/3600a0b800029df24000017ae4f45f30b: UUID="FXfSAO-P9hO-Dgtl-0Ihf-x2jX-TnHU-kSqUA2" TYPE="LVM2_member" 
/dev/sdg: UUID="FXfSAO-P9hO-Dgtl-0Ihf-x2jX-TnHU-kSqUA2" TYPE="LVM2_member" 
/dev/sdr: UUID="FXfSAO-P9hO-Dgtl-0Ihf-x2jX-TnHU-kSqUA2" TYPE="LVM2_member" 

/etc/lvm/backup/vgclients有所有正確的資訊,並沒有說 PV 失去:

# egrep "(N1YELU|kSqUA2|dm-|ALLOC)" /etc/lvm/backup/vgclients
                   id = "PyKfIa-cCs9-gBoh-Qb50-yOw4-dHQw-N1YELU"
                   device = "/dev/dm-7"    # Hint only
                   status = ["ALLOCATABLE"]
                   id = "FXfSAO-P9hO-Dgtl-0Ihf-x2jX-TnHU-kSqUA2"
                   device = "/dev/dm-12"   # Hint only
                   status = ["ALLOCATABLE"]

我在 SAN 上確認了此伺服器上的 LVM 專用(和命名)卷,並且標識符(以 f30b 或 7741 結尾)在 SAN 和伺服器上匹配:

# multipath -ll | egrep -A5 "(f30b|7741)"
3600a0b800029df24000017ae4f45f30b dm-7 IBM,1814      FAStT
size=575G features='1 queue_if_no_path' hwhandler='1 rdac' wp=rw
|-+- policy='round-robin 0' prio=6 status=active
| `- 6:0:0:1   sdr  65:16  active ready running
`-+- policy='round-robin 0' prio=1 status=enabled
 `- 5:0:0:1   sdg  8:96   active ghost running
--
3600a0b800029df24000011084db97741 dm-12 IBM,1814      FAStT
size=834G features='1 queue_if_no_path' hwhandler='1 rdac' wp=rw
|-+- policy='round-robin 0' prio=6 status=active
| `- 5:0:0:7   sdl  8:176  active ready running
`-+- policy='round-robin 0' prio=1 status=enabled
 `- 6:0:0:7   sdw  65:96  active ghost running

兩種設備都沒有分區表(按設計):

# fdisk -l /dev/dm-7 /dev/dm-12 | grep table
Disk /dev/dm-7 doesn't contain a valid partition table
Disk /dev/dm-12 doesn't contain a valid partition table

我可以直接從設備中讀取:

# dd if=/dev/dm-7 of=/tmp/a bs=1024 count=1
1+0 records in
1+0 records out
1024 bytes (1.0 kB) copied, 0.00121051 s, 846 kB/s
# strings /tmp/a
LABELONE
LVM2 001FXfSAOP9hODgtl0Ihfx2jXTnHUkSqUA2

我已經嘗試重新啟動以及刪除sd(r|g|l|w)dm-(7|12)重新掃描,但沒有效果。

我嘗試使用備份值重新創建 PV,但它仍然說找不到它們。

# pvcreate --uuid "PyKfIa-cCs9-gBoh-Qb50-yOw4-dHQw-N1YELU" --restorefile /etc/lvm/backup/vgclients /dev/mapper/3600a0b800029df24000011084db97741 -t
 Test mode: Metadata will NOT be updated and volumes will not be (de)activated.
 Couldn't find device with uuid PyKfIa-cCs9-gBoh-Qb50-yOw4-dHQw-N1YELU.
 Couldn't find device with uuid FXfSAO-P9hO-Dgtl-0Ihf-x2jX-TnHU-kSqUA2.
 Device /dev/mapper/3600a0b800029df24000011084db97741 not found (or ignored by filtering).

這是我的 lvm.conf,但據我所知,我所做的唯一更改是提高日誌級別:

# egrep -v "^( *#|$)" /etc/lvm/lvm.conf
devices {
   dir = "/dev"
   scan = [ "/dev" ]
   preferred_names = [ ]
   filter = [ "a|^/dev/sda$|", "r/.*/" ]
   cache = "/etc/lvm/.cache"
   write_cache_state = 1
   sysfs_scan = 1      
   md_component_detection = 1
   ignore_suspended_devices = 0
}
log {
   verbose = 0
   syslog = 1
   overwrite = 0
   level = 2

   indent = 1
   command_names = 0
   prefix = "  "
}
backup {
   backup = 1
   backup_dir = "/etc/lvm/backup"
   archive = 1
   archive_dir = "/etc/lvm/archive"

   retain_min = 10
   retain_days = 30
}
shell {
   history_size = 100
}
global {

   umask = 077
   test = 0
   units = "h"
   activation = 1
   proc = "/proc"
   locking_type = 3
   fallback_to_clustered_locking = 1
   fallback_to_local_locking = 1
   locking_dir = "/var/run/lvm/lock"
}
activation {
   missing_stripe_filler = "/dev/ioerror"
   reserved_stack = 256
   reserved_memory = 8192
   process_priority = -18
   mirror_region_size = 512
   readahead = "auto"
   mirror_log_fault_policy = "allocate"
   mirror_device_fault_policy = "remove"

   udev_rules = 1
   udev_sync = 1
}
dmeventd {
   mirror_library = "libdevmapper-event-lvm2mirror.so"
   snapshot_library = "libdevmapper-event-lvm2snapshot.so"
}

那麼給了什麼?我的 VG 去哪兒了,我該如何取回它?

Novell 知識庫中的一個文件似乎適用於此:它解釋了在 SLES 上,LVM 預設不掃描多路徑設備,因此在這種情況下永遠不會看到它們。

要解決此問題,您可以實施 Novell 提供的解決方法:

/etc/lvm.confdevices部分中,將 更改filter為:

filter = [ "a|/dev/sda.*|", "a|/dev/disk/by-id/dm-uuid-.*mpath-.*|", "r|.*|"]

(這適用於 SLES 11。有關其他版本,請參閱連結的知識庫文章。)

當我遇到同樣的問題時,vgreduce –removemissing 對我有幫助。

這就是我創建此問題的方式。

我打算擴展一個現有的邏輯卷,但在從另一個終端中,我在要用於擴展的設備上執行 fdisk。它有一些我想刪除的以前的分區表。由於我已經將此磁碟添加到卷組(只是尚未擴展邏輯卷本身)但該物理卷不再存在,因此發生了失去 UUID 的問題。

為了解決這個問題,我使用了 vgreduce –removemissing ,然後就可以了。

引用自:https://serverfault.com/questions/499300