Linux
zfs 無法辨識自己的物理磁碟
我對 zfs 池有重複的問題,其中 zfs 停止辨識自己的、正確標記(或看起來如此)的物理設備。
Ubuntu 20.04.2 LTS 5.11.0-44-generic #48~20.04.2-Ubuntu SMP Tue Dec 14 15:36:44 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux libzfs2linux/now 0.8.3-1ubuntu12.11 amd64 [installed,upgradable to: 0.8.3-1ubuntu12.13] zfs-zed/now 0.8.3-1ubuntu12.11 amd64 [installed,upgradable to: 0.8.3-1ubuntu12.13] zfsutils-linux/now 0.8.3-1ubuntu12.11 amd64 [installed,upgradable to: 0.8.3-1ubuntu12.13]
模型範例。
- 我可以創建一個池,連接完全不相關的磁碟(例如 USB、外部),並在重新啟動時(使用 USB 磁碟)zfs 報告其池中的一個磁碟失去。
- 更改一個(或多個)驅動器的控制器似乎也會發生同樣的情況。所有物理磁碟都在那裡,所有標籤/uuid 似乎都在那裡,改變的是設備號分配。
很難相信 zfs 會根據系統設備分配順序組裝池而忽略其標籤/uuid,但這就是它的簡單外觀。
agatek@mmstorage:~$ zpool status pool: mmdata state: DEGRADED status: One or more devices could not be used because the label is missing or invalid. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Replace the device using 'zpool replace'. see: http://zfsonlinux.org/msg/ZFS-8000-4J scan: scrub in progress since Sun Jan 9 13:03:23 2022 650G scanned at 1.58G/s, 188G issued at 468M/s, 22.7T total 0B repaired, 0.81% done, 0 days 14:00:27 to go config: NAME STATE READ WRITE CKSUM mmdata DEGRADED 0 0 0 raidz1-0 DEGRADED 0 0 0 ata-HGST_HDN726040ALE614_K7HJG8HL ONLINE 0 0 0 6348126275544519230 FAULTED 0 0 0 was /dev/sdb1 ata-HGST_HDN726040ALE614_K3H14ZAL ONLINE 0 0 0 ata-HGST_HDN726040ALE614_K4K721RB ONLINE 0 0 0 ata-WDC_WD40EZAZ-00SF3B0_WD-WX12D514858P ONLINE 0 0 0 ata-ST4000DM004-2CV104_ZTT24X5R ONLINE 0 0 0 ata-WDC_WD40EZAZ-00SF3B0_WD-WX62D711SHF4 ONLINE 0 0 0 sdi ONLINE 0 0 0 errors: No known data errors agatek@mmstorage:~$ blkid /dev/sda1: UUID="E0FD-8D4F" TYPE="vfat" PARTUUID="7600a192-967b-417f-b726-7f5524be71a5" /dev/sda2: UUID="9d8774ec-051f-4c60-aaa7-82f37dbaa4a4" TYPE="ext4" PARTUUID="425f31b2-f289-496a-911b-a2f8a9bb5c25" /dev/sda3: UUID="e0b8852d-f781-4891-8e77-d8651f39a55b" TYPE="ext4" PARTUUID="a750bae3-c6ea-40a0-bdfa-0523e358018b" /dev/sdb1: LABEL="mmdata" UUID="16683979255455566941" UUID_SUB="13253481390530831214" TYPE="zfs_member" PARTLABEL="zfs-5360ecc220877e69" PARTUUID="57fe2215-aa69-2f46-b626-0f2057a2e4a7" /dev/sdd1: LABEL="mmdata" UUID="16683979255455566941" UUID_SUB="17929921080902463088" TYPE="zfs_member" PARTLABEL="zfs-f6ef14df86c7a6e1" PARTUUID="31a074a3-300d-db45-b9e2-3495f49c4bee" /dev/sde1: LABEL="mmdata" UUID="16683979255455566941" UUID_SUB="505855664557329830" TYPE="zfs_member" PARTLABEL="zfs-6326993c142e4a03" PARTUUID="37f4954d-67fd-8945-82e6-d0db1f2af12e" /dev/sdg1: LABEL="mmdata" UUID="16683979255455566941" UUID_SUB="1905592300789522892" TYPE="zfs_member" PARTLABEL="zfs-9d379d5bfd432a2b" PARTUUID="185eff00-196a-a642-9360-0d4532d54ec0" /dev/sdi1: LABEL="mmdata" UUID="16683979255455566941" UUID_SUB="15862525770363300383" TYPE="zfs_member" PARTLABEL="zfs-3c99aa22a45c59bf" PARTUUID="89f1600a-b58e-c74c-8d5e-6fdd186a6db0" /dev/sdh1: LABEL="mmdata" UUID="16683979255455566941" UUID_SUB="15292769945216849639" TYPE="zfs_member" PARTLABEL="zfs-ee9e1c9a5bde878c" PARTUUID="2e70d63b-00ba-f842-b82d-4dba33314dd5" /dev/sdf1: LABEL="mmdata" UUID="16683979255455566941" UUID_SUB="5773484836304595337" TYPE="zfs_member" PARTLABEL="zfs-ee40cf2140012e24" PARTUUID="e5cc3e2a-f7c9-d54e-96de-e62a723a9c3f" /dev/sdc1: LABEL="mmdata" UUID="16683979255455566941" UUID_SUB="6348126275544519230" TYPE="zfs_member" PARTLABEL="zfs-0d28f0d2715eaff8" PARTUUID="a328981a-7569-294a-bbf6-9d26660e2aad"`
對於上述池,發生了什麼,其中一個設備早先發生了故障。我將替換磁碟連接到第二個控制器並執行了替換。它是成功的。游泳池還可以。接下來,故障設備從池中移除並由替換磁碟物理替換(控制器更改)。重新啟動後,我將其置於降級狀態,其中一個設備報告失去。清理是由命令 zpool clear 觸發的。
因此,正如 blkid 所示,有 8 個磁碟,所有磁碟都正確分區和標記(我認為),但其中一個設備未被辨識為池的一部分。在這種情況下該怎麼辦?這非常煩人。重新同步池需要幾天時間。
如果您使用
/dev/sdX
路徑將任何設備添加到池中,它可能會發生變化,因為 Linux 核心不保證這些驅動器條目的任何順序。在您的輸出中,您
/dev/sdi
是池的成員。這可以隨時改變。您應該嘗試
zpool export mmdata
使陣列離線,然後zpool import mmdata -d /dev/disk/by-id
使用驅動器的永久 ID 再次導入它。