Linux
我應該採取哪些步驟來最好地嘗試恢復失敗的軟體 raid5 設置?
我的突襲失敗了,我不確定要採取哪些最佳步驟才能最好地嘗試恢復它。
我在 raid5 配置中有 4 個驅動器。看起來好像失敗了(
sde1
),但md
無法啟動數組,因為它sdd1
說不新鮮我能做些什麼來恢復陣列嗎?
我在下面粘貼了一些摘錄
/var/log/messages
和mdadm --examine
:
/var/log/messages
$ egrep -w sd[b,c,d,e]\|raid\|md /var/log/messages
nas kernel: [...] sd 5:0:0:0: [sde] nas kernel: [...] sd 5:0:0:0: [sde] CDB: nas kernel: [...] end_request: I/O error, dev sde, sector 937821218 nas kernel: [...] sd 5:0:0:0: [sde] killing request nas kernel: [...] md/raid:md0: read error not correctable (sector 937821184 on sde1). nas kernel: [...] md/raid:md0: Disk failure on sde1, disabling device. nas kernel: [...] md/raid:md0: Operation continuing on 2 devices. nas kernel: [...] md/raid:md0: read error not correctable (sector 937821256 on sde1). nas kernel: [...] sd 5:0:0:0: [sde] Unhandled error code nas kernel: [...] sd 5:0:0:0: [sde] nas kernel: [...] sd 5:0:0:0: [sde] CDB: nas kernel: [...] end_request: I/O error, dev sde, sector 937820194 nas kernel: [...] sd 5:0:0:0: [sde] Synchronizing SCSI cache nas kernel: [...] sd 5:0:0:0: [sde] nas kernel: [...] sd 5:0:0:0: [sde] Stopping disk nas kernel: [...] sd 5:0:0:0: [sde] START_STOP FAILED nas kernel: [...] sd 5:0:0:0: [sde] nas kernel: [...] md: unbind<sde1> nas kernel: [...] md: export_rdev(sde1) nas kernel: [...] md: bind<sdd1> nas kernel: [...] md: bind<sdc1> nas kernel: [...] md: bind<sdb1> nas kernel: [...] md: bind<sde1> nas kernel: [...] md: kicking non-fresh sde1 from array! nas kernel: [...] md: unbind<sde1> nas kernel: [...] md: export_rdev(sde1) nas kernel: [...] md: kicking non-fresh sdd1 from array! nas kernel: [...] md: unbind<sdd1> nas kernel: [...] md: export_rdev(sdd1) nas kernel: [...] md: raid6 personality registered for level 6 nas kernel: [...] md: raid5 personality registered for level 5 nas kernel: [...] md: raid4 personality registered for level 4 nas kernel: [...] md/raid:md0: device sdb1 operational as raid disk 2 nas kernel: [...] md/raid:md0: device sdc1 operational as raid disk 0 nas kernel: [...] md/raid:md0: allocated 4338kB nas kernel: [...] md/raid:md0: not enough operational devices (2/4 failed) nas kernel: [...] md/raid:md0: failed to run raid set. nas kernel: [...] md: pers->run() failed ...
mdadm --examine
$ mdadm --examine /dev/sd[bcdefghijklmn]1
/dev/sdb1: Magic : a92b4efc Version : 1.2 Feature Map : 0x0 Array UUID : 4dc53f9d:f0c55279:a9cb9592:a59607c9 Name : NAS:0 Creation Time : Sun Sep 11 02:37:59 2011 Raid Level : raid5 Raid Devices : 4 Avail Dev Size : 3907027053 (1863.02 GiB 2000.40 GB) Array Size : 5860538880 (5589.05 GiB 6001.19 GB) Used Dev Size : 3907025920 (1863.02 GiB 2000.40 GB) Data Offset : 2048 sectors Super Offset : 8 sectors State : clean Device UUID : e8369dbc:bf591efa:f0ccc359:9d164ec8 Update Time : Tue May 27 18:54:37 2014 Checksum : a17a88c0 - correct Events : 1026050 Layout : left-symmetric Chunk Size : 512K Device Role : Active device 2 Array State : A.A. ('A' == active, '.' == missing) /dev/sdc1: Magic : a92b4efc Version : 1.2 Feature Map : 0x0 Array UUID : 4dc53f9d:f0c55279:a9cb9592:a59607c9 Name : NAS:0 Creation Time : Sun Sep 11 02:37:59 2011 Raid Level : raid5 Raid Devices : 4 Avail Dev Size : 3907027053 (1863.02 GiB 2000.40 GB) Array Size : 5860538880 (5589.05 GiB 6001.19 GB) Used Dev Size : 3907025920 (1863.02 GiB 2000.40 GB) Data Offset : 2048 sectors Super Offset : 8 sectors State : clean Device UUID : 78221e11:02acc1c8:c4eb01bf:f0852cbe Update Time : Tue May 27 18:54:37 2014 Checksum : 1fbb54b8 - correct Events : 1026050 Layout : left-symmetric Chunk Size : 512K Device Role : Active device 0 Array State : A.A. ('A' == active, '.' == missing) /dev/sdd1: Magic : a92b4efc Version : 1.2 Feature Map : 0x0 Array UUID : 4dc53f9d:f0c55279:a9cb9592:a59607c9 Name : NAS:0 Creation Time : Sun Sep 11 02:37:59 2011 Raid Level : raid5 Raid Devices : 4 Avail Dev Size : 3907027053 (1863.02 GiB 2000.40 GB) Array Size : 5860538880 (5589.05 GiB 6001.19 GB) Used Dev Size : 3907025920 (1863.02 GiB 2000.40 GB) Data Offset : 2048 sectors Super Offset : 8 sectors State : clean Device UUID : fd282483:d2647838:f6b9897e:c216616c Update Time : Mon Oct 7 19:21:22 2013 Checksum : 6df566b8 - correct Events : 32621 Layout : left-symmetric Chunk Size : 512K Device Role : Active device 3 Array State : AAAA ('A' == active, '.' == missing) /dev/sde1: Magic : a92b4efc Version : 1.2 Feature Map : 0x0 Array UUID : 4dc53f9d:f0c55279:a9cb9592:a59607c9 Name : NAS:0 Creation Time : Sun Sep 11 02:37:59 2011 Raid Level : raid5 Raid Devices : 4 Avail Dev Size : 3907027053 (1863.02 GiB 2000.40 GB) Array Size : 5860538880 (5589.05 GiB 6001.19 GB) Used Dev Size : 3907025920 (1863.02 GiB 2000.40 GB) Data Offset : 2048 sectors Super Offset : 8 sectors State : clean Device UUID : e84657dd:0882a7c8:5918b191:2fc3da02 Update Time : Tue May 27 18:46:12 2014 Checksum : 33ab6fe - correct Events : 1026039 Layout : left-symmetric Chunk Size : 512K Device Role : Active device 1 Array State : AAA. ('A' == active, '.' == missing)
你有一個雙驅動器故障,其中一個驅動器死了六個月。使用 RAID5,這是不可恢復的。更換故障硬體並從備份中恢復。
展望未來,請考慮使用此類大型驅動器的 RAID6,並確保您有適當的監控以捕捉設備故障,以便您可以盡快做出響應。
好吧,如果您的備份不是最新的,您可以嘗試使用三個驅動器在降級模式下強制重新組裝…
mdadm -v –assemble –force /dev/md0/dev/sdb1/dev/sdc1/dev/sde1
由於 sde1 在更新時間和事件計數方面只是略微不同步,我懷疑您將能夠訪問大部分數據。在類似的 RAID5 故障情況下,我已經成功完成了很多次。
- sdb1 更新時間:2014 年 5 月 27 日星期二 18:54:37
- sdc1 更新時間:2014 年 5 月 27 日星期二 18:54:37
- sdd1 更新時間:2013 年 10 月 7 日星期一 19:21:22
- sde1更新時間:2014年5月27日星期二18:46:12
- sdb1 事件:1026050
- sdc1 事件:1026050
- sdd1 事件:32621
- sde1 事件:1026039