Centos7

10TB硬碟分區失敗?

  • October 1, 2021

在我再次嘗試對 10TB 硬碟進行分區之前,parted看到了它:

# parted /dev/sdb
(parted) print list                                                       
Model: ATA ST10000NM0016-1T (scsi)
Disk /dev/sdb: 10.0TB
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Disk Flags: 

Number  Start   End     Size    File system  Name     Flags
1      1049kB  10.0TB  10.0TB  xfs          primary
....
....
....

然後,我只是嘗試再次分區但失敗了:

[root@localhost ~]# parted /dev/sdb
GNU Parted 3.1
Using /dev/sdb
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) mklabel gpt                                                      
Warning: The existing disk label on /dev/sdb will be destroyed and all data on this disk will be lost. Do you want to continue?
Yes/No? Yes
Error: end of file while reading /dev/sdb
Retry/Ignore/Cancel? Retry                                                
Error: end of file while reading /dev/sdb
Retry/Ignore/Cancel? Cancel                                               
(parted) q                                                                
Warning: Error fsyncing/closing /dev/sdb: Input/output error
Retry/Ignore? Retry                                                       
Warning: Error fsyncing/closing /dev/sdb: Input/output error
Retry/Ignore? Ignore                                                      

然後,驅動消失了。我試圖重新啟動,但仍然看不到驅動器。

這篇文章建議使用gdisk /dev/sdb. 但是,我認為它已經損壞到gdisk無法辨識:

# gdisk -l /dev/sdb
GPT fdisk (gdisk) version 0.8.10

Problem opening /dev/sdb for reading! Error is 2.
The specified file does not exist!

lsbk的輸出:

# lsblk 
NAME                 MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda                    8:0    1 447.1G  0 disk 
├─sda1                 8:1    1     2G  0 part /boot
└─sda2                 8:2    1 445.1G  0 part 
 ├─centos-root      253:0    0    30G  0 lvm  /
 ├─centos-swap      253:1    0     4G  0 lvm  [SWAP]
 ├─centos-var       253:2    0    30G  0 lvm  /var
 ├─centos-coredumps 253:3    0    30G  0 lvm  /coredumps
 └─centos-latest    253:4    0 351.1G  0 lvm  /latest

ls -ltr /dev/sd*的輸出:

brw-rw---- 1 root disk 8, 0 Feb 10 16:00 /dev/sda
brw-rw---- 1 root disk 8, 2 Feb 10 16:00 /dev/sda2
brw-rw---- 1 root disk 8, 1 Feb 10 16:00 /dev/sda1

lshw -class diskparted -lfdisk -l看不到驅動器。

我看到一些可疑的東西dmesg

[Wed Feb 10 13:27:39 2021] ata13: softreset failed (1st FIS failed)
[Wed Feb 10 13:27:49 2021] ata13: softreset failed (device not ready)
[Wed Feb 10 13:28:06 2021] ata13: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[Wed Feb 10 13:28:11 2021] ata13.00: qc timeout (cmd 0xec)
[Wed Feb 10 13:28:11 2021] ata13.00: failed to IDENTIFY (I/O error, err_mask=0x4)
[Wed Feb 10 13:28:17 2021] ata13: link is slow to respond, please be patient (ready=0)
[Wed Feb 10 13:28:21 2021] ata13: softreset failed (device not ready)
[Wed Feb 10 13:28:31 2021] ata13: softreset failed (1st FIS failed)
[Wed Feb 10 13:28:41 2021] ata13: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[Wed Feb 10 13:28:51 2021] ata13.00: qc timeout (cmd 0xec)
[Wed Feb 10 13:28:51 2021] ata13.00: failed to IDENTIFY (I/O error, err_mask=0x4)
[Wed Feb 10 13:28:51 2021] ata13: limiting SATA link speed to 3.0 Gbps
[Wed Feb 10 13:28:52 2021] ata13: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[Wed Feb 10 13:29:13 2021] ata13.00: qc timeout (cmd 0x47)
[Wed Feb 10 13:29:13 2021] ata13.00: READ LOG DMA EXT failed, trying unqueued
[Wed Feb 10 13:29:13 2021] ata13.00: failed to get NCQ Send/Recv Log Emask 0x40
[Wed Feb 10 13:29:13 2021] ata13.00: ATA-10: ST10000NM0016-1TT101, SNE0, max UDMA/133
[Wed Feb 10 13:29:13 2021] ata13.00: 19532873728 sectors, multi 16: LBA48 NCQ (depth 31/32), AA
[Wed Feb 10 13:29:13 2021] ata13.00: failed to set xfermode (err_mask=0x40)
[Wed Feb 10 13:29:13 2021] ata13.00: disabled
[Wed Feb 10 13:29:13 2021] ata13: hard resetting link
[Wed Feb 10 13:29:23 2021] ata13: softreset failed (1st FIS failed)
[Wed Feb 10 13:29:23 2021] ata13: hard resetting link
[Wed Feb 10 13:29:33 2021] ata13: softreset failed (device not ready)
[Wed Feb 10 13:29:33 2021] ata13: hard resetting link
[Wed Feb 10 13:29:39 2021] ata13: link is slow to respond, please be patient (ready=0)
[Wed Feb 10 13:29:49 2021] ata13: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[Wed Feb 10 13:29:49 2021] ata13: EH complete

=================================

更新#1

我看了這篇文章然後關機了acpi,另一篇文章提示電源問題所以我關機了tune-adm。然後,磁碟回來了,我就像上次一樣執行,但這一次,沒有parted /dev/sdb,但是當我繼續時 ,它給了我。我重新啟動機器並再次嘗試:mklabel gpt``Error: end of file while reading /dev/sdb``mkpart primary xfs 0% 1%``Error: /dev/sdb: unrecognised disk label

(parted) mkpart primary xfs 0% 1%                                         
(parted) mkpart primary xfs 1% 2%                                         
(parted) mkpart primary ext4 2% 3%                                        
(parted) mkpart primary ext4 3% 4%
(parted) mkpart primary btrfs 4% 5%                                       
(parted) mkpart primary btrfs 5% 6%                                       
(parted) mkpart primary xfs 6% 100%                                       
(parted) print                                                            
Model: ATA ST10000NM0016-1T (scsi)
Disk /dev/sdb: 10.0TB
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Disk Flags: 

Number  Start   End     Size    File system  Name     Flags
1      1049kB  100GB   100GB   xfs          primary
2      100GB   200GB   100GB                primary
3      200GB   300GB   100GB                primary
4      300GB   400GB   100GB                primary
5      400GB   500GB   100GB                primary
6      500GB   600GB   100GB                primary
7      600GB   10.0TB  9401GB               primary

(parted) q                                                                

有用。但這似乎很不穩定。我dmesg再次檢查,發現類似但不同的故障:

[Thu Feb 11 00:58:31 2021] ata15.00: qc timeout (cmd 0xec)
[Thu Feb 11 00:58:31 2021] ata15.00: failed to IDENTIFY (I/O error, err_mask=0x4)
[Thu Feb 11 00:58:32 2021] ata15: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[Thu Feb 11 00:58:42 2021] ata15.00: qc timeout (cmd 0xec)
[Thu Feb 11 00:58:42 2021] ata15.00: failed to IDENTIFY (I/O error, err_mask=0x4)
[Thu Feb 11 00:58:42 2021] ata15: limiting SATA link speed to 3.0 Gbps
[Thu Feb 11 00:58:44 2021] ata15: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[Thu Feb 11 00:59:12 2021] ata15.00: ATA-10: ST10000NM0016-1TT101, SNE0, max UDMA/133
[Thu Feb 11 00:59:12 2021] ata15.00: 19532873728 sectors, multi 16: LBA48 NCQ (depth 31/32), AA
[Thu Feb 11 00:59:12 2021] ata15.00: configured for UDMA/133
[Thu Feb 11 00:59:12 2021] scsi 14:0:0:0: Direct-Access     ATA      ST10000NM0016-1T SNE0 PQ: 0 ANSI: 5
[Thu Feb 11 00:59:12 2021] sd 14:0:0:0: [sdb] 19532873728 512-byte logical blocks: (10.0 TB/9.09 TiB)
[Thu Feb 11 00:59:12 2021] sd 14:0:0:0: [sdb] 4096-byte physical blocks
[Thu Feb 11 00:59:12 2021] sd 14:0:0:0: [sdb] Write Protect is off
[Thu Feb 11 00:59:12 2021] sd 14:0:0:0: [sdb] Mode Sense: 00 3a 00 00
[Thu Feb 11 00:59:12 2021] sd 14:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[Thu Feb 11 00:59:19 2021]  sdb:
[Thu Feb 11 00:59:19 2021] sd 14:0:0:0: [sdb] Attached SCSI removable disk
[Thu Feb 11 00:59:37 2021] SGI XFS with ACLs, security attributes, no debug enabled

知道發生了什麼嗎?

謝謝。

事實證明這是一個有故障的 SATA 控制器。

我更換了 SATA 電纜和整個硬碟。同樣的問題。重新安裝整個作業系統,同樣的問題。

更換 SATA 控制器即可解決問題。

在使用全新磁碟時遇到類似的硬體問題 - dmesg:

ataX: softreset failed (device not ready)

和類似的,直到完全失敗 - 我檢查了幾乎所有的東西並用Google搜尋了整個網際網路。

我注意到一種擦洗聲,有時還有旋轉和旋轉的短聲。在我的情況下,電源線不足是原因。當連接到單獨的電源時,沒有記錄新的故障。

引用自:https://serverfault.com/questions/1053064