突然斷電後恢復硬碟驅動器
以下是我的分區表,
mercurial@providence:~$ sudo fdisk -l Disk /dev/sda: 465.78 GiB, 500107862016 bytes, 976773168 sectors Disk model: ST9500420AS Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disklabel type: dos Disk identifier: 0xc8000000 Device Boot Start End Sectors Size Id Type /dev/sda1 * 2048 206847 204800 100M 7 HPFS/NTFS/exFAT /dev/sda2 206848 170128349 169921502 81G 7 HPFS/NTFS/exFAT /dev/sda3 170144408 337927335 167782928 80G 7 HPFS/NTFS/exFAT /dev/sda4 337930238 976769023 638838786 304.6G f W95 Ext'd (LBA) /dev/sda5 337930240 727652351 389722112 185.9G 83 Linux /dev/sda6 727654400 968382463 240728064 114.8G 83 Linux /dev/sda7 968384512 976769023 8384512 4G 82 Linux swap / Solaris
我安裝了兩個 Linux 發行版,一個正在執行
/dev/sda5
,另一個(Debian)正在執行/dev/sda6
。當我從電源啟動Debian時,它關閉了。
/dev/sda6
當我重新啟動並嘗試啟動到 Debian 時,它不會啟動。並給我一些錯誤。後來我嘗試在其他 Linux 中啟動,
/dev/sda5
系統啟動正常,但一直給我這個錯誤。mercurial@providence:~$ sudo dmesg | grep ata ... [ 40.300376] print_req_error: I/O error, dev sda, sector 727654456 flags 0 [ 40.302000] Buffer I/O error on dev sda6, logical block 7, async page read [ 40.303663] ata1: EH complete [ 40.740148] Adding 4192252k swap on /dev/sda7. Priority:-2 extents:1 across:4192252k FS [ 44.577187] ata1.00: exception Emask 0x0 SAct 0x678000 SErr 0x0 action 0x0 [ 44.579899] ata1.00: irq_stat 0x40000008 [ 44.582976] ata1.00: failed command: READ FPDMA QUEUED [ 44.585847] ata1.00: cmd 60/00:a8:00:20:5f/01:00:2b:00:00/40 tag 21 ncq dma 131072 in res 41/40:00:38:20:5f/00:01:2b:00:00/00 Emask 0x409 (media error) <F> [ 44.591895] ata1.00: status: { DRDY ERR } [ 44.594937] ata1.00: error: { UNC } [ 44.655930] ata1.00: configured for UDMA/133 [ 44.655977] sd 0:0:0:0: [sda] tag#21 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [ 44.655982] sd 0:0:0:0: [sda] tag#21 Sense Key : Medium Error [current] [ 44.655985] sd 0:0:0:0: [sda] tag#21 Add. Sense: Unrecovered read error - auto reallocate failed [ 44.655990] sd 0:0:0:0: [sda] tag#21 CDB: Read(10) 28 00 2b 5f 20 00 00 01 00 00 [ 44.655993] print_req_error: I/O error, dev sda, sector 727654456 flags 0 ...
我的硬碟比較健康。
mercurial@providence:~$ sudo smartctl -a /dev/sda smartctl 6.6 2017-11-05 r4594 [x86_64-linux-5.2.0-kali2-amd64] (local build) Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: Seagate Momentus 7200.4 Device Model: ST9500420AS Serial Number: 5VJ954YE LU WWN Device Id: 5 000c50 02eacc16b Firmware Version: D005SDM1 User Capacity: 500,107,862,016 bytes [500 GB] Sector Size: 512 bytes logical/physical Rotation Rate: 7200 rpm Device is: In smartctl database [for details use: -P show] ATA Version is: ATA8-ACS T13/1699-D revision 4 SATA Version is: SATA 2.6, 3.0 Gb/s Local Time is: Wed Oct 9 11:05:53 2019 EDT SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED See vendor-specific Attribute list for marginal Attributes. General SMART Values: Offline data collection status: (0x82) Offline data collection activity was completed without error. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: ( 0) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 106) minutes. Conveyance self-test routine recommended polling time: ( 3) minutes. SCT capabilities: (0x103f) SCT Status supported. SCT Error Recovery Control supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 10 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 097 089 006 Pre-fail Always - 176927878 3 Spin_Up_Time 0x0003 100 098 085 Pre-fail Always - 0 4 Start_Stop_Count 0x0032 094 094 020 Old_age Always - 6416 5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0 7 Seek_Error_Rate 0x000f 078 060 030 Pre-fail Always - 78548952961 9 Power_On_Hours 0x0032 064 064 000 Old_age Always - 31923 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 094 037 020 Old_age Always - 6304 184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0 187 Reported_Uncorrect 0x0032 001 001 000 Old_age Always - 1256 188 Command_Timeout 0x0032 099 095 000 Old_age Always - 4295037805 189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0 190 Airflow_Temperature_Cel 0x0022 060 036 045 Old_age Always In_the_past 40 (157 57 40 32 0) 191 G-Sense_Error_Rate 0x0032 100 100 000 Old_age Always - 1140 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 1094 193 Load_Cycle_Count 0x0032 001 001 000 Old_age Always - 245233 194 Temperature_Celsius 0x0022 040 064 000 Old_age Always - 40 (0 15 0 0 0) 195 Hardware_ECC_Recovered 0x001a 044 024 000 Old_age Always - 176927878 197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 1 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 1 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 1 240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 31560 (96 61 0) 241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 144327995 242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 25265341 254 Free_Fall_Sensor 0x0032 001 001 000 Old_age Always - 88 SMART Error Log Version: 1 ATA Error Count: 978 (device log contains only the most recent five errors) CR = Command Register [HEX] FR = Features Register [HEX] SC = Sector Count Register [HEX] SN = Sector Number Register [HEX] CL = Cylinder Low Register [HEX] CH = Cylinder High Register [HEX] DH = Device/Head Register [HEX] DC = Device Command Register [HEX] ER = Error register [HEX] ST = Status register [HEX] Powered_Up_Time is measured from power on, and printed as DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes, SS=sec, and sss=millisec. It "wraps" after 49.710 days. Error 978 occurred at disk power-on lifetime: 31923 hours (1330 days + 3 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 60 00 08 ff ff ff 4f 00 00:06:53.760 READ FPDMA QUEUED 60 00 08 ff ff ff 4f 00 00:06:53.759 READ FPDMA QUEUED 60 00 08 ff ff ff 4f 00 00:06:53.759 READ FPDMA QUEUED 60 00 08 ff ff ff 4f 00 00:06:53.758 READ FPDMA QUEUED 60 00 08 ff ff ff 4f 00 00:06:53.758 READ FPDMA QUEUED Error 977 occurred at disk power-on lifetime: 31923 hours (1330 days + 3 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 60 00 08 ff ff ff 4f 00 00:06:51.007 READ FPDMA QUEUED 60 00 08 ff ff ff 4f 00 00:06:51.007 READ FPDMA QUEUED 60 00 08 ff ff ff 4f 00 00:06:51.007 READ FPDMA QUEUED 60 00 08 ff ff ff 4f 00 00:06:51.006 READ FPDMA QUEUED 60 00 08 ff ff ff 4f 00 00:06:51.006 READ FPDMA QUEUED Error 976 occurred at disk power-on lifetime: 31923 hours (1330 days + 3 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 60 00 08 ff ff ff 4f 00 00:06:48.218 READ FPDMA QUEUED 61 00 38 ff ff ff 4f 00 00:06:48.217 WRITE FPDMA QUEUED 27 00 00 00 00 00 e0 00 00:06:48.217 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3] ec 00 00 00 00 00 a0 00 00:06:48.215 IDENTIFY DEVICE ef 03 46 00 00 00 a0 00 00:06:48.215 SET FEATURES [Set transfer mode] Error 975 occurred at disk power-on lifetime: 31923 hours (1330 days + 3 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 60 00 08 ff ff ff 4f 00 00:06:45.262 READ FPDMA QUEUED 60 00 08 ff ff ff 4f 00 00:06:45.256 READ FPDMA QUEUED 60 00 08 ff ff ff 4f 00 00:06:45.246 READ FPDMA QUEUED 60 00 08 ff ff ff 4f 00 00:06:45.242 READ FPDMA QUEUED 60 00 08 ff ff ff 4f 00 00:06:45.235 READ FPDMA QUEUED Error 974 occurred at disk power-on lifetime: 31923 hours (1330 days + 3 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 60 00 20 ff ff ff 4f 00 00:01:35.260 READ FPDMA QUEUED 60 00 08 ff ff ff 4f 00 00:01:35.253 READ FPDMA QUEUED 60 00 08 ff ff ff 4f 00 00:01:35.249 READ FPDMA QUEUED 60 00 08 ff ff ff 4f 00 00:01:35.249 READ FPDMA QUEUED 60 00 18 ff ff ff 4f 00 00:01:35.247 READ FPDMA QUEUED SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 11388 - # 2 Short offline Completed without error 00% 10594 - # 3 Short offline Completed without error 00% 9649 - # 4 Short offline Completed without error 00% 9203 - # 5 Short offline Completed without error 00% 8696 - # 6 Short offline Completed without error 00% 7998 - # 7 Short offline Completed without error 00% 7611 - # 8 Short offline Completed without error 00% 7116 - # 9 Short offline Completed without error 00% 6753 - #10 Short offline Aborted by host 40% 6540 - #11 Extended offline Completed without error 00% 6512 - #12 Short offline Completed without error 00% 6499 - #13 Short offline Completed: read failure 90% 6496 577789318 #14 Short offline Completed without error 00% 6296 - #15 Short offline Completed without error 00% 6182 - #16 Short offline Aborted by host 90% 5832 - #17 Short offline Completed without error 00% 4715 - #18 Short offline Completed without error 00% 4704 - #19 Short offline Completed without error 00% 4291 - #20 Short offline Completed without error 00% 4252 - #21 Short offline Completed without error 00% 4008 - 1 of 1 failed self-tests are outdated by newer successful extended offline self-test #11 SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay.
我今天要換新電池!除此之外,我還能做什麼?
編輯
我又進行了一些測試,結果呈陽性!
mercurial@providence:~$ sudo smartctl -H /dev/sda6 smartctl 6.6 2017-11-05 r4594 [x86_64-linux-5.2.0-kali2-amd64] (local build) Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED Please note the following marginal Attributes: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 190 Airflow_Temperature_Cel 0x0022 053 036 045 Old_age Always In_the_past 47 (157 57 48 46 0) mercurial@providence:~$ sudo badblocks -v /dev/sda6 > badsectors.txt Checking blocks 0 to 120364031 Checking for bad blocks (read-only test): done Pass completed, 4 bad blocks found. (4/0/0 errors)
如您所見,發現了 4 個 BadBlocks…
壞消息 - 你的硬碟比較不健康。好吧,與完全當機相比,它更健康,但它已損壞。Linux 核心消息和 SMART 日誌均確認讀取磁碟某些區域的錯誤:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 097 089 006 Pre-fail Always - 176927878 7 Seek_Error_Rate 0x000f 078 060 030 Pre-fail Always - 78548952961 187 Reported_Uncorrect 0x0032 001 001 000 Old_age Always - 1256
我建議在繼續使用這台機器之前創建(並驗證)所有重要數據(在所有分區上!)的即時備份(或少數備份) - 最好先手動備份最重要的數據,然後再備份不太重要的數據。
在該磁碟上執行更多活動(包括嘗試評估損壞程度)可能會產生更多損壞。
在您確定備份了您關心的所有數據後,您可以使用它
badblocks(8)
來評估損壞的程度。如果磁碟在保修期內,您可能會得到更換(如果需要,請務必先銷毀所有機密數據,例如使用製造商低級格式程序)。如果不在保修期內,我建議購買新磁碟。
如果這些都不是一個選項,您可以嘗試使用
badblocks -w
強制磁碟嘗試重新分配損壞的扇區(或專家模式fdisk(8)
在磁碟的未損壞區域創建較小的分區),但即使它工作磁碟也可能將來會出現更多問題(因此您可能不應該在其上放置任何不可替代的數據)。請注意,這會破壞數據,您需要在所有損壞的分區上重新安裝作業系統和數據。(也有
badblocks -n
隻會破壞已損壞的文件並使系統的其餘部分保持原樣,但恕我直言,如果您遇到的任何錯誤是由於系統文件損壞或其他原因導致的,那麼一次性乾淨重新安裝優於永久不確定性)