文件系統達到 100% 儲存容量後現在設置為只讀,如何重置為讀寫模式?
昨天我們的伺服器(Ubuntu 18.04)達到了 100% 的儲存容量 ,並將我們的一個文件系統設置為只讀模式,請參閱:
/dev/md3 / ext4 ro,relatime,errors=remount-ro,data=ordered 0 0
. 我已經從其他關於 serverfault 的答案中嘗試了幾種解決方案,但似乎沒有一個適合我的情況。例如,我嘗試執行以下命令:
sudo mount -o remount,rw /dev/md3 /
,但這會導致消息:mount: /: cannot remount /dev/md3 read-write, is write-protected.
如何解決此問題以使文件系統再次讀寫?
謝謝!
使用調試資訊更新:
mdadm --detail /dev/md3 /dev/md3: Version : 0.90 Creation Time : Fri Nov 10 10:07:34 2017 Raid Level : raid1 Array Size : 20478912 (19.53 GiB 20.97 GB) Used Dev Size : 20478912 (19.53 GiB 20.97 GB) Raid Devices : 2 Total Devices : 2 Preferred Minor : 3 Persistence : Superblock is persistent Update Time : Sat Sep 18 09:15:35 2021 State : clean Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0 Consistency Policy : unknown UUID : 4b632ac4:ae1a7c2b:a4d2adc2:26fd5302 Events : 0.861 Number Major Minor RaidDevice State 0 8 3 0 active sync /dev/sda3 1 8 19 1 active sync /dev/sdb3
並使用 dmesg:
dmesg | grep "md3" [67448453.830094] EXT4-fs error (device md3): ext4_remount:4840: Abort forced by user
執行
tune2fs
:tune2fs -l /dev/md3 tune2fs 1.44.1 (24-Mar-2018) Filesystem volume name: / Last mounted on: / Filesystem UUID: d1a985c4-8c5e-4034-93e0-629b8e65f161 Filesystem magic number: 0xEF53 Filesystem revision #: 1 (dynamic) Filesystem features: has_journal ext_attr resize_inode dir_index filetype needs_recovery extent flex_bg sparse_super large_file huge_file uninit_bg dir_nlink extra_isize Filesystem flags: signed_directory_hash Default mount options: user_xattr acl Filesystem state: clean with errors Errors behavior: Continue Filesystem OS type: Linux Inode count: 1281120 Block count: 5119728 Reserved block count: 255986 Free blocks: 445848 Free inodes: 1001361 First block: 0 Block size: 4096 Fragment size: 4096 Reserved GDT blocks: 1022 Blocks per group: 32768 Fragments per group: 32768 Inodes per group: 8160 Inode blocks per group: 510 Flex block group size: 16 Filesystem created: Fri Nov 10 10:07:39 2017 Last mount time: Tue Jul 30 17:51:41 2019 Last write time: Thu Sep 16 20:06:05 2021 Mount count: 7 Maximum mount count: -1 Last checked: Fri Nov 10 10:07:39 2017 Check interval: 0 (<none>) Lifetime writes: 4013 GB Reserved blocks uid: 0 (user root) Reserved blocks gid: 0 (group root) First inode: 11 Inode size: 256 Required extra isize: 28 Desired extra isize: 28 Journal inode: 8 First orphan inode: 663035 Default directory hash: half_md4 Directory Hash Seed: ae316af1-086d-470f-af27-0c10ca25f3c8 Journal backup: inode blocks FS Error count: 8 First error time: Thu Sep 16 20:06:04 2021 First error function: ext4_lookup First error line #: 1607 First error inode #: 930317 First error block #: 0 Last error time: Sat Sep 18 09:15:35 2021 Last error function: ext4_remount Last error line #: 4840 Last error inode #: 685456 Last error block #: 0
使用調試資訊
e2fsck -n /dev/md3
:e2fsck -n /dev/md3 e2fsck 1.44.1 (24-Mar-2018) Warning: skipping journal recovery because doing a read-only filesystem check. / contains a file system with errors, check forced. Pass 1: Checking inodes, blocks, and sizes Inodes that were part of a corrupted orphan linked list found. Fix? no Inode 101 was part of the orphaned inode list. IGNORED. Inode 117 was part of the orphaned inode list. IGNORED. Inode 292 was part of the orphaned inode list. IGNORED. Inode 460 was part of the orphaned inode list. IGNORED. Inode 465 was part of the orphaned inode list. IGNORED. Inode 471 was part of the orphaned inode list. IGNORED. Inode 487 was part of the orphaned inode list. IGNORED. Inode 529 was part of the orphaned inode list. IGNORED. Inode 562 was part of the orphaned inode list. IGNORED. Inode 564 was part of the orphaned inode list. IGNORED. Inode 707 was part of the orphaned inode list. IGNORED. Inode 723 was part of the orphaned inode list. IGNORED. Inode 918 was part of the orphaned inode list. IGNORED. ... Deleted inode 402614 has zero dtime. Fix? no ... Inode 783370, end of extent exceeds allowed value (logical block 1024, physical block 3068928, len 76) Clear? no Inode 783370, i_blocks is 8784, should be 8200. Fix? no Inode 783470, end of extent exceeds allowed value (logical block 2708, physical block 1322783, len 193) Clear? no Inode 783470, i_blocks is 23200, should be 21672. Fix? no Inode 1047956 was part of the orphaned inode list. IGNORED. Pass 2: Checking directory structure Entry 'tmp' in /tmp/systemd-private-bb09aae54cab4e12844e5844d11ca5eb-certbot.service-VSBnVY (685456) has deleted/unused inode 685457. Clear? no Entry '1159_key-certbot.pem' in /etc/letsencrypt/keys (930317) has deleted/unused inode 920168. Clear? no Entry '1159_key-certbot.pem' in /etc/letsencrypt/keys (930317) has an incorrect filetype (was 1, should be 0). Fix? no Entry '1110_csr-certbot.pem' in /etc/letsencrypt/csr (930318) has deleted/unused inode 920176. Clear? no Entry '1110_csr-certbot.pem' in /etc/letsencrypt/csr (930318) has an incorrect filetype (was 1, should be 0). Fix? no Entry '1106_key-certbot.pem' in /etc/letsencrypt/keys (930317) has deleted/unused inode 920166. Clear? no Entry '1106_key-certbot.pem' in /etc/letsencrypt/keys (930317) has an incorrect filetype (was 1, should be 0). Fix? no Entry '1109_key-certbot.pem' in /etc/letsencrypt/keys (930317) has deleted/unused inode 920173. Clear? no Entry '1109_key-certbot.pem' in /etc/letsencrypt/keys (930317) has an incorrect filetype (was 1, should be 0). Fix? no Entry '1146_csr-certbot.pem' in /etc/letsencrypt/csr (930318) has deleted/unused inode 920172. Clear? no Entry '1146_csr-certbot.pem' in /etc/letsencrypt/csr (930318) has an incorrect filetype (was 1, should be 0). Fix? no ... Pass 3: Checking directory connectivity Pass 4: Checking reference counts Inode 685456 ref count is 3, should be 2. Fix? no Pass 5: Checking group summary information Block bitmap differences: -34565 -(53721--53734) -(59721--59761) -(59981--59983) -(61106--61184) -(61540--61544) -(70964--71007) -(71274--71313) -(84938--84989) -(85084--85107) -(85592--85599) -(116400--116408) -(116423--116436) -(128700--128703) -(128708--128721) -(138904--138914) -(165045--165150) -(169691--169713) -(169717--169742) -(464896--471464) -(471552--471989) -(472928--472947) -(499200--499612) -(501408--501434) -(503808--504070) -(513024--513301) -(513408--513491) -(589477--589480) -(711431--711441) -(747968--748030) -(838733--838740) -(838755--838758) -(838772--838783) -(838791--838800) -(838805--838816) -(838824--838835) -(848384--848972) -(875840--875880) -(1032187--1033031) -(1083840--1083878) -(1120110--1120132) -(1322783--1322975) -(1631196--1631251) -(1635150--1635169) -(1635360--1635391) -(1635571--1635575) -(1635848--1635855) -(1635996--1636001) -1648860 -1648880 -(1715533--1715536) -(1740800--1741311) -(1746432--1746573) -(1750528--1750729) -(1867776--1867880) -(1870717--1871294) -(1880576--1880791) -(1888256--1888258) -1888260 -(1888272--1888273) -(1888275--1888767) -(2226402--2226405) -(2235495--2235719) -(2266304--2266332) -(2301560--2301629) -(2528723--2528753) -(2589088--2589117) -(2597312--2597374) -(2597696--2597757) -(2614784--2615295) -(2619392--2619458) -(2619904--2620297) -2636181 -(2671360--2671491) -(2687328--2687350) -(3068928--3069003) -(3196998--3197002) -(3228728--3228738) -(3236697--3236703) -(3252961--3252970) -(3264276--3264277) -(3264287--3264298) -(3285164--3285170) -(3299518--3299524) -(3399680--3400062) -(3441024--3441129) -(3574080--3574142) -(3601664--3601795) -(3659648--3659724) -(3660672--3660755) -(3704233--3704234) -(3704237--3704242) -3707626 -3708898 -3709310 -3709356 -3709398 -3709984 -(3751694--3751696) -(3751707--3751711) -(3751767--3751768) -(3751774--3751775) -(3751800--3751814) -(3771264--3771343) -(3830025--3830040) -(3860480--3867203) -(3867616--3867644) -(3868160--3868618) -(3869696--3870139) -(4045457--4045483) -(4087936--4088023) -(4088032--4088055) -(4088320--4088780) -(4088960--4089064) -(4089088--4089126) -(4091136--4091324) -(4091392--4092119) -(4092928--4094514) -(4094976--4095854) -(4097088--4097120) -(4097536--4097816) -(4109312--4110157) -(4250368--4250378) -(4278497--4278513) -(4296960--4297014) -(4325486--4325616) -(4325632--4325707) -(4326688--4327074) -(4328826--4328961) -(4329202--4329314) -(4329600--4329666) -(4329764--4329804) -(4332027--4332178) -(4332406--4332476) -(4333568--4333942) -(4334372--4334454) -(4334564--4335227) -(4621153--4621176) -(4669781--4670170) -(4696470--4696548) -(4697074--4697429) -(4697662--4697711) -(4726778--4727894) -(5055921--5056185) -(5056648--5056667) -(5106412--5106620) -(5106668--5107034) Fix? no Free blocks count wrong for group #76 (3374, counted=3375). Fix? no Free blocks count wrong (445848, counted=445849). Fix? no Inode bitmap differences: -101 -117 -292 -460 -465 -471 -487 -529 -562 -564 -707 -723 -918 -(1837--1838) -2041 -2714 -3593 -3654 -3659 -3894 -3976 -4336 -4425 -5193 -5244 -5252 -5930 -5951 -5967 -(7066--7069) -7431 -8492 -8651 -9298 -9583 -9592 -14261 -14270 -18093 -19214 -21301 -(27843--27844) -27847 -27849 -(27853--27856) -(27868--27869) -(27872--27873) -27875 -27879 -27883 -27885 -(27889--27890) -27892 -162842 -391708 -391741 -391759 -391763 -(391800--391802) -(391804--391805) -(391812--391814) -(391831--391833) -391870 -391873 -391878 -391900 -391902 -(391910--391911) -391915 -391919 -391927 -391956 -392493 -392719 -393759 -393795 -395132 -395134 -395161 -395165 -395221 -395234 -395267 -395289 -(395312--395313) -395315 -395325 -395336 -395387 -395630 -396550 -396589 -(396699--396700) -402594 -(402596--402598) -402601 -(402604--402606) -402608 -(402611--402614) -407918 -413872 -413874 -413881 -413885 -413897 -413900 -413908 -421042 -421202 -421226 -426391 -652905 -(652931--652935) -663035 -685457 -920162 -(920164--920176) -1047956 Fix? no Directories count wrong for group #84 (17, counted=16). Fix? no Free inodes count wrong for group #96 (80, counted=82). Fix? no Free inodes count wrong for group #112 (486, counted=487). Fix? no Free inodes count wrong (1001361, counted=1001364). Fix? no /: ********** WARNING: Filesystem still has errors ********** /: 279759/1281120 files (0.7% non-contiguous), 4673880/5119728 blocks
正是文件系統損壞導致此切換為只讀模式,而不是其溢出,完全遵循 mount 選項
errors=remount-ro
。備份重要數據和配置並將它們下載到某處。如果啟動重要的東西被破壞,請為案例準備恢復計劃。如果可能,將重要的服務移到另一台機器上。會有一些停機時間。
我注意到這個系統不會經常重啟(自 2017 年以來只有 7 次安裝,上次重啟是在 2019 年)。所以我建議將最大掛載計數設置為 1,這樣每次啟動都會檢查它:
tune2fs -c 1 /dev/md3
然後重新啟動。初始化腳本應在引導期間檢查並修復文件系統。但是,損壞可能非常嚴重,因此可能需要手動互動,因此請確保有人在伺服器附近並準備好幫助您。而且,如果這種腐敗觸動了一些重要的事情,你可能會遇到奇怪的問題。
在最壞的情況下,您將不得不重新安裝系統。但不要忘記再次將最大安裝計數設置為 1。
為什麼文件系統損壞了?它只是發生。塊儲存在記憶體中,並且由於宇宙射線的原因,可能在那裡發生了損壞。非常罕見的情況,有時會發生。然後,磁碟也不理想,無法檢測到所有錯誤;存在非零位錯誤率(在您的設備數據表中查找實際值),因此數據被讀取損壞的可能性非常低,但仍有可能。如果這發生在元數據塊上,問題可能會累積(由錯誤資訊引導的文件系統驅動程序可能會做出一些不正確的假設並進一步破壞文件系統),這就是為什麼不時檢查它很重要的原因。