Hard-Drive

查找使用過的硬碟數據,解決磁碟壓力

  • December 5, 2019

我有一台執行 Ubuntu 18.04 的伺服器,它也是 K8s 的工作節點。有時我看到 K8s 因為磁碟壓力而殺死了這台機器上的 pod,當我得到時,df -h --total我可以看到 85% (1.5T) 的磁碟正在使用/

~$ df -h --total
Filesystem      Size  Used Avail Use% Mounted on
udev            126G     0  126G   0% /dev
tmpfs            26G  5.3M   26G   1% /run
/dev/sda2       1.8T  1.5T  276G  85% /
tmpfs           126G     0  126G   0% /dev/shm
tmpfs           5.0M     0  5.0M   0% /run/lock
tmpfs           126G     0  126G   0% /sys/fs/cgroup
/dev/loop0       90M   90M     0 100% /snap/core/7917
/dev/loop1       90M   90M     0 100% /snap/core/8039
/dev/sdb1       9.8G  203M  9.1G   3% /boot
/dev/sdb2       511M  6.1M  505M   2% /boot/efi
/dev/sdb3       1.8T  100M  1.7T   1% /home
/dev/loop2      128K  128K     0 100% /snap/austin/42
/dev/loop3      3.0M  3.0M     0 100% /snap/micro/648
tmpfs            26G     0   26G   0% /run/user/1001
total           4.0T  1.5T  2.4T  38% -

問題是當我去/獲取時,sudo du -BG -s *我只能找到 313G 的已用數據,僅此而已:

/$ sudo du -BG -s *
1G  bin
1G  boot
0G  dev
1G  etc
1G  home
0G  initrd.img
0G  initrd.img.old
1G  lib
1G  lib64
1G  lost+found
1G  media
1G  mnt
1G  opt
du: cannot access 'proc/22512/task/22580/fdinfo/20': No such file or directory
du: cannot access 'proc/45752/task/45752/fd/4': No such file or directory
du: cannot access 'proc/45752/task/45752/fdinfo/4': No such file or directory
du: cannot access 'proc/45752/fd/3': No such file or directory
du: cannot access 'proc/45752/fdinfo/3': No such file or directory
0G  proc
1G  root
1G  run
1G  sbin
1G  snap
1G  srv
9G  swap.img
0G  sys
1G  tmp
3G  usr
313G    var
0G  vmlinuz
0G  vmlinuz.old

如何找到剩餘的數據並解決磁碟壓力問題?

更新

我的問題/問題與建議的解決方案不同。在那種情況下,問題是文件被刪除,但我的問題是 docker。我發布了一個答案,所以我可以關閉這個問題。

我找到了一種方法來向我顯示已使用文件的列表並在https://unix.stackexchange.com/a/382696/380398lsof對它們進行排序

sudo lsof \
| grep REG \
| grep -v "stat: No such file or directory" \
| grep -v DEL \
| awk '{if ($NF=="(deleted)") {x=3;y=1} else {x=2;y=0}; {print $(NF-x) "  " $(NF-y) } }'  \
| sort -n -u  \
| numfmt  --field=1 --to=iec

當我使用它時,我得到了:

118M  /usr/bin/kubelet
168M  /var/lib/docker/containers/ce98aeb3e061c31e81d232933fa21f055169924cd0411ec276d51ae008dbb993/ce98aeb3e061c31e81d232933fa21f055169924cd0411ec276d51ae008dbb993-json.log
185M  /var/lib/docker/containers/933c29608da9d954dc941fc741ffe0b012e6ec55a8befa95b8487f2367596577/933c29608da9d954dc941fc741ffe0b012e6ec55a8befa95b8487f2367596577-json.log
207M  /var/lib/docker/containers/2d4c2967fe22b1eb79b234e465f36ad062c8f390659c2f2f42ad31636be8a1be/2d4c2967fe22b1eb79b234e465f36ad062c8f390659c2f2f42ad31636be8a1be-json.log
272M  /var/lib/docker/containers/4b8daa87cda051a3b2bfd1b89c70763dca990b65b0eb211260f0e6d92b972da9/4b8daa87cda051a3b2bfd1b89c70763dca990b65b0eb211260f0e6d92b972da9-json.log
343M  /var/lib/docker/containers/52cb2d7fceb6bef7a01f7e5c666cb05e0eb62537d54a9b8da8865eba9e51c728/52cb2d7fceb6bef7a01f7e5c666cb05e0eb62537d54a9b8da8865eba9e51c728-json.log
1.1G  /var/lib/docker/containers/fe2c73fd47b37a7a5e70bd1f07508bec7dad024c75b859d933b6fa5bba649f18/fe2c73fd47b37a7a5e70bd1f07508bec7dad024c75b859d933b6fa5bba649f18-json.log
1.1G  /var/lib/docker/containers/8887ea0b31603e0a5b21c934ce06bb4a35133df2367eccb5ad9e2a07eb884bd3/8887ea0b31603e0a5b21c934ce06bb4a35133df2367eccb5ad9e2a07eb884bd3-json.log
42G  /var/lib/docker/containers/1f7180db9e41b66f3646bdf021644b23c1a954830191807532af813f5aa5cde6/1f7180db9e41b66f3646bdf021644b23c1a954830191807532af813f5aa5cde6-json.log
83G  /var/lib/docker/containers/a456e37303998844207c79fc3cdb63878765d7a3151c35051cb071545c75cec7/a456e37303998844207c79fc3cdb63878765d7a3151c35051cb071545c75cec7-json.log
220G  /var/lib/docker/containers/60aad026e90035790ff5f6f1ad714e6187bec5dfeb5b1d3156b7cda1d00cc251/60aad026e90035790ff5f6f1ad714e6187bec5dfeb5b1d3156b7cda1d00cc251-json.log
260G  /var/lib/docker/containers/52c866da942a3228ba56265210ef4f13fbc96ebc1c0214501df189901a829414/52c866da942a3228ba56265210ef4f13fbc96ebc1c0214501df189901a829414-json.log
560G  /var/lib/docker/containers/f56a9853ef993ce3843a2d6acf5c9603a283e64fb4b81d6523342c6ad03243ad/f56a9853ef993ce3843a2d6acf5c9603a283e64fb4b81d6523342c6ad03243ad-json.log

正確地總結為 1.5T(如果我還添加了我之前可以看到的其他東西)。

引用自:https://serverfault.com/questions/994409